Handwritten Text Recognition on tranScriptorium Datasets: Bentham R0 (HTR Competition 2014)
Historical Handwritten Text Recognition
The Bentham collection consists of a set of images of a collection of works on law and moral philosophy written by the philosopher Jeremy Bentham.
The selected subset has been written by several hands (Bentham himself and his secretaries) and entails significant varibilities and difficulties regarding the quality of text images and writting styles. Training and test data were provided in the form of carefully segmented line images, along with the corresponding transcripts.
This dataset is free available for research purposes and it is provided into two parts: the images and the GT. The GT includes information about the layout and the transcription at line level of each image in PAGE format.
The dataset includes a README about the amount the data, training and test.