Handwritten Text Recognition on tranScriptorium Datasets: Bentham R1 (HTR Competition 2015) (HTR Competition 2015)

Ground Truth

GT for the HTR Competition 2015

2017-01-20 (v. 1)

Contact author

Joan Andreu Sánchez

Pattern Recognition and Human Language Technologies - Universitat Politècnica de


(+34) 96 387 7358

(+34) 96 387 7359

This work is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License.


Handwritten Text Recognition of Historical Documents


This paper data corresponds to the second edition of the Handwritten Text Recognition (HTR) contest on the tranScriptorium datasets that was held in the context of the International Conference on Document Analysis and Recognition 2015. Two tracks with different conditions on the use of training data were proposed. The handwritten images for this contest were  drawn from the English ``Bentham collection'' dataset used in the tranScriptorium project.  A small subset of this collection has been chosen for the present HTR competition. The selected subset   has been written by several hands and entails significant variabilities and difficulties regarding the quality of text images, writing styles and crossed-out text.  This contest was clearly more difficult than the the first edition both for training and for testing.  A portion of the training dataset and the full test dataset were provided in the form of carefully segmented line images, along with the corresponding transcripts. Another portion of the training
  dataset was provided as raw images and their corresponding transcripts at region level.



No comments on this dataset yet.

Add your comment

In order to comment on a dataset you need to be logged on
Register Now!


In order to rate this dataset you need to be logged on
Register Now!