ICDAR 2019 Historical Document Reading Challenge on Large Structured Chinese Family Records (ICDAR2019HDRC)

Ground Truth

Handwritten Character Recognition on extracted textlines

2019-08-29 (v. 1)

Contact author

Rajkumar Saini, Derek Dobson, Jon Morrey, Marcus Liwicki, Foteini Simistira Liwicki

LTU, Sweden

rajkumar.saini@ltu.se, marcus.liwicki@ltu.se

+46 (0)920 491006

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.


ICDAR2019HDRC, Historical Chinese documents, Document Image Analysis, Textline Recognition


The scope of this competition is to recognize (OCR) given extracted text lines and, if possible, to find the segmentation points of the characters. The advantage of the character competition is that we would be able to generate synthetic historical images, once we have the characters segmented and recognized. Training data will be also available in PAGE-XML format. 

ICDAR2019HDRCgroundTruth.zipdata(99 MB)21Ground Truth Files


Rajkumar Saini 09-02-2019 13:48
The text lines bounding boxes and ground truths can be found in the xml files provided in the ground truth zip file.

Add your comment

In order to comment on a dataset you need to be logged on
Register Now!


In order to rate this dataset you need to be logged on
Register Now!