ICDAR 2019 Historical Document Reading Challenge on Large Structured Chinese Family Records (ICDAR2019HDRC)

Ground Truth

Complete, integrated textline detection and recognition on a large dataset

2019-08-29 (v. 1)

Contact author

Rajkumar Saini, Derek Dobson, Jon Morrey, Marcus Liwicki, Foteini Simistira Liwicki

LTU, Sweden

rajkumar.saini@ltu.se, marcus.liwicki@ltu.se

+46 (0)920 491006

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.


ICDAR2019HDRC, Historical Chinese documents, Document Image Analysis, end to end


The scope of this competition is to detect (bounding box) and recognize (OCR)  text lines in a given document image. The training data will be available also in PAGE-XML format. The PAGE-XML file will contain the information of the text lines’ location and their corresponding text.


Rajkumar Saini 09-02-2019 13:47
The ground truth in other tasks contains the xml files required for this task as well.

Add your comment

In order to comment on a dataset you need to be logged on
Register Now!


In order to rate this dataset you need to be logged on
Register Now!