ICDAR 2019 Historical Document Reading Challenge on Large Structured Chinese Family Records (ICDAR2019HDRC)

Ground Truth

Handwritten Character Recognition on extracted textlines

2019-08-29 (v. 1)

Contact author

Rajkumar Saini, Derek Dobson, Jon Morrey, Marcus Liwicki, Foteini Simistira Liwicki

LTU, Sweden

rajkumar.saini@ltu.se, marcus.liwicki@ltu.se

+46 (0)920 491006


This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

Keywords

ICDAR2019HDRC, Historical Chinese documents, Document Image Analysis, Textline Recognition

Description

The scope of this competition is to recognize (OCR) given extracted text lines and, if possible, to find the segmentation points of the characters. The advantage of the character competition is that we would be able to generate synthetic historical images, once we have the characters segmented and recognized. Training data will be also available in PAGE-XML format. 

FileTypeSizeDownloadsDescription
ICDAR2019HDRCgroundTruth.zipdata(99 MB)1Ground Truth Files

Comments

Rajkumar Saini 09-02-2019 13:48
The text lines bounding boxes and ground truths can be found in the xml files provided in the ground truth zip file.

Add your comment

In order to comment on a dataset you need to be logged on
Register Now!

Valoration

In order to rate this dataset you need to be logged on
Register Now!