ICDAR 2019 Historical Document Reading Challenge on Large Structured Chinese Family Records (ICDAR2019HDRC)

Ground Truth

Layout Analysis on structured historical document images

2019-08-29 (v. 1)

Contact author

Rajkumar Saini, Derek Dobson, Jon Morrey, Marcus Liwicki, Foteini Simistira Liwicki

LTU, Sweden

rajkumar.saini@ltu.se, marcus.liwicki@ltu.se

+46 (0)920 491006

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.


ICDAR2019HDRC, Historical Chinese documents, Document Image Analysis, Textline Detection, segmentation


The scope of this competition is to segment the page in different classes by assigning a different pixel value for each class: There are 2 different annotated classes: RGB=0b00...1000=0x000008: text (foreground) RGB=0b00...0001=0x000001: non-text (background) The training data will be available as pixel labeled images. To avoid unfair penalties for the boundary regions, we add a value for boundary pixels: RGB=0b10...0000=0x800000: boundary pixel (to be combined with one of the classes, expect background) For example, a boundary text is represented as: boundary+text=0x800008 

ICDAR2019HDRCgroundTruth.zipdata(99 MB)21Ground Truth Files


Rajkumar Saini 09-02-2019 13:50
The ground truth for this task can be found in the 'segmentation' directory in the ground truth zip file.

Add your comment

In order to comment on a dataset you need to be logged on
Register Now!


In order to rate this dataset you need to be logged on
Register Now!