ICDAR 2019 Historical Document Reading Challenge on Large Structured Chinese Family Records (ICDAR2019HDRC)

Ground Truth

Layout Analysis on structured historical document images

2019-08-29 (v. 1)

Contact author

Rajkumar Saini, Derek Dobson, Jon Morrey, Marcus Liwicki, Foteini Simistira Liwicki

LTU, Sweden

rajkumar.saini@ltu.se, marcus.liwicki@ltu.se

+46 (0)920 491006


This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

Keywords

ICDAR2019HDRC, Historical Chinese documents, Document Image Analysis, Textline Detection, segmentation

Description

The scope of this competition is to segment the page in different classes by assigning a different pixel value for each class: There are 2 different annotated classes: RGB=0b00...1000=0x000008: text (foreground) RGB=0b00...0001=0x000001: non-text (background) The training data will be available as pixel labeled images. To avoid unfair penalties for the boundary regions, we add a value for boundary pixels: RGB=0b10...0000=0x800000: boundary pixel (to be combined with one of the classes, expect background) For example, a boundary text is represented as: boundary+text=0x800008 

FileTypeSizeDownloadsDescription
ICDAR2019HDRCgroundTruth.zipdata(99 MB)3Ground Truth Files

Comments

Rajkumar Saini 09-02-2019 13:50
The ground truth for this task can be found in the 'segmentation' directory in the ground truth zip file.

Add your comment

In order to comment on a dataset you need to be logged on
Register Now!

Valoration

In order to rate this dataset you need to be logged on
Register Now!