ICDAR 2019 Historical Document Reading Challenge on Large Structured Chinese Family Records (ICDAR2019HDRC)
Complete, integrated textline detection and recognition on a large dataset
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
The scope of this competition is to detect and recognize (OCR) a given document image. The training data will be available also in PAGE-XML format. The PAGE-XML file will contain the information of the text lines’ location and their corresponding text. Thus, the similar PAGE-XML file is expected as the output, given a document image as input.
The evaluation is based on graph-based string EDIT distance averaged over all documents. However, researchers are free to use other methods as well to evaluate the performance.
The tool is found here.
No comments on this dataset yet.