REID2019 Competition Dataset (Recognition of Early Indian Printed Documents) (REID2019)

2019-05-31 (v. 1)

Contact author

Christian Clausner

University of Salford

c.clausner@salford.ac.uk

01612956749

You can cite this dataset as: Christian Clausner, REID2019 Competition Dataset (Recognition of Early Indian Printed Documents) (REID2019) ,1,ID:REID2019_1,URL:http://tc11.cvc.uab.es/datasets/REID2019_1

Dataset Information

Dataset URL

https://www.primaresearch.org/datasets/REID2019

Keywords

Bengali,OCR,Segmentation,Printed

Description

For the most part, the scanned images contain single column lines of text, with a small amount containing illustrations as well as text. Some pages contain marginal data such as numbers, handwritten notes, and decorative frames. The evaluation set consisted of 56 images as a representative sample ensuring a balanced presence of different issues affecting layout analysis and OCR. Such issues include non-straight text lines, show-through or bleed-through, faded ink, decorations, the presence non-rectangular shaped regions, varying text column widths, varying font sizes, presence of separators and various aging- and scanning-related issues. In addition to the evaluation set, 25 representative images were selected as the example set that was provided to the authors with ground truth. 

Comments

No comments on this dataset yet.

Add your comment

In order to comment on a dataset you need to be logged on
Register Now!

Valoration

In order to rate this dataset you need to be logged on
Register Now!

REID2019 Competition Dataset (Recognition of Early Indian Printed Documents)