Handwritten Annotation Detection Dataset (AnnotationDB)
Handwriting, Annotation, Segmentation, Historic, German, Documents
The dataset contains 40 images for training and validation and 10 images for testing.
The document pages in the dataset are from multiple sources which are digitized using different devices. This increased variance makes the dataset especially challenging for segmentation task.
All images are labeled with their respective ground truths which are available in the PAGE format and as PNG files. The PNG files encode the classes in the Blue color channel and allow for ambiguous regions (cf. ICDAR2017 Competition on Layout Analysis for Challenging Medieval Manuscripts).