Persian Heritage Image Binarization Dataset (PHIBD 2012) (PHIBD 2012)

Research Tasks

Binarization of PHIBD 2012 dataset

2018-08-03 (v. 1)

Contact author

Hossein Ziaie Nafchi, Seyed Morteza Ayatollahi, Reza Farrahi Moghaddam, and Mohamed Cheriet

Synchromedia Laboratory ETS, Montreal, (Quebec) Canada H3C 1K3

mohamed.cheriet@etsmtl.ca

+1(514)396-8972

+1(514)396-8595

Description

Binarization of handwritten Document Images.

There are actually two tasks, depending on the nature of the binarization method used.

  1. For regular binarization methods, the task is to binarize all 15 document images.
  2. For learning-based binarization methods, the task is to use images number 1 to 5 for training, and then binarize images number 6 to 16.

A few baseline methods have been provided: PC (phase congruency) binarization method [Ziaei2012], and SGL/BGL binarization method [Farrahi2009, Farrahi2010]. The SGL/BGL method uses a rough binarization as its initialization.

References

  • [Ziaei2013] Hossein Ziaei Nafchi, Reza Farrahi Moghaddam, and Mohamed Cheriet. Persian historical document dataset with introduction to PhaseGT: A ground truthing application, to be submitted to ICDAR’13.
  • [Ziaei2012] Hossein Ziaei Nafchi, Reza Farrahi Moghaddam and Mohamed Cheriet, Historical Document Binarization Based on Phase Information of Images, in ACCV’12 Workshop on e-Heritage, Daejeon, South Korea, Nov 5-10, 2012.
  • [Farrahi2009] Reza Farrahi Moghaddam, and Mohamed Cheriet, RSLDI: Restoration of single-sided low-quality document images, Pattern Recognition, Volume 42, Issue 12, p.3355–3364 (2009) DOI: 10.1016/j.patcog.2008.10.021
  • [Farrahi2010] Reza Farrahi Moghaddam, and Mohamed Cheriet, A multi-scale framework for adaptive binarization of degraded document images, Pattern Recognition, Volume 43, Issue 6, Number 6, p.2186–2198 (2010) DOI: 10.1016/j.patcog.2009.12.024
  • [Cheriet2012] Mohamed Cheriet, Reza Farrahi Moghaddam, and Rachid Hedjam, A learning framework for the optimization and automation of document binarization methods, Computer Vision and Image Understanding, Volume Accepted, p.– (2012) DOI: 10.1016/j.cviu.2012.11.003

 

Protocol

  1. For regular methods, the average F-measure of the binarized images against the provided ground truth is used as the performance of the binarization method in question.
  2. For learning-based methods, the average F-measure of the binarized images number 6 to 15 against the provided ground truth is used as the performance of the binarization method in question.

PHIDB Binarization

 

 

 

 

 

 

A metacode of a learning-based binarization method based on stroke gray level (SGL) and background gray level (BGL) is provided. The executable of the method will be provided in near future.

The proposed learning-based binarization method uses the SGL and the BGL to determine a locally-adaptive threshold value based on a parameter (alpha). The optimal selection of this parameter is the learning part of this method.

FileTypeSizeDownloadsDescription
Task.zipdata(2 MB)2Binarization task

Comments

No comments on this dataset yet.

Add your comment

In order to comment on a dataset you need to be logged on
Register Now!

Valoration

In order to rate this dataset you need to be logged on
Register Now!