Persian Heritage Image Binarization Dataset (PHIBD 2012) (PHIBD 2012)
Binarization of PHIBD 2012 dataset
Binarization of handwritten Document Images.
There are actually two tasks, depending on the nature of the binarization method used.
- For regular binarization methods, the task is to binarize all 15 document images.
- For learning-based binarization methods, the task is to use images number 1 to 5 for training, and then binarize images number 6 to 16.
A few baseline methods have been provided: PC (phase congruency) binarization method [Ziaei2012], and SGL/BGL binarization method [Farrahi2009, Farrahi2010]. The SGL/BGL method uses a rough binarization as its initialization.
- [Ziaei2013] Hossein Ziaei Nafchi, Reza Farrahi Moghaddam, and Mohamed Cheriet. Persian historical document dataset with introduction to PhaseGT: A ground truthing application, to be submitted to ICDAR’13.
- [Ziaei2012] Hossein Ziaei Nafchi, Reza Farrahi Moghaddam and Mohamed Cheriet, Historical Document Binarization Based on Phase Information of Images, in ACCV’12 Workshop on e-Heritage, Daejeon, South Korea, Nov 5-10, 2012.
- [Farrahi2009] Reza Farrahi Moghaddam, and Mohamed Cheriet, RSLDI: Restoration of single-sided low-quality document images, Pattern Recognition, Volume 42, Issue 12, p.3355–3364 (2009) DOI: 10.1016/j.patcog.2008.10.021
- [Farrahi2010] Reza Farrahi Moghaddam, and Mohamed Cheriet, A multi-scale framework for adaptive binarization of degraded document images, Pattern Recognition, Volume 43, Issue 6, Number 6, p.2186–2198 (2010) DOI: 10.1016/j.patcog.2009.12.024
- [Cheriet2012] Mohamed Cheriet, Reza Farrahi Moghaddam, and Rachid Hedjam, A learning framework for the optimization and automation of document binarization methods, Computer Vision and Image Understanding, Volume Accepted, p.– (2012) DOI: 10.1016/j.cviu.2012.11.003
- For regular methods, the average F-measure of the binarized images against the provided ground truth is used as the performance of the binarization method in question.
- For learning-based methods, the average F-measure of the binarized images number 6 to 15 against the provided ground truth is used as the performance of the binarization method in question.
A metacode of a learning-based binarization method based on stroke gray level (SGL) and background gray level (BGL) is provided. The executable of the method will be provided in near future.
The proposed learning-based binarization method uses the SGL and the BGL to determine a locally-adaptive threshold value based on a parameter (alpha). The optimal selection of this parameter is the learning part of this method.
|Task.zip||data||(2 MB)||12||Binarization task|
No comments on this dataset yet.