Handwritten Chess Scoresheet Dataset (HCS)

2021-07-04 (v. 1)

Contact author

Owen Eicher

Colorado School of Mines

oeicher@mymail.mines.edu

(970)-946-8898

You can cite this dataset as: Owen Eicher, Handwritten Chess Scoresheet Dataset (HCS) ,1,ID:HCS_1,URL:https://tc11.cvc.uab.es/datasets/HCS_1

Dataset Information

Keywords

Handwriting Recognition, Latin Handwriting, Chess, Score Sheet

Description

The Handwritten Chess Scoresheet or HCS dataset is an annotated collection of chess scoresheet images from live chess events. It was created to aid handwriting recognition research of chess moves to automate scoresheet digitization process. This dataset contains 158 games comprised of 215 pages of scoresheets digitized using a standard cellphone camera in natural lighting conditions. These images are tightly cropped, and a standard corner detection-based transformation is applied to eliminate perspective distortion. The headers and footers were also cropped out from each image in order to maintain player anonymity. Each scoresheet possesses a total of 120 move boxes, though in most cases a given scoresheet contains some blank boxes.

Also attached (please check the "Ground Truth" section) are two text files which contain ground truth data for the scoresheets. A folder named "empty scoresheets" contains a few scoresheets with no handwriting on them. This is included in the dataset intentionally to provide a format for the chess scoresheets used, and to help with extracting individual text boxes from scoresheet images.

 

Authors:

  • Eicher, Owen
  • Farmer, Denzel
  • Li, Yiyan
  • Majid, Nishatul

Technical Details

Scoresheet images are stored as PNG files and follow a naming convention: [Game #]_[Page #].png. All ground truth labels are stored in one of two text files, either training_tags.txt or testing_tags.txt. Training labels follow the following format: [Game #]_[Page #]_[Move #]_[black/white] [label], with a single space between the move description and the label. Testing labels are formatted similarly, except the page number is ommitted since the ground truth for a game applies to both sheets: [Game #]_[Move #]_[black/white] [label]. Not all games have two pages however, so testing tags are only given for a select few games with two pages available.

A single .zip file containing all of the data is 367 MB in size, with the vast majority of that file size dedicated to scoresheet images.

FileTypeSizeDownloadsDescription
HCS.zipdata(367 MB)88Folder containing all images and associated ground truth data.
Comments
No comments on this dataset yet.
Valoration
In order to rate this dataset you need to be logged onLogin / Register