Handwritten Chess Scoresheet Dataset (HCS)
Dataset Information
Keywords
Handwriting Recognition, Latin Handwriting, Chess, Score Sheet
Description
The Handwritten Chess Scoresheet or HCS dataset is an annotated collection of chess scoresheet images from live chess events. It was created to aid handwriting recognition research of chess moves to automate scoresheet digitization process. This dataset contains 158 games comprised of 215 pages of scoresheets digitized using a standard cellphone camera in natural lighting conditions. These images are tightly cropped, and a standard corner detection-based transformation is applied to eliminate perspective distortion. The headers and footers were also cropped out from each image in order to maintain player anonymity. Each scoresheet possesses a total of 120 move boxes, though in most cases a given scoresheet contains some blank boxes.
Also attached (please check the "Ground Truth" section) are two text files which contain ground truth data for the scoresheets. A folder named "empty scoresheets" contains a few scoresheets with no handwriting on them. This is included in the dataset intentionally to provide a format for the chess scoresheets used, and to help with extracting individual text boxes from scoresheet images.
Authors:
- Eicher, Owen
- Farmer, Denzel
- Li, Yiyan
- Majid, Nishatul
Technical Details
Scoresheet images are stored as PNG files and follow a naming convention: [Game #]_[Page #].png. All ground truth labels are stored in one of two text files, either training_tags.txt or testing_tags.txt. Training labels follow the following format: [Game #]_[Page #]_[Move #]_[black/white] [label], with a single space between the move description and the label. Testing labels are formatted similarly, except the page number is ommitted since the ground truth for a game applies to both sheets: [Game #]_[Move #]_[black/white] [label]. Not all games have two pages however, so testing tags are only given for a select few games with two pages available.
A single .zip file containing all of the data is 367 MB in size, with the vast majority of that file size dedicated to scoresheet images.
File | Type | Size | Downloads | Description |
---|---|---|---|---|
HCS.zip | data | (367 MB) | 88 | Folder containing all images and associated ground truth data. |