TUAT Nakagawa Lab. HANDS - Vietnamese Online Handwriting Database (HANDS-VNOnDB)

2016-11-25 (v. 1)

Contact author

NGUYEN TUAN HUNG

Tokyo University of Agriculture and Technology

ntuanhung@gmail.com

+81-423-88-7144

+81-423-88-7144

You can cite this dataset as: NGUYEN TUAN HUNG, TUAT Nakagawa Lab. HANDS - Vietnamese Online Handwriting Database (HANDS-VNOnDB) ,1,ID:HANDS-VNOnDB_1,URL:http://tc11.cvc.uab.es/datasets/HANDS-VNOnDB_1

Dataset Information

Keywords

Vietnamese, Online Handwriting Database

Description

HANDS-VNOnDB (VNOnDB in short) provides 1,146 Vietnamese paragraphs of handwritten text composed of 7,296 lines, more than 480,000 strokes and more that 380,000 characters written by 200 Vietnamese. Writers were asked to write freely ground-truth text from a corpus of Vietnamese text. Our ground-truth text is derived from the VieTreeBank corpus which contains all of Vietnamese characters and some special symbols since it bases on Vietnamese newspapers. For collecting patterns, we used Fujistu PC Tablets (FMVT8170) with stylus pen at high sampling rate (120Hz). Each sequence contains multiple lines within various delayed strokes. VNOnDB is available for research purpose only. For commercial purposes and sell products, please contact us via email.

Technical Details

The ink corresponding to each writer with a certain ground-truth text is stored in an InkML file. An InkML file mainly contains three kinds of information:

  1. The general information such as description, writer id, ground-truth text id, etc;
  2. The  ink is a set of traces made of points;
  3. The paragraph level ground truth is contained at the beginning of each TraceGroup.

The total size of the dataset is ~300MBytes. (100MBytes zipped)

FileTypeSizeDownloadsDescription
VNOnDB_Paragraph.zipdata(102 MB)8This VNOnDB_Paragraph compress file consists of online handwritten patterns with the ground truth at paragraph level. Moreover, it contains the list of files used in train, validation, test sets.
VNOnDB_Line.ziparticle(100 MB)7This VNOnDB_Line compress file consists of online handwritten patterns with the ground truth at line level. Moreover, it contains the list of files used in train, validation, test sets.

Comments

No comments on this dataset yet.

Add your comment

In order to comment on a dataset you need to be logged on
Register Now!

Valoration

In order to rate this dataset you need to be logged on
Register Now!