TUAT Nakagawa Lab. HANDS - Vietnamese Online Handwriting Database (HANDS-VNOnDB)
Dataset Information
Keywords
Vietnamese, Online Handwriting Database
Description
HANDS-VNOnDB (VNOnDB in short) provides 1,146 Vietnamese paragraphs of handwritten text composed of 7,296 lines, more than 480,000 strokes and more that 380,000 characters written by 200 Vietnamese. Writers were asked to write freely ground-truth text from a corpus of Vietnamese text. Our ground-truth text is derived from the VieTreeBank corpus which contains all of Vietnamese characters and some special symbols since it bases on Vietnamese newspapers. For collecting patterns, we used Fujistu PC Tablets (FMVT8170) with stylus pen at high sampling rate (120Hz). Each sequence contains multiple lines within various delayed strokes. VNOnDB is available for research purpose only. For commercial purposes and sell products, please contact us via email.
Technical Details
The ink corresponding to each writer with a certain ground-truth text is stored in an InkML file. An InkML file mainly contains three kinds of information:
- The general information such as description, writer id, ground-truth text id, etc;
- The ink is a set of traces made of points;
- The paragraph level ground truth is contained at the beginning of each TraceGroup.
The total size of the dataset is ~300MBytes. (100MBytes zipped)
File | Type | Size | Downloads | Description |
---|---|---|---|---|
VNOnDB_Paragraph.zip | data | (102 MB) | 144 | This VNOnDB_Paragraph compress file consists of online handwritten patterns with the ground truth at paragraph level. Moreover, it contains the list of files used in train, validation, test sets. |
VNOnDB_Line.zip | article | (100 MB) | 155 | This VNOnDB_Line compress file consists of online handwritten patterns with the ground truth at line level. Moreover, it contains the list of files used in train, validation, test sets. |