ICFHR2018 Competition on Vietnamese Online Handwritten Text Recognition Database (HANDS-VNOnDB2018)
Vietnamese, Online Handwriting Database, ICFHR, recognition, competition
HANDS-VNOnDB2018 (VNOnDB2018 in short ) is used for ICFHR2018 Competition on Vietnamese Online Handwritten Text Recognition using VNOnDB. It provides 1,146 Vietnamese paragraphs of handwritten text composed of 7,296 lines, more than 480,000 strokes and more that 380,000 characters written by 200 Vietnamese.
Writers were asked to write freely ground-truth text from a corpus of Vietnamese text. Our ground-truth text is derived from the VieTreeBank corpus which contains all of Vietnamese characters and some special symbols since it bases on Vietnamese newspapers. For collecting patterns, we used Fujistu PC Tablets (FMVT8170) with stylus pen at high sampling rate (120Hz). Each sequence contains multiple lines within various delayed strokes. VNOnDB2018 is available for research purpose only. For commercial purposes and sell products, please contact us via email.
The following is the structure of each InkML file in VNOnDB2018. There are two main sections: description section (including description, content_category, language, writer index, gender, age, ...) and trajectory data section (including multiple "traceGroup" elements). Each "traceGroup" element contains a groundtruth text in "Tg_Truth" tag and some strokes data in "trace" elements which are represented by x and y-coordinates of points.
<Description>Cursive online handwriting</Description>
<trace id="tr_0_0"> x1 y1, x2 y2, x3 y3, ....</trace>
<trace id="tr_0_1"> x11 y11, x12 y12, x13 y13, ....</trace>
|InkData_line.zip||data||(239 MB)||311||This VNOnDB_Line compress file consists of online handwritten patterns with the ground truth at line level.|
|VNOnDB_ICFHR2018_dataSplit.zip||data||(7 KB)||288||This VNOnDB2018 is the split for training, validation and testing sets used for ICFHR2018 Competition on Vietnamese Online Handwritten Text Recognition.|
|InkData_paragraph.zip||data||(239 MB)||197||This VNOnDB_Paragraph compress file consists of online handwritten patterns with the ground truth at paragraph level.|
|InkData_word.zip||data||(246 MB)||278||This VNOnDB_Word compress file consists of online handwritten patterns with the ground truth at word level.|
 H. T. Nguyen, C. T. Nguyen, P. T. Bao, M. Nakagawaa A database of unconstrained Vietnamese online handwriting and recognition experiments by recurrent neural networks https://www.sciencedirect.com/science/article/pii/S0031320318300141