ICFHR 2014 CROHME: Fourth International Competition on Recognition of Online Handwritten Mathematical Expressions (CROHME-2014)

Research Tasks

Mathematical Expression Recognition

2015-02-16 (v. 1)

Contact author

Harold Mouchère

University of Nantes / IRCCyN


+33 2-40-68-30-82


This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.


The difficulty to recognize mathematical expressions depends of the number of different symbols, number of allowed layouts and the used grammar. The competition define 4 levels (tasks) from 41 symbols to 101 symbols, with increasing difficulties in the grammar of allowed expressions.


The competition defines an evaluation protocol :

  •  participants can use available training dataset (and more)
  • the candidate systems take as input an inkml file (without ground-truth) and have to write as output a inkml file with the symbol segmentation, recognition and the expression interpretation with MathML format. This is exactly the same format as the provided training dataset. Since 2013, the system can also generate label graph (LG) files.
  • the evaluation first converts the inkml files in LG files and then compare the resulting inkml and LG files with the ground-truth with a provided script.

Several aspects  are  measured.  They are    

  1. ST_Rec:    the    stroke    classification    rate, representing the percentage of strokes with the correct symbol,  
  2. SYM_Seg:  the  symbol  segmentation  rate, defining    the    percentage    of    symbols    correctly segmented,
  3. SYM_Rec; the symbol recognition rate, computing  the  performance  of  the  symbol  classifier when considering only the correct segmented symbols,
  4. STRUCT:  the MathML structure recognition rate, computing the percentage of expressions (MEs) having the  correct  MathML  tree  as output irrespective of  the symbols   attached   to   its   leaves.
  5. EXP_Rec: the  expression  recognition  rate,  which  informs  the percentage of MEs totally correctly recognized.
  6. EXP-Rec_1, _2, _3, giving the percentage of MEs recognized with at most 1 error, 2 errors and 3 errors (in  terminal  symbols  or  in  MathML  node  tags)  given that the tree structure is correct.
  7. Recall and Precision for segments and recognized segments (symbols)
  8. Recall and Precision for spatial relations between the symbols

As the MathML struture is not unique for the same expressions, the evaluation tool provides a normalization step to use canonical structures.


No comments on this dataset yet.

Add your comment

In order to comment on a dataset you need to be logged on
Register Now!


In order to rate this dataset you need to be logged on
Register Now!