ICPR 2020 Competition on HArvesting Raw Tables (ICPR-2020-CHART-UB_PMC) (ICPR2020-CHART-INFO)

Research Tasks

Data Extraction

Description

The goal of this task is to convert all of the previously extracted information into the tabular data of the chart. We break this task into 2 subtasks: (a) plot element detection and classification (b) data conversion. Competitor systems are expected to produce output for both sub-tasks. It is also permitted for competitors to only perform this sub-task only for certain classes of charts.

More details can be found at: https://chartinfo.github.io/

Protocol

**Detection**. For an element to be correctly detected, it must be **assigned to the correct class.** We will use a variation on MSE to evaluate the representation of each element with the correct class. For each element, we compute a score between 0 and 1, where 1 represents an exact prediction, and predictions farther away than a distance threshold, T, receive a score of 0. **The score is max(0, 1 - (D/T)^2)**, where D is the Euclidean distance between the predicted and GT points. The distance threshold, T, is determined to be **1%** of the smallest image dimension. Because there are many ways to pair predicted and GT points, we will find the minimum cost pairing (i.e. solve this bi-partite graph matching problem).

**Extraction.** Data Series names should come from the chart legend (if there is one). If the data series names are not specified in the chart image, then the predicted names are **ignored for evaluation purposes**.

