## ICPR 2020 Competition on HArvesting Raw Tables (ICPR-2020-CHART-UB_PMC) (ICPR2020-CHART-Info)

## Research Tasks

# Data Extraction

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

### Description

The goal of this task is to convert all of the previously extracted information into the tabular data of the chart. We break this task into 2 subtasks: (a) plot element detection and classification (b) data conversion. Competitor systems are expected to produce output for both sub-tasks. It is also permitted for competitors to only perform this sub-task only for certain classes of charts.

More details can be found at: https://chartinfo.github.io/

### Protocol

**Detection**. For an element to be correctly detected, it must be **assigned to the correct class.** We will use a variation on MSE to evaluate the representation of each element with the correct class. For each element, we compute a score between 0 and 1, where 1 represents an exact prediction, and predictions farther away than a distance threshold, T, receive a score of 0. **The score is max(0, 1 - (D/T)^2)**, where D is the Euclidean distance between the predicted and GT points. The distance threshold, T, is determined to be **1%** of the smallest image dimension. Because there are many ways to pair predicted and GT points, we will find the minimum cost pairing (i.e. solve this bi-partite graph matching problem).

**Extraction.** Data Series names should come from the chart legend (if there is one). If the data series names are not specified in the chart image, then the predicted names are **ignored for evaluation purposes**.

More details can be found at: https://chartinfo.github.io/

## Comments

No comments on this dataset yet.

## Add your comment

In order to comment on a dataset you need to be logged on