The large scene video text dataset for scene video text spotting (LSVTD)
Research Tasks
Video Text Detection
2021-06-01 (v. 1)

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License.
Description
TASK 1 - Video Text Detection
This task is to obtain the locations of words in each video frame in terms of their affine bounding boxes. The ground truth of each bounding box is comprised of 4 coordinate points.
Protocol
We evaluate results based on a single Intersection-over-Union criterion with a threshold of 50%, which is similar to the standard practice in object recognition and Pascal VOC challenge [1]. Recall, Precison and F_score will be uesd as the evaluation metrics.
References
[1]. M. Everingham, S. A. Eslami, L. Van Gool, C. K. Williams, J. Winn, and A. Zisserman, (2014). The pascal visual object classes challenge: A retrospective. International Journal of Computer Vision, 111(1), 98-136.
Comments
No comments on this dataset yet.
Valoration

Add your comment
In order to comment on a dataset you need to be logged on