The large scene video text dataset for scene video text spotting (LSVTD)

Research Tasks

Video Text Detection

2021-06-01 (v. 1)

Contact author

Baorui Zou

Hikvision Research Institute

(+86) 18826072052

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License.


TASK 1 - Video Text Detection 

This task is to obtain the locations of words in each video frame in terms of their affine bounding boxes.  The ground truth of each bounding box is comprised of 4 coordinate points.


We evaluate results based on a single Intersection-over-Union criterion with a threshold of 50%, which is similar to the standard practice in object recognition and Pascal VOC challenge [1]. Recall, Precison and F_score will be uesd as the evaluation metrics.





[1]. M. Everingham, S. A. Eslami, L. Van Gool, C. K. Williams, J. Winn, and A. Zisserman, (2014). The pascal visual object classes challenge: A retrospective. International Journal of Computer Vision111(1), 98-136.


No comments on this dataset yet.

Add your comment

In order to comment on a dataset you need to be logged on
Register Now!


In order to rate this dataset you need to be logged on
Register Now!