A Dataset for Arabic Text Detection, Tracking and Recognition in News Videos - AcTiV (AcTiV)
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License.
Artificial Arabic Text Detection; Artificial Arabic Text Tracking; Arabic Video-OCR; News Video, Video Indexing, Content-based Research
AcTiV is the first publicly accessible annotated dataset designed to assess the performance of different Arabic VIDEO-OCR systems. The database has been named AcTiV for Arabic Text in Video. The challenges that are addressed by AcTiV-database are in text patterns variability and presence of complex background with various objects resembling text characters. AcTiV enables users to test their systems’ abilities to locate, track and read text objects in videos. The actual version of the dataset includes 80 videos collected from 4 different Arabic news channels. In the present work, two types of video stream were chosen: Standard-Definition (720x576, 25 fps) and High-Definition (1920x1080, 25fps).We mainly focus on text displayed as overlay in news video, which can be classified into two types: static text and dynamic one.
Two sub-datasets are created from the AcTiV database: Activ-D (D for detection) and Activ-R (R for recognition). AcTiV-D represents a sub-dataset of nonredundant frames used to measure the performance of single-frame based methods to detect/localize text regions in still HD/SD images. AcTiV-R is a sub-dataset of cropped images used to measure the performance of Arabic OCR systems to read texts in video frames.
Typical video frames from the proposed dataset. Top Sub-figures: examples of Russia Today and ElWataniya1 frames. Bottom Sub-figures: examples of Aljazeera HD and France 24 frames.