Arbitrary-Shaped Text (ICDAR-2019 ArT)

Dataset Information
Dataset URL
http://bjyz-ai.epc.baidu.com/broad/download?dataset=art
Keywords
Arbitrary Shaped Text, Chinese, English
Description
Update: alternative download is available through the RRC Platform (registration required): https://rrc.cvc.uab.es/?ch=14&com=downloads
ArT is a combination of Total-Text, SCUT-CTW1500 and Baidu Curved Scene Text, which were collected with the motive of introducing the arbitrary-shaped text problem to the scene text community. On top of the existing images (3055), more than 7111 images are added to mixture of both datasets, which make ArT one of the larger scale scene text datasets today. There is a total of 10,166 images in the ArT dataset. It is split into a training set with 5603 images, and a testing set of 4563 newly collected images. The ArT dataset was collected with text shape diversity in mind, hence all existing text shapes (i.e. horizontal, multi-oriented, and curved) have high number of existence in the dataset.