The Street View Text Dataset (SVT)
OCR, Real Scene, Urban Scene, Scene Text, Word Spotting, Scene Text Recognition, Scene Text Detection, Scene Text Localization
Example images from the Street View Text dataset.
The Street View Text (SVT) dataset was harvested from Google Street View. Image text in this data exhibits high variability and often has low resolution. In dealing with outdoor street level imagery, we note two characteristics. (1) Image text often comes from business signage and (2) business names are easily available through geographic business searches. These factors make the SVT set uniquely suited for word spotting in the wild: given a street view image, the goal is to identify words from nearby businesses. More details about the data set can be found in our paper, Word Spotting in the Wild . For our up-to-date benchmarks on this data, see our paper, End-to-end Scene Text Recognition .
This dataset only has word-level annotations (no character bounding boxes) and should be used for
- cropped lexicon-driven word recognition and
- full image lexicon-driven word detection and recognition.
 Kai Wang, Boris Babenko and Serge Belongie End-to-end Scene Text Recognition ICCV 2011, Barcelona, Spain
 Kai Wang and Serge Belongie Word Spotting in the Wild ECCV 2010, Heraklion, Crete, Greece