1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.
Scene text plays a significant role in image and video understanding, which has made great progress in recent years. Most existing models on text detection in the wild have the assumption that all the texts are surrounded by a rotated rectangle or quadrangle. While there also exist lots of curved texts in the wild, which would not be bounded by a regular bounding box. In this paper, we develop a novel architecture to localize the text regions, which can deal with curved-shape scene texts. Specifically, we first design a text-related feature enhancement module by incorporating the prior knowledge of the text shape to enhance the feature representations. After that, based on the enhanced features, we employ a region proposal network to generate the candidate boxes of scene texts. For each text candidate, a pyramid region-of-interest pooling attention module is utilized to extract the fixed-size features. Finally, we exploit the box-aware contextbased text segmentation module and box refinement network to obtain the location of scene text. Experiments are conducted on four challenging benchmarks CTW1500, totalTEXT, ICDAR-2015 and MLT, and the experimental results have demonstrated the superiority of our model.
© Copyright 2021 IEEE – All rights reserved. Use of this website signifies your agreement to the IEEE Terms and Conditions.
A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.