1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.
To robustly detect arbitrary-shaped scene texts, bottom-up methods are widely explored for their flexibility. Due to the highly homogeneous texture and cluttered distribution of scene texts, it is nontrivial for segmentation-based methods to discover the separatrixes between adjacent instances. To effectively separate nearby texts, many methods adopt the seed expansion strategy that segments shrunken text regions as seed areas, and then iteratively expands the seed areas into intact text regions. In seek of a more straightforward way that does not rely on seed area segmentation and avoid possible error accumulation brought by iterative processing, we propose a redundancy removal strategy. In this work, we directly explore two types of fuzzy semantics-text and separatrix-that do not possess specific boundaries, and separate cluttered instances by excluding the separatrix pixels from text regions. To deal with the fuzzy semantic boundaries, we also conduct reliability analysis in both optimization and inference stage to suppress false positive pixels at ambiguous locations. Experiments on benchmark datasets demonstrate the effectiveness of our method.
Abitrary-shaped scene text detection aims to accurately locate tight text regions of arbitrary shapes from natural scene images. It has wide-range applications such as text recognition, scene parsing and automatic pilot. The main challenge of robust scene text detection lies in the complex appearance of texts, such as arbitrary shape, skewed viewpoint and large aspect ratio.
To deal with the arbitrary shapes, mainstream methods seek bottom-up solutions for their flexibility and treat text detection as a segmentation problem. However, as mixtures of stroke and background pixels, text regions are highly homogeneous textures that do not possess natural and clear boundaries. Besides, as shown in Figure 1 (a), scene texts are often in cluttered distribution and sometimes even contiguous due to the coarse polygon annotations. Thus, effectively separating cluttered instances becomes the most intractable problem in segmentation-based methods. False positive pixels along the instance separatrix areas often merge adjacent instances, which can have a dramatic influence on the detection results even though these pixels are of a very small proportion in the whole image. A typical solution is two-stage processing [1], [2], [3] which avoids directly discovering text separatrixes. They tend to segment shrunken text regions at first to find separated instance seeds, and then expand these seed areas iteratively and exhaustively to recover the intact text regions. Though the seed area extraction and iterative region expansion strategy can help separate cluttered instances, its performance is highly relied on the seed area segmentation, and error may accumulate throughout the iterative expansion procedure. Given that, we seek a more straightforward strategy to discover the specific instance separatrixes by directly modeling its unique semantics.