Pre-Processing of Document Images Obtained with a Smartphone

Hassan El Bahi; Abdelkarim Zatni

doi:10.15866/irecos.v11i12.10955

Pre-Processing of Document Images Obtained with a Smartphone

Hassan El Bahi^(1*), Abdelkarim Zatni⁽²⁾

^(*) Corresponding author

Authors' affiliations

DOI: https://doi.org/10.15866/irecos.v11i12.10955

Abstract

In recent years, text preprocessing has become increasingly important in the field of pattern recognition because of its use in various domains. The greatest difficulty is to build an effective text preprocessing system able to overcome the problems of perspective distortion, nonuniform illumination and poor focusing. Many systems are being proposed, but less interest has been given to document images acquired with a smartphone camera. In this paper, a complete text preprocessing system of document images obtained via mobile phones will be presented. The system comprises three steps: initially, a method is proposed based on edge detection, morphology operation and heuristic rules to extract the text area from document image. In the second stage, an approach was developed to cope with the problem of perspective distortion by using a new method that relies on polar coordinates and bilinear interpretation. After that, a simple method based on projection profile was proposed in order to eliminate the marginal noise. Finally, a new technique based on connected components (CCs) analysis is suggested to segment the text into individual lines. The experiments were performed with two public databases: sample and test ICDAR2015 Smartphone document OCR. Experimental results demonstrate that the presented system can achieve a very good extraction rate and work efficiently even under different types of document image distortions.
Copyright © 2016 Praise Worthy Prize - All rights reserved.

Keywords

Preprocessing; Smartphone; Text Detection; Perspective Correction; Text Segmentation

Full Text:

PDF

References

Chen, Jin, Lopresti, Daniel, et Nagy, George. Conservative preprocessing of document images. International Journal on Document Analysis and Recognition (IJDAR), 2016, vol. 19, no 4, p. 321-333.
http://dx.doi.org/10.1007/s10032-016-0273-3

Simon, Christian, Park, In Kyu, et al. Correcting geometric and photometric distortion of document images on a smartphone. Journal of Electronic Imaging, 2015, vol. 24, no 1, p. 013038-013038.
http://dx.doi.org/10.1117/1.jei.24.1.013038

Rong, Li, Suyu, Wang, et Shi, Zhixin. A two level algorithm for text detection in natural scene images. In : Document Analysis Systems (DAS), 2014 11th IAPR International Workshop on. IEEE, 2014. P.329-333.
http://dx.doi.org/10.1109/das.2014.41

Koo, Hyung Il et Kim, Duck Hoon. Scene text detection via connected component clustering and nontext filtering. IEEE Transactions on Image Processing, 2013, vol. 22, no 6, p. 2296-2305.
http://dx.doi.org/10.1109/tip.2013.2249082

Yi, Chucai et Tian, Yingli. Text extraction from scene images by character appearance and structure modeling. Computer Vision and Image Understanding, 2013, vol. 117, no 2, p. 182-194.
http://dx.doi.org/10.1016/j.cviu.2012.11.002

Yin, Xu-Cheng, Yin, Xuwang, Huang, Kaizhu, et al. Robust text detection in natural scene images. IEEE transactions on pattern analysis and machine intelligence, 2014, vol. 36, no 5, p. 970-983.
http://dx.doi.org/10.1109/tpami.2013.182

Anthimopoulos, Marios, Gatos, Basilis, et Pratikakis, Ioannis. Detection of artificial and scene text in images and video frames. Pattern Analysis and Applications, 2013, vol. 16, no 3, p. 431-446.
http://dx.doi.org/10.1007/s10044-011-0237-7

Wang, Changwon, Lee, Sangjoon, Ho, Jonggab, et al. Detection of Optimal Activity Recognition Algorithm for Elderly Using Smartphone. In : International Conference on Computer Science and its Applications. Springer Singapore, 2016. p. 1013-1018.
http://dx.doi.org/10.1109/iccsnt.2013.6967138

Shivakumara, Palaiahnakote, Trung Quy Phan, and Chew Lim Tan. "New fourier-statistical features in RGB space for video text detection." IEEE transactions on circuits and systems for video technology 20.11 (2010): 1520-1532
http://dx.doi.org/10.1109/tcsvt.2010.2077772

Pan, Yi-Feng, Cheng-Lin Liu, and Xinwen Hou. "Fast scene text localization by learning-based filtering and verification." 2010 IEEE International Conference on Image Processing. IEEE, 2010.
http://dx.doi.org/10.1109/icip.2010.5651862

Liu, Xiaoqian, and Weiqiang Wang. "Robustly extracting captions in videos based on stroke-like edges and spatio-temporal analysis." IEEE transactions on multimedia 14.2 (2012): 482-489.
http://dx.doi.org/10.1109/tmm.2011.2177646

Zhang, Jing, and Rangachar Kasturi. "A novel text detection system based on character and link energies." IEEE Transactions on Image Processing 23.9 (2014): 4187-4198.
http://dx.doi.org/10.1109/tip.2014.2341935

Shivakumara, Palaiahnakote, et al. "Multioriented video scene text detection through Bayesian classification and boundary growing." IEEE transactions on circuits and systems for video technology 22.8 (2012): 1227-1235.
http://dx.doi.org/10.1109/tcsvt.2012.2198129

Huang, Xiaodong, et al. "A new video text extraction method based on stroke." Image and Signal Processing (CISP), 2013 6th International Congress on. Vol. 1. IEEE, 2013.
http://dx.doi.org/10.1109/cisp.2013.6744069

Burie, Jean-Christophe, Chazalon, Joseph, Coustaty, Mickaël, et al. ICDAR2015 competition on smartphone document capture and OCR (SmartDoc). In : Document Analysis and Recognition (ICDAR), 2015 13th International Conference on. IEEE, 2015. p. 1161-1165.
http://dx.doi.org/10.1109/icdar.2015.7333943

Castro, DM Rojas, Revel, Arnaud, et Ménard, Michel. Document image analysis by a mobile robot for autonomous indoor navigation. In : Document Analysis and Recognition (ICDAR), 2015 13th International Conference on. IEEE, 2015. p. 156-160.
http://dx.doi.org/10.1109/icdar.2015.7333743

Clark, Paul et Mirmehdi, Majid. Rectifying perspective views of text in 3D scenes using vanishing points. Pattern Recognition, 2003, vol. 36, no 11, p. 2673-2686.
http://dx.doi.org/10.1016/s0031-3203(03)00132-8

Lu, Shijian, Chen, Ben M., et KO, Chi Chung. Perspective rectification of document images using fuzzy set and morphological operations. Image and Vision Computing, 2005, vol. 23, no 5, p. 541-553.
http://dx.doi.org/10.1016/j.imavis.2005.01.003

Rodríguez-Piñeiro, José, Comesaña-Alfaro, Pedro, Pérez-González, Fernando, et al. A new method for perspective correction of document images. In : IS&T/SPIE Electronic Imaging. International Society for Optics and Photonics, 2011. p. 787410-787410-12.
http://dx.doi.org/10.1117/12.876155

Clark, Paul et Mirmehdi, Majid. Recognising text in real scenes. International Journal on Document Analysis and Recognition, 2002, vol. 4, no 4, p. 243-257.
http://dx.doi.org/10.1007/s10032-001-0072-2

Pao, Williem, Simon, Christian, Cho, Sungdae, et al. Fast and Robust Perspective Rectification of Document Images on a Smartphone. In : Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2014. p. 197-198.
http://dx.doi.org/10.1109/cvprw.2014.37

Kim, Beom Su, Koo, Hyung Il, et Cho, Nam Ik. Document dewarping via text-line based optimization. Pattern Recognition, 2015, vol. 48, no 11, p. 3600-3614.
http://dx.doi.org/10.1016/j.patcog.2015.04.026

Liu, Changsong, Zhang, Yu, Wang, Baokang, et al. Restoring camera-captured distorted document images. International Journal on Document Analysis and Recognition (IJDAR), 2015, vol. 18, no 2, p. 111-124.
http://dx.doi.org/10.1007/s10032-014-0233-8

Basu, Nabanita et Bandyopadhyay, Samir K. Automatic perspective rectification of documents photographed with a camera. IJAR, 2016, vol. 2, no 3, p. 705-710.
http://dx.doi.org/10.4236/oalib.1101412

Vo, Quang Nhat et Lee, GueeSang. Dense prediction for text line segmentation in handwritten document images. In : Image Processing (ICIP), 2016 IEEE International Conference on. IEEE, 2016. p. 3264-3268.
http://dx.doi.org/10.1109/icip.2016.7532963

Refbacks

There are currently no refbacks.

Username
Password
Remember me