Open Access Open Access  Restricted Access Subscription or Fee Access

Arabic Handwriting Text Offline Recognition Using the HMM Toolkit (HTK)

(*) Corresponding author

Authors' affiliations



This Recognition of Arabic text handwritten awaits precise recognition solutions. There are a lot of difficulties facing a good handwritten Arabic recognition system such as unlimited variant in human handwriting, similarities of different character shapes, and their location in the word. This paper presents a handwriting Arabic text recognition system. It decomposes the text image into text line images and extracts a set of simple statistical features from a narrow window which is sliding a long that text line, then it injects the resulting feature vectors to the Hidden Markov Model Toolkit (HTK). HTK is a portable toolkit for speech recognition system. In recognized state, the concatenation of characters to form words is modelled by simple lexical models, each word is modelled by a stochastic finite-state automaton (SFSA), and the concatenation of words into sentences is modelled by an n-gram language model. The proposed system is applied to a data corpus constructed by Text lines examples from the “Arabic-Numbers”, which contains 1905 sentences and 47 words. This phrase is written by 5 different peoples.
Copyright © 2014 Praise Worthy Prize - All rights reserved.


Arabic Text Handwritten; Hidden Markov Model Toolkit (HTK); Stochastic Finite-State Automaton

Full Text:



AL-Shatnawi, Atallah Mahmoud, AL-Salaimeh, Safwan, AL-Zawaideh, Farah Hanna, Omar, Khairuddin, Offline arabic text recognition an overview, World Comput. Sci. Inform. Technol. J. 1 (5), 184–192, 2011.

Lorigo, Liana M., Govindaraju, Venu, Offline arabic handwriting recognition: a survey, IEEE Trans. Pattern Anal. Mach. Intell. 28, 712–724, 2006.

AlKhateeb, Jawad H., Ren, Jinchang, Jiang, Jianmin, Al-Muhtaseb, Husni, Offline handwritten arabic cursive text recognition using hidden markov models and re-ranking, Pattern Recognition Lett. 32 (8), 1081–1088, 2011.

Khorsheed, M.S., Offline recognition of omnifont arabic text using the hmm toolkit (htk), Pattern Recognition Lett. 28 (12), 1563–1571, 2007.

Al-Muhtaseb, A. Husni, Mahmoud, A. Sabri, Qahwaji, Rami S., Recognition of off-line printed Arabic text using hidden markov models. Signal Process. 88 (12), 2902–2912, 2008.

Slimane, Fouad, Ingold, Rolf, Alimi, Adel M., Hennebert, Jean, Duration models for Arabic text recognition using hidden markov models, CIMCA, 838–843, 2008.

Sternby, Jakob, Morwing, Jonas, Andersson, Jonas, Friberg, Christer, On-line arabic handwriting recognition with templates, Pattern Recognition 42 (12), 3278–3286, new Frontiers in Handwriting Recognition, 2009.

Al-Hajj, R., Likforman-Sulem, L., Mokbel, C., Combination of HMM-based classifiers for the recognition of arabic handwritten words, In: Proceedings of the Ninth International Conference on Document Analysis and Recognition (ICDAR’07) (2007).

Benouareth, A., Ennaji, A., Sellami, M., HMMs with explicit state duration applied to handwritten Arabic word recognition, In: Proceeding of 18th International Conference Pattern Recognition (ICPR) (2006).

Dreuw, P., Jonas, S., Ney, H., White-space models for offline Arabic handwriting recognition, In: Proceeding of 19th Int. Conf. Pattern Recognition (ICPR) (2008).

Khorsheed, M.S.: Offline Arabic character recognition-a review, Pattern Anal. Appl. 5, 31–45, 2002.

Parker, J.R., Algorithms for Image Processing and Computer Vision, John Wiley and Sons, Inc., 1997.

U.-V. Marti and H. Bunke, Using a Statistical Language Model to improve the performance of an HMM-Based Cursive Handwriting Recognition System, International Journal of Pattern Recognition and Artificial Intelligence, 15(1):65–90, 2001.

Moisés Pastor i Gadea, Aportaciones al reconocimiento automático de texto manuscrito, Ph.D. thesis, Departament de Sistemes Informàtics i Computació, València, Spain, Oct 2007. Advisors: E. Vidal and A.H. Tosselli.

A. H. Toselli, A. Juan, D. Keysers, J. González, I. Salvador, H. Ney, E. Vidal, and F. Casacuberta, Integrated Handwriting Recognition and Interpretation using Finite-State Models, International Journal of Pattern Recognition and Artificial Intelligence, 18(4), 519–539, June 2004.

M. Pastor, A. Toselli, and E. Vidal, Projection profile based algorithm for slant removal, In International Conference on Image Analysis and Recognition (ICIAR’04), Lecture Notes in Computer Science, pages 183–190, Porto, Portugal, September 2004, Springer-Verlag.

M. Pastor, A. H. Toselli, V. Romero, and E. Vidal, Improving handwritten offline text slant correction, In Proc. of The Sixth IASTED international Conference on Visualization, Imaging, and Image Processing (VIIP 06), Palma de Mallorca, Spain, August 2006.

R. G. Casey K. Y. Wong and F. M. Wahl, Document analysis system, IBM Journal of Research and Development, 26(6):647–656, 1982.

R. O. Duda and P. E. Hart. Pattern Classification and Scene Analysis. J. Wiley and Sons, 1973.

I. Bazzi, R. Schwartz, and J. Makhoul, An Omnifont Open-Vocabulary OCR System for English and Arabic, IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(6):495–504, 1999.

A. Brakensiek, J. Rottland, A. Kosmala, and G. Rigoll, Off-Line Handwriting Recognition Using Various Hybrid Modeling Techniques and Character N-Grams, In 7th Int. Workshop on Frontiers in Handwriting Recognition (IWFHR), pages 343–352, Amsterdam, The Netherlands, 2000.

A. H. Toselli, A. Juan, D. Keysers, J. González, I. Salvador, H. Ney, E. Vidal, and F. Casacuberta, Integrated Handwriting Recognition and Interpretation using Finite-State Models, International Journal of Pattern Recognition and Artificial Intelligence, 18(4):519–539, June 2004.

Alqurneh, A., Mustapha, A., Bayesian network model for oath statement retrieval: A case study in Quranic text using machine learning techniques, (2014) International Review on Computers and Software (IRECOS), 9 (5), pp. 757-76.

HTK Speech Recognition Toolkit, pp. 108–122.

F. Jelinek, Statistical Methods for Speech Recognition, MIT Press, 1998.

L. Rabiner, A Tutorial of Hidden Markov Models and Selected Application in Speech Recognition, Proceedings IEEE, 77:257–286, 1989.

S. M. Katz, Estimation of Probabilities from Sparse Data for the Language Model Component of a Speech Recognizer, IEEE Transactions on Acoustics, Speech and Signal Processing, ASSP-35:400–401, March 1987.

R. Kneser and H. Ney, Improved backing-off for m-gram language modeling, IEEE Computer Society, volume 1, pages 181–184, Los Alamitos, CA, USA, 1995.


  • There are currently no refbacks.

Please send any question about this web site to
Copyright © 2005-2024 Praise Worthy Prize