Existing techniques in Arabic Characters Recognition (ACR)

Aid Ahmed Radaideh, Mohd Shafry Mohd Rahim


Text or handwritten document writing recognition or understanding through characters whether online or off-line play an important role in many applications, e.g. recognizing text on bank cheques, editing old documents or translating online writing. The Arabic language characters recognition in Arabic text (hand written and printed text) is a challenging job, which is addressed by researchers in two different domains, i.e. online and off-line. The challenges are due to the cursive nature of the Arabic language. The approaches designed and adopted to achieve the target of Arabic characters recognition are also broadly classified into two, i.e. with segmentation and without segmentation. Machines can easily classify the characters if they are properly presented to it. The aim of this paper is to review critically state-of-the-art and to highlight the drawbacks in recent Arabic character recognition (ACR) for realization of the needs for robust online or off-line character recognition. This paper will properly enlighten the way for new researchers to improve or overcome the problems faced by previous studies and in addition provide a brief summary of ACR databases.


Characters recognition; Characters segmentation; Features extraction; Classification; Arabic characters recognition

Full Text:



A Nazif. A system for the recognition of the printed arabic characters. Master's Thesis (2nd Edition) Faculty of Engineering, Cairo University, 1975.

Liana M Lorigo and Venugopal Govindaraju. Offine arabic handwriting recognition: a survey. IEEE transactions on pattern analysis and machine intelligence, 28(5):712-724, 2006.

MS Khorsheed and WF Clocksin. Segmentation-free word recognition for arabic handwriting. In The International Conference on Pattern Recognition ICPR, 2000.

Jawad H AlKhateeb, Jinchang Ren, Stan S Ipson, and Jianmin Jiang. Knowledgebased baseline detection and optimal thresholding for words segmentation in efficient preprocessing of handwritten arabic text. In Information Technology: New Generations, 2008. ITNG 2008. Fifth International Conference on, pages 1158-1159. IEEE, 2008.

Ahmed M Zeki. The segmentation problem in arabic character recognition the state of the art. In 2005 International Conference on Information and Communication Technologies, pages 11-26. IEEE, 2005.

Yousef Al-Ohali, Mohamed Cheriet, and Ching Suen. Databases for recognition of handwritten arabic cheques. Pattern Recognition, 36(1):111-121, 2003.

Reza Safabakhsh and Peyman Adibi. Nastaaligh handwritten word recognition using a continuous-density variable-duration hmm. Arabian Journal for Science and Engineering, 30(1):95-120, 2005.

Adnan Amin. Recognition of hand-printed characters based on structural description and inductive logic programming. Pattern Recognition Letters, 24(16):3187-3196, 2003.

Saeed Mozaffari, Haikal El Abed, Volker Margner, Karim Faez, and Ali Amirshahi. Ifn/farsidatabase: a database of farsi handwritten city names. In International Conference on Frontiers in Handwriting Recognition, 2008.

Hossein Khosravi and Ehsanollah Kabir. Introducing a very large dataset of handwritten farsi digits and a study on their varieties. Pattern Recognition Letters, 28(10):1133-1141, 2007.

Ezzat Ali El-Sherif and Sherif Abdelazeem. A two-stage system for arabic handwritten digit recognition tested on a new large database. In Artificial Intelligence and Pattern Recognition, pages 237-242, 2007.

Huda Alamri, Javad Sadri, Ching Y Suen, and Nicola Nobile. A novel comprehensive database for arabic off-line handwriting recognition. In Proceedings of 11th International Conference on Frontiers in Handwriting Recognition, ICFHR, volume 8, pages 664-669, 2008.

Sabri Mahmoud. Recognition of writer-independent off-line handwritten arabic (Indian) numerals using hidden markov models. Signal Processing, 88(4):844-857, 2008.

Rejean Plamondon and Sargur N Srihari. Online and off-line handwriting recognition: a comprehensive survey. IEEE Transactions on pattern analysis and machine intelligence, 22 (1):63-84, 2000.

M Kherallah, A Elbaati, HE Abed, and AM Alimi. The on/off (lmca) dual arabic handwriting database. In 11th International Conference on Frontiers in Handwriting Recognition (ICFHR), 2008.

Saeed Mozaffari, Karim Faez, and Majid Ziaratban. Structural decomposition and statistical description of farsi/arabic handwritten numeric characters. In Eighth International Conference on Document Analysis and Recognition (ICDAR'05), pages 237-241. IEEE, 2005.

Majid Ziaratban, Karim Faez, and Fatemeh Bagheri. Fht: An unconstraint farsi handwritten text database. In 2009 10th International Conference on Document Analysis and Recognition, pages 281-285. IEEE, 2009.

Husni A Al-Muhtaseb, Sabri A Mahmoud, Rami S Qahwaji, M Demiralp, NA Baykara, and NE Mastorakis. A novel minimal arabic script for preparing databases and benchmarks for arabic text recognition research. In WSEAS International Conference. Proceedings. Mathematics and Computers in Science and Engineering, number 8. World Scientific and Engineering Academy and Society, 2009.

Faisal Farooq, Venu Govindaraju, and Michael Perrone. Pre-processing methods for handwritten arabic documents. In Eighth International Conference on Document Analysis and Recognition (ICDAR'05), pages 267-271. IEEE, 2005.

Hasan Al-Rashaideh. Preprocessing phase for arabic word handwritten recognition. Information Process(Russian), 6(1), 2006.

CJ Hilditch. Comparison of thinning algorithms on a parallel processor. Image and Vision Computing, 1(3):115-132, 1983.

Liying Zheng. Machine printed arabic character recognition using s-gcm. In 18th International Conference on Pattern Recognition (ICPR'06), volume 2, pages 893-896. IEEE, 2006.

Moayad Yousif Potrus, Umi Kalthum Ngah, and Bestoun S Ahmed. An evolutionary harmony search algorithm with dominant point detection for recognition-based segmentation of online arabic text recognition. Ain Shams Engineering Journal, 5(4):1129-1139, 2014.

Mehdi Dehghan, Karim Faez, Majid Ahmadi, and Malayappan Shridhar. Handwritten farsi (arabic) word recognition: a holistic approach using discrete hmm. Pattern Recognition, 34 (5):1057-1065, 2001.

Mohammad S Khorsheed. Hmm-based system for recognizing words in historical arabic manuscript. International Journal of Robotics & Automation, 22(4):294, 2007.

Richard G Casey and Eric Lecolinet. Strategies in character segmentation: A survey. In Document Analysis and Recognition, 1995., Proceedings of the Third International Conference on, volume 2, pages 1028-1033. IEEE, 1995.

Yasser M Alginahi. A survey on arabic character segmentation. International Journal on Document Analysis and Recognition (IJDAR), 16(2):105-126, 2013.

Zaher Al Aghbari and Salama Brook. Hah manuscripts: A holistic paradigm for classifying and retrieving historical arabic handwritten documents. Expert Systems with Applications, 36(8):10942-10951, 2009.

Gyeonghwan Kim, Venu Govindaraju, and Sargur N Srihari. Architecture for handwritten text recognition systems. SERIES IN MACHINE PERCEPTION AND ARTIFICIAL INTELLIGENCE, 34:163-172, 2000.

Jawad H AlKhateeb, Jianmin Jiang, Jinchang Ren, and S Ipson. Component-based segmentation of words from handwritten arabic text. International Journal of Computer Systems Science and Engineering, 5(1), 2009.

Liana Lorigo and Venu Govindaraju. Segmentation and pre-recognition of arabic handwriting. In Eighth International Conference on Document Analysis and Recognition (ICDAR'05), pages 605-609. IEEE, 2005.

Behrooz Parhami and M Taraghi. Automatic recognition of printed farsi texts. Pattern Recognition, 14(1-6):395-403, 1981.

Liying Zheng, Abbas H Hassin, and Xianglong Tang. A new algorithm for machine printed arabic character segmentation. Pattern Recognition Letters, 25(15):1723-1729, 2004.

Toufik Sari, Labiba Souici, and Mokhtar Sellami. Off-line handwritten arabic character segmentation algorithm: Acsa. In Frontiers in Handwriting Recognition, 2002. Proceedings. Eighth International Workshop on, pages 452-457. IEEE, 2002.

Khaled Mostafa and Ahmed M Darwish. Robust baseline-independent algorithms for segmentation and reconstruction of arabic handwritten cursive script. In Electronic Imaging'99, pages 73-83. International Society for Optics and Photonics, 1999.

Humoud B Al-Sadoun and Adnan Amin. A new structural technique for recognizing printed arabic text. International journal of pattern recognition and artificial intelligence, 9(01): 101-125, 1995.

M Tellache, M Sid-Ahmed, and B Abaza. Thinning algorithms for arabic ocr. In Communications, Computers and Signal Processing, 1993., IEEE Pacific Rim Conference on, volume 1, pages 248-251. IEEE, 1993.

Deya Motawa, Adnan Amin, and Robert Sabourin. Segmentation of arabic cursive script. In icdar, volume 97, pages 625-628, 1997.

Alaa Hamid and Ramzi Haraty. A neuro-heuristic approach for segmenting handwritten arabic text. In Computer Systems and Applications, ACS/IEEE International Conference on. 2001, pages 110-113. IEEE, 2001.

Nibaran Das, Ayatullah Faruk Mollah, Ram Sarkar, and Subhadip Basu. A comparative study of different feature sets for recognition of handwritten arabic numerals using a multi layer perceptron. arXiv preprint arXiv:1003.1894, 2010.

Isabelle Guyon and Andre Elissee. An introduction to variable and feature selection. Journal of machine learning research, 3(Mar):1157-1182, 2003.

Abdelmalek BC Zidouri, Supoj Chinveeraphan, and Makoto Sato. Structural features by mcr expression for printed arabic character recognition. In International Conference on Image Analysis and Processing, pages 557-562. Springer, 1995.

Ibrahim SI Abuhaiba. A discrete arabic script for better automatic document understanding. 2003.

Ahmad T Al-Taani. An efficient feature extraction algorithm for the recognition of handwritten arabic digits. International journal of computational intelligence, 2(2):107-111, 2005.

Ashutosh Malaviya and Liliane Peters. Extracting meaningful handwriting features with fuzzy aggregation method. In Document Analysis and Recognition, 1995., Proceedings of the Third International Conference on, volume 2, pages 841-844. IEEE, 1995.

Mohammad S Khorsheed and William F Clocksin. Structural features of cursive arabic script. In BMVC, pages 1-10. Citeseer, 1999.

Habib Goraine, Mike Usher, and Samir Al-Emami. Off-line arabic character recognition. Computer, 25(7):71-74, 1992.

Badr Al-Badr and Sabri A Mahmoud. Survey and bibliography of arabic optical text recognition. Signal processing, 41(1):49-77, 1995.

Hermineh YY Sanossian. An arabic character recognition system using neural network. In Neural Networks for Signal Processing [1996] VI. Proceedings of the 1996 IEEE Signal Processing Society Workshop, pages 340-348. IEEE, 1996.

Issam Bazzi, Richard Schwartz, and John Makhoul. An omnifont open-vocabulary ocr system for english and arabic. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(6):495-504, 1999.

F Zaki, S Elkonyaly, A Elfattah, and Y Enab. A new technique for arabic handwriting recognition. In Proceedings of the 11th International Conference for Statistics and Computer Science, Cairo, Egypt, pages 171-180, 1986.

S Saadallah and S Yacu. Design of an arabic character reading machine. In Proceedings of Computer Processing of Arabic Language, Kuwait, pages 123-135, 1985.

Samir Al-Emami and Mike Usher. On-line recognition of handwritten arabic characters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(7):704-710, 1990.

Ramin Halavati, Saeed Bagheri Shouraki, and Saeed Hassanpour. Evolution of multiple states machines for recognition of online cursive handwriting. In 2006 World Automation Congress, pages 1-6. IEEE, 2006.

A Dehghani, F Shabini, and P Nava. Off-line recognition of isolated persian handwritten characters using multiple hidden markov models. In Information Technology: Coding and Computing, 2001. Proceedings. International Conference on, pages 506-510. IEEE, 2001.

Somaya Alma'adeed, Colin Higgins, and Dave Elliman. Off-line recognition of handwritten arabic words using multiple hidden markov models. Knowledge-Based Systems, 17(2):75-79, 2004.

Mostafa G Mostafa. An adaptive algorithm for the automatic segmentation of printed arabic text. In 17th National Computer Conference, pages 437-444. Citeseer, 2004.

Adnan Amin and Jean F Mari. Machine recognition and correction of printed arabic text. IEEE Transactions on systems, man, and cybernetics, 19(5):1300-1306, 1989.

Huda Alamri, Chun Lei He, and Ching Y Suen. A new approach for segmentation and recognition of arabic handwritten touching numeral pairs. In International Conference on Computer Analysis of Images and Patterns, pages 165-172. Springer, 2009.

Alireza Alaei, Umapada Pal, and P Nagabhushan. Using modified contour features and svm based classifier for the recognition of persian/arabic handwritten numerals. In Advances in Pattern Recognition, 2009. ICAPR'09. Seventh International Conference on, pages 391- 394. IEEE, 2009.

Sabri A Mahmoud and Sunday O Olatunji. Automatic recognition of off-line handwritten arabic (indian) numerals using support vector and extreme learning machines. International Journal of Imaging, 2(A09):34-53, 2009.

Majid Ziaratban, Karim Faez, and Farhad Faradji. Language-based feature extraction using template-matching in farsi/arabic handwritten numeral recognition. In Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), volume 1, pages 297- 301. IEEE, 2007.

Ahmad AbdulKader. A two-tier arabic offine handwriting recognition based on conditional joining rules. In Arabic and Chinese Handwriting Recognition, pages 70-81. Springer, 2008.

I Ben Cheikh, Abdel Belagard, and Afef Kacem. A novel approach for the recognition of a wide arabic handwritten word lexicon. In Pattern Recognition, 2008. ICPR 2008. 19th International Conference on, pages 1-4. IEEE, 2008.

DOI: http://dx.doi.org/10.26713%2Fjims.v8i5.469

eISSN 0975-5748; pISSN 0974-875X