Open Access Open Access  Restricted Access Subscription or Fee Access

A Machine Learning Approach to Anaphora Resolution in Arabic

(*) Corresponding author

Authors' affiliations



Anaphora resolution is a commonly studied research area of Natural Language Processing (NLP). It is crucial for many application areas of Natural Language Processing including information extraction, question answering and text summarization. Most of the earlier work done in the field of anaphora resolution is for English and other European languages. Arabic language is not sufficiently studied with respect to anaphora resolution and rarely being subjected to machine learning experiments. In this paper we present a machine learning approach to resolve the pronominal anaphora in Arabic language. In this work we determine the appropriate features to be used in this task. We consider a number of classifier namely naive Bayes, K-nearest neighbors and linear logistic regression are employed as base-classifiers for each of the feature sets. In this paper, an in-depth study has been conducted on different of feature sets for exploiting effective features and investigating their effect on performance of the Anaphora resolution. Finally, a wide range of comparative experiments on Quranic datasets are conducted, The experimental results on the Arabic Quran training corpus demonstrate that the proposed method is feasible for the pronominal anaphora resolution of Arabic.
Copyright © 2014 Praise Worthy Prize - All rights reserved.


Anaphora Resolution; Natural Language Processing; Supervised Machine Learning

Full Text:



R. Mitkov, 2002. Anaphora resolution (Longman London, 2002).

Arregi, O., Ceberio, K., De Illarraza, A. D., Goenaga, I., Sierra, B. & Zelaia, A., A first machine learning approach to pronominal anaphora resolution in Basqu, Lecture Notes in Computer Science, v. 6433 Pages 234-243. Springer Verlag, 2010.

Soon, W. M., Ng, H. T., Lim, D. Y., A machine learning approach to coreference resolution of noun phrases, Computational Linguistics, Vol. 4, n. 27, pp. 521-544, 2001.

Rahman, A.,Ng, V., Supervised models for coreference resolution, Proceedings of Empirical Methods in Natural Language Processing, (Pages: 968-977 , Year of Publication: 2009).

Yang, X., Zhou, G., Su, J., Tan, C. L., Coreference resolution using competition learning approach, In Proceedings of The 41st Annual Meeting on Association for Computational Linguistics, (Pages: 176-183 , Year of Publication: 2003)

Haghighi, A., Klein, D., Unsupervised coreference resolution in a nonparametric bayesian model, In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics , (Year of Publication: 2007).

Iida, R., Inui, K.,Matsumoto, Y., Anaphora resolution by antecedent identification followed by anaphoricity determination, ACM Transactions on Asian Language Information Processing, Vol. 4, No. 4, 417-434, 2005.

Ram, R., Devi, S. L., Pronominal Resolution in Tamil Using Tree CRFs, In Proceedings of The International Conference on Asian Language Processing , (Pages: 197-200, Year of Publication: 2013)

Fei, L., Shi, S., Yuzhong, C., Xueqiang, L., Chinese Pronominal Anaphora Resolution Based on Conditional Random Fields, In Proceedings of the International Conference on Computer Science and Software Engineering, (Pages: 731-734, Year of Publication: 2008)

Hammami, S. M., Sallemi, R.,Belguith, L. H., A Bayesian classifier for the identification of non-referential pronouns in Arabic, In Proceedings of The 7th International Conference on Informatics and Systems , (Pages: 1-6, Year of Publication: 2010)

Yıldırım, S., Kılıçaslan, Y., A machine learning approach to personal pronoun resolution in Turkish, Computational Linguistics, Vol. 4, n. 27, pp. 521-544, 2006.

Trong Le, D., Tran, M. V., Nguyen, T., Ha, Q. T., Co-reference Resolution in Vietnamese Documents Based on Support Vector Machines, In Proceedings of The International Conference on Asian Language Processing , (Pages: 89-92, Year of Publication: 2011)

Kucuk, D.,Yondem, M., Automatic identification of pronominal Anaphora in Turkish texts, In Proc. of 22nd international symposium on Computer and information sciences, (Pages: 1-6 , Year of Publication: 2007)

Zelaia, A., Sierra, B., Arregi Uriarte, O., Ceberio, K., Díaz de Illarraza, A. & Goenaga, I. A combination of classifiers for the pronominal anaphora resolution in Basque , Lecture Notes in Computer Science, v. 6419 Pages 253-260. Springer Verlag, 2010.

Holen, G., Automatic anaphora resolution for Norwegian, Anaphora: Analysis, Algorithms and Applications, In Proceedings of The 6th Discourse Anaphora and Anaphor Resolution Colloquium (DAARC), (Pages: 151-167 , Year of Publication: 2007)

H. Wunsch, 2010. Rule-Based and Memory-Based Pronoun Resolution for German: A Comparison and Assessment of Data Sources, Ph.D. Thesis, Dept. Modern Languages, University Tubingen, German, 2009.

Converse, S. P., Resolving pronominal references in Chinese with the Hobbs algorithm, In Proceedings of the 4th SIGHAN workshop on Chinese language processing, (Pages: 116-122, Year of Publication: 2005)

Khan, M., Naveed, A. M., Aamir Khan, M.,Treatment of Pronominal Anaphoric Devices in Urdu Discourse, In Proceedings of The International Conference on Emerging Technologies , (Pages: 543-547, Year of Publication: 2006).

Palomar, M., Ferrández, A., Moreno, L., Martínez-Barco, P., Peral, J., Saiz-Noeda, M.,Muñoz, R., 2001. An algorithm for anaphora resolution in Spanish texts, Computational Linguistics, Vol. 4, n. 27, pp. 545-567, 2001.

Mitkov, R., Factors in anaphora resolution: they are not the only things that matter: a case study based on two different approaches, In Proceedings of ACL Workshop on Operational Factors in Practical, Robust Anaphora Resolution forUnrestricted Texts, (Pages: 14-21 , Year of Publication: 1997)

Lappin, S., Leass, H. J., An algorithm for pronominal anaphora resolution, Computational Linguistics, Vol. 4, n. 20, pp. 535-561, 2006.

Chen, P., Hinote, D., Chen, G. A rule based solution to co-reference resolution in clinical text. the American Medical Informatics Association. Vol. 5, n. 20, pp. 891-897, 2012.

Qin, K., Kong, F., Li, P.,Zhu, Q., 2012. Chinese Zero Anaphor Detection: Rule-Based Approach. Knowledge, Lecture Notes in Computer Science, v. 123 Pages 403-407. Springer Verlag, 2010.

Dutta, K., Prakash, N., Kaushik, S. Resolving Pronominal Anaphora in Hindi Using Hobbs’ Algorithm, Web Journal of Formal Computation and Cognitive Linguistics, Vol. 1, n. 10, pp. 5607-5607, 2008.

Ali, R., Khan, M. A., Ali, M., Reflexive Anaphora Resolution in Pashto Discourse, Proceedings of Language and Technology, (Year of Publication: 2009)

Albared, M., Omar, N., Aziz, M. and Ahmad Nazri, M. Automatic part of speech tagging for Arabic: an experiment using Bigram hidden Markov model. Lecture Notes in Computer Science, v. 6401 Pages 361-370. Springer Verlag, 2010.

Albared, M., Omar, N. and Ab Aziz, M. Improving Arabic part-of-speech tagging through morphological analysis. Lecture Notes in Computer Science, v. 6591 Pages 317-326. Springer Verlag, 2011.

Albared, M., Omar, N., Ab Aziz, M.J., Arabic part of speech disambiguation: A survey, (2009) International Review on Computers and Software (IRECOS), 4 (5), pp. 517-532.


  • There are currently no refbacks.

Please send any question about this web site to
Copyright © 2005-2023 Praise Worthy Prize