Open Access Open Access  Restricted Access Subscription or Fee Access

A Machine Learning Approach to Anaphora Resolution in Arabic


(*) Corresponding author


Authors' affiliations


DOI: https://doi.org/10.15866/irecos.v9i12.4786

Abstract


Anaphora resolution is a commonly studied research area of Natural Language Processing (NLP). It is crucial for many application areas of Natural Language Processing including information extraction, question answering and text summarization. Most of the earlier work done in the field of anaphora resolution is for English and other European languages. Arabic language is not sufficiently studied with respect to anaphora resolution and rarely being subjected to machine learning experiments. In this paper we present a machine learning approach to resolve the pronominal anaphora in Arabic language. In this work we determine the appropriate features to be used in this task. We consider a number of classifier namely naive Bayes, K-nearest neighbors and linear logistic regression are employed as base-classifiers for each of the feature sets. In this paper, an in-depth study has been conducted on different of feature sets for exploiting effective features and investigating their effect on performance of the Anaphora resolution. Finally, a wide range of comparative experiments on Quranic datasets are conducted, The experimental results on the Arabic Quran training corpus demonstrate that the proposed method is feasible for the pronominal anaphora resolution of Arabic.
Copyright © 2014 Praise Worthy Prize - All rights reserved.

Keywords


Anaphora Resolution; Natural Language Processing; Supervised Machine Learning

Full Text:

PDF


References


R. Mitkov, 2002. Anaphora resolution (Longman London, 2002).
http://dx.doi.org/10.1162/coli.2003.29.4.662

Arregi, O., Ceberio, K., De Illarraza, A. D., Goenaga, I., Sierra, B. & Zelaia, A., A first machine learning approach to pronominal anaphora resolution in Basqu, Lecture Notes in Computer Science, v. 6433 Pages 234-243. Springer Verlag, 2010.
http://dx.doi.org/10.1007/978-3-642-16952-6_24

Soon, W. M., Ng, H. T., Lim, D. Y., A machine learning approach to coreference resolution of noun phrases, Computational Linguistics, Vol. 4, n. 27, pp. 521-544, 2001.
http://dx.doi.org/10.1162/089120101753342653

Rahman, A.,Ng, V., Supervised models for coreference resolution, Proceedings of Empirical Methods in Natural Language Processing, (Pages: 968-977 , Year of Publication: 2009).
http://dx.doi.org/10.3115/1699571.1699639

Yang, X., Zhou, G., Su, J., Tan, C. L., Coreference resolution using competition learning approach, In Proceedings of The 41st Annual Meeting on Association for Computational Linguistics, (Pages: 176-183 , Year of Publication: 2003)
http://dx.doi.org/10.3115/1075096.1075119

Haghighi, A., Klein, D., Unsupervised coreference resolution in a nonparametric bayesian model, In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics , (Year of Publication: 2007).

Iida, R., Inui, K.,Matsumoto, Y., Anaphora resolution by antecedent identification followed by anaphoricity determination, ACM Transactions on Asian Language Information Processing, Vol. 4, No. 4, 417-434, 2005.
http://dx.doi.org/10.1145/1113308.1113312

Ram, R., Devi, S. L., Pronominal Resolution in Tamil Using Tree CRFs, In Proceedings of The International Conference on Asian Language Processing , (Pages: 197-200, Year of Publication: 2013)
http://dx.doi.org/10.1109/ialp.2013.59

Fei, L., Shi, S., Yuzhong, C., Xueqiang, L., Chinese Pronominal Anaphora Resolution Based on Conditional Random Fields, In Proceedings of the International Conference on Computer Science and Software Engineering, (Pages: 731-734, Year of Publication: 2008)
http://dx.doi.org/10.1109/csse.2008.432

Hammami, S. M., Sallemi, R.,Belguith, L. H., A Bayesian classifier for the identification of non-referential pronouns in Arabic, In Proceedings of The 7th International Conference on Informatics and Systems , (Pages: 1-6, Year of Publication: 2010)

Yıldırım, S., Kılıçaslan, Y., A machine learning approach to personal pronoun resolution in Turkish, Computational Linguistics, Vol. 4, n. 27, pp. 521-544, 2006.

Trong Le, D., Tran, M. V., Nguyen, T., Ha, Q. T., Co-reference Resolution in Vietnamese Documents Based on Support Vector Machines, In Proceedings of The International Conference on Asian Language Processing , (Pages: 89-92, Year of Publication: 2011)
http://dx.doi.org/10.1109/ialp.2011.63

Kucuk, D.,Yondem, M., Automatic identification of pronominal Anaphora in Turkish texts, In Proc. of 22nd international symposium on Computer and information sciences, (Pages: 1-6 , Year of Publication: 2007)
http://dx.doi.org/10.1109/iscis.2007.4456858

Zelaia, A., Sierra, B., Arregi Uriarte, O., Ceberio, K., Díaz de Illarraza, A. & Goenaga, I. A combination of classifiers for the pronominal anaphora resolution in Basque , Lecture Notes in Computer Science, v. 6419 Pages 253-260. Springer Verlag, 2010.
http://dx.doi.org/10.1007/978-3-642-16687-7_36

Holen, G., Automatic anaphora resolution for Norwegian, Anaphora: Analysis, Algorithms and Applications, In Proceedings of The 6th Discourse Anaphora and Anaphor Resolution Colloquium (DAARC), (Pages: 151-167 , Year of Publication: 2007)
http://dx.doi.org/10.1007/978-3-540-71412-5_11

H. Wunsch, 2010. Rule-Based and Memory-Based Pronoun Resolution for German: A Comparison and Assessment of Data Sources, Ph.D. Thesis, Dept. Modern Languages, University Tubingen, German, 2009.

Converse, S. P., Resolving pronominal references in Chinese with the Hobbs algorithm, In Proceedings of the 4th SIGHAN workshop on Chinese language processing, (Pages: 116-122, Year of Publication: 2005)

Khan, M., Naveed, A. M., Aamir Khan, M.,Treatment of Pronominal Anaphoric Devices in Urdu Discourse, In Proceedings of The International Conference on Emerging Technologies , (Pages: 543-547, Year of Publication: 2006).
http://dx.doi.org/10.1109/icet.2006.335951

Palomar, M., Ferrández, A., Moreno, L., Martínez-Barco, P., Peral, J., Saiz-Noeda, M.,Muñoz, R., 2001. An algorithm for anaphora resolution in Spanish texts, Computational Linguistics, Vol. 4, n. 27, pp. 545-567, 2001.
http://dx.doi.org/10.1162/089120101753342662

Mitkov, R., Factors in anaphora resolution: they are not the only things that matter: a case study based on two different approaches, In Proceedings of ACL Workshop on Operational Factors in Practical, Robust Anaphora Resolution forUnrestricted Texts, (Pages: 14-21 , Year of Publication: 1997)
http://dx.doi.org/10.3115/1598819.1598822

Lappin, S., Leass, H. J., An algorithm for pronominal anaphora resolution, Computational Linguistics, Vol. 4, n. 20, pp. 535-561, 2006.

Chen, P., Hinote, D., Chen, G. A rule based solution to co-reference resolution in clinical text. the American Medical Informatics Association. Vol. 5, n. 20, pp. 891-897, 2012.
http://dx.doi.org/10.1136/amiajnl-2011-000770

Qin, K., Kong, F., Li, P.,Zhu, Q., 2012. Chinese Zero Anaphor Detection: Rule-Based Approach. Knowledge, Lecture Notes in Computer Science, v. 123 Pages 403-407. Springer Verlag, 2010.
http://dx.doi.org/10.1007/978-3-642-25661-5_52

Dutta, K., Prakash, N., Kaushik, S. Resolving Pronominal Anaphora in Hindi Using Hobbs’ Algorithm, Web Journal of Formal Computation and Cognitive Linguistics, Vol. 1, n. 10, pp. 5607-5607, 2008.

Ali, R., Khan, M. A., Ali, M., Reflexive Anaphora Resolution in Pashto Discourse, Proceedings of Language and Technology, (Year of Publication: 2009)

Albared, M., Omar, N., Aziz, M. and Ahmad Nazri, M. Automatic part of speech tagging for Arabic: an experiment using Bigram hidden Markov model. Lecture Notes in Computer Science, v. 6401 Pages 361-370. Springer Verlag, 2010.
http://dx.doi.org/10.1007/978-3-642-16248-0_52

Albared, M., Omar, N. and Ab Aziz, M. Improving Arabic part-of-speech tagging through morphological analysis. Lecture Notes in Computer Science, v. 6591 Pages 317-326. Springer Verlag, 2011.
http://dx.doi.org/10.1007/978-3-642-20039-7_32

Albared, M., Omar, N., Ab Aziz, M.J., Arabic part of speech disambiguation: A survey, (2009) International Review on Computers and Software (IRECOS), 4 (5), pp. 517-532.


Refbacks

  • There are currently no refbacks.



Please send any question about this web site to info@praiseworthyprize.com
Copyright © 2005-2024 Praise Worthy Prize