Open Access Open Access  Restricted Access Subscription or Fee Access

Evaluating Knowledge-Based Semantic Measures on Arabic


(*) Corresponding author


Authors' affiliations


DOI: https://doi.org/10.15866/irecap.v4i5.4248

Abstract


Semantic measures have received a wide attention from researchers to handle various issues in the different tasks of the computational linguistics and information retrieval. In this paper, we experimentally investigate the performances of the knowledge-based semantic measures on the Arabic language. The state-of-the-art semantic measures are adapted to two knowledge sources: a highly structured source (Arabic WordNet) and semi-structured source (Arabic Wikipedia). The performance of the different semantic measures is evaluated on four Arabic benchmark data sets of the word-to-word semantic similarity/relatedness task. The evaluation results show that Wikipedia is a competitive and promising knowledge source in terms of its high degree of coverage and the variety of the extractable semantic features.
Copyright © 2014 Praise Worthy Prize - All rights reserved.

Keywords


Semantic Measures; Arabic WordNet; Wikipedia; Knowledge-Based Methods

Full Text:

PDF


References


M. Strube, S.P. Ponzetto, WikiRelate! Computing semantic relatedness using Wikipedia, (Page: 1419-1424, Year of Publication: 2006)

Y. Wang, J. Hodges, Document clustering with semantic analysis, (Page: 54c-54c, Year of Publication: 2006)

S. Banerjee, T. Pedersen, Extended gloss overlaps as a measure of semantic relatedness, (Page: 805-810, Year of Publication: 2003)

G. Varelas, E. Voutsakis, P. Raftopoulou, E.G. Petrakis, E.E. Milios, Semantic similarity methods in wordNet and their application to information retrieval on the web, (Page: 10-16, Year of Publication: 2005)
http://dx.doi.org/10.1145/1097047.1097051

C. Fellbaum, WordNet: an electrical lexical database (The MIT Press, 1998).
http://dx.doi.org/10.1017/s0142716401221079

Z. Wu, M. Palmer, Verbs semantics and lexical selection, (Page: 133-138, Year of Publication: 1994)

C. Leacock, M. Chodorow, Combining local context and WordNet similarity for word sense identification, WordNet: An electronic lexical database,Vol. 49, n. 2, pp. 265-283, 1998.

Y. Li, Z.A. Bandar, D. McLean, An approach for measuring semantic similarity between words using multiple information sources, Knowledge and Data Engineering, IEEE Transactions on,Vol. 15, n. 4, pp. 871-882, 2003.
http://dx.doi.org/10.1109/tkde.2003.1209005

M. Lesk, Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone, (Page: 24-26, Year of Publication: 1986)
http://dx.doi.org/10.1145/318723.318728

T. Hughes, D. Ramage, Lexical Semantic Relatedness with Random Graph Walks, (Page, Year of Publication: 2007)

E. Agirre, E. Alfonseca, K. Hall, J. Kravalova, M. Paşca, A. Soroa, A study on similarity and relatedness using distributional and WordNet-based approaches, (Page: 19-27, Year of Publication: 2009)
http://dx.doi.org/10.3115/1620754.1620758

E. Gabrilovich, S. Markovitch, Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis, (Page: 1606-1611, Year of Publication: 2007)

D. Milne, I. Witten, An effective, low-cost measure of semantic relatedness obtained from Wikipedia links, (Page: 25-30, Year of Publication: 2008)

S. Hassan, R. Mihalcea, Cross-lingual semantic relatedness using encyclopedic knowledge, (Page: 1192-1201, Year of Publication: 2009)
http://dx.doi.org/10.3115/1699648.1699665

S. Hassan, R. Mihalcea, Semantic Relatedness Using Salient Semantic Analysis, (Page, Year of Publication: 2011)

R. Mihalcea, Using Wikipedia for Automatic Word Sense Disambiguation, (Page: 196-203, Year of Publication: 2007)

A. Saif, M.J. Ab Aziz, N. Omar, Measuring the Compositionality of Arabic Multiword Expressions, in: Soft Computing Applications and Intelligent Systems,(Springer, 2013, pp. 245-256).
http://dx.doi.org/10.1007/978-3-642-40567-9_21

S. Cucerzan, Large-Scale Named Entity Disambiguation Based on Wikipedia Data, (Page: 708-716, Year of Publication: 2007)

T. Zesch, I. Gurevych, Wisdom of crowds versus wisdom of linguists–measuring the semantic relatedness of words, Natural Language Engineering,Vol. 16, n. 1, pp. 25, 2010.
http://dx.doi.org/10.1017/s1351324909990167

P. Resnik, Using information content to evaluate semantic similarity in a taxonomy, arXiv preprint cmp-lg/9511007,Vol., n., pp., 1995.

D. Lin, An information-theoretic definition of similarity, (Page: 296-304, Year of Publication: 1998)

J.J. Jiang, D.W. Conrath, Semantic similarity based on corpus statistics and lexical taxonomy, arXiv preprint cmp-lg/9709008,Vol., n., pp., 1997.

A. Budanitsky, G. Hirst, Evaluating wordnet-based measures of lexical semantic relatedness, Computational Linguistics,Vol. 32, n. 1, pp. 13-47, 2006.
http://dx.doi.org/10.1162/089120106776173093

T. Zesch, I. Gurevych, M. Mühlhäuser, Comparing wikipedia and german wordnet by evaluating semantic relatedness on multiple datasets, (Page: 205-208, Year of Publication: 2007)
http://dx.doi.org/10.3115/1614108.1614160

T. Pedersen, S.V. Pakhomov, S. Patwardhan, C.G. Chute, Measures of semantic similarity and relatedness in the biomedical domain, Journal of biomedical informatics,Vol. 40, n. 3, pp. 288-299, 2007.
http://dx.doi.org/10.1016/j.jbi.2006.06.004

V. Garla, C. Brandt, Semantic similarity in the biomedical domain: an evaluation across knowledge sources, BMC Bioinformatics,Vol. 13, n. 1, pp. 1-13, 2012.
http://dx.doi.org/10.1186/1471-2105-13-261

F.A. Almarsoomi, J.D. OShea, Z. Bandar, K. Crockett, AWSS: An Algorithm for Measuring Arabic Word Semantic Similarity, (Page: 504-509, Year of Publication: 2013)
http://dx.doi.org/10.1109/smc.2013.92

Z. Zhang, A. Gentile, F. Ciravegna, Recent advances in methods of lexical semantic relatedness–a survey, Natural Language Engineering,Vol. 1, n. 1, pp. 1-69, 2012.
http://dx.doi.org/10.1017/s1351324912000125

S. Elkateb, W. Black, H. Rodríguez, M. Alkhalifa, P. Vossen, A. Pease, C. Fellbaum, Building a wordnet for arabic, (Page, Year of Publication: 2006)

H. Rodríguez, D. Farwell, J. Farreres, M. Bertran, M. Alkhalifa, M.A. Martí, W. Black, S. Elkateb, J. Kirk, A. Pease, Arabic wordnet: Current state and future extensions, (Page, Year of Publication: 2008)

O. Medelyan, D. Milne, C. Legg, I.H. Witten, Mining meaning from Wikipedia, International Journal of Human-Computer Studies,Vol. 67, n. 9, pp. 716-754, 2009.
http://dx.doi.org/10.1016/j.ijhcs.2009.05.004

E. Niemann, I. Gurevych, The people’s web meets linguistic knowledge: Automatic sense alignment of Wikipedia and WordNet, (Page: 205-214, Year of Publication: 2011)

T. Zesch, I. Gurevych, Analysis of the Wikipedia category graph for NLP applications, (Page: 1-8, Year of Publication: 2007)

R. Rada, H. Mili, E. Bicknell, M. Blettner, Development and application of a metric on semantic nets, Systems, Man and Cybernetics, IEEE Transactions on,Vol. 19, n. 1, pp. 17-30, 1989.
http://dx.doi.org/10.1109/21.24528

X.-Y. Liu, Y.-M. Zhou, R.-S. Zheng, Measuring semantic similarity in WordNet, (Page: 3431-3435, Year of Publication: 2007)

H. Al-Mubaid, H.A. Nguyen, A Cluster-Based Approach for Semantic Similarity in the Biomedical Domain, (Page: 2713-2717, Year of Publication: 2006)
http://dx.doi.org/10.1109/iembs.2006.259235

X. Wu, L. Zhu, J. Guo, D.-Y. Zhang, K. Lin, Prediction of yeast protein–protein interaction network: insights from the Gene Ontology and annotations, Nucleic acids research,Vol. 34, n. 7, pp. 2137-2150, 2006.
http://dx.doi.org/10.1093/nar/gkl219

N. Seco, T. Veale, J. Hayes, An intrinsic information content metric for semantic similarity in WordNet, (Page: 1089, Year of Publication: 2004)

Z. Zhou, Y. Wang, J. Gu, A new model of information content for semantic similarity in WordNet, (Page: 85-89, Year of Publication: 2008)
http://dx.doi.org/10.1109/fgcns.2008.16

D. Sánchez, M. Batet, D. Isern, Ontology-based information content computation, Knowledge-Based Systems,Vol. 24, n. 2, pp. 297-303, 2011.
http://dx.doi.org/10.1016/j.knosys.2010.10.001

D. Sánchez, M. Batet, A new model to compute the information content of concepts from taxonomic knowledge, International Journal on Semantic Web and Information Systems (IJSWIS),Vol. 8, n. 2, pp. 34-50, 2012.
http://dx.doi.org/10.4018/jswis.2012040102

H. Taieb, M. Ben Aouicha, M. Tmar, A.B. Hamadou, New information content metric and nominalization relation for a new WordNet-based method to measure the semantic relatedness, (Page: 51-58, Year of Publication: 2011)
http://dx.doi.org/10.1109/cis.2011.6169134

L. Meng, J. Gu, Z. Zhou, A New Model of Information Content Based on Concept's Topology for Measuring Semantic Similarity in WordNet, International Journal of Grid & Distributed Computing,Vol. 5, n. 3, pp., 2012.

G. Pirró, J. Euzenat, A feature and information theoretic framework for semantic similarity and relatedness, in: The Semantic Web–ISWC 2010,(Springer, 2010, pp. 615-630).
http://dx.doi.org/10.1007/978-3-642-17746-0_39

I. Gurevych, Using the structure of a conceptual network in computing semantic relatedness, in: Natural Language Processing–IJCNLP 2005,(Springer, 2005, pp. 767-778).
http://dx.doi.org/10.1007/11562214_67

M. Batet, D. Sánchez, A. Valls, An ontology-based measure to compute semantic similarity in biomedicine, Journal of biomedical informatics,Vol. 44, n. 1, pp. 118-125, 2011.
http://dx.doi.org/10.1016/j.jbi.2010.09.002

R.L. Cilibrasi, P.M. Vitanyi, The google similarity distance, Knowledge and Data Engineering, IEEE Transactions on,Vol. 19, n. 3, pp. 370-383, 2007.
http://dx.doi.org/10.1109/tkde.2007.48

M.A. Taieb, M. Ben Aouicha, A. Ben Hamadou, Computing semantic relatedness using Wikipedia features, Knowledge-Based Systems,Vol. 50, n., pp. 260-278, 2013.
http://dx.doi.org/10.1016/j.knosys.2013.06.015

Y. Alhanini, M.J. Ab Aziz, The Enhancement of Arabic Stemming by Using Light Stemming and Dictionary-Based Stemming, JSEA,Vol. 4, n. 9, pp. 522-526, 2011.
http://dx.doi.org/10.4236/jsea.2011.49060

G.A. Miller, W.G. Charles, Contextual correlates of semantic similarity, Language and cognitive processes,Vol. 6, n. 1, pp. 1-28, 1991.
http://dx.doi.org/10.1080/01690969108406936

L. Finkelstein, E. Gabrilovich, Y. Matias, E. Rivlin, Z. Solan, G. Wolfman, E. Ruppin, Placing search in context: The concept revisited, (Page: 406-414, Year of Publication: 2001)
http://dx.doi.org/10.1145/371920.372094

F.A. Almarsoomi, J.D. O'Shea, Z.A. Bandar, K.A. Crockett, Arabic word semantic similarity, (Page, Year of Publication: 2012)
http://dx.doi.org/10.1109/smc.2013.92

L. Abouenour, K. Bouzoubaa, P. Rosso, On the evaluation and improvement of Arabic WordNet coverage and usability, Language resources and evaluation,Vol. 47, n. 3, pp. 891-917, 2013.
http://dx.doi.org/10.1007/s10579-013-9237-0

M. Jarmasz, S. Szpakowicz, S.: Roget’s thesaurus and semantic similarity, (Page, Year of Publication: 2003)
http://dx.doi.org/10.1075/cilt.260.12jar


Refbacks

  • There are currently no refbacks.



Please send any question about this web site to info@praiseworthyprize.com
Copyright © 2005-2024 Praise Worthy Prize