A Novel Framework for Alias Detection in Web Search Using Extreme Learning Machine (ELM) Approach
(*) Corresponding author
the author of the article can submit here a request for assignment of a DOI number to this resource!
Cost of the service: euros 10,00 (for a DOI)
This is an era of search engines which have become powerful tools to retrieve information from the Web. It is the practice of using multiple names to mention the same entity that creates problem while collecting data. For example, Mohandas Karamchand Gandhi, Father of Nation, is also known as Gandhiji and with other pseudonyms. Synonyms (different names of an entity) can affect the relevance of a search engine. Extracting alias of an entity is significant for various tasks in the web such as automatic metadata extraction, entity disambiguation and social network analysis. In this paper, an alias detection framework is proposed to find alternate names of a given entity through the automatically downloaded web snippets using lexical patterns. The extracted alias names similarity scores are calculated using the string similarity measures and a novel method is introduced for detecting irrelevant alias names by ranking with the help of ELM. It is a learning algorithm for single hidden layer feed forwards neural networks having a generalization performance with a faster learning speed to train neural networks in a single iteration and its ranking performance is examined against Support Vector Machine (SVM). The ELM outperformed in terms of precision, recall and fscore as 17.28%, 90.70% and 0.34% for giving alias dataset better than SVM.
Copyright © 2014 Praise Worthy Prize - All rights reserved.
Quang Minh Vu Atsuhiro Takasu, Jun Adachi, Improving the prformance of personal name disambiguation using web directories, Information Processing and Management, Vol. 44, no. 3, pp. 1546–156, 2008.
Javier Artiles, Julio Gonzalo, and Felisa Verdejo, A test bed for people searching strategies in the www, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval (Page: 569-570, Year of Publication: 2005)
Jiafeng Guo, Gu Xu, Xueqi Cheng, Hang Li, Named entity recognition in the query, Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval(Page: 267-274 Year of publication: 2009, ISBN: 978-1-60558-483-6)
Ying Chen, Sophia Yat Mei Lee, Chu-Ren Huang, A robust web personal name information extracting system, An international journal Expert system with application,Vol. 39, no. 3, pp. 2090- 2699, 2012.
Gideon S. Mann, David Yarowsky, Unsupervised Personal Name Disambiguation, Proceedings of the seventh conference on Natural language learning (Vol. 4, Page: 33-40, Year of Publication: 2003).
Ron Bekkerman,Andrew McCallum, Disambiguating Web Appearances of People in a Social Network, Proceedings of the 14th international conference on World Wide Web(Page: 463 -470 Year of Publication: 2005, ISBN:1-59593-046-9).
Patrick Pantel, Alias Detection in Malicious Environments, Proceeding of AAAI Fall Symp. Capturing and Using Patterns for Evidence Detection (Page: 14-20 Year of Publication: 2006).
Ralf Holzer, Bradley Malin, Latanya Sweeney,Email alias detection using network Analysis, Proceedings of the 3rd international workshop on Link discovery (Page: 52-57 Year of Publication: 2005 ISBN: 1-59593-215-1).
Tarique Anwar, Muhammad Abulaish”,An MCL-Based Text Mining Approach for Namesake Disambiguation on the Web, Proceedings of the IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology(Vol. 1, Page: 40 -44 Year of Publication: 2012).
Danushka Bollegala,Taiki Honma,Yutaka Matsuo, Mitsuru Ishizuka, Identiﬁcation of personal name aliases on the web, Proceedings of the 17th international conference on World Wide web(Page: 1107 -1108 Year of Publication: 2008 ISBN: 978-1-60558-085-2).
Monica Marrero, Julian urbano,Sonia sanchez-cuadrado,Jorge Morato, Juan Miguel Gomez-Berbisl, Named Entity Recognition: Fallacies, challenges and opportunities,Computer standards and interfaces, Vol. 35, no. 5, pp. 482-489, 2013.
Ben Hachey, Will Radford, Joel Nothman, Matthew Honnibal, James R. Curran, Evaluating Entity Linking with Wikipedia, Artificial Intelligence,Vol. 194, pp. 130-150, 2013.
Rosa M. Ortega-Mendoza, Luis Villaseineda and Manuel Montes-y-G, Using lexical patterns for extracting hyponyms from the web, Proceedings of the artificial intelligence 6th Mexican international conference on Advances in artificial intelligence (Page: 904 Year of Publication: 2007).
Amit Bagga, Breck Baldwin, Entity-Based Cross-Document Co-referencing Using the Vector Space Model, Proceedings of the 17th international conference on Computational linguistics(Vol. 1, Page: 79 -85 Year of Publication: 1998).
Ted Pedersen, Amruta Purandare, and Anagha Kulkarni,Name discrimination by clustering similar contexts, Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing(Page: 226-237 Year of Publication: 2005)
R. Guha, A. Garg, Disambiguating People in Search, Proceedings of the Thirteenth International World Wide Web Conference(Year of Publication: 2004).
Mann G.S,Yarowsky D, Unsupervised personal name disambiguation., Proceedings of the seventh conference on Natural language learning(page: 33–40 Year of Publication: 2003)
Bradley Malin, Unsupervised name disambiguation via social network similarity, Proceedings of the Workshop on Link Analysis, Counterterrorism, and Security, in conjunction with the SIAM International Conference on Data Mining(Page: 93-102, Year of Publication: 2005).
Elena Smirnova , Konstantin Avrachenkov , Brigitte Trousse,Using Web Graph Structure for Named entity Disambiguation, Conference and albs of the evaluation forum( Year of Publication: 2010).
Xiaojun, Jianfeng Gao, MU Li, Binggong Ding, Person Resolution in Person Search Result : WebHawk, Proceedings of the 14th ACM international conference on Information and knowledge management(Page:163-170 Year of Publication: 2005 ISBN:1-59593-140-6).
Olga Vechtomova , Stephen E. Robertson, A domain-independent approach to finding related entities, Information processing management, Vol. 48, no. 4, pp. 654–670, 2012.
Qiang Shen and Tossapon Boongoen,Fuzzy Orders-of-Magnitude-Based Link Analysis for Qualitative Alias Detection, IEEE Transaction on knowledge and data engineering, Vol. 24, no. 4, pp. 649-663, 2012.
Meijuan Meijuan Yin, Junyong Luo, Ding Cao, Xiaonan and yongxing tan, User Name Alias Extraction in Emails, I.J. Image, Graphics and Signal Processing,Vol. 3, no. 9, pp.1-9, 2011.
Teaspoon Boongoen,Qiang Shen,Nearest Neighbor Guided Evaluation of Data Reliability and Its Applications, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, Vol.40, no. 6, pp.1622-1633, 2010.
Vinay Bhat, Tim Oates, Vishal Shanbhag, Charles Nicholas, Finding aliases on the web using latent semantic analysis, Data & Knowledge Engineering, Vol. 49, no. 2, pp.129-143, 2004.
Anwar, T.Abulaish, M. Alghathbar, k,Web Content Mining for Alias Identification: a first step towards suspect tracking, IEEE International Conference on Intelligence and Security Informatics(page:195-197, Year of Publication: 2011)
Paul Andrew Moore, Daniel Neill and Jeff Schneider, Alias detection in Link Data Set, Proceedings of the International Conference on Intelligence Analysis(Year of Publication: 2004).
Danushka Bollegala, Yutaka Matsuo, A Web Search Engine-Based Approach to Measure Semantic Similarity between Words, IEEE Transaction on knowledge and data Engineering, vol. 23, no. 7, pp. 977-990, 2011.
Danushka Bollegala,Yutaka Matsuo,Mitsuru Ishizuka, Automatic Discovery of Personal Name Aliases from the Web, IEEE Transaction on knowledge and data Engineering,Vol. 23, no. 6, pp.831-844, 2011.
Gerald Salton and Christopher Buckley,Term-Weighting Approaches in Automatic Text Retrieval, Information Processing and Management, vol. 24, no. 5, pp. 513-523, 1988.
Kenneth Ward Church and Patrick Hanks,Word Association Norms, Mutual Information and Lexicography, Computational Linguistics,Vol. 16, no. 1,pp.22-29, 1991.
Frank Smadja, Retrieving Collocations from Text: Xtract, Computational Linguistics, Vol. 19, no. 1, pp.143-177, 1993.
Danushka.Bollegala, yutaka Matsuo, mitsuru Ishizuka, Measuring Semantic Similarity between Words Using Web Search Engines, Proceeding of the 16th international conference on World Wide Web(Page:757-766 Year of Publication:2007).
Yuan-Hai Shao, Chun-Hua Zhang, Xiao-Bo Wang, Nai-Yang Deng, Improvements in Twin Support Vector Machines,IEEE Transactions on Neural Networks, vol.22, no. 6, pp.962-968, 2011.
Yunbo Cao, Jun Xu, Tie-Yan Liu, Hang Li, Yalou Huang, and Hsiao-Wuen Hon,Adapting ranking svm to document retrieval, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval (Page: 186-193 Year of Publication: 2006).
Thorsten Joachims, optimizing search engines using clickthrough data, Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining (Page: 133-142, Year of Publication: 2002).
Hwanjo Yu, SVM selective sampling for ranking with application to data retrieval, Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining (Page: 354-363, Year of Publication: 2005).
Bryan Prosser, Wei-Shi Zheng, Shaogang Gong, Tao Xiang, Person Re-Identification by Support Vector Ranking, Proceedings of the British Machine Vision Conference(page: 1-21 Year of Publication: 2010).
Guang-Bin Huang, Qin-Yu Zhu, Chee-Kheong Siew, Extreme learning machine: Theory and applications, Neurocomputing, Vol. 70, pp. 489-501, 2006.
Shifei Ding,Hang Zhao,Yanan Zhang, Extreme Learning Machine: algorithm, theory and application, Artificial intelligence review, published online, 2013.
- There are currently no refbacks.
Please send any question about this web site to email@example.com
Copyright © 2005-2023 Praise Worthy Prize