Open Access Open Access  Restricted Access Subscription or Fee Access

Web Mining: the Demystification of Multifarious Aspects

M. Ambika(1*), K. Latha(2)

(1) Department of Computer Science and Engineering, Anna University, BIT Campus, India
(2) Department of Computer Science and Engineering, Anna University, BIT Campus, India
(*) Corresponding author


DOI: https://doi.org/10.15866/irecos.v9i1.1040

Abstract


With the rapid growth of the World Wide Web, the availability of web- based information is also increasing exponentially day by day. We can say that “we are drowning in data, but starving for knowledge”. Thus, considering the impressive variety of the web, retrieving interesting, relevant, and required information has become a very difficult task. A popular and successful technique that has shown much promise is web mining. This paper presents a deep and intense study of various techniques available for web mining. We also focus on the comparative study of various techniques, process, and algorithms. Finally, we conclude with applications and some research issues. This report also discusses how these approaches facilitate the use of the Internet.
Copyright © 2014 Praise Worthy Prize - All rights reserved.

Keywords


Web Mining; Web Content Mining; Web Structure Mining; Web Usage Mining

Full Text:

PDF


References


Raymond Kosala, Hendrik Blockeel, Web Mining Research: A Survey, ACM SIGKDD: Explorations: Newsletter of the Special Interest Group (SIG) on Knowledge Discovery and Data Mining, Vol. 2, Issue 1 – page 1, July 2000.
http://dx.doi.org/10.1145/360402.360406

Sanghamitra Bandyopadhyay . Sankar K.Pal, Classification and Learning with Genetic algorithms – Application in Bioinformatics and Web Intelligence, ISSN – 1619-7127, Springer Berlin Heidelberg – 2007 (pg: 242 – 256).

Miguel Gomes da Costa Júnior, Zhiguo Gong, Web Structure Mining: An Introduction, in Proceedings of the IEEE, 2005
http://dx.doi.org/10.1109/icia.2005.1635156

L. Page, S. Brin, R. Motwani, and T. Winograd. The Pagerank citation ranking: Bring order to the web. Technical report, Stanford University, 1998.

Kleinberg, J.M., Authoritative sources in a hyperlinked environment. In Proceedings of ACM-SIAM Symposium on Discrete Algorithms, 1998, pages 668-677 – 1998.

P. Ravi Kumar and Ashutosh Kumar Singh, Web Structure Mining: Exploring Hyperlinks and Algorithms for Information Retrieval, American Journal of Applied Sciences 7 (6): 840-845, 2010, ISSN 1546-9239.
http://dx.doi.org/10.3844/ajassp.2010.840.845

Srivastava J, Desikan P and V Kumar, Web Mining - Concepts, Applications & Research Direction, in 2002 Conference.
http://dx.doi.org/10.1007/11362197_10

Rochester Institute of Technology, Web Usage Mining: Data Preprocessing, Pattern Discovery and Pattern Analysis on the RIT Web Data, MS Project Report, Rochester Institute of Technology, 2008.

Miguel Gomes da Costa Júnior, Zhiguo Gong, Web Structure Mining: An Introduction, Proceedings of the 2005 IEEE International Conference on Information Acquisition, June 27 - July 3, 2005, Hong Kong and Macau, China.
http://dx.doi.org/10.1109/icia.2005.1635156

Han, J., Kamber, M. Kamber. Data mining: concepts and techniques. Morgan Kaufmann Publishers, 2000.
http://dx.doi.org/10.1145/565117.565130

P. Ravi Kumar and Ashutosh Kumar Singh, Web Structure Mining: Exploring Hyperlinks and Algorithms for Information Retrieval, American Journal of Applied Sciences 7 (6): 840-845, 2010 ISSN 1546-9239 © 2010Science Publications.
http://dx.doi.org/10.3844/ajassp.2010.840.845

O. Etzioni. The world wield web: Quagmire or Gold Mining. Communicate of the ACM, (39)11:65-68, 1996;
http://dx.doi.org/10.1145/240455.240473

Qingyu Zhang and Richard s. Segall, Web mining: a survey of current research,Techniques, and software, in the International Journal of Information Technology & Decision Making Vol. 7, No. 4 (2008) 683–720.
http://dx.doi.org/10.1142/s0219622008003150

Dimitrios Pierrakos, Georgios Paliouras, Web Usage Mining as a Tool for Personalization: A Surve, User Modeling and User-Adapted Interaction 13: 311-372,2 003, ©2003 Kluwer Academic Publishers. Printed in the Netherlands.

Chhavi Rana, “A Study of Web Usage Mining Research Tools”, Int. J. Advanced Networking and Applications, Volume: 03 Issue: 06 Pages:1422-1429 (2012) ISSN : 0975-0290.

Bhatia, MPS and Khalid, Akshi Kumar (2008), Information Retrieval and machine learning : Supporting Technology for web mining research and practice, Webology, 5(2), Atricle 55. Available at: http://www.webology.org/2008/v5n2/a55.html.

Kobayashi, M. & Takeda, K. (2000). Information retrieval on the Web. ACM Computing Surveys, 32 (2).
http://dx.doi.org/10.1145/358923.358934

Srivastava, J., Desikan, P., & Kumar, V. (2002). Web Mining- Accomplishments and Future directions. Proceedings of National Science Foundation Workshop on Next Generation Data Mining (NGDM'02), Baltimore, Maryland.

Bin, W., & Zhijing, L. (2003). Web mining research. Proceedings of 5th International Conference on Computational Intelligence and Multimedia Applications (ICCIMA'03) .
http://dx.doi.org/10.1109/iccima.2003.1238105

Chakarbarti S. (2003). Mining the Web: Discovering knowledge from hypertext data. (Morgan Kaufmann Publisher, San Francisco, CA).
http://dx.doi.org/10.1108/14684520310489113

Henzinger, M. (2004). The Past, Present, and Future of Web Search Engines. Proceedings of 31st International Colloquium, ICALP 2004, Finland.

G.Poonkuzhali, R.Kishore Kumar, R.Kripa Keshav, K.Thiagarajan, K.Sarukesi, Effective Algorithms for Improving the Performance of Search Engine Results, International Journal of Applied Mathematics and Informatics, issue 3, volume 5, 2011.

V. Bharanipriya and V. Kamakshi Prasad, Web Content Mining Tools: A Comparative Study, International Journal of Information Technology and Knowledge Management January-June 2011, Volume 4, No. 1, pp. 211-215.

Web Info Extractor Manual.

www.mozenda.com/web-mining-software

www.screen-scraper.com

Web Content Extractor help.

Automation Anywhere 5.5 help

Tripurari Pujan Pratap Singh, Dr. Anurag Seetha, K. K. Pandey, HIT: Web Content Mining Tool, International Journal of Electronics Communication and Computer Engineering, Volume 3, Issue 6.

Anirudhdha Nayak, A Comparative Study of Web Page Classification Techniques, GIT-Journal of Engineering and Technology, Sixth volume, 2013, ISSN 2249 – 6157.

R. Cooley, B. Mobasher, and J. Srivastava, Web Mining: Information and Pattern Discovery on the World Wide Web, 1082-3409197 $10.00 0 1997 IEEE.
http://dx.doi.org/10.1109/tai.1997.632303

Faustina Johnson, Santosh Kumar Gupta, Web Content Mining Techniques: A Survey, International Journal of Computer Applications (0975 – 888), Volume 47– No.11, June 2012.
http://dx.doi.org/10.5120/7236-0266

www.kdnuggets.com/software/web-mining.html

Soumen Chakrabarti, Byron E. Dom, S. Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, Andrew Tomkins, David Gibson and Jon Kleinberg. Mining the Web’s link structure. Computer, 32(8):60–67, 1999.
http://dx.doi.org/10.1109/2.781636

Qiankun Zhao, Sourav S. Bhowmick, and Sanjay Madria, “Research Issues in Web Structural Delta Mining”.
http://dx.doi.org/10.1007/11539827_16

Esmin, A., Lima, J., Yano, Tiago, E. T., Carneiro, G.S. (2008) ‘ArchCollect - A Tool for WEB Usage Knowledge Acquisition from User's Interactions’, Proceedings of the Tenth International Conference on Enterprise Information Systems, Barcelona, Spain, pp. 375-380.
http://dx.doi.org/10.5220/0001722003750380

Abraham, A. (2003) ‘i-Miner: A Web Usage Mining Framework Using Hierarchical Intelligent Systems’, IEEE International Conference on Fuzzy Systems FUZZ-IEEE'03, IEEE Press , pp. 1129-1134 .
http://dx.doi.org/10.1109/fuzz.2003.1206590

Tiedtke, T. Märtin, C. and Gerth, N. (2002)‘AWUSA – A Tool for Automated Website Usability Analysis’, PreProceedings of the 9th International Workshop on the Design, Specification and Verification of Interactive Systems.

Pierrakos, D. Paliouras, G. Papatheodorou, C. and Spyropoulos, C. D. (2000) 'KOINOTITES: A Web Usage Mining Tool for Personalization', Proceedings of Panhellenic Conference on Human Computer Interaction, Greece, Patras, pp. 231-236.

Eirinaki, M., Vazirgiannis, M. and Varlamis, I. (2003) ‘SEWeP: using site semantics and a taxonomy to enhance the Web personalization process’, Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 99-108.
http://dx.doi.org/10.1145/956750.956765

Masseglia, F. Poncelet, P. and and Cicchetti. R. (1999) ‘WebTool: An Integrated Framework for Data Mining’, Proceedings of the 10th International Conference on Database and Expert Systems Applications (DEXA '99), Trevor J. M. Bench-Capon, Giovanni Soda, and A. Min Tjoa (Eds.). Springer- Verlag, London, UK, pp. 892-901.
http://dx.doi.org/10.1007/3-540-48309-8_84

Spiliopoulou, M. and Faulstich, L. C. (1998) ‘WUM : A Web Utilization Miner’, EDBT Workshop on Web Databases, pp.1-7, Valencia, Spain.
http://dx.doi.org/10.1007/10704656_12

www.clicktracks.com

www.downloadanalyzer.com

www.thedataminer.com

Brijendra Singh, Hemant Kumar Singh, Web Data Mining Research: A Survey, 978-1-4244-5967-4/10/$26.00 ©2010 IEEE.
http://dx.doi.org/10.1109/iccic.2010.5705856

Ahmad, M.H., Mustapha, A., Khairudin, N.M., Clustering web log data using graph partitioning and agglomerative hierarchical algorithms for predicting user navigation patterns, (2013) International Review on Computers and Software (IRECOS), 8 (9), pp. 2178-2186.

Manikandan, P., Selvarajan, S., A hybrid optimization algorithm based on cuckoo search and PSO for data clustering, (2013) International Review on Computers and Software (IRECOS), 8 (9), pp. 2278-2287.

P. Brusilovsky, A. Kobsa, and W. Nejdl (Eds.): The Adaptive Web, LNCS 4321 Springer-Verlag, pp. 90–135, 2007.
http://dx.doi.org/10.1007/978-3-540-72079-9

Cristóbal Romero, Sebastián Ventura, Amelia Zafra, Paul de Bra, Applying Web usage mining for personalizing hyperlinks in Web-based adaptive educational systems, Computers & Education (Elsevier), 53 (2009) 828–840.
http://dx.doi.org/10.1016/j.compedu.2009.05.003

Pooja Mehta, Adaptive Personalization, International Journal Of Engineering Development And Research, 2013.

D. Antoniou, M. Paschou, E. Sourla, A. Tsakalidis (2010), A Semantic Web Personalizing Technique The case of bursts in web visits”, IEEE Fourth International Conference on Semantic Computing.
http://dx.doi.org/10.1109/icsc.2010.49

Gouttaya, N., Begdouri, A., Automatic tracking of changes in user behavior to support proactivity in pervasive systems, (2013) International Review on Computers and Software (IRECOS), 8 (8), pp. 1822-1831.

Sneha, Y.S., Mahadevan, G., Parvathi, R.M.S., Recommender system based on user ratings: A comprehensive study and future challenges, (2013) International Review on Computers and Software (IRECOS), 8 (7), pp. 1624-1635.

Burke, R.: Hybrid systems for personalized recommendations. In Mobasher, B., Anand, S.S., eds.: Intelligent Techniques in Web Personalisation. LNAI 3169. Springer-Verlag (2005) 133–152.

Koutri, M., Avouris, N., & Daskalaki, S. (2005). A survey on Web usage mining techniques for Web-based adaptive hypermedia systems. Adaptable and adaptive hypermedia systems (pp. 125–149). IRM Press.
http://dx.doi.org/10.4018/978-1-59140-567-2.ch007

Chunyan Liang (2011), User Profile for Personalized Web Search, Eighth International Conference on Fuzzy Systems and Knowledge Discovery, IEEE, pp: 1847 – 1850.
http://dx.doi.org/10.1109/fskd.2011.6019913

Zhicheng Dou, Ruihua Song, Ji-Rong Wen, and Xiaojie Yuan, Evaluating the Effectiveness of Personalized Web Search, IEEE Transactions on Knowledge And Data Engineering, Vol. 21, No. 8, August 2009.
http://dx.doi.org/10.1109/tkde.2008.172

Magdalini Eirinaki and Michalis Vazirgiannis, Web Mining for Web Personalization, ACM Transactions on Internet Technology, Vol. 3, No. 1, February 2003, Pages 1–27.
http://dx.doi.org/10.1145/643477.643478

Alkhatib, B., Alnahhas, A., Ezaldeen, H., Building automatic web customer profiling service, (2013) International Review on Computers and Software (IRECOS), 8 (6), pp. 1341-1345.

Masseglia, F., Poncelet, P., And Teisseire, M. (2000). Web usage mining: How to efficiently manage new transactions and new customers. In Proceedings of the Fourth European Conference on Principles of Data Mining and Knowledge Discovery (PKDD’00) (Lyon, France, Sept.).
http://dx.doi.org/10.1007/3-540-45372-5_62

Senthil Kumaran, V., Sankar, A., Study of personalization in E-learning, (2013) International Review on Computers and Software (IRECOS), 8 (5), pp. 1209-1217.

John, J.M., Shajin Nargunam, A., Similarity distance based clustering framework for aggregation of web usage data, (2013) International Review on Computers and Software (IRECOS), 8 (1), pp. 287-295.


Refbacks

  • There are currently no refbacks.



Please send any question about this web site to info@praiseworthyprize.com
Copyright © 2005-2018 Praise Worthy Prize