Open Access Open Access  Restricted Access Subscription or Fee Access

Web Mining: the Demystification of Multifarious Aspects

(*) Corresponding author

Authors' affiliations



With the rapid growth of the World Wide Web, the availability of web- based information is also increasing exponentially day by day. We can say that “we are drowning in data, but starving for knowledge”. Thus, considering the impressive variety of the web, retrieving interesting, relevant, and required information has become a very difficult task. A popular and successful technique that has shown much promise is web mining. This paper presents a deep and intense study of various techniques available for web mining. We also focus on the comparative study of various techniques, process, and algorithms. Finally, we conclude with applications and some research issues. This report also discusses how these approaches facilitate the use of the Internet.
Copyright © 2014 Praise Worthy Prize - All rights reserved.


Web Mining; Web Content Mining; Web Structure Mining; Web Usage Mining

Full Text:



Raymond Kosala, Hendrik Blockeel, Web Mining Research: A Survey, ACM SIGKDD: Explorations: Newsletter of the Special Interest Group (SIG) on Knowledge Discovery and Data Mining, Vol. 2, Issue 1 – page 1, July 2000.

Sanghamitra Bandyopadhyay . Sankar K.Pal, Classification and Learning with Genetic algorithms – Application in Bioinformatics and Web Intelligence, ISSN – 1619-7127, Springer Berlin Heidelberg – 2007 (pg: 242 – 256).

Miguel Gomes da Costa Júnior, Zhiguo Gong, Web Structure Mining: An Introduction, in Proceedings of the IEEE, 2005

L. Page, S. Brin, R. Motwani, and T. Winograd. The Pagerank citation ranking: Bring order to the web. Technical report, Stanford University, 1998.

Kleinberg, J.M., Authoritative sources in a hyperlinked environment. In Proceedings of ACM-SIAM Symposium on Discrete Algorithms, 1998, pages 668-677 – 1998.

P. Ravi Kumar and Ashutosh Kumar Singh, Web Structure Mining: Exploring Hyperlinks and Algorithms for Information Retrieval, American Journal of Applied Sciences 7 (6): 840-845, 2010, ISSN 1546-9239.

Srivastava J, Desikan P and V Kumar, Web Mining - Concepts, Applications & Research Direction, in 2002 Conference.

Rochester Institute of Technology, Web Usage Mining: Data Preprocessing, Pattern Discovery and Pattern Analysis on the RIT Web Data, MS Project Report, Rochester Institute of Technology, 2008.

Miguel Gomes da Costa Júnior, Zhiguo Gong, Web Structure Mining: An Introduction, Proceedings of the 2005 IEEE International Conference on Information Acquisition, June 27 - July 3, 2005, Hong Kong and Macau, China.

Han, J., Kamber, M. Kamber. Data mining: concepts and techniques. Morgan Kaufmann Publishers, 2000.

P. Ravi Kumar and Ashutosh Kumar Singh, Web Structure Mining: Exploring Hyperlinks and Algorithms for Information Retrieval, American Journal of Applied Sciences 7 (6): 840-845, 2010 ISSN 1546-9239 © 2010Science Publications.

O. Etzioni. The world wield web: Quagmire or Gold Mining. Communicate of the ACM, (39)11:65-68, 1996;

Qingyu Zhang and Richard s. Segall, Web mining: a survey of current research,Techniques, and software, in the International Journal of Information Technology & Decision Making Vol. 7, No. 4 (2008) 683–720.

Dimitrios Pierrakos, Georgios Paliouras, Web Usage Mining as a Tool for Personalization: A Surve, User Modeling and User-Adapted Interaction 13: 311-372,2 003, ©2003 Kluwer Academic Publishers. Printed in the Netherlands.

Chhavi Rana, “A Study of Web Usage Mining Research Tools”, Int. J. Advanced Networking and Applications, Volume: 03 Issue: 06 Pages:1422-1429 (2012) ISSN : 0975-0290.

Bhatia, MPS and Khalid, Akshi Kumar (2008), Information Retrieval and machine learning : Supporting Technology for web mining research and practice, Webology, 5(2), Atricle 55. Available at:

Kobayashi, M. & Takeda, K. (2000). Information retrieval on the Web. ACM Computing Surveys, 32 (2).

Srivastava, J., Desikan, P., & Kumar, V. (2002). Web Mining- Accomplishments and Future directions. Proceedings of National Science Foundation Workshop on Next Generation Data Mining (NGDM'02), Baltimore, Maryland.

Bin, W., & Zhijing, L. (2003). Web mining research. Proceedings of 5th International Conference on Computational Intelligence and Multimedia Applications (ICCIMA'03) .

Chakarbarti S. (2003). Mining the Web: Discovering knowledge from hypertext data. (Morgan Kaufmann Publisher, San Francisco, CA).

Henzinger, M. (2004). The Past, Present, and Future of Web Search Engines. Proceedings of 31st International Colloquium, ICALP 2004, Finland.

G.Poonkuzhali, R.Kishore Kumar, R.Kripa Keshav, K.Thiagarajan, K.Sarukesi, Effective Algorithms for Improving the Performance of Search Engine Results, International Journal of Applied Mathematics and Informatics, issue 3, volume 5, 2011.

V. Bharanipriya and V. Kamakshi Prasad, Web Content Mining Tools: A Comparative Study, International Journal of Information Technology and Knowledge Management January-June 2011, Volume 4, No. 1, pp. 211-215.

Web Info Extractor Manual.

Web Content Extractor help.

Automation Anywhere 5.5 help

Tripurari Pujan Pratap Singh, Dr. Anurag Seetha, K. K. Pandey, HIT: Web Content Mining Tool, International Journal of Electronics Communication and Computer Engineering, Volume 3, Issue 6.

Anirudhdha Nayak, A Comparative Study of Web Page Classification Techniques, GIT-Journal of Engineering and Technology, Sixth volume, 2013, ISSN 2249 – 6157.

R. Cooley, B. Mobasher, and J. Srivastava, Web Mining: Information and Pattern Discovery on the World Wide Web, 1082-3409197 $10.00 0 1997 IEEE.

Faustina Johnson, Santosh Kumar Gupta, Web Content Mining Techniques: A Survey, International Journal of Computer Applications (0975 – 888), Volume 47– No.11, June 2012.

Soumen Chakrabarti, Byron E. Dom, S. Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, Andrew Tomkins, David Gibson and Jon Kleinberg. Mining the Web’s link structure. Computer, 32(8):60–67, 1999.

Qiankun Zhao, Sourav S. Bhowmick, and Sanjay Madria, “Research Issues in Web Structural Delta Mining”.

Esmin, A., Lima, J., Yano, Tiago, E. T., Carneiro, G.S. (2008) ‘ArchCollect - A Tool for WEB Usage Knowledge Acquisition from User's Interactions’, Proceedings of the Tenth International Conference on Enterprise Information Systems, Barcelona, Spain, pp. 375-380.

Abraham, A. (2003) ‘i-Miner: A Web Usage Mining Framework Using Hierarchical Intelligent Systems’, IEEE International Conference on Fuzzy Systems FUZZ-IEEE'03, IEEE Press , pp. 1129-1134 .

Tiedtke, T. Märtin, C. and Gerth, N. (2002)‘AWUSA – A Tool for Automated Website Usability Analysis’, PreProceedings of the 9th International Workshop on the Design, Specification and Verification of Interactive Systems.

Pierrakos, D. Paliouras, G. Papatheodorou, C. and Spyropoulos, C. D. (2000) 'KOINOTITES: A Web Usage Mining Tool for Personalization', Proceedings of Panhellenic Conference on Human Computer Interaction, Greece, Patras, pp. 231-236.

Eirinaki, M., Vazirgiannis, M. and Varlamis, I. (2003) ‘SEWeP: using site semantics and a taxonomy to enhance the Web personalization process’, Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 99-108.

Masseglia, F. Poncelet, P. and and Cicchetti. R. (1999) ‘WebTool: An Integrated Framework for Data Mining’, Proceedings of the 10th International Conference on Database and Expert Systems Applications (DEXA '99), Trevor J. M. Bench-Capon, Giovanni Soda, and A. Min Tjoa (Eds.). Springer- Verlag, London, UK, pp. 892-901.

Spiliopoulou, M. and Faulstich, L. C. (1998) ‘WUM : A Web Utilization Miner’, EDBT Workshop on Web Databases, pp.1-7, Valencia, Spain.

Brijendra Singh, Hemant Kumar Singh, Web Data Mining Research: A Survey, 978-1-4244-5967-4/10/$26.00 ©2010 IEEE.

Ahmad, M.H., Mustapha, A., Khairudin, N.M., Clustering web log data using graph partitioning and agglomerative hierarchical algorithms for predicting user navigation patterns, (2013) International Review on Computers and Software (IRECOS), 8 (9), pp. 2178-2186.

Manikandan, P., Selvarajan, S., A hybrid optimization algorithm based on cuckoo search and PSO for data clustering, (2013) International Review on Computers and Software (IRECOS), 8 (9), pp. 2278-2287.

P. Brusilovsky, A. Kobsa, and W. Nejdl (Eds.): The Adaptive Web, LNCS 4321 Springer-Verlag, pp. 90–135, 2007.

Cristóbal Romero, Sebastián Ventura, Amelia Zafra, Paul de Bra, Applying Web usage mining for personalizing hyperlinks in Web-based adaptive educational systems, Computers & Education (Elsevier), 53 (2009) 828–840.

Pooja Mehta, Adaptive Personalization, International Journal Of Engineering Development And Research, 2013.

D. Antoniou, M. Paschou, E. Sourla, A. Tsakalidis (2010), A Semantic Web Personalizing Technique The case of bursts in web visits”, IEEE Fourth International Conference on Semantic Computing.

Gouttaya, N., Begdouri, A., Automatic tracking of changes in user behavior to support proactivity in pervasive systems, (2013) International Review on Computers and Software (IRECOS), 8 (8), pp. 1822-1831.

Sneha, Y.S., Mahadevan, G., Parvathi, R.M.S., Recommender system based on user ratings: A comprehensive study and future challenges, (2013) International Review on Computers and Software (IRECOS), 8 (7), pp. 1624-1635.

Burke, R.: Hybrid systems for personalized recommendations. In Mobasher, B., Anand, S.S., eds.: Intelligent Techniques in Web Personalisation. LNAI 3169. Springer-Verlag (2005) 133–152.

Koutri, M., Avouris, N., & Daskalaki, S. (2005). A survey on Web usage mining techniques for Web-based adaptive hypermedia systems. Adaptable and adaptive hypermedia systems (pp. 125–149). IRM Press.

Chunyan Liang (2011), User Profile for Personalized Web Search, Eighth International Conference on Fuzzy Systems and Knowledge Discovery, IEEE, pp: 1847 – 1850.

Zhicheng Dou, Ruihua Song, Ji-Rong Wen, and Xiaojie Yuan, Evaluating the Effectiveness of Personalized Web Search, IEEE Transactions on Knowledge And Data Engineering, Vol. 21, No. 8, August 2009.

Magdalini Eirinaki and Michalis Vazirgiannis, Web Mining for Web Personalization, ACM Transactions on Internet Technology, Vol. 3, No. 1, February 2003, Pages 1–27.

Alkhatib, B., Alnahhas, A., Ezaldeen, H., Building automatic web customer profiling service, (2013) International Review on Computers and Software (IRECOS), 8 (6), pp. 1341-1345.

Masseglia, F., Poncelet, P., And Teisseire, M. (2000). Web usage mining: How to efficiently manage new transactions and new customers. In Proceedings of the Fourth European Conference on Principles of Data Mining and Knowledge Discovery (PKDD’00) (Lyon, France, Sept.).

Senthil Kumaran, V., Sankar, A., Study of personalization in E-learning, (2013) International Review on Computers and Software (IRECOS), 8 (5), pp. 1209-1217.

John, J.M., Shajin Nargunam, A., Similarity distance based clustering framework for aggregation of web usage data, (2013) International Review on Computers and Software (IRECOS), 8 (1), pp. 287-295.


  • There are currently no refbacks.

Please send any question about this web site to
Copyright © 2005-2024 Praise Worthy Prize