A Comparative and Performance Analysis Of Similarity Metrics In Recommneder System Based On Hadoop Framework

Y. S. Sneha(1*), Neetha Susan Thampi(2), G. Mahadevan(3)

(1) Assistant Professor, JSSATE, Bangalore and Research Scholar Anna University, Chennai, India
(2) Toxicology Analyst in CASMEDCO Data Management PVT LTD, India
(3) Principal, Annai College of Engineering and Technology, Kovilacheri, Kumbakonam, India
(*) Corresponding author

DOI's assignment:
the author of the article can submit here a request for assignment of a DOI number to this resource!
Cost of the service: euros 10,00 (for a DOI)


With exponential growth of information available on web, there is a need for a personal assistance to customers for finding the best item out of the large set of items. There is also need for finding the best item having highest popularity. The personal assistance which will assist the user is the Recommender System. It is a software tool that uses knowledge discovery techniques to produce personalized recommendations. The majority of the recommendations are based on the machine learning algorithm and techniques which uses a standard data set for producing predictions. With tremendous growth of customers and items in the recent years, there is a key challenge for the recommendation system to produce quality recommendations based on the similarity of the users. There also lies a great challenge for the recommendation system to produce quality output when data set is huge, reduce latency, group users having similar interest and perform recommendations in seconds for millions of customers and products. Thus there is a need for new recommendation technologies which will produce quality output to users having similar tastes in a best possible manner.  In this paper we present a comparative analysis of various similarity metrics used in recommendation system for clustering the users based on hadoop framework. The experimental results show that spearman rank correlation performs the best among the other similarity metrics based on three evaluation metrics
Copyright © 2014 Praise Worthy Prize - All rights reserved.


Collaborative Filtering (CF); Spearman Correlation Coefficient; Karl Pearson Coefficient; Tanimoto Coefficient; Log Likelihood; Cosine Coefficient

Full Text:



G. Adomavicius, A.Tuzhilin, Towards the Next Generation Of Recommender Systems – A Survey of the State- of –the Art , IEEE Trans on Knowledge and Data Engineering, Vol. 17, n.6,pp. 734-739,2005

M. Balabanovic, Y. Shoham, Fab: Content based Collaborative Recommender System, Communication of the ACM, Vol. 40,n.3,pp. 66-72,1997

J.S.Breese, D. Heckerman, C. Kadie., Empirical Analysis of Predictive Algorithms for Collaborative filtering, Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence (Page:43-52,Year of Publication:1998)

Chuck Lam 2010, Hadoop in Action. 1st Ed (Manning Publications. ISBN: 978-1-9351-8219-1, pp: 60)

G. David, N. David, M. Brian , T. Douglas., Using Collaborative Filtering to weave an information tapestry, Communications of ACM, Vol. 35,n.12,pp. 61-70, 1992.

A. Dhoha, S. Ghadeer., A. Lamia., K. Mona., M. Romy., N.William. 2010, A Survey Paper on Recommender Systems, in http://arxiv.org/pdf/1006.5278.pdf

Ghanshyam ,N. 2012, A study on similarity measures based on map reduce programming model http://library.ndsu.edu/repository/handle/10365/22599

S. Guy, G.Asela. 2011, Evaluating the RecommenderSystem, http://research.microsoft.Com/pubs /115396/evaluationmetrics.tr.pdf

A. Joseph, K., N.M. Bradley, M.David, L.H.Jonathan, R.G.Lee , R.John. 1997, Group Lens: Applying Collaborative Filtering to Usenet news Communication of ACM, Vol. 40,n. 3, pp. 77-87,1997.

H. Kwon, T.K. Hong., Improved Memory-Based Collaborative Filtering Using Entropy-based Similarity Measures. Proceedings of the 2009 International Symposium on Web Information Systems and Applications (WISA’09), China, May 22-24(Page: 29-34,Year of Publication:2009).

Oreily 2010, Hadoop Defintive Guide,2nd Ed. ISBN: 978-1-4493-8973-4,pp: 50

M.Papagelis, D. Plexousakis. 2005, Qualitative analysis of user-based and item-based prediction algorithms for recommendation agents., Engineering Applications of Artificial Intelligence,Vol. 18,n.7,pp. 781-789,2005.

R.Paul, I.Neophytos , S.Mitesh, B. Peter , R. John 1994, Group Lens: An open architecture for Collaborative Filtering for Netnews, Proceedings of ACM Conference on Computer Supported Cooperative Work Chapel Hill (Page:175-186, Year of Publication:1994)

O. Sean., A.Robin, D. Ted, F. Ellen. 2012, Mahout in Action First Edition Manning Publications. ISBN:978-1-9351-8268-9,pp:78.

Su,X , Khoshgoftaar 2009, A Survey of Collaborative Filtering Techniques, Adv. in Artificial. Intelligence.Vol.2009n.1,pp:1-19,2009.

Upendra Shardanand , Patti Maes 1995, Social Information filtering: Algorithms for Automating Word Of Mouth , Proceedings of ACM Conference on Human Factors in Computing System (Page: 210-217,Year of Publication:1995).

Xingyuan, L., Collaborative Filtering Recommendation Algorithm Based on Cluster, Proceedings of International Conference on Computer Science and Network Technology IEEE Xplore(Page:2682-2685,Year of Publication:2011).

Yueping, W and Z. Jianguo, A Collaborative Filtering Recommendation Algorithm Based on Improved Similarity Measure Method, Proceedings of IEEE Conference on Progress in informatics and computing Dec 10-12 IEEE Xplore Shanghai,(Page:246-249,Year of Publication:2010).

Rajesh, R.V., Arif Abdul Rahuman, S., Veerappan, J., CBIR using similarity measure analysis based on region based level set segmentation, (2014) International Review on Computers and Software (IRECOS), 9 (1), pp. 154-16.


  • There are currently no refbacks.

Please send any question about this web site to info@praiseworthyprize.com
Copyright © 2005-2022 Praise Worthy Prize