Open Access Open Access  Restricted Access Subscription or Fee Access

Comparing the Speed and Accuracy of Multi-Label Classification Models


(*) Corresponding author


Authors' affiliations


DOI: https://doi.org/10.15866/irecos.v10i8.6697

Abstract


A fast and accurate multi-label classification method is needed to manage a rapidly increasing number of journal articles which can include more than one field of study. This research’s purposes are to compare the speed and accuracy of two multi-label classification models. The first model combines Label Powerset (LP), ReliefF (RF) and Fuzzy Similarity-based k-Nearest Neighbor (FSkNN). The second model combines LP, Distinguishing Feature Selector (DFS), and FSkNN. Speed is measured by training time and testing time consumed, while accuracy is measured using hamming loss. Based on the experiment, LP-DFS-FSkNN is faster and more accurate since its training time and hamming loss are less than LP-RF-FSkNN’s while the testing time of both models are the same.
Copyright © 2015 Praise Worthy Prize - All rights reserved.

Keywords


Multi-Label Classification; Machine Learning; Fuzzy Similarity-based k-Nearest Neighbor; ReliefF; Distinguishing Feature Selector

Full Text:

PDF


References


A. E. Jinha, "Article 50 million: an estimate of the number of scholarly articles in existence," Learned Publishing, vol. 23, pp. 258-263, 2010.
http://dx.doi.org/10.1087/20100308

N. Spolaor, E. A. Cherman, M. C. Monard and H. D. Lee, "A Comparison of Multi-Label Feature Selection Methods using the Problem Transformation Approach," Electronics Notes in Theoretical Computer Science, vol. 292, pp. 135-151, 2013.
http://dx.doi.org/10.1016/j.entcs.2013.02.010

C. D. Manning, P. Raghavan and H. Schutze, An Introduction to Information Retrieval, Cambridge: Cambridge University Press, 2009.
http://dx.doi.org/10.1007/s10791-009-9096-x

G. Tsoumakas and I. Katakis, "Multi-Label Classification: An Overview," International Journal of Data Warehousing & Mining, vol. 3, pp. 1-13, 2007.
http://dx.doi.org/10.4018/jdwm.2007070101

J.-Y. Jiang, S.-C. Tsai and S.-J. Lee, "FSKNN: Multi-label text categorization based on fuzzy similarity and k nearest neighbors," Expert Systems with Applications, vol. 39, no. 3, pp. 2813-2821, 2012.
http://dx.doi.org/10.1016/j.eswa.2011.08.141

Sumathi, R., Kirubakaran, E., Krishnamoorthy, R., Multi class multi label based Fuzzy associative classifier with genetic rule selection for coronary heart disease risk level prediction, (2014) International Review on Computers and Software (IRECOS), 9 (3), pp. 533-540.

G. Tsoumakas, I. Katakis and I. Vlahavas, "Mining Multi-Label Data," in Data Mining and Knowledge Discovery Handbook, Springer, 2009, pp. 1-19.
http://dx.doi.org/10.1007/978-0-387-09823-4_34

K. Trohidis, G. Tsoumakas, G. Kalliris and I. Vlahavas, "Multi-label classification of music by emotion," 9th International Conference on Music Information Retrieval (ISMIR 2008), pp. 325-330, 2008.
http://dx.doi.org/10.1186/1687-4722-2011-426793

I. Kononenko, "Estimating attributes: analysis and extensions of Relief," in Proceedings of the European conference on machine learning on Machine Learning, New York, 1994.
http://dx.doi.org/10.1007/3-540-57868-4_57

P. Kumari, A. Nath and R. Chaube, "Identification of human drug targets using machine-learning algorithms," Computers in Biology and Medicine, vol. 56, pp. 175-181, 2015.
http://dx.doi.org/10.1016/j.compbiomed.2014.11.008

A. K. Uysal and S. Gunal, "A novel probabilistic feature selection method for text classification," Knowledge-Based Systems 36, pp. 226-235, 2012.
http://dx.doi.org/10.1016/j.knosys.2012.06.005

A. K. Uysal and S. Gunal, "Text classification using genetic algorithm oriented latent semantic features," Expert Systems with Applications, 2014.
http://dx.doi.org/10.1016/j.eswa.2014.03.041

W. Ren, L. Hu, K. Zhao, J. Chu and B. Jia, "Intrusion classifier based on multiple attribute selection algorithms," Journal of Computers (Finland), vol. 8, no. 10, pp. 2536-2543, 2013.
http://dx.doi.org/10.4304/jcp.8.10.2536-2543

M. L. Zhang and Z. H. Zhou, "ML-kNN: A lazy learning approach to multi-label learning," Pattern Recognition, vol. 40, p. 2038–2048, 2007.
http://dx.doi.org/10.1016/j.patcog.2006.12.019

R. Saracoglu, K. Tutuncu and N. Allahverdi, "A new approach on search for similar documents with multiple categories using fuzzy clustering," Expert Systems with Applications, vol. 2, pp. 2545-2554, 2008.
http://dx.doi.org/10.1016/j.eswa.2007.04.003

Vanisri, D., A novel fuzzy clustering algorithm based on K-means algorithm, (2014) International Review on Computers and Software (IRECOS), 9 (10), pp. 1731-1736.
http://dx.doi.org/10.15866/irecos.v9i10.1639

OCLC, Inc, "Dewey Decimal Classification Summaries," 9 January 2015. [Online]. Available: http://www.oclc.org/en-US/dewey/features/summaries.html.

A. Voutilainen, "NPTool, a detector of English noun phrases," Proceedings of the Workshop on Very Large Corpora, pp. 48-57, 1993.

T. Joachimes, "A probabilistic analysis of the Rocchio algorithm with TFIDF for text categorization," 14th international conference on machine learning, pp. 143-151, 1997.

Chun, G., Zhu, Q., Recognition of multiple power quality disturbances using KNN-Bayesian, (2014) International Review of Electrical Engineering (IREE), 9 (1), pp. 109-112.


Refbacks

  • There are currently no refbacks.



Please send any question about this web site to info@praiseworthyprize.com
Copyright © 2005-2024 Praise Worthy Prize