Open Access Open Access  Restricted Access Subscription or Fee Access

Semi-Supervised Multi-Label Classification Through Topological Active Learning

Assia Benyettou(1*), Younès Bennani(2), Abdelkader Benyettou(3), Abderrahmane Bendahmane(4), Guénaël Cabanes(5)

(1) SIMPA Laboratory, Department of Computer Sciences, Faculty of Mathematics and Computers Sciences, University of Sciences and Technology of Oran, Mohammed Boudiaf, USTO-MB, Algeria
(2) LIPN - UMR 7030 CNRS, Université Paris 13, ComUE Université Sorbonne Paris Cité, France
(3) SIMPA Laboratory, Department of Computer Sciences, Faculty of Mathematics and Computers Sciences, University of Sciences and Technology of Oran, Mohammed Boudiaf, USTO-MB, Algeria
(4) SIMPA Laboratory, Department of Computer Sciences, Faculty of Mathematics and Computers Sciences, University of Sciences and Technology of Oran, Mohammed Boudiaf, USTO-MB, Algeria
(5) LIPN - UMR 7030 CNRS, Université Paris 13, ComUE Université Sorbonne Paris Cité, France
(*) Corresponding author


DOI: https://doi.org/10.15866/irecap.v7i3.12742

Abstract


Multi-label classification is becoming increasingly widespread as a data mining technique. Its objective is to categorize models in several non-exclusive groups, and is applied in such areas as news categorization, image labeling and music classification, among others. Our contribution is to use the paradigm of active learning with the topological power of the Act-SOM for semi-supervised multi-label classification, taking into account the multi-label information, and selecting unlabeled data which can lead to the largest reduction of the expected model loss. This paper deals with various multi-label datasets by presenting in active learning, a set of results ranging from: 1) Transductive classifier TSVM with relevance sampling methods for multi-labeled data in various application domains; 2) Proposed semi-supervised classifier Act-SOM in multi-label active learning, adopting a strategy relative to the evaluation by the uncertainty of the labels. Act-SOM based on active learning selects the most uncertain data while clearly improving the test rate with less than 30% of labeled instances added, which is our main contribution. We present the results from the statistical tests using critical diagrams. Thus, potential of the proposed multi-label classification method is demonstrate, due mainly to the competitive properties with global consistency of the semi-supervised Act-SOM through topological active learning.
Copyright © 2017 Praise Worthy Prize - All rights reserved.

Keywords


Multi-Label Learning; Active Learning; SOM; TSVM; Label Uncertainty Strategy

Full Text:

PDF


References


A. Singla, S. Patra, L. Bruzzone, A novel classification technique based on progressive transductive SVM learning, (2014) Pattern recognition letters, 42, pp.101-106.
http://dx.doi.org/10.1016/j.patrec.2014.02.003

A. Wang, J. Wen, S. Alam, Z. Jiang, Y. Wu, Semi-supervised learning combining transductive support vector machine with active learning, (2015) Neurocomputing, pp.1288-1298.
http://dx.doi.org/10.1016/j.neucom.2015.08.087

E.A. Cherman, G. Tsoumakas, M.C. Monard, Active learning algorithms for multi-label data, International Federation for Information Processing,IFIP: Intern. Conf. on Artificial Intelligence and Innovations IFIP AICT, Springer Intern. Publishing 475, pp.267-279, 2016.
http://dx.doi.org/10.1007/978-3-319-44944-9_23

E. Gibaja, S. Ventura, A tutorial on multilabel learning, (2015) ACM Computing Surveys 47(3), pp. 1-38.
http://dx.doi.org/10.1145/2716262

M. Boutell, J. Luo, X. Shen, C. Brown, Learning multi-label Scene classification, (2004) Pattern Recognition, 37, pp.1757–1771.
http://dx.doi.org/10.1016/j.patcog.2004.03.009

D.D. Lewis, Y. Yang, T.G. Rose, F. Li, RCV1: a new benchmark collection for text categorization research, (2005) Journal of Machine Learning Research, 5, pp.361–397.
http://dx.doi.org/10.1109/icmlc.2005.1527601

M.S. Sorower, A literature survey on algorithms for multi-label learning, Oregon State University, Corvallis.16,30, 2010.
http://dx.doi.org/10.5962/bhl.title.25008

M.L. Zhang, Z.H. Zhou, A Review on multi-label learning algorithms, (2014) IEEE Transactions on Knowledge and Data Engineering, 26(8), pp. 1819–1837.
http://dx.doi.org/10.1109/tkde.2013.39

G. Madjarov, D. Kocev, D. Gjorgjevikj, S. Džeroski, An extensive experimental comparison of methods for multi-label learning, (2012) Pattern Recognition, 45, pp. 3084–3104.
http://dx.doi.org/10.1016/j.patcog.2012.03.004

F. Charte, A.J. Rivera, M.J. del Jesus, F. Herrera, On the impact of dataset complexity and sampling strategy in multi-label classifiers performance. Proceedings of 11th International Conference on Hybrid Artificial Intelligent Systems, HAIS’16, vol. 9648 (pp. 500-511, Springer, 2016).
http://dx.doi.org/10.1007/978-3-319-32034-2_42

A. Aggarwal, S. Ghoshal, M.S. Ankith, S. Sinha, G. Ramakrishnan, P. Kar, P. Jain, Scalable optimization of multivariate performance measures in multi-instance multi-label learning, Association for the Advancement of Artificial Intelligence, 2017.
http://dx.doi.org/10.1109/ictai.2010.102

X. Zhu, Semi-supervised learning literature survey, Computer sciences technical report 1530, University of Wisconsin-Madison, 2007.
http://dx.doi.org/10.2172/4573664

F. Mezzoudj, A. Benyettou, On the Optimization of Multiclass Support Vector Machines Dedicated to Speech Recognition, Proceedings of the 19th International Conference on Neural Information Processing–,ICONIP (2), Doha, Qatar,, Part II. Lecture Notes in Computer Science 7664, Springer 2012, (pp. 1-8, Year of Publication: 2012, ISBN: 978-3-642-34480-0).
http://dx.doi.org/10.1007/978-3-642-34481-7_1

A. Benyettou, A. Bendahmane, A. Lotfi, Variables selection by support vector machines for web pages classification, Proceeding of the 1st IEEE- International Conference on Telecommunications and ICT, Oran-Algeria, 2015.
http://dx.doi.org/10.15866/irecos.v11i12.10964

L. Zhang, D. Zhang, Visual understanding via multi-feature shared learning with global consistency, (2016) IEEE Trans. Multimedia 18 (2), pp. 247-259.
http://dx.doi.org/10.1109/tmm.2015.2510509

B. Settles, Active learning literature survey, Computer sciences technical report 1648, University of Wisconsin-Madison, 2012.
http://dx.doi.org/10.2172/4573664

R. Sumathi, E. Kirubakaran, R. Krishnamoorthi, Multi Class Multi Label Based Fuzzy Associative Classifier with Genetic Rule Selection for Coronary Heart Disease Risk Level Prediction, (2014) International Review on Computers and Software (IRECOS), 9 (3), pp. 533–540.

A. Wibisurya, F. L. Gaol, K. Wastuwibowo, Comparing the speed and the Accuracy of multi-label classification models, (2015) International Review on Computers and Software (IRECOS), 10 (8), pp. 49–55.
http://dx.doi.org/10.15866/irecos.v10i8.6697

B. Wang, J. Tsotsos, Dynamic label propagation for semi-supervised multi-class multi-label classification, (2016) Pattern Recognition, 52, pp. 75-84.
http://dx.doi.org/10.1109/iccv.2013.60

G. Lin, K. Liao, B. Sun, Y. Chen, F. Zhao, Dynamic graph fusion label propagation for semi-supervised multi-modality classification, (2017) Pattern Recognition 68, pp.14-23.
http://dx.doi.org/10.1016/j.patcog.2017.03.014

K. Brinker, On active learning in multi-label classification. In: Bühlmann P, Tellner D, Havemann S, et al., eds. From data and information analysis to knowledge engineering. Springer Berlin Heidelberg, 2006, 206-213
http://dx.doi.org/10.1007/3-540-31314-1_24

B. Yang, J.T. Sun, T. Wang, Z. Chen, Effective multi-label active learning for text classification, Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data mining, ser. KDD’09, Paris, France (pp. 917-926, Year of Publication: 2009, ISBN: 978-1-60558-495-9).
http://dx.doi.org/10.1145/1557019.1557119

S. Zhao, J. Wu, V.S. Sheng, C. Ye, P. Zhao, Z. Cui, Weak labeled multi-label active learning for image classification, Proceedings of the 23rd Annual ACM on Multimedia Conference, Bribane, Australia (pp. 1127-1130, Year of Publication: 2015, ISBN: 978-1-4503-3459-4).
http://dx.doi.org/10.1145/2733373.2806298

S.Y. Li, Y. Jiang, Z.H. Zhou, Multi-label active learning from crowds, arXiv:1508.00722, 2015
http://dx.doi.org/10.1007/978-3-319-20248-8_7

N. Gao, S.J. Huang, S. Chen, Multi-label active learning by model guided distribution matching, Frontiers of Computer Science 10(5) (Higher Education Press and Springer-Verlag Berlin Heidelberg, 2016, pp.845-855).
http://dx.doi.org/10.1007/s11704-016-5421-x

M. Sharma, M. Bilgic, Evidence-based uncertainty sampling for active learning, In C. Aggarwal (Ed.), Data Mining Knwoledge Discovery (Springer, 2016).
http://dx.doi.org/10.1007/s10618-016-0460-3

S. Patra, L. Bruzzone, A novel SOM-SVM-based active learning technique for remote sensing image classification. (2014) IEEE Transactions on Geosciences and Remote sensing, vol.52, n°11, pp. 6899-6910.
http://dx.doi.org/10.1109/tgrs.2014.2305516

H. K. Idrissi, Z. Kartit, A. Kartit, M. El Marraki, CKMSA: an Anomaly Detection Process Based on K-Means and Simulated Annealing Algorithms, (2016) International Review on Computers and Software (IRECOS), 11 (1), pp. 42-48.
http://dx.doi.org/10.15866/irecos.v11i1.8272

Q. Abbas, M. Celebi, C. Serrano, I.F. García, G. Ma, Pattern classification of dermoscopy images: a perceptually uniform model, (2013) Pattern Recognition , 46, pp. 86–97.
http://dx.doi.org/10.1016/j.patcog.2012.07.027

H. Liu, X. Wu, S. Zhang, Neighbor selection for multi-label classification, (2016) Neurocomputing, 182, pp. 187-196.
http://dx.doi.org/10.1016/j.neucom.2015.12.035

G. Abaei, A. Selamat, H. Fujita, An empirical study based on semi-supervised hybrid self-organizing map for software fault prediction, (2015) Knowledge-Based Systems, 74, pp. 28-39.
http://dx.doi.org/10.1016/j.knosys.2014.10.017

N. Rogovschi, M. Lebbah, Y. Bennani, Probabilistic Mixed topological map for categorical and continuous data. The 7th International Conference ICMLA. San Diego, California, December 11-13, 2008, pp. 224–231.
http://dx.doi.org/10.1109/icmla.2008.13

N. Rogovschi, M. Lebbah, N. Grozavu, Pondération et classification simultanée de données binaires et continues, Proceedings of the Extraction et Gestion des Connaissances EGC’11, Brest, 25-28 janvier 2011-RNTI, Revue des Nouvelles Technologies de l'Information, Editions Hermann (pp. 65-70, Year of Publication: 2011 ISBN : 9782705688387).
http://dx.doi.org/10.1051/larsg:2007045

F. Herrera, F. Charte, J.A. Rivera, M.J. el Jesus, Multilabel Classification, Problem Analysis, Metrics and Techniques. Springer, 2016.
http://dx.doi.org/10.1007/978-3-319-41111-8_2

J. Demsar, Statistical comparisons of classifiers over multiple data sets, (2006) Journal of Machine learning Research7 1-30.
http://dx.doi.org/10.1007/978-3-540-78246-9_3

M. Hendel, A. Benyettou, F. Hendel, Heartbeats arrhythmia classification using quadratic loss multi-class support vector machines, (2016) International Review on Computers and Software (IRECOS), 11 (1), pp. 49–55.
http://dx.doi.org/10.15866/irecos.v11i1.8280

A. Bendahmane, A. Benyettou, Learning to generate optimized term weighting for web documents classification - A parallel mimetic approach based on support vector machines, (2016) International Review on Computers and Software (IRECOS), 11 (12), pp. 1147-1156.
http://dx.doi.org/10.15866/irecos.v11i12.10964


Refbacks

  • There are currently no refbacks.



Please send any question about this web site to info@praiseworthyprize.com
Copyright © 2005-2020 Praise Worthy Prize