A New Gene Selection Method Based on Maximum Correlation and Minimum Redundancy

M. Ebrahimpour; H. Mahmoodian; R. Ghayour

doi:10.15866/irena.v1i3.6621

A New Gene Selection Method Based on Maximum Correlation and Minimum Redundancy

M. Ebrahimpour^(1*), H. Mahmoodian⁽²⁾, R. Ghayour⁽³⁾

^(*) Corresponding author

DOI's assignment:
the author of the article can submit here a request for assignment of a DOI number to this resource!
Cost of the service: euros 10,00 (for a DOI)

Abstract

Microarray technology is a powerful tool for analyzing the behavior of thousands of genes simultaneously, and plays an important role in diagnosis, detection and treatment methods. Standard statistical methods are not suited to classification and diagnosis, when the number of samples is greater than the number of genes. Reducing the size of the selected set of genes with high potential for classification of microarray data analysis is thus an important goal. In this paper, we propose a new feature selection method based on maximum correlation and minimum redundancy (MCMR). We evaluate the performance of MCMR using three microarray data sets: the colon cancer data, breast cancer data and the DLBCL dataset. In general, MCMR can significantly reduce the number of genes and perform better than SNR, PCC and Fisher score
Copyright © 2013 Praise Worthy Prize - All rights reserved.

Keywords

Gene Selection; Classification; Correlation; Redundancy

Full Text:

PDF

References

C. W. D. Justin and R. J. Victor, Feature subset selection with a simulated annealing data mining algorithm, J. Intell. Inform. Syst., vol. 9, pp. 57-81, 1997.

I. Inza, P. Larranaga, R. Blanco, and A. Cerrolaza, Filter versus wrapper gene selection approaches in DNA microarray domains, Artif. Intell. Med., vol. 31, pp. 91–103, 2004.
http://dx.doi.org/10.1016/j.artmed.2004.01.007

M. Dash, H. Liu, Feature selection for classification, Intelligent Data Analysis , pp.131–156,1997.
http://dx.doi.org/10.1016/s1088-467x(97)00008-5

Guyon, A. Elisseeff, An introduction to variable and feature selection, Journal of Machine Learning Research ,vol. 3, pp. 1157–1182, 2003.

H. Liu, L. Yu, Towards integrating feature selection algorithms for classification and clustering, IEEE Transactions on Knowledge and Data Engineering. Vol. 17 no. 4, pp. 491–502, 2005.
http://dx.doi.org/10.1109/tkde.2005.66

J. Kittler, Pattern Recognition and Signal Processing, Chapter Feature Set Search Algorithms, (Netherlands: Sijthoff and Noordhoff, Alphen aan den Rijn, 1978, 41-60).
http://dx.doi.org/10.1007/978-94-009-9941-1_3

J. Kittler, Pattern Recognition and Signal Processing, Chapter Feature Set Search Algorithms, (Netherlands: Sijthoff and Noordhoff, Alphen aan den Rijn, 1978, 41-60).
http://dx.doi.org/10.1007/978-94-009-9941-1_3

J. Holland, Adaptation in Natural and Artificial Systems (University of Michigan Press, Ann Arbor,1975).
http://dx.doi.org/10.1016/s0092-8240(76)80036-5

I. Inza, P. Larrañaga, R. Etxeberria, B. Sierra, Feature subset selection by Bayesian networks based optimization. Artif. Intell, vol. 123, pp. 157–184, 2000.
http://dx.doi.org/10.1016/s0004-3702(00)00052-7

L. van’t Veer, H. Dai, M. van de Vijver, Y. He, A. Hart, M. Mao, H. Peterse, K. van der Kooy, M. Marton, A. Witteveen, G. Schreiber, R. Kerkhoven, C. Roberts, P. Linsley, R. Bernards, S. Friend, Gene expression profiling predicts clinical outcome of breast cancer, Nature, vol. 415, pp. 530–536, 2000.
http://dx.doi.org/10.1038/415530a

R. O. Duda, P. E. Hart, D. G. Stork, Pattern Classification (2nd edition, John Wiley & Sons, Inc, 2000).
http://dx.doi.org/10.1007/s00357-007-0015-9

J. Jaeger, Improved gene selection for classification of microarrays, Pacific Symposium on Biocomputing, vol. 8, pp. 53-64, 2003.
http://dx.doi.org/10.1142/9789812776303_0006

M. A. Hall, Correlation-based Feature Selection for Machine Learning, Ph.D. Thesis, Dept. Computer Science, University of Waikato, Hamilton, NewZealand, 1999.

J. Prados, A. Kalousis, J. C. Sanchez, L. Allard, O. Carrette and M. Hilario, Mining mass-spectra for diagnosis and biomarker discovery of cerebral accidents, Proteomics,vol. 4, pp. 2320-2332, 2004.
http://dx.doi.org/10.1002/pmic.200400857

H.C. Peng, F. Long and C. Ding, Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 8, pp. 1226–1238, 2005.
http://dx.doi.org/10.1109/tpami.2005.159

CH. Ding and H. Peng, Minimum Redundancy Feature Selection from Microarray Gene Expression Data, 2nd IEEE Computer Society Bioinformatics Conference (CSB), pp. 523-529, 2003.
http://dx.doi.org/10.1109/csb.2003.1227396

R. Kohavi and G. John, Wrappers for feature subset selection, Artif Intell, vol. 97, pp. 273–324, 1997.
http://dx.doi.org/10.1016/s0004-3702(97)00043-x

L. Yu and H. Liu, Efficient feature selection via analysis of relevance and redundancy, J. Mach. Learn. Res, vol. 5, pp. 1205–1224, 2004.

O. KURŞUN , C.O. ŞAKAR, O. FAVOROV, N. AYDIN, F. GÜRGEN ,Using covariates for improving the minimum Redundancy Maximum Relevance feature selection method, Turk J Elec Eng & Comp Sci, Vol. 18, No. 6,pp. 975-989, 2010.

M.J. Abdi, S.M. Hosseini, and M. Rezghi, A Novel Weighted Support Vector Machine Based on Particle Swarm Optimization for Gene Selection and Tumor Classification, Computational and Mathematical Methods in Medicine, vol. 2012.
http://dx.doi.org/10.1155/2012/320698

H. Peng, F. Long, and C. Ding, Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and minredundancy, IEEE Trans. Pattern Anal. Mach. Intell, vol. 27, no. 8, pp. 1226–1238, Aug. 2005.
http://dx.doi.org/10.1109/tpami.2005.159

P.A. Mundraand, J.C. Rajapakse, SVM-RFE With MRMR Filter for Gene Selection, NanoBioscience, IEEE Transactions , pp. 31-37, 2010.
http://dx.doi.org/10.1109/tnb.2009.2035284

L.J. van’t Veer, H. Dai, M.J. van de Vijver, Y.D. He , A.A. M. Hart , M. Mao ,H.L. Peterse , K. van der Kooy, M.J. Marton, A.T. Witteveen, G.J. Schreiber, R.M. Kerkhoven, C. Roberts, P.S. Linsley, R. Bernards and S.H. Friend ,Gene expression profiling predicts clinical outcome of breast cancer, Nature, vol. 415, pp. 530–536, 2002.
http://dx.doi.org/10.1038/415530a

H. Mahmoodian, M.H. Marhaban ,R. Abdulrahim, R. Rosli, I. Saripan, Using fuzzy association rule mining in cancer classification, Australas Phys Eng Sci Med, vol. 34, pp. 41–54, 2011.
http://dx.doi.org/10.1007/s13246-011-0054-8

R.O. Duda, P.E. Hart and D.G. Stork, Pattern Classification (Wiley-Interscience Publication, 2001).
http://dx.doi.org/10.1007/s00357-007-0015-9

T.R. Golub, D.K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J.P. Mesirov, H. Coller, M.L. Loh, J.R. Downing, M.A. Caligiuri, C.D. Bloomfield, and E.S. Lander, Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring, Science, vol. 286, pp. 531-537, 1999.

L.J. van’t Veer, H. Dai, M.J. van de Vijver, Y.D. He , A.A. M. Hart , M. Mao ,H.L. Peterse , K. van der Kooy, M.J. Marton, A.T. Witteveen, G.J. Schreiber, R.M. Kerkhoven, C. Roberts, P.S. Linsley, R. Bernards and S.H. Friend ,Gene expression profiling predicts clinical outcome of breast cancer, Nature, vol. 415, pp. 530–536, 2002.
http://dx.doi.org/10.1038/415530a

M.A. Shipp, K.N. Ross , P. Tamayo , A.P. Weng, J.L. Kutok , R.C. Aguiar , M. Gaasenbeek , M. Angelo , M. Reich , G.S. Pinkus , T.S. Ray , M.A. Koval , K.W. Last , A. Norton ,TA. Lister , J. Mesirov , D.S. Neuberg , E.S. Lander , J.C. Aster ,T.R. Golub ,Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning, Nat Med, vol. 8, no. 1, pp. 68–74, 2002.
http://dx.doi.org/10.1038/nm0102-68

Alba, E., Garcia-Nieto, J., Jourdan, L., Talbi, E, Gene selection in cancer classification using PSO/SVM and GA/SVM hybrid algorithms. Evolutionary Computation, CEC 2007. IEEE Congress (page: 284-290 year of publication: 2007 ISBN: 978-1-4244-1340-9).
http://dx.doi.org/10.1109/cec.2007.4424483

G. Alexe, S. Alexe, D.E. Axelrod, T.O. Bonates, I.I. Lozina, M. Reiss, P.L. Hammer, Breast cancer prognosis by combinatorial analysis of gene expression data, Breast Cancer Research , vol. 8, no. 4, pp. 1-20, 2006.
http://dx.doi.org/10.1186/bcr1512

W. Xiong, Z. Cai, J.Ma, DSRPCL-SVM approach to informative gene analysis, Genomics Proteomics Bioinform, vol. 6, n. 2, pp. 83–90, 2008.
http://dx.doi.org/10.1016/s1672-0229(08)60023-6

S. Li, X. Wu, and M Tan, Gene selection using hybrid particle swarm optimization and genetic algorithm, Soft Comput, vol. 12, pp. 1039–1048, 2008.
http://dx.doi.org/10.1007/s00500-007-0272-x

L. van’t Veer, H. Dai and MJ. van de Vijver , Gene expression profiling predicts clinical outcome of breast cancer, Nature ,vol. 415, pp. 530–536, 2002.
http://dx.doi.org/10.1038/415530a

H. Mahmoodian , M. Hamiruce Marhaban ,R. Abdulrahim , O. Rosli, and I. Saripan, Using fuzzy association rule mining in cancer classification, Australas Phys Eng Sci Med, vol. 34, pp. 41–54, 2011.
http://dx.doi.org/10.1007/s13246-011-0054-8

L. Ziaei , AR. Mehri, M. Salehi, Application of Artificial Neural Networks in Cancer Classification and Diagnosis Prediction of a Subtype of Lymphoma Based on Gene Expression Profile, Journal of Research in Medical Sciences, vol. 11,n. 1, pp. 13-17, 2006.

X. Wang and O. Gotoh, A Robust Gene Selection Method for Microarray-based Cancer Classification, Cancer Informatics , vol. 9, pp. 15–30, 2010.
http://dx.doi.org/10.4137/cin.s3794

Refbacks

There are currently no refbacks.

Username
Password
Remember me