Open Access Open Access  Restricted Access Subscription or Fee Access

A Heuristic Method Based on Multi-Objective Optimization Concept for Solving RNA Multiple Alignment

Arakil Chentoufi(1*), Abdelhakim El Fatmi(2), Molay Ali Bekri(3), Said Benhlima(4), Mohamed Sabbane(5)

(1) Faculty of Science, Moulay Ismail University, Computer Science Department, MACS Lab, Morocco
(2) Faculty of Science, Moulay Ismail University, Computer Science Department, MACS Lab, Morocco
(3) Faculty of Science, Moulay Ismail University, Computer Science Department, MACS Lab, Morocco
(4) Faculty of Science, Moulay Ismail University, Computer Science Department, MACS Lab, Morocco
(5) Faculty of Science, Moulay Ismail University, Computer Science Department, MACS Lab, Morocco
(*) Corresponding author


DOI: https://doi.org/10.15866/ireaco.v10i5.12395

Abstract


Multiple sequence alignment (MSA) is an NP-complete and important problem in Bioinformatics. For this reason, a number of computational approaches has been developed to achieve the optimal alignment. However, this goal remains a big challenge. MSA can be also treated as a multi-objective optimization problem. In the same way, we present a new method using Pareto Front and Genetic Algorithm (GA), called MOO-RNA, to align a set of RNA sequences. We validate our method on a set of alignments of Bralibase II. The results show that the quality of our method, in terms of Sum-of-Pairs Score (SPS) and Structure Conservation Index (SCI), is improved.
Copyright © 2017 Praise Worthy Prize - All rights reserved.

Keywords


Bioinformatics; Multiple Sequence Alignment; Secondary Structure; Objective Function; Optimization Multi-Objective

Full Text:

PDF


References


M. Höchsmann, B. Voss, and R. Giegerich, “Pure multiple rna secondary structure alignments: a progressive profile approach,” IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), vol. 1, no. 1, pp. 53–62, 2004.
http://dx.doi.org/10.1109/tcbb.2004.11

C. Notredame, “Recent progress in multiple sequence alignment: a survey,” Pharmacogenomics, vol. 3, no. 1, pp. 131–144, 2002.
http://dx.doi.org/10.1517/14622416.3.1.131

S. B. Needleman and C. D. Wunsch, “A general method applicable to the search for similarities in the amino acid sequence of two proteins,” Journal of molecular biology, vol. 48, no. 3, pp. 443–453, 1970.
http://dx.doi.org/10.1016/0022-2836(70)90057-4

T. F. Smith and M. S. Waterman, “Identification of common molecular subsequences,” Journal of molecular biology, vol. 147, no. 1, pp. 195– 197, 1981.
http://dx.doi.org/10.1016/0022-2836(81)90087-5

L. Wang and T. Jiang, “On the complexity of multiple sequence alignment,” Journal of computational biology, vol. 1, no. 4, pp. 337–348, 1994.
http://dx.doi.org/10.1089/cmb.1994.1.337

B. Knudsen, “Optimal multiple parsimony alignment with affine gap cost using a phylogenetic tree,” in International Workshop on Algorithms in Bioinformatics. Springer, 2003, pp. 433–446.
http://dx.doi.org/10.1007/978-3-540-39763-2_31

D.-F. Feng and R. F. Doolittle, “Progressive sequence alignment as a prerequisitetto correct phylogenetic trees,” Journal of molecular evolution, vol. 25, no. 4, pp. 351–360, 1987.
http://dx.doi.org/10.1007/bf02603120

O. Gotoh, “Optimal alignment between groups of sequences and its application to multiple sequence alignment,” Computer applications in the biosciences: CABIOS, vol. 9, no. 3, pp. 361–370, 1993.
http://dx.doi.org/10.1093/bioinformatics/9.3.361

A. V. Lukashin, J. Engelbrecht, and S. Brunak, “Multiple alignment using simulated annealing: branch point definition in human mrna splicing,” Nucleic acids research, vol. 20, no. 10, pp. 2511–2516, 1992.
http://dx.doi.org/10.1093/nar/20.10.2511

C. E. Lawrence, S. F. Altschul, M. S. Boguski, J. S. Liu, A. F. Neuwald, J. C. Wootton et al., “Detecting subtle sequence signals: a gibbs sampling strategy for multiple alignment,” SCIENCE-NEW YORK THEN WASHINGTON-, vol. 262, pp. 208–208, 1993.
http://dx.doi.org/10.1126/science.8211139

J. Stoye, S. W. Perrey, and A. Dress, “Improving the divide-and- conquer approach to sum-of-pairs multiple sequence alignment,” Applied mathematics letters, vol. 10, no. 2, pp. 67–73, 1997.
http://dx.doi.org/10.1016/s0893-9659(97)00013-x

O. Gotoh, “Multiple sequence alignment: algorithms and applications,” Advances in biophysics, vol. 36, pp. 159–206, 1999.
http://dx.doi.org/10.1016/s0065-227x(99)80007-0

F. Sievers, A. Wilm, D. Dineen, T. J. Gibson, K. Karplus, W. Li, R. Lopez, H. McWilliam, M. Remmert, J. Söding et al., “Fast, scalable generation of high-quality protein multiple sequence alignments using clustal omega,” Molecular systems biology, vol. 7, no. 1, p. 539, 2011.
http://dx.doi.org/10.1038/msb.2011.75

C. Notredame, D. G. Higgins, and J. Heringa, “T-coffee: A novel method for fast and accurate multiple sequence alignment,” Journal of molecular biology, vol. 302, no. 1, pp. 205–217, 2000.
http://dx.doi.org/10.1006/jmbi.2000.4042

T. Lassmann and E. L. Sonnhammer, “Kalign–an accurate and fast multiple sequence alignment algorithm,” BMC bioinformatics, vol. 6, no. 1, p. 1, 2005.
http://dx.doi.org/10.1093/nar/gkl191

G. Blackshields, F. Sievers, W. Shi, A. Wilm, and D. G. Higgins, “Sequence embedding for fast construction of guide trees for multiple sequence alignment,” Algorithms for Molecular Biology, vol. 5, no. 1, p. 21, 2010.
http://dx.doi.org/10.1186/1748-7188-5-21

J. Söding, “Protein homology detection by hmm–hmm comparison,” Bioinformatics, vol. 21, no. 7, pp. 951–960, 2005.
http://dx.doi.org/10.1093/bioinformatics/bti125

J. D. Thompson, D. G. Higgins, and T. J. Gibson, “Clustal w: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice,” Nucleic acids research, vol. 22, no. 22, pp. 4673–4680, 1994.
http://dx.doi.org/10.1093/nar/22.22.4673

X. Huang and W. Miller, “Lalign-find the best local alignments between two sequences,” Adv. Appl. Math, vol. 12, p. 373, 1991.
http://dx.doi.org/10.1093/nar/19.17.4663

R. Muth and U. Manber, “Approximate multiple string search,” in Combinatorial Pattern Matching. Springer, 1996, pp. 75–86.
http://dx.doi.org/10.1007/3-540-61258-0_7

C. Notredame and D. G. Higgins, “Saga: sequence alignment by genetic algorithm,” Nucleic acids research, vol. 24, no. 8, pp. 1515–1524, 1996.
http://dx.doi.org/10.1093/nar/24.8.1515

Z.-J. Lee, S.-F. Su, C.-C. Chuang, and K.-H. Liu, “Genetic algorithm with ant colony optimization (ga-aco) for multiple sequence alignment,” Applied Soft Computing, vol. 8, no. 1, pp. 55–78, 2008.
http://dx.doi.org/10.1016/j.asoc.2006.10.012

S. Wu, M. Lee, Y. Lee, and T. M. Gatton, “Multiple sequence alignment using ga and nn,” International Journal of Signal Processing Image Processing and Pattern Recognition, vol. 1, no. 1, pp. 21–30, 2008.
http://dx.doi.org/10.14257/ijsip.2013.6.6.16

P. Seeluangsawat and P. Chongstitvatana, “A multiple objective evolutionary algorithm for multiple sequence alignment,” in Proceedings of he 7th annual conference on Genetic and evolutionary computation. ACM, 2005, pp. 477–478.
http://dx.doi.org/10.1145/1068009.1068088

M. Abbasi, L. Paquete, and F. B. Pereira, “Local search for multiobjective multiple sequence alignment,” in International Conference on Bioinformatics and Biomedical Engineering. Springer, 2015, pp. 175– 182.
http://dx.doi.org/10.1007/978-3-319-16480-9_18

H. Zhu, Z. He, and Y. Jia, “A novel approach to multiple sequence alignment using multiobjective evolutionary algorithm based on decomposition,” IEEE journal of biomedical and health informatics, vol. 20, no. 2, pp. 717–727, 2016
http://dx.doi.org/10.1109/jbhi.2015.2403397

Á. Rubio-Largo, M. A. Vega-Rodríguez, and D. L. González-Álvarez, “Hybrid multiobjective artificial bee colony for multiple sequence alignment,” Applied Soft Computing, vol. 41, pp. 157–168, 2016.
http://dx.doi.org/10.1016/j.asoc.2015.12.034

M. Kaya, A. Sarhan, and R. Alhajj, “Multiple sequence alignment with affine gap by using multi-objective genetic algorithm,” Computer methods and programs in biomedicine, vol. 114, no. 1, pp. 38–49, 2014.
http://dx.doi.org/10.1016/j.cmpb.2014.01.013

M. Kaya, B. Kaya, and R. Alhajj, “A novel multi-objective genetic algorithm for multiple sequence alignment,” International Journal of Data Mining and Bioinformatics, vol. 14, no. 2, pp. 139–158, 2016.
http://dx.doi.org/10.1504/ijdmb.2016.074684

R. R. Rani and D. Ramyachitra, “Multiple sequence alignment using multi-objective based bacterial foraging optimization algorithm,” Biosystems, vol. 150, pp. 177–189, 2016.
http://dx.doi.org/10.1016/j.biosystems.2016.10.005

K. Katoh, K.-i. Kuma, H. Toh, and T. Miyata, “Mafft version 5: improvement in accuracy of multiple sequence alignment,” Nucleic acids research, vol. 33, no. 2, pp. 511–518, 2005.
http://dx.doi.org/10.1093/nar/gki198

R. C. Edgar, “Muscle: multiple sequence alignment with high accuracy and high throughput,” Nucleic acids research, vol. 32, no. 5, pp. 1792– 1797, 2004.
http://dx.doi.org/10.1093/nar/gkh340

S. R. Sneath PHA, “Numerical taxonomy,” Pharmacogenomics, 1973.
http://dx.doi.org/10.1006/rwgn.2001.1272

I. L. Hofacker, “Vienna rna secondary structure server,” Nucleic acids research, vol. 31, no. 13, pp. 3429–3431, 2003.
http://dx.doi.org/10.1093/nar/gkg599

P. P. Gardner, A. Wilm, and S. Washietl, “A benchmark of multiple sequence alignment programs upon structural rnas,” Nucleic acids research, vol. 33, no. 8, pp. 2433–2439, 2005.
http://dx.doi.org/10.1093/nar/gki541

J. D. Thompson, F. Plewniak, and O. Poch, “Balibase: a benchmark alignment database for the evaluation of multiple alignment programs.” Bioinformatics, vol. 15, no. 1, pp. 87–88, 1999.
http://dx.doi.org/10.1093/bioinformatics/15.1.87

I. L. Hofacker, M. Fekete, and P. F. Stadler, “Secondary structure prediction for aligned rna sequences,” Journal of molecular biology, vol. 319, no. 5, pp. 1059–1066, 2002.
http://dx.doi.org/10.1016/s0022-2836(02)00308-x

K. Katoh and H. Toh, “Parttree: an algorithm to build an approximate tree from a large number of unaligned sequences,” Bioinformatics, vol. 23, no. 3, pp. 372–374, 2007.
http://dx.doi.org/10.1093/bioinformatics/btl592

C. Grasso and C. Lee, “Combining partial order alignment and progressive multiple sequence alignment increases alignment speed and scalability to very large alignment problems,” Bioinformatics, vol. 20, no. 10, pp. 1546–1556, 2004.
http://dx.doi.org/10.1093/bioinformatics/bth126

J. Pei, R. Sadreyev, and N. V. Grishin, “Pcma: fast and accurate multiple sequence alignment based on profile consistency,” Bioinformatics, vol. 19, no. 3, pp. 427–428, 2003.
http://dx.doi.org/10.1093/bioinformatics/btg008

A. Richardson, “Nonparametric statistics for non-statisticians: A step-by-step approach by gregory w. corder, dale i. foreman,” International Statistical Review, vol. 78, no. 3, pp. 451–452, 2010.
http://dx.doi.org/10.1111/j.1751-5823.2010.00122_6.x


Refbacks

  • There are currently no refbacks.



Please send any question about this web site to info@praiseworthyprize.com
Copyright © 2005-2019 Praise Worthy Prize