Local Similarities Approximation in DNA Sequences Based on Pairwise Sequence Aligner Algorithm

N. Al-Shanableh; H. Al-Zoubi; M. Al Rababaa

doi:10.15866/irecos.v8i1.2712

Local Similarities Approximation in DNA Sequences Based on Pairwise Sequence Aligner Algorithm

N. Al-Shanableh^(1*), H. Al-Zoubi⁽²⁾, M. Al Rababaa⁽³⁾

^(*) Corresponding author

Authors' affiliations

DOI: https://doi.org/10.15866/irecos.v8i1.2712

Abstract

Sequence alignment is a way of arranging primary sequences of DNA, RNA, or protein to identify regions of similarity. This region may be a consequence of functional, structural, or evolutionary relationships between the sequences. An algorithm is proposed for finding approximate local similarities in DNA sequences (AFALS-N). This algorithm is capable of finding the similarity between two sequences by generating all the possible words in the first sequence, then finding the exact matches in the second sequence. The selection of the obtained results is essential to reduce the number of possible results that in turn reduces the searching time. Results show that the proposed algorithm has reduced the searching time to an average of 20% in regard to PatternHunter algorithm. The objective of this work was evident by maintaining balance between the execution time and the size of seeds and the sensitivity. Improved execution time with 66% of sensitivity are obtained with the same word size as those used in other algorithms.
Copyright © 2013 Praise Worthy Prize - All rights reserved.

Keywords

DNA Sequences; Pairwise Alignment; String Matching; Patternhunter Algorithm; AFALS-N Algorithm

Full Text:

PDF

References

A. Poloski, M. Kimmel, Bioinformatics (Springer, 2007).

N. Jones, P. Pevzner, An introduction to bioinformatics algorithms (MIT press, 2004).

H. Heikki, Practical Methods for Approximate String Matching, PhD Thesis, University of Tampere, Kalevantie, Finland, 2003.

S. Needleman, C. Wunsch, A general method applicable to the search for similarities in the amino acid sequence of two proteins, J. Mol. Biol., Vol. 48, pp. 443-453, 1970.

T. Smith, M. Waterman, Identification of common molecular subsequences, J. Mol. Biol., Vol. 147, pp. 195-197, 1981.

D. Lipman, W. Pearson, Rapid and Sensitive Protein Similarity Searches, Science, Vol. 227, n. 4693, pp. 1435-1441, 1985.
http://dx.doi.org/10.1126/science.2983426

S. Altschul, W. Gish, W. Miller, E. Myers, D. Lipman, A Basic Local Alignment Search Tool, J. Mol. Biol., Vol. 215, pp. 403-410, 1990.
http://dx.doi.org/10.1016/s0022-2836(05)80360-2

B. Ma, J. Tromp, M. Li, Patternhunter: Faster and more sensitive homology search, Bioinformatics, Vol. 18, n. 3, pp. 440-445, 2002.
http://dx.doi.org/10.1093/bioinformatics/18.3.440

D. Knuth, J. Morris (Jr), V. Pratt, Fast Pattern Matching in strings, SIAM Journal on Computing, Vol. 6, n. 1, pp. 325-350, 1977.
http://dx.doi.org/10.1137/0206024

G. Landan, D. Graur, Characterization of pairwise and multiple sequence alignment errors, Gene, Vol. 441, n. 1-2, pp. 141-147, 2009.
http://dx.doi.org/10.1016/j.gene.2008.05.016

E. Mackackova, J. Damborsky, D. Valik, L. Foretova, Novel Germline BRCA1 and BRCA2 Mutations in Breast and Breast/ Ovarian Cancer Families, Human Mutation, Vol. 18, n. 6, pp. 545-550, 2001.
http://dx.doi.org/10.1002/humu.1232

L. Dudás, Algorithmic Improvements in Pattern Matching, International Review on Computers and Software (IRECOS), Vol. 2, n. 4, pp. 331-341, 2007.

Refbacks

There are currently no refbacks.

Username
Password
Remember me