Algorithmic Improvements in Pattern Matching

L. Dudás(1*)

(1) Department of Information Engineering, University of Miskolc, Hungary
(*) Corresponding author


DOI's assignment:
the author of the article can submit here a request for assignment of a DOI number to this resource!
Cost of the service: euros 10,00 (for a DOI)

Abstract


This paper shows two algorithmic improvements for pattern matching introducing two new heuristics. For test purposes the author implemented three well-known heuristics to ensure the O(n) worst case time, the O(n log( (m) /m) average case time and the O(n/m) best case time of searching for an m length pattern in an n length text that use a ( letter alphabet. The first new heuristic predicts the better search direction in the pattern preprocessing phase taking into consideration the unsymmetrical property of the pattern and using it in the search process. This method assumes that the full text resides in the RAM and exploits the fact that the memory reading is not direction dependent. The paper gives the prediction function for uniform character distribution and for natural language text alike and in addition presents a generalized form. The second new heuristic is effective on natural language texts and achieves a longer average jump than the pattern length, making compromise on completeness of search. The method exploits the fact that minimal distances can be determined between occurrences of different substrings for a language by a statistical preprocessing. Tests proved that both heuristics result in fewer jumps and character comparisons in the search phase of the pattern matching.
Copyright © 2018 Praise Worthy Prize - All rights reserved.

Keywords


Pattern Matching; Search Direction Prediction; DNA Search; Natural Language Text; Long Jumps

Full Text:

PDF


Refbacks

  • There are currently no refbacks.



Please send any question about this web site to info@praiseworthyprize.com
Copyright © 2005-2019 Praise Worthy Prize