A genetic algorithm is proposed as an alternative to the traditional linear programming method for scoring covariance models in non-coding RNA (ncRNA) gene searches. The standard method is guaranteed to find the best score, but it is too slow for general use. The observation that most of the search space investigated by the linear programming method does not even remotely resemble any observed sequence in real sequence data can be used to motivate the use of genetic algorithms (GAs) to quickly reject regions of the search space. A search space with many local minima makes gradient decent an unattractive alternative. It is shown that a fixed-length representation for alignment of two sequences taken from the protein threading literature can be adapted for use with covariance models.
This document was originally published by IEEE in 2006 IEEE Mountain Workshop on Adaptive and Learning Systems. Copyright restrictions may apply. DOI: 10.1109/SMCALS.2006.250691
Available at: http://works.bepress.com/jennifer_smith/3/