Skip to main content
Article
Island method for estimating the statistical significance of profile-profile alignment scores
BMC Bioinformatics
  • Aleksandar Poleksic, University of Northern Iowa
Document Type
Article
Disciplines
Abstract

Background: In the last decade, a significant improvement in detecting remote similarity between protein sequences has been made by utilizing alignment profiles in place of amino-acid strings. Unfortunately, no analytical theory is available for estimating the significance of a gapped alignment of two profiles. Many experiments suggest that the distribution of local profile-profile alignment scores is of the Gumbel form. However, estimating distribution parameters by random simulations turns out to be computationally very expensive.

Results: We demonstrate that the background distribution of profile-profile alignment scores heavily depends on profiles' composition and thus the distribution parameters must be estimated independently, for each pair of profiles of interest. We also show that accurate estimates of statistical parameters can be obtained using the "island statistics" for profile-profile alignments.

Conclusion: The island statistics can be generalized to profile-profile alignments to provide an efficient method for the alignment score normalization. Since multiple island scores can be extracted from a single comparison of two profiles, the island method has a clear speed advantage over the direct shuffling method for comparable accuracy in parameter estimates.

Department
Department of Computer Science
Comments

First published in BMC Bioinformatics, v. 10 n. 112, (2009), 12 pages, published by BioMed Central Ltd. DOI: https://doi.org/10.1186/1471-2105-10-112.

Original Publication Date
1-1-2009
DOI of published version
10.1186/1471-2105-10-112
Repository
UNI ScholarWorks, University of Northern Iowa, Rod Library
Date Digital
2009
Copyright
©2009 Alexsandar Poleksic. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Creative Commons License
Creative Commons Attribution 4.0
Language
EN
File Format
application/pdf
Citation Information
On article "This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited." shera/romeo website - publisher's version can be archived in IR. - 3/2/2018 en