"Exploring inconsistencies in genome-wide protein function annotations: a machine learning approach" by Carson Andorf

Selected Works of Drena Dobbs

Follow Contact

Article

Exploring inconsistencies in genome-wide protein function annotations: a machine learning approach

BMC Bioinformatics

Carson Andorf, Iowa State University
Drena Dobbs, Iowa State University
Vasant Honavar, Iowa State University

Download Find in your library

Document Type

Article

Disciplines

Publication Version

Published Version

Publication Date

1-1-2007

DOI

10.1186/1471-2105-8-284

Abstract

Background Incorrectly annotated sequence data are becoming more commonplace as databases increasingly rely on automated techniques for annotation. Hence, there is an urgent need for computational methods for checking consistency of such annotations against independent sources of evidence and detecting potential annotation errors. We show how a machine learning approach designed to automatically predict a protein's Gene Ontology (GO) functional class can be employed to identify potential gene annotation errors. Results In a set of 211 previously annotated mouse protein kinases, we found that 201 of the GO annotations returned by AmiGO appear to be inconsistent with the UniProt functions assigned to their human counterparts. In contrast, 97% of the predicted annotations generated using a machine learning approach were consistent with the UniProt annotations of the human counterparts, as well as with available annotations for these mouse protein kinases in the Mouse Kinome database. Conclusion We conjecture that most of our predicted annotations are, therefore, correct and suggest that the machine learning approach developed here could be routinely used to detect potential errors in GO annotations generated by high-throughput gene annotation projects.

Comments

This article is from BMC Bioinformatics 8 (2007): 284, doi: 10.1186/1471-2105-8-284. Posted with permission.

Rights

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Andorf et al

2007

Language

File Format

application/pdf

Citation Information

Carson Andorf, Drena Dobbs and Vasant Honavar. "Exploring inconsistencies in genome-wide protein function annotations: a machine learning approach" BMC Bioinformatics Vol. 8 (2007) p. 284
Available at: http://works.bepress.com/drena-dobbs/23/