"Glycosylation site prediction using ensembles of Support Vector Machine classifiers" by Cornelia Caragea

Selected Works of Drena Dobbs

Follow Contact

Article

Glycosylation site prediction using ensembles of Support Vector Machine classifiers

BMC Bioinformatics

Cornelia Caragea, Iowa State University
Jivko Sinapov, Iowa State University
Adrian Silvescu, Iowa State University
Drena Dobbs, Iowa State University
Vasant Honavar, Iowa State University

Download Find in your library

Document Type

Article

Disciplines

Publication Version

Published Version

Publication Date

1-1-2007

DOI

10.1186/1471-2105-8-438

Abstract

Background: Glycosylation is one of the most complex post-translational modifications (PTMs) of proteins in eukaryotic cells. Glycosylation plays an important role in biological processes ranging from protein folding and subcellular localization, to ligand recognition and cell-cell interactions. Experimental identification of glycosylation sites is expensive and laborious. Hence, there is significant interest in the development of computational methods for reliable prediction of glycosylation sites from amino acid sequences.

Results: We explore machine learning methods for training classifiers to predict the amino acid residues that are likely to be glycosylated using information derived from the target amino acid residue and its sequence neighbors. We compare the performance of Support Vector Machine classifiers and ensembles of Support Vector Machine classifiers trained on a dataset of experimentally determined N-linked, O-linked, and C-linked glycosylation sites extracted from O-GlycBase version 6.00, a database of 242 proteins from several different species. The results of our experiments show that the ensembles of Support Vector Machine classifiers outperform single Support Vector Machine classifiers on the problem of predicting glycosylation sites in terms of a range of standard measures for comparing the performance of classifiers. The resulting methods have been implemented in EnsembleGly, a web server for glycosylation site prediction.

Conclusion: Ensembles of Support Vector Machine classifiers offer an accurate and reliable approach to automated identification of putative glycosylation sites in glycoprotein sequences.

Comments

This article is from BMC Bioinformatics 8 (2007): 438, doi: 10.1186/1471-2105-8-438. Posted with permission.

Rights

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Caragea et al

2007

Language

File Format

application/pdf

Citation Information

Cornelia Caragea, Jivko Sinapov, Adrian Silvescu, Drena Dobbs, et al.. "Glycosylation site prediction using ensembles of Support Vector Machine classifiers" BMC Bioinformatics Vol. 8 (2007) p. 438
Available at: http://works.bepress.com/drena-dobbs/20/