The ability to identify protein binding sites and to detect specific amino acid residues that contribute to the specificity and affinity of protein interactions has important implications for problems ranging from rational drug design to analysis of metabolic and signal transduction networks. Support vector machines (SVM) and related kernel methods offer an attractive approach to predicting protein binding sites. An appropriate choice of the kernel function is critical to the performance of SVM. Kernel functions offer a way to incorporate domain-specific knowledge into the classifier. We compare the performance of 3 types of kernels functions: identity kernel, sequence-alignment kernel, and amino acid substitution matrix kernel for predicting protein-protein, protein-DNA and protein-RNA binding sites. The results show that the identity kernel is quite effective in on all three tasks, with the substitution kernel based on amino acid substitution matrices that take into account structural or evolutionary conservation or physicochemical properties of amino acids yields modest improvement in the performance of the resulting SVM classifiers for predicting protein-protein, protein-DNA and protein-RNA binding sites.
Available at: http://works.bepress.com/drena-dobbs/49/