We analyze the characteristics of protein–protein interfaces using the largest datasets available from the Protein Data Bank (PDB). We start with a comparison of interfaces with protein cores and noninterface surfaces. The results show that interfaces differ from protein cores and non-interface surfaces in residue composition, sequence entropy, and secondary structure. Since interfaces, protein cores, and non-interface surfaces have different solvent accessibilities, it is important to investigate whether the observed differences are due to the differences in solvent accessibility or differences in functionality. We separate out the effect of solvent accessibility by comparing interfaces with a set of residues having the same solvent accessibility as the interfaces. This strategy reveals residue distribution propensities that are not observable by comparing interfaces with protein cores and noninterface surfaces. Our conclusions are that there are larger numbers of hydrophobic residues, particularly aromatic residues, in interfaces, and the interactions apparently favored in interfaces include the opposite charge pairs and hydrophobic pairs. Surprisingly, Pro-Trp pairs are over represented in interfaces, presumably because of favorable geometries. The analysis is repeated using three datasets having different constraints on sequence similarity and structure quality. Consistent results are obtained across these datasets. We have also investigated separately the characteristics of heteromeric interfaces and homomeric interfaces.
Available at: http://works.bepress.com/drena-dobbs/22/