![](https://d3ilqtpdwi981i.cloudfront.net/bhO4fw-0r1RATWUVe9-qxd8pt_I=/425x550/smart/https://bepress-attached-resources.s3.amazonaws.com/uploads/4c/38/ee/4c38eeae-2fcf-42f7-bd6a-79a42361dba1/thumbnail_964b08ba-832a-4200-99be-8a85df93d061.jpg)
Position-weight matrices (PWMs) are broadly used to locate transcription factor binding sites in DNA sequences. The majority of existing PWMs provide a low level of both sensitivity and specificity. We present a new computational algorithm, a modification of the Staden–Bucher approach, that improves the PWM. We applied the proposed technique on the PWM of the GC-box, binding site for Sp1. The comparison of old and new PWMs shows that the latter increase both sensitivity and specificity. The statistical parameters of GC-box distribution in promoter regions and in the human genome, as well as in each chromosome, are presented. The majority of commonly used PWMs are the 4-row mononucleotide matrices, although 16-row dinucleotide matrices are known to be more informative. The algorithm efficiently determines the 16-row matrices and preliminary results show that such matrices provide better results than 4-row matrices.
Available at: http://works.bepress.com/naum_gershenzon/30/
© Gershenzon, et al. 2005. Published by Oxford University Press. All rights reserved.
The following article appeared in Nucleic Acids Research 33(7), and may be found at http://nar.oxfordjournals.org/content/33/7/2290.full