Genome scanning methods for comparing sequences between groups, with application to HIV vaccine trialsBiometrics (2008)
AbstractConsider a placebo-controlled preventive HIV vaccine efficacy trial. An HIV amino acid sequence is measured from each volunteer who acquires HIV, and these sequences are aligned together with the reference HIV sequence represented in the vaccine. We develop genome scanning methods to identify positions at which the amino acids in infected vaccine recipient sequences either (A) are more divergent from the reference amino acid than the amino acids in infected placebo recipient sequences or (B) have a different frequency distribution than the placebo sequences, irrespective of a reference amino acid. We consider t-test-type statistics for problem A and Euclidean, Mahalanobis, and Kullback-Leibler-type statistics for problem B. The test statistics incorporate weights to reflect biological information contained in different amino acid positions and mismatches. Position-specific p-values are obtained by approximating the null distribution of the statistics either by a permutation procedure or by a nonparametric estimation. A permutation method is used to estimate a cut-off p-value to control the per comparison error rate at a prespecified level. The methods are examined in simulations and are applied to two HIV examples. The methods for problem B address the general problem of comparing discrete frequency distributions between groups in a high-dimensional data setting.
Citation InformationPeter B. Gilbert, Chunyuan Wu and David Jobes. "Genome scanning methods for comparing sequences between groups, with application to HIV vaccine trials" Biometrics Vol. 64 (2008)
Available at: http://works.bepress.com/peter_gilbert/24/