This paper describes a method of estimating the performance of a multiple screening test where those who test negative do not have their true disease status determined. The methodology is motivated by a dataset on 49,927 subjects who were given K=6 binary tests for bowel cancer. A complicating factor is that individuals may have polyps present in the bowel, a condition that the screening test is not designed to detect but which may be worth diagnosing. The methodology is based on a multinomial logit model for Pr(S|R_6), the probability distribution of patient status S (healthy, polyps or diseased) conditional on the results R_6 from six binary tests. An advantage of the described methodology is that the modeling is data driven. In particular, we require no assumptions about (i) correlation within subjects, (ii) the relative sensitivity of the K tests, (iii) the conditional independence of the tests. The model leads to simple estimates of the trade-off between different errors as the number of tests is varied, presented graphically using ROC curves. Finally, the model allows us to estimate better protocols for assigning subjects to the disease group, as well as the gains in accuracy from these protocols.
- ROC curves,
Available at: http://works.bepress.com/chris_lloyd/1/