Skip to main content
Article
Variable importance in matched case-control studies in settings of high-dimensional data
Journal of the Royal Statistical Society, Series C (2014)
  • Raji Balasubramanian, University of Massachusetts - Amherst
  • E. Andres Houseman
  • Brent A Coull
  • M H Lev
  • L H Schwamm
  • Rebecca A Betensky
Abstract

We propose a method for assessing variable importance in matched case-control investigations and other highly-stratified studies characterized by high dimensional data (p >> n). In simulated and real datasets, we show that the proposed algorithm performs better than a conventional univariate method (condi- tional logistic regression) and a popular multivariable algorithm (Random Forests) that does not take the matching into account. The methods are applicable to wide ranging, high impact clinical studies including metabolomic, proteomic studies and neuroimaging analyses, such as those assessing stroke and Alzheimer’s disease. The methods proposed in this paper have been implemented in a freely available R library (http://cran.r-project.org/web/packages/RPCLR/index.html).

Disciplines
Publication Date
2014
Citation Information
Raji Balasubramanian, E. Andres Houseman, Brent A Coull, M H Lev, et al.. "Variable importance in matched case-control studies in settings of high-dimensional data" Journal of the Royal Statistical Society, Series C (2014)
Available at: http://works.bepress.com/raji_balasubramanian/25/