Estimating linear regression models in the presence of a censored independent variable
Abstract
The current study examined the impact of a censored independent variable, after adjusting for a second independent variable, when estimating regression coefficients using naïve ordinary least squares (OLS), partial OLS and full-likelihood models. We used Monte Carlo simulations to determine the bias associated with all three regression methods. We demonstrated that substantial bias was introduced in the estimation of the regression coefficient associated with the variable subject to a ceiling effect when naïve OLS regression was used. Furthermore, minor bias was transmitted to the estimation of the regression coefficient associated with the second independent variable. High correlation between the two independent variables improved estimation of the censored variable's coefficient at the expense of estimation of the other coefficient. The use of partial OLS and maximum-likelihood estimation were shown to result in, at most, negligible bias in estimation. Furthermore, we demonstrated that the full-likelihood method was robust under misspecification of the joint distribution of the independent random variables. Lastly, we provided an empirical example using National Population Health Survey (NPHS) data to demonstrate the practical implications of our main findings and the simple methods available to circumvent the bias identified in the Monte Carlo simulations. Our results suggest that researchers need to be aware of the bias associated with the use of naïve ordinary least-squares estimation when estimating regression models in which at least one independent variable is subject to a ceiling effect.
Suggested Citation
Peter C. Austin. "Estimating linear regression models in the presence of a censored independent variable" Statistics in Medicine 23 (2004): 411-429.