## Abstract

Background: In nutritional epidemiology, it is common to fit models in which several dietary variables are included. However, with standard instruments for dietary assessment, not only are the intakes of many nutrients often highly correlated, but the errors in the estimation of the intake of different nutrients are also correlated. The effect of this error correlation on the results of observational studies has been little investigated. This paper describes the effect on multivariate regression coefficients of different levels of correlation, both between the variables themselves and between the errors of estimation of these variables.

Methods: Using a simple model for the multivariate error structure, we examine the effect on the estimates of bivariate linear regression coefficients of (1) differential precision of measurement of the two independent variables, (2) differing levels of correlation between the true values of the two variables, and (3) differing levels of correlation between the errors of measurement of the two variables. As an example, the prediction of plasma vitamin C levels by dietary intake variables is considered, using data from the European Prospective Investigation of Cancer (EPIC) Norfolk study in which dietary intake was estimated using both a food frequency questionaire (FFQ) and a 7-day diary (7DD). The dietary variables considered are vitamin C, fat, and energy, with different approaches taken to energy adjustment.

Results: When the error correlation is zero, the estimates of the bivariate regression coefficients reflect the precision of measurement of the two variables and mutual confounding. The sum of the observed regression coefficients is biased towards the null as in univariate regression. When the error correlation is non-zero but below about 0.7, the effect is minor. However, as the error correlation increases beyond 0.8 the effect becomes large and highly dependent on the relative precision with which the two variables are measured. At the extreme, the bivariate estimates can become indefinitely large. In the example, the error correlation between fat and energy using the FFQ appears to be over 0.9, the corresponding value for the 7DD being approximately 0.85. The error correlation between vitamin C and fat, and vitamin C and energy, appears to be below 0.5 and smaller for the 7DD than for the FFQ. The impact of these error correlations on bivariate regression coefficients is large. The effect of energy adjustment differs widely between vitamin C and fat.

Conclusion: High levels of error correlation can have a large effect on bivariate regression estimates, varying widely depending on which two variables are considered. In particular, the effect of energy adjustment will vary widely. For vitamin C, the effect of energy adjustment appears negligible, whereas for fat the effect is large indicating that error correlation close to one can partially remove regression dilution due to measurement error. If, for fat intake, energy adjustment is performed by using energy density, the partial removal of regression dilution is achieved at the expense of substantial reduction in the true variance.

Methods: Using a simple model for the multivariate error structure, we examine the effect on the estimates of bivariate linear regression coefficients of (1) differential precision of measurement of the two independent variables, (2) differing levels of correlation between the true values of the two variables, and (3) differing levels of correlation between the errors of measurement of the two variables. As an example, the prediction of plasma vitamin C levels by dietary intake variables is considered, using data from the European Prospective Investigation of Cancer (EPIC) Norfolk study in which dietary intake was estimated using both a food frequency questionaire (FFQ) and a 7-day diary (7DD). The dietary variables considered are vitamin C, fat, and energy, with different approaches taken to energy adjustment.

Results: When the error correlation is zero, the estimates of the bivariate regression coefficients reflect the precision of measurement of the two variables and mutual confounding. The sum of the observed regression coefficients is biased towards the null as in univariate regression. When the error correlation is non-zero but below about 0.7, the effect is minor. However, as the error correlation increases beyond 0.8 the effect becomes large and highly dependent on the relative precision with which the two variables are measured. At the extreme, the bivariate estimates can become indefinitely large. In the example, the error correlation between fat and energy using the FFQ appears to be over 0.9, the corresponding value for the 7DD being approximately 0.85. The error correlation between vitamin C and fat, and vitamin C and energy, appears to be below 0.5 and smaller for the 7DD than for the FFQ. The impact of these error correlations on bivariate regression coefficients is large. The effect of energy adjustment differs widely between vitamin C and fat.

Conclusion: High levels of error correlation can have a large effect on bivariate regression estimates, varying widely depending on which two variables are considered. In particular, the effect of energy adjustment will vary widely. For vitamin C, the effect of energy adjustment appears negligible, whereas for fat the effect is large indicating that error correlation close to one can partially remove regression dilution due to measurement error. If, for fat intake, energy adjustment is performed by using energy density, the partial removal of regression dilution is achieved at the expense of substantial reduction in the true variance.

Original language | English |
---|---|

Pages (from-to) | 1373-1381 |

Number of pages | 9 |

Journal | International Journal of Epidemiology |

Volume | 33 |

Issue number | 6 |

Early online date | 27 Aug 2004 |

DOIs | |

Publication status | Published - Dec 2004 |