Correlated response data often arise in longitudinal and familial studies. The marginal regression model and its associated generalized estimating equation (GEE) method are becoming more and more popular in handling such data. Pepe and Anderson pointed out that there is an important yet implicit assumption behind the marginal model and GEE. If the assumption is violated and a nondiagonal working correlation matrix is used in GEE, biased estimates of regression coefficients may result. On the other hand, if a diagonal correlation matrix is used, irrespective of whether the assumption is violated, the resulting estimates are (nearly) unbiased. A straightforward interpretation of this phenomenon is lacking, in part due to the unavailability of a closed form for the resulting GEE estimates. In this note, we show how the bias may arise in the context of linear regression, where the GEE estimates of regression coefficients are the ordinary or generalized least squares (LS) estimates. Also we explain why the generalized LS estimator may be biased, in contrast to the well-known result that it is usually unbiased. In addition, we discuss the bias properties of the sandwich variance estimator of the ordinary LS estimate.
Bibliographical noteFunding Information:
Wei Pan is Assistant Professor, and Thomas A. Louis and John E. Con-nett are Professors, Division of Biostatistics, University of Minnesota, A460 Mayo Building, Minneapolis, MN 55455 (E-mail addresses: weip, tom, firstname.lastname@example.org). The authors thank two referees, an associate editor, and the editor for careful reading and many constructive comments. This research was partially supported by a grant (1U01-HL59275) and contract (N01-HR-96140) from the National Institutes of Health.
- Generalized estimating equation (GEE)
- Generalized least square (GLS)
- Ordinary least square (OLS)