The usefulness of the lz person-fit index was investigated with achievement test data from 20 exams given to more than 3,200 college students. Results for three methods of estimating θ showed that the distributions of lz were not consistent with its theoretical distribution, resulting in general overfit to the item response theory model and underidentification of potentially nonfitting response vectors. The distributions of lz were not improved for the Bayesian estimation method. A follow-up Monte Carlo simulation study using item parameters estimated from real data resulted in mean lz approximating the theoretical value of 0.0 for one of three θ estimation methods, but all standard deviations were substantially below the theoretical value of 1.0. Use of the lz distributions from these simulations resulted in levels of identification of significant misfit consistent with the nominal error rates. The reasons for the nonstandardized distributions of lz observed in both these data sets were investigated in additional Monte Carlo simulations. Previous studies showed that the distribution of item difficulties was primarily responsible for the nonstandardized distributions, with smaller effects for item discrimination and guessing. It is recommended that with real tests, identification of significantly nonfitting examinees be based on empirical distributions of lz generated from Monte Carlo simulations using item parameters estimated from real data.
- item response theory