Abstract
Three multiple indicators—multiple causes (MIMIC) methods, namely, the standard MIMIC method (M-ST), the MIMIC method with scale purification (M-SP), and the MIMIC method with a pure anchor (M-PA), were developed to assess differential item functioning (DIF) in polytomous items. In a series of simulations, it appeared that all three methods yielded a well-controlled Type I error rate when tests did not contain any DIF items. M-ST and M-SP began to yield an inflated Type I error rate and a deflated power when tests contained 10% and 20% DIF items, respectively. M-PA maintained an expected Type I error rate and a high power even when tests contained as many as 40% DIF items. An iterative MIMIC procedure was proposed to select a small set of DIF-free items to serve as the anchor in M-PA. It was found in a series of simulations that this procedure yielded a very high rate of accuracy. Two simulated data sets were then analyzed to show applications of these MIMIC methods for DIF assessment in polytomous items.
