Methods are developed that investigate the fit of parametric item response models by comparing them to models fitted under nonparametric assumptions. The approach is primarily graphical, but is made inferential through resampling from an estimated parametric model. The identifiability and estimation consistency of item response theory models are discussed and shown to be vital to the interpretation of differences between two fitted item response theory models. Simulation studies and real-data examples illustrate these techniques.
References
1.
Assessment Systems Corporation (1996). XCALIBRE marginal maximum likelihood estimation program, Version 1.10 [Computer program]. St. Paul MN: Author.
2.
Azzalini, A. , Bowman, A. W., & Härdle, W. H. (1989). On the use of nonparametric regression for model checking. Biometrika, 76, 1–11.
3.
Baker, F. B. (1992). Item response theory parameter estimation techniques. New York: Marcel Dekker.
4.
Becker, R. A. , Chambers, J. M., & Wilks, A. R. (1988). The new S language: A programming environment for data analysis and graphics. Belmont CA: Wadsworth.
5.
Douglas, J. (1997). Joint consistency of nonparametric item characteristic curve and ability estimation. Psychometrika, 62, 7–28.
6.
Douglas, J. (1999). Asymptotic identifiability of nonparametric item response models (Technical Report No. 142). University of Wisconsin, Department of Biostatistics and Medical Informatics.
7.
Fischer, G. H. , & Molenaar, I. W. (Eds.). (1995). Rasch models: Foundations, recent developments, and applications. New York: Springer-Verlag.
8.
Härdle, W. (1990). Applied nonparametric regression. London: Chapman & Hall.
9.
Hastie, T. J. , & Tibshirani, R. J. (1990). Generalized additive models. London: Chapman & Hall.
10.
Junker, B. , & Ellis, J. (1997). A characterization of monotone unidimensional latent variable models. Annals of Statistics, 25, 1327–1343.
11.
Kingston, N. M. , & Dorans, N. J. (1985). The analysis of item-ability regressions: An exploratory IRT model fit tool. Applied Psychological Measurement, 9, 281–288.
12.
Lord, F. M. (1970). Item characteristic curves estimated without knowledge of their mathematical form: A confrontation of Birnbaum’s logistic model. Psychometrika, 35, 43–50.
13.
Mislevy, R. J. , & Bock, R. D. (1982). BILOG: Item analysis and test scoring with binary logistic models [Computer Program]. Mooresville IN: Scientific Software.
14.
Mokken, R. , & Lewis, C. (1982). A nonparametric approach to the analysis of dichotomous item responses. Applied Psychological Measurement, 6, 417–430.
15.
Molenaar, I. W. , & Sijtsma, K. (2000). MSP5 for Windows, a program for Mokken scale analysis of polytomous items. Groningen, The Netherlands: ProGAMMA.
16.
Press, W. H. , Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P. (1996). Numerical recipes in FORTRAN 90: The art of parallel scientific computing. New York: Cambridge University Press.
17.
Ramsay, J. O. (1988). Monotone regression splines in action. Statistical Science, 3, 425–441.
18.
Ramsay, J. O. (1991). Kernel smoothing approaches to nonparametric item characteristic curve estimation. Psychometrika, 56, 611–630.
19.
Ramsay, J. O. (2000). TESTGRAF: A program for the graphical analysis of multiple choice test and questionnaire data [Computer program]. Available from www.psych.mcgill.ca/faculty/ramsay.html.
20.
Ramsay, J. O. , & Abrahamowicz, M. (1989). Binomial regression with monotone splines: A psychometric application. Journal of the American Statistical Association, 84, 906–915.
21.
Ramsay, J. O. , & Winsberg, S. (1991). Maximum marginal likelihood estimation for semiparametric item analysis. Psychometrika, 56, 365–379.
22.
Rice, J. (1984). Bandwidth choice for nonparametric regression. Annals of Statistics, 12, 1215–1230.
23.
Rosenbaum, P. R. (1984). Testing the conditional independence and monotonicity assumptions of item response theory. Psychometrika, 49, 425–435.
24.
Samejima, F. (1979). A new family of models for the multiple-choice item (Research Report No. 79-4). Knoxville TN: University of Tennessee, Department of Psychology.
25.
Samejima, F. (1981). Efficient methods of estimating the operating characteristic of item response categories and a challenge to a new model for the multiple-choice item. Knoxville TN: University of Tennessee.
26.
Samejima, F. (1984). Plausibility functions of the Iowa vocabulary test items estimated by the simple sum procedure of the conditional P.D.F. approach (Research Report 84-1). Knoxville TN: University of Tennessee, Department of Psychology.
27.
Samejima, F. (1988). Advancement of latent trait theory (Research Report No. 79-4). Knoxville TN: University of Tennessee, Department of Psychology.
28.
Sijtsma, K. , & Junker, B. (1996). A survey of theory and methods of invariant item ordering. British Journal of Mathematical and Statistical Psychology, 49, 79–105.
29.
Sijtsma, K. , & Molenaar, I. (1987). Reliability of test scores in nonparametric item response theory. Psychometrika, 52, 79–97.
30.
Stone, C. A. (2000). Monte Carlo based null distribution for an alternative goodness-of-fit test statistic in IRT models. Journal of Educational Measurement, 37, 58–75.
31.
Stout, W. F. (1987). A nonparametric approach for assessing latent trait dimensionality. Psychometrika, 52, 589–617.
32.
Stout, W. F. (1990). A new item response theory modeling approach with applications to unidimensionality assessment and ability estimation. Psychometrika, 55, 293–326.