Using the Gini-Light-Margolin concept of partioning variance for qualitative data, correspondences are established between various kappa statistics and intraclass correlation coefficients under general conditions (multiple raters and polychotomous category systems). A measure of marginal symmetry for multiple ratings is also developed and is shown to have a proportion-of-variance explanation.
References
1.
Collis, G. M. (1985). Kappa, measures of marginal symmetry and intraclass correlations. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 45, 55-62.
2.
Conger, A. J. (1980). Integration and generalization of kappas for multiple raters. Psychological Bulletin, 88, 322-328.
3.
Fleiss, J. L. (1965). Estimating the accuracy of dichotomous judgements. Psychometrika, 30, 469-479.
4.
Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76, 378-382.
5.
Fleiss, J. L. and Cohen, J. (1973). The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 33, 613-619.
6.
Fleiss, J. L. and Cuzick, J. (1979). The reliability of dichotomous judgements: Unequal number of judges per subject. Applied Psychological Measurement, 3, 537-542.
7.
Gini, C. (1912). Variabilita e mutabilita: contributo allo studio delle distribuzioni e delle relazioni statistiche. Bologna: Cuppini.
8.
Gini, C. (1939). Variabilita e Concentrazione. Vol. 1 di: Memorie di metodologia statistica. Milano: Giuffre.
9.
Krippendorff, K. (1970). Bivariate agreement coefficients for reliability of data. In E. F. Borgatta and G. W. Bohrnstedt (Eds.), Sociological methodology 1970. San Francisco: Jossey-Bass.
10.
Light, R. J. and Margolin, B. H. (1971). An analysis of variance for categorical data. Journal of the American Statistical Association, 66, 534-544.
11.
Rae, G. (1984). On measuring agreement among several judges on the presence or absence of a trait. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 44, 247-253.
12.
Shrout, P. E. and Fleiss, J. L. (1979). Intraclass correlations: Uses in assessing rater reliability. Psychological Bulletin, 86, 420-428.
13.
Winer, B. J. (1971). Statistical principles in experimental design (2nd ed.). New York: McGraw-Hill.