PainScience.com β€’ Good advice for aches, pains & injuries

The measurement of observer agreement for categorical data

updated

Tags: stats, classics, scientific medicine

Two articles on PainSci cite Landis 1977: (1) Is Diagnosis for Pain Problems Reliable?(2) Trigger Point Doubts

PainSci notes on Landis 1977:

Landis and Koch suggested labels for ranges of https://en.wikipedia.org/wiki/Cohen%27s_kappa values, describing 𝛋 = 0–0.20 as slight, 0.21–0.40 as fair, 0.41–0.60 as moderate, 0.61–0.80 as substantial, and 0.81–1 as almost perfect . These labels were just expert opinion, and are controversial, but have been widely cited and used ever since, because they are imprecise enough to be β€œgood enough” for many purposes.

original abstract †Abstracts here may not perfectly match originals, for a variety of technical and practical reasons. Some abstacts are truncated for my purposes here, if they are particularly long-winded and unhelpful. I occasionally add clarifying notes. And I make some minor corrections.

This paper presents a general statistical methodology for the analysis of multivariate categorical data arising from observer reliability studies. The procedure essentially involves the construction of functions of the observed proportions which are directed at the extent to which the observers agree among themselves and the construction of test statistics for hypotheses involving these functions. Tests for interobserver bias are presented in terms of first-order marginal homogeneity and measures of interobserver agreement are developed as generalized kappa-type statistics. These procedures are illustrated with a clinical diagnosis example from the epidemiological literature.

This page is part of the PainScience BIBLIOGRAPHY, which contains plain language summaries of thousands of scientific papers & others sources. It’s like a highly specialized blog. A few highlights: