PainSci summary of Herzog 2016?This page is one of thousands in the PainScience.com bibliography. It is not a general article: it is focused on a single scientific paper, and it may provide only just enough context for the summary to make sense. Links to other papers and more general information are provided at the bottom of the page, as often as possible. ★★★★☆4-star ratings are for bigger/better studies and reviews published in more prestigious journals, with only quibbles. Ratings are a highly subjective opinion, and subject to revision at any time. If you think this paper has been incorrectly rated, please let me know.
People mostly assume that MRI is a reliable technology, but if you send the same patient to get ten different MRIs, interpreted by ten different radiologists from different facilities, apparently you get ten markedly different explanations for her symptoms. A 63-year-old volunteer with sciatica allowed herself to be scanned again and again and again for science. The radiologists — who did not know they were being tested — cooked up forty-nine distinct “findings.” Sixteen were unique; not one was found in all ten reports, and only one was found in nine of the ten. On average, each radiologist made about a dozen errors, seeing one or two things that weren’t there and missing about ten things that were. That’s a lot of errors, and not a lot of reliability. The authors clearly believe that some MRI providers are better than others, and that’s probably true, but we also need to ask the question: is any MRI reliable?
(See also my more informal description of this study, which includes an amazing personal example of an imaging error.)
original abstract†Abstracts here may not perfectly match originals, for a variety of technical and practical reasons. Some abstacts are truncated for my purposes here, if they are particularly long-winded and unhelpful. I occasionally add clarifying notes. And I make some minor corrections.
BACKGROUND CONTEXT: In today’s health-care climate, magnetic resonance imaging (MRI) is often perceived as a commodity-a service where there are no meaningful differences in quality and thus an area in which patients can be advised to select a provider based on price and convenience alone. If this prevailing view is correct, then a patient should expect to receive the same radiological diagnosis regardless of which imaging center he or she visits, or which radiologist reviews the examination. Based on their extensive clinical experience, the authors believe that this assumption is not correct and that it can negatively impact patient care, outcomes, and costs.
PURPOSE: This study is designed to test the authors’ hypothesis that radiologists’ reports from multiple imaging centers performing a lumbar MRI examination on the same patient over a short period of time will have (1) marked variability in interpretive findings and (2) a broad range of interpretive errors.
STUDY DESIGN: This is a prospective observational study comparing the interpretive findings reported for one patient scanned at 10 different MRI centers over a period of 3 weeks to each other and to reference MRI examinations performed immediately preceding and following the 10 MRI examinations.
PATIENT SAMPLE: The sample is a 63-year-old woman with a history of low back pain and right L5 radicular symptoms.
OUTCOME MEASURES: Variability was quantified using percent agreement rates and Fleiss kappa statistic. Interpretive errors were quantified using true-positive counts, false-positive counts, false-negative counts, true-positive rate (sensitivity), and false-negative rate (miss rate).
METHODS: Interpretive findings from 10 study MRI examinations were tabulated and compared for variability and errors. Two of the authors, both subspecialist spine radiologists from different institutions, independently reviewed the reference examinations and then came to a final diagnosis by consensus. Errors of interpretation in the study examinations were considered present if a finding present or not present in the study examination’s report was not present in the reference examinations.
RESULTS: Across all 10 study examinations, there were 49 distinct findings reported related to the presence of a distinct pathology at a specific motion segment. Zero interpretive findings were reported in all 10 study examinations and only one finding was reported in nine out of 10 study examinations. Of the interpretive findings, 32.7% appeared only once across all 10 of the study examinations’ reports. A global Fleiss kappa statistic, computed across all reported findings, was 0.20±0.06, indicating poor overall agreement on interpretive findings. The average interpretive error count in the study examinations was 12.5±3.2 (both false-positives and false-negatives). The average false-negative count per examination was 10.9±2.9 out of 25 and the average false-positive count was 1.6±0.9, which correspond to an average true-positive rate (sensitivity) of 56.4%±11.7 and miss rate of 43.6%±11.7.
CONCLUSIONS: This study found marked variability in the reported interpretive findings and a high prevalence of interpretive errors in radiologists’ reports of an MRI examination of the lumbar spine performed on the same patient at 10 different MRI centers over a short time period. As a result, the authors conclude that where a patient obtains his or her MRI examination and which radiologist interprets the examination may have a direct impact on radiological diagnosis, subsequent choice of treatment, and clinical outcome.
- “Interrater reliability: the kappa statistic,” Mary L McHugh, Biochem Med (Zagreb), 2012.
- “The measurement of observer agreement for categorical data,” J R Landis and G G Koch, Biometrics, 1977.
These eight articles on PainScience.com cite Herzog 2016 as a source:
- When to Worry About Low Back Pain — And when not to! What’s bark and what’s bite?
- Save Yourself from Low Back Pain! — Low back pain myths debunked and all your treatment options reviewed
- Save Yourself from Neck Pain! — A complete guide to chronic neck pain and the disturbing sensation of a “crick”
- The Mind Game in Low Back Pain — How back pain is powered by fear and loathing, and greatly helped by rational confidence
- MRI and X-Ray Often Worse than Useless for Back Pain — Medical guidelines “strongly” discourage the use of MRI and X-ray in diagnosing low back pain, because they produce so many false alarms
- Is Diagnosis for Pain Problems Reliable? — Reliability science shows that health professionals can’t agree on many popular theories about why you’re in pain
- Digital Motion X-Ray — What’s the risk from the radiation exposure? Is the diagnostic potential worth it?
- A Rational Guide to Fibromyalgia — The science of the mysterious disease of pain, exhaustion, and mental fog
This page is part of the PainScience BIBLIOGRAPHY, which contains plain language summaries of thousands of scientific papers & others sources. It’s like a highly specialized blog. A few highlights:
- Effectiveness of customised foot orthoses for Achilles tendinopathy: a randomised controlled trial. Munteanu 2015 Br J Sports Med.
- A Bayesian model-averaged meta-analysis of the power pose effect with informed and default priors: the case of felt power. Gronau 2017 Comprehensive Results in Social Psychology.
- The neck and headaches. Bogduk 2014 Neurol Clin.
- Agreement of self-reported items and clinically assessed nerve root involvement (or sciatica) in a primary care setting. Konstantinou 2012 Eur Spine J.
- Effect of NSAIDs on Recovery From Acute Skeletal Muscle Injury: A Systematic Review and Meta-analysis. Morelli 2017 Am J Sports Med.