Most Pain Treatments Damned With Faint Praise
Most controversial and alternative therapies are fighting over scraps of “positive” scientific evidence that damn them with the faint praise of small effect sizes that cannot impress
It is common for dubious therapies and treatments to be based that were only technically, superficially positive — but when you look at the data, you only find evidence of a trivial beneficial effect. Even if that minor benefit it is real, it’s not impressive, and the treatment is damned with faint praise.
However, the small effect is often only one of many other problems with such research.
Surprisingly, weak evidence like this is routinely exploited by people who should know better. For instance, scientific reviews and clinical guidelines often include treatment recommendations that are based on inadequate evidence.1 Many forces motivate such carelessness.
If a therapy actually works well it should be easy to prove it.2 Although large treatment effects are quite rare in medicine in general— because biology is so complicated, and people are so different — they should be impressive enough to leave little room for argument. When a treatment is clearly shown to be effective, it’s exciting! It makes headlines, and it should.
But it’s also incredibly rare.
Most slightly “positive” study results are actually just bogus
The weaker a positive result, the more likely it is to actually be misleading: not actually positive at all. There are several ways that a positive study can actually be negative …
- Flukes! Sometimes, by pure chance, things turn out well for the study subjects — just enough to make an ineffective treatment seem a little bit effective.
- False positives! These are more likely when testing weirder treatment ideas that aren’t plausible, “testing magic.” (This is a tricky, technical issue: basically, absurd hypotheses warp the math of statistical significance.34)
- Lies! Many “positive” studies are simply worthless junk — true pseudoscience — in ridiculous non-journals published by quacks with a self-promotional agenda (predatory journals7). Such experiments have far too many fatal flaws to mean anything.8
- More lies! It sounds incredible, but you’d be amazed how often people cite “positive” studies that just aren’t. If you read them, you end up scratching your head and thinking, “Um, but this is bad news … ”9
- Honest mistakes! Plus some that are harder to forgive … Even well-trained, professional researchers have biases and can screw up. Imagine how much worse it is when an amateur is keen to prove that their favourite therapy works! There are an almost unbelievable number of ways that researchers can unconsciously bend experiments to produce the results they’d prefer to see, and/or they can spin the interpretation.10 Weakly positive results are a common symptom of this. Science is hard, and most of it is just wrong.11
Early studies of a treatment tend to be sketchier and “positive,” often conducted by proponents trying to produce scientific justification for their methods. Eventually less biased investigators do better quality studies, and the results are negative. This is a classic pattern in the history of science in general, especially medicine.12
So you can see why I’m a little skeptical when someone enthusiastically shows me one paper from an obscure journal reporting a “significant” benefit to, say, acupuncture — which has probably been the subject of more of these “positive” studies than any other treatment.

Give small a chance
I’ve been too critical of small effect sizes over the years. This is a bit of a mea culpa, updating this article in 2025. Small effect sizes don’t necessarily mean that a study is worthless.
We use microscopes to look for small things, and we (mainly) use clinical trials to look for small and specific effects. The more obvious the the benefit of a treatment, the less we need to test it.13 We only need a randomized controlled trial when the apparent effect size is so small that it might be a fluke, and so we have to confirm that it’s genuine, rather than illusion or delusion.
And so, small benefits may still matter for many reasons:
- A small effect may be worthwhile if it’s cheap, easy, and safe to achieve. We never just want to know about the juice — we also want to know if it’s worth the squeeze. (More on this one below.)
- It might be the average of a wide range of outcomes: some people will get less, some more — sometimes predictably. There’s a huge difference between “always just 5%” and “5% on average but sometimes 0% and sometimes 30%, depending on your genes.” (Also more on this one below.)
- It’s just one of several effects, maybe part of an unmeasured process or whole that’s greater than the sum of its parts — which defines good therapy. A small effect on pain may come with more meaningful benefits, like less suffering and more function (pain self-efficacy).
- Small effects can matter much more to people who need them the most. Dropping from 9 to 8 on the pain scale may seem like a much bigger deal than 3 to 2.
- The mechanism might be important in principle, the tip of an iceberg of potential. Proving that a drug works at all could suggest ways to make it work better, and send researchers back to their labs with new ideas. Adjustments to the intervention, and/or exactly what is measured, could reveal a more substantial effect.
- Not all worthwhile benefits can be felt. No one can feel 30% less likely to get injured, or that they will recover from a tendinitis 20% faster.
- The truth is inherently valuable. If homeopathy actually worked, even just a tiny bit, I’d want to know how it could possibly work at all!
And yet clinical significance is still significant, and so a tiny benefit is still mostly unimpressive — assuming it’s even real, which it probably isn’t. Most good news from weak trials of pain treatments are as wrong as the 1989 cold fusion claim — just with less media attention.
Is it really “better than nothing”? Only if it’s a legit small effect!
If you’re a glass-is-half-full person, you might be happy to say that weakly positive results are “better than nothing.” Science says chiropractic adjustment of my back might improve my back pain by 3%? Heck, I’ll take 3%!
Sometimes the better-than-nothing interpretation is fair and fine,14 and I’ve used it myself many times. But let’s not confuse optimistic pragmatism with actual knowledge. Weakly positive results, even real ones, do not mean it’s truly established that a treatment “works a little bit.” The bar for that is a lot higher!
The whole point here is that weak results are mostly an illusion, an artifact.
To be small and genuine, a treatment effect has to beat the null hypothesis in most studies over time. This is hard.
The null hypothesis — a pillar of the scientific method — is that most hopeful theories amount to nothing when carefully checked. In plain English, the null hypothesis says, “Most ideas turn out to be wrong.” And therefore most weakly positive results will turn out to be the product of bias, p-hacking, and wishful thinking. And that’s fine.
Treatments should be generally considered useless until proven effective. The burden of proof is on the pusher of the idea (and it’s a heavy burden). Until they do, the null hypothesis looms over them, still very likely to win in the long run.
The null hypothesis has kicked a lot of theories butts over the centuries. It is still the champ.
Maybe the effect of a treatment is just hard to confirm?
The optimist (or ideologue) is inevitably going to say, “But maybe there is a strong effect and it’s just erratic, hard for science to pin down!”
Perhaps.
But not usually. In practice, this is a kind of “special pleading” for an exception,15 a slightly more specific version of “science doesn’t know everything” (a classic non-sequitur from people defending quackery16) Yes, science might catch up and validate something previously missed.
But it’s unlikely, and even if it’s real, erratic benefits that are so hard for science to pin down that we can’t find any promising evidence that it exists also tend to be awkward or lame in practice. If a standardized treatment protocol can’t deliver the goods in a somewhat reliable fashion, to the people who need it — then it’s not really useful medicine. Or at least it’s not medicine I want to spend my time/money on until its “erratic” nature is better understood.
“A promising treatment is often in fact merely the larval stage of a disappointing one. At least a third of influential trials suggesting benefit may either ultimately be contradicted or turn out to have exaggerated effectiveness.”
Bastian, 2006, J R Soc Med
Fighting over scraps
The science of painful problems is still generally rudimentary and preliminary, late to the evidence-based medicine party. We can try to critically assess it, and I do, but “replication needed” is usually all that really needs to be said. That covers most of the bases. At the end of the day, if slightly promising results cannot be confirmed by other researchers, it doesn’t really matter what was wrong with the original research. Either a treatment works well enough to consistently produce measurably useful results for some patients, or it doesn’t.
Controversy about many popular treatments and therapies is much ado about not much, mostly just fighting over pathetic scraps of evidence of minor effects. After decades of study, the effectiveness of a therapy should be clear and significant in order to justify its continued existence or “more study.” If it’s still hopelessly mired in controversy after so many years — more than a century in some cases (*cough* homeopathy *cough*) — how good can it possibly be? Why would anyone — patient or professional — feel enthusiastic about a therapy that can’t clearly show its superiority in a fair scientific test? Where’s the value in even debating a therapy that is clearly not working any miracles, that has a trivial benefit at best?
The long-term persistence of such debate constitutes evidence of absence. Several dozen crappy studies with weekly positive results is roughly equivalent to proof that there’s no beef, with or without high quality studies to put the nail in the coffin. More research is a waste of time and resources.17
Science, as they say, really delivers the goods: missions to Mars, long lives, the internet. A therapy has to deliver the goods. It’s got to help most people a fair amount and most of the time … or who cares?
Until it impresses you, it’s just some idea that hasn’t yet showed much promise.
It’s okay not to know

What would Carl Sagan do? Always a good question.
Readers and patients are forever asking me what my “hunch” is about a therapy: does it work? Is there anything to it? I’m honoured that my opinion is so sought after, but I usually won’t take the bait. Like Carl Sagan, “I try not to think with my gut.”
It’s okay not to know. It’s okay for the jury to be out.
And it had better be, because there’s still a great deal of mystery in musculoskeletal health science. Most of the scientific evidence that I interpret for readers of PainScience fails the “impress me” test. Even when that evidence is actually positive — and it’s hard to tell — it’s often only slightly positive. Even when there’s evidence that a therapy works, it’s usually weak evidence: some studies concluded that maybe it helps some people, some of the time … while other studies, almost always the better ones, showed no effect at all. I’m supposed to get excited about this? To justify real confidence in a therapy, we want really good evidence, evidence that makes you sit up and take notice, evidence that ends arguments because it’s just that clear.
Anything less fails to impress!
I don’t want to believe. I want to know.
Carl Sagan
We must somehow find a way to make peace with limited information, eagerly seeking more, without being dogmatic about premature conclusions.
Science and The Game Of 20 Questions, by Val Jones
About Paul Ingraham

I am a science writer in Vancouver, Canada. I was a Registered Massage Therapist for a decade and the assistant editor of ScienceBasedMedicine.org for several years. I’ve had many injuries as a runner and ultimate player, and I’ve been a chronic pain patient myself since 2015. Full bio. See you on Facebook or Twitter., or subscribe:
What’s new in this article?
Jan 24, 2025 — A rare update to this old article, acknowledging that I may have been too dismissive of small effect sizes over the years. See the new section, “Give small a chance.”
2020 — An unusually large batch of typo corrections. (I don’t normally consider correcting typos to be worthy of logging an update, but in this case… sheesh.)
2017 — Added a couple citations and an important technical point about false positives when “testing magic.”
2016 — Science update: citation to Pereira 2012 about the lack of large treatment effects in medicine.
2009 — Publication.
Notes
- Colquhoun D. Recommendations are made in the absence of any good treatments. BMJ. 2017;(358):j3975. Dr. David Colquhoun briefly but persuasively argues that clinical guidelines and scientific reviews routinely make recommendations based on inadequate evidence, substantially due to a common failure to appreciate the risk of false positives in positive studies of treatments with low prior plausibility: “every false positive not only harms patients (and budgets) but also provides ammunition for the antiscience brigade, who are now so evident.”
- Standard proof caveat: nothing is ever truly “proved,” of course.” When we talk of proof in science, we don’t mean total certainty, but more like the certainty you feel about the sun rising tomorrow.
- Nuzzo R. Scientific method: statistical errors. Nature. 2014 Feb;506(7487):150–2. PubMed 24522584 ❐ “The more implausible the hypothesis — telepathy, aliens, homeopathy — the greater the chance that an exciting finding is a false alarm, no matter what the P value is.”
- Pandolfi M, Carreras G. The faulty statistics of complementary alternative medicine (CAM). Eur J Intern Med. 2014 Sep;25(7):607–9. PubMed 24954813 ❐
The word “significant” in scientific abstracts is routinely misleading. It does not mean that the results are large or meaningful, and in fact is used to hide precisely the opposite. When only “significance” is mentioned, it almost invariably refers to the notoriously problematic “p-value,” a technically-true distraction from the more meaningful truth of a tiny “effect size”: results that are not actually impressive. This practice has been considered bad form by experts for decades, but is still extremely common. See Statistical Significance Abuse: A lot of research makes scientific evidence seem much more “significant” than it is.
- One of my favourites is another technically correct but misleading stats term, “trending.” When the results are positive but not statistically insignificant, paper authors will often still summarize by saying that there was a “positive trend” in the data: not enough to claim significance, mind you, but not actually negative. It’s a good way of making a worthless study still sound a little positive.
- A “predatory journal” is a fraudulent journal that publishes anything for pay (literally anything, even gibberish), without peer review. This is a new kind of junk science, as bad as any pseudoscience. These “journals” are scams: their purpose is to rip off academics who are desperate to “publish or perish.” There are thousands of predatory journals now, many of which have high superficial legitimacy (they look a lot like real journals, e.g. actually indexed in PubMed). Some of the research is undoubtedly earnest, but cannot be trusted without peer-review. See Gasparyan et al. and 14 Kinds of Bogus Citations.
- Except it’s usually noteworthy that, even by cheating and lying and bending every rule in their favour, they still couldn’t produce better results!
- Ingraham. 14 Kinds of Bogus Citations: Classic ways to self-servingly screw up references to science, like “the sneaky reach” or “the uncheckable”. PainScience.com. 6094 words.
- Cuijpers P, Cristea IA. How to prove that your therapy is effective, even when it is not: a guideline. Epidemiol Psychiatr Sci. 2016 Oct;25(5):428–435. PubMed 26411384 ❐ A clear explanation of all the ways that trials can go wrong — or, as the title mischievously implies, all the ways trials can be made to go wrong. Although written about psychotherapy research, it is directly relevant to musculoskeletal medicine: both fields share the problem of lots of junky little trials done by professionals trying to prove their pet theories, which produces a lot of “positive” results that just aren’t credible.
- Ioannidis J. Why Most Published Research Findings Are False. PLoS Medicine. 2005 08;2(8):e124. PainSci Bibliography 55463 ❐
- Pereira TV, Horwitz RI, Ioannidis JPA. Empirical evaluation of very large treatment effects of medical interventions. JAMA. 2012 Oct;308(16):1676–84. PubMed 23093165 ❐
A “very large effect” in medical research is probably exaggerated, according to Stanford researchers. Small trials of medical treatments often produce results that seem impressive. However, when more and better trials are performed, the results are usually much less promising. In fact, “most medical interventions have modest effects” and “well-validated large effects are uncommon.”
Scientific tests are mostly designed to identify or confirm relatively subtle relief that we can’t be sure we can feel. If a headache nostrum made your headache vanish in thirty minutes, every time, you could do a carefully controlled trial to confirm it … but it would be about as surprising as a test of the moistening power of showers.
The shower analogy is mine, but the classic satirical example is parachutes: we don’t need an RCT to know they work (see Smith et al).
But that blasted parachute thing has been thoroughly abused. It was cooked up originally to denigrate EBM absolutism, but then widely exploited to just sneer at an EBM straw man, and now it’s just used by quacks and cranks to argue that what they believe is so obvious that it doesn’t need to be tested any more than parachutes do. For instance, a doctor friend of mine “once had an iodine quack (‘Harvard trained!’) proudly tell me there had never been an RCT of parachutes when I asked for evidence that ‘everybody needs more iodine.’”
- Whether you use unimpressive positive results to justify giving a treatment a try depends largely on other factors: Is it expensive? Is it dangerous? Will it interfere with other, better treatment options? And so on. It’s a pragmatic calculation, not a scientific conclusion.
Special pleading is an informal fallacy: claiming an exception to a general trend or principle without actually establishing that it is, either using a thin rationalization or even just using the exception as evidence for itself (“the rules don’t apply to my claim because my claim is an exception to the rule”).
- It’s true but obvious, and irrelevant to their point … which is that their kooky treatment beliefs are so exotic that they are immune to investigation and criticism, beyond the reach of science. Nope! Not even close! It’s like declaring a leaky old canoe to be seaworthy because we don’t yet know everything about the ocean depths.
- Gorski DH, Novella SP. Clinical trials of integrative medicine: testing whether magic works? Trends in Molecular Medicine. 2014. PainSci Bibliography 53769 ❐
A lot of dead horses are getting beaten in alternative medicine: pointlessly studying silly treatments like homeopathy and reiki over and over again, as if it’s going to tell us something we don’t already know. This point has been made ad nauseam on ScienceBasedMedicine.org since its founding in 2009, but here Drs. Novella and Gorski make the case against testing “whether magic works” in a high-impact journal, Trends in Molecular Medicine.