Let’s go back to the eight studies in Ernst’s review article on homeopathic arnica—which we chose pretty arbitrarily—because they demonstrate a phenomenon which we see over and over again with CAM studies: most of the trials were hopelessly methodologically flawed, and showed positive results for homeopathy; whereas the couple of decent studies—the most ‘fair tests’—showed homeopathy to perform no better than placebo.*
So now you can see, I would hope, that when doctors say a piece of research is ‘unreliable’, that’s not necessarily a stitch-up; when academics deliberately exclude a poorly performed study that flatters homeopathy, or any other kind of paper, from a systematic review of the literature, it’s not through a personal or moral bias: it’s for the simple reason that if a study is no good, if it is not a ‘fair test’ of the treatments, then it might give unreliable results, and so it should be regarded with great caution.
There is a moral and financial issue here too: randomising your patients properly doesn’t cost money. Blinding your patients to whether they had the active treatment or the placebo doesn’t cost money. Overall, doing research robustly and fairly does not necessarily require more money, it simply requires that you think before you start. The only people to blame for the flaws in these studies are the people who performed them. In some cases they will be people who turn their backs on the scientific method as a ‘flawed paradigm’; and yet it seems their great new paradigm is simply ‘unfair tests’.
These patterns are reflected throughout the alternative therapy literature. In general, the studies which are flawed tend to be the ones that favour homeopathy, or any other alternative therapy; and the well-performed studies, where every controllable source of bias and error is excluded, tend to show that the treatments are no better than placebo.
This phenomenon has been carefully studied, and there is an almost linear relationship between the methodological quality of a homeopathy trial and the result it gives. The worse the study—which is to say, the less it is a ‘fair test’—the more likely it is to find that homeopathy is better than placebo. Academics conventionally measure the quality of a study using standardised tools like the ‘Jadad score’, a seven-point tick list that includes things we’ve been talking about, like ‘Did they describe the method of randomisation?’ and ‘Was plenty of numerical information provided?’
This graph, from Ernst’s paper, shows what happens when you plot Jadad score against result in homeopathy trials. Towards the top left, you can see rubbish trials with huge design flaws which triumphantly find that homeopathy is much, much better than placebo. Towards the bottom right, you can see that as the Jadad score tends towards the top mark of 5, as the trials become more of a ‘fair test’, the line tends towards showing that homeopathy performs no better than placebo.
There is, however, a mystery in this graph: an oddity, and the makings of a whodunnit. That little dot on the right-hand edge of the graph, representing the ten best-quality trials, with the highest Jadad scores, stands clearly outside the trend of all the others. This is an anomalous finding: suddenly, only at that end of the graph, there are some good-quality trials bucking the trend and showing that homeopathy is better than placebo.
What’s going on there? I can tell you what I think: some of the papers making up that spot are a stitch-up. I don’t know which ones, how it happened, or who did it, in which of the ten papers, but that’s what I think. Academics often have to couch strong criticism in diplomatic language. Here is Professor Ernst, the man who made that graph, discussing the eyebrow-raising outlier. You might decode his Yes, Minister diplomacy, and conclude that he thinks there’s been a stitch-up too.
There may be several hypotheses to explain this phenomenon. Scientists who insist that homeopathic remedies are in every way identical to placebos might favour the following. The correlation provided by the four data points (Jadad score 1–4) roughly reflects the truth. Extrapolation of this correlation would lead them to expect that those trials with the least room for bias (Jadad score = 5) show homeopathic remedies are pure placebos. The fact, however, that the average result of the 10 trials scoring 5 points on the Jadad score contradicts this notion, is consistent with the hypothesis that some (by no means all) methodologically astute and highly convinced homeopaths have published results that look convincing but are, in fact, not credible.
But this is a curiosity and an aside. In the bigger picture it doesn’t matter, because overall, even including these suspicious studies, the ‘meta-analyses’ still show, overall, that homeopathy is no better than placebo.
Meta-analyses?
Meta-analysis
This will be our last big idea for a while, and this is one that has saved the lives of more people than you will ever meet. A metaanalysis is a very simple thing to do, in some respects: you just collect all the results from all the trials on a given subject, bung them into one big spreadsheet, and do the maths on that, instead of relying on your own gestalt intuition about all the results from each of your little trials. It’s particularly useful when there have been lots of trials, each too small to give a conclusive answer, but all looking at the same topic.
So if there are, say, ten randomised, placebo-controlled trials looking at whether asthma symptoms get better with homeopathy, each of which has a paltry forty patients, you could put them all into one meta-analysis and effectively (in some respects) have a four-hundred-person trial to work with.
In some very famous cases—at least, famous in the world of academic medicine—meta-analyses have shown that a treatment previously believed to be ineffective is in fact rather good, but because the trials that had been done were each too small, individually, to detect the real benefit, nobody had been able to spot it.
As I said, information alone can be life-saving, and one of the greatest institutional innovations of the past thirty years is undoubtedly the Cochrane Collaboration, an international not-for-profit organisation of academics, which produces systematic summaries of the research literature on healthcare research, including meta-analyses.
The logo of the Cochrane Collaboration features a simplified ‘blobbogram’, a graph of the results from a landmark meta-analysis which looked at an intervention given to pregnant mothers. When people give birth prematurely, as you might expect, the babies are more likely to suffer and die. Some doctors in New Zealand had the idea that giving a short, cheap course of a steroid might help improve outcomes, and seven trials testing this idea were done between 1972 and 1981. Two of them showed some benefit from the steroids, but the remaining five failed to detect any benefit, and because of this, the idea didn’t catch on.
Eight years later, in 1989, a meta-analysis was done by pooling all this trial data. If you look at the blobbogram in the logo on the previous page, you can see what happened. Each horizontal line represents a single study: if the line is over to the left, it means the steroids were better than placebo, and if it is over to the right, it means the steroids were worse. If the horizontal line for a trial touches the big vertical ‘nil effect’ line going down the middle, then the trial showed no clear difference either way. One last thing: the longer a horizontal line is, the less certain the outcome of the study was.
Looking at the blobbogram, we can see that there are lots of not-very-certain studies, long horizontal lines, mostly touching the central vertical line of ‘no effect’; but they’re all a bit over to the left,