This is the 4th article in the ‘Why do we do systematic reviews?’ series (see references below for previous articles [1, 2, 3]). The series is about exploring the reasons for undertaking a systematic review, with four main reasons seeming popular. The third most popular reason, with 23.6% of the votes at the time of writing, is ‘To quantify, quite tightly, how good the intervention is‘.
I will use this article to highlight the clear, demonstrable, failings of standard systematic review methods in quantifying – accurately – how good an intervention is. I will then bring it round to the case for rapid reviews.
To be clear I find this reason the most depressing of them all – the belief that systematic reviews can be relied upon to give an accurate estimate of the effect size of an intervention. It highlights a real problem at the heart of EBM due to the prominent place systematic reviews have; they are seen as being at the pinnacle of the evidence pyramid. Yet the shortcomings are not much discussed and people, all too often, uncritically accept the results. It is the lack of transparency which is particularly insidious. While a systematic review, based on published journal articles, might be the best available evidence it doesn’t make it right or even particularly accurate. One of my favourite quotes seems pertinent:
“Those that choose the lesser of two evils soon forget they chose evil at all”
My main concerns, apart from the length of time it takes to do a traditional systematic reviews, is that of publication bias. This is the situation whereby 30-50% of all trials are not published . If a central ethos of systematic reviews is to incorporate all trials – it has failed from the outset. We know that Cochrane, one of the more rigourous systematic review producers, doesn’t have a robust system for dealing with unpublished trials. In fact I would describe it as un-systematic .
Do these missing trials have an effect? Absolutely. In 2008 Turner  published an analysis of unpublished studies of antidepressants. For this he had to look through regulatory data (in this case the FDA) to find the unpublished information. While pharmaceutical companies need to register their trials with the FDA they are under no obligation to publish them. By carefully going through the FDA data (this is non-trivial) you can find trials that have been carried out but not published. The article reported that there were 38 studies favouring an antidepressant, of which 37 were published. Of the 36 that were negative, only 3 were published. There is pretty much an even split between positive and negative trials. But if you’d rely on published studies nearly all would be positive. It’s obvious the results are likely to be distorted – which is what Turner reported.
Four years later Hart, using similar methods, looked at a much wider group of interventions .
“Overall, addition of unpublished FDA trial data caused 46% (19/41) of the summary estimates from the meta-analyses to show lower efficacy of the drug, 7% (3/41) to show identical efficacy, and 46% (19/41) to show greater efficacy.”
In just under 50% of the cases the discrepancy was greater than 10%. But what is possibly more troubling is the fact that the results are unpredictable. There is no way of knowing if the result of a meta-analysis, based on published trials, are likely to under-estimate or over-estimate the true effect size. Historically the assumption (acknowledged by Hart) was that negative trials would be unpublished (as Turner had shown) but this was simply not demonstrated in the larger study.
Tom Jefferson (known mostly for his work on Tamiflu) used the analogy of an iceberg in relation to data used for evidence synthesis.
He points out that most systematic reviews are missing huge amounts of data, not just unpublished journal articles but also the large amount of data included in documents such as clinical study reports.
The iceberg image is particularly powerful as it shows the nonsense that systematic reviews are based on ‘everything’. The inclusion criteria for most systematic reviews (ie use all published journal articles) is arbitrary and not evidence-based. It’s as much about convenience as ‘truth’. The papers by Hart and Turner show the effects.
Bottom line 1: if you look at any systematic review there is only a 50% chance that the estimate of effect is within 10% of the real figure. This is compounded by the fact that you can’t be sure if the review you’re looking at is close, an overestimate or an underestimate.
But how does this link to rapid reviews? It starts with the consideration of what, when looking at a systematic review, can you actually say about it? I’m reduced to thinking you can say that it can give a ballpark figure of the effectiveness of an intervention – nothing more. And, if you’re happy with a ballpark figure, do it quickly! I actually find this thinking quite liberating!!
Using the iceberg example is it really problematic to rely on a good sample of published journal articles? If your position is that you need all trials and therefore a sample is wrong – then you have to concede that currently the vast majority of systematic reviews are wrong. On the other hand, if you’re happy with a sample, why is a 70% sample of the trials appreciably better than 60% (as you might get via a rapid search as part of a rapid review)? It is nonsensical.
Reapplying the argument outlined above to the iceberg theme/image:
But is there any evidence of what happens when you take a sample of published trials (versus a systematic review that attempts to get all published trials)? I’m aware of two articles [8, 9] that use different sampling techniques and both report almost identical outcomes to full-blown systematic reviews. My own work in rapid reviews, not peer-reviewed, finds similar findings.
The notion of having to find all published articles to me is a nonsense, is wasteful and is therefore unethical. This is compounded by the (possibly wilful) suppression of this information. People look to systematic reviews for accurate information, the fact that the shortcomings are not advertised is negligent. Systematic reviews waste an awful lot of effort, time, resource. Methodologists fiddle around – at great effort and cost – trying to remove bias that invariably has a much smaller effect than missing trials. As stated before  one of the few reasons that I can see is an economic one in that having a high methodological entry point acts as a barrier to entry to competition.
Bottom line 2: Do reviews rapidly, explicitly mention the methodological shortcoming. If more accurate information is really required then you invariably need more resource and need to go below the surface to the depths of the iceberg.
- Why do we do systematic reviews?
- Why do we do systematic reviews? Part 2
- Why do we do systematic reviews? Part 3
- Searching for unpublished data for Cochrane reviews: cross sectional study. Schroll JB et al. BMJ 2013;346:f2231
- Selective Publication of Antidepressant Trials and Its Influence on Apparent Efficacy Turner EH et al. N Engl J Med. 2008 Jan 17;358(3):252-60
- Effect of reporting bias on meta-analyses of drug trials: reanalysis of meta-analyses Hart B et al. BMJ. 2012 Jan 3;344:d7202.
- McMaster Premium LiteratUre Service (PLUS) performed well for identifying new studies for updated Cochrane reviews. Hemens BJ et al. J Clin Epidemiol. 2012 Jan;65(1):62-72.e1
- A pragmatic strategy for the review of clinical evidence. Sagliocca L et al. J Eval Clin Pract. 2013 Jan 15. doi: 10.1111/jep.1202
- Economics and EBM. Liberating the Literature. October 2014