An interesting exchange on Twitter – reflections

Last week I posted An interesting exchange on Twitter. In that I ended with this:

When might:

  • the largest trial suffice?
  • when might a rapid review suffice?
  • when might a systematic review suffice?
  • when might you need to do a full systematic review, using all the data (including unpublished data including CSRs as seen with the Tamiflu work of Tom Jefferson)?

This triggered further exchanges on Twitter which has led to further reflections.  Perhaps my biggest one is that much of the debate is driven by ideology not evidence. An example might be in reference to using the largest trial to guide decisions.  It was pointed out – by James Thomas, someone with impeccable systematic review credentials – that:

Relying on the largest study results in no important differences in findings in only 34% of cases. This doesn’t look to be sufficiently reliable for practical purposes to me…“.

I don’t disagree. But I can’t help feeling that, if one was curious, one would seek to understand when the largest trial agreed and when it didn’t.  For instance, I suspect that if you plot agreement (between largest RCT with the subsequent meta-analysis) by sample size it would not be random. As the trial gets larger the agreement is likely to be stronger. This makes sense as the larger the trial the greater the weight is likely to be in the meta-analysis.  So, there are two options:

  • Default to say it’s unreliable and close the book and help preserve the status quo.
  • Explore the relationship and see if you can – in some situations – rely on the largest trial.

The data is there and I think it’d make an interesting study on its own (even if my assumptions prove to be wrong).

Another ideological assertion (as I would characterise it) was from both James and Leonie van Grootel who both say roughly the same. Quoting Leonie:

I am not aware either of the mentioned distinction SR/SR+. You make a decision how to balance your search strategy between precision and specificity and the choices, such as only to include peer reviewed journal articles, are carefully explained as potential limitations.

And James reported:

The systematic review / ‘full’ systematic review distinction doesn’t work for me, as the above definition will include data from CSRs and other unpublished sources when relevant.

They both make the case that you should seek appropriate evidence and that can include CSRs and unpublished data. I would love to see anything that guides this decision making process. Is it faith-based or evidence-based?  James later restated a line from Cochrane:

The Cochrane Handbook says “a systematic review attempts to collate all the empirical evidence that fits pre-specified eligibility criteria… to answer a specific research question”.

I love the use of the term ‘attempts’, it just makes me smile as its bordering on meaningless in this context. A slight aside, I ‘attempt’ to touch my toes but fail badly. How hard do you ‘attempt’ to find all the empirical evidence?

Erick Turner shares my perspective:

The usual interpretation of the term “systematic review” seems to be that you should systematically review the published literature…and systematically ignore data in regulatory documents (and thereby ignore unpublished studies)“.

There is evidence out there:

Searching for unpublished data for Cochrane reviews: cross sectional study Schroll JB et al. BMJ. 2013 Apr 23;346:f2231 which indicates that many respondents attempt to locate unpublished data but success is low and is likely to be unsystematic with “Manufacturers and regulatory agencies were uncommon sources of unpublished data.

The striking discord here is that those from a systematic review production world do not see the distinction while those outside of it see it fairly clearly. Bias/ideology on both sides?

I think a lot of this comes down to the Q of how much data you need.  I love Tom Jefferson’s use of the iceberg metaphor:

The above was taken from Restoring invisible and abandoned trials: a call for people to publish the findings BMJ 2013; 346

The point being is that there is lots of data out there and if you rely only (or wholly) on published journal articles (the stuff above the waterline) then you’re missing loads of data.

So, that leads to the following question:

Do you need all the data or not?

This is a fundamental point, one that is infrequently articulated.  As far as I’m aware the work around Tamiflu is the review that’s used the most extensive use of data and that was a massive undertaking. So, from a resource perspective it’s impractical to do this with any regularity.  So, the question then becomes:

If you don’t need all the data, how much do you need?

In an article Iain Chalmers and Paul Glasziou wrote Can it really be true that 50% of research is unpublished? they report:

Reviewers trying to summarize all the research addressing a particular question are limited by access only to a biased subsample of what has been done.

This (relying on published articles) has consequences!  In Turner’s 2008 seminal paper Selective Publication of Antidepressant Trials and Its Influence on Apparent Efficacy where he highlighted that the majority of the papers that were positive (for antidepressants) were published and the majority of the negative ones were unpublished. They report “According to the published literature, it appeared that 94% of the trials conducted were positive. By contrast, the FDA analysis showed that 51% were positive.” Relying on the published literature gives you a “biased subsample” (the quote from Chalmers and Glasziou) and a significant distortion of the effectiveness of an intervention’s likely ‘worth’. These findings are not one off with other studies finding big differences between all trials and published trials.

The one thing that strikes me is how arbitrary (or is it faith-based) it is to rely on journal articles. I believe that when systematic reviews started off in earnest (?25 years ago) the problem of reporting bias was considerably less understood and therefore using just journal articles seemed entirely reasonable. Since then structures, belief system, careers, ideologies have been built on this premise. Is it any wonder there is resistance when new (and inconvenient) evidence surfaces?

So, where to go from here?

Rapid reviews are part of this debate! I’ve typically said that relying on published articles is likely to get you to a ballpark figure and if you’re happy with ballpark then do it quickly. Caroline Struthers has changed my perspective on this with her tweet:

My view is that if you use incomplete (and if published probably biased) data you cannot say with any confidence you are in the right ballpark. Unpublished data may also be useless if the wrong primary outcomes were used because researchers didn’t involve patients

So, this is where I’m at. It’s a mess. Setting out the problems is the easy part, but how to solve it is another issue entirely.

But, to me, it all boils down to, for a given review topic and a given context how much data do you need to provide a robust answer? We are lacking in evidence to guide this decision so we rely on faith and the nonsensical and entirely arbitrary decision to rely on published journal articles.


One thought on “An interesting exchange on Twitter – reflections

  1. Thanks Jon. The last sentence is a great summary of where we’re at in the systematic review “industry”. I think we need to cut systematic review emissions as a matter or urgency to improve evidence efficiency and reduce pollution.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s