Professor Irina Volkov, Director of Evidence-Based Medicine at a teach

Professor Irina Volkov, Director of Evidence-Based Medicine at a teaching hospital, faces a complex challenge for her upcoming presentation to the National Institute for Health and Care Excellence (NICE). She's been asked to provide expert testimony about treatment options for a chronic condition where multiple systematic reviews exist, but they reach conflicting conclusions. Her analysis reveals three Cochrane systematic reviews (published 2019-2023), two rapid reviews commissioned by different health technology assessment agencies, four network meta-analyses comparing different intervention combinations, and two umbrella reviews attempting to synthesise the systematic reviews. The reviews use different inclusion criteria, quality assessment methods, and statistical approaches. Some conclude there is strong evidence for intervention A, others favour intervention B, and several suggest insufficient evidence for any recommendation. The NICE committee needs Irina's expert interpretation to guide national treatment guidelines that will affect prescribing decisions across the NHS, potentially impacting 250,000 patients annually and £50 million in healthcare spending. 1. I'd like the group to develop a systematic approach for Irina to evaluate the quality and reliability of these conflicting reviews. Please discuss how tools like AMSTAR-2 should be applied, and bring evidence about which quality domains are most critical when reviews reach different conclusions about the same intervention. 2. Given the mix of review types Irina is dealing with, I want the group to critically analyse how different synthesis methodologies should be weighted in her evidence evaluation. Please discuss the relative strengths of Cochrane reviews versus network meta-analyses versus umbrella reviews and how methodological rigor should influence evidence interpretation. 3. Looking at the conflicting conclusions, I'd like the group to explore the potential sources of discordance between reviews. Please discuss how differences in inclusion criteria, search strategies, quality assessment, and statistical methods could lead to different conclusions and how Irina should present this uncertainty to the NICE committee. 4. Expert Synthesis and Recommendations Finally, I want the group to consider Irina's role as an expert synthesiser of conflicting evidence. Please discuss how she should move beyond simply describing the reviews to providing informed interpretation that guides policy decisions. What frameworks exist for expert evidence synthesis, and how should uncertainty be communicated to decision-makers? The following two articles could help you: Shea, B.J., Reeves, B.C., Wells, G., Thuku, M., Hamel, C., Moran, J., Ivory, J., Tugwell, P., Welch, V. and Kristjansson, E. (2017) 'AMSTAR 2: a critical appraisal tool for systematic reviews that include randomised or non-randomised studies of healthcare interventions', BMJ, 358, j4008. Available at: https://doi.org/10.1136/bmj.j4008 Salanti, G. (2012) 'Indirect and mixed-treatment comparison, network, or multiple-treatments meta-analysis: many names, many benefits, one statistical framework', Research Synthesis Methods, 3(2), pp. 80-97. Available at: https://doi.org/10.1002/jrsm.1037 Ahmed

Reading Web Page

https://doi.org/10.1136/bmj.j4008

Searching PubMed

AMSTAR-2 systematic review quality appraisal critical domains

Searching PubMed

conflicting systematic reviews discordance inclusion criteria heterogeneity

Searching PubMed

network meta-analysis indirect comparison evidence synthesis methodology

Searching the Web

AMSTAR-2 16 domains critical items systematic review quality Shea 2017 BMJ

Searching PubMed

AMSTAR-2 appraisal systematic reviews confidence overall rating

Searching PubMed

umbrella review methodology evidence synthesis overlapping systematic reviews

Searching PubMed

GRADE evidence certainty health technology assessment guideline uncertainty

Reading Web Page

https://corates.org/resources/amstar2

Reading Article

PMID: 36927334

Searching PubMed

Salanti network meta-analysis transitivity consistency assumption indirect comparisons

Searching the Web

sources of discordance between systematic reviews conflicting conclusions inclusion criteria search strategy heterogeneity

Here is a consolidated overview of what the analysis covers, Ahmed, so your group can see the architecture clearly before engaging with each section.

Overview for the Group

The response is structured around four interconnected analytical arguments:

Q1 - AMSTAR-2 Application: The analysis distinguishes AMSTAR-2's seven critical items from its nine non-critical ones, following Shea et al. (2017) directly. The most important domains when reviews reach conflicting conclusions are protocol pre-registration (Item 2), search comprehensiveness (Item 4), exclusion justification (Item 7), risk-of-bias assessment AND incorporation into conclusions (Items 9 and 13), appropriate statistical methods (Item 11), and publication bias assessment (Item 15). A comparative AMSTAR-2 matrix across all 11 reviews is proposed as the foundation of Irina's presentation. Critically, AMSTAR-2 produces a confidence rating - not a score - and treating it as a percentage (e.g., "13/16") is a specific misuse the tool's authors warn against (De Santis et al., 2023, PMID 36927334).

Q2 - Methodological weighting: Drawing on Salanti (2012), NMAs are positioned as high-value when three conditions are met: transitivity is justified (patient populations and contexts are comparable across the network), consistency is confirmed (direct and indirect estimates agree), and common heterogeneity is reasonable. Where these are unverified, NMAs should be treated as hypothesis-generating rather than conclusive. Cochrane reviews take precedence as primary evidence; rapid reviews serve a supplementary and corroborative role only.

Q3 - Sources of discordance: Six primary drivers are analysed: population/intervention/comparator/outcome definition differences, search comprehensiveness gaps (and their link to publication bias), choice of quality assessment instrument and - crucially - how risk-of-bias findings are acted upon, and statistical model choices (fixed vs. random effects, heterogeneity handling, the minimum-study threshold for pooling). The Jadad algorithm is introduced as a stepwise tool for adjudicating between specific conflicting review pairs.

Q4 - Expert synthesis frameworks: The GRADE Evidence to Decision framework is identified as the appropriate vehicle for structuring Irina's testimony since it is the language NICE already uses. Scenario analysis (best/central/worst case), evidence maps, and conditional recommendations are proposed as mechanisms for communicating calibrated uncertainty without sacrificing actionability. The analysis concludes with a direct connection to the NHS spending context: Low-certainty evidence directly widens cost-effectiveness credible intervals, which the NICE committee must factor into its threshold decision.

This is a shared conversation. Sign in to Orris to start your own chat.