On the appropriate interpretation of evidence: the example of anti-vascular endothelial growth factor for diabetic macular edema

DOI: https://doi.org/10.21203/rs.3.rs-1030290/v1

Abstract

Background: Different network meta-analyses (NMAs) on the same topic result in differences in findings. In this review we investigated network meta-analyses comparing ranibizumab with aflibercept for diabetic macular edema in the hope of illuminating why the differences in findings occurred.

Findings: For the binary outcome of best corrected visual acuity, different reviews all agreed on their being no clear difference between the two treatments; while continuous outcomes all favour aflibercept over ranibizumab. We discussed four points of particular concern that are illustrated by five similar NMAs, including: network differences, PICO differences, different data from the same measures of effect, differences in what is truly significant.

Conclusions: Closer inspection of each of these reviews shows how the methods, including the searches and analyses all differ but the findings, although presented differently and sometimes interpreted differently, were similar.

1 Background

With the rapid increase in biomedical evidence, systematic reviews are an opportunity to take healthcare decisions based on comprehensive summaries of the best available evidence on a topic (13). Current knowledge may be imperfect, but decisions should be better informed when taken in the light of the best, most up-to-date knowledge. It is crucially important that the systematic review itself is both clear and accurate for local interpretation by healthcare decision makers (healthcare practitioners and policy makers)(3, 4). When similar approaches are taken in summarizing evidence by different research teams but contradictory findings are reported, there is a problem. In this review we investigate one such example in the hope of illuminating why the differences in findings occurred.

Our example is taken from network meta-analyses (NMA) comparing ranibizumab (a monoclonal antibody) with aflibercept (an inhibitor of vascular endothelial growth factor) for diabetic macular edema (DME). DME leads to impaired vision-related functioning and quality of life (QoL) (7) and is the main cause of moderate to severe vision impairment in people with diabetes. DME constitutes a substantial economic burden for patients and public health systems(8, 9). The therapeutic goal for people with DME is to improve visual function and vision-related QoL(10). Anti-vascular endothelial growth factor (VEGF) is recommended by several clinical guidelines as first-line treatment(11, 12). Ranibizumab and aflibercept are commonly used in clinical practice, but there are limited direct comparisons of these two drugs. Network meta-analysis then becomes an attractive option as NMAs use past studies of the two drugs directly compared with other controls to create statistical indirect comparisons of the two medications of current interest.

2 Findings

We identified five NMAs (1317) through searching electronic database (PubMed, Embase, Cochrane Library, Web of Science, CNKI, Wanfang, VIP) and screening. AMSTAR-2 tool (3) was employed to provide some rating of the quality of each of the reviews.

Everything seemed varied in almost every review (Table 1). Although the question under investigation was consistent, the searches, the numbers of studies used and definitions for eligible participants, comparisons from which to source data and acceptable outcomes mostly lacked rigid consistency.

 
Table 1

Summary of Included NMAs

 

Study IDa

 

Korobelnik

2015(13)

Régnier 2014(14)

Zhang 2016(15)

Muston 2018(16)

Virgili

2018(17)

Protocol identified

Search

Cochrane

EMBASE

MEDLINE

b

b

c

b

Others

       

d

Date

01/2013

02/2014

08/2015

12/2016

04/2017

Number of studies

11

8

21

13

24

Statistics

Network model

Bayesian

Bayesian

Bayesian

Bayesian

Frequentist

Sensitivity analysis

Heterogeneous studies; ethnic group

Ethnic group

 

Heterogeneous studies; ethnic group

Studies at higher risk of biasf

Covariates

baseline BCVA and/or CRT

baseline BCVA and/or CRT

 

baseline BCVA and/or CRT

 

Other

     

Some IPD

 

Participants

Diabetic macular edema

Significant, focal or diffuse

,

Baseline BCVA & CRT varied - 24-78 letters

 

As for Korobelnik 2015

Baseline visual acuity between 20/200 and 20/40

DME secondary to diabetes involving the center of the macula

     

retinal thickening due to DME/clinically significant macula edema with DR

   

previously received central/peripheral laser or treatment naïve included

Interventionsh

aflibercept

 

2q4 or 2q8

2 mg; bimonthly

intravitreal

2q8

2 mgg

+ laser

   

 

ranibizumab

 

0.5 mg, PRN

0.5 mg, PRN

intravitreal

0.5 mg, PRN or 0.5 mg T&E or 0.3 mg, q4

0.5 mg or 0.3 mg

+ laser

 

deferred

 

prompt

dexamethasone

implants

 

implants

   

(continued)

         

bevacizumab

     

intravitreal

1.25 mg

1.25 mg

+ laser

 

 

triamcinolone acetonide

     

intravitreal

4 mg, q4/PRN or 4 mg, q4

 

+ laser

 

 

pegaptanib

         

0.3 mg

Laser

 

 

+ sham injection

       

Sham

 

   

Outcomes

Binary

ETDRS letterse

>10 and >15 gain; >10 and >15 loss

>10 gain

 

>10 and >15 gain; >10 and >15 loss

>15 gain

AEs

 

 

Continuous

(average change)

in BCVA using ETDRS charts

 

     

CMT

 

CRT measured using OCT

Quality rating (AMSTAR-2)

Low

Low

Low

Low

High

a Sorted by search date
b Including In-Process Citations and Daily Update
c PubMED
d International Clinical Trials Registry Platform; ISRCTN registry; LILACS; Novartis Clinical Trials database; US National Institutes of Health Ongoing Trials Register ClinicalTrials.gov; World Health Organization
e in BCVA
f Post hoc
g Regarding drug dose and monitoring/retreatment regimen, Virgili 2018 included
schemes that are either on-label or commonly used in clinical practice (such as monthly, bimonthly, PRN, T&E
h No included NMAs included RCTs which investigated conbercept.
AE: adverse event; BCVA: best corrected visual acuity; CI: credible/confidence interval; CMT: central macular thickness; CRT: central retinal thickness; DME: diabetic macular edema; DR: diabetic retinopathy; ETDRS: Early Treatment Diabetic Retinopathy Study; IPD: individual patient data; NI: no information; NMA: network meta-analysis; OCT: optical coherence tomography; PRN: pro re nata; q4: every 4 weeks T&E: treat-and-extend; 2q8: 2 mg every 8 week


Table 2 summarised three outcomes in the NMAs. The first outcome shows, how, for the identical binary outcome, different reviews gathered data from different studies, and, partly due to this, arrived as slightly different point estimates – although all agreed on their being no clear difference between the two treatments (all 95% Confidence Intervals straddled zero); and the other two outcomes reproduce the results from each review and illustrates how the same measure is reported in different ways and different combinations across the five reviews.


 
Table 2

Summary of three outcomes in NMAs

 

Régnier 2014(14)

Korobelnik 2015(13)

Zhang 2016(15)

Muston 2018(16)

Virgili 2018(17)

Gain ≥ 10 ETDRS letters at 12 months (three reviewsa)

OR [95% CrI]

0.63 [0.19 to 1.63]

1.59 [0.75 to 3.35]

NR

1.79 [0.63 to 4.06]

NR

Studies reporting these datab in each NMA

Elman 2010(18)

[DRCR.net Protocol I]

Included

Included

NR

Included

NR

Mitchell 2011(19)

[RESTORE]

Included

Included

NR

Included

NR

Korobelnik 2014(20)

[VIVID; VISTA]

Included

Included

NR

Included

NR

Massin 2010(21)

[RESOLVE]

Included

Not includedc

NR

Not includedc

NR

Googe 2011(22)

[DRCR.net Protocol J]

Not includedd

Included

NR

Included

NR

Do DV 2012(23)

[Da VINCI]

Included

Not includedd

NR

Not includedd

NR

Ishibashi 2015(24)

[REVEAL]

Not included

focus on Asian populatione

Includede

NR

Included

NR

RESPOND(25)

[NCT01135914]

Included

Not includedf

NR

Included

NR

Nguyen 2009(26)

[READ-2]

Includede

Not includedg

NR

Not includedg

NR

Average change in BCVAh at 12 months MD [95% CrI] (five reviews)

 

4.5 [1.5 to 7]i

4.67 [2.45 to 6.87]

2.07 [-0.97 to 5.33]

5.20 [1.90 to 8.52]

4 [2.5 to 5.5]

Gain ETDRS lettersh at 12 months OR [95% CrI] (five reviews)

≥ 10

0.63 [0.19–1.63]

1.59 [0.75–3.35]

NR

1.79 [0.63–4.06]

NR

≥ 15

NR

NR

NR

2.30 [1.12-4.20]

1.33 [1.06-1.67]j

a Zhang 2016 and Virgili 2018 did not report this outcome.
b the additional reasons presented for ‘included’ or ‘not included’ were identified by author team of this review, not identified in original texts of NMAs
c data unavailable on ranibizumab 0.5 mg
d unclear reason for exclusion
e included in sensitivity analysis
f unpublished when NMA conducted
g data only reported at 6 months
h higher values represent better visual acuity measured using ETDRS letters
i data were analysis by author team of this review (Bayesian network model/random effects, using ADDIS software), not reported in original texts of Re´gnier 2014.
j data were risk ratio (RR) and its 95% CrI
CrI: credible interval; BCVA: best corrected visual acuity; ETDRS: Early Treatment Diabetic Retinopathy Study; NMA: network meta-analysis; NR: not reported; OR: odds ratio.


3 Discussion

Generally speaking, it is clear that the reader must continue to think ‘cleanly’ amidst the data which may not be quite so clean. Below we discuss some points of particular concern that are illustrated by these five similar NMAs.

3.1 Network differences

Network meta-analysis employs data from [in these cases] randomised trials in ways by which comparisons of interest can be constructed by using somewhat assumption-heavy observational methods. For example, aflibercept versus ranibizumab is the comparison of interest (referred to as the decision set). Aflibercept or ranibizumab, however, may only have been compared with sham injection in randomised trials. We use the term ‘supplementary set’ to refer to interventions, such as sham injection, that are included in the network meta-analysis for the purpose of improving inference among interventions in the decision set(4). As different selection for supplementary set, different network structure will be conducted for the same clinical problem. In selecting which competing interventions to include in decision set, researchers should ensure that the transitivity assumption is likely to hold, mostly based on clinical considerations(4, 27).

When theoretical assumptions are guaranteed (transitivity and consistency), there is no absolute right or wrong in the construction of a network structure. The reader of the review should carefully consider if she/he feels the network indirect comparisons are sensible and making best use of available data.

2.2 PICO differences

Mostly, different decisions for the PICO choices are all considered to make clinical sense in different NMAs. These differences do then lead to results that are not identical. For example, one review may feel that a systematic difference in participants in a particular trial may be make it inappropriate to network with the data of other studies (e.g., Ishibashi 2015, only included in sensitivity analysis in Régnier 2014, included in main analysis of Korobelnik 2015). Variations in dosage of treatments may, in the view of one review team, make a study ineligible but be acceptable to other researchers (e.g., Massin 2010, included in Régnier 2014, excluded from Korobelnik 2015/Muston 2018). Time-point of outcome assessment can add more differences (e.g., Nguyen 2009, included in Régnier 2014, excluded from Korobelnik 2015/Muston 2018).

These decisions, all done with the best of intentions, lead to inclusion of different studies contributing to the final – slightly different - results as illustrated in Table 2. It should not be a surprise that clinicians and researchers evolve their ideas and differ even at the same time. Readers need to consider and understand what participants are included – and excluded – in the review, what treatments are its focus and if there are omissions, and what outcomes are being reported and why those choices were taken.

2.3 Different data from the same measures of effect

Clinicians and patients first tend to seek if the treatment will help them, for example, ‘get better’ (a question that merits a binary answer) and then, only as second preference, seek more detailed information on the degree of improvement (meriting an answer on a continuous scale). The dichotomous or binary though is often a crude and even arbitrary cut-off within an ostensibly continuous measure. Continuous measures are, however, often a research fabrication and not truly continuous.

In the examples in Table 2 the average change improvements seem relatively consistently to be a matter of around 4 points. It is problematic to really understand what this may mean for any one patient’s life. In averaging across the groups something may be lost, however, that is revealed in the binary and Table 2 gives good evidence for speculation. What trials that report a ≥10-point gain consistently are reviewed to show no clear difference between aflibercept and ranibizumab but the two latest reviews have a new binary to report (≥15 point gain) and both show advantage for those allocated to aflibercept. Perhaps in the averaging across all people in the trials there has been a masking of an important group of people who respond better to aflibercept. But these are clinical and research points of debate.

Overall, the five reviews have reported results that are complicated, thought provoking, but not truly inconsistent with each other. The reader needs to consider the value of the outcome for their need. The researchers may favour the continuous measure of function, the clinician or patient the binary cut-off for better/not better and the policy maker the economics.

2.4 Differences in what is truly significant

When the synthesis of data produces a pre-stated level of statistical significance, however, the findings of the outcome measure may not have great clinical impact. It is easy for confusion to arise when the same data are commented from the statistical perspective or clinical meaning. Careful consideration is required from the reader to understand the assessment of the reviewers - are they reporting the clinical or statistical perspective – or a mixture of both.

Further danger of differing interpretations of the same findings lies in when confidence intervals straddle zero (for continuous data) or 1 (for binary data – as for all the ≥10 point gain findings in Table 2). It is easy for reviewers and readers of the reviews to confuse ‘no evidence of an effect’ with ‘evidence of no effect’. When confidence intervals are wide, for example the 0.63 to 4.06 of Muston 2018 in Table 2, they straddle 1 or unity. In this case it is wrong to claim that aflibercept has ‘no effect’ or is ‘no different’ from ranibizumab – both statements carry too much certainty. It is true, there is no clear difference, but one drug is not clearly different to the other. If a true beneficial effect is mentioned in the conclusion, a true harmful effect should also be mentioned and discussed.

As always, really thinking about the meaning of findings is key. Together, the point estimate and confidence interval provide information to assess the effects of the intervention on the outcome. For example, in the evaluation of these drugs on BCVA it could have been decided that it would be clinically useful if the medication increased BCVA from baseline by 5 letters – and at the very least 2 letters. Virgili 2018 reports an effect estimate of an increase from baseline of 4 letters with a 95% confidence interval from 2.5 to 5.5 letters. This allows the conclusion that aflibercept was useful since both the point estimate and the entire range of the interval exceed the criterion of an increase of 2 letters. The Régnier 2014 review reported similar point estimate (4.5 letters) but with a wider interval from 1.5 to 7 letters. In this case, although it could still be concluded that the best estimate of the aflibercept effect is that it provides net benefit, the reader could not be so confident as the possibility still has to be entertained that the effect could be between 1.5 and 2 letters – a low range that had been pre-specified to be of little clinical value. The contrast of Régnier 2014 and Virgili 2018 serves well to illustrate how very similar findings may justify subtly different implications. The reviewers carry a responsibility to help the reader by clear reporting and thoughtful inclusive explanations – but where this has not happened the readers may have to do this for themselves.

4 Conclusions

We have summarised the methods and findings of five NMAs of the same topic which produced what seemed like somewhat different findings from similar data sets. Closer inspection of each of these reviews shows how the methods, including the searches and analyses all differ but the findings, although presented differently and sometimes interpreted differently, were similar.

As always, the critical reader of a review should think about the review in detail. This is helped by long-established checklists (28). Furthermore, Grading of Recommendations Assessment, Development, and Evaluation (GRADE) offers a transparent and structured process for developing and presenting summaries of evidence, including its quality, for systematic reviews and recommendations in health care (29).

As is common in different trials and reviews, outcomes – even the same measures – can be legitimately reported in several different ways. There is no avoiding the need to think through what the numbers really mean in terms of people, services and policies. This may necessitate careful, subtle, humane, and expert consideration.

Abbreviations

NMA: network meta-analyses

DME: diabetic macular edema

BCVA: best corrected visual acuity

QoL: quality of life

VEGF: vascular endothelial growth factor

GRADE: grading of recommendations assessment, development, and evaluation

Declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Availability of data and material

The datasets during and/or analysed during the current study available from the corresponding author on reasonable request.

Competing interests

None.

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Authors’ contributions:

Jing Wu: study design, draft, review, revision;

Clive Adams: draft, review, revision;

Xiaoning He: screening of articles, data extraction;

Fang Qi: screening of articles, data extraction, statistical analysis;

Jun Xia: draft, review, revision.

All authors read and approved the final manuscript.

Acknowledgements

We thank everyone who kindly provided assistance during our preparation of this manuscript.

References

  1. Bastian H, Glasziou P, and Chalmers I. Seventy-Five Trials and Eleven Systematic Reviews a Day: How Will We Ever Keep Up? Plos Medicine. 2010;7(9):e1000326.
  2. Mulrow CD. Rationale for systematic reviews. Bmj. 1994;309(6954):597–9.
  3. Shea BJ, Reeves BC, Wells G, Thuku M, Hamel C, Moran J, et al. AMSTAR 2: a critical appraisal tool for systematic reviews that include randomised or non-randomised studies of healthcare interventions, or both. Bmj. 2017;358:j4008.
  4. Higgins JPT TJ, Chandler J, Cumpston M, Li T, Page MJ, Welch VA. Cochrane Handbook for Systematic Reviews of Interventions version 6.1 (updated September 2020). Cochrane. 2020.
  5. Ciulla TA, Amador AG, and Zinman B. Diabetic retinopathy and diabetic macular edema: pathophysiology, screening, and novel therapies. Diabetes Care. 2003;26(9):2653–64.
  6. Fenwick EK, Xie J, Ratcliffe J, Pesudovs K, Finger RP, Wong TY, et al. The Impact of Diabetic Retinopathy and Diabetic Macular Edema on Health-Related Quality of Life in Type 1 and Type 2 Diabetes. Investigative Ophthalmology & Visual Science. 2012;53(2):677–84.
  7. Hariprasad SM, Mieler WF, Grassi M, Green JL, Jager RD, and Miller L. Vision-related quality of life in patients with diabetic macular oedema. British Journal of Ophthalmology. 2008;92(1):89–92.
  8. Chen E, Looman M, Laouri M, Gallagher M, Van Nuys K, Lakdawalla D, et al. Burden of illness of diabetic macular edema: literature review. Current Medical Research & Opinion. 2010;26(7):1587.
  9. P.H. S, M.L. M, C. B, E. J, P. H, and S. K. Reported symptoms and quality-of-life impacts in patients having laser treatment for sight-threatening diabetic retinopathy. Diabetic Medicine. 2006(No.1):60–6.
  10. Jain A, Varshney N, and Smith C. The Evolving Treatment Options for Diabetic Macular Edema. Int J Inflam. 2013;2013:689276.
  11. Schmidt-Erfurth U, Garcia-Arumi J, Bandello F, Berg K, Chakravarthy U, Gerendas BS, et al. Guidelines for the Management of Diabetic Macular Edema by the European Society of Retina Specialists (EURETINA). Ophthalmologica. 2017.
  12. Gemmy, CM, Cheung, Young, Hee, Yoon, et al. Diabetic macular oedema: evidence-based treatment recommendations for Asian countries. Clinical & Experimental Ophthalmology. 2017.
  13. Korobelnik JF, Kleijnen J, Lang SH, Birnie R, Leadley RM, Misso K, et al. Systematic review and mixed treatment comparison of intravitreal aflibercept with other therapies for diabetic macular edema (DME). BMC Ophthalmology,15,1(2015-05-15). 2015;15(1):52.
  14. Stephane R, William M, Felicity A, Jonathan W, Vladimir B, and Andreas W. Efficacy of Anti-VEGF and Laser Photocoagulation in the Treatment of Visual Impairment due to Diabetic Macular Edema: A Systematic Review and Network Meta-Analysis. Plos One. 2014;9(7):e102309.
  15. Lu Z, Wen W, Yan G, Jie L, Xie L, and Alan S. The Efficacy and Safety of Current Treatments in Diabetic Macular Edema: A Systematic Review and Network Meta-Analysis. Plos One. 2016;11(7):e0159553.
  16. Muston D, Korobelnik JF, Reason T, Hawkins N, Chatzitheofilou I, Ryan F, et al. An efficacy comparison of anti-vascular growth factor agents and laser photocoagulation in diabetic macular edema: a network meta-analysis incorporating individual patient-level data. Bmc Ophthalmology. 2018;18(1).
  17. Virgili G, Parravano M, Evans JR, Gordon I, and Lucenteforte E. Anti-vascular endothelial growth factor for diabetic macular oedema: a network meta-analysis. Cochrane Database of Systematic Reviews. 2017;6(6):CD007419.
  18. Elman MJ, Aiello LP, Beck RW, Bressler NM, and Sun JK. Randomized Trial Evaluating Ranibizumab Plus Prompt or Deferred Laser or Triamcinolone Plus Prompt Laser for Diabetic Macular Edema. Ophthalmology. 2010;117(6):1064-77.e35.
  19. Mitchell P, Bandello F, Schmidterfurth U, Lang GE, Massin P, Schlingemann RO, et al. The RESTORE study: ranibizumab monotherapy or combined with laser versus laser monotherapy for diabetic macular edema. Ophthalmology. 2011;118(4):615–25.
  20. Korobelnik JF, Do DV, Schmidt-Erfurth U, Boyer DS, Holz FG, Heier JS, et al. Intravitreal aflibercept for diabetic macular edema. Ophthalmology. 2014;121(11):2247–54.
  21. Massin P BF, Garweg JG, Hansen LL, Harding SP, et al. Safety and efficacy of ranibizumab in diabetic macular edema (RESOLVE Study): a 12-month, randomized, controlled, double-masked, multicenter phase II study. Diabetes Care. 2010;33:2399–405.
  22. Googe J, Brucker AJ, Bressler NM, Qin H, Aiello LP, Antoszyk A, et al. Randomized trial evaluating short-term effects of intravitreal ranibizumab or triamcinolone acetonide on macular edema after focal/grid laser for diabetic macular edema in eyes also receiving panretinal photocoagulation. Retina (Philadelphia, Pa). 2011(No.6):1009–27.
  23. Do DV, Quan DN, Boyer D, Schmidt-Erfurth U, and Heier JS. One-year outcomes of the DA VINCI study of VEGF trap-eye in eyes with diabetic macular edema. Ophthalmology. 2012;119(8):1658–65.
  24. Ishibashi T, Li X, Koh A, Lai TY, Lee FL, Lee WK, et al. The REVEAL Study: Ranibizumab Monotherapy or Combined with Laser versus Laser Monotherapy in Asian Patients with Diabetic Macular Edema. Ophthalmology. 2015;122(7):1402–15.
  25. Clinicaltrial.gov. Safety, efficacy and cost-efficacy of ranibizumab (monotherapy or combination with laser) in the treatment of diabetic macular edema (DME) (RESPOND). NCT01135914. https://www.clinicaltrials.gov/ct2/show/NCT01135914?term=RESPOND&cond=DME&rank=1. 7 May 2018.
  26. Nguyen QD, Shah SM, Heier JS, Do DV, Lim J, Boyer D, et al. Primary End Point (Six Months) Results of the Ranibizumab for Edema of the mAcula in diabetes (READ-2) study. Ophthalmology. 2009;116(11):2175-81.e1.
  27. Salanti G, Ades AE, and Ioannidis JP. Graphical methods and numerical summaries for presenting results from multiple-treatment meta-analysis: an overview and tutorial. J Clin Epidemiol. 2011;64(2):163–71.
  28. Critical Appraisal Checklist For A Systematic Review. https://www.gla.ac.uk/media/Media_64047_smxx.PDF. Accessed 20 March 2021.
  29. Guyatt G, Oxman AD, Akl EA, Kunz R, Vist G, Brozek J, et al. GRADE guidelines: 1. Introduction-GRADE evidence profiles and summary of findings tables. Journal of Clinical Epidemiology. 2011;64(4):383–94.