We performed a methodological review of the reporting of safety results focusing on immune-related serious adverse events (irSAEs) in publications and registries for all current FDA-approved ICIs (Appendix 1): CTLA-4 (ipilimumab), PD-1 (nivolumab, pembrolizumab) and PD-L1 (atezolizumab, avelumab, durvalumab and cemiplimab).
2.1 Terminology and definitions
A complete and detailed list of the following terms and definitions which have been used in this study are provided in Appendix 2: Structural hierarchy of adverse events, severity of adverse events, seriousness of adverse events, immune-related adverse event (irAE), and immune-related serious adverse events (irSAE).
2.2 Search for publications
A search in MEDLINE via PubMed was conducted to identify all randomized controlled trials (RCTs) assessing currently FDA-approved ICIs (Appendix 1). The search algorithm included key-words and free-text words for immune checkpoint inhibitor or blocker (anti-CTLA-4, anti-PD-1, anti-PD-L1) and drug names for currently FDA-approved ICIs and applied the Cochrane’s filter (sensitivity- and specificity-maximizing version) to identify RCTs (Appendix 3).
2.3 Eligibility criteria
Phase III RCTs for all FDA-approved ICIs used in cancer treatment which were published in English prior to March 2019 were included in this study. Phase I, II or IV trials, duplicates, abstracts of conference proceedings, case reports / series, editorials, commentaries, expert opinions, letters, narrative reviews, secondary reports, retrospective analyses, systematic reviews and meta-analyses or non-English publications were excluded.
2.4 Selection process
All references were evaluated for eligibility by one of the authors (ZK) with any doubtful publications considered upon evaluation and approval by a second author (AD). The screening procedure was conducted based on a two-step process: (1) title/abstract screening using Rayyan  and (2) full-text screening.
2.5 Search for corresponding registration on ClinicalTrials.gov
For each selected published trial, ClinicalTrials.gov was searched for the corresponding RCT using the NCT number when provided in the publication. When the registration number was not reported (which was not the case for any of the eligible trials), we planned to search the trial acronym or key elements of the trial to identify the registration. According to the Food and Drug Administration Amendment Act of 2007 (FDAAA 801), applicable clinical trials (trials with at least one site in the US) must submit trial results within 12 months after the primary completion date. We therefore evaluated whether results were posted within 1 year for those concerned by the law.
2.6 Data extraction
A structured data extraction form in Excel was used to collect the following information from publication and ClinicalTrials.gov for each trial, which was carried out in duplicate (ZK and SM) with any disagreements resolved through discussion and consensus:
2.6.1 From the published report:
- Publication characteristics: title, first author, date of online publication, journal name, type of journal (specialty or general medical), funding source and whether the ClinicalTrial.gov NCT number was reported.
- Medical indications and interventions: type of cancer, stage, ICI medication administered, whether ICI was given as monotherapy or combination therapy and the treatment duration.
- Trial characteristics: study design, blinding (open label, single or double blind), countries where the trial was conducted, primary outcome (overall survival, progression-free survival, or other outcome), start and end dates of recruitment, sample size and planned follow-up duration.
2.6.2 From the registry results:
- Registration information, trial start and primary completion dates, primary sponsor (pharmaceutical company, academic institution or other).
2.6.3 From both sources:
- Evaluation of the reporting of safety: We evaluated the reporting of overall safety, and of irAEs and irSAEs from the text, tables and figures, as well as supplementary information (if any) using the following items based on the CONSORT Extension for Reporting Harms and safety guidelines / recommendations for reporting AEs in Oncology [18–20]:
Evaluation of overall / general safety information
- Population of analysis: we evaluated whether safety was analyzed in all randomly assigned participants (intention-to-treat) or in a defined safety population (e.g., as-treated population) and we collected the number of participants analyzed in each treatment arm
- Use of a validated instrument for coding and grading AEs (MedDRA , CTCAE, etc.)
- Reporting of
- a frequency threshold for AEs and SAEs (reporting of all AEs or only those occurring with a sufficient frequency)
- the overall rate of AEs
- the overall rate of SAEs
- treatment-related adverse events (trAEs)
- serious treatment-related adverse events (trSAEs)
- withdrawals from treatment due to AEs and trAEs
- death due to AE and trAEs
Evaluation of specific safety information: irAEs and irSAEs
- Terminology used: how irAEs were referred to (“select trAEs”, “AEs of interest”, “immune AEs”, “immune-related AEs” or “immune-mediated AEs”).
- Reporting of:
- a definition for irAE and irSAEs
- whether and how the investigators distinguished irAEs from other trAEs
- an overall rate for irAEs and irSAEs
- a frequency threshold for irAE and irSAEs (reporting of all AEs or only those occurring with a sufficient frequency)
- structural hierarchy for description of irAE (MedRA System Organ Class (SOC) which is more general (e.g. skin, gastrointestinal) or Preferred Terms (PTs) which are more specific (e.g., rash, colitis), or any other level used for reporting irAEs)
- the severity of irAE according to the NCI-CTCAE Grading Classification
Of note, AEs were considered as immune-related only when clearly indicated as such by the authors. In other words, similar trAEs as irAEs (e.g., pneumonitis or colitis) which did not have an underlying immune etiology were not considered in the assessment. Definitions for key terms are reported in Appendix 2.
2.7 Concordance of key safety data between publications and registry results
For each trial, general safety parameters (listed above) as well as the incidence of irAEs and irSAEs were compared between the publication and results posted on ClinicalTrials.gov. When this was not possible, the reason (unreported value, inconsistent reporting format, etc.) was noted.
We first studied the overall incidence for each safety parameter between the two sources for each arm using the following approach (which is graphically illustrated in Appendix 4) :
(1) Concordant: when the reported values matched between the two sources for all treatment arms
(2) Partially concordant: when the value matched for one / some arm(s), but not all arms of the trial
(3) Discordant: if none of the reported values in the treatment arms matched between the two sources
(4) Not assessable or comparable: if the value was not reported in one or both sources (along with indicating the missing sources), or if they were not presented in the same format (barring a direct concordance assessment)
The reported frequencies from the two sources were marked as a match, if the rounded percentages were within ±1% of one another.
After comparing general safety information, we then compared the incidence of specific types of irSAEs (e.g., pneumonitis, colitis, rash, etc) between the two sources, in each trial arm. The same approach was used to assess concordance between the two sources regarding irSAEs (Appendix 4).
When there were several publications for a given trial, only the article with a publication date closer to when the trial results were posted on ClinicalTrials.gov was considered. This was to ensure that differences would not be attributable to updates in posting new trial results in the registry (basically, we wanted to make sure that detected differences were not a result of comparing newer trial results posted in ClinicalTrials.gov to old published information). Also, if the investigators of a trial had published efficacy and safety outcomes in separate articles, the publication reporting safety results was selected for the purposes of our study.
2.8 Statistical analysis
Data analysis was descriptive. Frequencies and proportions are reported for categorical data, while median and interquartile ranges are presented for continuous data. Statistical analysis was performed using R software (v3.3.1).