Study cohort
Overall, the study cohort of our project comprises the following sub-cohorts:
- Public Germany (IITs with German contribution)
- Public Germany gov (reference sub-cohort. IITs funded by the governmental organizations DFG and BMBF within their Clinical trials program)
- Public Germany other (IITs funded by other non-commercial organizations or funding programs)
- Public International (IITs without German contribution)
- Commercial Germany (ISTs with German contribution)
- Commercial International (ISTs without German contribution)
Establishing the study cohort
In Germany, there are two main research funding organizations facilitating IITs within specific clinical trials funding programs since 2005, the German Research Foundation (DFG) [38] (also funding this project) and the German Federal Ministry of Education and Research (BMBF) [39]. IITs funded within these funding programs served as reference sub-cohort relating to the study characteristics for the creation of the comparison sub-cohorts. Between 2005 and the cut-off date of 31 Dec 2016, a total of 77 completed IITs were recorded and available in the databases of DFG and BMBF. For our research project, we focused on 60 trials (27 funded by the DFG and 33 by the BMBF) that met the following criteria:
- Therapeutic randomized controlled trial
- Interventional
- Multicenter
- Confirmatory
- Year of study application or study start: 2005 or later
- Study completion up to the cut-off date 31 Dec 2016
These characteristics were used as eligibility criteria for the creation of the comparison sub-cohorts.
Furthermore, we aimed to create sub-cohorts that did not differ substantially from each other concerning the sample size. Therefore, we limited the trials of the other sub-cohorts to the maximum number of participants of the reference sub-cohort, which was 4005.
The study information was taken from the funder websites and study registries.
Creation of the sub-cohorts
To achieve a sufficient sample size of completed IITs with at least one study site in Germany, we complemented the 60 trials (Public Germany gov) retrieved from the DFG database German Project Information System (GEPRIS) [38] and the BMBF website [39] by an equal number of IITs funded by other German non-commercial organizations (Public Germany other) to a total of 120 (Public Germany) (Table 1).
The German Clinical Trials Register (DRKS) is an approved Primary Register in the WHO Registry network [40] and the central portal for information on clinical research in Germany [41]. It provides a complete and up-to-date overview of trials conducted in Germany. Therefore, we used the DRKS as the basis source for the German sub-cohorts Public Germany and Commercial Germany. We considered all eligible trials that were included in the DRKS and supplemented both sub-cohorts by trials drawn from ClinicalTrials.gov, a study registry including clinical trials conducted all over the world (210 countries) [42].
The trials for the two international sub-cohorts without German contribution (Public International and Commercial International) were all taken from ClinicalTrials.gov. For both sub-cohorts we included 200 trials each (please refer to “Sample size and statistical analysis”).
Table 1: Study cohort. For search strategies, please refer to Additional file 1.
|
Sub-cohort
|
Source
|
Number of trials
|
IITs
|
Public Germany
|
|
120
|
Public Germany gov
|
DFG/GEPRIS (n=27), BMBF website (n=33)
|
60
|
|
Public Germany other
|
DRKS (n=47), ClinicalTrials.gov (n=13)
|
60
|
|
Public International
|
ClinicalTrials.gov
|
200
|
ISTs
|
Commercial Germany*
|
DRKS (n=42), ClinicalTrials.gov (n=158)
|
171
|
|
Commercial International
|
ClinicalTrials.gov
|
200
|
* Due to an insufficient number of non-drug ISTs in the registries searched, we could only include 171 trials in the sub-cohort Commercial Germany (please refer to section “Balancing process” and Table 3).
Balancing of the sub-cohorts regarding study phase and study site location
Our study cohort is not a random sample of a defined population of studies but rather a compilation of sub-cohorts that are similar to the reference sub-cohort Public Germany with respect to important characteristics. Therefore, we decided to take into account the following study characteristics that are probably associated with the impact measures considered, by design: study phase and proportion of German study sites. We preferred to control for these characteristics by balanced design (also referred to as frequency matching) and not only by analysis.
The development process of a new drug normally goes through four study phases (Table 2). After passing phase 3, the drug is usually approved by a regulatory authority and, if successful, can then be used for health care of the general population. Phase 4 post-approval studies can follow. Therefore, it is evident that the probability for drug trials having an impact on medical practice changes with the study phase of the trial.
To prevent bias possibly occurring from systematic differences in study phase between the sub-cohorts, we balanced the three sub-cohorts Public International, Commercial Germany and Commercial International on the basis of the proportion of the specific study phase for both drug trials and non-drug trials (Table 2). Little is known about the influence of the study site location on research impact. Most (77 %) of IITs included in the sub-cohort Public Germany were national trials, i.e. all participating study sites are located in Germany, but some of the trials (23 %) have one or more study sites that are located outside Germany. To address this possibly biasing factor, we balanced the other comparison sub-cohort with German contribution, Commercial Germany, for this factor, i.e. the proportion of German study sites on all study sites.
Balancing process
For each of the comparison sub-cohorts Commercial Germany, Public International and Commercial International, we selected all trials fulfilling the eligibility criteria from the trials registries and downloaded them into an Excel-database. The search strategies used to identify the trials in the registries are shown in the supplemental material for each sub-cohort (Additional file 1).
For each trial studying a drug or biological product, we determined the study phase according to the U.S. National Library of Medicine [43] classification scheme (phase 1-4). If reported, we verified and considered the study phase information as stated in the study registries, if not reported, we determined, according to the classification scheme, the study phase by ourselves on the basis of the information available in the registries (Table 2).
For non-drugs trials, a similar classification scheme is not commonly used. To be able to consider the development and implementation phase also for those non-drug interventions, we applied the same classification criteria as for drug trials and classified them as S, A, B, or C trials (Table 2).
For all trials of German ISTs (Commercial Germany), we calculated the proportion of German study sites.
Table 2: Study phase classification scheme for drug trials and non-drug trials
Phase of drug trial/non-drug trial
|
Classification criteria
|
1/S
|
Safety study
Question: “Is the therapy safe?”
The trial focuses on the safety of a drug/therapy. The aim is to determine a safe dose range as well as the most common and serious adverse events associated with the drug/therapy. It is conducted with a small number of healthy participants.
|
2/A
|
Pilot, feasibility, tolerability study
Question: “Is there a therapy effect?”
The trial is explicitly defined as a pilot study or feasibility study or it can be assumed from the description that the therapy is either new or has never been investigated with regard to a specific outcome. The clinical trial collects initial data on drug/ treatment efficacy, i.e. whether or not a drug/treatment works in a specific study population, while continuing to monitor drug safety as well as short-term adverse events.
|
3/B
|
Efficacy study
Question: “How large is the therapy effect?” or “Is the effect larger than the effect of other therapies?”
Investigation and comparison of efficacy and safety under controlled conditions. The drug/therapy has already been tested, but more information is needed to establish the therapy. The clinical trial delves deeper into the safety and efficacy of a drug/treatment using different study populations, drug/treatment dosages, and combinations with other established drugs/treatments.
|
4/C
|
Effectiveness study
Question: How can the effect be improved?
Effectiveness and safety under real-life condition. The drug/therapy is approved for marketing/established, but needs to be optimized, implemented in practice and evaluated over a longer time period under routine conditions. Additional information on the safety, efficacy and/or optimal use of a drug/therapy is collected.
|
To obtain comparable sub-cohorts, we used a stratified randomization. For each sub-cohort, we sorted both drug trials and non-drug trials by study phase. For the German ISTs, we used the proportion of German study site as a secondary sorting parameter within each study phase. All trials of the same study phase (for German ISTs also of the same study site proportion) were then numbered consecutively. On the basis of the percentages of study phase (and study site proportion for German ISTs) deriving from the sub-cohort Public Germany, we calculated the number of trials needed for each study phase and study site proportion for the comparison sub-cohorts. Then, for each sub-cohort, we selected the numbers of trials required for each study phase/study site proportion by using a random number generator. Duplicates were excluded and new trials re-randomized. Due to an insufficient number of non-drug ISTs in the registries, we considered all 78 identified eligible non-drug trials for inclusion in the sub-cohort Commercial Germany (Table 1 and 3).
Data extraction
Study characteristics extracted
For each included trial, we determined or extracted the following pre-defined study characteristics from the trials registries:
- Study title and acronym
- Start date of study (enrollment)
- Date of study completion
- Type of intervention (drug, surgery/procedure/medical device/manual therapy, behavioral, or other [e.g. biological agents, bone marrow cells, etc.])
- Medical field (according to the slightly modified version of the medical fields specified in the “(Model) Specialty Training Regulations 2003” of the German Medical Association [44])
- Number of participants (sample size)
- Number of primary outcomes
- Sponsor/Funding sources (commercial/non-commercial)
- Results reported in study register (yes/no)
- Publication references reported/linked to study register (yes/no)
- Other/secondary study register ID numbers, e.g. Eudra-CT [45], ISRCTN [46]
For trials with missing trial characteristics in DRKS or ClinicalTrials.gov, we also considered information reported in secondary study registries. For trials included in the Public Germany gov sub-cohort we also considered the basic study information available in the DFG and BMBF databases.
For further information on extracted study characteristics, please refer to Additional file 2.
Piloting of the data extraction process
A manual describing the definitions for the data to be extracted was developed, i.e. for each variable it was described which data have to be extracted and how. According to these detailed data extraction instructions, the research team (AB, AI, KW, LR, SB, SL) independently double-extracted study data into the project database (MS Access 2010). The researchers were trained and data extraction was piloted on a test data set of 30 trials for which all researchers performed data extraction independently. We compared the results and discussed, edited as well as complemented the instructions, if and where necessary, before proceeding with the actual data extraction. Any discrepancies or disagreements were resolved through discussion or by consulting a third researcher until consensus was reached.
Assessing research impact
We examined research impact by assessing the proportion of trials that were published as well as the citation rate of their publication(s). In particular, we were interested in the proportion of trials and publications, respectively, cited by a systematic review or meta-analysis or a clinical guideline (Figure 1).
Research translation from trial results to clinical implementation over time. The figure is based on the research impact assessment concepts of Sarli et al. [34] and was adapted for this project.
Identifying primary research articles
For each included trial, we searched for corresponding articles included in biomedical databases to assess the proportion of conducted research that has been published.
Citations in registries
We examined whether a publication or its reference is directly attached or linked to the registry entry and whether trial results are reported in the study register.
Publications in bibliographic databases
Based on extracted data and keywords derived from the trials, we systematically searched in the following electronic databases for publications that correspond to the included trials:
- Study registries (DRKS, ClinicalTrials.gov, ISRCTN, EU Clinical Trials Register)
- Medline (via PubMed) [47]
- Cochrane Central Register of Controlled Trials (CENTRAL) [48]
- LIVIVO (interdisciplinary search engine for life sciences literature)[49]
- Web of Science (WoS) [50]
- Google scholar [51]
- Google [52]
- Study website
- PubMed tools “Similar articles“ and “Cited by“
For each trial, the search was conducted in the following order and with the following search terms: 1. Register Identifier (NCT ID, DRKS ID, etc.[1]); 2. Acronym; 3. Name of applicant/investigator(s); 4. Study title; 5. Study methods/PICO (Population, Intervention, Comparison, Outcome) components [53]; 6. Funding number.
References of publications that corresponded to the trial were downloaded into a reference management database (Endnote). The full text of the article was retrieved, e.g. by the departmental librarian, and attached to the corresponding reference. If we were unable to decide on the eligibility of an article based on the database entry, we also retrieved the full text article for further evaluation and decision. We only considered full publications, i.e. articles that contain at least some information on the study’s objectives, methods and/or results that were published in a scientific peer-reviewed journal.
Identifying secondary research articles
Cited by reviews
We downloaded the bibliographic citations of all references, including the digital objective identifier (DOI), citing the publication from the databases Medline (via PubMed) [47] and WoS [50] by means of the “Cited by” function (PubMed/Medline) and the “Times cited” function (WoS). This was done automatically by a program developed by one of the authors (KN). To determine which of the articles citing the publication is a systematic review or meta-analysis, we used Epistemonikos, a multi-collaborative database of health research evidence and the largest source of systematic reviews and other types of scientific evidence [54]. Its primary aim is to identify all systematic reviews relevant for health-decision making by regularly screening multiple electronic databases and other sources, including Cochrane Database of Systematic Reviews (CDSR), PubMed, Excerpta Medica database (EMBASE), Cumulative Index to Nursing and Allied Health Literature (CINAHL), Psychological Information (PsycINFO) database, Latin American and Caribbean Health Sciences Literature (LILACS), the Campbell Collaboration Online Library, the Joanna Briggs Institute (JBI) Database of Systematic Reviews and Implementation Reports, and the Evidence for Policy and Practice Information and Co-ordinating Centre (EPPI-Centre) Evidence Library [55-62]. Epistemonikos classifies potentially eligible articles by a machine-learning algorithm and checked by the network of human collaborators. Apart from systematic reviews, Epistemonikos does also include broad syntheses, i.e. summaries of systematic reviews [63].
We consider comparing the citing references with the content of Epistemonikos a reliable method to determine the publication type and also deem it suitable for publications that are not indexed with a publication type, e.g. because they are not included in Medline.
We matched the DOI of each downloaded citing reference with the record-DOIs included in Epistemonikos. For publications without DOI, we matched the publication title. For this purpose, a master list of all records was provided by Epistemonikos on request (as of 28 June 2019), containing the bibliographic citation information of the reference DOI, journal title, publication year, PubMed identifier (PMID)/Cochrane ID, and Epistemonikos’ ID and classification type (broad-synthesis or systematic review). The matching process was done automatically by a program written by one of our authors (KN) in Python programming language [64]. The references of all identified matching pairs was entered into the project Access database and linked to the reference of the “parent” publication.
For further assessment of the impact of the trial results in clinical guidelines we focused on the reviews identified by this process.
Cited by clinical guidelines
To identify clinical guidelines that include results deriving from our trial cohort, we manually searched the following three guideline databases: the search portal for German guidelines (AWMF Guidelines) and, for international guidelines, the Turning Research Into Practice (TRIP) database and National Institute for Health and Care Excellence (NICE) evidence search. The guideline database of the Association of the Scientific Medical Societies (AWMF) of Germany contains guidelines and related documents of all member medical specialist societies in Germany [65]. The Trip medical database [66] provides a search engine that enables healthcare professionals to easily search, find and use research evidence (e.g. international guidelines) in practice and/or care. NICE evidence search [67] offers free access to high quality evidence on (public) health, drugs and health technologies, social care, and healthcare management and implementation. It contains consolidated and synthesized evidence from various established sources such as the British National Formulary (BNF), Clinical Knowledge Summaries (CKS), Scottish Intercollegiate Guidelines Network (SIGN), the Cochrane Library, and Royal Colleges [68-71]. A variety of documents can be retrieved from NICE including systematic reviews, guidance, evidence summaries and patient information [72].
We searched for guidelines citing the original publication and/or the systematic review(s) identified by the matching process mentioned above. The search was performed by using (parts of) the article title, name of first author, intervention, and disease.
We also searched for the register identifier of the trials to identify guidelines citing study information or results included in the trial registers.
We complemented the manual search by an automatic search tool programmed by KN (please refer to “Methods”, ”Sub-study”).
Characteristics of primary research articles
The following information on the publication characteristics of an original article was extracted:
- Reference information (author, title, journal, volume, issue, pages)
- Type of publication: protocols, method papers, or result articles
- Date of publication (electronic version)
- Date of publication (print version)
- DOI
- Type of research article
- Country of first author
- Free full-text article availability (open/closed access)
- Free PubMed Central (PMC) article availability (yes/no)
- Distribution rights (creative commons license)
- Search term(s) by which publication was found
- Database(s) where publication was found
- Study registry identifier as reported in database and/or article
- Language of article
Characteristics of secondary research articles
Systematic reviews and meta-analyses
We determined and extracted the following characteristics of secondary research articles:
- Reference information (author, title, journal, volume, issue, pages)
- Date of publication (electronic version)
- Date of publication (print version)
- DOI
- Type of review according to Epistemonikos classification: systematic review or broad synthesis
- Context of publication citation: whether the publication is cited in general, e.g. in the introduction or discussion section, or study results are included or excluded in the systematic review or meta-analysis
Guidelines
For the retrieved guidelines we extracted the following characteristics:
- Title
- Year of publication
- Guideline identifier (e. g. AWMF register number)
- Database in which the guideline was found: TRIP, AWMF or NICE
- Language of guideline: English, non-English (e. g. German, French, etc.)
- Guideline quality: S1/S2/S3 (only applicable for German AWMF guidelines[2])
Sample size and statistical analysis
With the size of the sub-cohort Public Germany being restricted to n=120 trials, it is possible to estimate the proportion of published trials (primary outcome) with a standard error (SE) of less than 0.05 in this sub-cohort. The intended sample sizes of n=200 trials for the other three sub-cohorts will lead to SEs of about 0.035 for the corresponding estimated proportions in these sub-cohorts. Since the comparison of sub-cohorts with regard to publication proportions will be based on the more informative outcome time to publication, these sample sizes were chosen to achieve a power of over 90% (significance level of 5%) for a hazard ratio of 1.6 (increase of publication hazard) or 0.625 (decrease of publication hazard) assuming an overall publication proportion of 50% over a long follow-up period. There will be no adjustments for the number of comparisons. The time to publication analysis will properly take different follow-up lengths for the individual studies into account. In our planned analysis, we will present Kaplan-Meier plots of time-to-publication for the four sub-cohorts as well as results of Cox regression analyses, considering study characteristics. The intended sample sizes for the study cohorts will provide reasonable power for the detection of moderate to large differences between IITs and ISTs, also for the other endpoints considered.
Although all trials included in the sub-cohorts met the inclusion criteria and were balanced for the study phase, and the German IITs and ISTs for the proportion of German study sites, it might be possible that the sub-cohorts are still heterogeneous for other factors. This makes a comparison of the research impact susceptible to bias. Therefore, we attempted to create comparable groups by: a) pre-defining inclusion criteria, and b) conducting a propensity score analysis to evaluate additional influencing factors [73-75]. Study characteristics that turn out to have an influence on research impact will be adjusted for in the regression model to address confounding. In addition to the regression analyses, we planned a propensity score analysis as a form of sensitive analysis, where we use documented study characteristics that are not controlled for by design. These are, for instance, study status, study size, and number of primary outcomes. With this approach we are able to minimize possible bias when assessing the real effect of research impact.
Values will be quantified by means of absolute number, percentage, median and range.
Sub-study: Developing and validating a robust semi-automatic tool for follow-up
We also developed and validated a robust methodological tool that allows following-up trials and periodically replicating research impact analyses over time in a semi-automated manner. The tool, called DOIScout, comprises two main features. The first main feature is an automatic search for publications using their study register identifier (e. g. NCT01234567). The second main feature focuses on the impact of the identified publications using the PubMed and WoS citation tracking function, i.e. how many times a publication has been cited by other articles (PubMed function “Cited by”, WoS function “Times Cited”). Moreover, the tool is also designed to automatically search specific guideline databases (AWMF, TRIP, NICE) for guidelines citing the publication. The DOIScout collects the bibliographic information of the identified citations and the sources (databases) from where they were found. The tool also includes several secondary features aiming at facilitating workflows, for example importing PubMed- and WoS-files and downloading full text articles (PDFs) when available. Ultimately, the DOIScout will be made available as an open-source and user-friendly tool. Thus, it can be used for related research projects so that the scientific work and the scientific community can benefit from this tool.
[1] Clinical trial identification number assigned by the study registry, e. g. ClinicalTrials.gov.
[2] The AWMF S-classification scheme classifies guidelines into classes S1, S2 and S3. Class S1 guidelines consist of action recommendations by experts but lack a systematic development process. S2 guidelines are either developed using a systematic analysis of the scientific evidence (S2e) or a structured consensus finding by a representative body (S2k). S3 guidelines combine both aspects and form the highest class of guidelines.