Preclinical studies form the basis of research aimed at improving human health. Also known as basic research, these studies are performed in laboratory settings using various non-human models such as cell cultures, invertebrates, and/or vertebrate animals. Preclinical studies demonstrate how biological systems and disease processes work and provide the knowledge to understand if an intervention is safe for clinical research that involves direct interaction with human subjects. Translational research is the interface between preclinical and clinical trials, applying the findings from preclinical trials to human studies.
Unfortunately, despite the convention and necessity of conducting preclinical translational research, many preclinical studies fail to demonstrate safety and efficacy in clinical trials. Even worse, when interventions are deemed appropriate for clinical testing, over 80% of the human results fail to confirm the preclinical findings due to contradicting or under/overestimating effects. For example:
- Out of 101 technologies (drugs, devices, and genetic therapies) that were tested in preclinical models, only 27 were deemed appropriate for human trials, and of those, only one technology conferred a clinical benefit.
- Dozens of drugs developed for the treatment of amyotrophic lateral sclerosis (ALS) were studied in preclinical mouse models, and all demonstrated efficacy. Yet, when tested in humans, all but one trial failed to confer benefits to patients. Additionally, 100 drugs for ALS were studied in mice. Only eight of these trials were translated into clinical studies. All eight clinical trials, comprising thousands of enrolled patients, failed to demonstrate efficacy.
- A systematic review compared the treatment effects demonstrated in animal and human trials examining the same intervention and diseases. Discordance was found between the models, with benefits observed in animal models but not humans, and vice versa, demonstrating that the results from trials in one model may not be applicable for another model.
While animals have analogous physiologies to humans, they are different species. Biology is complex and mechanisms of action of an intervention in animals may or may not work in humans, and vice versa. Hence, the biological variability between species may lead to discordance between models. It is also likely that differences between study types may be due to systematic human-caused errors related to the design, conduct, and reporting of preclinical studies. These errors can bias the preclinical results inappropriately in the direction of benefit or harm. Thus, when tested in humans, the biased results cannot be reproduced.
The consequence of discordant preclinical trials is that extensive resources and money have been dedicated to pursuing treatments in patients based on erroneous and/or biased preclinical results. Most devastating is that patients dedicate hope and months or years of their lives to experimental treatments that may prove to be less efficacious, or even harmful, compared to beneficial results obtained from animal studies.
Despite the inappropriateness of claiming that results from preclinical trials are directly applicable to humans, it is not uncommon for researchers to make these claims in the conclusions of their work, perhaps in an attempt to inflate their physiological relevance. This tactic is misleading and should be avoided. The results from a preclinical study cannot be generalized to human populations because they lack external validity, a concept that asks the question “how likely is it that the observed effect would occur outside of the study?” In both preclinical and clinical research, the results are only externally valid to a study population with similar characteristics, e.g., with respect to species, age, and gender. For example, consider a clinical trial that examined X intervention in Caucasian male adults between the ages of 30-50 years of age with Y condition. The results from this trial are only applicable to a similar population: male adults between the ages of 30-50 years of age with Y condition. If X intervention is studied in a different population, such as Caucasian females between the ages of 40-70 years of age, the results may vary.
External validity is a theoretical concept and it is possible that interventions are applicable outside of study populations. However, clinical research is based on probabilities, and the likelihood that results will translate between populations is reduced when studied populations are different. Considering that caution is needed when applying the results from one human population to another, it is inappropriate to claim that the results are applicable from one species to another.
The failures of preclinical to clinical translation may be mitigated if preclinical studies are conducted with higher methodological quality, reproduced/replicated in multiple animal models, and/or evaluated to assess the likelihood they will concordantly translate to humans before commencing clinical studies. The translatability of preclinical studies can be evaluated using the Clinical Relevance Assessment of Animal Preclinical Research (RA-A) Tool. The tool is composed of eight domains comprising 27 questions that assess the likelihood that results from preclinical studies will successfully translate into clinical trials and demonstrate concordant results in human subjects. The tool should not be used: to evaluate the “quality” of a preclinical trial; for veterinary, in vitro, or in silico research; for assessing exploratory preclinical studies developing new animal models of disease. A summary of the eight domains from the RA-A Tool are as follows:
Domain 1: Clinical Translatability of Results to Human Disease or Condition (Construct Validity)
Purpose: to assess if statistically positive results from the primary/secondary outcomes in the preclinical study could translate into clinical benefits. The following components are assessed:
- Does the model represent the human disease?
- Did the researchers characterize and confirm that the disease was present in the model
- Were the methods and timing of the intervention relevant to humans?
- If surrogate outcomes were used, were they validated in previous studies and do they correlate with the clinical outcome?
- Was a systematic review/meta-analysis conducted demonstrating concordance between the animal and human model of the disease?
Domain 2: Experimental Design and Analysis
Purpose: to assess how the researchers designed the study and analyzed the results. A particular emphasis is placed on assessing random and measurement errors. The following components are assessed:
- Sample size calculations
- Type and distribution of data
- Multiple hypothesis testing
- Dose response, accuracy, precision, sampling error, and adjustments for measurement error
Domain 3: Bias (Internal Validity)
Purpose: to assess the risk of bias of the study. The following components are assessed:
- Selective outcome reporting (the SYRCLE risk of bias tool for animal studies is recommended as an accompaniment to evaluate the risk of bias for this domain)
Domain 4: Reproducibility of Results in a Range of Clinically Relevant Conditions (External Validity)
Purpose: to assess if the results reported in this study have been reproduced in a range of clinically relevant conditions. This domain assesses whether the results were reproduced using animal models with:
- different methods of disease induction
- different genetic compositions
- different ages
- different sex
Domain 5: Replicability of Methods and Results in the Same Model
Purpose: to assess if the methods and results from this study have been replicated in other studies using the same animal model. The following components are assessed:
- Were the methods used to conduct the study sufficiently described to allow for the study to be replicated?
- Has an external research team attempted to replicate the methods and the results either independently or in separate studies?
Domain 6: Implications of the Study Findings (Study Conclusions)
Purpose: to assess if the conclusions from the trial appropriately reflect the results. The following components are assessed:
- Did the researchers take all the findings and limitations (construct validity, experimental design and analysis, bias, external validity, and replicability) into consideration when arriving at conclusions?
- Did the researchers make appropriate recommendations for further preclinical studies that are necessary before conducting clinical trials?
Domain 7: Research Integrity
Purpose: to assess if the researchers conducted the study in line with research ethics and integrity. The following components are assessed:
- Were ethical and regulatory approvals acquired before conducting the study?
- Did the researchers minimize errors in their data by using a password protected data collection, storage, log, and analysis procedure to reduce the risk of unintentional changes to the data?
Domain 8: Research Transparency
Purpose: to assess if the researchers transparently reported their results in line with guidelines such as the ARRIVE checklist for animal studies. The following components are assessed:
- Did the researchers register and/or publish a protocol before the study was conducted?
- Is the protocol concordant with the methods and outcomes reported in the completed study?
- Were deviations from the protocol described?
- Were individual animal data available for an external research team to analyze?
Researchers use the RA-A tool by evaluating all 27 questions with one of five answers: Yes, Probably Yes, No, Probably No, or No Information. Based on the responses to each question, an overall classification is assigned to each of the eight domains: Low Concern, Moderate Concern, or High Concern. These domain classifications result in one of two overall Clinical Relevance classifications for the trial as a whole:
1) High Clinical Relevance: the trial has been adequately conducted and the results are more likely to translate in human trials.
2) Uncertain Clinical Relevance: Limitations in the design and conduct of the trial indicate that the results may not translate in human trials.
Given the importance of preclinical trials in establishing the safety and efficacy of interventions and the translational failures of basic science, it is of utmost importance that researchers, practitioners, and healthcare stakeholders are aware of these components before interventions are recommended for application in human studies. Through enhanced awareness about these translational elements, the design, conduct, and evaluation of preclinical trials may be improved. Consequently, a more accurate signal of the interventions that should be studied in human trials could be obtained to optimize the resources and funding dedicated to additional preclinical and clinical research.
Most importantly, the precious hope and time that patients dedicate to treatments that may ultimately fail would not be wasted if the clinical relevance could be better understood before the intervention is studied in humans. Research and healthcare communities need to be equal partners in understanding the limitations of preclinical studies, advocating for their appropriate conduct, and ensuring their results are appropriately interpreted and disseminated in research, healthcare, and public discourse.