Missing data bias occurs when:
- Participants’ data is not included in an analysis because the researchers purposely omit their data
- Participants dropped out/died before the trial was completed
When data is missing, it cannot contribute to the results.
Dropouts/deaths can be compensated for using statistical techniques to reduce the risk of bias. Missingness biases the true estimation of the outcomes because the analyzed data was incomplete.
Clinical research involves enrolling tens to thousands of participants in a study. Over a period of weeks to years, the effect of different interventions, risk factors, or exposures on an outcome(s) is compared in the participants who have been allocated to different groups within the study.
In most studies, participants return to the investigators during follow-up periods to measure their outcomes. Ideally, all participants will return at each follow-up point. Instead, some participants drop out of studies, especially in trials with many participants.
Why do people drop out of studies?
The reasons for dropping out may be unrelated to the study. For example, the participants may have time conflicts that prohibit their ability to continue with the study.
Participants may also drop out due to side effects from the intervention received during the trial. In more severe cases, sometimes participants die during the timeframe of a study.
In all cases, data from the participants who dropped out of the study may be missing from study analyses if the data could not be collected. Alternatively, the data could have been available, but the researchers neglected to include the data in the analyses from participants who dropped out or died.
Regardless of the reason, when data from participants is missing, the effect of the intervention, risk factor, or exposure may be under or over estimated that leads to biased conclusions.
In contrast to selection bias, in which participants are excluded before being enrolled in a study when they should not have been, missing data bias occurs after participants are enrolled in the study.
Examples of missing data bias
Consider the following hypothetical examples of bias due to missing data:
- A group of researchers studied the effect of a major surgery versus chemotherapy in people with advanced cancer. During the study, 5% of people who underwent surgery developed complications due to the surgical intervention and died. When analyzing the results, the researchers exclude the participants that died and compare the long-term mortality between people who did not die of surgery versus those who received chemotherapy. The outcomes and conclusions of this study would be biased due to missing data because the 5% of people who died were excluded from the analysis.
- In a preclinical study, researchers compared two drugs, drug A and drug B, and examined their effect on the physical activity of mice. Approximately 2% of mice receiving drug A developed toxicity and their physical activity reduced considerably and were removed from the study. When analyzing the results, the researchers excluded the 2% of mice that developed toxicity. Consequently, the outcomes and conclusions were biased due to the missing data from the mice that were inappropriately excluded.
If participants in a study dropped out and did not provide data, how could the data be missing if it was never provided? In this case, “missing data” does not literally mean that the data is “missing,” as in having been collected but was lost. Rather, the terminology is used to convey that the full data set is incomplete.
To compensate for missing or incomplete data, researchers can use a statistical method called an intention-to-treat (ITT) analysis where they impute data from other subjects in place of those that dropout. By performing an ITT analysis, the balance between study groups, and statistical power (the probability to correctly reject the null hypothesis), is maintained when analyzing the results. Preserving balance and power is critical because an unbalanced study may bias the results in favor of one group over another. An underpowered study may have been too small to detect true differences between both groups. Consequently, the study conclusions may be incorrect.
How to evaluate missing data bias
To evaluate bias to due missing data, perform the following steps:
- Assess how many participants were included in the study and assigned to each group.
- In the results, check if the number of participants analyzed match the number that were included in each group.
- If the numbers match, there is no bias due to missing data.
- If the numbers do not match, look to see if the researchers used an ITT analysis. They may have used an ITT analysis but failed to indicate in the tables and figures the number of participants analyzed in each group and reflected in the results.
- If no ITT analysis was used, bias due to missing data is present and the outcomes and conclusions must be taken cautiously as they may be prejudiced in a direction away from the truth.