Background: The study design used to develop prediction models in observational healthcare databases (e.g., case-control and cohort) may impact the clinical usefulness. We aim to quantify how the choice of design impacts prediction model performance.
Aim: To empirically investigate differences between models developed using a case-control design and a cohort design.
Methods: Using a US claims database, we replicated two published prediction models (dementia and type 2 diabetes) which were developed using a case-control design, and also train models for the same prediction questions using cohort designs. We validated each model on data mimicking the point in time the models would be applied in clinical practice. We calculate the models’ discrimination and calibration-in-the-large performances.
Results: The dementia models obtained area under the receiver operating characteristics of 0.560 and 0.897 for the case-control and cohort designs respectively. The type 2 diabetes models obtained area under the receiver operating characteristics of 0.733 and 0.727 for the case-control and cohort designs respectively. The dementia and diabetes case-control models were both poorly calibrated, whereas the dementia cohort model achieved good calibration. We show that careful construction of a case-control design can lead to comparable discriminative performance as a cohort design, but case-control designs generally oversample the outcome leading to miscalibration.
Conclusion: Any case-control design can be converted to a cohort design. We recommend that researchers with observational data use the less subjective and generally better calibrated cohort design. However, if a carefully constructed case-control design is used, then the model must be prospectively validated using a cohort design for fair evaluation and be recalibrated.

Figure 1

Figure 2

Figure 3
This is a list of supplementary files associated with this preprint. Click to download.
Loading...
Posted 23 Mar, 2021
On 03 Jul, 2021
Received 02 Jul, 2021
Received 30 Jun, 2021
Received 30 Jun, 2021
On 23 Jun, 2021
On 23 Jun, 2021
On 20 Jun, 2021
On 19 Jun, 2021
On 19 Jun, 2021
On 17 Jun, 2021
Received 01 May, 2021
Received 25 Apr, 2021
On 16 Apr, 2021
On 12 Apr, 2021
Received 12 Apr, 2021
On 12 Apr, 2021
On 08 Apr, 2021
Invitations sent on 08 Apr, 2021
On 18 Mar, 2021
On 18 Mar, 2021
On 18 Mar, 2021
On 17 Mar, 2021
Posted 23 Mar, 2021
On 03 Jul, 2021
Received 02 Jul, 2021
Received 30 Jun, 2021
Received 30 Jun, 2021
On 23 Jun, 2021
On 23 Jun, 2021
On 20 Jun, 2021
On 19 Jun, 2021
On 19 Jun, 2021
On 17 Jun, 2021
Received 01 May, 2021
Received 25 Apr, 2021
On 16 Apr, 2021
On 12 Apr, 2021
Received 12 Apr, 2021
On 12 Apr, 2021
On 08 Apr, 2021
Invitations sent on 08 Apr, 2021
On 18 Mar, 2021
On 18 Mar, 2021
On 18 Mar, 2021
On 17 Mar, 2021
Background: The study design used to develop prediction models in observational healthcare databases (e.g., case-control and cohort) may impact the clinical usefulness. We aim to quantify how the choice of design impacts prediction model performance.
Aim: To empirically investigate differences between models developed using a case-control design and a cohort design.
Methods: Using a US claims database, we replicated two published prediction models (dementia and type 2 diabetes) which were developed using a case-control design, and also train models for the same prediction questions using cohort designs. We validated each model on data mimicking the point in time the models would be applied in clinical practice. We calculate the models’ discrimination and calibration-in-the-large performances.
Results: The dementia models obtained area under the receiver operating characteristics of 0.560 and 0.897 for the case-control and cohort designs respectively. The type 2 diabetes models obtained area under the receiver operating characteristics of 0.733 and 0.727 for the case-control and cohort designs respectively. The dementia and diabetes case-control models were both poorly calibrated, whereas the dementia cohort model achieved good calibration. We show that careful construction of a case-control design can lead to comparable discriminative performance as a cohort design, but case-control designs generally oversample the outcome leading to miscalibration.
Conclusion: Any case-control design can be converted to a cohort design. We recommend that researchers with observational data use the less subjective and generally better calibrated cohort design. However, if a carefully constructed case-control design is used, then the model must be prospectively validated using a cohort design for fair evaluation and be recalibrated.

Figure 1

Figure 2

Figure 3
This is a list of supplementary files associated with this preprint. Click to download.
Loading...