Revisiting Methods For Modeling Longitudinal and Survival Data: The Framingham Heart Study
Background: Statistical methods for modeling longitudinal and time-to-event data has received much attention in medical research and is becoming increasingly useful. In clinical studies, such as cancer and AIDS, longitudinal biomarkers are used to monitor disease progression and to predict survival. These longitudinal measures are often missing at failure times and may be prone to measurement errors. More importantly, time-dependent survival models that include the raw longitudinal measurements may lead to biased results. In previous studies these two types of data are frequently analyzed separately where a mixed effects model is used for the longitudinal data and a survival model is applied to the event outcome.
Methods: In this paper we compare joint maximum likelihood methods, a two-step approach and a time dependent covariate method that link longitudinal data to survival data with emphasis on using longitudinal measures to predict survival. We apply a Bayesian semi-parametric joint method and maximum likelihood joint method that maximizes the joint likelihood of the time-to-event and longitudinal measures. We also implement the Two-Step approach, which estimates random effects separately, and a classic Time Dependent Covariate Model. We use simulation studies to assess bias, accuracy, and coverage probabilities for the estimates of the link parameter that connects the longitudinal measures to survival times.
Results: Simulation results demonstrate that the Two-Step approach performed best at estimating the link parameter when variability in the longitudinal measure is low but is somewhat biased downwards when the variability is high. Bayesian semi-parametric and maximum likelihood joint methods yield higher link parameter estimates with low and high variability in the longitudinal measure. The Time Dependent Covariate method resulted in consistent underestimation of the link parameter. We illustrate these methods using data from the Framingham Heart Study in which lipid measurements and Myocardial Infarction data were collected over a period of 26 years.
Conclusions: Traditional methods for modeling longitudinal and survival data, such as the time dependent covariate method, that use the observed longitudinal data, tend to provide downwardly biased estimates. The two-step approach and joint models provide better estimates, although a comparison of these methods may depend on the underlying residual variance.
Figure 1
Due to technical limitations, full-text HTML conversion of this manuscript could not be completed. However, the manuscript can be downloaded and accessed as a PDF.
Due to technical limitations, Tables 1-6 are provided in the Supplementary Files section.
This is a list of supplementary files associated with this preprint. Click to download.
Tables 1-6
S1: Type I Errors for Link (Exponential and Weibull Distribution, N = 100) S2: Estimates and Confidence Intervals for Link (Weibull Distribution, N = 100) S3: Estimates and Confidence Intervals for Age (Weibull Distribution, N = 100) S4: Estimates and Confidence Intervals for Sex (Weibull Distribution, N = 100) S5: Bayesian Semi-Parametric Joint Modeling Exact Prior Distributions
Posted 19 Jan, 2021
On 12 Jan, 2021
On 04 Jan, 2021
Received 27 Dec, 2020
Invitations sent on 09 Dec, 2020
On 09 Dec, 2020
On 08 Dec, 2020
On 08 Dec, 2020
On 08 Dec, 2020
On 28 Oct, 2020
Received 28 Sep, 2020
Received 25 Aug, 2020
On 03 Aug, 2020
On 28 Jul, 2020
On 15 Jun, 2020
Invitations sent on 15 Jun, 2020
On 10 Jun, 2020
On 10 Jun, 2020
Revisiting Methods For Modeling Longitudinal and Survival Data: The Framingham Heart Study
Posted 19 Jan, 2021
On 12 Jan, 2021
On 04 Jan, 2021
Received 27 Dec, 2020
Invitations sent on 09 Dec, 2020
On 09 Dec, 2020
On 08 Dec, 2020
On 08 Dec, 2020
On 08 Dec, 2020
On 28 Oct, 2020
Received 28 Sep, 2020
Received 25 Aug, 2020
On 03 Aug, 2020
On 28 Jul, 2020
On 15 Jun, 2020
Invitations sent on 15 Jun, 2020
On 10 Jun, 2020
On 10 Jun, 2020
Background: Statistical methods for modeling longitudinal and time-to-event data has received much attention in medical research and is becoming increasingly useful. In clinical studies, such as cancer and AIDS, longitudinal biomarkers are used to monitor disease progression and to predict survival. These longitudinal measures are often missing at failure times and may be prone to measurement errors. More importantly, time-dependent survival models that include the raw longitudinal measurements may lead to biased results. In previous studies these two types of data are frequently analyzed separately where a mixed effects model is used for the longitudinal data and a survival model is applied to the event outcome.
Methods: In this paper we compare joint maximum likelihood methods, a two-step approach and a time dependent covariate method that link longitudinal data to survival data with emphasis on using longitudinal measures to predict survival. We apply a Bayesian semi-parametric joint method and maximum likelihood joint method that maximizes the joint likelihood of the time-to-event and longitudinal measures. We also implement the Two-Step approach, which estimates random effects separately, and a classic Time Dependent Covariate Model. We use simulation studies to assess bias, accuracy, and coverage probabilities for the estimates of the link parameter that connects the longitudinal measures to survival times.
Results: Simulation results demonstrate that the Two-Step approach performed best at estimating the link parameter when variability in the longitudinal measure is low but is somewhat biased downwards when the variability is high. Bayesian semi-parametric and maximum likelihood joint methods yield higher link parameter estimates with low and high variability in the longitudinal measure. The Time Dependent Covariate method resulted in consistent underestimation of the link parameter. We illustrate these methods using data from the Framingham Heart Study in which lipid measurements and Myocardial Infarction data were collected over a period of 26 years.
Conclusions: Traditional methods for modeling longitudinal and survival data, such as the time dependent covariate method, that use the observed longitudinal data, tend to provide downwardly biased estimates. The two-step approach and joint models provide better estimates, although a comparison of these methods may depend on the underlying residual variance.
Figure 1
Due to technical limitations, full-text HTML conversion of this manuscript could not be completed. However, the manuscript can be downloaded and accessed as a PDF.
Due to technical limitations, Tables 1-6 are provided in the Supplementary Files section.