On estimating survival and hazard ratios using one time-scale when cohort data have multiple time-scales: a simulation study

Background: When estimating survival functions and hazard ratios during the analysis of cohort data, we often choose one time-scale, such as time-on-study, as the primary time-scale, and include a ﬁxed covariate, such as age at entry, in the model. However, we rarely consider the possibility of simultaneous eﬀects of multiple time-scales on the hazard function. Methods: In a simulation study, within the framework of ﬂexible parametric models, we investigate whether relying on one time-scale and ﬁxed covariate as proxy for the second time-scale is suﬃcient in capturing the true survival functions and hazard ratios when there are actually two underlying time-scales. Result: We demonstrate that the one-time-scale survival models appeared to approximate well the survival proportions, however, large bias was observed in the log hazard ratios if the covariate of interest had interactions with the second time-scale or with both time-scales. Conclusion: We recommend to exercise caution and encourage ﬁtting models with multiple time-scales if it is suspected that the cohort data have underlying non-proportional hazards on the second time-scale or both time-scales.


Introduction
Survival analysis of time-to-event data can include several time-scales, such as time since diagnosis, treatment duration, attained age or calendar time. Multiple time-scales can be seen in situations, where the event rate may vary along more than one time-scale. For example, breast cancer incidence depends on attained age as well as time since first childbirth; in many chronic diseases such as diabetes, the risk of complications is contingent on time since diagnosis and attained age; the risk of getting an infection in intensive care unit may depend on length of stay and calendar time [1,2].
When analysing cohort data, we may be aware of the presence of two or more timescales that may simultaneously affect the rates of an outcome. Time-on-study and attained age are the most frequently encountered time-scales. However, the joint effects of both time-scales are rarely considered. Instead, one time-scale, such as time-on-study, is often chosen as the main time-scale for the baseline hazard, while the effect of the second time-scale is not modelled, rather a time-fixed covariate, such as age at entry is included in the model.
The usual approach to jointly modelling two time-scales is to treat the main time-scale as continuous, and split the data into very short intervals along the second time-scale, thus assuming constant effects within each interval. The obtained data can then be analysed with a Cox model, or a parametric model. The flexible parametric models have become a popular alternative in analysing survival data, as they can capture complex hazard shapes, and enable smooth predictions for relative and absolute effects [3,4,5]. A Poisson regression model can also be used if the observations are further split along the first time-scale in addition to time-intervals on the second time-scale. This may lead to very fine splitting of the observed data to uphold the assumption of the constant hazard rate within each interval.
In large population-based register studies, splitting observations into very short intervals along two or more time-scales can become unmanageable with current technology. Time-splitting can further exacerbate the challenges of fitting very complex models. The absence of a well documented standard software to jointly model multiple time-scales can be an additional deterrent to fitting multiple time-scales models.
These limitations play a role in our preference to model how the rate changes as a function of only one time-scale. Unfortunately, we rarely consider the potential bias in estimates from a misspecified model that ignores simultaneous effects of more than one time-scale. In this study, within the framework of flexible parametric models, we aim to investigate whether modelling one time-scale while adjusting for a constant proxy for the second time-scale suffices in different situations when the hazard rates in the data depend on two time-scales. In situations where these models appear to be insufficient, we also aim to assess bias in survival and hazard ratio estimates.
Assessment of choosing an optimal time-scale is out of scope of this study as there has been a considerable attention paid to the choice of a time-scale when analysing cohort data [6,7,8,9], as well as combining time-scales into a unique optimal single time-scale [10, 11, 12].

Methods
Models with one or more time-scales where h 0 is the baseline hazard function, γ is the corresponding parameter vector and x is a vector of covariates with associated log hazard ratios β. In the Cox model, the baseline hazard is not modelled directly due to the use of the partial likelihood [13], whereas in the Poisson regression it is common to model the behaviour of the baseline hazard by categorising t 1 through function g into time-intervals, say monthly or yearly intervals. If a continuous format of the time-scale is preferred then g can be any polynomial function or a spline function, which allows for a smooth non-linear function.
In situations where it is known that the hazard rate may depend on two timescales, such as time from diagnosis and attained age, the usual approach is to include the time from diagnosis, t 1 , in the baseline hazard and use age at entry, a 0 , as a constant proxy for the second time-scale, as shown in the following model: In this model a 0 is part of the covariates x * , and the relative effect of a 0 , modelled by function f , is part of the vector of the log hazard ratios β * . Similar to g, function f is usually specified as a categorical function (such as five-yearly intervals), or as a smoothing spline function. To allow for time-dependent effects of age at diagnosis and of any x p (p = 1, ...P ) covariate in x, model (2) can include interaction terms with t 1 : log(h(t 1 ; γ, β a0 , γ a0 , β, γ xp )) = g(t 1 ; γ) + f (a 0 ; β a0 ) + r(g int (t 1 )f int (a 0 ); γ a0 )+ where P is the number of covariates, excluding age, with time-dependent effects. Now, a model that includes both time from diagnosis, t 1 , and attained age, t 2 , in the baseline hazard function, h 0 , assuming proportional hazards, can be written as: Furthermore, model (4) can also incorporate interaction between two time-scales t 1 and t 2 : in the medical setting, such as our example, using the time of diagnosis as the time origin, and a 0 as an offset term, we can specify attained age, t 2 , in terms of t 1 and a 0 , therefore transforming model (4) into Non-proportional hazards as well as interactions between time-scales can be accommodated by including interactions between the covariates and the time-scales: In this study, within the framework of flexible parametric models, our focus is on errors in predictions when fitting models as shown in equations (2)

Flexible parametric models
Flexible parametric survival models (FPMs) are parametric models that use restricted cubic splines to model the baseline hazard function as well as timedependent effects on the log hazard scale or log cumulative hazard scale [3,4,5]. A flexible parametric model on the log hazard scale with one time-scale that includes interactions with time, is written as: where h is a hazard function over time-scale t, x is a vector of covariates with associated log hazard ratios β, with additional p th time-dependent effect for covariate x p , with p = 1, . . . , P . The baseline hazard is represented by the restricted cubic spline function, s(log(t); γ 0 , k 0 ), with knot location vector, k 0 , and associated coefficient vector γ 0 from the spline expansion [5]. Users need to determine the number of knots for the spline expansion of log(t) and for each of the spline expansions for the time-dependent effects. Interactions between the covariates and the restricted cubic spline function of log(t) is omitted to fit a model with proportional hazards.
It should be noted that it is common to model log(t) in FPMs on the log hazard scale, but other functional forms of time t can be used as well.

Simulation strategy
This simulation study was conducted to evaluate bias in estimates of survival functions and hazard ratios when modelling hazard rates using FPMs with one timescale given that the true hazard rates depend on two time-scales. For this purpose, we simulated cohort data loosely based on haematological cancer patients who relapsed after transplantation, and were followed from relapse until death. We simultaneously simulated two time-scales -time since relapse and time since initial transplantation. In the interest of parsimony, we chose to exclude calendar time and attained age, which should not be ignored in the clinical study of death rates in these patients. To the simulated data we fitted FPMs with time since relapse as the time-scale, and used the time since transplantation until relapse as a constant proxy for the time since transplantation, while also allowing for possible non-proportional hazards.

Simulation design and data generation
We simulated 1000 datasets with a sample size of 1000 from each of the seven datagenerating mechanisms (DGMs) with and without interaction terms with timescales presented in Table 1. Each data-generating mechanism simulated time-toevent data based on two time-scales: the first time-scale, t, time (up to 10 years) from relapse until death or censoring, and the second time-scale, t + c 0 , time from transplantation, where c 0 represents the time from transplantation until relapse, and was generated from a uniform distribution U (0, 5). Notation c 0 for offset is used instead of a 0 from equations (2)- (7) to make it clear that these offsets are Different DGMs with and without interactions with time-scales were chosen to reflect the wide spectrum of situations that we usually encounter when analysing cohort data. Scenarios S 1 , S 2 generated data with proportional hazards for x; S 3 , S 4 simulated interaction between x and t, while S 5 , S 6 simulated interaction between x and t + c 0 ; S 7 generated data with both interaction terms: between x and t, and between x and t + c 0 . Of the seven data-generating mechanisms, three scenarios (S 2 , S 4 , S 6 ) also simulated interactions between the two time-scales.
For reference, to mirror the true models used for simulating data, we fitted flexible parametric models that jointly modelled both time-scales using three degrees of freedom for log(t) and for t + c 0 . Since these models are close approximations of the true models, they are only presented in the Supplementary Material.

Figures 2 -4 present the bias in predicted survival functions and the log hazard
ratios at follow-up times t = 1, 3, 5 (years since relapse) and c 0 = 0.5, 4 (time of relapse after transplantation) for the fitted models for the DGMs.

Scenarios with proportional hazards
For two simpler scenarios S 1 , S 2 , which simulated data with proportional hazards for x, the absolute bias was low (<0.06) in the survival estimates from all fitted [1] Supplementary material may also be found in the online version of this article. proportional hazards for x, showed close estimates to the true values (absolute bias <0.005). These models were aimed to correspond to S 1 and S 2 , respectively. The AIC and BIC were both seen to also support models M1, M2 most of the time in respect to these scenarios, as shown in Table 3.

Discussion
The purpose of this simulation study was to assess the performance of fitted models All models underperformed when estimating the log hazard ratios from data with interaction between x and the second time-scale t + c 0 . It is highly likely that a researcher encountering data similar to data from S 5 , S 6 , may choose models that include an interaction term between x and the main time-scale, such as models M3, M4 or M7, which include such an interaction. However, this can lead to misleading interpretation that the variation in the effect of x on the hazard rates is solely due to the main time-scale (t) rather than the underlying second time-scale (t + c 0 ). Even though model M7 that includes both interaction x · log(t) and interaction x · c 0 was seen to have very close survival predictions to the true values, the bias in the log hazard ratios was still visibly large, especially for follow-up time beyond five years.
We can never be safe from such pitfalls in the analysis of real-world data. With our simulation study, we hope to raise awareness of such pitfalls and direct attention towards more nuanced analysis with cautious approach to modeling time-scales. It should be noted that the simulated scenarios were designed to have joint strong effects from both time-scales, and therefore we do not expect to observe as severe bias in situations with less strong effects.
As in the majority of simulation studies, we only looked at limited number of scenarios, and made simplifying assumptions. In particular, we simulated data using   Figure 1 Survival proportions, hazard rates and log hazard ratios of the seven data generating mechanisms over time t since relapse given relapses occurred at c 0 = 0.5, 4 years after transplantation. Note that the y-scales differ across scenarios.

Additional Files
Additional file 1 A HTML file containing all the results from the data analyses with additional figures and tables.