Predictor Naïve Temporal Baseline Hazard of Recurrent Ischemic Stroke


 There are established correlation between risk factors and the recurrence of ischemic stroke (IS), however does the hazard of recurrent IS change although without the influence of established risk factors? This study aimed to quantify the hazard of recurrent IS at different time points after the index IS. This was a population cohort study extracted data of 7697 patients with a history of first IS attack registered with National Neurology Registry of Malaysia. A repeated time to recurrent IS model was developed using NONMEM version 7.5. Three baseline hazard models were fitted into the data. The best model was selected using maximum likelihood estimation, clinical plausibility and visual predictive checks. Three hundred and thirty-three (4.32%) patients developed at least one recurrent IS within the maximum 7.37 years follow-up. In the absence of significant risk factors, the hazard of recurrent IS was predicted to be 0.71 within the first month after the index IS and reduced to 0.022 between the first to third months after the index attack. The hazard of IS recurrence accelerated with the presence of typical risk factors such as hyperlipidaemia (HR, 2.64 [2.10-3.33]), hypertension (HR, 1.97 [1.43-2.72], and ischemic heart disease (HR, 2.21 [1.69-2.87]). In conclusion, the absence of significant risk factors, predicted hazard of recurrent IS was prominent in the first month after the index IS and was non-zero even three months after the index IS or later. Optimal secondary preventive treatment should incorporate the ‘nature risk’ IS recurrence.


Introduction
Stroke is the world's second leading cause of death and mortality (1)(2)(3)(4). The risk of recurring strokes is much greater for survivors of acute ischemic stroke (IS). For survivors of acute ischemic stroke (IS), the risk of repeated strokes is signi cantly larger (5)(6)(7). Nearly 33% of the IS population experienced a recurring stroke in Malaysia (8). Recurrence neurological damage is usually severe, harder to deal with and higher mortality compared with the rst stroke (9). Therefore, secondary prevention is crucial to reduce IS recurrence events (9).
The prognosis of recurrent IS has been widely studied. The probability of recurrent IS after the index attack was predicted to varies over time which was predicted to occur by 11.2-30% within the rst 24 months(10, 11) and 9.5% within 5 years after the IS attack (12). While the most recent study reported IS recurrence rate was 1.2% in the rst 30 days, 3.4% within 90 days, 7.4% within one year, and 19.4% within 5 years (13). Moreover, the reported risk factors of recurrent stroke vary (14)(15)(16) in which hypertension (HTN), atrial brillation (AF), diabetes mellitus (DM), hyperlipidaemia (HPLD), ischemic heart diseases (IHD), and smoking were the most common reported predictors of recurrent stroke (17,18). Despite improvement in recurrent IS risk classi cation and prevention measures of recurrent IS over the last decades, IS remains a devastating disease. Currently, most of the secondary prevention of IS are focusing on reducing and controlling the risk factors lead to recurrent IS. Nevertheless, does the probability of recurrent IS different in the absence of these risk factors in uence?
The majority of the previous prognosis studies of recurrent stroke involved the use of the most common semi-parametric survival analysis method; Cox-regression analysis. The Cox model incorporates the effect of covariates on the hazard without quantifying the form of recurrent stroke hazard at baseline and distribution of hazard function. The hazard of event at baseline is de ned as the hazard of having the interest event when all the risk factors or covariates are set to zero. Thus, in the case of recurrent IS, the baseline hazard should be de ned as the hazard of having recurrent IS after the index event in the absence of risk factors. While distribution hazard function quanti es the probability of IS recurrence during a very small-time interval, assuming that the individual has survived to the beginning of the interval. Nevertheless, unlike cox regression, the parametric approach of survival analysis quanti es the baseline hazard and distribution hazard function of the event. This permits more time-dependent prognostic information that better re ects the expected 'natural history' of the disease. Moreover, information on the recurrent IS distribution after the index IS is limited and the studies using the parametric approach on this topic are still lacking (19,20). Moreover, validated prognostic model of recurrent IS is limited. Thus, this study aimed to quantify the hazard of recurrent IS at different time points after the index IS when the in uence of signi cant risk factors was absent and to develop a validated parametric prognostic model of recurrent IS.

Patients and data acquisition
This was a population cohort study involves a secondary analysis of data from the National Neurology Registry (NNEUR) of Malaysia. Data of all Malaysian patients with a history of index IS from August 2009 to December 2016 were extracted from the NNEUR of Malaysia. The details on the National Stroke Registry of Malaysia were published previously (21)(22)(23). The stroke was diagnosed according to the World Health Organization's criteria (24). All diagnoses were con rmed using brain computed tomography or magnetic resonance imaging. Index IS was de ned as the rst stroke registered into the NNEUR for the patients from 2009 to 2016. Recurrent IS was de ned as any IS event recorded by involving hospitals after the index IS for a speci c patient in the NNEUR database. Malaysian adults aged above 18 years with the history of IS and registered with NNEUR was included. Non-Malaysian citizen and diagnosis other than IS was excluded from the study. Minimum events needed to develop this prognostic model was calculated as 228. Sample size -Survival analysis | Sample Size Calculators (sample-size.net)

Stroke Registry in Malaysia
The NNEUR in Malaysia was established in 2009. The NNEUR has recorded data from multi-ethnic involving stroke cases from 13 states in the country. The NNEUR aims to provide comprehensive epidemiological data on the country's stroke statistics, trends, and management, representing a multicentre, hospital-based registry. The registry development is funded by the Ministry of Health, Malaysia (MOH). A comprehensive explanation of the NNEUR has been previously published (25).

Ethics Approval
Ethical approval for this study was obtained from the Medical Research and Ethics Committee (MREC), Ministry of Health, Malaysia (Research ID: NMRR-08-1631-3189). All methods were performed according to the guidelines of the Declarations of Helsinki. Informed consent was obtained from all subjects that included into this study.

Collected variables
Demographic data and concomitant diseases including DM, HTN, HPLD, IHD, and hyperuricemia were tested. They were defined either by physician diagnosis, patients' electronic records, or deduced from the medication history, and the medications prescribed during discharge.  where S(t) is the time course of the probability of survival, or the survivor function calculated from the time-varying hazard h(t). The hazard is h(t), and the survival S(t) is a function of the cumulative hazard within the time interval between the time zero and the time t describing the probability of not experiencing any recurrent IS within this interval.

Data for external validation
The base model was developed by exploring different functions for the hazard h(t), starting from a simple time-independent constant hazard and then gradually progressing to more complex functions, including Gompertz and Weibull according to Equations (2) Between-subject variability around the hazard was estimated, assuming an exponential distribution for the random effect.

Development of the covariate model
Possible explanatory variables that may in uence or predict the changes in hazard were explored by including each explanatory variable in the hazard function. A parameter, β n , for each of the n explanatory variables, X n , was estimated using the following equation.
Where h 0 is the baseline hazard, βn is the coe cient for the explanatory variable, Xn, describing how the hazard varies with the explanatory variable. Exponentiation of the explanatory variable coe cient provides the hazard ratio (HR), which re ects the in uence of the explanatory variables relative to the hazard when the explanatory variable is not present.
Initially, the covariates were tested in a univariate manner, i.e. each covariate relationship was evaluated on the base hazard individually. Then, based on the results, covariate relationships were identi ed for a systematic covariate search by applying stepwise analysis approach, i.e. with stepwise forward inclusion followed by backward elimination (27).
In the forward inclusion, the statistical signi cance level was set to P < 0.05, which corresponds to a reduction of the OFV of at least 3.84, for one degree of freedom (addition of one covariate parameter).
While in the backward deletion, the signi cant value was set to P < 0.01, corresponding to an increase of the OFV of at least 6.64 to be kept in the model for one degree of freedom.

Model evaluation
Parameters were estimated using the LAPLACE method (ADVAN=6 TOL=9 NSIG=3) in NONMEM to obtain maximum likelihood estimates of time-to-event parameters. The parametric repeated time-to-event (RTTE) analysis was performed using NONMEM v7.5, and Perl speaks NONMEM (PsN) version 4.1.0.7. Model selection was based on comparing the OFV between models, bootstrap con dence intervals for parameter estimates, and biological plausibility. The improvement in the t was measured by a decrease (28)in the OFV generated by NONMEM. The difference in OFV between two hierarchical models is approximately Χ 2 distributed and can be tested for signi cance with Χ 1,0.052 =3.84 To evaluate the predictive performance of the model throughout model building, Kaplan-Meier visual predictive checks (VPCs) for internal and external validation, Xpose4 (version 4.7.1) function (29,30) in RStudio software (version 1.1.456, RStudio, Inc., Boston, MA, http://www.rstudio.com/) was utilized. The plots were based on simulations of 1000 simulated dataset. To enable simulations for time points where no clinical observations had been made, extra dummy time points were added to the dataset until 7.37 years in all individuals for the to allow for VPC simulation. The parameter certainty was evaluated through relative standard error (RSE) produced from the sampling importance resampling (SIR) method (31).

External validation
Data from 2692 patients with and without recurrent IS were used to validate the developed nal model externally. The parameters estimate obtained from the nal model were used to simulate 1000 replicates of the dataset and to plot the VPC. The predictive performance of the nal developed model was then evaluated on the ability of the model to predict the probability of not having recurrent IS from the validation data by overlaying the VPC plot on the Kaplan-Meier curve of the validation data.
Clinical application of the developed model An online prognosis IS recurrent risk calculator was developed based on the developed nal model. The probability of early (within a year) and late (2-year, and after 4-years after the index IS) IS recurrent for two clinical scenarios were predicted using the calculator. The scenarios were as the following: Scenario 1: A patient with a history of IHD, HTN and HPLD had the rst IS attack. The probability of recurrent IS was calculated.
Scenario 2: A patient with no concomitant diseases had the rst IS attack. The probability of recurrent IS was calculated.

Results
Out of 7697 subjects, 333 patients (4.32%) developed a recurrent IS within the maximum 7.37 years follow-up. The median time to the rst recurrent IS was 1.2 years, while the time to the 2 nd recurrent IS was 2.2 years. The study population included all age groups from young to elderly, with a median age of 63.47 years at the time of index IS. As shown in Table 1, most of the patients were females (4289, 55.72%). The percentage of smokers in this study population was 48%. Of 7697 subjects, 3493 (45.38%) were diabetics before index IS, while patients with HTN before index IS were 5506 (71.5%). The number of subjects with HPLD before index IS was 2028 (26.34%), patients who had IHD before index IS was (879, 11.4%), and patients who had AF before index IS was 3.4%.

Baseline hazard model
The hazard function that gave the best result with regard to OFV, clinical plausibility, and Kaplan-Meier plots was a Gompertz model, de ning the hazard for four-time intervals (Table 2). Without the in uence of any signi cant predictors, the risk of having recurrent IS within a month, between 1-3 months after the index were IS 0.71, 0.0224 respectively, while the risk of having a recurrent IS after 3 months of index IS was small but non-zero (0.0002) (Figure 1). Concurrent diseases in uencing the risk of having recurrent IS after index IS. Table 4 showed the nal covariates retained in the nal model. These results reported that the recurrent IS rate in those with HPLD before index IS was 2.65 times higher than that in those without HPLD prior to index IS ( (Figure 2). Kaplan-Meier VPCs for recurrent IS after index IS showed good predictions (Figure 3). Recurrent IS events shown in Figure 3 indicated that the nal model described the observed data adequately for the internal (a) and external (b) validation. While (c), (d), and (e) compared the VPCs between IS patients who had HPLD, IHD, HTN respectively, versus those who did not have these concurrent diseases.
Clinical application of the developed model A calculator tool (MyReCuRIS) has been developed for this model and can be accessed via this link https://www.calconic.com/calculator-widgets/recurrent-stroke cal/60e3a89aa17c7c002c61f655? layouts=true. The probability of IS recurrent after a certain period was estimated using the calculator based on the two clinical scenarios. Scenario 1: A patient with a history of IHD, HTN and HPLD had the rst IS attack. The probability of IS recurrence was calculated as 2.48%, 6.32%, and 3.23% after 1, 2 and 4 years of rst IS, respectively. Scenario 2: A patient with no concomitant diseases had the rst IS attack. The probability of IS recurrence was calculated as 0.2%, 0.5%, and 0.28% after 1, 2 and 4 years of rst IS, respectively.

Discussion
To our knowledge, this is the rst study incorporating the predicted hazard of recurrent IS in the absence of risk factor at different time points after the rst IS as one of component of recurrent IS prognostic model. Previous study (32) reported a constant baseline hazard of recurrent IS over time which assuming the hazard of recurrent IS in the absence of risk factors constant throughout the study period. In our population, after the rst IS attack, the baseline hazard of recurrent IS was predicted as non-zero even three months after the rst IS attack. This indicates that, in the absence of any risk factors, the hazard of recurrent IS present and change at different time-points after the rst attack. The 'natural history' of the stroke is postulated to play a role in the recurrent IS. In addition to reducing the risk factor of recurrent IS, the factor of this 'natural history' of IS should be taken into consideration when identifying an optimal secondary preventive treatment.
In addition to baseline hazard of recurrent IS, in uence of time was incorporated to predict the recurrent IS. The hazard of having recurrent IS during the rst 30 days after index IS was nearly twice the subsequent intervals. These results are consistent with previous ndings reported that the maximum incidence of recurrent stroke occurred in the rst 30 days after the initial stroke (33,34).
Recurrent stroke is associated with increased disability and mortality compared to the index stroke (35). Even with appropriate secondary prevention, the risk of recurrence after IS is high, especially in the early phase after stroke (36). It has been reported that within the rst year after the initial stroke, the risk of stroke recurrence is higher (between 6-14%) as opposed to risk in subsequent years (4% annually) (37)(38)(39). A more recent study showed that the incidence of stroke recurrence was the highest during the rst year after index stroke, 12.8% with a declining annual rate, 6.3% during the second year and 5.1% (95% CI, 4.0-6.5) during the third year after the index stroke (15). In the current model, the hazard of recurrent IS was estimated as increasing exponentially with time, which is in concordance with previous reports (13). This may suggest the disease progression over time may in uence the prognosis and other risk factors, which require further studies to investigate any preventive measures to alter the progression over time and minimise the hazard of recurrent IS.
Moreover, this study reported also that the recurrent hazard increased by half in the rst 6 months after index IS. These ndings may guide the clinicians to keep close monitoring and intervention, especially for the rst month, and suggest a need for more intensive patients follow up to ensure adherence and e cacy of secondary preventive drugs given for IS patients during the rst 6 months after index IS.
In this study, IHD, HPLD, and HTN were identi ed as independent predictors for recurrent IS. These ndings are consistent with data reported previously (20)(21)(22)(23). Additionally, this model quanti ed the hazard of having recurrent IS between the patients who have a history of HPLD, IHD, and HTN. Having HPLD, IHD or HTN were found to increase the hazard of developing recurrent IS by 2.64, 2.21, 1.97, respectively. Effective management of these co-morbidities is necessary to reduce the risk of recurrent IS. The developed TTE model for the hazard of having recurrent IS during different time intervals after index IS may allow comprising a description of hazard during different time intervals and with the presence of concurrent diseases, as well as the simulation of RTTE data based on the nal model. The developed tool may aid physician to stratify patients at high risk of recurrent stroke through the estimated risk, which may help in planning a personalised care post index IS to prevent recurrence.

Limitations
This study was a retrospective study based on the available data from the National Stroke Registry of Malaysia. Therefore, the rst stroke captured from the NNEUR from 2009 to 2016 was assumed to be the rst-ever stroke experienced by the patient. Any data on the prior TIA or stroke before the NNEUR establishment was not available and not considered in the current study. Due to the nature of the data captured from registry database, the comorbidities were analysed independently. Nevertheless, this study was a population-based study and large samples representing various ethnic groups across the country. This model may provide and insight on the importance of frequent follow up especially in the early days (examples within the rst 6 months to one year), thus perhaps may made a positive shift in Malaysian population follow up schedules during the management to prevent recurrent IS.

Conclusion
Incorporating time in predicting the risk of recurrent IS may attribute positively in predicting the prognosis of recurrent IS. In the absence of signi cant risk factors, predicted hazard of recurrent IS was prominent in the rst month after the index IS and was non-zero even three months after the index IS or later. Optimal secondary preventive treatment should not be only focusing on reducing the common established risk factors yet should incorporate the 'nature risk' IS recurrence. Future study determines which secondary prevention alter the baseline hazard of IS recurrence is vital. In addition to concomitant diseases, time also plays a vital predictor of the risk of recurrent IS population. Future studies are required to determine drugs that could alter the changes in hazard of recurrent IS over time. These results may add to the knowledge related to patient follow up schedules during the management to prevent recurrent IS.  Abbreviations: λ, baseline hazard; ∆OFV, change in objective function value; RSE, relative standard error; 95% CI, 95% con dence intervals.  Abbreviations: ∆OFV, change in objective function value; aHR, adjusted hazard ration; HPLD, hyperlipidemia; HTN, hypertension; IHD, ischemic heart disease; RSE, relative standard error; 95% CI, 95% con dence intervals. Figure 1 Baseline hazard during different time intervals after index IS; First month, after rst month until 3 months, after 3months until 3 years, after 3 years.

Figure 2
Effect of concurrent diseases on hazard of having recurrent IS after index IS.