This is a study on n = 470 incident cases with ischemic stroke with a 1-year follow-up. These cases are taken from a recently published case-control study  which was embedded in the Ludwigshafen population-based stroke registry covering about 93% of all stroke patients below the age of 80 years in the urbanized industrial area of Ludwigshafen . Figure 1 shows the corresponding flow-chart.
The diagnosis of ‘’stroke’’ is based on the definition of the World Health Organization .
Cases gave written informed consent. They were of white ethnicity and between 18 and 80 years of age. Exclusion criteria were additional previous events like stroke of any etiology, acute transient ischemic attack, intracerebral, subdural or subarachnoidal hemorrhage, myocardial infarction within the previous 90 days, dementia, severe aphasia, and other relevant communication barriers as well as withdrawal of consent.
These cases, i.e. the underlying study population analyzed in this work, underwent a personal interview with trained interviewers using a standardized questionnaire. The interviewers were guided by a handbook. Instructions and trainings were given to each of the 11 interviewers (6 medical doctors, 3 nurses and 4 medical doctor students) prior to study start. The quality of data collection was ensured by regular training and monitoring of interviewers. Data were double entered and checked for completeness and plausibility.
Sociodemographic data and behavioral factors were self-reported, whereas cardio-metabolic data were checked by medical personnel using medical records (F.P.). Also, based on medical records, previous risk diseases were assessed using standard definitions for hypertension, diabetes mellitus, hypercholesterolemia, and atrial fibrillation. Body weight and body height were both measured and assessed by the questionnaire.
Active follow-up was done by phone after one, three and twelve months following the diagnosis. These contacts exclusively served to assess the occurrence of a subsequent stroke during the time period since the previous visit. Therefore, no exact event time points were given for the event “subsequent stroke”. Vital status was obtained from the local population registry and date of death was assessed for deceased cases. Neither the severity nor the kind of second stroke were assessed by this follow-up.
Baseline characteristics of patients are summarized with descriptive statistical methods. Continuous variables are described by means and standard deviations. Categorical data are summarized as absolute and relative frequencies.
To analyze the risk factors that might affect the time to death or subsequent stroke we applied the weighted all-cause hazard ratio, thereby taking into account the occurrence of competing events [13, 14]. Thus, different ‘’relevance weights’’ for the two competing components ‘’death’’ and ‘’subsequent stroke’’ were applied. For choosing the weights, we followed the recommendations for finding a weighting scheme as described by Ozga & Rauch, which are also explained in the additional file [AdditionalFile.pdf, 14].
As a first step, we determined “death” to be the most relevant outcome, and thus a weight of “1” was assigned for “death". In a second step, the event ‘’subsequent stroke’’ was set into relation to “death” by answering the purely theoretical question of ‘’how many strokes are considered to be as harmful as one death?’’. The clinical consideration that “stroke” represents a severe clinical event but may be survived, thereby resulting in more “stroke events” than “deaths” within the observed time period, led us to choose a weight of 0.7 for the event “stroke”. Additionally, the “disability weights” as described in the “Global Burden of Disease” study 2016  supported our decision. Although the definitions of these “disability weights” are not directly comparable to our approach (i.e. weights were chosen on the basis of medical records like speech impairment, or being confined to bed or wheelchair), they also set the event “stroke” in relation to “death”. Disability weights for ischemic stroke ranged from 0.019 to 0.588 with a maximal upper confidence limit of 0.744 .
The weighted all-cause hazard ratio was originally introduced for a two-group comparison. For our analysis we extended this effect measure to adjust for confounders [see AdditionalFile.pdf for methodological details]. In addition to the weighted all-cause hazard ratio, we also provide results for the non-weighted all-cause hazard ratio resulting from the Cox proportional hazards model and for the cause-specific hazard ratios for each component separately. Along with the analysis using other weighting schemes, these later analyses are interpreted as sensitivity analyses. All hazard ratios are reported with 95%-confidence intervals. For the weighted all-cause hazard ratio, these are estimated via bootstrap sampling with 10,000 runs. Risk diseases and risk factors evaluated at baseline were atrial fibrillation, hypertension, diabetes mellitus, hypercholesterolemia and smoking. Furthermore, all estimates were adjusted for age (continuous) and sex.
The stroke risk factors considered in this study were defined as follows (unpublished study report):
- atrial fibrillation: “persistent” and “paroxysmal”
- hypertension: values 140/90 mmHg at rest at three time points
- diabetes mellitus: fasting-plasma-glucose >125 mg/dl (>7 mmol/l) or peak blood sugar >200 mg/dl (>11 mmol/l) or two hours value with oral test for glucose tolerance > 200 mkg/dl (>11 mmol/l)
- smoking: consumption of either a minimum of one cigarette/day, five cigarettes/week, one pack of cigarettes/month, two cigars or two pipes/week over a period of six month or more during life
- hypercholesterolemia: fasting total serum cholesterol 200 mg/dl
Since no information was available for any treatment regarding vascular risk factors and risk diseases, the therapeutic regimen could not be included as a predictor in the analysis.
As mentioned above, the exact time of an event has only been delivered for deaths but not for non-fatal strokes. In the latter cases only time intervals were presented (‘’interval censoring’’). For unadjusted univariable analyses, some methods were proposed to account for this, such as different score tests or a parametric model [20-23]. The parametric model can be extended easily to a model adjusting for confounders. However, since we have no information about the underlying event-time distributions and a misspecification of the distribution can introduce serious bias, we decided against a parametric approach and in favor of a naive approach. If the event time is known to fall in an interval, the naive approach assumes that the event occurred at the upper boundary of the interval, thus at the next time point observed. To assess the robustness of this naive approach, we repeated the analyses with the event times replaced by the lower boundary of the interval and by the middle of the interval. We used the statistical software R Version 3.5.1 for all analyses . The algorithm Mersenne Twister is used for random number generating needed to calculate the bootstrap confidence intervals .