Study cohort
The E3N (Etude Epidémiologique auprès de femmes de la Mutuelle Générale de l’Education Nationale (MGEN)) cohort study was set up in 1990, and included 98,995 women born 1925–1950 and affiliated to the MGEN, a health insurance plan for workers in the education system and their spouses. The objective was to identify risk factors for cancer and chronic conditions in women. In 1990, participants signed an informed consent form in accordance to the French National Commission for Data Protection and Privacy. Part of this study is the French component of the European Prospective Investigation into Cancer (EPIC) 12.
Participants were asked every 2–3 years to complete self-administered questionnaires to provide and update medical events and lifestyle (1990, 1992, 1993, 1995, 1997, 2000, 2002, 2005, 2008, and 2011). The baseline for the current study was the date the participants answered the 1992 questionnaire. Participants were excluded from this study if they had stroke, coronary heart disease, cancer, death (n = 6,452), or VTE (n = 390) before 1992. We excluded 721 participants with no reported baseline body mass index (BMI). The final study population included 91,707 women.
Assessment of Venous Thromboembolism Cases
Non-fatal incident VTE cases were identified from self-reports in the follow-up questionnaires sent to the study participants after baseline. Before the 2008 questionnaire, participants were asked to report VTE events (without distinguishing between deep and superficial VTE) as yes/no, and the corresponding date, or pulmonary embolism (PE) and the corresponding date. From 2008 on, questionnaires were specific for DVT or PE, asking participants not to report superficial VTE. A flow-chart of case identification is presented in the supplementary material (sup. Figure 1). All cases were considered validated, either by imaging procedures, or by reimbursement for anti-thrombotic medications following the event.
When participants reported VTE or PE events in the 2005 or preceding questionnaires, they were contacted via another letter, whereby they were asked to provide medical documents relating to the event. In addition, a questionnaire including information on potential predisposing factors for thrombosis and characteristics of the event was sent to the medical doctors who followed these participants, permitting classification of the events as primary or secondary. To be validated, clinical events had to be diagnosed using an imaging procedure. PE was defined as the presence of either a positive pulmonary angiography or a positive helicoidal computed tomography or a high-probability ventilation/perfusion lung scan. DVT had to be diagnosed by use of compression ultrasonography or venography.
For DVT/PE identified from the 2008 and 2011 questionnaires, we confirmed that cases were associated with a relevant antithrombotic prescription (Heparin, Tinzaparin, Fragmin, Fraxiparin, Enoxaparin, Acénocoumarol, Fluindione, Warfarin, Dabigatran, Rivaroxaban, or Apixaban) at least twice in the year following the VTE using the MGEN database. Self-reported cases that were not validated with anti-thrombotic medications were not considered as valid cases. We cross-referenced self-reported VTE with self-reported hospital admissions, to determine if there was any associated cancer, immobilisation, hospitalisation, surgery (e.g. hip and knee surgeries), or trauma, which would indicate a secondary VTE. All other cases were considered a primary event with no obvious cause.
Fatal cases were identified from national death registers using International Classification of Diseases (ICD) -9 (codes 4151 and 4539) and ICD-10 (codes I26.0 and I26.9).
Amongst the 91,711 participants in the study, 1,443 incident first cases of VTE were identified during follow-up (1992–2005) as previously described 13, and a further 1,277 identified from in 2008 and 2011. Cases of superficial and upper extremity VT (n = 848) and cases for which the type could not be determined (n = 223) were not considered. As a result, 1649 incident cases of VTE consisting of 505 cases of PE and 1144 cases of DVT with no evidence of PE were considered.
Assessment of cardiovascular risk factors
All cardiovascular risk factors considered were updated at each questionnaire cycle for this study.
As per previous studies 14 treated hypertension was self-reported as yes/no in the baseline questionnaire, and in all follow up questionnaires. Beginning in 2004, we defined cases of hypertension if a participating woman self-reported, or was pharmacologically treated with anti-hypertensive medications (Anatomical Therapeutic Chemical Classification System codes C02, C03, C07, C08, and C09). Previously we have observed an 82 % positive predictive value/agreement from self-reports when cross correlating with the drug reimbursement database for antihypertensive medications.
As per previous studies 15,16, type-2 diabetes cases were based on self-reports, which were then validated through a specific questionnaire mailed to women having reported type-2 diabetes, confirming either elevated glucose concentration at diagnosis (fasting ≥ 7.0 mmol/l or random glucose ≥ 11.1 mmol/l), treatment with diabetes drugs, and/or fasting glucose or HbA1c ≥ 7%,(53.0 mmol/mol), respectively17. Cases occurring after 2004 were identified using the MGEN reimbursement database. All women reimbursed for glucose lowering medications at least twice in a given year were classified as having type-2 diabetes.
Dyslipidaemia was self-reported at baseline, and at all follow up questionnaires. Participants in the study were asked if they had received a diagnosis of abnormal cholesterol from their doctor, or if they required treatment to control their cholesterol. The use of lipid-lowering medications was accounted for from 2004 onwards, using the MGEN database. Participants who did not provide information on dyslipidaemia were classified as unknown. Prior to 2004, statin use was considered unknown. As a secondary analysis, we considered the combination of self-reported dyslipidaemia, as well as statin reimbursement as ‘treated dyslipidaemia’. Separate classes were also created for ‘reported dyslipidaemia and no statin reimbursement’ and ‘no reported dyslipidaemia and statin reimbursement’.
We assessed usual physical activity from questionnaires that included questions on the time spent walking (to work, shopping, and leisure time), cycling (to work, shopping, and leisure time), housework, and sports activities. Metabolic equivalents (METs) per week were estimated by multiplying the hourly average METs for each item based on values from the Compendium of Physical Activities 18 by the reported activity duration. Physical activity was split into three tertiles depending on the population distribution. Blood ABO group and smoking were based on self-reports, and at each questionnaire participants were classified as smoker, ex-smoker, or never smoker.
Assessment of adjustment variables
Self-reported height and weight were used to calculate body mass index (BMI), defined as weight (kg) divided by squared height (m2). In the cohort, self-reported anthropometry is considered reliable from a validation study 19.
Use of MHT was assessed using a booklet containing photos of all types of oestrogens and progestogens, as previously described 13. Age at and type of menopause was defined as either (in decreasing order of priority) age at last menstrual period, age at bilateral oophorectomy, self-reported age at menopause, age at start of MHT, or the age at the start of menopausal symptoms. If unavailable, the median age at menopause for the cohort (51 years for natural menopause, 47 years for artificial menopause) was imputed. Women were considered menopausal at baseline if any of these events occurred before the start of follow-up. Parity was based on self-reports. Family history of cardiovascular disease (stroke or coronary disease in either parent) was based on self-reports. Incident fractures, cancers, heart attacks, and strokes occurring during follow-up were based on cases validated by specific questionnaires to the women and their practitioners.
Statistical analysis
As risk-factor status can change over time, we updated values in modelling using information from follow-up questionnaires, similar to the method used by Wattanakit et at 11, i.e. the dataset contained multiple rows for each participant. If a participant reported hypertension, diabetes, or dyslipidaemia in one questionnaire, they were considered to have this condition for the remainder of follow-up.
Outcomes considered were the first VTE, then the first PE, or DVT; and then models considered the status of the event, i.e. primary VTE, and secondary VTE. In order to account for competing events when considering the specific types of VTE, the other types of VTE were censored from the analysis at the time of the competing event.
Potential confounders were selected with the help of directed acyclic graphs (supplementary Fig. 2). Risk-factors were assessed one by one, (i.e. we did not mutually adjust for hypertension, diabetes and dyslipidaemia), using Cox proportional hazard models with age as the time-scale in order to account for the effect of age. Models were initially assessed with age as the time scale (model 1), then on statin use (yes/no), for education level (high-school/no high-school/university), parity (0, 1, > 1), menopausal status (yes/no), ever use of MHT (yes/no), type of menopause (natural/artificial) (model 2), and finally BMI (model 3). Models were not mutually adjusted for the considered risk factors, in order to reduce the likelihood of introducing collider bias. Time at entry was the age at the beginning of follow-up (i.e., the age when the participants answered the questionnaire sent out in 1992); exit time was the age when participants were diagnosed with VTE, died (dates of death were obtained from the participants’ medical insurance records), were lost to follow-up, or reached the end of the follow-up period (December 31, 2011), whichever occurred first.
As VTE can be provoked by bone fractures, other cardiovascular diseases, or cancer, we also considered a model controlled for incident fractures (time dependent, yes/no), cancers (time dependent, yes/no), and both heart attack and stroke (time dependent, yes/no), during follow-up as sensitivity analysis. The next sensitivity analysis excluded cases occurring after 2005 that were not part of the mailing conducted to validate the VTE cases. Finally, in the case of associations which were unexpected, we considered models using only baseline variables to determine if this was due to the inclusion of collider variables during follow-up.
When considering blood-groups 7,834 participants were excluded from this analysis due to missing data on blood group. As a hypothesis generating exercise, we wished to determine if the associations between these risk factors were consistent over blood-groups, which are a major risk-factor for VTE. As blood-group is a determinant of blood lipid levels 20, hypertension 21, diabetes 22, and coagulation factors including von Willibrand factor 23, we considered effect modification was plausible regarding dyslipidaemia, hypertension and diabetes.
Missing values (occurring in less than 5 % of data) were imputed as the mean for continuous variables, and the median for categorical variables. During follow-up, if a value was missing, the previous value was imputed.
All statistical analyses were performed using R and R studio, with the ‘Survival’ package. A Bonferroni corrected p-value for statistical significance was 0.01. The proportional hazards assumption was assessed using the cox.zph function in R. A p-value is generated for the Person product-moment correlation between the scaled Schoenfeld residual, and the time transformation for each variable 24.