Participants and data access
We used the UK Biobank database as the data source (application number 51671, approved August 2019). The UK Biobank is a large prospective cohort study encompassing over 500,000 participants and has now provided reliable population data for numerous epidemiological studies since the health information of participants was collected in 2006[25, 26]. All participants in the UK Biobank provided informed written consent at the time of inclusion in the cohort, and all information was available for scientific research. We first selected a cohort of a total of 501,109 participants aged 37-73 years (female: 272,632; male: 228,477). At baseline, we excluded participants with incomplete data (n=74,697), lost to follow-up (n=1,297), had a diagnosis of OA at any site or a self-reported history of OA (n=54,804). Finally, a total of 370,311 participants (female: 195,700; male: 174,611) were included in this study, and baseline characteristics can be found in Table 1.
Measurements
Our preliminary work described detailed measurement methods and quality control strategies[12, 13]. Briefly, all participants were invited to a physical examination centre for the collection of physical data and metabolic specimens. Waist circumference was measured twice consecutively at the level of the umbilicus using a skin ruler during calm breathing. Blood pressure was measured twice at 5mines intervals using an automated sphygmomanometer (HEM-7015IT; Omron, Kyoto, Japan) to minimize error. Blood specimens were drawn by trained physicians on a fasting basis, and meanwhile, blood glucose, HDL, triglyceride, and CRP concentrations were measured (Beckman Coulter (UK)). In addition, socio-demographic characteristics (including age, gender, ethnicity, Index of Multiple Deprivation), lifestyle (smoking, alcohol consumption, physical activity participation), medical history (diabetes, osteoarthritis, hypertension, surgical history), diet and medication (fruit and vegetable intake, dietary supplements, prescription drugs) were collected using a field questionnaire. Physical activity data were also assessed and categorized using adapted questions from the short International Physical Activity Questionnaire (IPAQ).
Outcome ascertainment
Information on disease diagnoses in the UK Biobank database was categorized by professionals using ICD-10 codes and structured spreadsheets. We queried the database according to the ICD-10 codes for OA events registered in 2006-2022 and identified most OA events (excluding spinal OA, polyosteoarthritis of unknown origin, and other infectious OA, etc.). The diagnostic information primarily comes from primary care, hospital admission data, and self-report. Some participants have multiple instances of diagnostic information, but we used the first diagnosis as the outcome event. Hand OA (M18, M18.0, M18.1, M18.2, M18.3, M18.4, M18.5 and M18.9); Hip OA (M16, M16.0, M16.1, M16.2, M16.3, M16.4, M16.5, M16.6, M16.7 and M16.9); Knee OA (M17, M17.0, M17.1, M17.2, M17.3, M17.4, M17.5 and M17.9); Polyarthrosis (M15.1, M15.2). Participants were followed from initial recruitment until the first diagnosis of OA, death, loss to follow-up, or the end (December 30, 2021).
Definition of MetS and its components
MetS and its components were defined and selected following the International Diabetes Federation (IDF) standards[11, 27]. Central obesity was defined according to waist circumference (≥94 cm in men or ≥80 cm in women). Hypertension was defined as systolic blood pressure (SBP) ≥130 mmHg and diastolic blood pressure (DBP) ≥85 mmHg or previously diagnosed or undergoing treatment for hypertension. Elevated triglycerides were defined as a plasma triglyceride level ≥1.7 mmol/L (150 mg/dL) or a prior diagnosis of elevated triglycerides or ongoing use of anti-triglyceride medication. Reduced HDL was defined as plasma HDL < 1.04 mmol/L (40 mg/dL) in men and plasma HDL < 1.29 mmol/L (50 mg/dL) in women; or being treated with various treatments for reduced HDL. Hyperglycemia was defined as fasting blood glucose ≥5.6mmol/L (100 mg/dL) or a prior diagnosis of type 2 diabetes or ongoing treatment against type 2 diabetes. The above five symptoms are the MetS components. Also, central obesity plus any two or more components were defined as MetS.
Statistical analysis
In the baseline characteristic description, categorical variables were expressed using percentages and frequencies, while continuous variables were presented using mean (standard deviation, SD) for normally distributed variables, and median (interquartile range) for skewed variables. Cox proportional risk models with age as the time scale were used to estimate the hazard ratio (HR) and 95% confidence interval (CI) of MetS and its components on the risk of OA. The proportional risk hypothesis was tested using the Schoenfeld residual method. All models were adjusted for age, and gender. In the basic model (model 1), we adjusted for baseline age and sex. In the lifestyle model (model 2), we further adjusted for body mass index (BMI), the Index of Multiple Deprivation (IMD), alcohol consumption, smoking, and physical activity. In the full model (model 3), further adjustments were made for non-steroidal anti-inflammatory drugs (NSAIDs), aspirin (ASP), vitamin, mineral and fruit & vegetable intake. In order to control for potential confounders, we adjusted for some lifestyle factors, including alcohol consumption (daily or almost daily, 1-4 times a week, 1-3 times a month, and special occasions only/never), smoking (current, previous and never), and fruit & vegetable intake (< 5 portions per day, ≥ 5 portions per day, or unknown/missing). To assess the association of MetS with OA risk in the inflammatory state, we stratified based on CRP levels. Based on previous studies of OA, we dichotomized serum CRP values for further analysis, i.e., low to moderate CRP (≤ 3 mg /L), and elevated CRP (> 3 mg /L)[28, 29]. We then defined four risk levels in relation to MetS: CRP <3 mg/L with or without MetS and CRP ≥3 mg/L with or without MetS.
In addition, to check the robustness of the model and results, we performed extensive sensitivity analyses. First, to reduce the effects of selection bias and covariates, we used propensity scores in our preliminary analyses. We weighted each confounding factor and then proximity matched with a variable ratio one-to-many (1:2) within the caliper. Also, we set a caliper width of 0.2 standard deviations of the propensity scores on the logarithmic scale. Second, to evaluate the potential interaction of MetS with sociodemographic factors and lifestyle (including gender, age, alcohol consumption, smoking, drug and dietary supplement intake status), we performed subgroup analyses and fitted interaction terms with these factors in the model. Third, we then used restricted cubic spline (RCS) to reveal potential nonlinear associations between MetS components and OA risk in a fully adjusted model. We used a three-part model with three parts at the 10th, 50th, and 90th percentiles of each MetS component to flexibly model the association between each MetS factor and OA risk. Fourth, We estimated the number of MstS components and the cumulative incidence of OA by fitting Kaplan-Meier curves and compared them using the log-rank test. Fifth, to minimize reverse causality, we introduced a 4-year lag period for OA onset. In brief, participants who had an OA event four years after the start of follow-up were considered eligible. We used the R software (version 3.5.0, R Foundation for Statistical Computing, Vienna, Austria) for all data analyses. All statistical tests were two-tailed, with p<0.05 as statistically significant.