Study population
The UK Biobank is a large, prospective, observational cohort study that recruited over 500,000 community-dwelling adults aged 40–69 years between 2006 and 2010 (Visit 1) [22]. The participants attended one of 22 assessment centers located in England, Wales, and Scotland, where they completed multiple touchscreen computer-based questionnaires, underwent physical measurements, and provided biological samples. The UK Biobank study protocol details are publicly available online [23]. Two repeat assessment visits were subsequently completed during follow-up (Visit 2, 2012–2013; Visit 3, since 2014). The UK Biobank study received ethical approval from the Northwest Multicenter Research Ethics Committee (16/NW/0274), and all participants provided informed consent.
In our study, participants with prevalent diabetes (including type 1 and 2 diabetes, gestational diabetes, and potential undiagnosed diabetes) at baseline (n=30,648) and those with missing values on social isolation and loneliness at baseline (n=32,520) were excluded; hence, 439,337 participants were included in the main analysis. In the genetic analysis, 409,782 participants of European descent with genetic data were included. When analyzing the associations of change in social isolation and loneliness with incident type 2 diabetes, we further restricted the analysis to include only participants without type 2 diabetes before the most recent follow-up assessment and with complete data on social isolation and loneliness available at both Visit 1 and the most recent follow-up assessment (n=52,498). eFigure 1 in Supplement shows the flowchart of this study’s sample selection criteria.
Social isolation and loneliness
Social isolation
We constructed a social isolation index, similar to the validated Berkman–Syme social network index [24], which was computed based on the items “contact with family/friends/groups” and “live alone.” “Contact with family/friends/groups” was constructed from the following two questions: (a) “How often do you visit friends or family or have them visit you?” (friends and family visit less than once a month, 1 point); and (b) “Which of the following (sports club or gym, pub or social club, religious group, adult education class, other group activities) do you engage in once a week or more often?” (none of the above, 1 point). “Live alone” was assessed using the following question: “Including yourself, how many people are living together in your household?” (living alone, 1 point). Social isolation was calculated as the sum (range, 0–3), with a higher score indicating a greater degree of social isolation; a score of 0 is considered least isolation, 1 indicated moderate isolation, and 2–3 indicated most isolation. The presence of social isolation was defined as having moderate and most social isolation.
The change in social isolation was classified into four patterns based on the change status between two assessments: (1) never social isolation, absence of social isolation at both assessments; (2) transient social isolation, the presence of social isolation at Visit 1 but the absence of social isolation at the follow-up assessment; (3) incident social isolation, absence of social isolation at Visit 1 but it was present at the follow-up assessment; and (4) persistent social isolation, the presence of social isolation at both assessments.
Loneliness
Loneliness was assessed using the following two questions derived from the UCLA loneliness scale: (a) “Do you often feel lonely?” (yes, 1 point) and (b) “How often are you able to confide in someone close to you?” (never or rarely, 1 point)[25]. This resulted in a sum score ranging from 0 to 2, a score of 2 points indicating loneliness.
The four patterns of change in loneliness were also defined using the same definitions as for the change in social isolation, which were as follows: (1) never loneliness; (2) transient loneliness; (3) incident loneliness; and (4) persistent loneliness.
Outcomes
The prevalent diabetes at baseline was determined according to the patients’ responses with the following items: (a) type of diabetes (type 1 diabetes, type 2 diabetes, or gestational diabetes) indicated by using a validated algorithm via self-reported disease, medication, and diabetes diagnosis recorded in the medical history [26] (eText 1 in Supplement) and (b) potential undiagnosed diabetes indicated by hemoglobin A1c (HbA1c) level ≥6.5% (48 mmol/mol). Incident type 2 diabetes was defined by the International Classification of Diseases, Tenth Revision (ICD-10) codes of E11, according to the primary/secondary diagnosis ascertained from the hospital admission records (Health Episode Statistics in England and Wales and Scottish Morbidity Records in Scotland) or underlying/contributory cause of death from death registry records linked to the UK Biobank. In the main analysis, participants were followed up prospectively from Visit 1 (2006–2010) to onset of incident type 2 diabetes, date of death, or censoring date (November 12th, 2021), whichever came first. When investigating the longitudinal association of change in social isolation and loneliness with a subsequent risk of type 2 diabetes, the events were followed up from the date of the most recent follow-up assessment of social isolation and loneliness (Visit 2, 2012–2013; or Visit 3, since 2014).
Genetic risk score
The genotyping process, quality control, and imputation of the genetic data in the UK Biobank were described elsewhere [27]. We selected 34 single genetic variants [single-nucleotide polymorphism (SNP)] identified to be associated with type 2 diabetes in a previous Genome-wide association study [28] (eTable 1 in Supplement). Based on the number of risk alleles, the selected SNP data in the UK Biobank were coded as 0, 1, and 2. Next, we employed the following formula used in a previous study [29] to calculate the genetic risk score, which was further classified into high (tertile 3), intermediate (tertile 2), and low (tertile 1) risks according to distribution: weighted genetic risk score=(β1 × SNP1 + β2 × SNP2 +…βn × SNPn) * (n / sum of the β coefficients).
Assessment of covariates
Demographic and socioeconomic factors, lifestyle behaviors, health conditions, and medication use were the covariates used in the analysis of the current study. Age (continuous, calculated from the date of birth), sex (male/female), assessment center (England/Wales/Scotland), and Townsend Deprivation Index (continuous, a score representing the deprivation of the participant's neighborhood to reflect their socioeconomic status) were known before arrival at the assessment center. Information on ethnicity (white/others), current employment status (employed/unemployed), educational level (college or university degree/non-college or university degree), smoking status (never/current/past), alcohol consumption frequency (not current/less than three times a week/three or more times a week), healthy diet score (continuous, 0–5 points, calculated from the intake amount and frequency of tablespoons of fruit, fish, unprocessed red meat and processed meat), time spent watching TV (continuous, hours/day), ever seeking help from physicians due to anxiety or depressive symptoms (yes/no), antihypertensive medication use (yes/no), cholesterol-lowering medication use (yes/no), and family history of diabetes (yes/no) was obtained using touchscreen questionnaires or verbal interviews. Body mass index (continuous) was calculated as weight in kilograms divided by the square of height in meters. The prevalence of hypertension (yes/no) was defined as a mean blood pressure >140/90 mmHg or a self-reported hypertension history. The prevalence of hyperlipidemia (yes/no) was obtained from the hospital records and death registries. Detailed information is provided in eText 1 and eTable 2 in Supplement.
STATISTICAL ANALYSIS
The baseline characteristics were summarized as number (percentage) for categorical variables and mean [standard deviation (SD)] or median [interquartile range (IQR)] for continuous variables.
The associations of social isolation and loneliness (including baseline or longitudinal change patterns) with incident type 2 diabetes were investigated by performing a Cox proportional hazards analysis. Adjustments were conducted in the following three steps: model 1, adjusted for age and sex; model 2, adjusted for socioeconomic factors and lifestyle behaviors including ethnicity, assessment center, current employment status, educational level, Townsend Deprivation Index, smoking status, alcohol consumption frequency, physical activity, time spent watching TV, and healthy diet score plus the covariates in model 1; and model, adjusted for health conditions and medication uses, including body mass index, hypertension, hyperlipidemia, ever seeking help from physicians due to anxiety or depressive symptoms, antihypertensive medication use, cholesterol-lowering medication use, and family history of diabetes. If covariate information was missing, multiple imputations were used to minimize the potential for inferential bias. Tests based on Schoenfeld residuals were conducted to check the proportional hazards assumptions, and no violations were observed. Estimating the proportion of incident type 2 diabetes hypothetically would be impossible if all participants were least isolated and not lonely; thus, we also calculated the population-attributable fractions (PAFs), assuming a causal relationship.
The test for interaction between social isolation and loneliness and incident type 2 diabetes was conducted with a cross-product term. Moreover, we performed stratified analyses according to the status of social isolation or loneliness. In the joint analyses of social isolation and loneliness, the referent group comprised participants with the least social isolation and with loneliness. Similar analyses were conducted for testing gene–environment interactions.
Several sensitivity analyses were conducted to confirm the robustness of our results: an analysis excluding participants with missing data of covariates; repeat analyses in the sample with imputed incomplete social isolation and loneliness data; an analysis excluding events of type 2 diabetes occurring in the first 2 years of follow-up to minimize the potential influence of reverse causality; and an analysis using Fine–Gray sub-distribution hazard models to account for competing risks of death. Additionally, we conducted subgroup analyses examining the potential effect modification by age (<65/≥65 years), sex (male/female), Townsend Deprivation Index (tertiles), smoking status (never/current/past), alcohol consumption frequency (not current/less than three times a week/three or more times a week), physical activity (tertiles), and obesity status (underweight or normal weight/overweight/obese), and evaluated potential interactions using the cross-product term.
All statistical analyses were performed using the R software version 3.6.0 (R Development Core Team, Vienna, Austria), with a two-sided P<.05 threshold indicating statistical significance.