The cohort for this study was selected from All of Us participants who have EHR data available and who fell outside the exclusion range for T2D, as defined by the phecode exclusion scheme (Supplementary Figure 1). Individuals in the cohort were restricted to those whose survey responses designated them to one of four SIRE groups – Asian, Black, Hispanic, and White – which represent the four largest racial or ethnic categories in the United States. To better understand the role of genetic ancestry, socioeconomic deprivation, and sex play in T2D risk, the study cohort was further restricted to individuals for whom genomic and socioeconomic data were available and whose sex at birth was either male or female. Our final cohort consisted of 86,488 individuals whose mean age was 54.3 and of whom 64.78% were female (Table 1).
All of Us participant genomic variant data were analyzed together with variant data from global reference populations to infer participants’ genetic ancestry fractions for six continental ancestry groups: African, Asian, European, Native American, Oceanian, and West Asian (Supplementary Table 1 and Supplementary Figure 2). Participant genetic ancestry fractions, stratified by SIRE groups, are shown in Figure 1A. SIRE groups show characteristic ancestry patterns, together with continua of ancestry fractions within and between groups. Those who self-identified as belonging to the Asian SIRE group were predominantly of Asian ancestry, the Black SIRE group of African ancestry, and the White SIRE group of European ancestry (Figure 1B). Some exceptions to this pattern can be noted however, as in the case of certain individuals belonging to the White SIRE group who were mostly of West Asian ancestry, and in the case of certain individuals belonging to the Black and Asian SIRE group who were mostly of European ancestry. In comparison to the Asian, Black, and White SIRE groups, individuals belonging to the Hispanic SIRE group demonstrated great heterogeneity in their ancestry fractions, with substantial European, Native American, and African components. This is consistent with the fact that participants with Hispanic ethnicity can identify with any race, following the current OMB standards.
All of Us participant socioeconomic deprivation was measured using a composite, place-based index (zSDI) that includes information on income, education, housing, and public assistance. We observe a clear disparity in zSDI across the four SIRE groups, with those self-identifying as Black and Hispanic exhibiting the highest median zSDI (0.35), followed by the Asian (0.31), and White (0.30) groups (Figure 1C; ANOVA, F = 4793, p <2e-16).
Participant T2D diagnoses gleaned from EHR were used to calculate prevalence values for each of the four SIRE groups (Figure 2A). Of these four groups, the Black SIRE group demonstrated the highest adjusted prevalence percentage (21.87%, CI: 0.60) with the Hispanic SIRE group following closely behind (19.92%, CI: 0.58). The Asian (15.14%, CI: 1.37) and White (14.80%, CI: 0.32) SIRE groups exhibit the lowest adjusted prevalence percentages. The relative T2D prevalence values among SIRE groups are similar to what is seen when different methods are used to create the cohort from All of Us data and resemble the pattern of T2D disparities reported by the Centers for Disease Control and Prevention (CDC; Supplementary Table 2).
To further investigate the association between T2D risk and SIRE, we modeled T2D case / control status as a function of SIRE, using age and sex as covariates (Figure 2B). In this model, the White SIRE group was used as a reference group. The results of this model revealed that belonging to the Hispanic as opposed to belonging to the White SIRE group conferred the greatest increase in the odds of T2D (OR: 2.46, CI: 2.35-2.58), followed by belonging to the Black SIRE group (OR: 2.42, CI: 2.32-2.53) and followed last by belonging to the Asian SIRE group (OR: 1.31, CI: 1.16-1.48). Additionally, increasing age (OR: 1.04, CI: 1.04-1.05) is associated with greater predicted T2D risk and being female (OR: 0.81, CI: 0.78-0.84) is associated with lower predicted T2D risk.
A similar analysis was performed to elucidate the association between T2D risk and genetic ancestry. In this analysis, we modeled T2D case / control status as a function of a particular genetic ancestry fraction, using age and sex as covariates (Table 2). A model with these specifications was generated for four genetic ancestry groups that our four SIRE groups of interest are closely associated with: Asian ancestry (Asian SIRE), African (Black SIRE), European (White SIRE), and Native American (Hispanic SIRE). African ancestry has the highest positive coefficient (0.21), suggesting that there is an increased level of T2D risk among those with greater African ancestry fractions. Native American ancestry has the second-highest coefficient (0.14). Asian (-0.10) and European (-0.19) have negative coefficients, suggesting that there is lower T2D risk among those with a greater proportion of these ancestry fractions. All of these coefficients are significant at an alpha of 0.05. These patterns largely remained and were amplified when SDI was controlled for (Supplementary Table 3).
Additional models were created to investigate the association between T2D risk and socioeconomic deprivation, which modeled T2D case / control status as a function of area-based zSDI and individual-level iSDI, using age and sex as covariates. Participant area-based (zSDI) and individual-level (iSDI) socioeconomic deprivation are signifantly correlated (r=0.26, p=<2.2e-16). As would be expected, the model returned a high, positive coefficient for zSDI (2.52) and iSDI (1.99), indicating greater odds of T2D at greater levels of both area-based and individual-level socioeconomic deprivation. Multilevel modeling with iSDI as a fixed effect and zSDI as the random effect returned a high, positive coefficient for iSDI (1.98), suggesting that indivdual-level socioeconomic deprivation remains tightly associated with T2D risk when controlling for zip code clustering (Table 2).
Additional multivariable logistic regression models were created to investigate how SIRE, genetic ancestry, and SDI interact to modify predicted T2D risk. As part of this analysis, we modeled T2D case / control status as a function of either the SIRE-zSDI or GA- zSDI interaction terms, using age and sex as covariates. The SIRE-zSDI models returned significant and negative interaction coefficients for the Black-zSDI (-1.67) and Hispanic-zSDI (-1.40) interaction terms, suggesting that greater socioeconomic deprivation is associated with reduced risk of T2D for individuals belonging to either the Black or Hispanic SIRE groups (Table 3 and Figure 3A). However, when restricting the cohort to native-born participants, the Hispanic-zSDI interaction term is no longer significant (Supplementary Table 4). Relative excess of risk interaction (RERI) values for the Black (-4.02) and Hispanic (-3.66) groups are also negative, indicating subadditive effects of SIRE and zSDI. The opposite trend is observed for individuals belonging to the Asian and White SIRE groups, in which greater socioeconomic deprivation is associated with a greater risk of T2D. The negative interactions observed between Black and Hispanic SIRE and socioeconomic deprivation can also be seen when individuial-level iSDI is used to model T2D outcomes (Supplementary Table 5).
Similarly significant and negative interaction coefficients were returned by the GA-zSDI models, specifically for the African-zSDI (-3.59), Asian-zSDI (-2.90), and Native American-zSDI (-4.84) interaction terms (Table 3). In contrast, a significant positive interaction coefficient was reported for the European-zSDI (1.34) interaction term. RERI values show the same trends, with negative values for African-zSDI (-7.76), Asian-zSDI (-6.23), and Native American-zSDI (-10.48), compared to positive RERI for European-zSDI (2.18). Visualization of the GA-zSDI interactions shows that high zSDI is a risk factor for T2D at low levels of non-European ancestry and this trend switches at high levels of non-European ancestry, where low zSDI has higher predicted risk (Figure 3B-E). This pattern is particularly pronounced for African and Native American ancestry. The pattern of all of these interaction effects remain when the cohort is stratified by sex (Supplementary Tables 6 and 7; Supplementary Figures 3 and 4).
These results were validated with another set of models which modeled T2D risk as a function of zSDI, using age and sex as covariates (Table 4). These models were each run on a different subset of the study cohort consisting exclusively of individuals from one of the four SIRE groups under investigation. The models corresponding to the Black and Hispanic SIRE cohorts returned negative coefficients for zSDI (-0.70, -0.41). The model corresponding to the White SIRE cohort returned a significant and positive coefficient for zSDI (0.77), consistent with the coefficients observed for the SIRE-zSDI interaction terms.