Socioeconomic inequalities in young adulthood disrupt the immune transcriptomic landscape via upstream regulators

Abstract Disparities in socio-economic status (SES) predict many immune system-related diseases, and previous research documents relationships between SES and the immune cell transcriptome. Drawing on a bioinformatically-informed network approach, we situate these findings in a broader molecular framework by examining the upstream regulators of SES-associated transcriptional alterations. Data come from the National Longitudinal Study of Adolescent to Adult Health (Add Health), a nationally representative sample of 4,543 adults in the United States. Results reveal a network—of differentially-expressed genes, transcription factors, and protein neighbors of transcription factors— that shows widespread SES-related dysregulation of the immune system. Mediational models suggest that body mass index plays a key role in accounting for many of these associations. Overall, the results reveal the central role of upstream regulators in socioeconomic differences in the molecular basis of immunity, which propagate to increase risk of chronic health conditions in later-life.


Introduction
A considerable body of evidence suggests that disparities in SES -re ecting education, income, occupational prestige, and subjective status -play a critical role in shaping the health trajectories of people, with lower SES associated with elevated morbidity and mortality rates [1][2][3][4][5][6] .Many of these diseases -including, for example, asthma 7 , atopic dermatitis 8 , food allergies 9 , systemic lupus erythematosus 10 , and periodontal disease 11 --vary widely in their pathologies but share a common etiological pathway involving immune dysregulation, and they are more common in lower socioeconomic strata than among people with higher SES 12 .Three strands of evidence also document associations between SES and biomarkers of the immune system.First, many studies report associations between childhood SES and pro-in ammatory markers in circulating peripheral blood (such as interleukin-6 and Creactive protein (CRP) 13 ) that, if chronically activated, presage a wide-range of diseases including, for example, diabetes type 2, some cancers, and cardiovascular disease.Second, studies have also examined white blood cell composition, nding that low SES is associated with increased development and circulation of pro-in ammatory immune cells (monocytes and neutrophils 14 ), whereas parental education is associated with a higher proportion of lymphocytes and a lower proportion of neutrophils and, among older adults, that SES is related to shifts in cell composition indicative of immunosenescence [15][16][17] .And nally, a limited number of studies have examined functional assays of immune response, sometimes ex-vivo, and show that, once again, childhood socioeconomic status is a risk factor for immune dysregulation, possibly more so among boys [18][19][20][21] .Nevertheless, despite the abundance of evidence connecting socioeconomic inequalities to immune-related diseases, the molecular etiology of SES-mediated immune alterations remains less explored.
A growing number of studies that have examined SES and transcriptional patterns indicative of immune functioning.Research consistently shows people from low SES backgrounds have greater proin ammatory activity 14,[21][22][23][24][25] .Additionally, SES is associated with the expression of genes regulated by the glucocorticoid receptor and interferon response factors, suggesting a suppression of adaptive immunity and innate antiviral immunity 21,26 .This signature pattern-involving the upregulation of proin ammatory genes and the downregulation of Type I interferon innate antiviral response genes among the lower strata of status, called the "Conserved Transcriptional Response to Adversity" (CTRA)has been observed in numerous populations with a range of research designs.Essentially, SES is negatively associated with CTRA activation, which in turn is associated with the molecular underpinnings of immune-related and in ammatory diseases 21,[27][28][29] .
The current study seeks to expand on these ndings by providing a broader mapping of associations between SES and the molecular signaling pathways that regulate immunity.Understanding the impact of socioeconomic status on immune gene expression requires a systems-oriented approach that extends beyond the examination of individually differentially expressed genes and transcription factor binding motifs.In this paper, we detail upstream transcriptional factors and protein-protein interactors as a networked system that responds to SES.Such an approach offers a comprehensive understanding of how SES is associated with the molecular mechanisms that drive biological processes such as gene expression, cell signaling, and cell fate, which ultimately lead to disease in individuals of lower SES 30,31 .
We focus on American adults in their late 30s, who are ostensibly healthy but nevertheless at-risk for later health challenges.We leverage the mRNA data from 4,543 young adults participating in the National Longitudinal Study of Adolescent Health (Add Health) 32 .First, we identify cell functional pathways and their directionality in SES-related dysregulation of the immune system.To this end, we capitalize on publicly available pathway ontologies to functionally annotate genes that show changes in expression and that cluster together.Second, we identify upstream modulators and regulators of the differentially expressed genes to provide a systems perspective on SES and immunity.Such a view also isolates potential targets for remediation.Finally, we consider the behavioral and health-related factors that may explain associations between SES and the immune cell transcriptome.Results reveal that SES is associated with widespread dysregulation of immunity involving intricately interrelated differentially expressed genes, transcription factors, and protein-protein regulators.Additionally, body mass index is a likely, potent mechanism driving these patterns.

Add Health and differential gene expression
The National Longitudinal Study of Adolescent to Adult Health (Add Health) is a representative study of adolescents in the Unites States who were followed into adulthood over ve waves of data collection 32 .Study participants provided informed written consent with respect to all aspects of the Add Health study in accordance with the University of North Carolina School of Public Health Institution Review Board (IRB).Transcriptomic pro les of the consenting participants were collected during Wave V of the Add Health Study (2016-2017) via an intravenous blood draw (age of subjects range from 33-43 years).The access to restricted use Add Health transcriptomic data was obtained by completing a contractual and data use agreement.Additional detailed information on the study design, interview procedures, consent procedures, demographic assessments, collection, sequencing and quality control of the blood sample, and derivation of the analytical samples is reported in previous studies [33][34][35] .Furthermore, the data analysis and all methods presented in this work were carried out in accordance with the relevant ethical guidelines and regulations.We draw on the mRNA-seq data of 4,015 subjects with complete information on the models' variables.Socioeconomic status composite scores were calculated using the sum of standardized indicators of education, income, occupation, and subjective socioeconomic status of the young adult subjects [35][36][37] .
Genes with low counts were excluded from the analysis.After normalizing the raw mRNA-seq counts using a weighted trimmed mean of log expression ratios (TMM normalization) 38 using the edgeR 39 package in R, we analyzed genes whose expression varied signi cantly by the young adulthood socioeconomic composite score using a linear model analysis 40,41 .We controlled for covariates that could in uence mRNA abundance levels: sex, self-described race, age, pregnancy status, sample analysis plate, number of hours fasting prior to blood sample collection, use of anti-in ammatory medication (e.g., NSAIDS, COX-2 inhibitors, inhaled corticosteroids), instances of common subclinical symptoms (e.g., colds, u), and common infectious or in ammatory diseases (e.g., infection, allergies) in the 4 weeks prior to blood sample collection.We also corrected for batch effects using the ComBat function in the sva package 42 in R.
Our overall analytic strategy is to (1) estimate clusters of genes across the whole genome and, within these clusters, identify genes that differentially expressed (DE) by SES (hereafter, SES -DEG); (2)   characterize the biological function of these DE genes and gene clusters that are likely to have DE genes; (3) identify transcription factors and their protein neighbors that are associated with these DE genes and, nally, (4) identify behavioral mediators that may account for associations between SES and DE genes and their upstream regulators.

Whole-genome clusters and cluster-SES relationship
Processed gene expression data from 14,251 transcripts in 4,015 individuals were subject to unsupervised clustering using Weighted Gene Coexpression Network Analysis (WGCNA) 43 .We identi ed a total of 19 clusters and the number of genes in each cluster and the clusters' overlaps with the SES -DE genes are shown in Supplementary Figure S1.To identify the clusters that have a signi cant relationship to SES, we modelled the cluster eigengenes (a summarized expression vector of each cluster) as a linear function of SES as in the differential expression analysis.Additionally, we performed a Fisher exact test to identify clusters that show an enrichment for SES -DEG.Together, the two tests resulted in clusters that, (1) have a signi cant cluster -SES relationship and (2) are enriched for SES -up or downregulated genes (see Supplementary Figure S2).Four clusters (Cluster 7, 11, 13 and 17) had eigengenes that are signi cantly differentially expressed by SES.Of the 4 clusters, Cluster 11 showed an overrepresentation for SES -downregulated genes, while Clusters 7, 13 and 17 showed overrepresentation of by SES -upregulated genes.In this context, up-regulation refers to a positive association between SES and mRNA abundance levels.

Functional enrichment analysis of the differentially expressed genes and signi cant clusters
Functional enrichment analysis for the SES -differentially expressed genes (see Supplementary Figure S3 and Supplementary Dataset S1) and WGCNA identi ed cluster genes (see Supplementary Figure S4 and Supplementary Dataset S2) was performed using R Bioconductor package ReactomePA 44 to identify the biological function of the genes (FDR p < 0.05).The Reactome results are organized in a hierarchical structure of biological pathways with each biological pathway being a node that shows parent-child relationships 45 .We relied on this parent-child relational database to pool together multiple pathways under the same parent node in order to better understand the large-scale changes (up to 3 hierarchical levels).The signi cance of the parent node was determined by its most signi cant child.
Functional enrichment analysis of the SES -differentially expressed genes and the upstream regulators was performed using the ClueGO 46 plugin in Cytoscape 47 .This plugin allows for the combined analysis of multiple gene lists using a preselected ontology.We analyzed the SES -differentially expressed genes (up-and downregulated gene lists) along with their upstream regulators (transcription factors and protein neighbors) with the Reactome ontology.

Identifying key controllers of genes exhibiting differential expression by SES
Upstream regulators of the SES -DEG (Set A; see Supplementary Figure S5) were categorized into (i) transcription factors that are themselves differentially expressed (Set B; see Supplementary Figure S5), (ii) protein neighbors of the transcription factors that are differentially expressed (Set C; see Supplementary Figure S5), and (iii) transcription factors that putatively modulate the expression of differentially expressed genes (Set D; see Supplementary Figure S5).Marbach et.al. 48constructed tissuespeci c regulatory networks that linked transcription factors and genes with a score based on a curated collection of sequence binding motifs.Those transcription factors that had a medium or greater con dence (> 0.4) of modulating the expression of the differentially expressed genes in blood tissue were included in the set of upstream regulators (Sets B and D).Protein neighbors of differentially regulated transcription factors were obtained using the STRING database 49 .Each Protein-Protein Interaction (PPI) in STRING is annotated with a score that indicates the con dence of the interaction.Only neighbors with scores of at least high con dence (> 0.7) were included in the set of upstream regulators (Set C).Thus, Set A represents the DE genes and Sets B, C and D together constitute their upstream regulators.

Possible mediators of SES and DE genes and upstream regulators
We examined behavioral and psychobiological process that might mediate associations between Wave V young adulthood SES and the expression of the genes and upstream regulators using a counterfactual mediational framework 50 .The mediators included Body Mass Index (BMI), perceived stress (based on Cohen's Perceived Stress Scale 51 ), current self-reported smoking status, consumption of alcoholic drinks (days drank over past 30 days; categorized as 0 drinks, 1-2 drinks, 3-5 drinks, and more than 5 drinks per occasion), nancial stress (self-reported di culty in paying bills), and access to health insurance.

Randomization test of differentially expressed genes and upstream regulators
We quanti ed the statistical signi cance of the observed results by performing randomization tests based on 1000 randomly generated sets of differentially expressed genes.Random samples were drawn from the entire genome to obtain a set of genes equal in number to Set A (see Supplementary Figure S5).Sets B, C, and D were derived from every randomly generated Set A using the same procedure used with the SES -DE genes.We then computed the signi cance (empirical p-value) of every actual gene in Set A (DE genes) and Sets B, C, and D (upstream regulators) 52 by comparing it to the 1000 randomly generated sets.We obtained p-values for each of the sets of genes by combining the p-values of every gene in the set using Fisher's method.

Transcriptional alterations with SES are characterized by organism wide dysregulation
We performed a differential gene expression analysis followed by an enrichment analysis of the resulting SES -differentially expressed genes (see Supplementary Dataset S2 and Supplementary Figure S3).Upregulated genes indicate a signi cant association between high SES and high expression.Functional enrichment of the SES -differentially expressed genes (423 upregulated genes and 389 downregulated genes) showed a majority upregulated for pathways involving metabolism, signal transduction and cellular response to stress by a core of ribosomal and translational genes.Interestingly, these cytosolic ribosomal genes (RPL-and RPS-genes) were found to be downregulated with aging in an analysis of the human peripheral blood and previously linked with SES 35 .Indeed, a combined WGCNA and SESdifferential expression analysis (see Fig. 1) showed a tight clustering of the ribosomal and transcriptional activity genes (Cluster 11 in Fig. 1) that are responsible for the SES-upregulated pathways.One cluster of SES-DEGs (Cluster 7) displayed dysregulation in immune system and response, hemostasis and cell death that were predominantly driven by downregulated genes, while another cluster (Cluster 11), largely comprising upregulated ribosomal genes, affected transcriptional events in several cellular functions.Cluster 13 consisted of genes involved in cell division and cell cycle control dysregulating a relatively small number of pathways in signal transduction and immune system, while Cluster 17 comprised too few genes for a meaningful enrichment interpretation.
An inspection of enriched pathways reveals that SES-upregulated pathways include interferon innate immune response and neutrophil degranulation (see Fig. 2).Curiously, type II interferon (IFN-γ) signaling is also upregulated.Despite the lack of direct evidence for their involvement, the genes that are tied to the upregulation of type I and type II interferon signaling do share HLAgenes that regulate the antiviral immune response, which may explain the overrepresentation of both classes of immunity.Figure 2 also shows an attenuation of proin ammatory pathways with higher SES (i.e., upregulation of proin ammatory pathways with low SES) via pathways in the proin ammatory nuclear factor kappalight-chain-enhancer of activated B cells (NF-κB) and proin ammatory toll-like receptors (TLR).

Upstream regulators structure immune dysregulation with socioeconomic disparities
The biological pathways and molecular mechanisms associated with socioeconomic disparity were also examined via an analysis of upstream regulators (transcriptional factors and protein partners) of the DE genes.A combined functional enrichment analysis of the resulting upstream regulators and SESdifferentially expressed genes (see Fig. 3 and Supplementary Dataset S3) indicates a signi cant role of the upstream regulators in structuring the overrepresented pathways in the immune system.The upstream regulators include tissue-speci c transcription factors that are (i) differentially expressed (Set B, see Supplementary Figure S5), (ii) potentially regulating the expression of a differentially expressed gene (Set D, see Supplementary Figure S5), and (iii) PPI neighbors of differentially expressed transcription factors (Set C, see Supplementary Figure S5). Figure 3 indicates that the immune pathways that are enriched have a larger proportion of upstream targets.Importantly, many of these pathways were also enriched by SES -DEG signifying that the patterns of immune dysregulation observed in Fig. 3 do not re ect the inclusion of upstream regulators per se, but rather re ect the signi cant mechanistic role played by these transcription factors and protein partners.
Figure 4 shows the upstream regulators along with the SES -DEG represented as layers (4 in total, where the innermost circle of genes and upstream regulators is labelled as Layer 1, sequentially to the outermost group of genes and regulators labelled as Layer 4) based on their interaction scores derived from the STRING database and the number of times each gene or upstream regulator is involved in functional immune pathways that are enriched in Fig. 3.The genes that are responsible for the enrichment of each immune pathway were identi ed.The innermost layer in Fig. 4 depicts genes and upstream regulators that are involved in more than ten functional pathways, whereas the outermost layer consists of genes and upstream regulators that are only involved in one enriched biological process.The most essential regulators involved in the dysregulation of the immune system related to SES disparities thus occupy the center of the diagram.Signi cantly, variations in young adult SES are prominently linked to alterations in cytokine signaling in the immune system involving interleukin and interferon gamma signaling, Toll-like receptor signaling cascade, and TNF pathways (Fig. 3 and innermost layer in Fig. 4).
Proteins most deeply linked to SES inequalities invariably revolve around the cyclic 3'-5' adenosine monophosphate response element-binding protein (CREB) and NF-κB pathway signaling.Variations in the activity of CREB and NF-κB selectively upregulate the transcription of interferon response factor family while simultaneously inhibiting the activity of proin ammatory interleukins in subjects with high SES.Genes and transcription factors (via the upstream analysis) that are central to the functional response to SES disparities in functional gene regulation (shown in Fig. 3) are also shown in Fig. 4. Molecules are placed in layers depending on their contribution (instances of enrichment of a pathway) to the functional immune enrichment.The inner most layer consists of transcriptional factors such as CREB1, TP53, RELA, REL, BTRC, BTK and the MAPK -family.CREB and REL proteins, among other important functions, play a crucial role in the activation of the ght-or-ight signaling pathways that is directly responsible in eliciting the CTRA gene expression pro les.Although a large fraction of the proteins is derived from the set of upregulated genes, it is noteworthy that these regulators can have far reaching impact and they are not always in the expected direction.This is particularly true for the proin ammatory toll-like receptors (TLR) (see Fig. 3) pathways, which have a larger proportion of downregulated genes than upregulated genes.However, they also functionally interact with the upstream regulators connected to upregulated genes.The greater in uence of the upstream regulators connected to upregulated genes suggests that the downregulation of certain genes could transpire as a consequence of inhibitory activity of the transcription factors.

Social/behavioral mediators of SES-transcriptome associations
Figure 5 reports the median percentage mediated ratio for key SES-related social/behavioral processes in every layer SES-related gene regulation.Intriguingly, the inner most group of genes that are most centrally implicated in SES-associated dysregulation, are also mediated the least (lowest median percentage mediated ratio) by every behavioral risk factor.However, this could re ect the fact that the inner most group of genes are not themselves differentially expressed despite potentially inducing larger mediated changes in the outer layers of genes.Body mass index presented the strongest explanation of the association between the transcriptional response of the gene groups and SES, followed by smoking tobacco (also see Supplementary Figure S6 and S7).No signi cant mediation was observed for nancial stress or access to health insurance.

Discussions
The present analyses expand the scope of prior studies of SES-related alteration in transcriptomic pro les of human immune response genes by identifying new genomic functional impacts (e.g., ribosomal biology) and new features of the gene regulatory architecture of SES (e.g., TP53, BTRC, BTK and MAPK transcriptional control pathways).Consistent with prior research, we nd that high SES is negatively associated with pro-in ammatory pathways in the nuclear factor kappa-light-chain-enhancer of activated B cells (NF-κB) and proin ammatory toll-like receptors (TLR).Our ndings also link high SES to elevated type II interferon (IFN-γ) signaling and identify a related upregulation of HLAgenes, which may underpin both type I and type II interferon effects.
The analyses extend previous research by mapping networks of upstream blood -speci c transcriptional factors and protein interactions that could play a vital role in structuring the observed transcriptional landscape.The complexity of the impact of socioeconomic inequalities on the immune system and its association with diseases with widely varying pathologies warrants a systems-oriented approach to comprehensively analyze the SES-related perturbations in the immune transcriptome.To our knowledge, most prior research on SES focuses solely on transcriptomic alterations with a particular focus on proin ammatory action 29 .Here, we include the upstream regulators to depict an enhanced view of dysregulation with SES, shedding light on a tightly knit group of transcription factors that play a central role in modulating the transcriptomic alterations.
Our ndings map a central network of upstream regulators that vary as a function of young adult SES (central positions in Fig. 4).Given the lack of change in the expression of the genes that encode these transcriptional factors, receptor-mediated post-transcriptional modi cation of these transcriptional factors (e.g., receptor-mediated phosphorylation of CREB1, TP53, RELA, REL, BTRC, BTK and MAPK) appears to modulate the expression of their downstream targets.
Despite the congruence between our results and the CTRA model in terms of dysregulated pathways, the ndings show that SES disparities in young adulthood, interestingly, do not alter the same set of genes.
For example, the attenuated interferon innate responses in the CTRA model is possible through the direct suppression of INFA and INFB.Here, we observed an upregulation of HLAgenes that additionally regulate the antiviral innate response.These ndings call for a for further study in SES-modulated genes in the immune system beyond the signaling pathways already implicated in structuring the conserved transcriptional response to adversity (CTRA) RNA pro le 53 .
The observed SES-related transcriptional perturbations are associated with both up-and downregulated functional pathways in the immune system.Social stress-induced gene alterations in humans are associated with diseases that include both upregulation of the immune system as well as suppressed immune responsiveness with lowered social status.This seemingly perplexing behavior has been explained by the selective characteristics of the immune transcriptome, i.e., the increase in expression of certain pro-in ammatory genes and the repression of groups of antiviral immune response genes [54][55][56][57] .It is, therefore, unsurprising that a similar pattern of dysregulated immune pathways is emergent in the blood transcriptomic landscape of subjects in Add Health with contributions of enrichment from both upand downregulated gene clusters.Such a nding calls for additional research that moves beyond the initial general nding that stressors upregulate proin ammatory genes and downregulate antiviral genes.
Lastly, among the common mediating (or possible explanatory) mechanisms studied here, BMI consistently emerged as a plausible mediator of the SES associations with immune cell gene regulation.Smoking also appears to play a signi cant role in the SES-related transcriptional alterations.These results underscore the importance of gene regulatory network approach in formulating a comprehensive understanding of psychosocial stressors and their impact on biological mechanisms early in life.Studies have already established that changes caused by socioeconomic disparities in young adulthood could have far-reaching implications for chronic conditions in later adulthood.Identi cation of novel regulators of such perturbations is an important step in formulating a mitigating strategy.
We used tissue-speci c regulatory networks to link transcription factors to differentially expressed genes, and subsequently the STRING database to nd protein interaction partners.There were 643 transcription factors identi ed, with an even a smaller number (304) having a con dence score that is above the threshold used (0.4).Given the central role of many of the transcription factors and protein partners, one could argue that the set of upstream regulators found from the SES -DE genes could be equal to a set of upstream regulators derived from a random set of genes.To examine this possibility, we performed randomized trials by starting with randomly selected sets of "DE" genes to then derive these upstream regulators of the random sets.We subsequently compared the upstream regulators of each random set of "DE" genes to our observed results (see Supplementary Figure S8 and S9).While some of the individual transcription factors may not reach statistical signi cance (Supplementary Figure S8), the entire set of the upstream regulators is highly signi cant (Supplementary Figure S9).

Limitations
Several limitations are noteworthy.First, the hypotheses and the subsequent results are driven by the mRNA abundance data collected once from every participating subject.The repeated collection of transcriptomic data would be essential to address their highly transient nature, which is likely associated with considerable noise.Second, the identi cation of upstream transcriptional regulators of gene expression is performed with the aid of tissue-speci c gene regulatory networks.These networks link genes to transcription factors based on experimental evidence and assign a con dence score to every identi ed transcription factor.It is, therefore, possible that transcription factors that play a central role in cell maintenance and cell cycle may be implicated without having a substantive role in the etiology or progression of dysfunction.We tried to account for such effects using a randomization experiment.However, direct measurements of protein abundance are required to concretely determine the role of every transcription factor.
Nevertheless, results suggest that a network of transcription factors and protein partners play a pivotal role in modulating the SES-related transcriptional response that precipitates dysregulated immune system response in terms of in ammation and interferon innate immunity.These central actors are important targets for future research connecting health disparities and socioeconomic inequalities.The results thus highlight the need for system-oriented analyses to comprehensively map the biological impact of SES disparities and represent an important step in identifying targets for possible mitigating strategies.
Enrichment analysis of the WGCNA clusters exhibiting signi cant cluster-SES relationships.Eigengenes from individual clusters were identi ed using WGCNA and then examined for signi cant associations with SES (adjusted p< 0.05).Enriched biological pathways for signi cant clusters (adjusted p < 0.05) were then examined using Reactome.Here, upregulation refers to a signi cant association between high the SES differentially expressed genes.The size of the circle signi es the number of genes that contribute to the signi cant enrichment in a pathway and the color of the circle indicates Cramer's V, a measure of the magnitude of association.The overrepresentation analysis shows a pre-dominantly upregulated immune response and downregulation in metabolism and cellular stress response associated with high SES.A majority of the downregulated pathways are driven by a cluster of ribosomal genes that affect cellular transcription.An onion diagram depicting the importance of the different gene classes to the immune system dysregulation by young adult SES.The gure shows the upstream regulators along with the SES differentially expressed genes represented as layers (4 in total.The innermost layer represents genes and upstream regulators that are involved in the enrichment of more than 10 functional pathways (in Figure 3) and thus they are pivotal in immune dysregulation, whereas the outermost layer consists of genes and upstream regulators that are only involved in 1 enriched biological process.
Mediation analysis for common behavioral risk factors and the gene classes identi ed to be essential toward the SES induced immune system dysfunction.The median percent mediated ratio (ACME/total effect) is shown for mediational models for the risk factors and the layers of SES -DE genes and upstream regulators in Figure 4.All the mediational models were signi cant (aggregated adjusted pvalue < 0.05) (also see Supplementary Figure S6 and S7).Negative suggest a pattern of suppression