Disentangling race, environment and the microbiome in a study of preterm birth risk

Previous studies have investigated the associations between the vaginal microbiome and preterm birth (PTB), with the aim of determining whether differences in community patterns meaningfully alter risk, and could therefore be the target of intervention. We report on vaginal microbial analysis on a subset of the Pregnancy, Infection, and Nutrition (PIN) Study, a prospectively enrolled cohort of women in central North Carolina between 1995-2001. We selected a nested case-control subset of this cohort, including 464 White women (375 term birth and 89 spontaneous PTB, sPTB) and 360 Black women (276 term birth and 84 sPTB). Microbial DNA was extracted from genital track swabs collected mid-pregnancy, and subjected to 16S rRNA taxonomic proling. We found that microbial community structure is associated with race and sPTB, although the inuence of race is stronger than the inuence of sPTB. The microbiome of Black women has higher alpha-diversity, higher abundance of Lactobacillus iners and lower abundance of Lactobacillus crispatus. These differences were obscured once maternal douching behavior was considered—specically, among women who douche, there were no signicant differences in microbiome by race. The sPTB associated microbiome exhibited a lower abundance of L. crispatus, while alpha diversity and L. iners were not signicantly different. Associations between the microbiome and sPTB were only signicant in women who do not douche. While race was a strong predictor of microbial community structure, we also observed strong intercorrelations between a range of maternal factors, including poverty, education, marital status, age, douching and race, with microbiome effect sizes in the range of 1.8-5.2% in univariate models. Therefore, race may simply be a proxy for other socially driven factors that differentiate microbiome community structures. Future work will continue to rene reliable microbial biomarkers for preterm birth across diverse cohorts.


Introduction
Over 10% of all pregnancies in the US are preterm, almost 16% among Black women 1 . Intrauterine infection is widely speculated to underlie some portion of preterm deliveries; however, the bene t of prophylactic antibiotic therapy for the prevention of preterm birth (PTB) is not universal, and appears to depend at least in part on timing of treatment in pregnancy, route (oral versus IV) of antibiotic administration, and clinical presentation (e.g., preterm premature rupture of membranes versus intact membranes) [2][3][4][5] , which may re ect etiologic heterogeneity among preterm births. One pathway through which pathogenic microorganisms may gain access to the amniotic cavity is by ascending from the vagina and the cervix 6 .
The vaginal microbiome represents a physiological barrier to this route, generally through "healthy" bacteria (mostly but not always of the Lactobacillus genus) producing lactic acid and lowering the vaginal pH 7 . However, there is a tremendous diversity in the species of Lactobacillus present in the vagina, and these species may produce varying levels of lactic acid and have different tolerance for anaerobic members of the microbial community 8 . Moreover, some relatively common vaginal microbiome pro les contain few Lactobacillus spp. 9,10 . Vaginal microbiome lacking Lactobacillus species (spp.) of any type tend to have higher pH, more taxon diversity, and often prominently represent organisms that make up the bacterial vaginosis (BV) diagnosis by Nugent Score 7,10,11 . This pro le, which is associated with adverse outcomes, tends to occur more frequently in Black women [10][11][12] .
In a recent large study, Fettweis et al. described vaginal microbial community differences among 1,268 African American and 416 European American women 11 . They report signi cant differences in the microbiome pro les between these groups, with African American women having greater microbial diversity and dominance by BV-associated organisms. In a longitudinal study of pregnant women, Fettweis et al. found that the abundance of L. crispatus was reduced in PTB groups and the abundances of Candidatus Lachnocurva vaginae (BVAB1), Prevotella cluster 2, Sneathia amnii and several additional taxa were increased 13 . Interestingly, these signals were strongest earlier in gestation and were largely driven by samples from women of African ancestry. Callahan et al. analyzed the associations between vaginal microbiome and PTB in two racially distinct cohorts, with one being a low risk predominantly Caucasian cohort (n = 39) and the other one a high risk majorly African American cohort (n = 96) 14 . They found that L. crispatus but not L. iners was associated with low PTB risk in both cohorts. The lower abundance of L. jensenii and L. gasseri was associated with PTB only in the high-risk cohort, while the higher abundance of G. vaginalis was associated with PTB only in the low-risk cohort. Kindinger et al. studied the vaginal microbiome in a cohort of women at risk of preterm birth (total n = 161: Black n = 30, Caucasian n = 104, and Asian n = 27) and found that L. iners dominance is associated with higher PTB risk and L. crispatus is associated with lower PTB risk 15 . However, no signi cant difference was identi ed when strati ed by race, perhaps due to the small sample size.
The present literature appears to support that Black women often have more diverse vaginal microbial community patterns 10,11 and that women with higher abundance of L. crispatus have overall lower risk of PTB [13][14][15] . What is not yet clear, given the known racial disparity of PTB 16,17 , is whether microbial community patterns are associated with PTB independent of racial differences in microbial community structure. To answer this question, we utilized the Pregnancy, Infection, and Nutrition (PIN) Study, a prospectively enrolled pregnancy cohort of women in central North Carolina, to investigate the relationship between second trimester vaginal microbial community patterns and PTB, and to disentangle differences in association by race. This cohort of low-risk women represents the ideal setting to answer this question given the rich social, behavioral, demographic and clinical data assembled that characterize in detail known determinants of PTB 18 .

Study Population
The Pregnancy, Infection and Nutrition (PIN) study enrolled pregnant women with singleton pregnancies in central North Carolina from August 1995 -Feb 2001. Women were recruited from prenatal clinics at the University of North Carolina Hospitals, Wake County Human Services and the Wake Area Health Education Center. Eligibility criteria included gestational age at enrollment between 24-29 weeks', ability to communicate in English, age 16 years or older, access to telephone, and plans to deliver at the recruitment site 19 . This study was approved by the Institutional Review Board at UNC Chapel Hill.
At enrollment, women provided blood, urine and genital tract specimens. They were also randomly assigned to a "subcohort" for future nested case-cohort studies intending to employ detailed biological measurements that would be infeasible to conduct in the entire population. Because women were randomly assigned to this subset irrespective of pregnancy outcome, this subcohort should re ect the exposure distribution in the cohort as a whole. In the subsequent two weeks following enrollment, women completed a telephone interview that collected information on social, demographic, medical and behavioral risk factors for adverse pregnancy outcomes. In total, 3,163 women were recruited into the PIN study during this period. The current study was restricted to spontaneous preterm and subcohort members who self-classi ed as black or white race (n = 824), including 375 White women with term birth, 89 with spontaneous preterm birth, 276 Black women with term birth and 84 with spontaneous preterm birth. Complete selection criteria for the current analysis are described in Fig. S1.

Preterm Birth Clinical Presentation
Gestational age at delivery was assigned by early ultrasound (completed prior to 22-week gestation) in 90% of the population, or last menstrual period date if ultrasound was unavailable 20 . Preterm birth was de ned as < 37 completed weeks' gestation. Preterm clinical presentation was determined by obstetrician review, and classi ed as preterm labor (PTL), preterm premature rupture of amniotic membranes (PPROM) in which membranes ruptured four or more hours before the onset of labor, and medically indicated. For the current study, we combined PTL and PPROM into a single clinical presentation of spontaneous preterm birth (sPTB). Among sPTB cases, gestational ages varied from 26-36 completed weeks.

Covariates
We selected covariates for consideration as predictors of the microbiome or confounders of the microbiome-PTB association based on prior literature. These included maternal age at enrollment, maternal education, marital status, pre-pregnancy weight and height to calculate body mass index (BMI), parity, any smoking during pregnancy, maternal household percent of poverty based on the 1996 census, douching before pregnancy, maternal self-reported depressive symptoms, and the number of negative life events. Maternal depressive symptoms were measured based on the Center for Epidemiological Studies Depression scale (CES-D) 21 . Negative life events were assessed by a modi ed Life-Events Inventory (LEI) 22 .

DNA extraction and Sequencing
Swabs were collected between 24-29 weeks' gestation from the posterior vaginal apex, and stored at -70C. For the current study, swabs were thawed on ice and processed essentially as previously described 13 . In brief, DNA was extracted using the PowerSoil DNA Isolation Kit (Qiagen), eluted in 100 µL water and quanti ed using PicoGreen. Extracted DNA was ampli ed with barcoded primers targeting the V1-V3 hypervariable regions of the bacterial 16S rRNA gene using protocols established in the Vaginal Human Microbiome Project at VCU 11 . Samples were multiplexed (384 samples/run) using a sample-speci c dual-index strategy and sequenced on Illumina MiSeq sequencers (2 x 300 base paired end protocol). The paired-end quality-aware raw sequence les were demultiplexed into sample-speci c data, and merged and quality-ltering using MeFiT 23 . Samples with fewer than 1,000 high-quality reads were excluded. The sequences generated can be accessed at NCBI with BioProject ID PRJNA694098.

Bioinformatics and Statistical Analysis Approach
Comprehensive 16S rRNA gene-based taxonomic survey of the vaginal microbial pro les yielded a mean count of 43,276 reads/sample with minimum and maximum read counts of 1,824 and 186,784, respectively. Over 99.9% of the high-quality single reads generated overlapping pair-end reads. Highquality sequences were assigned to the species-level taxonomic assignments for vaginal samples using STIRRUPS 24 , an analysis platform that employs the USEARCH algorithm 25 combined with a curated 16S rRNA sequence database. Paired reads which did not align to the same reference sequence were discarded as chimeras. Analyses with DADA2 and SILVA 132 release were used as alternative pipelines.
The PCoA ordinations were calculated based on the Bray-Curtis dissimilarity between samples with function 'capscale' and visualized with 'ordiplot' in R package 'vegan'. PERMANOVA tests were used to analyze the associations between microbiome and host factors with function 'adonis' in the same package. Shannon index was used to calculate the alpha-diversity of microbial communities.
Two methods were applied to determine the vaginal microbial community states or vagitypes based on the taxonomic composition. First, vagitypes were determined by the dominant species with a relative abundance > 30%. The microbiome was characterized as 'no type' when the relative abundance of all species was lower than 30%. We also used the hierarchical cluster analysis with R function 'hclust' to con rm the existence of vagitypes. The associations between preterm/term birth, vagitypes and host factors were determined with Fisher's exact test. The associations between species and preterm/term birth, race and douching were primarily analyzed with Wilcoxon tests. P-values were adjusted for multiple testing using the Benjamini-Hochberg method 26 .

Maternal characteristics, vaginal microbiome and spontaneous preterm birth
The vaginal microbiome pro les differed signi cantly by maternal race and spontaneous preterm birth ( Fig. 1a and b). PCoA ordination of the microbiome showed a separation of the 95% con dence limits of Black and White women (Fig. 1a), with a PERMANOVA R 2 of 1.8%. However, PCoA ordination of the microbiome showed only a modest separation of the 95% con dence limits of spontaneous preterm and term birth groups (Fig. 1b), with a relatively small PERMANOVA R 2 (0.45%). A number of maternal features were signi cantly associated with vaginal microbiome, including percent of poverty level, years of education, marital status, age at mid-pregnancy, douching, self-reported depression, negative life events, and parity (Fig. 1c), which are intercorrelated and differentially distributed by maternal race (Fig.   S2); therefore, identifying the underlying causal attribute is challenging. Because of the existing literature documenting differences in community patterns across racial and ethnic populations 27 , we orient our results according to maternal self-reported race; however, these patterns likely re ect a complex interplay between social and environmental factors for which race is a marker but not a causal factor. The microbiome of Black women has higher alpha-diversity, higher abundance of L. iners and lower abundance of L. crispatus (Fig. 1d). The spontaneous preterm birth associated microbiome has lower abundance of L. crispatus, while alpha diversity and L. iners were not signi cantly different (Fig. 1e).

Vagitypes, spontaneous preterm birth and race
The taxonomic composition suggested that most of the vaginal microbiomes were dominated by a single taxon, with the most prevalent species being L. iners, L. crispatus, L. gasseri, L. jensenii, Lachnospiraceae member BVAB1 (BVAB1 has been named provisionally: "Candidatus Lachnocurva vaginae") and Gardnerella spp. (Fig. 2a). Because of the discrete community structures of vaginal microbiome, we classi ed the microbiome into vagitypes based on the dominant species with relative abundance > 30% following previously reported methods 13 . The PCoA ordination of the microbiome showed that different vagitypes generally formed distinguishable clusters especially for those dominated by L. iners and L. crispatus (Fig. 2b). The same PCoA ordination but colored by race and spontaneous preterm birth showed that the microbiomes of White women, especially those who will experience term birth, are more likely to be of L. crispatus cluster (Fig. 2c).
To determine whether the vagitypes were associated with spontaneous preterm birth, we calculated and compared the percentage of spontaneous preterm birth cases in each vagitype (Fig. 2d). In this analysis, the vagitypes dominated by non-Lactobacillus were grouped as Others. The Lactobacillus vagitypes except L. iners and L. crispatus were grouped as Lacto_other in order to simplify the model and for sample size considerations. We found that the percentage of spontaneous preterm birth cases with L. crispatus vagitype was signi cantly lower than the other three types, with a spontaneous preterm birth percentage of 13% compared to 22%, 25% and 26% for L. iners, Lacto_other and Others respectively (Fisher's exact test) (Fig. 2d). Because this study oversampled the underlying cohort for preterm cases, the percentage of preterm birth above does not re ect the underlying risk in the population. When the oversampling of preterm birth was accounted for, the risk of spontaneous preterm birth across vagitypes followed the same pattern: 3.5%, 5.9%, 8.8% and 9.1% for vagitypes L. crispatus, L. iners, Lacto_other and Others respectively. The Shannon diversity of L. crispatus cluster was signi cantly lower than the other three vagitypes (Fig. 2e).
We analyzed whether the percentages of vagitypes were different between Black and White women, and found Black women had a higher percentage of the L. iners vagitype and lower percentages of L. crispatus and Lacto_other vagitypes as compared to White women. We next examined whether maternal race modi ed the association between the L. crispatus vagitype and preterm birth and found no difference between Black and White women (Fisher's exact test, OR = 1.11, CI = 0.41-2.88), and there is no difference for L. iners between Black and White women as well (Fisher's exact test, OR = 0.94, CI = 0.56-1.57).

Microbiome, douching and spontaneous preterm birth
To better understand the relationship between maternal race, douching, and vaginal microbiome, we created 4 distinct groups for a subset of participants with douching information (n = 489): Black, No Douching (B_N, n = 78); Black, Douching (B_D, n = 110); White, No Douching (W_N, n = 199); and White, Douching (W_D, n = 102). PCoA ordination showed that the White non-douching group formed a separate cluster from the other three groups (Fig. 3a). This was supported by the PERMANOVA tests that indicated that the White non-douching group is signi cantly different from the Black non-douching group (R 2 = 0.0284, P = 0.001), while the two douching groups were not signi cantly different from each other (P = 0.401). This is consistent with the vagitype composition of these 4 groups, with non-douching White women associated with higher percentage of L. crispatus, Lacto_other and lower percentage of L. iners vagitypes (Fig. 3b). Additionally, the alpha diversity and the abundance of L. iners and L. crispatus (Fig. 3c) showed similarly that the White non-douching group was signi cantly different from the others (Wilcoxon test, P < 0.05).
At the individual species level, there were 23 taxa that showed signi cant associations with race in the non-douching participants but there was none signi cant by race in douching group (Fig. 3d). Likewise, there are 22 taxa associated with douching in White participants, but no taxa associated with douching in Black participants (Fig. 3e). Taken together, these data show that the association of douching and the microbiome was much stronger for White participants, while race was signi cantly associated with the variation of the vaginal microbiome only for non-douching participants.
We next examined the relationship of spontaneous preterm birth and the microbiome in the 4 groups. With PERMANOVA tests, the vaginal microbiome was signi cantly associated with spontaneous preterm birth only in the non-douching participants and for both race groups, while this association was not signi cant for the douching participants (Fig. 3f). Moreover, compared to the PERMANOVA test for pregnancy outcomes without strati cation of race and douching, the effect sizes (PERMANOVA R 2 ) here are much increased, from 0.45-3% in Black women and 1% in White women.

Discussion
In this large, prospective pregnancy cohort, we analyzed the association between mid-pregnancy vaginal microbiome, race, and sPTB. We found that vaginal microbiomes were signi cantly associated with sPTB, race, douching and other maternal factors. Many of these maternal factors, like poverty, education, marital status, age, douching and race, have stronger associations with the vaginal microbiome than the vaginal microbiome has with sPTB (Fig. 1c). Consistent with previous studies 10, 27 , we found that the vaginal microbiomes of Black and White women were signi cantly different, with higher alpha diversity, higher abundance of L. iners and lower abundance of L. crispatus for Black women. The microbial difference between sPTB and term controls is mainly driven by a higher L. crispatus abundance in term controls, similar to previous reports 13,14 . Because of the strong intercorrelations between maternal factors such as race, poverty, education, marital status and douching (Fig. S2), we strati ed the dataset by race and douching with the aim of uncovering potentially stronger sPTB microbial signatures that are independent of race and douching.
With the community state types assigned based on the most abundant taxon, the sPTB risk associated with L. crispatus dominated community state is about 60% of that for L. iners dominated microbiome (Fig. 2d, 13% and 22% in all participants, and 3.5% and 5.9% when oversampling of cases in this nested case-control design is accounted for). The alpha diversity of L. crispatus dominated microbiome is signi cantly lower than that of L. iners dominated microbiome, indicating that L. crispatus may suppress the colonization and development of BV-like microbiome while L. iners does not. Compared to L. crispatus, vaginal microbiome dominated by L. iners also more often shift towards a diverse community 28 . For example, L. iners enhance the adhesion of Gardnerella spp. to cervical epithelial cells, and Gardnerella spp. displaced adherent L. crispatus but not L. iners from epithelial cells 29 . Previous research also suggests that L. crispatus and Gardnerella were exclusive while L. iners and Gardnerella often coexist 14 . This may explain the higher sPTB risk associated with L. iners dominated microbiome compared to L. crispatus in our population. The sPTB risk associated with community state "Lacto_other" dominated by other Lactobacillus species (mostly L. gasseri and L. jensenii/fornicalis/psittaci) is also signi cantly higher than L. crispatus dominated community state, indicating that these species were also not as protective as L. crispatus.
With a similar number of Black and White participants, we analyzed the associations between microbiome and the sPTB risk in each race separately and found that risk of sPTB associated with L. crispatus and L. iners are similar for Black and White women (Fig. 2g). Although at the US population level, black women have substantially higher risk of PTB 17 , in the PIN study speci cally, black race is only marginally associated with PTB (OR 1.3, 95% CI 1.0, 1.6) 18 . Our ndings, that race does not modify the association between L. crispatus and L. iners and sPTB, may suggest that the disparity in PTB rates at the population level may in part be due to the lower prevalence of L. crispatus dominated microbiome among Black women. While the associations between L. crispatus, L. iners, and sPTB are independent of race, we could not verify whether risk associated with other community patterns are also consistent between races, because of their relatively low prevalence among participants. Future studies with a larger number of participants are needed to further investigate other taxa.
Douching is often associated with BV 30-32 , although it is di cult to determine whether douching increases the risk of BV or BV leads to douching. In this study, we found that douching played an important role in the structure of the vaginal microbiome. Among women who did not douche, Black and White women have different microbiome. Speci cally, White women had a notably higher abundance of L. crispatus, lower abundance of L. iners and higher abundance of other Lactobacillus species. However, among women who did douche, the microbiomes of Black and White women were similar, and featured lower abundance of L. crispatus and higher abundance of L. iners. It was reported that the genome of L. iners AB-1 contains genes that could contribute to its survival in an environment of uctuating conditions, including Fe-S cluster protein for oxidative stress, alkaline shock and universal stress proteins 28 . L. iners also has a stronger ability to adhere to human bronection than other vaginal bacteria strains such as L.
crispatus ATCC 3800 33 . Thus, it is possible that douching habits in uence L. iners dominated microbiome less than L. crispatus dominated microbiome. At the same time, the percentage of sPTB cases was higher in douching groups in both Black and White women although causality cannot be directly inferred. It is possible that a pre-existing dysbiotic state caused both douching behavior and sPTB, or alternatively, that douching disrupted the healthier L. crispatus dominated microbiome that then shifted to a higher risk microbiome, ultimately leading to sPTB. Future studies with longitudinal vaginal track sampling, and longitudinal information on douching behavior, would be required to disentangle these interconnected features. Regardless, our results suggest that douching behavior is signi cantly associated with the vaginal microbiome, and that race-related differences in vaginal microbiome are erased in the population of women that report douching.
In summary, in this prospective study of mid-pregnancy microbiome and sPTB in a well characterized cohort of Black and White women, we found that the vaginal microbiome of Black women was characterized by higher diversity, lower abundance of L. crispatus, and higher abundance of L. iners. These differences were obscured once maternal douching behavior was considered-speci cally, among women who douche, there were no material differences in microbiome by race. Additionally, we found that women with microbiome dominated by L. crispatus had lower risk of sPTB, and women with microbiome dominated by L. iners had higher risk of sPTB, and these associations were the same for Black and White women. To our knowledge, this is the rst study of the vaginal microbiome and sPTB to consider the in uence of douching, and we found that douching has a signi cant in uence on the vaginal microbiome that should be considered in future studies.
Finally, it is important to note that while we present differences in microbial community patterns by race to be consistent with the prior literature 10,11 , we observed strong inter-correlations across a number of maternal factors whose effects cannot easily be separated. These intercorrelated factors include race, poverty level, psychosocial stress, education, marital status, and maternal age. In this as with many other medical research studies, maternally self-classi ed race only crudely captures complex social determinants of health 34 , and thus disparities in microbial community patterns that we observe in relation to race may actually result from factors such as but not limited to her diet, her access to high quality medical care, her social support and life experiences, her psychosocial stress, and her experiences of discrimination. These pathways, that may explain differences in vaginal microbial community patterns by race, need further investigation and elucidation. Although this is one of the largest studies of the associations between vaginal microbiome and sPTB, it still lacks power for analyzing less abundant microbes and whether the combination of race or other social factors and douching in uence the consistency of microbial signatures. Pooled studies across cohorts with similar metagenomics data may enable a more precise investigation of rare species as well as the in uence of maternal factors that may explain or modify effects of vaginal microbiome on sPTB.

Declarations Funding
This study was funded in part by a grant from NIH/NIMHD (R01MD011504), NIH/NIEHS (P30 ES010126). ER was supported by T32ES007018. The Pregnancy, Infection, and Nutrition study was supported by Grants HD37584, HD39373, and DK61981. The General Clinic Research Center was supported by the National Institutes of Health General Clinical Research Centers program of the Division of Research Resources Grant RR00046.

Ethics approval and consent to participate
This study was approved by the Institutional Review Board at UNC Chapel Hill.

Availability of data and material
The datasets analyzed in this study are available at NCBI with BioProject PRJNA694098. R scripts used in this study are available at Github (https://github.com/ssun6/PINmicrobiome). Additional requests and questions can be addressed to SS.
Authors' contributions SME, AAF and GAB contributed to all aspects of the study, including conception, design, data acquisition, analysis, and supervision. MGS, JMF, PB, KL, ND, JT, AMSR contributed to acquisition of data, interpretation of results. SS, JMF, ER, AAS, ICB, MCW, AAF and SME contributed to analysis and interpretation of data. All authors contributed to writing, review, and/or revision of the manuscript, and approved the nal manuscript.