Do Different Samples From Pregnant Women and Their Neonates Share the Common Microbiome: A Prospective Cohort Study

We characterized the microbiome composition from various samples in pregnant women and their neonates and evaluated the association between the distribution of microbiome and obstetric factors. A prospective study was performed in live births delivered at a single tertiary center between March 2020 and January 2021. Samples collected from pregnant women and their newborns included maternal cervicovaginal discharge (VD), amniotic uid (AF), umbilical cord blood (CB), neonatal gastric liquid (GL), and meconium. Next-generation sequencing was used to analyze the composition of the microbiome from the samples. We identied 19,597,239 bacterial sequences from the 641 samples. The alpha diversity of the negative control group was slightly higher than that of the VD group. The samples were separated well based on the microbiome pattern unique to their body sites despite the signicant batch effects present within our dataset. We observed a higher relative abundance of Lactobacillus at the genus level but lower abundances of Aerococcus, Fusobacterium, Gardnerella, Peptoniphilus, Porphyromonas, and Prevotella in VD. In evaluating the association with clinical data, Staphylococcus in M showed a strong association with preterm birth. The lists of bacterial taxa were connected to the weights of newborn babies in the CB group. The microbiome pattern of GL was similar to that of AF, but not technically identical. Twins did not demonstrate signicant differences in the composition of microbiomes from neonate samples.


Introduction
The human microbiome plays an important role in maintaining homeostasis in health and is associated with numerous diseases [1,2].
Microbiome development starts from the in-utero environment and changes in lifetime, continuously affecting the immune system and metabolism. Many studies have demonstrated that pregnancy itself modi es the microbial populations in multiple sites within the maternal body, and this alteration might in uence maternal, fetal, and neonatal health conditions [3]. Since pregnancy is a unique immune condition for a human body that allows a temporary tolerance for a foreign body, microbiome remodeling during pregnancy to facilitate immunological and metabolic adaptations seems necessary [4]. Some microbiome studies in pregnancy have proposed that fetal environments, including placenta and amniotic uid, traditionally known as sterile, contain several characteristic microbiotas not identi ed in routinely performed culture techniques. However, their biomass is small, and association with medical conditions has not yet been proven [5,6].
The most commonly studied site of the microbiome is the vagina. Anatomically, the vagina is the most distally exposed area of the reproductive organ and is connected to the uterus. Aagaard et al. reported that the vaginal microbiome differed during pregnancy by gestational age and that Lactobacillus species played a role in preventing the growth of pathogenic bacteria [7]. Pregnancy causes several changes in the vaginal microbiome, such as decreased overall diversity, increased proportion of Lactobacillus species, and higher stability [8,9]. Because preterm birth is a critical problem in obstetrics and a well-known cause of preterm birth is intra-amniotic in ammation/infection, the relationship between preterm birth and the vaginal microbiome has been studied by various groups. However, no signi cant association has been identi ed yet [10][11][12][13][14]. Other sites used to evaluate microbiome in pregnancy are maternal gut [15], oral cavity [16], placenta [5], amniotic uid [17,18], and neonatal gut [19], but the characterization of microbiomes from samples systematically collected from a re ned cohort composed of pregnant women and their newborns has not been designed.
Here, we aimed to evaluate the association between obstetrical conditions and microbiome from various sites of pregnant women and their neonates and determine how the maternal microbiome affects the microbiome of the fetal environment and infants simultaneously.

Methods
Study design and sample collection A prospective study was performed on live births delivered between March 2020 and January 2021. Samples were collected from women who delivered at Seoul National University Bundang Hospital and their newborns. Women with unstable vital signs or those requiring urgent management such as transfusion and neonates admitted to the neonatal intensive care unit (NICU) or who had unstable vital signs after birth were excluded from the study. Samples for microbiome analysis included maternal cervicovaginal discharge (VD), amniotic uid (AF), umbilical cord blood (CB), neonatal gastric liquid (GL), and meconium.
As a pregnant woman was hospitalized with an expectancy of delivery, VD was obtained using a polyester swab inserted into the posterior fornix of the vagina, assisted by sterile speculum examination. For those who had undergone cesarean section for delivery or amniocentesis for speci c indications (i.e., for detection of intraamniotic in ammation/infection), approximately 10cc of AF was obtained through a syringe. During both cesarean section and vaginal delivery, approximately 20cc of CB was taken through a syringe from the vein of the umbilical cord immediately after clamping. Since removing amniotic uid or other liquid from the newborn's mouth and stomach after birth is a part of initial management to help the airway and to stimulate spontaneous breathing, most newborns received suctioning procedures, and the liquid collected in the suction bottle (approximately 15 ml) was carried into a conical tube for analysis of GL. Sample M, the newborn's stool, was carefully obtained 24 h after birth using a polyester swab from the anus as the newborn stabilized after initial management. We tried to collect all ve different samples from each woman and neonate(s). However, some samples were not obtained or missed for clinical circumstances.
The primary outcome was the distribution and composition of the microbiome of the above samples from pregnant women and their neonates. To determine the association between the microbiome from different compartments and obstetric factors, medical records were collected. Data included maternal age, gestational age at delivery, delivery mode (vaginal delivery or cesarean section), the use of assisted reproductive technology (ART), other obstetric complications, and neonatal outcomes such as sex and birth weight. This study was performed with the informed consent of appropriate participants in compliance with the Declaration of Helsinki. The study protocol was approved by the Institutional Review Board of the Seoul National University Bundang Hospital (B-1606/350-003).

Microbial DNA isolation
Microbial deoxyribonucleic acid (DNA) was extracted from VD, GL, AF, and CB with the ZymoBIOMICS DNA Miniprep Kit (Zymo Research, Irvine, CA) and sample M using the DNeasy PowerSoil Pro Kit (Qiagen, Germantown, MD) according to the manufacturer's instructions. Brie y, samples were enzymatically and mechanically lysed by bead beating, followed by washing and ltering in the provided column.
Extracted DNA concentrations were measured using a Qubit dsDNA HS Assay Kit (Thermo Fisher Scienti c, Waltham, MA, USA). For each box of the DNA extraction kit used, no material was used as a negative control. The blanks were processed in the entire protocol and analyzed.
16S rRNA gene ampli cation The 16S ribosomal ribonucleic acid (rRNA) gene was ampli ed using the two-step polymerase chain reaction (PCR) protocol in the 16S Metagenomic Sequencing Library Preparation (Illumina, San Diego, CA). In the rst PCR step, the V3-V4 hypervariable region of the 16S rRNA gene was ampli ed using 341F/785R primers and Herculase II fusion DNA polymerase (Agilent, Santa Clara, CA). In the below primer sequence, 'N' base is selected from any random base, 'W' base is A or T, 'H' base is A, C or T, and 'V' base is A, C, or G. 341F: PCR cycling was performed with an initial cycle at 95°C for 3 min, followed by 25 cycles of 95°C for 30 s, 55°C for 30 s, 72°C for 30 s, and a nal extension cycle at 72°C for 5 min. The amplicons were cleaned with AMPure XP beads (Beckman Coulter, Brea, CA, USA). In the second PCR, index primers from the Nextera DNA CD Index Kit (Illumina, San Diego, CA) were added to the ends of the amplicons generated in the rst PCR. PCR cycling was performed with an initial cycle at 95°C for 3 min, followed by ten cycles of 95°C for 30 s, 55°C for 30 s, 72°C for 30 s, and a nal extension cycle at 72°C for 5 min. Each sample was cleaned with AMPure XP beads (Beckman Coulter, Brea, CA, USA) and eluted in UltraPure DNase/RNase-Free Water (Thermo Fisher Scienti c, Waltham, MA). The ampli ed DNA was checked using a 2100 Bioanalyzer system using an Agilent DNA 1000 Kit (Agilent, Santa Clara, CA, USA). For each library production, no template was used as a negative control.

16S rRNA gene sequencing and analysis
Based on the DNA size and concentration, the amplicons were pooled in equimolar amounts and spiked with 30% PhiX (Illumina, San Diego, CA). These were then sequenced on the Illumina MiSeq platform using paired-end 250 cycle MiSeq Reagent Kit V2 (Illumina, San Diego, CA) and a 300 cycle MiSeq Reagent Kit V3 (Illumina, San Diego, CA). Negative controls from the DNA extraction and library were sequenced.

Sequencing data generation
We divided the samples into nine batches (Runs 1-9) and sequenced the V3-V4 region of the 16S rRNA gene using Illumina MiSeq machines with a target depth of 100,000 per sample (Supplementary Figure 1). Sequencing was performed with 250 bp paired-end reads for all of the sequencing runs except for the last one (Run 9), where sequencing was performed with 300 bp paired-end reads for practical reasons. The read quality scores for each sequencing run are shown in Supplementary Figure 2. The bcf2fastq program of Illumina was used to demultiplex raw sequencing data (BCL les) and output forward and reverse FASTQ les for each sample. Of note, some samples were sequenced more than once to assess the impact of batch effects. These included "sequencing duplicates" in which the identical nextgeneration sequencing (NGS) library of one sample was sequenced in separate runs and "library duplicates" in which multiple NGS libraries were prepared from the identical sample at different dates and then sequenced separately.

Data analysis and visualization
Unless stated otherwise, all analyses were carried out using the QIIME 2 platform, a powerful community-developed platform for microbiome bioinformatics [20]. For each sequencing run, FASTQ les were imported to QIIME 2 and the DADA2 plugin [21] to identify amplicon sequence variants (ASVs) by trimming low-quality parts of sequence reads, denoising trimmed reads, and then merging the forward and reverse reads (Supplementary Figure 1). The observed ASVs from individual sequencing runs were then merged into one ASV table. To detect and remove potential contaminants, we ran the decontam program on our samples, which looked for ASVs per sequencing batch that appeared at higher frequencies in low-concentration samples and were repeatedly found in the negative control [22]. Taxonomy classi cation was performed using a naive Bayes classi er using the SILVA database [23]. To visualize the outputs from QIIME 2, we developed the Dokdo program (https://github.com/sbslee/dokdo), an open-source and MIT-licensed Python package for microbiome sequencing analysis using QIIME 2.

Diversity analysis
We used the QIIME 2 command "qiime diversity core-metrics-phylogenetic" to compute the alpha and beta diversity metrics of our samples.
When running the command, to normalize for the difference in read depth across the samples, we used the "--p-sampling-depth" option to rarefy our samples to 5,000 sequence reads and have an equal depth of coverage. We also ensured that all samples were sequenced to a su cient depth of coverage for diversity analysis by creating rarefaction curves (Supplementary Figure 3). Additionally, we used the "--iphylogeny" option to provide a rooted phylogenetic tree of observed ASVs, which is required for performing PCoA based on the weighted UniFrac distance [24].

Statistical analysis
To assess the differential abundance of the microbiome in the context of clinical information such as preterm birth, we used the QIIME 2 command "qiime composition ancom" to perform the analysis of the composition of microbiomes (ANCOM), which compares the centered log-ratio (CLR) of relative abundance between two or more groups of samples [25]. To determine whether groups of samples are signi cantly different from one another in beta diversity, we carried out permutational multivariate analysis of variance (PERMANOVA) using the QIIME 2 command "qiime diversity adonis" which ts linear model assumptions to a distance matrix (e.g., weighted UniFrac) with the chosen variables. We performed bootstrapping hypothesis testing by building a 95% con dence interval with the "scipy.stats.t.interval" method in the scipy package to compare similarities in microbiome composition between twins and randomly chosen samples [26].

Description of the study populations and clinical characteristics
All women were of Asian ethnicity (Korean, speci cally), and the median age of the study population was 34 (interquartile range 31-37) years (Table 1). Most of the features were low-risk pregnant women. The proportion of nulliparity was slightly over half of the population (67%), and the median values of height, weight, and body mass index (BMI) were 162 cm, 70 kg, and 27 kg/m 2 , respectively. About 30% were conceived by ART, including intrauterine insemination (IUI) and in vitro fertilization with embryo transfer (IVF-ET). Twin pregnancy was approximately one-fourth of the total population, and among them, 19% were monochorionic. The median gestational age at delivery was 37.7 weeks (interquartile range 36.9-38.6), and preterm birth before 37 weeks of gestation was 26.2% (37/141). The rate of cesarean section was 55% (77/141). Seven neonates had congenital structural anomalies (atrioventricular septal defect, absence of corpus callosum in the brain, achondroplasia, polydactyly in two cases, and syndactyly). The frequencies of other obstetric complications or underlying maternal diseases are described in Table 1.

Maternal and neonatal microbiome landscape during delivery
We identi ed 19,597,239 bacterial sequences (22,412 unique ASVs) in 641 samples, including all biological samples and negative controls. These ASVs were taxonomically annotated, but we found evidence of batch effects in our sequencing data from all sample types except VD samples, which were likely introduced during NGS library construction and not during the sequencing itself ( Supplementary Figure 4 and 5). However, this was expected because our samples were collected from body sites with low-biomass specimens, making our samples prone to contamination [27]. Therefore, we expected to nd many false positives and apply a series of lters, as outlined in Supplementary Figure 6. Notably, we found and removed 203 ASVs that were statistically determined as contaminants because they were highly prevalent in negative controls (Supplementary Figure 7) or they showed higher frequencies in low-concentration samples (Supplementary Figure 8).
We measured the alpha diversity of the samples by calculating their individual Shannon indices. As shown in Figure 2, the median alpha diversity for each sample type decreased in the following order: GL, AF, M, CB, and VD. The negative control group showed a slightly higher alpha diversity than that of the VD group, suggesting that with 16S amplicon sequencing, negative controls can have microbiome diversity as rich as real biological specimens. Next, we estimated the beta diversity of our samples by computing the weighted UniFrac distance between them. As shown in Figure 3, the samples were moderately well separated by their sample type when projected using principal coordinates analysis (PCoA).

Clinical relevance of microbiome in pregnancy
To better understand the sources of variation seen in the beta diversity of our samples, we carried out the PERMANOVA using various factors, including clinical information. As shown in Table 2, when all sample types were included in the analysis, the variable "Site" explained 17.2% of variation (p-value = 0.001) and the variable "LibraryMonth," 7.4% (p-value = 0.002). This result indicates that the samples could still be separated well based on the microbiome pattern unique to their body site, despite the signi cant batch effects present within our dataset. When the analysis was restricted to each sample type, except for the VD group, the variable "LibraryMonth" was found to be signi cant for all sample types. The explanatory power increased to a range between 24.5% and 48.9%. These results align with the hypothesis that our samples are predominantly low-biomass specimens and prone to contamination.
Additionally, the variable "DeliveryMethod" was returned as signi cant for the VD group, the variables "PretermBirth37" and "AntibioticsUse" for M group, and the variable "Weight" for the sample CB group (Figure 4). We explored the signi cant variables in each group using PCoA with weighted UniFrac distance. The top seven bacterial taxa that led to different coordinates are shown in Figure 4. Several ASVs of Lactobacillus and one ASV of Gardnerella were found in the VD group. In the sample M group, Staphylococcus showed a strong association with preterm birth. Lastly, the lists of bacterial taxa were connected to the weights of newborn babies in the sample CB group. Table 3 shows the analysis of ANCOM for various clinical data to study any statistically signi cant relevance with bacteria in multiple sample types.

The resemblance of twin microbiome in delivery
To test the hypothesis that samples from twins, both monochorionic and dichorionic, have higher similarity in microbiome composition than randomly chosen samples, we compared the mean of weighted UniFrac distance between twin samples and randomly selected samples. More speci cally, for each AF, CB, GL, and M group, we performed bootstrapping hypothesis testing by randomly sampling pairwise distances with replacement from all samples 1,000 times to build a 95% con dence interval with the means of the sampled distances. We rejected the null hypothesis that there was no difference between the twin samples and randomly selected samples for all four sample types because the mean pairwise distance for twin samples was below the con dence interval ( Figure 5). Next, we divided the twins into monochorionic and dichorionic twins and repeated hypothesis testing. We found that we could still reject the null hypothesis for all four sample types for dichorionic twins. For monochorionic twins, however, only the CB and M groups passed the test.

Characterization of the vaginal health-related microbiome
Several pathogenic and commensal vaginal microbiota have been shown to have important consequences for a woman's reproductive and general health. To establish reference ranges of vaginal microbiota with known clinical associations in generally healthy pregnant women, we searched for bacterial targets commonly tested for assessing vaginal health within VD samples. More speci cally, we focused on 31 bacterial targets (15 genera and 16 species) that are tested by the "SmartJane" assay from uBiome Inc., including Lactobacillus, Sneathia, and Gardnerella [28]. Of the 31 bacterial taxa of clinical importance, 12 were identi ed in our samples ( Figure 6).
We observed a higher relative abundance of Lactobacillus at the genus level but lower abundances of Aerococcus, Fusobacterium, Gardnerella, Peptoniphilus, Porphyromonas, and Prevotella. Most of our patients did not have any severe pregnancy-related complications. In addition, the majority of preterm birth was ranged in the late preterm period from 34+0 weeks to 36+6 weeks. Therefore, the "SmartJane" assay did not capture almost any pathogenic microbiome. The speci cation level was examined and is listed in Figure. We found Lactobacillus iners and Lactobacillus jensenii from the assay lists, but Lactobacillus crispatus was not commonly found in the vaginal microbiome. This could be simply because the SILVA reference database we used omitted Lactobacillus crispatus. We con rmed that our data detected up to the Lactobacillus genus were linked to Lactobacillus crispatus using the National Center for Biotechnology Information (NCBI) database (data not shown).

Discussion
The study population comprised pure Asians and represented general or low-risk pregnancies. The maternal age range was 20-45 years, which can be described as the general reproductive age. The proportions of nulliparity, cesarean section, and sex of the neonates were approximately half. The signs of fetal distress, such as low Apgar scores and the presence of meconium staining, were de cient in frequency. Extreme pathologic conditions that might in uence or modify the microbiome were excluded, such as very preterm birth and preemies who needed to be treated carefully in the NICU. Therefore, the microbiome analyzed in this study population is likely to re ect the features of general pregnancy.
The important implication of this study is that certain bacteria might be present in intrauterine environments or fetal compartments, which are known to be biologically sterile, despite the possibility of contamination in samples since the detection methods are currently far more sensitive [9]. In this study, several methods were used to con rm the biomass or presence of contamination, and we could nd the samples especially AF and CB were highly prevalent in negative controls. There are still controversies on whether intrauterine environment is originally sterile or not, and this study supports it could be closer to being sterile. Moreover, these bacteria do not seem to lead to in ammatory responses during pregnancy. A few reports have demonstrated the presence of the microbiome in normal amniotic uid, umbilical cord blood, and placenta [5,6]. The detection of bacteria from the very early stool from newborns, the meconium, supports the idea of microbial colonization of the intrauterine environment during a normal pregnancy period [29,30]. Low-risk pregnant women and their normal neonates should be studied rst with the detailed exclusion of pathologic conditions to characterize the microbiome composition in responsible compartments during pregnancy.
We designed this study to evaluate whether samples from various compartments of body from pregnant women and their infants would share the similar microbiome, however we found the samples demonstrated microbiome with certain groups according to the body compartments where they had been obtained, not dependent on the individuals or the pairs of mother and fetus/infants. The microbiome from GL was similar to that of AF and it is easily expected since the fetuses swallow AF in utero and their urine composes AF again under normal physiologic circulatory condition. Interestingly, the microbiome from GL did not reveal to be completely identical to that of AF, which might imply that there could be technologically unidenti able mechanism of ora formation in the oral cavity or proximal gastrointestinal tract such as esophagus in fetal body from the intrauterine environment.
A comparison of the results of the analysis with the clinical database revealed several associations. First, Lactobacillus and Gardnerella found in the VD group are well-known indicators of the microbiome during pregnancy. Lactobacillus protects the maternal microbiome during pregnancy, whereas Gardnerella plays the role of a pathogen and is highly related to preterm birth or pregnancy complications [11,13,31]. One interesting bacterium from CB was Faecalibacterium, which is usually depleted in gestational diabetes mellitus [32].
Staphylococcus showed a strong association with preterm birth in the stools of neonates. Infection with bacteria has been reported in many studies, indicating that Staphylococcus may lead to preterm birth [33,34]. Tormo-Badia et al. reported that antibiotics altered the gut microbiota of offspring in pregnant mice [35]. As many studies suggest, if a "healthy microbiome" exists and plays an important role in maintaining a normal pregnancy, one can easily assume the possibility of adverse effects from antibiotic administration during pregnancy. We analyzed the relevance of antibiotic use in the M group, but it seems less effective due to the small sample size. However, it also has pregnancy-related microbiome taxa, such as Lactobacillus, Staphylococcus, and Ureaplasma [11,36]. Dominguez-Bello et al. reported differences in bacterial communities in infants' guts according to the delivery methods [37]. Vaginally delivered neonates showed microbiomes resembling their mothers' vaginal microbiota, dominated by Lactobacillus. In contrast, infants born by cesarean section had Staphylococcus, Corynebacterium, and Propionibacterium spp. They were dominant on the skin surfaces of their mothers. We sought to determine the relationship between delivery modes and the microbiome from samples but found no signi cant composition or diversity.
This study is the rst to explore microbiomes in twin pregnancies. Regardless of chorionicity, twins have a similar composition of microbiomes as randomly selected samples. All four types of samples from neonates did not signi cantly differ in the microbiome composition of dichorionic twins. The samples from CB and M in monochorionic twins showed a statistical difference, but this could be due to the small sample size of AF and GL groups in monochronic twins, one and three twins, respectively. Since twins are different individuals sharing the same intrauterine environment and the samples from twins are relatively rare compared to singletons, future studies focusing on twins or other higher-order pregnancies are essential.
Because of our complicated sample characteristics, including low microbial biomass and di culty in controlling the groups like pregnant women, the ANCOM data to determine the relationship between various clinical conditions and bacteria from different samples might have many false-positive results. Regardless, the bacteria listed in Table 3 seem to indicate exciting results. Finegoldia and Bi dobacterium were previously demonstrated in a healthier pregnancy, which was also con rmed by our data [38,39]. In addition, many other taxa in the table are relevant to in ammation and pregnancy complications such as gestational diabetes, preeclampsia, and preterm birth. For example, Campylobacter and Lachnospiraceae in the VD group correlate with previous studies showing that these bacterial infections cause in ammation and even further preterm birth [31,40]. We will continue further to understand the role of the microbiome in a normal pregnancy and the pathophysiology of microbiome modi cation during the development of obstetric complications by establishing a microbiota library of various compartments with a large sample size of pregnant women and their infants. This study could be considered as the very initial step to build up the basic database since it contains the samples from the general pregnant population and normal infants.

Conclusions
The exploration for microbiologic features in the compartments related to pregnancy has been historically a challenging issue for researchers to struggle for many decades although is still controversial. Of note, a part of abnormal microbial invasion to the gestational cavity such as AF or placenta de nitely seems to engender serious obstetric complications including preterm birth and severe neonatal morbidities that might persist during lifetime. Despite the signi cance of research on microbiome in pregnancy, the advancement is relatively in a stalemate due to several limitations speci ed to pregnancy such as ethical vulnerability as a study subject and di cult accessibility to obtain samples. We have collected various samples in pregnant women and their neonates with a standardized protocol and established the database of microbiome which further will be used as a baseline library for samples from diverse pathologic conditions. As a result, samples known to be traditionally sterile, such as AF and CB, demonstrated several bacterial clusters although most of them were extremely close to negative controls. The microbiome patterns could be grouped according to the body compartments, not according to the mother-fetus/infant pairs. Further studies lowering the possibility of contamination and redeeming the low-biomass specimens should be followed to evaluate the association between microbiomes and a number of diseases during pregnancy.

Availability of data and material
Data that support the ndings detailed in this study are available in the supplementary information and this article. Any other source data perceived as pertinent are available, on reasonable request, from the corresponding author.

Competing interests
All authors have declared that no competing interests exist.   Figure 1 Batch effect detection in 16S rRNA amplicon sequencing data. Center log-ratio transformation was used to normalize the ltered ASV table before generating a hierarchically clustered heatmap based on correlation coe cients. AF, amniotic uid; CB, umbilical cord blood; GL, gastric liquid; M, meconium; VD, cervicovaginal discharge; NC, negative control.

Figure 3
Beta diversity of the Korean maternal neonatal microbiome. The ltered ASV table was rare ed before the samples were projected into 2Dspace with principal coordinates analysis using the weighted UniFrac distance.

Figure 4
Beta diversity results of the PERMANOVA analysis. Principal coordinates analysis using weighted UniFrac distance is shown for A the cervicovaginal discharge samples, B and C the meconium samples, and D the umbilical cord blood samples.