Sample Collection. Over the last 15 years, the GSMs of the Zhouzhi National Nature Reserve (ZNNR) in the Qinling Mountains of Shaanxi province have been studied [23]. During the winters of 2012 and 2013, we found and then dissected the bodies of a total of five GSM individuals that had each recently died of injuries incurred from fighting and/or falling out of a tree (for details see Supplementary Material). Immediately upon discovery, we carefully slit open each dead body to take samples of the contents from different regions of the digestive tract. We took a total of 39 samples from five individuals in the foregut (Sacus gastricus I, Sacus gastricus II, proximal gastric, distal gastric, pylorus sinus, and gastric tube) and the hindgut (cecum, colon and rectum) (Table S1). Each sample was placed into a 2 ml centrifuge tube and then taken to the laboratory and stored in liquid nitrogen prior to DNA extraction.
Comparative morphology of the major components of the GSM digestive system.
We measured the length and mass of each body in the field. Each body was then taken to the laboratory, and the digestive tract was removed. We then measured the volumes of the stomach, caecum and colon using previously described methods [24]. These methods allow us to compare the estimated volumes of organs using their surface areas [11]. The small intestine, caecum and colon were treated as cylinders (with \(\text{V}={\pi (b/2\pi )}^{2}\times 0.1\), and the stomach as a sphere, \(\text{V}=4/3\pi {\left(\sqrt{A/4\pi }\right)}^{3}\) (\(\text{A}=b\times 1\))). To estimate surface areas, we cut the inner surface of each organ into fragments that could be flattened. The flattened fragments were arranged adjacent to each other and organized into either a triangle or rectangle. We first quantitatively described the gut morphology of the GSM and then compared our findings to a database of 47 species from all major primate subfamilies [11] (Detail see Supplementary Methods; Table S2).
DNA extraction and sequencing. To identify the mechanisms by which carbohydrates are digested in the fore- and the hindgut, we examined the microbial composition and functional potential of both the fore- and the hindguts from the five individuals. Genomic DNA was extracted from each sample using a QIAamp DNA Stool Mini Kit (Qiagen, Germany). To describe the microbial composition of each sample, we sequenced the V3-V4 region of the 16S rRNA gene (Supplemental Material). The resulting amplicons were sequenced on an Illumina MiSeq using 2x300 paired-end reads. To describe the microbial functional potential in each digestive compartment, we used shotgun metagenomic sequencing. We pooled the DNA from all of the samples collected from the same digestive compartment for each individual, resulting in five foregut sample pools and five hindgut sample pools. Metagenome library preparations for each pool were constructed following the manufacturer’s protocol (NEBNext® Ultra™ II DNA Library Prep Kit for Illumina®; Supplmental Material). DNA sequencing was performed on Illumina HiSeq using 2x150 paired-end reads.
Data analysis of 16S rRNA. We sequenced 47 samples and produced a total of 9, 298, 270 reads. 3, 732, 400 reads (79, 413 reads per sample) were retained after quality filtering with an average length of 448bp. DNA sequences were demultiplexed and quality filtered using MiSeq Control Software. We used the search function for chimerism checks to remove low-quality sequences, the flash function for splicing, and the trimmomatic function for quality control [25]. Sequences were clustered into amplicon sequence variants (ASV) using the DADA2 wrapper in QIIME2 [26] (https://benjjneb.github.io/dada2/tutorial.html). Taxonomy was assigned using a pre-trained Bayesian classifier in QIIME2 and the Silva 138 database (https://www.arb-silva.de/documentation/release-138/).
Sequences were rarefied to 46, 256 reads prior to the calculation of alpha and beta diversity statistics. Alpha diversity indices were calculated in QIIME2 using the Shannon index for diversity, and the Chao1 index for richness. Principal coordinate analysis (PCoA) and an Unweighted Pair Group Method with Arithmetic mean (UPGMA) tree were used to visualize the data based on both weighted and unweighted UniFrac distances [27]. PERMANOVAs on weighted and unweighted UniFrac distances were used to test overall differences in composition. We tested for differences in microbial diversity between the hindgut and foregut using a linear mixed effects model (nlme, R v. 3.5.4) with the fixed effect defined as gut compartment and the random effect defined as individual GSM. We also tested for differences in the relative abundances of specific microbial ASVs present in at least ten samples between fore- and hind-guts using a linear mixed effects model (nlme, R v. 3.5.4) with the fixed effect defined as gut compartment and the random effect defined as the individual GSM. We tested every ASV using a loop and corrected the resulting p values for multiple tests (fdrtool, R v. 3.5.4). Because only 22 ASVs were present in at least ten samples, we repeated this process at both genus and family levels.
Data Analysis of metagenomic data. We obtained a total of 928, 965, 472 raw reads (150 bp) across ten samples from the foreguts and hindguts of the five individuals. Raw shotgun sequencing reads were trimmed using cutadapt (v1.9.1). Specifically, low-quality reads, N-rich reads and adapter-polluted reads were removed as well as host contamination reads. After filtering, we had a total of 417, 333, 904 reads with an average length of 146 bp for the ten samples (Table S3). Sequences from each sample were assembled de novo separately. Whole genome de novo assemblies were performed using SOAPdenovo (v2) with different k-mers. The best assembly result of Scaffold, which has the largest N50, was selected for the subsequent analysis. CD-HIT was used to cluster scaftigs derived from assembly with a default identity of 0.95.
To analyse the relative abundance of scaftigs in each sample, paired-end clean reads were mapped to assembled scaftigs using the Burrows-Wheeler Aligner (BWA version 0.7.12) to generate read coverage information for assembled scaftigs. Paired forward and reverse read alignments were generated in the SAM format using the BWA-SAMPE algorithm with default parameters. The mapped reads counts were extracted using SAMtools 0.1.17. The corresponding scaftigs were mapped to the mass of bacterial data extracted from the NT database of NCBI. LCA algorithm (Lowest Common Ancestor, applied in MEGAN software system) was used to ensure the annotation significance by picking out the lowest common classified ancestor for final display.
We conducted a detailed metagenomic study of carbohydrate digestion from the functional genes in the foregut and the hindgut (Figure S3A), and used an online database to identify carbohydrate-active enzymes (CAZymase; Fig. 2C). Genes were predicted using Meta Gene Mark, and BLASTP was used to search the protein sequences of the predicted genes with the NR database, CAZy database, eggNOG database and KEGG (Kyoto Encyclopedia of Genes and Genomes) database with E < 1e-5. For glycosyl hydrolase (GH) families, the representative sequences selected from the CAZy Web site were used in BLAST searches of the metagenomic data to identify GH families using an E-value cutoff of 1e-5 (Table S7). We used Welch’s Two Sample t-test to quantify differences in the functional potential (overall gene families as well as CAZymes) of the microbiome between the foregut and the hindgut. P-values were corrected for multiple tests [28]. We further applied correspondence analysis, a form of multidimensional scaling to determine whether the variation in the abundance of GH families distinguished the gut regions[29].
Enzyme activity of fiber digestion in the gut. Dietary fiber contains cellulose, hemicellulose and lignin. Microbial pathways for the degradation of cellulose and hemicellulose are relatively common, while pathways for lignin degradation are rare. To test whether the fore- and hindgut microbiomes are simultaneously involved in fiber digestion we measured the activity of enzymes that digest cellulose (β-glucosidase) and hemicellulose (xylanase), in the foreguts and hindguts of three individuals. We took samples from three sections of the foregut (Presaccus, Saccus, Tubus gastricus) and four sections of the hindgut (Caecum, Ascending colon, Trasverse colon, Descending colon) in each of the three individuals, respectively.
Endo-cellulase (endo-β-1,4-glucanase) and hemi-cellulase (endo-β-1,4-xylanase) activities were assayed by measuring the amount of reducing sugar released from 2 % CMC sodium salt (Sigma-Aldrich, USA) and 2% xylan (Sigma-Aldrich, USA), respectively, using the DNS method [30]. β-glucosidase activity on p-nitrophenyl-d-glucopyranoside (pNPG; Sigma) was assayed [31]. The contents of the fore- and the hindgut were each diluted with NaAc buffer (50 mM, pH 6.0) at a concentration of 0.5 g/mL and the enzyme activities were assayed immediately after dilution. A unit of specific enzyme activity (U) was defined as the number of micromoles of reducing sugar (or p-nitrophenol) released per minute. All data shown are the means of triplicate experiments. Wilcoxon rank-sum tests were used to compare enzyme activity between the foregut and hindgut (Table S9).
Digestion ratio. Using sterile spoons, we took fresh samples (> 5g) of gut contents from the three sections of the foregut (Presaccus, Saccus, Tubus gastricus) and the four sections of the hindgut (Caecum, Ascending colon, Trasverse colon, Descending colon) from the same three monkeys as used for the enzyme activity measurements (Table S10). We directly measured crude protein (CP), lipids (CL), ash, acid detergent fiber (ADF), neutral detergent fiber (NDF) and acid detergent lignin (ADL) from 0.5g dry mass of each sample [21, 32](for details see Supplementary Materials). All analyses were repeated three times. Cellulose was determined as the difference between ADF and ADL; hemicellulose was determined as the difference between NDF and ADF. For wild monkeys, it is impossible to use artificial markers to indicate a dry matter diet. We therefore used faecal ADL as an internal marker to estimate the digestibility of hemicellulose and cellulose [15, 33]. We assumed that the monkeys had stable daily nutrient intakes because they mainly fed on tree bark during the sampling period [23]. To estimate the digestion of hemicellulose and cellulose in the foregut, we measured the reduction of hemicellulose and cellulose from the first stomach chamber - the presaccus - to the small intestine. To estimate the digestion of hemicellulose and cellulose in the hindgut, we measured the reduction of hemicellulose and cellulose from the small intestine to the end of descending colon [15, 33]. Either Wilcoxon rank-sum or paired t-tests were used to quantify differences in mean reduction of hemicellulose and cellulose between the foregut (presaccus to small intestine) and the hindgut (small intestine to colon).