Metagenomic and Metabonomic Proling provide insights into AIDS resistance in African Green Monkey

As a natural host of simian immunodeciency virus (SIV), African green monkeys (AGM) do not develop AIDS. AGM has recently been shown to rapidly activate and maintain evolutionarily conserved regenerative wound healing mechanisms in mucosal tissue in case of SIV acute infection, suggesting a role of mucosal integrity in AIDS resistance.However, little is known about the underlying mechanisms, especially of which the gut microbiome as well as the metabolites participate in. This study aims to characterize the proles of gut microbiome and metabolites of AGM with chronic SIV infection (AGM_P), AGM without SIV infection (AGM_N), Cynomolgus macaque (CM), Rhesus macaque (RM). One-tailed Wilcoxon rank-sum test was performed for all the KOs that occured in more than ve samples and adjusted for multiple testing using the Benjamin-Hochberg procedure. Calculations of the Z-score, the aggregated Z-score for a KEGG pathway, the adjusted Z-score for a KEGG pathway were performed as described [31]. A reporter score of ≥ 1.6 (90% condence according to normal distribution) could be used as a detection threshold for signicantly differentiating pathways. Rarefaction analysis was performed to assess the gene richness. of differential metabolites were ploted by Pheatmap package in R language. The correlation between differential metabolites were analyzed by cor () in R language with the pearson method. Statistically signicant of correlation between differential metabolites were calculated by cor.mtest in R language. P-value < 0.05 statistically signicant and correlation plots were ploted by corrplot R The functions of these metabolites and metabolic pathways were studied KEGG metabolic pathways enrichment of differential metabolites performed. When When feces samples from SIV- AGM (n=20) and nCM (n=20) via quantitative RT-PCR. b Comparison analysis of the relative bacterial abundance of feces samples from SIV- AGM (n=8) and SIV+ AGM (n=8) via quantitative RT-PCR.

Although rodent models are most frequently used to establish causality between a given variation of the gut microbiota and disease, several studies have demonstrated that the mouse gut microbiome differs greatly from that of humans [11][12][13]. Non-human primates (NHPs) have been vital for medical research due to their close evolutionary relationship, similar behavioral and physiological characteristics to humans. Three of the most commonly used NHPs are Rhesus macaque (Macaca muluta), Cynomolgus macaque (Macaca fascicularis) and African green monkey (Chlorocebus Sabaeus), which are widely used in studies of neuroscience, infectious diseases and drug safety testing [14][15][16][17][18]. Differences also exist between these species. For example, AGM has been known to exhibit AIDS resistance in case of simian immunode ciency virus (SIV) infection [19,20]. Also, AGM develops spontaneous hypertension with pathophysiological changes that mimic those of patients with essential hypertension [21,22]. Research has also shown that genetic factors in uence the atherogenic response of lipoproteins to dietary fat and cholesterol in nonhuman primates which re ects different capacities in lipid metabolism between AGM and CM [23]. A recent study found that AGM rapidly activate and maintain evolutionarily conserved regenerative wound healing mechanisms in mucosal tissue, which preserves mucosal integrity and prevents in ammatory insults that underlie immune exhaustion in RMs [24]. Role of the gut microbiome and metabolites in shaping these differences as well as the underlying mechanisms needs to be investigated.
Previous studies have almost explored the gut microbiota of different monkey species using 16S rRNA gene amplicon sequencing, of which little information on gene identity and function of the gut microbiome was revealed [25][26][27][28]. This study aims to compare pro les the gut microbiome and metabolites of AGM_P, AGM_N, CM and RM through metagenomic and metabonomic sequencing.

Animals, sample collection, and transportation
A colony of captive AGM were obtained from the Republic of Uganda and raised in our center. Animals were housed in troop enclosures with an outdoor facility and fed nonhuman primate chow per day and a combination of fresh bananas, apples, carrots 3 days per week. All protocols were in strict accordance with the recommendations in the Guide for the Care and Use of Laboratory Animals. Fresh feces from 8 SIV-AGM, 8 SIV + AGM, 10 SIV-CM, and 10 SIV-RM with the age between 5-10 years old were chosen for metagenomic and metabonomic sequencing. Fresh feces were collected, immediately frozen, kept on dry ice during transportation to NovoGene (Tianjin, China) for further processing. DNA extraction, library construction and sequencing DNA extraction of a frozen aliquot (200 mg) of each fecal sample was performed by MagPure Stool DNA KF kit B (Magen, China) following manufacturer's introduction. The concentration of fecal DNA was measured by a Qubit Fluorometer by using Qubit dsDNA BR Assay kit (Invitrogen, USA) and further estimated by agarose gel electrophoresis. A total amount of 1 µg DNA per sample was used as input material for the DNA sample preparations. Sequencing libraries were generated using NEBNext® Ultra™ DNA Library Prep Kit for Illumina (NEB, USA) following manufacturer's recommendations. The DNA sample was fragmented by sonication to a size of 350 bp, end-polished, A-tailed, and ligated with the full length adaptor for Illumina sequencing with further PCR ampli cation. The PCR products were puri ed (AMPure XP system) and libraries were analysed for size distribution by Agilent 2100 Bioanalyzer. The clustering of the index-coded samples was performed on a cBot Cluster Generation System according to the manufacturer's instructions. After cluster generation, the library preparations were sequenced on an Illunima HiSeq platform and paired-end reads were generated.
Taxonomy prediction DIAMOND software [30] (V0.9.9, http://github.com/bbuch nk/diamond/), was used to blast the Unigenes to the sequences of Bacteria, Fungi, Archaea and Viruses which were all extracted from the NR database (Version:2018-01-02, http://www.ncbi.nlm.nih.gov) of NCBI. Krona analysis, the exhibition of generation situation of relative abundance, the exhibition of abundance cluster heat map, PCA (R ade4 package, Version 2.15.3) and NMDS (R vegan package, Version 2.15.3) decrease-dimension analysis were based on the abundance table of each taxonomic hierarchy. Differences between groups were tested by Anosim analysis (R vegan package, Version 2.15.3). Metastats and LEfSe analysis were used to look for the different species between groups. Permutation tests between groups were used in Metastats analysis for each taxonomy and Benjamini and Hochberg False Discovery Rate was used for correction of the P value and acquire q value. LEfSe analysis was conducted by LEfSe software. Finally, random forest (R pROC and randomForest packages, Version 2.15.3) was used to construct a random forest model.

Common functional database annotations
The nucleotide sequences of the gene catalog were translated into amino acid sequences and aligned against the proteins or domains in eggNOG database (2015-10), CAZy database (2017-09), COG database (2014-11), Swiss-prot database (2017-07), the KEGG database (89.1) and CARD database (4.0) by DIAMOND with an E value cutoff of 1e − 5. KEGG annotation was performed using an in-house pipeline, where each protein was assigned to a KO when the highest-scoring annotated hit contained at least one alignment over 60 hits.Resistance gene annotation was performed by aligning the Unigenes to CARD database (http://card.mcmaster.ca/).

KEGG pathway enrichment analysis and rarefaction curve analysis
One-tailed Wilcoxon rank-sum test was performed for all the KOs that occured in more than ve samples and adjusted for multiple testing using the Benjamin-Hochberg procedure. Calculations of the Z-score, the aggregated Z-score for a KEGG pathway, the adjusted Z-score for a KEGG pathway were performed as described [31]. A reporter score of ≥ 1.6 (90% con dence according to normal distribution) could be used as a detection threshold for signi cantly differentiating pathways. Rarefaction analysis was performed to assess the gene richness.

Within sample diversity
Based on the species pro le, within sample (alpha & beta) diversity was calculated to estimate the species richness of a sample. Shannon index was used for quantizing the alpha diversity with the formula as described [32]. The beta diversity was calculated using Bray-Curtis distance [33].

Analysis of the microbiome through Real-Time Quantitative Reverse-Transcription Polymerase Chain Reaction
Fresh feces from 20 SIV-AGM and 20 SIV-CM as well as those of 8 SIV-AGM and 8 SIV + AGM were extracted by QIAamp Fast DNA Stool Mini Kit (Qiagen). Relative abundance of the selected bacterial population were analysed through SYBR-based quantitative PCR with an CFX real-time PCR systems (Biorad). The primer sets speci c for each bacteria are shown in Table 1. Table 1 Sequences of the primers used in the SYBR-green-based quantitative RT-PCR validation.

Metabolites Extraction
Feces (100 mg) were individuallly grounded with liquid nitrogen and the homogenate was resuspended with prechilled 80% methanol and 0.1% formic acid by well vortexing. The samples were incubated on ice for 5 min and then centrifuged at 15000 rpm, 4℃ for 5 min. Some of the supernatant was diluted to nal concentration containing 53% methanol by LC-MS grade water. The samples were subsequently transfered to a fresh Eppendorf tube and then centrifuged at 15000 g, 4 ℃ for 10 min. Finally, the supernatant was injected into the LC-MS/MS system analysis.

Data processing and metabolite identi cation
The raw data les generated by UHPLC-MS/MS were processed using the Compound Discoverer 3.1 (CD3.1, Thermo Fisher) to perform peak alignment, peak picking, and quantitation for each metabolite.
The main parameters were set as follows: retention time tolerance, 0.2 min; actual mass tolerance, 5 ppm; signal intensity tolerance, 30%; signal/noise ratio, 3; minimum intensity, 100,000. After that, peak intensities were normalized to the total spectral intensity. The normalized data was used to predict the molecular formula based on additive ions, molecular ion peaks and fragment ions. And then peaks were matched with the mzCloud (http://www.mzcloud.org), mzVault and MassList database to obtain the accurate qualitative and relative quantitative results. Statistical analyses were performed using the statistical software R (R version R-3.4.3), Python (Python 2.7.6 version) and CentOS (CentOS release 6.6).
When data were not normally distributed, normal transformations were attempted using area normalization method.

Data Analysis of the Untargeted Metabolomics
These metabolites were annotated using the KEGG database (http://www.genome,jp/kegg/pathway.html), HMDB database (http://hmdb.ca/metabolites). Principal components analysis (PCA) and Partial least squares discriminant analysis (PLS-DA) were performed at metaX (a exible and comprehensive software for processing metabolomics data). We applied univariate analysis (t-test) to calculate the statistical signi cance (P-value). The metabolites with VIP > 1 and Pvalue < 0.05 and fold change (FC) ≥ 2 or FC ≦ 0.5 were considered to be differential metabolites.
For clustering heat map, the data were normalized using z-scores of the intensity areas of differential metabolites and were ploted by Pheatmap package in R language. The correlation between differential metabolites were analyzed by cor () in R language with the pearson method. Statistically signi cant of correlation between differential metabolites were calculated by cor.mtest () in R language. P-value < 0.05 was considered as statistically signi cant and correlation plots were ploted by corrplot package in R language. The functions of these metabolites and metabolic pathways were studied using the KEGG database. The metabolic pathways enrichment of differential metabolites was performed. When ratio were satis ed by x/n > y/n, metabolic pathway were considered as enrichment. When P-value of metabolic pathway < 0.05, metabolic pathway were considered as statistically signi cant enrichment.

Statistical Analysis
Values were expressed as mean ± SD. The comparative CT method was used in real-time qRT-PCR assay according to the delta-delta CT method. Statistical analyses were performed using GraphPad Prism version 5.01. t test and Wilcoxon rank sum test were used to compared statistical differences. The data were considered statistically signi cant at P < 0.05.

Results
Comparison of the gut microbiota pro les between AGM_N, CM, RM Rarefaction analysis based on the core and pan gene numbers of the samples was performed which both demonstrated a curve approaching saturation (Fig.S1). Based on taxonomical annotation, composition of the microbiota at the phylum level varied with a higher abundance of Bacteroidetes in CM, a higher abundance of Spirochaetes in RM, and a higher abundance of Actinobacteria in AGM_N (Fig.S2). Composition of the microbiota at the genus level also varied between different groups (Fig. 1&S3). Compared with CM and RM, signi cant decrease in the abundances of Streptococcus, Alistipes, Treponema, Bacteoides, Methanobrevibacter, Methanobrevibacter (P < 0.01), as well as signi cant increase in the abundances of Clostridium, Eubacterium, Blautia, Roseburia, Faecalibacterium, Dialister (P < 0.01) were demonstrated in AGM_N (Fig. 1&S3). Cluster analysis of the relative abundances of the microbiota species at the genus level by the Bray-Curtis distances was shown in Fig. 2.
Principal component analysis based on the gene level (Fig. 3a) and KEGG pro le (Fig. 3b) demonstrated that the gut microbiome of AGM_P and AGM_N can be clearly distinguished from that of CM and RM. And the CM gut microbiome is closer to the RM microbiome than the AGM microbiome (Fig. 3). The distribution of CAZy classes between different groups were shown in Fig. 4a. Compared with CM, the alpha-galactosidase (EC 3.2.1.22), which is involved in lipid metabolism, was signi cantly downregulated in AGM_N (P < 0.01) (Fig.S4&S5). Compared with RM, the beta-glucosylceramidase (EC 3.2.1.45), which is also involved in lipid metabolism, was signi cantly upregulated in AGM_N (P < 0.05) (Fig.S4&S5). The signi cantly enriched terms by KEGG Orthology between different groups were shown in which the 3oxoacyl-[acyl-carrier protein] reductase (K00059) and beta-galactosidase (K01190) was related to fatty acid biosynthesis/metabolism and lipid metabolism, respectively (Fig.S6). Differences in the ARO (Antibiotic Resistance Ontology) distribution were shown in Fig. 4b.

Validation of the differential gut microbiota pro le of AGM
The most studied microbial metabolites that in uence immune system homeostasis are short-chain fatty acids (SCFAs), such as acetate, propionate and butyrate. Among SCFAs, butyrate has multiple regulatory roles at the gut level, exerting an anti-in ammatory effect on both intestinal epithelial cells and immune cells, and in uence host gut health [34,35]. Several bacteria, such as Coprococcus catus, Eubacterium rectale, Eubacterium hallii, Faecalibacterium prausnitzii, Roseburia spp, produce butyrate [36].

Comparison of the metabonomic pro les between AGM_N and CM
There were altogether 255 positive and 176 negtive metabolites identi ed with signi cantly different levels between AGM_N and CM group (Fig. 6, Table 2).The KEGG pathway annotation of the negtive metabolites and positive metabolites between AGM_N and CM demonstrated that the top 2 annotated KEGG pathways were amino acid metabolism and lipid metabolism (Fig.S7). The lipidmaps annotation of the negtive metabolites and positive metabolites between AGM_N and CM indicated that fatty acids and conjugates, avonoids, steroids were the top 3 annotated lipids (Fig.S8). The most enriched terms of by KEGG Orthology between AGM_N and CM are biosynthesis of unsaturated fatty acids, carbon metabolism, biosynthesis of amino acids, galactose metabolism for the negtive metabolites as well as metabolic pathways, steroid hormone biosynthesis, vitamin digestion and absorption, glycerophospholipid metabolism for the positive metabolites (Fig. 7). Next, correlation between metabolites, as well as correlation between metabolite and microbiota species, was analysed. Among the negtive metabolites, the gentisic acid (Com 1389), which is related to pyruvate metabolism and butanoate metabolism, was found to be at higher levels and positively correlated to  (Fig.S9). All the above metabolites were negatively correlated to the abundances of Haloarchaeobius, Methanobrevibacter, and Methanothermococcus in AGM_N (Fig. 9). These results indicated decreased capacity in lipid metabolism as well as increased capacity in butanoate metabolism in AGM_N compared with CM.

Comparison of the gut microbiota and metabonomic pro les between AGM_P and AGM_N
Clear changes in the gut microbiome composition have been reported between HIV-infected and uninfected individuals both in the early and chronic infection stage [37]. Although the fecal microbiota of healthy humans and macaques shares many similarities, there was not such strong clustering of bacterial communities associated with SIV infection as that in HIV infection [38]. Compared with CM, our data demonstrated an signi cant enrichment in Prevotella and Ruminococcaceae and decreased abundances of Bacteroides, Alistipes, Lactobacillus and Streptococcus in the gut microbiome composition of AGM_N, resembling a trend as found in early SIV infection [39] as well as in chronic HIV infection [41][42][43]. Compared with AGM_N, a trend in the increased abundances of Streptococcus and Roseburia were shown in AGM_P (Fig.S3). Compared with AGM_N, most of the lipid metabolites were found to be at lower levels in AGM_P and positively correlated to each other (Fig.S11, S12, S13, S14), indicating an increased capacity in lipid metabolism in AGM during chronic SIV infection. LPE 20:1 (Com 6182) and PE 16:2e/16:1 (Com 15213) were positively correlate to the microbiota species ,such as Methanoregula, in AGM_P (Fig.S15).

Discussion
NHPs share similar characteristics in many aspects to humans when they are used in models of neuroscience, infectious diseases and drug safety testing. Differences also exist between NHP species such as CM and AGM. AGM are resistant to AIDS progression caused by simian immunode ciency virus and can be used as a model of spontaneous hypertension. Besides, genetic differences between CM and AGM in uence the diet responsiveness. However, the mechanisms in shaping their speci c biological features remain not clearly understood.
Natural hosts have co-evolved with SIV and are capable of avoiding disease progression. The mechanisms of this phenotype may diverge. Unlike AGM, sooty mangabey (SM) maintains healthy frequencied of CD4 + T cells and genome sequencing has identi ed two gene products (ICAM-2 and TLR-4), which show structural differences that may in uence cell-surface expression (ICAM-2) and downstream signalling (TLR-4) [44]. In AGM, the co-evolution with SIVagm may be accounted for partially by the development of CD4 − CD8a dim T-cells from memory CD4 + T-cells. Like other natural hosts of SIV, CD4-like immunological functions can be elicited by CD4 − T-cells in AGM [45] and preservation of CD4 + Tcell function may contribute to the lack of immune activation in AGM and SM [46,47]. By assessing gene expression pro les from acutely SIV infected AGM and RM, a recent study found that AGM rapidly activate and maintain evolutionarily conserved regenerative wound healing mechanisms in mucosal tissue, which preserves mucosal integrity and prevents in ammatory insults that underlie immune exhaustion in RM [24]. The most studied microbial metabolites that in uence immune system homeostasis are short-chain fatty acids (SCFAs), such as acetate, propionate and butyrate. Among SCFAs, butyrate has multiple regulatory roles at the gut level, exerting an anti-in ammatory effect on both intestinal epithelial cells and immune cells, and in uence host gut health [30,31]. Several bacteria, such as Coprococcus catus, Eubacterium rectale, Eubacterium hallii, Faecalibacterium prausnitzii, Roseburia spp, produce butyrate [32]. In this study, pro les and possible roles of the AGM gut microbiome and metabolites in shaping these differences are investigated.
Our data con rmed that AGM possessed a higher abundance of the butyrate-producing Clostridium, Ruminococcus, Eubacterium, Roseburia, and Faecalibacterium, indicating an enhanced ability in facilitating immune system homeostasis and mucosal integrity. Also, AGM possessed a higher abundance of Blautia which belongs to the family of Lachnospiraceae and classi ed as bene cial of which administration can attennuates obesity, in ammation and dysbiosis [48]. Upon SIV infection, AGM exhibited an signi cant enrichment in Prevotella and Ruminococcaceae and decreased abundances of Bacteroides, Alistipes, Lactobacillus and Streptococcus in the gut microbiome composition, resembling a trend as found in early SIV infection [39] as well as in chronic HIV infection [41][42][43]. Upregulation of several metabolites, such as the butanoate metabolism-related gentisic acid, the anti-in ammatary FAHFA (fatty acid hydroxy fatty acids), in AGM also correlates with ndings in the gut microbiome composition. Differences in metabolites involved in lipid metabolism between AGM and CM suggest their different capacities in lipid metabolism. Intersetingly, AGM with chronic SIV infection demonstrated enhanced activity in lipid metabolism compared with SIV-AGM, suggesting a possible role of lipid metabolism in adaption to SIV infection. All these results re ect a natural selection pressure of SIV infection and evolutionary adaption of the AGM as a natural host of SIV to avoid AIDS progression.

Conclusions
Compared with CM and RM, AGM possesses different capacities of lipid metabolism and butanoate metabolism, which correlates with gut homeostasis and mucosal integrity. These ndings may provide insight into AIDS resistance in AGM.

Declarations
Funding This study was funded by the National Projects of Infectious Disease under Grant No.2017ZX10304402003.

Avaliability of data and materials
Please contact author for data requests.
Authors' contributions     Quantitative RT-PCR analysis of the relative bacterial abundance of feces samples =20from AGM and nCM. a Validation analysis of the relative bacterial abundance of feces samples from SIV-AGM (n=20) and nCM (n=20) via quantitative RT-PCR. b Comparison analysis of the relative bacterial abundance of feces samples from SIV-AGM (n=8) and SIV+ AGM (n=8) via quantitative RT-PCR.  The KEGG enrichment bubble map based on the relative levels of the differential metabolites between AGM_N and CM group. a KEGG map based on the negative metabolites with signi cantly different levels between AGM_N and CM. b KEGG map based on the positive metabolites with signi cantly different levels between AGM_N and CM. P<0.05 was considered statistically signi cant.

Figure 8
Correlation analysis of the differential levels between the negative metabolites from AGM_N and CM group. P<0.05 was considered statistically signi cant.

Figure 9
Correlation analysis of negative metabolites with differential levels and the microbiota at genus level between AGM_N and CM group. P<0.05 was considered statistically signi cant.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.