Animals
Twenty adult rhesus macaques at the ages of 6-15 years old were provided by State Key Laboratory of Primate Biomedical Research of Kunming University of Science and Technology. Adult rhesus macaquesat were individually caged, which included ten macaques diagnosed as spontaneous osteoarthritis and ten health rhesus macaque as normal control. which included ten macaques diagnosed as spontaneous osteoarthritis and ten health rhesus macaque as normal control. All of the animals were maintained in a 12 hours light: 12 hours darkness cycle, temperature was kept at 18–26℃ and humidity from 40 to 70%. All procedures were approved by the Institutional Animal Care and Use Committee of Kunming University of Science and Technology (protocol number: LPBR20170201), and were carried out in accordance with the Guide for the Care and Use of Laboratory Animals (8th edition).
Identification of Spontaneous Osteoarthritis in Rhesus Macaques
The animals were anesthetized by intramuscular injection of ketamine with a volume of 5mg/kg. The magnetic resonance imaging (MRI) scan was conducted on the 3T machine (Siemens). The MRI scanning of the knees was performed on the rhesus monkeys. Articular cartilage was quantitatively assessed based on T1 rho (TR: 420.0 TE: 12.0) and T2 (TR: 3090.0 TE: 12.0) relaxation times. Cartilage thickness and signal intensity of the surfaces of the patella, medial and lateral femoral were measured.
Fecal Sample Collection and DNA Extraction
Fresh fecal samples were collected in sterile tubes from the twenty rhesus macaques. Then, the fecal samples were transferred to the laboratory immediately in an ice bath and stored at −80°C (not more than 3 months). The isolation of purified microbial genomic DNA was performed from each fecal sample using a MoBioPowerSoil® DNA Extraction Kit (Arlsbad, CA, USA) according to the manufacturer’s recommendation. The DNA concentration was measured using Qubit® DNA Assay Kit in Qubit® 2.0 Flurometer (Life Technologies, CA, USA).
Library Preparation for Sequencing
Each sample needed a total amount of 700ng DNA to be used as input material for the DNA sample preparations. According to the manufacturer’s recommendation, sequencing libraries were generated using NEB Next® Ultra DNA Library Prep Kit for Illumina® (NEB, USA), and index codes were added to attribute sequences for each sample.
Clustering and Sequencing
In the cBot Cluster Generation System, the clustering of the index-coded samples was performed by HiSeq 4000 PE Cluster Kit (Illumina) according to the manufacturer’s instructions. After cluster generation, the library preparations were sequenced on an Illumina Hiseq 4000 platform and 150 bp paired-end reads were generated.
Metagenome data analyses
Assembly of the metagenome and construction of the gene catalog
Raw paired-end reads were processed to exclude: (1) adaptor sequences ; (2) low-quality reads that have more than 40% of bases with a quality score < 5, (3) reads containing more than 10% unknown bases; (4) reads mapped to host genome (NCBI Reference genome: Mmul_8.0.1/rheMac8, Macaca mulatta) by BWA-MEM [36]. Finally, paired reads longer than 75 bp were selected as high-quality-reads. For each sample, Megahit v1.0.6 [37] was used to assemble the high-quality-reads under pair-end mode with default parameters, respectively. Then, Prodigal v2.6.3 [38] was used to perform gene prediction using contigs (a length threshold of 500bp) with parameter “-p meta”.Then, the non-redundant gene catalog was constructed using cd-hit-est v4.6.6 [39] based on the predicted ORFs (length longer than 100bp were selected), and the redundant genes were removed using a sequence identity cut-off of 0.95. Additionally, the functional assignments of the non-redundant proteins were performed based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database.
Taxonomic annotation
Taxonomic annotation of protein sequences generated by Prodigal were performed by DIAMOND v0.8.28.90 [40] alignment against the NCBI-NR database using CARMA3 [41] with the default parameters. To obtain the relative gene abundance, the high-quality-reads from each sample were aligned against the non-redundant gene catalog by BWA-MEM using the criteria of length ³50 bp and identity > 0.95. The sequence-based relative abundance calculation referred to a previously described method [42]. The relative abundances of Phylum, genus, species and KO were calculated by summing the abundance of genes belonging to each category for each sample. Metastats analysis was conducted to investigate the difference of the relative abundance for each species and gene between the two groups [43]. A multi-comparison adjusted Q < 0.05 was used to define significant differences.
Microbial composition analysis
For microbial diversity analysis, Shannon index and Simpson index were used to describe the α-diversity (intergroup diversity), using R package “vegan”. A PCoA analysis were performed to describe the β-diversity (intragroup diversity) by the R package “vegan” (vegdist was used to calculate the Bray-Curtis dissimilarity values), ggplot2 was used to do visualization.
Discovery of biomarkers
The genomic features(organisms and clades) were identified by a metagenomic biomarker discovery approach called Linear discriminant analysis Effect Size (LEfSe: https://huttenhower.sph.harvard.edu/galaxy/) [44].Kruskal-Wallis and pairwise Wilcoxon tests were implemented, followed by a Linear discriminant analysis (LDA) to evaluate the effect size determined by LEfSe of each differentially abundant taxon. Bacteria with considerably increased values were defined as those with an LDA score (log10) of over 2. By way of class comparison, tests of biological consistency and effect size estimation to address the differences between multi microbial communities.