Demographics and Clinical Data
46 PD patients and their paired healthy spouses were recruited in this study (Table 1). There was no significant difference in sex ratio and age between PD and Spouse groups. PD patients had average age of onset of 60.0 ± 6.5 years, disease duration of 3.0 (1.0-5.0) years, Hoehn and Yahr (H&Y) stage of 1.8 (1.0-2.5), and Unified Parkinson’s Disease Rating Scale (UPDRS) total scores of 30.5 (20.8-43.0). According to clinical phenotype, patients were divided into 3 subgroups: tremor dominant (TD, n = 20), postural instability and gait difficulty (PIGD, n = 20) and indeterminate (n = 6). Hereditary factors play a role in PD, and 4 (9.7%) patients had family history in our study. Pesticides exposure is a risk of PD, and 4 (9.7%) patients had ever suffered pesticides exposure (Additional file 1). Meanwhile, prevalence of constipation between two groups were significant different (63.0% vs 10.7%, P < 0.001).
Sequencing data and taxonomic composition of all samples
Total of 5812686124 raw reads were obtained from 92 samples. The average number (mean ± SD) of raw reads per sample in PD and Spouse groups were 62387159 ± 17744069, 63975582 ± 24281624, respectively. Filtered clean reads of per sample were 59484557 ± 16536993 and 62907844 ± 24719821 in PD and Spouse groups (Additional file 2). GraPhlAn was used to construct classification tree (Fig. 1), and gut microbiota annotation to Archaea, Bacteria, Eukaryota and Viruses. The average relative abundance of these kingdoms were 0.108%, 99.762%, 0.005% ,0.125% in PD patients, and 0.036%, 99.184%, 0.002%, 0.778% in spouses, respectively. Meanwhile, the relative abundance of Viruses in patients was significantly lower than that in spouse (P=0.002). Bacteroidetes, Firmicutes, Proteobacteria and Actinobacteria were the dominant Bacteria at phylum level, with the relative abundance accounting for more than 98%. In the remaining phylum with the relative abundance less than 1%, Viruses_noname in Viruses accounted for 0.45% and Euryarchaeota in Archaea accounted for 0.07%. At species level, relative abundance of Prevotella_copri, Bacteroides_stercoris, Faecalibacterium_prausnitzii, Escherichia_coli and Bacteroides_uniformis were more than 3% in both two groups.
Composition of gut microbiota between PD and Spouse groups
Relative abundance of gut microbiota at the level of phylum, family and species were compared, respectively. The results showed no significant difference in the top 5 phyla of two groups. 70 families were annotationed and Bacteroidaceae was significantly increased in PD, while Prevotellaceae was significantly decreased. In the top 10 of all 484 species, Prevotella_copri and Bacteroides_fragilis was significantly decreased in patients, while Bacteroides_stercoris and Escherichia_coli were significantly increased. Heatmap (Fig. 2a) showed the composition of top 50 species in all samples, and boxplot (Fig. 2b) showed different abundance of top 15 species in two groups. 23 species with significant differences between two groups were screened out, and principal coordinates analysis (PCoA) (Fig. 2c) based on the Bray-Curtis distance matrix showed significantly different beta diversity (analysis of similarities ANOSIM: R = 0.035, P = 0.036).
Correlation between gut microbiota and clinical features of PD
All samples were divided into < 60, 60-70 and > 70 years subgroups to further analyze the differences of Prevotella_copri (Fig. 3a). The results indicated Prevotella_copri decreased significantly in patients of 60-70 and > 70 subgroups compared with those in paired spouse groups. Moreover, the relative abundance of Prevotella_copri in PD was decreased significantly as the increase of age, but this phenomenon didn't occur in spouse subgroups.
Structure of gut microbiota in patients were analyed, and significant difference were found between 3 age subgroups (Fig. 3b). Then filtered 22 species that had significant correlation with group factors (Spearman's test, P < 0.05), and used generalized linear model (GLM)to calculate the correlation coefficient between microbiota and clinical features of disease, such as age, disease duration and severity (H&Y stage, UPDRS score). Most of the identified species in gut microbiota were negatively correlated with disease clinical features (Table 2). In 7 species with average relative abundance of more than 0.1%, Prevotella_copri had significant negative correlation with age and UPDRS Ⅲ score. Parabacteroides_merdae had negative correlation with disease duration, UPDRS total score, UPDRS Ⅰ and Ⅳ score. Alistipes_onderdonkii had negative correlation with age, H&Y stage, UPDRS Ⅱ and Ⅲ score.
Gut microbiota in patients with family-history, dysosmia, constipation, pesticide expose and sleep disorder were further studied. The result indicated altered microbiota were significantly correlate with these non-motor symptoms (Additional file 3).
Prediction models for PD based on gut microbiota biomarkers
In order to evaluate the predictive value of gut microbiota for disease, we first searched out 23 species that had significantly different abundance between PD and Spouse groups by Wilcoxon rank-sum test. Then filtered out 6 important species by Boruta(Fig. 4a) and constructed random forest (RF) classification model. Relative abundance of Prevotella_copri, Parabacterid_merdae, Alistipes_onderdonkii, Bacteroides_fragilis, Lachaceae__3_1_57 and Providencia_rettgeri were involved to predict the disease status. The results (Fig. 4b) showed that the area under curve (AUC) of random forest model was 0.772 (95% CI: 0.559-0.985; Sensitivity: 0.875; Specificity: 0.500).
Altered functional pathways of gut microbiota in PD patient
Statistical Analysis of Metagenomic Profiles softwares (STAMP) and LDA Effect Size (LefSe) were used to compare the different microbiotal function pathways between PD and Spouse groups, and the overlapped functional pathways from two methods were focused on. By mapping to the MetaCyc databases, 15 and 42 PWY pathways (Fig. 5a,b)significantly changed between PD and Spouse groups were found by STAMP and LefSe, respectively (Additional file 4). The results showed that the pathways associated with aromatic amino acid degradation/chorismate metabolism were significantly increased, while the biosynthesis-related pathways were significantly decreased. Another significant change in patient was significant increase in γ-aminobutyric acid (GABA) degradation, carbohydrate metabolism, and methylphosphonate degradation pathways. In the aspect of vitamin metabolism, pathways in vitamin B1 synthesis were increased in both two groups, while the synthetic pathway in patients was mainly from Escherichia_coli. In additionally, vitamin B6 synthesis pathway in the patients was significantly increased. Functional annotation can be precise to species level as the advantages of metagenomic sequencing. 7 functional pathways of Prevotella_copri were investigated, and the pathways which involve UMP biosynthesis I, S-adenosyl-L-methionine cycle I and guanosine ribonucleotides de novo biosynthesis were significant differences between PD and Spouse groups.
Clean data were also mapped to Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology database, and KEGG orthology (KOs) of top 50 relative abundance are shown in heatmap (Fig. 5c). STAMP and LefSe screened 61 and 200 KOs, respectively, and 30 overlapped KOs were summarized and annotationed (Additional file 5). There were 163 overlapped Clusters of Orthologous Groups of proteins (COGs) and 32 overlapped gene ontology (GOs) (Additional file 6,7), Bray-Curtis distance matrix based on these genes showed different composition of genes between two groups (Fig. 5d,e).
Relationship between clinical features of PD and differentially functional pathways were also analyzed, and clinical features were found to be related to many of functional pathways (Fig. 6). For example, aerobactin biosynthesis was positively associated with UPDRS Ⅳ, gluconeogenesis I and L−methionine biosynthesis III were negatively associated with H&Y stage.