Bacterial community and composition of soil in world’s longevity township Jiaoling, China by 16S rRNA high-throughput sequencing


 Healthy longevity is a complicated process, however, the underlying mechanisms between longevity and microbiota warrant investigation. To address this, we characterized a longevity trajectory of environmental microbiota in a longevity township. We used high-throughput sequencing of the 16S rRNA gene to analyse the composition and function of soil microbiota. The composition and diversity of soil microbiota significantly differed between towns. The dominant bacteria at the phylum level included Proteobacteria, Firmicutes, and Acidobacteria. At the genus level, Chujaibacter, Acidipila, and Lactobacillus were dominant. However, Steroidobacter, Comamonas, and Pseudoxanthomonas were only dominant in Xinpu with high centenarian population. Twelve biomarkers were responsible for significant differences between towns, including Lactobacillus, Muribaculaceae, Ruminococcaceae, Lachnospiraceae, and Chitinophagales, etc. The main species contributing to the differences of towns were Chujaibacter, Acidipila, Lactobacillus, Rhodanobacter, Lysobacter, Bryobacter, Granulicella, Flavobacterium, and Mizugakiibacter. The function of exosome, cysteine and methionine metabolism, amino acid-related enzymes, peptidases, starch and sucrose metabolism, etc., were predicted. Thus, we have revealed significant differences in the composition and diversity of soil microbiota in the world’s longevity township, the relationship between soil microbiota and long-lived people. These findings provide a research foundation for the role of soil microbiota in healthy longevity.


Introduction
Throughout the ages, humans have explored the mystery of health and longevity. Longevity is a complex feature. It is related to many factors, such as geography and meteorological, economic, medical, and health conditions. It is also affected by genetic, environmental, and other factors, and ideally would mainly depend on the rate of ageing.
In the biomedical eld, the research has focused on searching for molecular and biological factors that promote healthy ageing and longevity. Considerable attention has been paid to analyzing the role of genetic factors in determining healthy ageing and longevity. The main factors that affect longevity and ageing are growth hormone (GH) and insulin/insulin-like growth factor (insulin/IGF-1) pathways 1 in various organisms 2,3 . Further, overexpression of the Forkhead box O3 gene (FOXO3) in model organisms is related to prolonged lifespan [4][5][6] . Another protein bene cial for longevity and metabolic regulation is AMP-activated protein kinase (AMPK) 7 . Besides, deacetylase family genes (Sirtuins) 8,9 , the apolipoprotein E gene (APOE) 10 , Telomerase 11 , the mammalian target of rapamycin (mTOR) signalling pathway 12,13 , the tumour suppressor gene P53 14 , the transcription factor NF-κB 15,16 , the autophagy-lysosomal signalling pathway [17][18][19] , long-chain non-coding RNAs 20,21 play important roles between health and longevity.
Finally, methionine sulfoxide is considered to be a marker of biological ageing 22 . Methionine sulfoxide reductase is a speci c antioxidant enzyme that removes this modi cation of proteins, and at the same time, acts as a general cellular antioxidant to scavenge free radicals and protect the cell from biological oxidative stress 23 .
The environment also plays a very important role in determining longevity, as people are sensitive to the environment 24,25 . For example, dietary restriction, i.e., food intake control to achieve balanced nutrition 26 , can delay the rate of ageing. Further, appropriate exercise strengthens the body, enhances immunity resistance, and delays ageing. Age positively correlates with free radical levels in humans. Moderate exercise increases the levels of free radical-scavenging enzymes and their activity to delay ageing 27 .
Among the above factors, the natural environment is key to longevity. For human beings to live a long and healthy life, they must establish and maintain a harmonious relationship with the natural environment and living environment. Topography and landforms, climate, soil, water, and other natural geographical environmental factors are the main determinants of longevity 28 . Soil is the material basis on which organisms rely for survival. The most essential trace elements from the soil affect water quality, plant, and human health through the food chain 29,30 .
The rapid development of science and technology in the 21st century has enabled researchers to explore the relationship between health and longevity, and the environment. Metagenomics with high-throughput sequencing, combined with multi-omics technologies, is used to analyse the gut microbiota of the elderly, and the correlation and contribution of microorganisms to health and longevity from the perspective of the microbiota. The microbiota co-evolve with human and are vital to human health. However, changes in environmental microbiota composition and function in longevity areas have not been fully studied. To address this question and improve the understanding of whether and how the environment support health and longevity, it is necessary to analyse the environmental microbiota in such areas. To investigate the association between the soil microbiota and longevity in counties of China, we studied Jiaoling, one of the world's longevity townships. The obtained data will help identify commonalities and differences between regional longevity factors; enable correlation analyses between the soil microbiota, health and longevity; and provide a theoretical basis for guiding future research on health and longevity.

Results
Characteristics of the long-lived population and sample sequencing Jiaoling County is a liated to Meizhou City, Guangdong Province. It is located in the northeast of Guangdong Province In eight towns in the Jiaoling County (T1, Guangfu; T2, Nanzhai; T3, Wenfu; T4, Changtan; T5, Jiaocheng; T6, Lanfang; T7, Sanzhen; and T8, Xinpu), the proportion of the population over the age of 80 in the total population over the age of 80 (approximately 8365 people) is 6.91%, 10.57%, 9.07%, 9.12%, 22.30%, 11.73%, 8.67%, and 21.64%, respectively. The elderly population is the largest in Jiaocheng, and the smallest in Guangfu (Fig. 1). According to the proportion of the population aged 80-89 years old, 90-99 years old, and over 100 years old to the total population over-80, Xinpu with the highest proportion of centenarians, followed by Changtan, and the lowest in Nanzhai (Fig. 1). The proportion of centenarians shows a decreasing distribution trend along the Shiku River, from south to north.
Soil samples were collected in the eight towns, bacterial genomic DNA isolated, and the 16S rRNA gene sequenced using the Ion S5 TM XL sequencing platform (by single-end sequencing), generating smallfragment libraries. By shearing and ltering the reads, an average of 83,564 reads per sample was obtained, with an average of 78,271 valid reads after quality control. The e ciency of quality control was 93.74%. The sequences were then clustered into OTUs, at 97% identity, yielding 10,528 OTUs, which were nally annotated using the Silva132 database.
Composition and structure of the soil microbiota Based on the OTU analysis and clustering, shared and unique OTUs were identi ed in different groups.
Overall, 1827 OTUs were shared in the T1-T8 samples, and the number of unique OTUs in each group is 58, 96, 422, 351, 129, 401, 51, 153, respectively (Supplementary Figure S1A). Based on species annotation, the top 10 most abundant bacteria at each classi cation level (phylum, class, order, family, and genus) were identi ed in each group, and a cumulative bar map of species relative abundance was generated to allow visual assessment of the relative bacterial abundance and their proportions at the different classi cation levels in each sample. The relative abundance at phylum levels is presented in a histogram ( Fig. 2A). The major abundant phyla were Proteobacteria, Firmicutes, Acidobacteria, Bacteroidetes, Actinobacteria, Chloro exi, Rokubacteria, Planctomycetes, and Gemmatimonadetes. Among these, bacteria from Proteobacteria, Acidobacteria, and Bacteroidetes were dominant at all taxonomic levels. Based on species annotation and abundance at the genus level, 35 most abundant genera were identi ed, clustered at the species and sample levels, and a heat map was generated (Fig. 2B) To further analyse the phylogenetic relationship of species at the genus level, representative sequences of the top 100 genera were obtained using multiple sequence alignment. A phylogenetic tree constructed using representative sequences of species at the genus level is shown in Fig. 2C. The analysis revealed that Lactobacillus, unidenti ed Acidobacteria, Flavobacterium, Lysobacter, Chujaibacter, Acidibacter, unidenti ed Gammaproteobacteria, and Bryobacter were distributed in the eight groups. For each classi cation result, species representing the top 10 genera with the highest relative abundance were selected for speci c species classi cation tree analysis (Fig. 2D). The analysis revealed that Lactobacillus, a species of special interest, accounted for 1.332% of all species and 15.38% of the selected species. The percentages of Lactobacillus animalis, Lactobacillus delbrueckii, Lactobacillus gasseri, Lactobacillus mucosae, and Lactobacillus reuteri among the selected species were 7.44%, 0.01%, 4.27%, 0.03%, and 2.76%, respectively. Alpha diversity and beta diversity analyses Species diversity curves (dilution curves and hierarchical clustering curves) and species accumulation box plots were used to evaluate the differences in species richness and diversity of microbial communities in each sample. The rarefaction curve ( Fig. 3A) tended to be at, indicating that the sequencing data was su cient for the analysis. Additional data would generate only a small number of new species (OTUs), and indirectly re ect the abundance of species in the sample. For the rank abundance curve (Fig. 3B) in the horizontal dimension, the span of the curve gradually increased, indicating a higher richness of species; in the vertical dimension, the curve gradually attened, indicating a more uniform species distribution. The box plot of species accumulation (Fig. 3C), the sample size increased, and the box plot position tended to be at. This meant that the species richness in the environment would not signi cantly increase with increasing sample size, indicating that the sample size was su cient for data analysis. The alpha diversity, assessed by the Chao1 index, was signi cantly different between the T2 group and the T1, T3, T4, and T5 groups (p < 0.05) (Fig. 3D). Wilcoxon rank-sum analysis to test the alpha diversity between any two groups revealed that the alpha diversity in the T2 group was signi cantly lower than that in the other groups.
Beta diversity is a comparative analysis of microbial community composition in different samples. Beta diversity analysis based on the weighted Unifrac beta-Wilcox rank-sum test index (Fig. 3E) revealed statistically signi cant differences between the T2 group and the T3, T4, T5, and T6 groups (p < 0.05).
Signi cant differences between the T5 group and the T7 and T8 groups were also apparent (p < 0.05). To analyse the similarity between samples, the weighted Unifrac distance matrix was used for UPGMA cluster analysis, and the clustering results were integrated and displayed with the relative abundance of each sample at the phylum level (Fig. 3F). The relatively small distance between the T6 and T8 groups, and the T3 and T4 groups, indicated a similar species composition structure. Therefore, samples with similar community structures tended to cluster together, while samples with very different communities were further apart. The T2 and T8 groups had the farthest distance and the largest community difference.
The communities in each group exhibit certain inter-group differences, indicating the accuracy of grouping. In the beta diversity analysis, the weighted Unifrac distance index was used to measure the coe cient of difference between the two samples. The smaller the coe cient value, the smaller the difference in species diversity between the two samples. The distance matrix heat map constructed based on the weighted Unifrac distance (Fig. 3G) revealed a big difference between the T2 group and other groups. Signi cance analysis of species differences between groups Alpha diversity index analysis of differences between groups is used to determine whether the overall community structure in different groups is signi cantly different. The species responsible for this difference are then identi ed using LEfSe and ternary plot analysis. The identi ed species are a group biomarker. Accordingly, we used LEfSe analysis to detect the species differentiating different groups by the rank-sum test of species abundance. A histogram of the different species LDA value distribution is shown in Fig. 4A, and an evolutionary branch diagram of the different species is shown in Fig. 4B. The former revealed biomarker species with LDA score greater than 4. The LEfSe analysis revealed 12 biomarkers, namely, Lactobacillus, Muribaculaceae, Ruminococcaceae, Lachnospiraceae, Chitinophagales, and others ( Fig. 4C-G).
Next, to identify the differences in the dominant species in three groups of samples at each classi cation level (phylum, class, order, family, genus, and species), the top 10 species with the average abundance in the three groups of samples at the genus level were used to generate a ternary plot (ternary phase diagram) (Supplementary Figure S2). Difference analysis of the dominant species in the T1-T3 samples revealed that Lactobacillus was dominant in the T2 group, and Chujaibacter and Granulicella were dominant in the T1 group (Supplementary Figure S2A). Further, for the T2-T4 samples, Lactobacillus was dominant in the T2 group; and Chujaibacter, Acidipila, Mizugakiibacter, and Rhodanobacter were dominant in the T4 group (Supplementary Figure S2B). For the T3-T5 samples, unidenti ed Prevotellaceae was dominant in the T3 group, and Chujaibacter and Acidipila were dominant in the T4 group (Supplementary Figure S2C). For the T4-T6 samples, Chujaibacter and Acidipila were dominant in the T4 group, and Lactobacillus and unidenti ed Prevotellaceae were dominant in the T6 group (Supplementary Figure S2D). For the T5-T7 samples, Chujaibacter was dominant in the T7 group, and Lactobacillus and unidenti ed Prevotellaceae were dominant in the T6 group (Supplementary Figure  S2E). Finally, for the T6-T8 samples, Flavobacterium was dominant in the T8 group, and Chujaibacter was dominant in the T7 group (Supplementary Figure S2F). Overall, each group had a different dominant species. This species may constitute the characteristic of each group and should be analysed further.
Next, to determine individual species contribution to the above differences, Similarity percentage (Simper) analysis was performed. Simper is a decomposition of the Bray-Curtis difference index, used to quantify each species contribution to the difference between two groups. The analysis revealed the top 10 species and their abundances, contributing to the difference between the two groups (Supplementary Figure S3). The top 10 species contributing to the difference between two groups and their abundance rankings, from high to low, were: Chujaibacter, Acidipila, Lactobacillus, Rhodanobacter, unidenti ed Prevotellaceae, Lysobacter, Bryobacter, Granulicella, Flavobacterium, and Mizugakiibacter. The contribution of the species to different groups differed, with the highest contribution from Chujaibacter, followed by Acidipila and Lactobacillus.
Co-occurrence network of the soil microbiota The species co-occurrence network diagram enables intuitive visualisation of the impact of different environmental factors on microbial adaptability and the dominant species and closely interacting populations of species that occupy a dominant position in a speci c environment. These dominant species and populations often play a unique and important role in maintaining a stable structure and function of the environmental microbial community. The analysis of the signi cance of the differences between groups and their contribution revealed that Lactobacillus was important for differentiating the groups. Therefore, the co-occurrence network of Lactobacillus was next analysed. In the soil microbiota co-occurrence network (Fig. 5) Hence, in addition to exerting a probiotic effect, Lactobacillus could also exert a synergistic effect via positive and negative regulation of the strains mentioned above.
Predicted functions of the soil microbiota Using the annotation data (Supplementary Figure S4B), the top 10 functional annotations were retrieved for the most abundant bacteria at each annotation level, and a histogram of relative functional abundance was generated to visualise the relatively most abundant features at different annotation levels (Supplementary Figure S4A). The identi ed KEGG pathway annotations mainly focused on cellular processes, environmental information processing, genetic information processing, human diseases, metabolism, organic systems. Among them, metabolism had the highest proportion, and organic systems had the lowest proportion. Based on the functional annotation and sample abundance information, the top 35 abundance features and their abundances in each sample were used to generate a heat map, and clustered functional differences. On the horizontal level 1 of the clustering heat map (Fig. 6A), the T2 group had the highest proportion of genetic information processing. On the horizontal level 2 of the clustering heat map (Fig. 6B), immune system, glycan biosynthesis and metabolism, and carbohydrate metabolism had a higher level of function in the T2 group than in others. On the horizontal level 3 of the clustering heat map (Fig. 6C), the abundances of exosome, cysteine and methionine Metabolism, glycolysis/gluconeogenesis, alanine, aspartate and glutamate metabolism, amino acid-related enzymes, peptidases, starch and sucrose metabolism, amino sugar and nucleotide sugar metabolism, etc. were only relatively high in the T2 group. The T2 group had unique features also in the horizontal level k of clustering heat map (Fig. 6D)

Discussion
In this study, we aimed to reveal the composition, diversity, and function of soil microbial community in the world's longevity township, Jiaoling (China), to compare microbiota composition differences between different towns. Prediction of local community compositions and functions 31  This nding might emphasise the in uence of geographical changes on the composition of the local microbial community. Based on the Chao1 index and Wilcoxon rank-sum analysis of the alpha diversity of samples, the alpha diversity of the T2 group was signi cantly lower than that of the other groups. The beta diversity analysis based on the weighted Unifrac beta-Wilcox rank-sum test index revealed statistically signi cant differences between the T2 and other groups (p < 0.05). Perhaps the diversity of the T2 group was signi cantly different from other groups was precisely linked to the different geographical localisation of the above species.
The analysis of the differences between the alpha and beta diversity indices revealed that different groups' overall community structures were signi cantly different. The LEfSe analysis of T1-T8 samples revealed biomarkers, including Lactobacillus, Muribaculaceae, Ruminococcaceae, Lachnospiraceae, and Chitinophagales. Different species have different contributions to different groups; the highest contribution was that of Chujaibacter, followed by Acidipila and Lactobacillus. The environmental microbiota is complex and diverse. We observed that the dominant strain in the T2 group was Lactobacillus. This prompts the urgent questions of how its survival among thousands of microorganisms and in the colony affects the local microbiota and is affected by local microbiota. Using the species co-occurrence network diagram, we intuitively visualise the dominant species and closely interacting populations that are dominant in speci c environments. These dominant species and populations often play a unique and important role in maintaining the stable structure and function of the microbial community in the environment 36 . In the soil microbial co-occurrence network, approximately 20 strains were positively correlated with Lactobacillus, including Anaerostipes, Blautia, Dialister, and Streptococcus; and approximately 35 strains negatively correlated with Lactobacillus, including Minicystis, Rhizorhapis, Phycicoccus, and Oligo exus. In addition to its good viability under various conditions, Lactobacillus can also exert a synergistic effect on other microbes by regulating the growth of the above-mentioned impacted strains. Lactobacillus, as a general probiotic, is widely distributed, e.g., large numbers of Lactobacillus have been isolated from the environment (soil and water), food (fermented dairy products, fermented soy products, etc.) 37 , and various organisms (human, animals, etc.) 38 . Their presence greatly affects human life and health. Probiotics are currently used to solve problems in various elds, including agriculture, aquaculture, food processing industry, medical treatment, etc., especially in the food and medicine, and, recently, cosmetics industries.
Molecular methods, such as metagenomics (estimating microbial composition and genome capacity), metatranscriptomics (estimating gene expression), and metaproteomics (estimating protein synthesis), provide insights into the functional pro le of the entire soil communities. The expression of most functional genes in the soil is different in different communities, and is driven by climate and soil conditions 39 . Clustering based on the functional annotation and published abundance information can determine the functional difference levels. On the horizontal level 2 of the clustering heat map, immune system, glycan biosynthesis and metabolism, and carbohydrate metabolism had a higher level of function in the T2 group than in others. The abundances of exosome, cysteine and methionine metabolism, glycolysis/gluconeogenesis, alanine, aspartate and glutamate metabolism, amino acidrelated enzymes, peptidases, starch and sucrose metabolism, amino sugar and nucleotide sugar metabolism, etc. were only relatively high in the T2 group on the horizontal level 3 of the clustering heat map. Collectively, T2 group samples have unique functional characteristics at different levels. This may be related to its unique geographical environment, and the underpinning factors require further research.
Longevity is comprehensively in uenced by environment (soil, water, air, diet, lifestyle, exposome, medication, development, economy), host (genetics, gender, age, stress, psychological factors), microbiome (homeostasis, composition, diversity, function), among other factors. (Fig. 7). The microbiome may play a key role in the connection between the environment and the host. Thus, we may need to ensure a healthy gut microbiota in order to be longevous. The speci c bacterial groups found in this study may be related to host ageing and used to promote a healthy microbiota and longevity.
There are some limitations to this study. First, the functions of the soil microbiota were only predicted using high-throughput sequencing. It is di cult to isolate as many types of microorganisms as those analyzed using high-throughput sequencing data. In the future, in vivo and in vitro functional veri cation tests on isolated strains will need to be conducted to ensure the accuracy of the high-throughput sequencing analysis results. Second, seasonal changes in the environmental microbiota were not considered, and must be taken into account in future studies. Third, we did not carry out a in-depth longitudinal study of the host microbiota, which should be considered in future studies.

Conclusion
In the current study, 64 bacterial phyla, 74 classes, 148 orders, 276 families, 664 genera, and 513 species were identi ed in all soil samples. We found that considering the composition and diversity of the soil microbial community in Jiaoling, China, the world's longevity township, at the phylum level, the dominant bacteria were Proteobacteria, Firmicutes, and Acidobacteria. The dominant bacteria at the genus level were Chujaibacter, Acidipila, and Lactobacillus. However, Bacteroides, Lactobacillus, Ralstonia, Collimonas, Candidatus Solibacter, Candidatus Koribacter were only dominant in the T2 group. The alpha diversity and beta diversity of the T2 group were signi cantly different from those of other groups. That was because of the different geographic locations of these species. LEfSe and ternary plot analysis revealed 12 biomarker microbes that were responsible for signi cant differences between the groups, mainly including Lactobacillus, Muribaculaceae, Ruminococcaceae, Lachnospiraceae, Chitinophagales, etc. Simper analysis indicated the species that contributed to the above differences between the groups and their contribution size. The contribution from high to low was Chujaibacter, Acidipila, Lactobacillus, Rhodanobacter, unidenti ed Prevotellaceae, Lysobacter, Bryobacter, Granulicella, Flavobacterium, and Mizugakiibacter. The contributions of different species were different in different groups, with the highest contribution of Chujaibacter, followed by Acidipila and Lactobacillus.The microbiome may play a key role in the connection between the environment and the host. Regional environmental factor soil microbiota as a potential bridge between environment and long-lived People. These data provide a basis for understanding the signi cant differences in the composition and diversity of soil microbial communities in the long-lived people in longevity township Jiaoling, China.  40 was rst used to remove low-quality fragments of reads. Next, the data were ltered for preliminary quality control of the raw reads. Then, to remove the chimera sequences and obtain clean reads, they were processed by comparing with an annotation database as described elsewhere 41  . QIIme (v1.9.1) was used to calculate the Shannon index and the Unifrac distance, and to construct the sample-clustering tree using the unweighted pair group method with arithmetic mean (UPGMA). R (v2.15.3) was used for dilution curve, rank abundance curve, and species accumulation curve calculations, and to analyze the differences in alpha and beta diversity indeces between the groups using the Wilcoxon test in the agricolae package for parametric test. Linear discriminant analysis effect size (LEfSe) software was used to nd biomarkers using a default score setting of linear discriminant analysis (LDA) of 4. Based on species abundance, the correlation coe cient (Spearman correlation coe cient or Pearson correlation coe cient) was calculated between the genera, to obtain a correlation coe cient matrix. For edges, graphviz (v2.38.0) was used to draw the network graph. The Tax4Fun function prediction was executed using the nearest-neighbor method based on the minimum 16S rRNA sequence similarity. The speci c method involved extracting the 16S rRNA gene sequences from the whole prokaryotic genome of the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and comparing them to those available in the SILVA SSU Ref NR database (BLAST bitscore > 1,500) using the BLASTN algorithm to establish a correlation matrix.