2.1 Data Quality Assessment
Raw reads obtained from Contig sequencing were assembled with the software MEGAHIT (Gurevich et al., 2013) in default parameter. Contig sequences shorter than 300bp were discarded, and the assembly results were evaluated by QUAST (Zhu et al., 2013) with default parameter. Results demonstrated that the number of Contig in different experimental groups was about 600,000, and greatly varied in length. The N50 length was about 1000 bp-1200 bp with the alignment rate exceeding 95% (Table 1).
Table 1
Assembled data assessment
Sample | Total Length (bp) | Contig Number | Largest Length (bp) | GC (%) | N50 (bp) | Mapped (%) |
F1 | 713374913 | 679554 | 332933 | 41.78 | 1115 | 97.07 |
F2 | 601410151 | 594324 | 264723 | 41.58 | 1061 | 95.68 |
F3 | 621800601 | 613040 | 315309 | 41.63 | 1065 | 95.84 |
T1 | 573217820 | 519415 | 184590 | 41.85 | 1194 | 96.05 |
T2 | 716521041 | 613618 | 721453 | 41.86 | 1290 | 97.17 |
T3 | 649395570 | 557814 | 186681 | 42.01 | 1287 | 96.87 |
W1 | 604749023 | 581793 | 363719 | 42.28 | 1072 | 95.93 |
W2 | 712670354 | 668870 | 399518 | 42.11 | 1116 | 96.38 |
W3 | 680722887 | 639170 | 744843 | 42.37 | 1108 | 96.35 |
Notes: Sample: sample number; Total Length: sum of the base numbers of all contigs; Contig Numbers: number of contigs after assembly; Largest Length: number of bases in the longest contig; N50: contigs were sorted from long to short and the cumulative length was counted. When a contig was added and the cumulative length was equal to half of the sum of the lengths of all contigs, the length of the contig was N50. Mapped: alignment rate between sequencing reads and assembled contigs. |
MetaGeneMark (Fu et al., 2012) was used to identify coding regions in the genome with default parameters. Gene prediction was conducted according to the assembled Contigs. The number of genes in different samples was about 500,000. The average gene length measured in each sample was in the range of 260 to 340 bp. Cd-hit software (Version 4.6.6) was used to remove the redundancy with default parameters, the similarity threshold was set at 95% and the coverage threshold at 90% (Fu et al., 2012). It was concluded that there were 2168041 non-redundant genes obtained in this sequence process, with an average length of 280 bp (Table 2).
Table 2
Sample ID | Gene number | Total length (bp) | Average (bp) | Max length (bp) | Min length (bp) |
F1 | 550158 | 152574384 | 277 | 7536 | 102 |
F2 | 464889 | 125423319 | 269 | 9972 | 102 |
F3 | 485384 | 132226104 | 272 | 7536 | 102 |
T1 | 426498 | 117609837 | 275 | 8286 | 102 |
T2 | 531807 | 152545194 | 286 | 8175 | 102 |
T3 | 472169 | 132783018 | 281 | 8904 | 102 |
W1 | 506309 | 164800824 | 325 | 12033 | 102 |
W2 | 579598 | 184957656 | 319 | 10866 | 102 |
W3 | 576423 | 195543927 | 339 | 11607 | 102 |
Gene set | 2168041 | 608038053 | 280 | 12033 | 102 |
Notes: Samples: sample number; Genes Numbers: the number of predicted genes; Total Length: the bases sum of predicted genes; Average Length: the average bases of predicted genes. |
2.2 Species Information Statistics
The species composition and relative abundance of the samples were obtained by comparing the above non-redundant genes with the species information of the sequence in the Nr database (Ashburner et al., 2000). The intestinal microbial species of the black soldier Fly larvae in different groups were counted in taxa of kingdom, phylum, class, order, family, genus, and species. The gut larvae microbiota was abundant, reaching more than 2300 genera and 11,000 species. The number of microbial species on different taxa in different groups were relatively similar, but a significant difference was observed in genus between groups W and T. In general, the intestinal microbe species of group W were higher, but not in a significant level than those of groups F and T (Table 3).
Table 3
Microbial species statistics
Group | kingdom | phylum | class | order | Family | genus | species |
F | 6 | 115.33 ± 1.15 | 124.67 ± 1.15 | 269.67 ± 1.53 | 623.33 ± 0.58 | 2309.67 ± 9.61 ab | 11510.33 ± 25.50 |
T | 6 | 115.33 ± 1.53 | 125.33 ± 0.58 | 269.33 ± 2.08 | 619 ± 5.20 | 2283 ± 7.81 a | 11432.33 ± 68.72 |
W | 6 | 111.67 ± 1.15 | 125.33 ± 1.15 | 269.67 ± 0.58 | 622 ± 3.46 | 2321.67 ± 10.11 b | 11880 ± 17.09 |
Note: Data in Table 3 are indicate means ± standard error. Nonparametric ANOVA was used to analyze the data in the same column. Different letter in the same column indicate a significant difference(p < 0.05)between groups.
2.4 Similarity Analysis of Microbial and Functional Genes between Groups
The microbiota and functional gene composition of the samples were hierarchically clustered through R and the unweighted paired average method (UPGMA). Based on this, the similarity of species composition and functional gene composition of each sample was determined. The sample distance in the sample hierarchy clustering indicates the similarity of the species composition of the two samples. Results showed that samples in same groups are closer and the branches are shorter, therefore the microbial species structure and functional gene composition of the samples between the same treatment groups were comparable (Figs. 2a, and 2c).
Principal Component Analysis (PCA) decomposes the differences of multiple sets of data on the two-dimensional coordinate chart through processing complex data into two eigenvalues. Composition analysis of different samples (97% similarity) reflects differences and distances between samples. Composition of the species in the two samples are similar, when they come closer on a PCA diagram. The results showed that the same sample in same group were closer to each other, which demonstrated that the microbial composition and functional gene composition were comparable. The sum of the first dimension (98.4%) and the second dimension (1.13%) reached 99.53%, which explain the difference between different groups to a great extent (Fig. 2b, Fig. 2d).
2.5 Analysis of Differential Microbes
About 100 species were selected (p < 0.05) to draw differential heatmaps. According to the heatmap, there exists 35 high relative abundance species in group W (Faecalicatena contorta, Stenotrophomonas acidaminiphila, Sphingobacterium cellulitidis, Salana multivorans, Enterococcus sp. Gos25 − 1, Lachnospiraceae bacterium OF09-33XD, Pusillimonas caeni, Hungatella hathewayi, Sphingobacterium sp. 30C10-4-7, Stenotrophomonas sp. Leaf70, Pusillimonas sp. T7-7, uncultured Stenotrophomonas sp., Frischella perrara, Myroides sp. N17-2, Microbacterium sp. CH12i, Ochrobactrum sp. A44, Microbacterium ginsengiterrae, Kluyvera georgiana, Sphingobacterium gobiense, Myroides odoratus, Klebsiella pneumoniae, Enterococcus pallens, Sphingobacterium lactis, Candidatus Schmidhempelia bombi, Aequorivita soesokkakensis, Vitellibacter aquimaris, Enterococcus sp. 9D6 DIV0238, Providencia stuartii, Miniimonas sp. PCH200, Gilliamella apicola, Orbus hercynius, Sphingobacterium mizutaii, Sphingobacterium sp. 1.A.4, Microbacterium profundi, Thermus filiformis), 34 in group T(Dysgonomonas capnocytophagoides, Allomyces macrogynus, Enterococcus sp. 9E7 DIV0242, Candidatus Erwinia dacicola, Enterococcus sp. 4G2 DIV0659, Propionibacteriaceae bacterium 16Sb5-5, Desulfovibrio sp. DS-1, Marinifilum breve, Tatumella sp. UCD-D suzukii, Bacillus velezensis, Staphylococcus gallinarum, Staphylococcus sciuri, Dysgonomonas sp. Marseille-P4361, Klebsiella aerogenes, Providencia rettgeri, Mycolicibacterium mucogenicum, Corynebacterium nuruki, Ruaniaceae bacterium KH17, Corynebacterium stationis, Enterococcus sp. 6C8 DIV0013, Enterococcus faecium, Providencia rustigianii, Wohlfahrtiimonas populi, Weissella jogaejeotgali, Bacillus amyloliquefaciens, Enterococcus casseliflavus, Bacteroides thetaiotaomicron CAG:40, Weissella thailandensis, Enterobacter hormaechei, Staphylococcus xylosus, Enterococcus saccharolyticus, Carnobacterium maltaromaticum, Spizellomyces punctatus), and 22 in group F(Citrobacter sp. MH181794, Schaalia canis, Metarhizium majus, Candida maltose, Nosocomiicoccus massiliensis, Lactobacillus sp. 54 − 5, Dysgonomonas gadei, Dorea longicatena, Prevotella sp. 10(H), Leminorella grimontii, Bacteroides caccae, Variovorax sp. EL159, Solibacillus isronensis, Proteus mirabilis, Actinomyces sp., Staphylococcus lentus, Rozella allomycis, Ignatzschineria sp. F8392, Leucobacter triazinivorans, Campylobacter concisus, Bacteroides sp. 1 1 14, Azospira oryzae). The relative high abundance species of the different groups concentrate on different species without obvious overlap in the heatmap. Although the relative high abundance species of the differential species in Group F was the lowest, the relatively low abundance species were also less than the other two groups (Fig. 3).
2.6 Analysis of Differential Functional Gene
The heatmap of different functional genes was obtained through a parametric test. The heatmap of Kegg (Kanehisa et al., 2004) metabolic pathways in differential abundance gene shows that groups possess differences in polysaccharide biosynthesis and metabolism, translation, membrane transport and energy metabolism. Group W results showed highest relative abundance in four biological process, group F maintained middle level abundance in those four-biological process, and group T showed low abundance in those four-biological process (Fig. 4a). The EggNOG (Powell et al., 2014) heatmap evidences the differences of cytoskeleton, extracellular structures, inorganic ion transport and metabolism, nucleotide transport and metabolism, and coenzyme transport and metabolism in different groups. Group W showed high abundance in extracellular structure, inorganic ion transport and metabolism, nucleotide metabolism and transport, coenzyme transport and metabolism, whereas the related functional genes of cytoskeleton in group T showed relatively high abundance. The extracellular structures of group F were comparable to those of group W, whereas the other categories were less than those of group W (Fig. 4b).