3.1. Sequencing and quality control
DNA and RNA, extracted from fecal samples of 82 forest rodent intestinal contents samples collected in the forest area of Hengdaohezi Town Hailin City, Heilongjiang Province, were sequenced. The length of the insert size was 350 bp. Bases showing overlapping information and low mass, and bases that were not measured, were excluded. The total numbers of clean data obtained from the six samples were: 2,384.93 (MR), 2,090.33 (AP), 1,994.32 (AA), 2,391.77 (TS), 2,007.30 (SV), and 2,148.70 (CT).Sequencing data quality was distributed in the quality score Q20 to ensure a normal order of the subsequent advanced analysis. The clean sequence tags were subjected to redundancy processing using the Mothur software to obtain unique sequence tags. The percentages of effective sequences of the six samples were: 95.242%(MR), 93.561% (AP), 97.509% (AA), 97.258%(TS), 93.121% (SV), and 94.622% (CT). (Table 1).
Table 1. Fecal samples used for metagenomic analysis and data generation
Sample
|
InsertSize
(bp)
|
SeqStrategy
|
RawData
|
CleanData
|
Clean_Q20
|
Clean_Q30
|
Clean_GC
(%)
|
Effective
(%)
|
MR
|
350
|
(150:150)
|
2,504.06
|
2,384.93
|
76.53
|
65.45
|
58.69
|
95.242
|
AP
|
350
|
(150:150)
|
2,234.18
|
2,090.33
|
75.94
|
64.76
|
57.70
|
93.561
|
AA
|
350
|
(150:150)
|
2,045.27
|
1,994.32
|
87.12
|
78.68
|
52.56
|
97.509
|
TS
|
350
|
(150:150)
|
2,459.21
|
2,391.77
|
83.63
|
73.78
|
52.87
|
97.258
|
SV
|
350
|
(150:150)
|
2,155.58
|
2,007.30
|
73.76
|
62.15
|
59.05
|
93.121
|
CT
|
350
|
(150:150)
|
2,270.82
|
2,148.70
|
78.59
|
67.93
|
55.53
|
94.622
|
3.2. Viral Communities in fecal samples, based on family-level classifications
Pre-treated clean data for all samples and the assembled sequence were compared to the reference genomes of viruses in the NCBI database using the BLASTX and BLASTN tools in the BLAST+ software package to obtain the virus annotation results. In total, 82 families of mammalian viruses, plant viruses, phages, insect viruses, and fungal viruses were parsed. An overview of the reads of the top 35 families of viruses in each sample is shown in Figure 1. In addition, an overview visualized presentation of the classification of families, genera, and species for each sample is shown in Figure 2A-C.
(1) Single-stranded RNA viruses(Orthomyxoviridae, Picobirnaviridae, Bunyaviridae, and Arenaviridae)
The members of the family Orthomyxoviridaecan cause causes cyclical pandemics throughout the world in various species[13].In this study, they are assigned to the genus Influenzavirus A and the species influenza A virus. The reads related to the family Orthomyxoviridae occupied the largest proportion of viruses. The percentage of this family of viruses in each sample was: 45.04% (MR), 51.57% (AP), 41.08% (AA), 41.9% (TS), 27.59% (SV), and 22.1% (CT) (Figure2A).
The family members of Picobirnaviridae cause a wide variety of mucocutaneous, encephalic, cardiac, hepatic, neurological, and respiratory diseases invertebrate hosts [14]. The Picornaviridae family viruses were found in all six samples. The viruses were assigned to the genus picobirnavirus, the species human picobirnavirus, Microtus picobirnavirus V-111_USA_2008, and fox picobirnavirus. It is worth mentioning that the human picobirnavirus occupied the larger proportion of viruses in the SV sample (2.64%) (Fgure 2C).
The family Bunyaviridae have strong infectivity, wide distribution, a high fatality rate, and can cause serious infectious diseases in humans and animals [15], Most of the members of this family, such as Rift valley fever virus, Crimean-Congo hemorrhagic fever virus, La Crosse encephalitis virus, and Hantavirus, cause deadly diseases in humans. The natural hosts for Hantavirus are rodents, and it can cause hemorrhagic kidney fever. The virus was present in all samples of the six forest rodents. Among these, the abundance in fecal of SV (3.49%) was higher than that in the other samples. The viruses of this family were assigned to the genus, Orthobunyavirus, and the species, Shamonda virus (Figure 2B, C).
Arenaviridae is an enveloped RNA virus found worldwide. The Lassa fever virus, Junin virus, and Machupo virus can cause severe diseases with a high mortality rate[16]. Thus, the prevalence of infectious diseases is closely related to the local dynamic distribution of rodents. In this study, the virus was detected in the fecal of all six species of mice. The viruses in this family are assigned to the genus, Mammarenavirus, and the species, Lassa mammarenavirus.
(2) DNA viruses (Herpesviridae)
Viruses of the Herpesviridae family are enveloped, double-strand DNA viruses, divided into three genera based on phylogenetic clustering: α-herpesvirus, β-herpesvirus, and ɣ-herpesvirus [17]. This family was detected in fecal samples of all six species of rodent. The viruses in this family were assigned to the genera, Cytomegalovirus, Varicellovirus, Mardivirus, and the species, Cercopithecine herpesvirus 5 and Gallid herpesvirus 2, respectively (Figure 2B, C).
(3) Other rare viruses (Nodaviridae, Baculoviridae, Tombusviridae, Myoviridae )
Insect viruses (Nodaviridae, Baculoviridae), plant viruses (Tombusviridae), and phages (Myoviridae) were identified in the fecal samples. The viruses in the family of Nodaviridae were assigned to the genera, Alphanodavirus and Betanodavirus, and the species, Pariacoto virus and Barfin flounder nervous necrosis virus, respectively. The viruses in the family of Tombusviridae were assigned to the genera, Tombusvirus, and no virus in the family of Tombusviridae was assigned to the species among the top 10 most widely distributed. It is worth mentioning that AA and CT did not contain any members from Tombusviridae.No virus in the family of Baculoviridae and Myoviridae was assigned to the genera, and species among the top 10 most widely distributed.
(4) Unclassified viruses
Currently, there is little information about unclassified viruses and their evolution in forest rodents. In our data, many reads are classified as "unclassified virus sequences" in all samples, likely to be previously unidentified viruses that have not been studied. The identification and characterization of these unclassified viruses will provide insight into the evolutionary histories of other clinically important viruses, as well as the genetic basis behind their infectivity and virulence in humans and other animals. Such information is important for the development of future treatment options and vaccine research (Figure 2A).
3.3 Genome features of the novel astrovirus
Using mNGS, we identified a novel astrovirus from the feces of Myodes rufocanus, which was tentatively named Rodent Astrovirus CHNDB/2019. Rapid Annotation using Subsystem Technology (RAST, http://rast.nmpdr.org) was used to annotate the complete genome sequence. The results revealed that the Rodent Astrovirus CHNDB/2019 genome consists of 6192 nt and contains two open reading frames: ORF1 encodes a polyprotein located at nucleotide (nt) positions 35 - 3750, and ORF2 was located at nt positions 3719 -6178 in Myodes rufocanus.
3.4 Phylogenetic Analysis
The phylogenetic analysis of the complete genome sequence showed that ORF1 (Figure 3B), ORF2 (Figure 3C), and the complete genome (Figure 3A) had the highest similarity to the rodent astrovirus isolate, CHN/100, which was isolated from Jiaxiang County, Shandong Province, and the genetic distance between them is the closest. The nt sequence similarity between the entire genome and CHN/100 was 84.39%, which the sequence similarity with other rodent astroviruses was 78.08%–67.74%. The nucleotide sequence similarity between ORF1 and orf1b of Rodent Astrovirus isolate CHN/100 was 86.2%, and the amino acid similarity was 74.45%. The greatest similarity between OFR2 and orf2 of the Rodent astrovirus isolates CHN/100 was 81.27%, while the amino acid sequence similarity was 63.61%. Therefore, it was clear that the virus was a rodent astrovirus (GenBank Submission Number, MW927503).
3.5 Infectivity of Astroviruses in BHK Cells
The fecal samples containing the new astroviruses were inoculated into BHK-21 cells to detect the presence of infectious astroviruses. After four to five blind passages, CPE appeared in BHK cells. IFA and qRT-PCR were used to analyze the presence of astroviruses, which could be detected in the inoculated cells. IFA (Figure 4A) confirmed the presence of antibodies specific to the astrovirus capsid spike protein, VP27. In addition, whether there is astrovirus in the inoculated BHK-21 cells. qRT-PCR was performed using astrovirus ORF1 gene-specific primers ORF1-F (5'-CAGTCCTTGGGATTTCTC-3'); ORF1-R (5'-TATTCTTTCGCACCATTAG-3') (Figure 4B).