Metagenomic Analysis of the Virome in Lung Tissues and Guts of Wild Mice

Background Mice, as host animals of a variety of pathogens, can spread 60 kinds of human diseases including more than ten families of viruses including Poxviridae, Herpesviridae, and so on. Methods In this study, lung tissues and gut samples of 7-week-old mice from outdoor environments were sequenced using metagenomics, and an abundance of virome information was acquired. Results A total of 82 families of mammalian viruses, plant viruses, insect viruses, and phages were detected. Among the top 10 most abundant families were the RNA viruses Orthomyxoviridae, Picornaviridae, Bunyaviridae, and Arenaviridae, the DNA virus Herpesviridae, the insect viruses Nodaviridae and Baculoviridae, the plant virus Tombusviridae, and the phage Myoviridae. Except for Myoviridae, whose abundance in guts was higher than in lung tissues, the abundance of viruses in the lung tissues and guts showed no significant difference. The data obtained in this study provided an overview of the viral community present in these mice samples, revealing some mouse-associated viruses closely related to known human or animal pathogens. Strengthening our understanding of unclassified viruses in mice in the natural environment could provide scientific guidance for the prevention and control of new viral outbreaks that can spread via rodents.


Introduction
Mice, as host animals of a variety of pathogens, can spread 60 kinds of human diseases 3 including more than ten families of viruses including Poxviridae, Herpesviridae, and so on.
For instance rodents are the natural host of Hantavirus, which commonly causes HFRS in Asia and Europe and HPS in North and South America [1]. Rodents are also the natural host of Arenavirus, which causes Lassa Fever, a condition with high mortality in humans [2]. Arenavirus belongs to the family of Arenaviruses, the representative virus of which is the Lymphocytic choriomeningitis virus, which is distributed globally [3]. The main reason for this worldwide distribution is that Mus musculus are the primary host animals for this virus. Therefore, mice carry many disease-causing viruses and are the cause for increasing concern. While mice carry these viruses, they do not show any clinical symptoms, and thus, it is easy to ignore the potential threat of natural viruses to human and animals. Therefore, strengthening the research of wildlife etiology to understand the existence of viruses and epidemic conditions in nature is important for the prevention and control of new viral epidemics and outbreaks.
Traditional virology research methods are limited to tissue pathology and virus culture, which makes it difficult to study viruses that cannot be cultured. Metagenomics sequencing technologies make it possible to find new viruses from the angle of the genome. In a few short years, metagenomics research has penetrated into all areas of potential viral life, including ocean, soil, hot springs, human oral cavities and the gastrointestinal tract [4][5][6][7][8][9]. Surprisingly, in nearshore marine environments, 65% of the detected virus sequences were previously unknown and genotype data revealed a total of 5000 viral species [7]. Horse fecal samples were sequenced using high-throughput sequencing technologies and 68% of virus sequences identified were previously unknown while the genotype data identified up to 1000 different viral species [8]. Finally, feces samples from humans contained as many as 1200 unique viral genotypes identified through metagenomics sequencing, and rare and new intestinal viruses were found 4 [10][11][12].
Zhang et al., BLASTed 36,769 RNA virus sequences of samples from the healthy human intestinal tract and found that most sequences were similar to plant RNA viruses [13]. Day et al., used metagenomics to analyze the virome of turkeys suffering from enterovirus syndrome and found many new unidentified viral species [14]. Bats are the natural host of many zoonotic viruses; Li and Donaldson [15,16] collected samples from the intestine of North American bat species and analyzed them using metagenomics. They found that the intestines of the bats contained a rich pool of viruses with not only viruses that can infect animals, but also many new plant and insect viruses. In addition, the study by Donaldson identified three new strains of genetic type I coronaviruses. He also used virus metagenomics technology to analyze the virus community of fecal samples of bats from different areas in China and showed that the bacterium and virus community accounted for 60% of species in the feces, where insect viruses accounted for 35%, while vertebrate, plant and protozoan viruses accounted for about 5% of all viruses.
In this study, gut and lung tissue were collected from 3 Clethrionomys rufocanus organisms, which are wild representatives of mice, to assess the variety of viruses carried by the mice. Metagenomic analysis was then conducted to screen the viromes of these samples. Herein, we outline the viral spectrum within these mouse samples. These data offer new clues for tracing the sources of important viral pathogens that can cause human and animal disease. and the mixtures were incubated at 37 ℃ for 60 min to prepare synthetic double-stranded cDNA, and then at 75 °C for 10 min. Then, 2 U of shrimp alkaline phosphatase and 2.5 U of Exonuclease I were added into the system, followed by incubation at 37℃ for 60 min to remove redundant primers and free nucleotides. The mixture was then incubated at 72 °C for 15 min. Next, 10 µL of the template, 2 µL of 10 µmol/L anchor sequence primers, 5 µL of 10 × AccuPrime buffer, and 1 µL of Accuprime Taq DNA Polymerase was added, and ddH 2 O was used to attain a total volume of 50 µL to carry out sequence-independent single primer amplification (SISPA). The thermal cycler profile was as follows: 72 °C for Based on the numbers of reads and the abundance of information in each sample for each classification level (phylum, class, order, family, and genus), statistical analysis and visual display was performed.

Statistical analysis
Metastat was used to analyze the top 10 abundant taxonomic sequence tags of the three samples. When the p-value was less than 0.05, the differences were considered to be statistically significant.

Sequencing and quality control
Sequencing of DNA and RNA extracted from lung tissues and gut samples of 3 wild Myodes 8 rufocanus organisms was performed. The length of the insert size was 350 bp. Bases showing overlapping information and low mass, and bases that were not measured were excluded and the total numbers of clean data obtained from the six samples (three lung and three gut samples) were 238493,209033,199432,239177,200730, and 214870, respectively. Sequencing data quality was distributed in the quality score Q20 so as to ensure a normal order of the subsequent advanced analysis. The clean sequence tags were subjected to redundancy processing using the mothur software to obtain unique sequence tags.

Distribution of the samples based on family-level classification
Pretreated Clean Data of all the samples was compared with the reference genome of viruses in the NCBI database, NT database, and ACLAME database to obtain an annotation to each level (from Kingdom to Species). In this study, we conducted a viral metagenomic analysis of fecal and lung tissue samples from mice using the Solexa sequencing technique (Illumina). The data analysis indicated that the most abundant sequences were related to mammalian viruses, insect viruses, plant viruses, and phages.
This report suggests that mice harbor a large spectrum of mammalian viruses, especially Influenza A virus, from the family Orthomyxoviridae in both feces and lung tissues.
Additionally, there is no significant difference between the two tissues in terms of viral species, implying that if humans have close contact with rodents, they may be infected with influenza virus or other viruses. The natural reservoir of Hantavirus and Arenavirus are mice, and they can cause serious infectious diseases in humans and animals.
In lung tissues and gut samples from wild-life mice, insect viruses (Nodaviridae, Baculoviridae) and plant viruses (Tombusviridae) were found. The presence of these viruses may be related to the survival environment of mice and the intake of food. In addition, phages (Myoviridae) were also was detected and the abundance of Myoviridae in the lungs was significantly higher than in the feces (p = 0.037). It is worth mentioning that the virus does not exist in lung2 and lung1 samples and this may be due to the presence of different viral species in the lungs and lungs. The presence of a group of unclassified viruses implies that many of the viruses in rodents are unknown and require further exploration.

Conclusion
The metagenomics approach can greatly improve our understanding of the diversity of 13 viruses in mice. Using metagenomics technology, this study analyzed the composition and abundance of virus genome in lung tissues and guts from 3 wild-life mice. This strategy could be extended to other wildlife or livestock samples worldwide, ultimately increasing our knowledge of the viral population and ecological community, and thus minimizing the impact of potential wildlife-associated viruses on public health by providing meaningful basic data.

Availability of data and materials
The datasets analyzed during the current study available from the corresponding author on reasonable request.

Ethics approval and consent to participate
The study was approved by Harbin Veterinary Research Institute and performed in accordance with animal ethics guidelines and approved protocols. The animal Ethics Committee approval number is Heilongjiang SYXK-2006-032.

Consent for publication
Not applicable    (a-f) Overview of the classification of the identified mice viruses from each sample in this study from kingdom to species. "Others" indicates the sum of the relative abundances of all the other levels (from kingdom to species) and are labeled in a pink box."a-f "refer to gut1,gut2,gut3,lung1,lung2 and lung3 respectively.