SARS-CoV-2 and Influenza A/B virus infection trends during the epidemic
By Feb 1, 2020, 649 fever patients were doubted as SARS-CoV-2 infection in Hunan. The patients were aged from 2 to 65, and the median age of the patients was 46 years (IQR 29.25-45.5). During the COVID epidemic, all the fever patients admitted to the hospital were tested for the SARS-CoV-2 and Flu A/B nucleic acid, using the kits of bioPerfectus and Sansure biotech respectively. Among fever patients, 3.96% (29/733) of patients are infected with the Influenza A virus and 5.46% (40/733) are infected with the Influenza B virus. We found that patients with SARS-CoV-2 do not have co-infections of A and B. The number of newly diagnosed patients with SARS-CoV-2 accounts for only 2.05% of all fever patients (Tab. 1). The number of COVID-19 confirmed cases in these six days (Jan 23th, Jan 26, Jan 28, Jan 29, Jan 31, and Feb 1st) is relatively stable. The number of new patients with fever infected with Influenza A/B virus fluctuates greatly and is significantly higher than the number of COVID-19 patients (P<0.01) (Fig.2).
Changes in Influenza A/B virus before, during and after the COVID-19 epidemic
Before the epidemic, most patients with fever would only be tested for Influenza A/B virus antigen. If the patient has a negative antigen test but is highly suspected of being infected with Influenza A/B virus or if an antigen-positive patient wants to seek further diagnosis, the doctor will recommend the patient to take a nucleic acid test. Therefore, the results of the Influenza A/B virus tests of patients before the epidemic originated from the antigen gold labeling method. However, during the epidemic, to ensure high detection efficiency and timely treatment of COVID-19 patients, all fever patients in this teaching hospital detected Influenza A/B virus and SARS-CoV-2 nucleic acid. The positive rate of each period is shown in Tab.2. The detection rate of Influenza A/B virus in fever patients from 2019.8.1 to 8.16 is 8.62% (151/1751) via colloidal gold immunochromatography. The nucleic acid test result is 7.48% (8/107) similar to that of the immunoassay. During the epidemic (2020.1-2020.2), the infection rate of the Influenza A/B virus increased to 12.55% (92/733), showing the same trend as the COVID-19. However, After the epidemic, the detection rate of COVID-19 among fever patients in this teaching hospital dropped to zero and the infection rate of the Influenza A/B virus became extremely low (0.00%)(Fig.3).
Infections of pathogens in fever patients
We tested ten respiratory pathogen nucleic acid detection for patients. The test results are as follows (Tab.3): The most common pathogen at onset of illness was Respiratory syncytial virus (RSV) [16 (2.47%) of 649 patients]. Influenza A(FluA) [9 (1.39%) of 649 patients], Influenza B(FluB) [14 (2.16%) of 649 patients], M.Pneumonia(MP) [11 (2.47%) of 649 patients], Adenovirus(ADV) [10 (1.54%) of 649 patients], Parainfluenza virus I (PIV1) [0 (0.00%) of 649 patients], Parainfluenza virus II (PIV2) [1 (0.15%) of 649 patients], Parainfluenza virus III (PIV3) [0 (0.00%) of 649 patients], Coronavirus virus 229E (HCoV-229E) [5 (0.77%) of 649 patients], Human metapneumovirus (HPMV) [13 (2.00%) of 649 patients], Human Coronavirus virus OC43 (HCoV-OC43) [1 (0.15%) of 649 patients], Human Boca virus (HBoV) [0 (0.00%) of 649 patients]. The proportion of patients with SARS-CoV-2 among fever patients is [15/733 (2.05%)]. The other febrile pathogens rate of fever patients is higher than patients with SARS-CoV-2 (P<0.01) (Fig. 4).
Basic statistics of bioinformatics analysis data
Quickly check the SARS-CoV-2 virus. The mean and depth of SARS-CoV-2 coverage of positive samples were 98.03% (44.80%-99.87%) and 12.9X (2.8X-167.8X). The statistics are as follows (Tab. 4):
Analysis of variation and intra-host variation
Use samtools mpileup and in-house scripts to analyze SARS-CoV-2 mutations. Extract all mutation sites with a frequency above 90% and the site depth >=10X. A total of 18 loci were detected from 9 samples. 209001889R and 209001909R share 2 identical mutations (27925, 28377). 209001897R and 209001903R share the C28854T mutation (Tab.5).
Extract all the sites with a frequency of more than 5% and the genotype depth of the mutation> 5X, and remove the sites with a mutation frequency of more than 95%. A total of 42 Intra-host mutations (intra-host mutations) were obtained, which were sporadic. The same mutation was not found in any two samples. For details, please see " Supplementary Table 1".
Extracting the genotypes at positions 8782 and 28144 in all samples, 5 samples can distinguish the L/S type, including 4 S type and 1 L type (Tab.6).
SARS-Cov-2 and human genome integration analysis
For the sample "209001902R" with the highest virus depth, the reads that were not completely aligned to humans and that were not completely aligned to SARS-Cov-2 were re-analyzed, and no reads that reliably supported fusion were found (Supplementary Table 2).
Analysis of mixed infection with other pathogens
4(25%) positive samples were combined with Human adenovirus infection. 209001894R was Human respiratory syncytial virus B infection (Supplementary Table 3).
Metatranscriptome analysis
The data of the 8 throat swabs and 6 stool samples were spaded and de-redundant (cd-hit) to obtain the gene set. And use QUAST to evaluate the assembly effect. Compare the reference gene set (throat swabs and stool separately) to all samples, and calculate the expression level. Use "edgeR" to do DEGs analysis on the negative and positive groups. The results are shown in the form " Supplementary Table 4". The volcano plot (ggplot2) of the differential expression of transcripts of swab and stool in negative and positive samples is shown in the Figure 5.
Metaphlan was used to analyze the origin of mRNA sequence species. Calculate the Shannon, Simpson, and inverse Simpson indexes (R package vegan) at the species and genus levels. Calculate t-test, Wilcox test value (python3 scipy). Calculate the kind of horizontal unifrac distance (R package rbiom). Draw PCoA diagrams (ape, ggplot2). See Supplementary Table 4 form "Diversity". The following Fig.6 is a PCoA diagram made by unifrac distance. Humann3 is used to make functional annotations on the part of the macro transcript data, and the difference between the positive and negative two groups is calculated. The pathways with Wilcox-test p-value<0.05, t-test p-value<0.05 and Equal_Vari_P>0.05 (levene test for homogeneity of variance) were screened out (Tab.7 and Supplementary Table 5).
For the rRNA part, use Kraken2 to do taxonomy on the data, and calculate the Shannon, Simpson, and inverse Simpson indices. There was no significant difference in species and genus levels (Wilcox-test, p=0.1824, p=0.1654).