Recombinant SARS-CoV-2 Delta/Omicron BA.5 emerging in an immunocompromised long-term infected COVID-19 patient

doi:10.21203/rs.3.rs-3787764/v1

Download PDF

Research Article

Recombinant SARS-CoV-2 Delta/Omicron BA.5 emerging in an immunocompromised long-term infected COVID-19 patient

https://doi.org/10.21203/rs.3.rs-3787764/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Background

The emergence of the SARS-CoV-2 virus led to a global pandemic, prompting extensive research efforts to understand its molecular biology, transmission dynamics, and pathogenesis. Recombination events have been increasingly recognized as a significant contributor to the virus's diversity and evolution, potentially leading to the emergence of novel strains with altered biological properties. Indeed, recombinant lineages such as the XBB variant and its descendants have subsequently dominated globally. Therefore, continued surveillance and monitoring of viral genome diversity is crucial to identify and understand the emergence and spread of novel strains.

Methods

The case was discovered through routine genomic surveillance of SARS-CoV-2 cases in Norway. Samples were whole genome sequenced by the Illumina NovaSeq platform and SARS-CoV-2 lineage assignment was performed using Pangolin and Nextclade. Mutations were pangolin classified based on the frequency of the mutations present in the AY.98.1 and BA.5 lineages.

Results

In this study, we report and investigate a SARS-CoV-2 recombination event in a long-term infected immunocompromised COVID-19 patient. Several recombination events between two distinct lineages of the virus, namely AY.98.1 and BA.5, were identified, resulting in a single novel recombinant viral strain with a unique genetic signature.

Conclusions

The presence of several concomitant recombinants in the patient suggests that these events occur frequently in vivo and can provide insight into the fitness associated with the different combinations of mutations. This study underscores the importance of continued tracking of viral diversity and the potential impact of recombination events on the evolution of the SARS-CoV-2 virus.

Trial registration

Retrospectively registered

SARS-CoV-2

recombinant

immunocompromised

in-patient recombination event

Delta

Omicron

The emergence of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) caused a global pandemic that affected millions of people worldwide, with significant impacts on public health (World Health Organization, 2023), the economy (World Development Report, 2022), and social welfare (OECD, 2021). The high rate of transmission (Meyerowitz et al., 2021) and the ability of the virus to cause severe respiratory illness (Zhou et al., 2020) have prompted extensive research efforts to understand its biology and evolution. One crucial aspect of this research is investigating the genetic variability of the virus and the potential for recombination events that can lead to the emergence of new viral strains with altered virulence and transmission characteristics.

Recombination occurs when two or more different viral strains infect the same host cell, allowing for the exchange of genetic material between the viruses. In coronaviruses, these events are primarily driven by the RNA-dependent RNA polymerase, which can switch between viral templates during genome replication (Bentley & Evans, 2018). This can result in the formation of chimeric viruses that contain genetic material from two or more viral strains (Focosi & Maggi, 2022). The emergence of recombinant viruses has been increasingly recognized as a significant contributor to SARS-CoV-2 diversity and evolution.

While recombinant SARS-CoV-2 viruses were observed during the first years of the COVID-19 pandemic, these variants did not circulate widely in the population (Burel et al., 2022; Sekizuka et al., 2022). However, as the number of infections rose with the spread of the Omicron variant, there was also an increase in observed recombinant strains, including the emergence of the XBB lineage, a recombination between the two BA.2.75 subvariants, BJ.1.1 and BM.1.1.1 (Parums, 2023), which initially resulted in extensive transmission in Singapore, India, and elsewhere in the fall of 2022 (World Health Organization, 2022). By the spring of 2023, subvariants of XBB had become dominant globally, demonstrating how recombination events can contribute to viral fitness and transmissibility.

Chronic SARS-CoV-2 infections in immunocompromised individuals are known to accelerate the viral mutagenesis and significant mutations within the spike protein have been observed in these patients (Harari et al., 2022; Li et al., 2022). Moreover, the prolonged persistence of the infection in these patients provides a favourable time window for recombination to occur if the patient is exposed to other variants. Indeed, several recombinants have been identified to occur in immunocompromised patients. (Burel at al., 2022; Zannoli et al., 2023)

In this study, we report the identification of a recombinant SARS-CoV-2 virus in a long-term infected COVID19 patient. During our surveillance of SARS-CoV-2, we identified the emergence of a recombinant strain between two distinct lineages, AY.98 and BA.5, resulting in a novel viral strain. We gathered and characterized additional samples from the same patient before and after the recombination event. Deep sequencing of all the sequences suggests that recombination events occur frequently in vivo, providing further evidence of the need for continued surveillance and monitoring of viral diversity in immunocompromised patients. Our findings underscore the importance of understanding the molecular mechanisms of recombination and the potential impacts of recombination events on the evolution and emergence of novel strains of SARS-CoV-2.

Sample Extraction and Sequencing.

All the samples were extracted and processed using the Swift Amplicon SARS-CoV-2 Panel (Swift Biosciences) The samples were sequenced on an Illumina NovaSeq platform at the Norwegian Sequencing Centre (NSC) NorSeq.

Generation of SARS-CoV-2 consensus sequences.

SARS-Cov-2 consensus sequences were generated using the “Covid-seq” pipeline developed by the NSC (https://github.com/nsc-norway/covid-seq). Briefly, PCR primers used during library preparation were removed using NSCTrim (https://github.com/nsc-norway/NSCtrim). Then, sequencing adapters, poorly called nucleotides and overall low-quality reads and adapters were removed using fastp (Chen et al., 2018).

Next, the high-quality-trimmed reads were mapped to the Wuhan-Hu-1 reference genome (NC_045512.2) using Bowtie2 (Langmead & Salzberg, 2012). Consensus sequences were generated from the resulting mapping files using samtools, mpileup (Danecek et al., 2021) and iVar (Grubaugh et al., 2019) with a minimum depth threshold of 10 for calling a nucleotide.

Noise calculation.

We define noise as the sum of the ratios of all the nucleotides minus the ratio of the most frequent nucleotide (i.e., the one called in the consensus sequence). To calculate the noise of the samples we developed a tool called NoisExtractor (https://github.com/garcia-nacho/NoisExtractor). NoisExtractor uses indexed bam files as inputs and for each position of the genome it outputs the noise, depth, the nucleotide with highest frequency and the nucleotide with the second highest frequency and their frequencies respectively.

Identification of coinfections/contaminations.

As a part of the sequencing routines at NIPH, a quality control is performed for each sample. In this analysis low-quality samples and individual samples containing more than one virus are flagged, as this could indicate a contaminated sample or a coinfection at the patient level. To do this analysis, we developed a machine learning model. This model is based on linear regression in which noise-related parameters (e.g., mean and standard deviation of noise across the genome, binned number of positions with noise, etc) and depth and coverage-related parameters (e.g., binned number of missing positions, average depth, etc) were used to classify a sample as low-quality, high-quality or contaminant. To train the classification model, we used a subset of 1846 manually curated samples that were assigned into 4 different classes: high-quality-high-contamination, high-quality-low-contamination, high-quality-no-contamination and low-quality. The code to perform the quality control and the trained model is available at here: https://github.com/folkehelseinstituttet/FHI_SC2_Pipeline_Illumina.

Extraction of sequences for the major and minor variants.

Once a possible contamination or coinfection is identified, the sequence of the major variant (most abundant variant) was generated by concatenating the nucleotides with highest frequency at each position of the genome. To generate the sequence of the minor variant (second most abundant variant), the nucleotides in which the noise of sequence was higher than 0.1 were replaced. The nucleotide that replaced the nucleotide with highest frequency (major) was the one with the second highest frequency (minor). We implemented the extraction of sequences by parsing the output of NoisExtractor in R.

Identification of recombinant sequences.

To identify recombinant sequences, we developed PrecFinder (https://github.com/garcia-nacho/Precfinder). For each single mutation in a sequence, PrecFinder calculates the Bayes’ probability of the virus belonging to a particular Pangolin lineage based on the distribution of mutations in different lineages. As the ratios of the different mutations in the virus continues to evolve, the probability is calculated based on a database of sequences which is regularly updated.

To find which sequences are recombinants, PrecFinder uses a 1D-convolutional neural network model. The model consists of three sets of a 1D-convolutional neural network layer (1D-CNN) followed by a 1D-MaxPooling layer. The three 1D-CNN have 64, 32 and 12 filters and kernel sizes of 5, 3 and 3 nucleotides respectively. The pool sizes of all the 1D-MaxPooling layers were set to 2. Then, two feed-forward layers with 24 and 12 layers respectively were included. Finally, a softmax classification layer outputs the score to classify the sample. The input of the model consists of a Bayes’ probability matrix of n by m dimensions. Where n is the number of unique lineages present on the database and m is the maximum number of mutations present in at least one sequence of the database. To train the model we used binary-crossentropy as loss function, adagrad as optimizer and a batch size of 128. The training of the model was scheduled for 60 epochs but it was early-stopped if there was no improvement on the accuracy after eight epochs. The weights of the model with highest accuracy were saved. Moreover F1, precision and recall were calculated. As training set, we used the sequences present in the database which consists of a synthetic set of recombinant sequences that were generated using the sequences present in the database. Sequences assigned to different lineages were recombined in silico through one, two or three breaking-points randomly selected in the genome. Moreover, we augmented the dataset through the reordering the n rows of the training set. The model was implemented and trained using Keras (Chollet et al., 2015) and TensorFlow v2.8 (Abadi et al., 2015) in R.

Recombinant sequences were also identified using the program sc2rf (https://github.com/lenaschimmel/sc2rf).

Lineage assignments.

SARS-CoV-2 lineage assignment was performed using Pangolin (O’Toole et al., 2021) and Nextclade (Aksamentov et al., 2021) and the mutations at nucleotide and amino acid levels were identified using Nextclade (Aksamentov et al., 2021).

To identify the AY.98.1 and BA.5 specific mutations, 2000 AY.98.1 and BA.5 sequences were downloaded from NCBI GenBank using cov-sampler (Cheng et al., 2022). Sequences with low-quality and/or wrong lineage assignment according to Nextclade were removed, and the mutations present in the remaining sequences were extracted using Nextclade (Aksamentov et al., 2021). Based on the frequency of the mutations present on the AY.98.1 and BA.5 lineages, the mutations present in our sequences were classified either as AY.98.1-specific, BA.5-specific or other, where other means that the mutation is not found on any of the lineages or that it can be found in both. All plots to visualize Pangolin lineages were generated in R (R Core Team, 2022) using the library ggplot2 (Wickham, 2016).

Cultivation of recombinant virus.

Vero E6/TMPRSS2 cells (NIBSC #100978) were cultivated in complete Dulbecco’s Modified Eagle Medium (cDMEM) supplemented with 10% fetal bovine serum (FBS) and 1mg/ml G418. In a biosafety level 3 (BSL3) laboratory, clinical samples collected from the patient were added to the cells at approx. 60% confluency in a T-25 flask for 1h at 37°C. The inoculate was subsequently removed and replaced with fresh viral culture medium (DMEM supplemented with 2% FBS, 100 units/ml penicillin, 100 ug/ml streptomycin and 25 mM HEPES). The infected cells were incubated for 3–4 days at 37°C and the supernatant was then diluted 1:1000 and passaged onto fresh cells for a second passage. After 3–4 more days the second passage of virus was harvested. Both the first and the second passage of the virus were sequenced.

Fitness estimation.

To estimate the fitness of the different virus strains, we identified the substitutions that they carried at the amino acid level using Nextclade. Then, we connected those mutations with the fitness estimated from Bloom and Neher (2023). The fitness of each of the variants was computed as the sum of the fitnesses of the individual mutations present in the sample. If a sequence contained mutations absent in the fitness database, no fitness was assigned to that mutation.

Identification of a co-infection.

As part of our SARS-CoV-2 surveillance at the Norwegian Institute of Public Health (NIPH), the purity and consistency in the sequence data is monitored. Through this quality control, we identified a sample with high levels of noise, or sequence variation, after the mapping of the reads which typically indicate either contamination or co-infection (Fig. 1A). To rule out contamination or other sequencing artifacts as causes for the observation, we repeated the entire analysis from RNA extraction, cDNA generation, PCR amplification, library preparation, to sequencing. The re-processed sample showed the exactly same noise pattern (Fig. 1A and 2A “day 0”).

Strikingly, the sequence variation was restricted mainly to the first two-thirds of the genome (Fig. 1A), which warranted further investigation. We attempted to re-create the genomic sequences of a potential major and minor variant in the sample (i.e., co-occurring strains of different abundances) (See Methods for details on the generations of the major and minor sequences) and we found that the major sequence was classified as a delta variant (Pangolin lineage: AY.98.1/NextClade clade: 21J) and the minor sequence as a variant of omicron (BA.5/21B) (Fig. 1A and Fig. 1B).

Identification of delta/omicron recombinant strains.

Fine-grained mutation-profile analyses of the two strains (i.e., major and minor variants) showed that both sequences were actually recombinants. The major strain had AY.98-specific mutations in the first 15Kb of the genome (flanked by G210T and G15451A) (Fig. 1B). Then, there was an 8Kb region overlapping with the Spike gene that contains just BA.5 mutations (flanked by C17410T and C25584T). The minor strain, on the contrary, had only BA.5-specific mutations on the first 15.7kb of the genome flanked by the BA.5-specific C44T and C15714T mutations. Then, there was a 3Kb region where there is a mixture of delta and omicron mutations. Next, there was a 4Kb region covering the Spike gene where there are BA.5 mutations only (flanked by C21618T and C25584T BA.5 mutations) (Fig. 1B). The final part of the genome of both strains was identical and it carries only AY.98.1-specific mutations (Fig. 1B).

This suggests that the sample isolated on day 0 was actually a mixture of at least two recombinant strains, one AY.98.1-BA.5-AY.98.1 recombinant and one a BA.5-AY.98.1 recombinant.

Two independent recombinant-detection tools, sc2rf and PrecFinder classified both strains as recombinants (Fig. 1B, Fig. S1A and Fig. S1B).

Virus evolution in an immunocompromised long-term infected COVID-19 patient.

Driven by these results, we became interested in tracing the sample's origin, and we found that it had been obtained from a long-term infected COVID-19 patient. Since the patient was already being monitored at the hospital, we could obtain five additional samples from the hospital collected from 288 days before the day 0 sample and up to 10 days before (Fig. 2A and 2B). All these samples had much lower levels of noise compared to day 0, suggesting that the patient was only infected by a single viral strain at these time points. Some degree of noise was observed on day − 171 and day − 130), but analysis of any major and minor strains in these samples, as well as all the other samples taken before day 0, showed only AY.98.1 strains without any BA.5-specific mutations (Fig. 2C). This suggest that the recombination events happened sometime in the ten days before day 0.

To gain information on the relative fitness of the two recombinants, we decided to follow up the patient to see the viruses competing in vivo. We collected five extra samples from the patient at 22, 69, 70, 93 and 103 days after day 0 (Fig. 2A and Fig. 2B). We found that the new samples had low noise levels, consistent with the presence of just a single lineage (Fig. 2A). Indeed, when we extracted major and minor variants from these new samples, we found that all of them belonged to a new recombinant lineage. This survivor lineage was the recombinant with an AY.98.1 backbone and a BA.5 Spike. We noticed that the lineage was similar but not identical to the major lineage found on the sample collected on day 0 (Fig. 2C).

By looking at the mutation profiles (Fig. 2C) together with the noise across the genome (Fig. 2A) it seems unlikely that the minor lineage found on the day 0 was the one that outcompeted the other lineages. We therefore hypothesized two scenarios that could lead to the results observed 22 days after day 0. (i) The two recombinant lineages found on the day 0 recombined again, so that the one that became dominant afterwards lost four BA.5-specific mutations (i.e., C17410T, A18163G, C19955T, and A20055G) and acquired three AY.98.1 specific mutations (i.e., C16466T, C19220T, C19524T). (ii) Alternatively, on the sample taken on day 0 there was indeed a mixture of at least three recombinants: a BA.5-AY.98.1 similar to the minor variant on day 0, a AY.98.1-BA.5-AY.98.1 similar to the major variant on day 0, and another AY.98.1-BA.5-AY.91 recombinant similar to the major variant present on the sample taken 22 days after day 0. A mixture of three such recombinants with approximated ratios 65%, 10% and 25% respectively, would produce a noise pattern like the one observed on day 0 (Fig. S1C versus Fig. 2A “day 0”). Although it is impossible to distinguish between these scenarios a posteriori, we believe that the second scenario is more plausible. Anyway, both scenarios require multiple recombination events suggesting that recombination between different SARS-CoV-2 lineages may occur frequently during co-infection.

Cultivation of the recombinant lineage.

To investigate the ability of the recombinant strain to propagate in vitro, we cultivated the virus extracted from the sample taken 70 days after the day 0. We found that the virus was able to replicated and that after two passages the sequence was the same as the original recombinant (Fig. 2C, “day + 70 in-vitro”). Interestingly, we found that all the noisy positions present in the original sample, were also noisy in the cultured samples (Fig. S1D), suggesting that the sample contained a mix of strains with different mutations at these positions. However, none of them seemed to provide a strong fitness advantage, at least in vitro.

Fitness advantage of the recombinant.

We hypothesized that the competition of two similar viruses inside a patient would be the perfect arena to infer which mutations would provide fitness advantages in vivo. After excluding the mutations gained or lost because of the recombination on day 0, we found 21 mutations with presence/absence patterns that suggested they had been gained and/or lost during the evolution of the virus within the patient (Fig. 3). We found 13 mutations that were incorporated into the genome at some point during the infection (Fig. 3B). We found seven mutations that were gained and then subsequently lost some time after and one mutation (S:H49Y) with a pattern that suggests that it might have been gained by the virus at two different timepoints (day − 171 and day 93).

Then, we analysed the fitness difference associated with these 21 mutations by associating them with the fitness differences calculated by Bloom and Neher (2023). Surprisingly, we found that seven of the thirteen mutations gained and fixated into the genome are estimated to yield a negative fitness difference; conversely, six of eight mutations gained and lost had positive fitness differences.

Finally, we estimated the fitness of all the variants identified by adding the fitness associated with all the mutations present in their genomes. We found that although the virus has gained fitness during the infection, the main driver for the fitness increase was the recombination event (Fig. 3C). Moreover, the presence of a significant number of gained-and-lost mutations (Fig. 3B) together with the differences in the estimated fitness between major and minor variants (Fig. 3C) suggests the presence of different subvariants with different mutations competing within the patient.

Homologous recombination in coronaviruses is thought to occur when the enzyme RNA-dependent RNA polymerase (RdRp) separates from one RNA template while keeping the nascent RNA and then continues building the strand at the same position using a different template molecule (Focosi & Maggi, 2022). Although coronaviruses have evolved to use recombination as part of their replication processes to produce a pool of recombined RNA molecules, the role of this viral molecular mechanism in generating novel recombinant lineages remains uncertain.

To our knowledge, this is the first report of a recombinant SARS-CoV-2 virus between these the omicron BA.5 and delta AY.98 lineages and the first time that we have witnessed consecutive sequencing snapshots of the competition of several SARS-CoV-2 lineages in one infected individual. Although other studies have found recombinant viruses in sequential samples acquired from long-term infected patients (Burel et al., 2022), this is the first time that we have obtained and analysed samples in which at least two recombinant lineages were competing each other in vivo. Moreover, we have developed and released a set of tools to detect and analyse this type of events in the future (i.e., Precfinder, NoisExtractor, Co-infection detection tool).

The results of this study reveal the emergence of a recombinant virus with an AY.98.1 backbone and a BA.5 Spike gene isolated from a long-term infected COVID-19 patient in Norway. The most likely scenario for this recombinant to arise is that, while at the hospital, the long-term patient infected with an AY.98.1 virus came in contact with another person infected by a BA.5 virus leading to a coinfection and that shortly after the two lineages recombined. The recombined strain that eventually became the dominant strain in the patient probably arose within 10 days prior to the first detection of the recombinant lineage. However, our observations suggest that there were actually multiple recombination events within the patient, both between the omicron and delta variants but also secondary events between different recombinants. These results suggest that recombination can occur frequently during coinfection, and they highlight the importance of close monitoring and early detection of such events.

Moreover, our findings suggest that the several recombinant viruses may have been competing in the patient. And the fact that some apparently harmful mutations were retained over time, while beneficial mutations were lost, suggest that the evolution of the virus within the patient might be affected by clonal interference, a phenomenon where beneficial mutations may disappear from a population because of competition between sub-variants carrying the different mutations (Strelkowa & Lässig, 2012). By analysing the fitness associated with each of the observed mutations in the viral population in the long-term infected patient, we found that recombination had a major impact on the fitness increase of the virus. Indeed, the fitness gained due to mutations acquired or lost during the infection seems to be lower than the fitness gained because of the incorporation of an Omicron Spike into a Delta backbone via recombination. Recombination in betacoronaviruses may therefore serve as a powerful mechanism to overcome clonal interference and ensure mixing of genetic material between lineages. Clonal interference is strongest in asexual organisms, or when there is a strong linkage disequilibrium, but recombination might serve to overcome clonal interference. Indeed, one hypothesis that might explain the success of RNA-viruses is that the high recombination rates in RNA-viruses might help them to overcome the burdens of clonality.

However, it is possible that our fitness estimations differ from the actual fitness of the virus because of three reasons. First, the database that we used to estimate the fitness was constructed using epidemiological data gathered from viral databases and it is possible that the mutations important for the fitness of the virus at the population level differ from the mutations important for its transmission between cells within the patient. This might be especially relevant for patients with a weakened immune system unable of clearing the infection for months. Second, when we computed the overall fitness of each lineage, we did not account for epistatic relationships between mutations and it is possible that genetic interactions between mutations become important determinants for the overall fitness of the virus. Third, the fitness associated with deletions of amino acids was not taken into account since the dataset does not have information about the fitness changes due to sequence deletions or insertions.

Therefore, further research is needed to investigate the potential implications of the mutations gained by the virus during its evolution within the patient (i.e., ORF7a:E22D, ORF1ab:V86F, ORF1ab:V4102A, ORF1ab:N1080I, ORF1ab:P1427S, ORF1ab:C1889Y, ORF1ab:S1272G, ORF1ab:Q4100H, ORF1ab:M1156I, ORF1ab:A2909V, N:P326L, ORF1ab:I3619V, ORF1ab:T1538I) in term of fitness, transmissibility, virulence, and vaccine effectiveness.

The identification of recombinant viruses in a long-term infected COVID-19 patient raises questions about the potential for similar events to occur in other patients and populations, as well as the implications for ongoing efforts to control the spread of the virus.

Our findings highlight the importance of continued surveillance and monitoring of SARS-CoV-2 genomes, particularly in high-risk populations such as long-term infected immunocompromised COVID-19 patients, to detect and respond to potential recombination events and other evolutionary changes in the virus. These patients are possibly one of the most probable causes for new novel recombinant SARS-CoV-2 variants.

Overall, our study provides important insights into the genetic diversity and evolution of SARS-CoV-2 and underscores the need for ongoing research and surveillance efforts to better understand and combat this global health threat.

SARS-CoV-2 Severe Acute Respiratoy Syndrome coronavirus-2

ENA European Nucleotide Archive

NIPH Norwegian Insitute of Public Health

Data availability

Sequences in fastq and fasta format are stored in the European Nucleotide Archive (ENA) under the

Project ID PRJEB71327.

Acknowledgments

We acknowledge all the hard work carried out in the clinic and their contribution to the infection control monitoring which has led to the discovery of such important single cases as this case. We also acknowledge the contributions of providers of publically available sequences. We are also very grateful for the whole team of highly skilled technicians involved in whole genome sequencing of the samples at the NIPH, and especially Rasmus Kopperud Riis. We also sincerely thank the Norwegian Sequencing Centre (NSC) NorSeq for partnering during the pandemic to achieve large volume sequencing capacity.

Ethics Approval

Ethical approval has not been sought for these analyses since the work has been carried out as part of the monitoring of infectious diseases at the national public health institute covered by the national infection control act. The ethics committee/scientific department at the local hospital, Levanger Hospital, has nevertheless been consulted and approval has been given to publish the results in current form.

Consent for publication

The ethics committee/scientific advice department, at the local hospital, Levanger Hospital, has been consulted and approval has been given to publish the results in current form without consent from the actual patient.

Competing interests

The authors declare that they have no competing interests.

Funding

Not applicable

Author contributions

IG, OH and KB conceptualized the study. OF og KZ provided the samples and the clinical information. IG and JB analysed the data. EF performed the cultivation of the virus in vitro. IG, JB, EF, LM, AR, OH and KB wrote the manuscript. All authors reviewed and edited the manuscript, and approved the final version.

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jozefowicz R., , Jia, Y., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Schuster, M., & others (2015) TensorFlow: Large-scale machine learning on heterogeneous systems. Software available from tensorflow.org.
Aksamentov, I., Roemer, C., Hodcroft, E., & Neher, R. (2021). Nextclade: clade assignment, mutation calling and quality control for viral genomes. Journal of Open Source Software, 6, 3773. https://doi.org/10.21105/joss.03773
Bentley, K., & Evans, D. J. (2018). Mechanisms and consequences of positive-strand RNA virus recombination. Journal of General Virology, 99(10), 1345-1356. https://doi.org/https://doi.org/10.1099/jgv.0.001142
Bloom, J. D., & Neher, R. A. (2023). Fitness effects of mutations to SARS-CoV-2 proteins. Virus Evolution, 9(2):vead55. https://doi.org/10.1093/ve/vead055
Burel, E., Colson, P., Lagier, J.-C., Levasseur, A., Bedotto, M., Lavrard-Meyer, P., Fournier, P.-E., La Scola, B., & Raoult, D. (2022). Sequential Appearance and Isolation of a SARS-CoV-2 Recombinant between Two Major SARS-CoV-2 Variants in a Chronically Infected Immunocompromised Patient. Viruses, 14(6), 1266. https://www.mdpi.com/1999-4915/14/6/1266
Carabelli, A. M., Peacock, T. P., Thorne, L. G., Harvey, W. T., Hughes, J., 6, C.-G. U. C. d. S. T. I., Peacock, S. J., Barclay, W. S., de Silva, T. I., & Towers, G. J. (2023). SARS-CoV-2 variant biology: immune escape, transmission and fitness. Nature Reviews Microbiology, 1-16.
Chen, S., Zhou, Y., Chen, Y., & Gu, J. (2018). fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics, 34(17), i884-i890. https://doi.org/10.1093/bioinformatics/bty560
Cheng, Y., Ji, C., Han, N., Li, J., Xu, L., Chen, Z., Yang, R., Zhou, H.-Y., & Wu, A. (2022). covSampler: A subsampling method with balanced genetic diversity for large-scale SARS-CoV-2 genome data sets. Virus Evolution, 8(2). https://doi.org/10.1093/ve/veac071
Chollet, F. & others, (2015). Keras. https://keras.io
Danecek, P., Bonfield, J. K., Liddle, J., Marshall, J., Ohan, V., Pollard, M. O., Whitwham, A., Keane, T., McCarthy, S. A., Davies, R. M., & Li, H. (2021). Twelve years of SAMtools and BCFtools. GigaScience, 10(2). https://doi.org/10.1093/gigascience/giab008
Focosi, D., & Maggi, F. (2022). Recombination in Coronaviruses, with a Focus on SARS-CoV-2. Viruses, 14(6). https://doi.org/10.3390/v14061239
Grubaugh, N. D., Gangavarapu, K., Quick, J., Matteson, N. L., De Jesus, J. G., Main, B. J., Tan, A. L., Paul, L. M., Brackney, D. E., Grewal, S., Gurfield, N., Van Rompay, K. K. A., Isern, S., Michael, S. F., Coffey, L. L., Loman, N. J., & Andersen, K. G. (2019). An amplicon-based sequencing framework for accurately measuring intrahost virus diversity using PrimalSeq and iVar. Genome Biology, 20(1), 8. https://doi.org/10.1186/s13059-018-1618-7
Harari, S., Tahor, M., Rutsinsky, N., Meijer, S., Miller, D., Henig, O., Halutz, O., Levytskyi, K., Ben-Ami, R., Adler, A., Paran, R., & Adi Stern (2022). Drivers of adaptive evolution during chronic SARS-CoV-2 infections. Nature Medicine, 28, 1501-1508. https://doi.org/10.1038/s41591-022-01882-4
Langmead, B., & Salzberg, S. L. (2012). Fast gapped-read alignment with Bowtie 2. Nature Methods, 9(4), 357-359. https://doi.org/10.1038/nmeth.1923
Li, P., de Vries, A. C., Kamar, N., Peppelenbosch, M. P., Pan, Q. (2022). Monitoring and managing SARS-CoV-2 evolution in immunocompromised populations. Lancet Microbe, 3(5), e325-e326. https://doi.org/10.1016/S2666-5247(22)00061-1
Meyerowitz, E. A., Richterman, A., Gandhi, R. T., & Sax, P. E. (2021). Transmission of SARS-CoV-2: A Review of Viral, Host, and Environmental Factors. Ann Intern Med, 174(1), 69-79. https://doi.org/10.7326/m20-5008
O’Toole, Á., Scher, E., Underwood, A., Jackson, B., Hill, V., McCrone, J. T., Colquhoun, R., Ruis, C., Abu-Dahab, K., Taylor, B., Yeats, C., du Plessis, L., Maloney, D., Medd, N., Attwood, S. W., Aanensen, D. M., Holmes, E. C., Pybus, O. G., & Rambaut, A. (2021). Assignment of epidemiological lineages in an emerging pandemic using the pangolin tool. Virus Evolution, 7(2). https://doi.org/10.1093/ve/veab064
OECD. (2021). Risks that matter 2020: The long reach of COVID-19. https://doi.org/doi:https://doi.org/10.1787/44932654-en Organisation, W. H. (2023). WHO Coronavirus (COVID-19) Dashboard. https://covid19.who.int/
Parums, D. V. (2023). Editorial: The XBB.1.5 ('Kraken') Subvariant of Omicron SARS-CoV-2 and its Rapid Global Spread. Med Sci Monit, 29, e939580. https://doi.org/10.12659/msm.939580
R Core Team. (2022). A language and environment for statistical computing. R Foundation for Statistical Computing, https://www.R-project.org
Sekizuka, T., Saito, M., Itokawa, K., Sasaki, N., Tanaka, R., Eto, S., Someno, R., Ogamino, A., Yokota, E., Saito, T., & Kuroda, M. (2022). Recombination between SARS-CoV-2 Omicron BA.1 and BA.2 variants identified in a traveller from Nepal at the airport quarantine facility in Japan. Journal of Travel Medicine, 29(6). https://doi.org/10.1093/jtm/taac051
Strelkowa, N., & Lässig, M. (2012). Clonal interference in the evolution of influenza. Genetics, 192(2), 671-682. https://doi.org/10.1534/genetics.112.143396
Wickham, H. (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org
World Development Report 2022: Finance for an Equitable Recovery. (2022). https://doi.org/10.1596/978-1-4648-1730-4
World Health, O. (2022). COVID-19 weekly epidemiological update, edition 115, 26 October 2022. https://apps.who.int/iris/handle/10665/363853
Zannoli, S., Brandolini, M., Marino, M. M., Denicolò, A., Mancini, A., Taddei, F., Arfilli, V., Manera, M., Gatti, G., Battisti, A., Grumiro, L., Scalcione, A., Dirani, G., Sambri, V. (2023). SARS-CoV-2 coinfection in immunocompromised host leads to the generation of recombinant strain. International Journal of Infectious Diseases, 131, 65-70. https://doi.org/10.1016/j.ijid.2023.03.014
Zhou, P., Yang, X.-L., Wang, X.-G., Hu, B., Zhang, L., Zhang, W., Si, H.-R., Zhu, Y., Li, B., Huang, C.-L., Chen, H.-D., Chen, J., Luo, Y., Guo, H., Jiang, R.-D., Liu, M.-Q., Chen, Y., Shen, X.-R., Wang, X., . . . Shi, Z.-L. (2020). A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature, 579(7798), 270-273. https://doi.org/10.1038/s41586-020-2012-7

No competing interests reported.

FigS121112023.pdf
Figure S1. Recombinant viruses confirmation. A.Classification of the major and minor sequences obtained on day 0according to PrecFinder. B. Classification of the major and minor sequences obtained on day 0 according to sc2rf. C. Simulated noise of the scenario in which three distinct recombinants with ratios 65%m 25% and 10% were mixed. The parental lineage of the different fragments of the genome was represented with red (BA.5) and blue (AY.98.1) colors. D. Noise ratio of the sample obtained from the patient on day 70 (left) and the extracted virus cultivated for one (middle) or two passages (right). The noise outliers and missing positions were labeled with red and blue dots respectively as described in Fig. 1A. The nucleotide positions for the bases with high noise were also labeled.

Download PDF

Version 1

posted

You are reading this latest preprint version

Recombinant SARS-CoV-2 Delta/Omicron BA.5 emerging in an immunocompromised long-term infected COVID-19 patient

Status:

Version 1

Abstract

Figures

Background

Methods

Results

Discussion

Conclusions

Abbreviations

Declarations

References

Additional Declarations

Supplementary Files

Status:

Version 1