Recombination Drives Emergence of Orf Virus Diversity: Evidence From the First Complete Genome of Indian Orf Virus and Comparative Genomic Analysis


 Contagious pustular dermatitis is a disease that primarily infects small ruminants and has the zoonotic potential evoked by a Parapoxvirus, Orf virus (ORFV). This study evaluated an ORFV outbreak in goats that arose in Madhya Pradesh, a state of central India, during 2017 by constructing phylogenetic trees and unveiling its transboundary potential. Thereafter, the complete genome of an ORFV strain named Ind/MP has revealed the presence of 139,807bp nucleotide sequences, GC content 63.7%, 132 open reading frames (ORFs) circumscribed by inverted terminal repeats (ITRs) of 3,910bp. Evolutionary parameters such as selection pressure (θ=dN/dS), nucleotide diversity (π), etc., demonstrate the ORFV exhibit purifying selection. A total of forty recombination events were observed, out of which Ind/MP strains were engaged in twenty-one recombination events indicating this strain can recombine for the generation of new variants.

animals' wool and feces for several years [2]. The infection is exhibited with the presence of intensi ed skin wounds with exacerbated blisters throughout the buccal cavity, often leading to weight loss and anorexia. The morbidity rate often reaches up to 100%, resulting in emaciation in adults, and kids, thereby negatively affecting the herding economy [2]. Animal handlers are prone to the zoonotic potential and are often manifested in the form of unbearable pustules on hands that can expand to other body parts such as genitals and the face of affected persons [3]. The ORFV genome consists of double-stranded DNA (dsDNA), accommodating almost 130 distinctive genes. Genes in the central region are relatively more conserved and involved in mature virion formation and virus replication. In contrast, genes in the terminal regions are more variable and are often attributed to virulence and immune modulation [4]. Despite its global distribution, only fourteen complete genome information is available so far. The absence of a complete genome sequence of the Indian isolates makes it di cult to comprehend genetic analysis and thus hinders further functional studies. Therefore we performed molecular detection of ORFV isolates prevalent in central India infecting the black Bengal goat breed followed by complete genome sequence analysis for the rst time through next-generation sequencing (NGS) platform and utilized comparative genomics approaches to decipher phylogenetic relationship, evolutionary and recombination analysis.
The study area is in the Dhar district of Madhya Pradesh, a central Indian state (75.30E, 22.59 N).
Samples (n=10) were collected from naturally occurring infected goats aged between one to eleven months, showing typical Orf skin lesions on their lips during 2017. Scab samples collected were subsequently stored at −80 °C for virus isolation and further analysis. Total genomic DNA was isolated from the skin tissue according to the protocol described by Sarker et al. 2017 using the DNeasy Blood and tissue puri cation Kits (QIAGEN, Germany) [5]. The viral presence was con rmed by PCR utilizing four sets of primers targeting ORFV011, ORFV020, ORFV059, and ORFV108, commonly known as B2L, E3L, F1L, and A32L (Supplementary Table 1). The PCR ampli ed DNA was puri ed using the MiniElute gel extraction kit (QIAGEN, Germany) and sent for Sanger sequencing. Subsequently, gene-speci c phylogenetic trees were constructed to infer the genetic relationship and transboundary potential of the circulating strain by comparing within the country and with other global isolates using a general-timereversible (GTR) substitution model for the maximum likelihood (ML) phylogeny with 1,000 bootstrap values using MEGA 6.0 ( Figure 1A-2D). Nucleotide BLAST and phylogenetic analysis based on these four genes con rmed that the present isolate has 99%-100%. At the global level, the maximum similarity was isolation was attempted bypassing clinical samples in African green monkey kidney (Vero) cells and primary lamb testicle cell line [7]. However, the virus could not be recovered until the sixth blind passage.
So, we moved ahead with the NGS experiment by isolating viral DNA directly from the clinical samples through NextSeq 500 NGS platform [8,9]. The total length of the assembled genome exhibited 139,807bp in length, and the assigned NCBI accession number is MT332357. Like other PPV genomes, the genome possessed a high (63.7%) G+C content. The Inverted Terminal Repeats (ITRs), which spanned throughout ORFV001 and ORFV134 having a total length of 3,910bp. Each ITR is composed of a terminal BamHI site. Telomere resolution motifs are composed of TAAAT, followed by a spacer sequence, ACCCGACC, and six T residues, which form the terminal hairpin loop ( Figure 2). Using NCBI's ORF Finder tool and NCBI's BLAST, we obtained 132 ORFs for a distinct set of genes. In comparison to the reference genome Chi/GO by the help of utilizing the BioEdit and ExPaSy tools, our analysis showed nearly 488 unique mutations in the current isolate (Supplementary Table 2). These observed mutations led to both synonymous and nonsynonymous amino acid substitutions. The highest number of synonymous and non-synonymous amino acid substitutions was recorded in RNA helicase NPH-II, RNA-polymerase subunit RPO147, virion core protein P4a precursor, and EEV maturation protein, Poly(A)-polymerase catalytic subunit PAPL, NF-kappa pathway inhibitor, DNA-binding protein, Ankyrin/F-box protein, respectively. We observed that the highest number of non-synonymous substitutions within immune regulatory genes such as NF-kappa pathway inhibitor, and Ankyrin/F-box protein, etc. These mutations might be responsible for maintaining the heterogeneity and mimicking the virulence of this pathogen. By taking into account all the fourteen available complete genome sequences of ORFV and utilizing DnaSP, the nucleotide diversity (π) and haplotype diversity (Hd) were observed to be 0.02815 and 1.000, respectively. Selection pressure analysis (θ=dN/dS), with a value of 0.02911 revealed that ORFV resides under purifying selection. Tajima's D test of neutrality resulted in a signi cant negative value (-0.14928), suggesting that this virus might be undergoing a period of evolutionary expansion. A similar pattern of θ value was obtained from recently studied avipoxvirus and ORFV partial genes, which ranged 0.065-0.200 [10,11]. This con rmed dN and dS impel selection pressure to alter the rate of evolution. Identi cation of perfect mono, di, tri, tetra, penta, hexa as well as compound microsatellites was made by IMEx software [12] with the parameter: type of repeat: perfect; repeat size: all; minimum repeat number: 6, 3, 3, 3, 3, 3 for mono, di, tri, tetra, penta and hexanucleotide repeats, respectively, and the distance between two SSRs (dMAX) was ten nucleotides.
Our study revealed 1,108 and 94 numbers of SSRs and cSSR scattered throughout the ORFV genome. The ORFV genome is rich mostly with dinucleotide repeats (76.5%), followed by trinucleotide (18.14%), and mononucleotide repeats (5.14%). The hexanucleotide microsatellites most scarcely presented and constituted only 0.18% of the ORFV genome (Supplementary Figure 2). The distribution of classi ed repeats suggests that dinucleotide GC/CG is more prevalent in most of the ORFV genomes, similar to other DNA viruses like Human papilloma [13]. Di-nucleotide repeat could form Z con rmation or other alternative secondary DNA to facilitate the recombination activity [14]. By using a single mononucleotide repeat, Houng et. al., could follow the transmission dynamics of a human adenovirus during an epidemic [15]. Therefore, these microsatellites could potentially be used as a powerful tool for epidemiological and evolutionary studies for ORFV.
We retrieved fourteen available ORFV complete genome sequences for the GenBank database along with ve sequences of Parapoxvirus (PPV) and Orthopoxvirus, consisting of two Pseudocowpox virus (PCPV), one Bovine papular stomatitis virus (BPSV), and two Monkeypox virus (MPV), respectively to create a phylogenetic tree. It revealed that six ORFV strains originating in goats and eight strains belonging to sheep formed two separate clades except for Ger/D1701 with 61-100% bootstrap support. The present ORFV strain showed a close relationship with Chi/GO and USA/ORFD isolates. Our analysis also showed that all ORFVs were more closely related to PPVs (PCPVs and BPSV) than to Orthopoxvirus (MPVs) (Figure 3). However, our analysis, in comparison to the previous study, showed an increase in the heterogeneity and inability to maintain the perfection of the host-speci c clade. This kind of ambiguity was also observed during phylodynamic analysis of the parapoxvirus genus in Mexico (2007-2011), where Ger/D1701, with several other isolates, exhibited a separate clade rather than host-speci c clade [11]. To understand the source of genetic variation among all the ORFV complete genomes, we looked for evidence of recombination using the RDP, GENECONV, Bootscan, MaxChi, Chimaera, Siscan, PhylPro, LARD, and 3Seq methods contained in the RDP4 program [16]. We observed a total of 40 potential recombination events with signi cant P-values detected across the ORFV genomes (Supplementary Table  3). Viruses undergo genetic recombination to form new variants in the population by deleting many of their non-essential genes or by acquiring new host genes [17]. Viral genome sequencing elucidates that recombination plays a vital role in understanding human and animal pathogens' evolution, including Vaccinia and Variola viruses [18,19]. However, in this study, we identi ed forty potential recombination events where Ind/MP actively participated in more than 50% of events by forming recombinant as well as major and minor parents. Thus, the Ind/MP strain has the potential to evolve via recombination and can act as a major or minor parent to form new variants.
In conclusion, we report the complete genome of circulating ORFV isolate from central India.
Subsequently, by in-depth analysis through a comparative genomic approach, we propose recombination events that may be responsible for ORFV evolution and generation of new strain types. We hope that the current genomic information would be greatly useful for further understanding of ORFV biology, epidemiology, and research carried in front of diagnosis and vaccine development.