Approaches to identify canine distemper virus with neurological symptoms on the basis of molecular characterization of hemagglutinin and fusion genes

Canine distemper virus (CDV), which causes severe infections in all domestic and wild carnivores, is transmitted by all secretions and excretions of infected animals. Despite the regular vaccination against it, CDV still manages to circulate in nature and is a worldwide problem in dogs. For many years in the world, the virus managed to circulate in nature. The current investigation aims to identify and characterize CDV in dogs with neurological symptoms and to determine whether CNS symptoms and phylogenetic data might be used to differentiate between CDV strains. The medical records of 35 dogs with central nervous system (CNS) symptoms were examined. An ELISA kit was used to identify CDV-specific IgG antibodies in all of the dogs' serum samples. RT-PCR confirmed the presence of CDV nucleic acid in 30 of these dogs. Of the RT-PCR-positive samples, 6 were randomly chosen for further sequencing, sequence comparisons, and phylogenetic reconstructions. Genes encoding the Hemagglutinin (H) and Fusion (F) proteins were partly sequenced and compared to other CDVs from throughout the world, including vaccine strains. The maximum likelihood method was used to build a phylogenetic tree using CDV H and F gene nucleotide sequences. According to phylogenetic analysis of partial H and F gene nucleotide sequences, the field CDVs in this investigation were unique and different from the vaccine strain. The phylogenetic analysis indicated that all Turkish CDV strains that induced CNS symptoms belonged to the European CDV clade. While the intricacy of the CNS and the complexities of glycosylation pathways may provide significant challenges to infections, future research will bring significant benefits by identifying evolutionarily conserved activities of N-glycosylation in CDV-infected dogs.


Introduction
Canine distemper virus (CDV), is an enveloped, negative-sense, single-stranded RNA virus that belongs to the morbillivirus genus of the Paramyxoviridae. [1]. CDV can cause infections in families of Canidae, Felidae, Hyaenidae, Mustelidae, Procyonidae, Ursidae, and Viverridae [2,3]. Despite the presence of a susceptible broad range of animals, the main reservoir for CDV has been reported as dogs that belong to the Canidae family [4]. Commonly, the viral infection caused by CDV is associated with respiratory, gastrointestinal, and nervous system disorders in carnivores. Several studies on the clinical and pathological aspects of different kinds of CNS symptoms in dogs have been published over the years [5][6][7]. There have been few attempts to analyze how effectively CNS illnesses may be distinguished from each other based on clinical and molecular data. The genome of CDV consists of genes for C and V non-structural proteins, and six structural proteins: nucleocapsid protein (N), hemagglutinin (H), fusion protein (F), phosphoprotein (P), matrix protein (M), and large protein (L). The non-structural proteins (C) and (V) are produced by an alternative open reading frame in the P gene [8,9]. Previous studies have revealed that the gene regions encoding the H and F proteins are subjected to a higher variability while the genes encoding N, P, M, and L proteins are highly conserved [10,11]. The H protein was reported to be 607 amino acids (aa) in length and has an 1824 bp gene size. Twelve putative N-glycosylation sites Edited [12]. It encodes a type II integral membrane glycoprotein that helps in the attachment of the virus to the host cell receptors [8,11,13]. The involvement of the H protein's glycans is one component that has received little attention. Glycans are primarily involved in protein folding and post-translational modification. N-linked glycosylation is one of the most prevalent types. Within the conserved pattern Asn-X-Ser/ Thr, a high mannose core is linked to the amide nitrogen of asparagine (N). The CDV F gene open reading frame encodes 662 amino acids (aa) in length and a 1989 bp gene size, comprising the regions Fsp (aa 1-135), F2 (aa 136-224), and F1 (aa 225-662). It encodes type I integral membrane protein excluding 135 aa in length N-terminal signal peptides [14]. The membrane protein plays an essential role in fusion between the virus and the infected cells or in the circulation of the virus among the host. Therefore, H and F genes encoding proteins are suitable for genetic analysis [15]. CDV genotypes have been classified into Africa-1, 2, Asia-1, 2, 3, 4, North America-1, 2, 3, South/North America-4, Europe/South America-1, South America-2, 3, European Wildlife, Rockborn-like, Arctic-like, Vaccine clusters based on nucleotide alignment of H and F proteins [14,[16][17][18][19][20][21]. Hemagglutinin (H) and fusion (F) glycoproteins constitute the viral envelope and are essential for cellular infection by binding to signaling lymphocyte activation molecule (SLAM) and CD46 cellular receptors [22]. The H protein binds to one or more receptors, causing cellular attachment and activation of the F protein by tissue-specific proteases, terminating in cellular infection. Hence, these proteins play a vital role in determining host range and tropism [23]. Complete H gene sequence studies have found widely distant groups of CDV isolates, although some connections among these lineages remain uncertain [7]. To our knowledge, the patterns of selection or recombination, as well as their relationship to the many cases of emergence in dogs with neurological symptoms, have not been investigated in any study. In the present study, the medical records of 35 dogs with symptoms of the CNS were evaluated, as was the sequencing of the H and F genes of the virus. Analyses of the CDV H protein have identified twelve N-glycosylation sites. The importance of glycans in CDV H protein is likely to extend beyond protein modification and folding. Their involvement in the pathogenesis of neural cell adhesion, axonal targeting, neural stimulation, viral receptor binding and fusion with virus-cell are all possible roles [24]. The purpose was to determine whether CDV strains can be detected by the prevalence of their symptoms as a group and to search for criteria that would be helpful for recognizing CDV strains.

Materials and methods
Sampling

CDV-specific IgG ELISA detection
Dogs with CNS symptoms were sampled for further study based on an abnormal neurologic examination and were  [14]. All primers were shown in Table 1 together with their nucleotide positions and amplicon sizes. In all tests, cDNA was synthesized using the Revertaid™ First Strand cDNA Synthesis Kit (Thermo Scientific™, Germany) according to the manufacturer's instructions. The electrophoresis of 5 µl of PCR products was performed in 2% agarose gel. Sterile purified water and CDV RNA (extract from a commercial vaccine) were used as negative and positive controls, respectively.

Sequence and phylogenetic analyses
All positive PCR product samples with expected size of 1046 bp for the H gene and 797 bp for the F gene were sequenced in both directions by a commercial company (Macrogen; BMLabosis, Turkey). The obtained sequences of CDV partial H and F genes were subjected to BLASTn to compare for sequence identities/variations with other sequences of CDV strains around the world present in the GenBank (NCBI) database [26,27]. Further, amino acid sequences were subjected to multiple alignment with BioEdit software (v.7.2) using the Clustal-W method [28,29]. The deduced amino acid sequences of the genes along with the other CDV strains from different geographical regions were used to construct the phylogenetic trees with bootstrap values calculated with 1000 replicates (Figs. 4, 5) by using the maximum likelihood (ML) method in MEGA 11 software [30]. Sequences have been submitted to GenBank through the Banklt interface to receive an accession number. Similarity and identity rates regarding bet sequences have been calculated in MatGAT 2.0 [31].

Nucleotide sequence accession numbers
The GenBank accession numbers for H and F genes of the sequences used in this study are listed in Supplementary  Table 1.

Predicting of putative glycolisation sites
The presence of different N-glycosylation sites was predicted to characterize the H protein. Using artificial neural networks, we predicted putative N-glycosylation sites by examining the sequence context of Asn-Xaa-Ser/Thr sequons. Amino acid (aa) sequences for the glycoprotein precursor were analyzed using the online software NetNGlyc v1.0 Server (www. cbs. dtu. dk/ servi ces/ NetNG lyc) to determine the presence of N-linked glycosylation sites.

N-linked glycosylation sites of the H protein of the Turkish strains
Glycosylation plays a crucial role in the antigenicity of many proteins. The H gene glycosylation sites were predicted to have twelve possible N-glycosylation sites at positions 19-21, 149-151, 309-311, 339-341, 391-393, 422-424, 456-458, 517-519, 542-544, 584-586, 587-589, and 603-605. A comparative analysis was done on H protein sequences from the six CDV field strains studied in this study. Other N-glycosylation sites in the Turkish CDV strains were discovered in this study, in addition to the already known ones, at locations 433-435, 464-466, and 489-491 ( Table 4). The glycosylation potential and threshold were illustrated (Fig. 3).

Phylogenetic analysis
Phylogenetic analysis was performed using the MEGA 11 software package. The evolutionary history was inferred by using the Maximum Likelihood method and Tamura 3-parameter model and the bootstrap consensus was inferred from 1000 replicates to represent the evolutionary history of the taxa analyzed, and branch lengths were indicative of genetic distances between the sequences Tamura et al. [30]. The tree with the highest log likelihood (−2932.58) is shown. The percentage of trees in which the associated taxa clustered together is shown next to the branches. The best DNA/protein model for the construction of phylogenetic trees was selected from a model test program integrated with MEGA 11 software. Phylogenetic trees were constructed based on the nucleotide sequences of the H and F genes. In the maximum likelihood tree, the sequences in this study constituted separate clusters. CDV genotypes were classified into Africa-1, 2, Asia-1, 2, 3, 4, North America-1, 2, 3, South/North America-4, Europe/ South America-1, South America-2, 3, European Wildlife, Rockborn-like, Arctic-like, Vaccine clusters based on nucleotide alignment. The Turkish strains joined a tight cluster of CDV strains in the Europe/South America-1 that was separated from known CDV clades (Figs. 4, 5). The CDV strain's pathogenicity and host cellular receptors are closely linked to the progress of disease, and the H and F genes play key roles in both. In our study, phylogenetic classification of these genes has demonstrated that the circulating CDV strains in Turkey that cause neurological symptoms are exclusively located in the Europe/South America-1 clade.

Discussion
The CDV is a highly contagious pathogen that has been widely circulated across the world. Infected dogs are a risk to other dogs since they have not been vaccinated against this agent. In this study, the CDV-N protein was successfully identified by the use of the RT-PCR technique with specific primers. Since the N protein is required for viral RNA replication, the mRNA encoding the N gene is highly expressed in infected cells. As a result, our study focused on the N gene region, which is known to be highly conserved, in order to detect the infection in the blood. [32,33]. It is crucial to remember that serum IgG titers do not necessarily indicate whether an infection is acute or convalescent; rather, IgG titers represent previous exposure to a pathogen [34,35]. It has been previously reported that previous CDV infection can lead to the development of persistent infection in neural tissues [36]. Therefore, dogs with CNS symptoms and CDV IgG-positive were used to conduct the research. In this study, they diagnosed the persistent CDV infection by infecting primary canine brain cell cultures with a recombinant red fluorescent protein (RFP)-expressing wild-type Morbillivirus strain (rA75/ 17red) to show the mechanism of persistent CDV infection. Furthermore, the infected dog brain cell culture (DBCC) were detected by electron microscopy and immunofluorescence analysis. As a result, we used blood samples from living animals as an indicator of all systems in our investigation, and we had a very high positive rate. We recommend using uncoagulated blood samples from live animals for diagnostic purposes in the future, except in the host system and tissues where the virus persists. The H and F protein genes, which facilitate receptor binding and are significantly more variable than other morbillivirus proteins, are mainly used in molecular epidemiology research for CDV [14,37]. Molecular analysis of CDV's H and F proteins has been used to investigate neurovirulence and the pathophysiology of CNS symptoms [38,39]. The most prevalent symptoms observed in our study were myoclonus, ataxia, and tremors, which were all quite typical of the neurological manifestations reported in CDV-infected dogs. Such symptoms of infected dogs have been mentioned as particularly in previous studies [5,40]. The primary focus of the research was CDV-infected dogs with CNS symptoms. Doubts remain regarding whether viral persistence plays a role in the development of neurological symptoms in CDV infection. In a study comparing highly neurovirulent and non-cytolytic CDV strains that cause persistency, it was claimed that non-cytolytic CDV strains follow a different path in dog brain cells. On the other hand, in cases of distemper-induced demyelination, CDV has been shown to infect mostly astrocytes [36,41]. Despite the successful clearance of CDV in white matter lesions in infected dogs, prior research with CDV has shown that the virus has the capacity to migrate to other parts of the central nervous system (CNS), resulting in new lesions every time it is re-infected [42,43]. CDV, on the other hand, may generate a persistent infection in the CNS, which is yet mostly unknown.
There are strains that have effectively adapted to each host and may lead to infections, and there are strains that manage to cross the species barrier without causing severe clinical symptoms. A number of epidemiological situations may result in the prevalence of CDV in wild populations despite the lack of infection in domestic populations. All of these CDV strains might be responsible for the distemper Duque-Valencia et al. [44]. In recent years, there have been research on the transmission of neurotopic CDV from wild carnivores [41,45]. In Turkey, wild carnivores may contact with domestic dogs on rare. CDV transmission from domestic dogs to wild carnivores has already been reported in a previous report. According to this research, a mink with clinical symptoms but no neurological indications (such as lethargy, oculo-nasal discharge, footpad hyperkeratosis) was CDV positive by RT-PCR. Oguzolu et al. [46] analyzed 530 and 549 aa substitutions of the H gene to find host specificity of CDV strains in this study. The Turkish mink distemper strain, accession number KT588923, has the closest phylogenetic homology with our strains' H gene (Fig. 4). Interestingly, no alterations were found in our strains' H genes at locations 530 (glycine, G) and 549 (tyrosine, Y). Based on the findings at aa locations 530 and 549 of our study's H gene sequencing, CDV is of dog origin. However, there are no aa substitutions to associate wild animal interaction with host specifity in our study. In recent years, the lack of effectiveness of the old CDV strains, which are still used in vaccines and are only distantly related to the new CDVs, has made it necessary to redesign vaccines with newly discovered field strains [11]. The Snyder Hill strain was isolated from a dog's brain in the 1950s and propagated in vivo before being modified

N-Glyc
Asn-Xaa-Ser/Thr sequons (including Asn-Pro-Ser/Thr) are shown in green. Asparagines predicted to be N-glycosylated are shown in red for cell culture [47]. The Onderstepoort strain was developed in North America in the 1930s, and is now widely used across the world [48]. The old CDV Onderstepoort strain, in lineage America-1, is related to most CDV strains [20,49]. These vaccine strains are used worldwide in distemper vaccination programs.  [50]. Despite the introduction of an effective vaccination, canine distemper is still a serious dog infection and has lately spread to wild animal populations, including five orders and two families of nonhuman primates [51]. While all of our strains were shown to be genetically distant from the vaccination strains, only the HSS1_H1 strain had a similarity to the Rockborn and Cani-shotk5 strains ( Table 2). Different levels of pathogenicity or tropism could be a possible reason why our other strains are not similar to the vaccine strains. Our phylogenetic data are important in ruling out the hypothesis that clinical sickness was caused by the vaccine strain's residual pathogenicity. The cause of the vaccination failure is unknown, however given that the proper vaccination strategy was followed, individual immunity-associated variables or a lack of crossreactivity between the vaccine and the wild-type strain are thought to be likely reasons. Martella et al. [11] think that two strains that are close to each other in the phylogeny and have an H amino acid difference of less than 4% are from the same lineage. Dog distemper is an RNA virus, so it has evolved rapidly, which explains why Turkish strains have an amino acid divergence of more than 4% compared to vaccination strains. If CDV variations occurred prior to the appearance of current viruses, our results imply that these "ancient" strains have either been distinguished from new variants or remain unidentified in certain host species [52]. Glycosylation is a common post-translational modification that affects protein structure, localization and trafficking, protein solubility, antigenicity, biological activity and halflife, and cell-cell interactions [24]. We studied the association between identified and putative N-glycosylation sites in   H gene sequences of  Turkish CDV strains HSS1_1H,  HSS2_3H, HSS3_6H,  HSS4_11H, HSS5_27H, and HSS6_33H compared with strains around the world obtained from the NCBI database. Bar number of base substitutions per site the H gene and neurological findings. Amino acid variations in currently circulating wild-type CDV should be recognized as a potential cause contributing to a recurrence of distemper cases in well-vaccinated dog populations across the world. The function of additional potential N-glycosylation sites predicted in current wild-type CDV H proteins has to be determined [49]. Similar to the previous study, our findings show that the predicted putative N-glycosylation sites and amino acid changes accumulated in the H protein of circulating wild-type CDV have antigenic and perhaps neuropathic consequences [53]. In this work, we predicted additional potential glycosylation sites with consequences for the viral life cycle and pathogenicity that were lacking in CDV and may be considered carefully when designing a new CDV vaccine. Increasing overall vaccination rates with effective vaccines that generate broad, lasting immunity must remain the top aim in distemper management, particularly in regions with high dog populations and level of exposure to carnivores. In addition to the mutations we showed at these positions, the newly found N-glycosylation regions may be responsible for the neurological findings. It is probable that this represents just the tip of the iceberg in terms of the various and crucial functions that N-glycosylation plays in the development of the nervous system that are yet unknown. The nervous system is governed by a diverse range of N-glycosylated proteins, including glycoproteins found on the cell surface and in the extracellular matrix, which are involved in cell adhesion and signal transduction. Surprisingly, N-glycosylation has an effect on the activities of a number of these glycoproteins even when they are not located in the nervous system [54,55].
Studies of CDV show that the distribution of CDV lineages in the world is variable. Thus, it is vital to perform ongoing molecular epidemiological surveillance in order to discover emerging CDV variations that may be resistant to the host immune system and to analyze data. The accumulation of point mutations in the viral genome has resulted in the continuous evolution of CDV, with the following indicators: In the circumstance that novel antigenic variations with distinct molecular and antigenic features are discovered, it is possible that a comprehensive redesign of vaccination programs as well as an update of the virus strains contained in commercially available products will be required. The emergence of point mutations may also have compromised advanced molecular approaches for the detection and analysis of CDV strains that rely on perfect matching between viral RNA and test nucleotide sequences. As a result of these challenges, it is possible that these strategies will need to be modified.

Fig. 5
Nucleotide sequences and phylogenetic tree of the 797 bp F gene sequences of Turkish CDV strains HSS1_1H, HSS2_3H, HSS3_6H, HSS4_11H, HSS5_27H, and HSS6_33H compared with strains around the world obtained from the NCBI database. Bar number of base substitutions per site ◂