Molecular analysis of goose parvovirus eld strains from the Derzsy’s disease outbreak revealed the local European-associated variants

Since the rst recognition in the early '60s, Derzsy’s disease has occasioned signicant economic losses in the goose meat industry through the world. Today, Derzsy’s disease still maintains its importance for small-scale waterfowl farming, despite not having a remarkable impact on public health. In the present study, we investigated the distribution of goose parvovirus (GPV) strains and its potential variants from the 2019 outbreak in Turkey. The tissue samples were obtained from the infected eggs and the goslings, which were raised in the distinct farming areas of the various provinces. For this purpose, a novel primer set which amplies the 630 bp of VP3 region was designed to conrm the GPV infection by conventional PCR method. After the diagnosis, 4709 base nucleotide data including structural, non-structural and 5' inverted terminal repeat regions were obtained from the three samples in the Middle Anatolian region. The multiple comparison and phylogenetic analyses together demonstrated that the eld strains clustered with European group 2 and presented a series of unique amino acid substitutions which could determine the virulence. These results conrmed the European-related eld strains caused the outbreak in minor Asia, which could assist to understand the GPV circulation between Asia and Europe.

GPV has different pathogenicity and host ranges and is distinguishable serologically through neutralization tests. GPV has been observed in geese, Muscovy ducks, crossbreed ducks and wild ducks; however, MDPV strains have only been discovered in Muscovy ducks so far [9][10][11]. The progression of Derzsy's disease may be acute, subacute or chronic. Acutely infected geese present with symptoms include anorexia, nasal and ocular discharge, conjunctivitis, severe diarrhea, hepatitis, hydropericardium, myocarditis and, sometimes, mucosal necrosis in the oral cavity [12,13]. Furthermore, one-to three-weekold geese may develop prolonged form of disease, upon recovery from the initial infection. This is often called "short beak and dwar sm syndrome" (SBDS) and is characterized by growth restriction and feather loss. It has been demonstrated that West-European cluster strains of GPV are capable of establishing SBDS in geese [14][15][16].
Incidences of Derzsy's Disease have been frequently reported from domestic and wild geese in European countries including Germany [17], the United Kingdom [18], Poland [19] and Sweden [20]. Vaccination programs include attenuated or subunit vaccine formulations, which have been broadly administered to geese ocks in Europe to immunize them against the disease [1]. On the other hand, Turkey lacks a national control and prevention program for GPV, since the raising of geese is an insigni cant local and family-controlled enterprise. In this study, we aimed to provide comparative analysis and characterization GPV based on a known epidemic of Derzsy's disease cases to provide a better understanding of a potential viral reservoir in Minor Asia territory.

Clinical Samples and DNA extraction
Beginning from April 2019, thirty-two Lindovskaya breed goslings from the various provinces were presented to the small animal clinic of the Cumhuriyet University. These goslings were severely dehydrated and anorectic and showed listless and depression. The acute diarrhoea was commonly observed in most of the patients. Sixteen goose embryonated eggs from same farms were also subjected for this study (for the details of the samples, see Supplementary Table 1 and Supplementary Fig. 2).
Samples of the internal organs including intestine, liver, heart, spleen and brain were collected at the postmortem examination and submitted to the Virology laboratory for further analysis. Tissue homogenates were prepared from the specimens for each individual gosling separately. Brie y, 5 ml sterile PBS was added in 1 g for each tissue and grinded with sterile pestle. The same process was applied for the embryonated eggs to obtain tissue homogenates. These homogenates were centrifuged at 3000 x g 4°C for 5 minutes with Rotanta 460R refrigerated centrifuge (Hettich, Germany). Supernatants were harvested for the DNA extraction and for this purpose, commercial viral DNA extraction kit (GF-1 Vivantis, Malaysia) was used as per the manufacturer's instructions. DNA purities were interpreted according to the ratio of absorbance at 260 nm and 280 nm in UV-Vis spectrophotometry (Denovix DS-11, USA).

Primer design and PCR Ampli cation of partial GPV sequences
After reaction, positive samples were detected by running these PCR products through 1% agarose gel and displayed in imaging system (Quantum ST4 Vilber Lourmat, France) based on the observation of 630 bp amplicon according to VC 1000 bp DNA ladder (Vivantis, Malaysia). Table 1 Primer sets adopted to amplify the nearly whole genome and the characteristics (position, tm, product size) of these sets. Positions of the primer set were determined according to the GPV reference strain (NC_001701.1). Gray color indicates the primer set used for the detection of GPV.  Table 1 and the supplementary Fig. 1) and the expected size of products were observed in standard gel electrophoresis. These amplicons were further excised from the gel and puri ed using GF-1 AmbiClean Kit (Vivantis, Malaysia), then sequenced twice bidirectionally using the BigDye Terminator Cycle Sequencing Kit (Applied Biosystems, USA) on an ABI 3100 automated sequencer (Applied Biosystems, USA).
The initial evaluation of the primary sequencing data quality was processed in Geneious Prime® 2020.0.3 software [22]. The four-colour nucleotide chromatogram values were interpreted and the data with poor quality were re-sequenced for achieving the most accurate results. Contiguous sub-sequences were mapped to a reference sequence accessed from GenBank (Accession no.:NC_001701.1). Therefore, positions of the structural and non-structural genes; intergenic regions and terminal repeat regions were established in the sequencing data.
Once the coding sequences were revealed, both nucleotide and deduced amino acid sequences were subjected to multiple sequence alignment (MSA) to determine sequence identity percentages and the similarity scores. For this purpose, VP1-VP2 partial, VP3 partial, VP1 complete sequences, rep complete sequences and all-length fragment data were individually aligned with other available sequences obtained from the GenBank using Clustal (mBed algorithm) [24]. The optimal substitution models were determined using MEGA X software [25]. Bayesian information criterion (BIC) was used to select the bestt models for each dataset. Datasets were also investigated for the potential recombination events via RDP4 v4.100 software [26]. For this purpose, both structural and non-structural genes were analyzed for possible recombination events using available algorithms (RDP, GENECONV, Chimaera, MaxChi, BootScan, SiScan and 3Seq) with their default parameters (P-value cutoff = 0.05). The events detected with at least three algorithms were considered as a recombinant strain. Selection pressures for both VP1 and rep genes were estimated as the mean ratio of the number of nonsynonymous substitutions to the number of synonymous substitutions (dN/dS). Overall scores were calculated using MEGA 7 [27].
Phylogenetic analyses were performed for the nearly full length sequences, VP1 and the rep, using the PhyML 3.3.20180621 module [28] in Geneious Prime® 2021.0.3 [22] and are bootstrapped for 100 times. For this purpose, ninety-two complete genomic data including GPV, MDPV and AAAV (Avian adenoassociated virus) strains were selected for the construction of trees. Tamura-Nei (TN93), Hasegawa-Kishino-Yano (HKY85) or Kimura model were used as nucleotide substitution model. For the phylogeny of the VP1-VP2 and the VP3 partial sequences, further partial genomic data of the European-originated virus were also included.

Molecular detection of virus
We screened fty samples for the presence of GPV by using conventional PCR with GoPV-3313F and GoPV-3942R primer sets targeting the 630 nt fragments of the VP3 partial sequence. The results showed that all of the samples were positive for GPV. Next, three of the PCR products were selected for bidirectional sequencing. At this point, the priority was given to the samples from farms located in the Middle Anatolian region, where the goose production is intense. Thus, sample 2 from "farm G" in Yozgat, a sample from "farm A" in Corum, and a third sample from "farm I" in Konya were determined for further study (See supplementary table 1).

Nearly full-length genomic sequence analysis
We successfully obtained 4709 base sequencing data from our three samples with the lack of 3' UTR. Data from three samples included 5' UTR between the 1st and 444th nt, a replication associated gene (rep) between 537th and 2.420th nt, and the gene coding sequences of the structural VPs between the 2439 and 4637th nt. One-fourth of the rep cds consisted of SF3 helicase, positioned between 1461st and 1931st nt, whereas the structural ORF encoded three VPs-VP1 (2439-4637), VP2 (2874-4637) and VP3 (3033-4637).
A pairwise comparison of multiple sequence alignment according to 4709 base-length data demonstrated that Turkish strains closely related to each other and shared 99.81-99.91% nucleotide identity. These strains also presented the highest identity with three European originated strains-virulent B (NC_001701.1), VG32/1 (EU583392.1) and GPV GER (KU684472.1)-varying between 98.00% and 98.90%. The nucleotide identity percentages for other chosen strains were shown in Table 2. A complete phylogenetic analysis of nearly full-length data indicated that the Turkish strains grouped with VG32/1 with good bootstrap value, while the Virulent B and GPV GER strains diverged from these group, despite bearing high identities (Fig. 1).

Analysis of Structural Genomic Sequences
The structural ORF gene coding sequences were 2.199 nt in length and had 732 deduced amino acids. We compared both the nucleotide and predicted amino acid sequences of VP1 polyprotein sequences with other available data. Konya/19 had only a single amino acid substitution-asparagine to aspartic acid at the 703rd position (N703D) differentiating it from other Turkish strains. On the other hand, the Turkish strains shared a series of amino acid substitution separating them from virulent B; these were K47R, V144I, G177S, S498N Y521H, L529I, E558D and G615W. Most of these mutations were also identi ed in the other European-originated strains, for instance, the VG32/1 and strain GPV GER (KU684472.1). G177S substitution was detected in the partial genomic data of European-originated strains, such as the D441/04 (DQ862008.1) and D462/1/04 (DQ862011.1) strains. Exceptions were found for the S498N and N703D mutations, which were carried by strains detected in South Asia. Multiple comparison analysis of VP1 protein was shown in Supplementary Fig. 3.
The phylogenetic analysis using the complete VP1 nucleotide sequences revealed that the Middle Anatolian strains were grouped with the European-originated strains: virulent B (NC_001701.1), VG32/1 and the GPV GER with good bootstrap value ( Fig. 2A). Despite having complete data for VP1, we restricted our phylogenetic analysis of both 443 nt partial sequences of VP1/VP2 (Fragment A) and 495 nt partial sequences of VP3 gene (Fragment B) to include the partial genomic data of other European originated strains. Phylogenetic analysis based on fragment A revealed that three Turkish strains clustered with the Swedish strains (DQ862008 -DQ862012) in the European group 2, while the Hungarian strains including virulent B strain fell into the European group 1 (Fig. 3A). Notably, The West-European group included a highly pathogenic strain GPV-GER (KU684472) identi ed in ornamental ducks. The SBDS-related group mostly consisting of French isolates (EU938702-EU938706; AY496906 and AY496907) separated into a single branch with good bootstrap support value. This group presented high identity to the Asian strains. On the other hand, fragment B-based analysis revealed a European cluster, in which three Turkish strains generated a sub-branch (Fig. 3B). Notably, according to this analysis, strain DB3 was the only Asian originated strain within the European genogroup again.

Analysis of Non-Structural Genomic Sequences
The 1884 bases and 627 derived amino acid sequencing data of the complete rep gene were elicited from  Supplementary Fig. 4.
Phylogenetic analysis of the nucleotide sequences showed that Turkish strains sub-clustered in a group that included an Asian-originated DY strain (EF515837.1) and also a European strain, VG32/1 (EU583392.1). Notably, there was a weak delineation between the Asian and European strains as observed in the VP1 based phylogeny (Fig. 2B).

Recombination and Molecular Evolution Analyses
To detect potential recombination events, seven available tests using RDP4 software were implemented on nearly full-length genomes, structural and non-structural complete genomes and fragment A and fragment B individually. We could not nd any possible recombination events for the obtained Middle Anatolian strains.
We further calculated the dN/dS ratio to investigate the mode and strength of selection on parvoviral genomes using VP1 and rep gene coding sequences. The results showed that overall dN/dS ratios were below one for both coding sequences (0.209 for VP1 and 0.207 for rep) indicating negative selection.

Discussion
In 2019, multiple Derzsy's disease cases occurred in geese farms located in various provinces in Turkey.
In this study, we aimed to detect the presence of goose parvovirus infection in Turkish geese farms and contribute to the understanding of the epizootiology of this virus. For this purpose, we diagnosed the disease based on clinical and necropsy ndings and performed conventional PCR by using our novel primer set. We identi ed the GPV from both infected egg and the tissue samples retrieved from goose ocks. Having con rmed the disease, we sought to perform molecular characterization of the viral genome, including structural and non-structural genes. A recent retrospective study also revealed an outbreak during 2018 in this area, which was in accordance with our ndings [29].
Zadori et al. (1995) previously de ned a primer set for the VP3 position [8], which has been extensively used to identify and classify GPV elsewhere [6,30,31]. Since then, additional methods have been developed such as Real Time PCR [32], LAMP [33] and immunochromatographic assay [34]. We de ned a novel degenerated primer set for the detection of GPV amplifying VP3 coding sequence. Since we con rmed that our novel degenerated primer set matched most of the retrieved sequence data, it might also be considered as simple alternative approach for diagnostic purposes.
Both nearly full-length and VP1 coding sequence and data were individually utilized for phylogenetic analysis to evaluate initial results from partial genomic data. These two analyses showed that our eld strains clustered with European-originated strains and vaccine strains. VG32/1 and virulent B strains have been utilized as parental vaccine strains in Asian and European countries, respectively [6,35]. The structural genes of GPV is highly stable in the environment and has presented only a 0.3% variation rate in the past 20 years [36]. Furthermore, these variations accumulate mostly in the C-terminus of VP1, where the VP3 gene is located. We subsequently implemented a series of phylogenetic analyses based on VP1-VP2 fragment A (443 bp) and partial VP3 fragment B (495 bp) coding sequences. Fragment A region is a well-conserved region and is frequently used for classi cation [13,16]. Our fragment A-based phylogenic analysis showed that eld strains had the closest relationship with European group 2, which consisted of Swedish strains (DQ862008-DQ862012) [20]. Apart from this group, two more groups occurred in the European cluster according to fragment A: Hungarian strains and Low pathogenic strains and vaccines. Next, we performed phylogenetic analysis using 495 nt partial "fragment B" sequence data, which revealed that the Minor Asian strains were similarly clustered with European strains in a single genogroup. Notably, several European strains sub-branched together with Cherry duck-associated GPV strains. Since we did not detect any recombinant events between strains, our ndings evidenced the existence of endemic goose parvovirus strains that are analogous to European strains, speci cally to Swedish strains (European group 2), circulating in geese ocks in the Middle Anatolian regions. These results suggested that the nearly-complete genomic data based phylogeny method allowed more detailed classi cation results, since it gave unambiguously better contrasts within European strains and exhibited signi cant diversity than the genomic sequences. Thus, we particularly recommend analyzing full data in molecular epidemiology studies of goose parvovirus. In this point, our developed primer sets were broader in range and therefore might be good alternative for genomic investigation.
Mutations in the nucleotides and predicted amino acid sequences of VP1 were investigated and analyzed according to available data in GenBank. We detected several substitution mutations in the coding sequence homogeneously distributed between the N-and C-terminus halves of VP1; however, non-synonymous mutations were predominantly on the VP3. The C-terminus part of VP1 overlaps with VP2 and VP3 and is considered to be a determinant for GPV assembly and attenuation [36]. Eight mutual amino acid variations were shared between the Turkish eld strains and several Asian-originated strains. N706D was unique for Konya/19 and, is also one of the nine substitutions that distinguish the vaccine and wild type of the 82-0321 strain (EU583390.1) [36]; It is still common in with some of eld strains in Asia. On the other hand, the structural genes of GPV include several motifs and regions altering the pathogenesis such as the nuclear import motif [37] or the VP1-VP2 cleavage motif [36,38]. Additionally, Yu et al. (2012) identi ed seven immunodominant regions (IDs) on the structural proteins of GPV [39]. In the present study, no mutation was detected on the known motifs, which could relate to virulence.
However, K47R, G177S and E558D substitutions were found on the ID1, ID2 and ID3, respectively. In addition, ve amino acid substitutions (S498N, Y521H, L529I, E558D and G615W) were detected between the 478th and the 623rd aa residue in the VP1 gene, where the receptor binding site (RBS) is positioned [31]. Genetic diversity between the structural genes of parental and vaccine strains were predominantly accumulated in this motif as well. These results together proved the competent GPV strains with distinctive variations on the surface protein-coding genes. Thus, the immunogenic features of Minor Asian eld strains deserve to be comparatively assessed.
The rep protein of GPV is a nuclear phosphoprotein (NS), and owing to its structural domains, has multifunctional features including target-speci c DNA binding, endonuclease and helicase [40]. Our multiple comparison analysis demonstrated a series of synonymous and non-synonymous mutations through NS coding sequences of goose parvovirus. We brie y detected six mutations in total; however, four of them (V153G, Y157N, K550N and F572S) were unique in Corum/19. The two remaining mutations -S22P and M575I-are shared with Asian-originated strains. Previously, immunodominant regions were de ned on the C-terminus of the deduced amino acid sequences [41,42]. Despite the fact that we could not detect any amino acid substitution on the functional motifs, K550N, F572S and M575I were observed in the epitope-coding sequences of the rep gene. The rst two of these substitutions were in Corum/19, while the latter was found in all the Turkish strains. On the other hand, comprehensive recombination analysis did not address any events between the Turkish strains and other known GPV, MDPV and avian adeno-associated virus genomes. Taken together, we concluded the existence of European-classi ed GPV strains with low levels of diversity causing infections in the Middle Anatolia.
The evolutionary relationship between GPV strains and other known strains was also analyzed according to the complete sequences of the rep gene. Phylogenetic analysis demonstrated that the Turkish strains were closely related to both DY (EF515837.1) and the VG32/1 (EU583392.1) strains, which were isolated in China and Europe, respectively. Wan et al., (2015) suggested that the DY strain was a recombinant strain, which was also con rmed by our recombination detection analyses [43]. However, no recombination events were detected between Middle Anatolian and DY strains. Notably, nonstructural gene-based phylogeny revealed the heterogeneity between genogroups regarding the origins of the strains, which seemed to contradict VP1 phylogeny.
We looked for recombination events in the genomic data by limiting the capability of the RDP4 software. Our ndings indicated no evidence of recombination in the Turkish strains. Recombinant GPV strains with GPV and MDPV ancestors have been reported elsewhere [44,45]. The results of our bioinformatic analysis demonstrated again the newly emerged variants, which con rmed the validity of our approach. However, unidenti ed GPV and MDPV strains could potentially have contributed to the evolution of Minor Asian strains; therefore, further studies are required.
We calculated the dN/dS ratio for structural and nonstructural genes as 0.209 and 0.207, respectively. This ratio is considered to be an indicator re ecting the selective pressure on the coding sequences [46] and, in our case, the results pointed to negative selection. Evidence of strong negative selection has previously been reported in the gene coding sequences of other parvoviruses (e.g. parvovirus B19 [47] and canine parvovirus [48]). As the ratio of both genes was closer, we conjectured that the evolutional mechanisms might affect at the same rate for both genes. However, multiple factors contribute to viral evolution; therefore, the dN/dS ratio calculation alone may not be an appropriate method to infer selection pressures [49]. Indeed, further data must be generated for the non-structural genes from the European strains to validate this hypothesis.
We are aware that this study has a few limitations that might impact our ndings. First, the collection of specimens was performed from some of the Turkish provinces where the disease reported. Thus, our limited number of samples obviously could not represent the whole Anatolian territory. The second, only a few numbers of sequencing data have been characterized from Europe, most of which depend on the partial sequences of VP1 only. These limitations underlined the need for more comprehensive studies in both Turkey and Europe.
In conclusion, in this study we focused on a the Derzsy's disease outbreak in geese ocks in several provinces in Turkey. We evidenced European strain-related, virulent variants of GPV circulating in the territory of Asia Minor. This study also comparatively assessed the complete structural and non-structural speci cations of goose parvovirus strains from an outbreak with multiple foci. Multiple sequence comparison and phylogenetic analysis of the obtained eld strains demonstrated the existence of genetically related and authentic strains with unique mutations. These results provided substantial data to determine the basic characteristics and geographical distribution of variants between Asia and Europe. Furthermore, phylogenetic analyses based on the partial sequence of structural genes demonstrated the genome constellation of local strains, while complete genomic analyses proved this methodology to be accurate. We inferred that the VP1-VP2 and VP3 sequences would allow the researchers su cient data to classify GPV.

COMPLIANCE
This article does not contain any studies with human or animal subjects performed by any of the authors. The authors declare that they have no con ict of interest.