Molecular characteristics and phylogenetic analysis of novel goose parvovirus strains associated with short beak and dwarfism syndrome

Short beak and dwarfism syndrome (SBDS) emerged in Cherry Valley duck flocks in China in 2015, and novel goose parvovirus (NGPV) was shown to be the etiological agent of SBDS. To date, it is not known whether SBDS-related NGPV isolates possess common molecular characteristics. In this study, three new NGPV strains (namely, SDHT16, SDJN19, and SDLC19) were isolated from diseased ducks showing typical signs of SBDS and successfully passaged in embryonated goose or Cherry Valley duck eggs. The complete genome sequences of these NGPV strains were 98.9%–99.7% identical to each other but showed slightly less similarity (95.2%–96.1% identity) to classical GPV strains. A total of 16 common amino acid substitutions were present in the VP1 proteins of six NGPV strains (SDHT16, SDJN19, SDLC19, QH, JS1, and SDLC01) compared with the classical Chinese GPV strains, nine of which were identical to those found in European GPV strain B. The non-structural protein Rep1 of the six NGPV strains had 12 common amino acid substitutions compared with the classical GPV strains. Phylogenetic analysis indicated that the Chinese NGPV strains clustered with the European SBDS-related NGPV strains, forming a separate branch that was distinct from the group formed by the classical GPV strains. The present study shows the common molecular characteristics of NGPV isolates and suggests that the Chinese NGPV isolates probably share a common ancestor with European SBDS-related NGPV strains.


Introduction
Short beak and dwarfism syndrome (SBDS) emerged in Cherry Valley duck flocks in China in 2015 with a morbidity rate ranging from 5% to 25% [1,2,11]. Ducks with this disease are characterized by a short beak, protruding tongue, fragile bones, and retarded growth [2,12]. Poor body weight gain and heterogeneity within the flock lead to economic losses to the duck meat industry. SBDS was first reported in the 1970s in France in mule ducks and was also reported in Poland in 1995 [14].
The causative agent of SBDS is closely related to classical goose parvovirus (GPV) and is referred to as "novel GPV" or "GPV variant" [20]. GPV and Muscovy duck parvovirus are members of the genus of Dependoparvovirus in the family Parvoviridae [3]. The viral genome has a ~5.1-kb single-stranded linear DNA with equal amounts of positiveand negative-strand encapsidated in icosahedral capsids. The genome is flanked by identical inverted terminal repeats (ITRs) and contains two ORFs, with the left ORF encoding the Rep protein and the right ORF encoding the capsid protein [21]. The Rep ORF produces the largest protein Rep1 and several low-molecular-weight Rep proteins from spliced mRNAs, which are involved in genome replication, interaction with the ITRs, and modulation of the downstream P41 promoter [10,13,16,18]. By alternative mRNA splicing and protease cleavage, the right ORF produces three structural proteins (VP1, VP2, and VP3), which assemble the viral capsid in a ratio of 1:1:8 [7,16]. GPV DNA can be detected by PCR in the internal organs of diseased ducks showing SBDS; however, viral isolation from tissues (liver, spleen, and kidney) is not always successful [1,20]. Viral isolation can be achieved in susceptible duck embryos by inoculation with tissue homogenate. Blind passages in embryonated duck eggs are necessary to establish good adaption of primary novel goose parvovirus (NGPV) isolates [1]. Since the outbreak of SBDS in China, several genome sequences of NGPV isolates have been determined [1,12,20], but whether these isolates possess common molecular characteristics and similar propagation capability in avian embryos remains unclear. Moreover, the evolutionary correlation between NGPV isolates, classical GPV strains, and European SDBS-related GPV isolates remains to be elucidated.
In the present study, three NGPV strains were isolated from SBDS cases during disease outbreaks in Cherry Valley duck flocks between 2016 and 2019, and their genomes were sequenced. Sequence alignments revealed that all Chinese NGPV strains harbored common amino acid mutations in the coding proteins when compared to classical GPV strains. Moreover, two NGPV strains isolated in different years exhibited obvious differences in their ability to be propagated in embryonated duck eggs. Phylogenetic analysis indicated that the Chinese NGPV isolates probably originated from the European SBDS-related NGPV.

Viral isolation and passage
Internal organ tissues, including liver, kidney, and spleen, were sampled between 2016 and 2019 from diseased Cherry Valley ducks showing typical SBDS in Shandong Province, China. The tissue samples were placed in sterile saline for homogenization. The homogenate was clarified by centrifugation at 3000 × g, and the supernatant was transferred and passed through a filter with a 0.22-µm pore diameter. The filtrate was supplemented with penicillin (1000 IU/mL) and streptomycin (1000 µg/mL) and then used to inoculate susceptible 9-day-old embryonated goose eggs via the chorioallantoic membrane route. The eggs were incubated at 38 °C for 12 days and candled three times each day. Dead embryos were kept at 4 °C for 6 h, and the allantoic fluid was pooled. The embryos were further examined for pathogenic lesions. The first-generation allantoic fluid containing virus was diluted 1:10 with sterile saline, and this dilution was used to inoculate another five 9-day-old goose eggs or Cherry Valley duck eggs for passage. The mean death time (MDT) was determined for the isolate SDJN19 and SDHT16. Briefly, 16 9-day-old embryonated duck eggs were inoculated with each isolate, the times of any embryo deaths were recorded, and their mean values were calculated.

Determination of the median egg lethal dose
To evaluate whether the NGPV isolates isolated in 2016 and 2019 differ in their ability to be propagated in embryonated duck eggs, the median embryo lethal dose (ELD 50 ) of two NGPV strains was calculated by the method of Reed and Müench [17]. Briefly, the viral stock in the form of allantoic fluid was diluted in sterile saline to give a tenfold dilution series from 10 -1 to 10 -7 . For each dilution, 0.2 mL of the virus suspension was inoculated into 9-day-old susceptible Cherry Valley duck eggs (five eggs per dilution) via the allantoic cavity. The inoculated eggs were incubated continuously at 37 °C and examined three times daily for seven days. The times of any embryo deaths were recorded (hours post-inoculation). The ELD 50 experiments were repeated three times, and the mean value was calculated.

Whole-genome sequencing
Seven sets of primers were designed on the basis of the genomic sequence of the classical GPV strain LH (accession number KM272560, Table 1). These primers and PrimeS-TAR Max DNA polymerase (Takara, Dalian, China) were used to amplify overlapping fragments. In addition to the 200-bp terminal fragments containing half of the ITR, five internal DNA fragments were gel-purified using a TIANgel Midi Purification Kit (Tiangen, Beijing, China) and then subjected to direct sequencing using the amplification primers. An "A" tail was added to the 200-bp terminal fragments using Taq DNA polymerase. The fragments were then ligated into the pMD19T vector (Takara, Dalian, China), and competent DH5α cells were transformed with the resulting constructs. Positive clones were screened by sequencing, and the resulting sequences were combined to assemble the whole genome sequence using the SeqMan program packaged in Lasergene 7.0 (DNASTAR, Madison, USA).
In addition, the ITR nucleotide sequences of three strains were confirmed by sequence analysis of recombinant plasmids carrying the viral DNAs purified directly from the virions without PCR amplification. Virus purification and concentration from the allantoic fluid and extraction of genomic DNA were performed as described previously [19]. The extracted single-stranded DNA (ssDNA) was suspended in STE buffer (10 mM Tris, 1 mM EDTA, 100 mM NaCl, pH 8.0) and annealed by heating to 95 °C for 5 min, followed by slow cooling to 55 °C. The double-stranded DNA (dsDNA) was digested with NcoI, separated by electrophoresis, and purified using a gel extraction kit (Tiangen, Beijing, China). The 1.3-kb DNA fragment containing the 3′-end ITR was ligated into the NcoI-SmaI site of the pBSKNB plasmid [19] and used to transform competent cells of the Sure strain of E. coli. Positive clones were chosen for sequencing. To overcome difficulties in sequencing ITR regions with hairpin structures, the ITR was digested by SphI, which cuts the ITR in the middle loop region, and the resulting fragments were subcloned into the pUC18 plasmid for sequencing.

Sequence alignment and phylogenetic analysis
Fourteen strains, including six classical GPV strains and eight SBDS-related NGPV strains, were used in this study for sequence alignment and phylogenetic tree construction. JS1, QH15, and SDLC01 are representative NGPV strains that were isolated at the early stage of the SBDS outbreak in China (Table 2). Except for D146/02 and D697/3/06, both of which are European SBDS-related NGPV isolates [14] for which only partial VP1 gene sequences are available in the GenBank database, complete genome sequences were analyzed. Sequence comparisons of the encoded proteins of the classical GPV strains and SBDSrelated NGPV strains were performed using the MegAlign program packaged in Lasergene 7.0. Given that the European SBDS-related NGPV strains have only partial VP1 gene sequences (at the N-terminus) available in the Gen-Bank database, a phylogenetic tree was constructed using a 427-nt VP1 gene fragment to investigate the possible evolutionary origin of the Chinese SBDS-related NGPV strains. The tree was constructed using MEGA 7.0 [9] by the maximum-likelihood method based on the Kimura 2-parameter model according to the Bayesian information criterion (BIC) score. Bootstrap values were calculated based on 1000 replicates.

Statistical analysis
Normality tests indicated that the mean death time (MDT) values for duck embryos after inoculation with the SDHT16 or SDJN19 isolate formed a Gaussian distribution. Hence, the two-tailed Student's t-test was used to compare the  MDT values between SDHT16 and SDJN19, using Graph-Pad Prism software version 6.01 (GraphPad Software Inc., San Diego, CA, USA). P < 0.05 was considered statistically significant. The data are expressed as the mean ± standard deviation (SD).

Virus isolation and passage in embryonated goose eggs
Three Hemorrhagic lesions in the head, neck, and legs typical of parvoviral infection were observed (Fig. 1). The three NGPV isolates were named SDHT16, SDJN19, and SDLC19.

NGPV propagation in Cherry Valley duck embryos and ELD 50 calculation
Strain SDHT16 and SDJN19 were successfully passaged in 9-day-old embryonated Cherry Valley duck eggs (Fig. 1). Strain SDHT16 killed 16 duck embryos between 106 h and 197 h postinfection (average 150 h), and strain SDJN19 killed 16 duck embryos between 69 h and 163 h postinfection (average, 104.6 h). Statistical analysis indicated that the differences in the mean death times were extremely significant between the two NGPV strains (P < 0.01) (Fig. 2). The median egg lethal dose (ELD 50 ) in embryonated Cherry Valley duck eggs was calculated for strains SDHT16 and SDJN19 and found to be 5 × 10 2. 5 /mL and 5 × 10 5.4 /mL, respectively, demonstrating that strain SDJN19 is propagated more efficiently in embryonated Cherry Valley duck eggs than strain SDHT16.

Genome sequencing and sequence comparisons
The whole genomes of three NGPV isolates were amplified by PCR and sequenced. The genome of strain SDHT16 is 5050 bases long, which is four bases shorter than that of strains SDJN19 and SDLC19. All nucleotide deletions and insertions occurred in the ITR without exception. The three NGPV strains shared 98.9%-99.7% sequence identity at the genomic level but displayed slightly less similarity (95.2%-96.1% identity) to the classical GPV strains (strain B, 82-0321, 06-0329, YZ99-6, and LH). The genome sequences of three NGPV strains were deposited in the Gen-Bank database with the accession numbers MN356043, MN356044, and MN356045.
Amino acid sequence alignments of the VP1 proteins of the six NGPV strains (SDHT16, SDJN19, SDLC19, SDLC01, JS1, and QH15) and the classical GPV strains showed that, compared with the Chinese GPV strains, the six NGPV strains harbored 16 common amino acid alterations, Fig. 2 Comparison of the mean death times of Cherry Valley duck embryos after infection with strain SDHT16 or SDJN19. **, Extremely significant difference between the SDHT16 and SDJN19infected groups (P < 0.01).

Comparison of ITR sequences
The ITRs of the three NGPV strains (SDHT16, SDJN19, and SDLC19) shared 98.1%-100.0% nucleotide sequence identity with each other but only 93.0%-94.0% identity with the classical GPV strain LH. Alignments of ITR sequences revealed that strains SDHT16, SDJN19, and SDLC19 have two, four, and four nucleotide insertions, respectively, in the stems of the palindrome region compared with the classical GPV strain LH (Fig. 4).

Phylogenetic analysis
Because only partial sequences of the European SBDSrelated NGPV strains were available, a 427-nt fragment from the N-terminus of the VP1 gene was used to construct a phylogenetic tree. The results revealed that the Chinese SBDS-related NGPV strains SDHT16, SDJN19, SDLC19, SDLC01, JS1, and QH15 clustered together and grouped with the European SBDS-related NGPV strains (D697/3/06 and D146/02), forming a separate branch (Fig. 5). The four Chinese GPV strains (LH, YZ99-6, 82-0321, and 06-0329) Table 3 Amino acid sequence alignment of the Rep1 proteins of the six NGPV strains and the classical virulent strains of GPV Table 4 Amino acid sequence alignment of the VP1 proteins of the six NGPV strains and the classical virulent strains of GPV  clustered closely together and grouped with two European GPV strains (VG32/1 and B), forming another separate branch. On the basis of the results of phylogenetic analysis, it can be concluded that the Chinese SBDS-related NGPV and the European SBDS-related NGPV probably evolved from a common ancestor.

Discussion
Virus isolation and PCR characterization are the preferred methods of diagnosing SBDS cases. In contrast to the parvoviral disease of geese (Derzsy's disease), for which isolation of GPV is easily achieved using embryonated goose eggs [5,6,15], virus isolation in cases of SBDS is not always successful [1,14,20]. The primary virus isolate may be lost in the following blind passages in duck embryos; hence, the isolation rate from positive samples is comparatively low. In this study, goose embryos were used for virus isolation from SBDS cases. Although only a limited number of goose embryos died after inoculation with positive samples, primary isolates were able to kill goose embryos in a shorter time in the next passaging stages. These results demonstrate that embryonated goose eggs are still a suitable platform for isolating NGPV from SBDS cases. However, it should be noted that classical GPV usually kills 12-day-old goose embryos between 3 and 7 days after inoculation with tissue sample from Derzsy's disease [6], suggesting that classical GPV replicates more efficiently in goose embryos than NGPV.
Strains JS1, QH15, and SDLC01 were isolated previously by other groups in China in 2015 [1,12,20]. The isolation year of SDHT16 was closer to that of strains JS1, QH15, and SDLC01 than to those of SDJN19 and SDLC19. The VP1 and Rep1 proteins of SDHT16 displayed 99.8%-99.9% nucleotide sequence identity to those of strains JS1, QH15, and SDLC01, but only 99.1%-99.4% and 99.3%-99.5% identity to those of strain SDJN19 and SDLC19, respectively. This indicates that strain SDHT16 is more closely related to the early NGPV isolates than to the 2019 NGPV isolates.
Strains SDJN19 and SDLC19 share higher genome sequence identity (99.8%) with each other than with SDHT16 (99.1% and 99.0%, respectively). To evaluate whether the 2019 NGPV isolates differ from SDHT16 in their replication ability, further comparisons were conducted between strain SDJN19 and SDHT16. We found that both strains could be passaged in embryonated Cherry Valley duck eggs, but on the basis of the ELD 50 and MDT values, strain SDJN19 was found to be propagated more efficiently in embryonated Cherry Valley duck eggs than SDHT16. Given that classical GPV cannot be propagated in embryonated Cherry Valley duck eggs, we conclude that the minor nucleotide sequence differences that were observed probably contribute to the observed differences in propagation efficiency. Whether strains SDJN19 and SDHT16 differ in their pathogenicity in Cherry Valley ducks deserves to be investigated.
The ITR region serves as the replication origin of the genome, containing a number of transcription factor binding sites and short nucleotide repeats (TCC GGT ) [19,21]. Compared with the classical GPV strain LH, nucleotide mutations were present in the ITR region of the three NGPV isolates. The ITR sequences of two 2019 NGPV isolates (SDJN19 and SDLC19) were 99.5% identical, but only 97.6%-98.1% identical to that of SDHT16. In addition, three NGPV isolates harbored one or two additional nucleotide insertions in the stem region of the ITR. However, all of these nucleotide mutations and insertions were found to be located in the base-pairing positions; hence, the correct hairpin structure formation of the ITR was not influenced. Altogether, the comparison of the ITR sequences of the three NGPV strains demonstrated a closer relationship between two 2019 NGPV isolates, which is in agreement with the results of comparisons of the protein coding regions.
Characteristic amino acid variations common to the NGPV isolates were identified in this study. These alterations presumably played a key role in the host transition from goose to Cherry Valley duck. These amino acid alterations were not evenly distributed throughout the protein sequences. In the Rep1 protein, eight of the 12 site substitutions were located within the one-third of the peptide sequence adjacent to the carboxyl terminus. The C-terminus of the Rep1 protein contains a domain involved in transactivation of the P41 promoter, which modulates VP1 gene transcription [13]. In the VP1 protein, half of the 16 alterations were located in the VP1u region. VP1u contains the phospholipase A2 (PLA2) domain, which is required for parvovirus entry and infectivity [8,22]. Further analysis showed only one amino acid alteration (Q89L) in the PLA2 motif (53-111 aa) of VP1u of GPV [22], and seven other mutated sites in the VP1u lay outside of the PLA2 motif. The amino acid residues outside the PLA2 motif are also important for maintaining the proper three-dimensional structure for the PLA2 activity [4]. Together, the common amino acid mutations identified in this study are likely to have contributed to the host range shift of NGPV.
Sequence alignments and comparisons revealed that the six Chinese NGPV isolates have a closer relationship to European strain B than to the Chinese GPV strains. In the phylogenetic tree, the Chinese NGPV isolates grouped with the European SBDS-related NGPV strains and formed a separate branch, distinct from the branch formed by the classical GPV strains. This indicates that the Chinese NGPV isolates and their European counterparts probably originated from a common ancestor. The European SBDS-related NGPV was isolated from diseased mule ducks, which are a sterile intergenetic cross of Pekin and Muscovy ducks [14]. Therefore, whether a characteristic difference exists at the genome level between the European and Chinese SBDS-related NGPV isolates needs to be elucidated. Whole-genome sequencing of the European SBDS-related NGPV isolates may be able to shed light on this matter.
In summary, three new SBDS-related NGPV isolates were obtained in this study, and their genomes were sequenced. Two NGPV strains isolated in different years differed in their ability to be propagated in embryonated Cherry Valley duck eggs. Characteristic amino acid differences between the NGPV strains and classical GPV strains were identified, which presumably contributed to the host range shift. Further phylogenetic analysis supported the conclusion that the Chinese NGPV isolates and the European SBDS-related NGPV probably originated from a common ancestor.
Finally, the VP1 protein of NGPV shares almost 95% amino acid sequence identity with classical GPV despite the characteristic amino acid differences. The structural proteins VP1, VP2, and VP3 commonly constitute the protective antigens of GPV [7]. Hence, whether the traditional GPV vaccine currently in use against Derzsy's disease can be applied in Cherry Valley ducks to prevent SBDS deserves to be thoroughly investigated.