First Detection of Lake Sinai Virus in the Czech Republic: Possibility of Novel LSV Species

Lake Sinai virus (LSV) is one of more than twenty honeybee viruses. There are ocially two species: LSV 1 and LSV 2. However, there is currently a limited number of whole-genome sequences and the genetic variability of the virus indicates more than two species exist. Extracted nucleic acid of honeybee samples were screened by PCR for presence of the selected honeybee viruses. LSV was the third most abundant virus (36.9% of positive samples) after Apis mellifera lamentous virus (72.2%) and Deformed wing virus (52.5%). LSV-positive samples underwent additional PCR reaction with primers targeting region coding RNA-dependent RNA polymerase of the virus. The PCR products were sequenced and the acquired sequences used for rst phylogenetic analysis. Based on the results, several of the isolates were selected to undergo whole-genome sequencing and the sequences obtained were used for additional phylogenetic analyses and construction of dendrograms. The results indicate presence of at least three genetically distinct groups of LSV in the Czech Republic, the major one being related to LSV 2, but too distinct to be considered LSV 2 species. Two sequences of major Czech LSV cluster of strains were successfully acquired, thus being the rst Czech LSV strains published to date.


Introduction
Western honeybee (Apis mellifera L.) is a globally distributed domesticated insect, valued mainly for honey production and crop pollination services. Therefore, it represents a key component of biodiversity preservation. Honeybees are super-generalists and as such, they are versatile and convenient pollinators, participating on maintenance of wide variety of both agricultural crops and wild plants [1][2][3]. Generally, insect pollinators are affected by a variety of different factors of both abiotic and biotic character. These factors act as potential drivers for pollinator loss. Habitat loss and climate change represent abiotic factors while biotic factors comprise a diverse range of parasites and pathogens [4][5][6].
Single-strand positive-sense RNA viruses represent a signi cant portion of honeybee infecting agens. Viruses often go unnoticed due to the lack of clinical manifestation of characteristic symptoms, hence a little of importance in context of honeybee health was originally given to them. However, the honeybee infecting viruses were given more attention with the global spread of Varroa destructor, a parasitic mite of honeybees that can facilitate the symptomatic viral infection in honeybees [6][7][8][9][10]. With the concerns surrounding causes behind global pollinator decline [4] and with description of colony collapse disorder, a complex phenomenon of loss of a majority of worker bees without a link to one single cause [11], the honeybee viruses became a widely studied topic. Lake Sinai virus (LSV) is a recently discovered monophyletic viral complex containing single-strand positive-sense RNA genome. The virus was rst detected and described by Runckel et al. in 2011, when it was discovered along with three other novel honeybee viruses. The discovery was facilitated by ultra-deep sequencing of honeybee samples originating from migratory colonies. Two strains of the virus were identi ed at the time, which are now recognized by ICTV as species Lake Sinai virus 1 (LSV 1) and Lake Sinai virus 2 (LSV 2) within genus Sinaivirus [12,13]. The viral complex is genetically diverse; Sinaivirus can be separated into 4 main phylogenetic clades [14,15]. It was previously reported that a single honeybee can be co-infected by several LSV strains belonging into different genetically distinct "groups" [16]. LSV was recently fully classi ed within the Riboviria realm and placed into newly formed family Sinhaliviridae within order Nodamuvirales. The classi cation now re ects genetic closeness to nodaviruses, as the two families, Nodaviridae and Sinhaliviridae now share the order [17]. LSV is also related to two currently unclassi ed viruses, Halictus scambiosae Adlikon virus (HsAV) isolated from bees of genus Halictus spp.
The role of the virus in the pathological process and its epidemiological signi cance has not been completely clari ed yet. It has not been linked to any visible symptoms of infection, however, in weak or collapsed colonies in USA were LSV 1 and LSV 2 the most frequent pathogens (together with Black queen cell virus and Nosema ceranae) when compared to healthy or recovered colonies [19]. Similar results follow from the study from Spain, where LSV along with N. ceranae was omnipresent in the examined samples of collapsing honeybee colonies, while other detected pathogens and parasites were much less prevalent [20]. Another study focused on Spanish honeybee colony with symptoms of Colony Collapse Disorder (CCD) found viral loads of LSV along with Israeli acute paralysis virus and Aphid lethal paralysis virus in the examined samples of worker bees from the collapsing colony [21]. However, Ravoet et al. did not nd a signi cant difference in LSV prevalence between collapsed and surviving colonies [22]. Recently published work by Faurot-Daniels et al. suggests an inverse relationship between LSV 2 prevalence and honeybee colony health [23].
Genetic closeness of the viral strains isolated from honeybees and bumblebees from the same location suggests the possibility of horizontal transmission between different pollinators [15]. This could be facilitated by contaminated pollen pellets. The virus was also detected in the parasitic mite V. destructor, however, the virus does not actively replicate in the mite, which potentially acts as a mechanical vector at most [17]. Previous detection of the virus in bee eggs hints possibility of vertical transmission [26].
Although the virus is not, as far as is known, a major threat to honeybee health, the phylogeny of the virus has been focused of only several previously published papers. Thus, the aim of our study was screening of honeybee samples from the Czech Republic for the presence of Lake Sinai virus and additionally, characterization of the Czech LSV strains.

Sample collection
Initial screening for honeybee infecting viruses was carried out in the collection of 209 samples during 2015. Each sample consisted of six bees from one colony. Worker bees were collected from experimental apiaries of the Bee Research Institute at Dol which are placed in different localities of the Czech Republic. Samples were collected from honeybee colonies with varying health conditions. Sampled bees were frozen and stored at -80 °C until nucleic acids extraction.
Homogenates were centrifuged for 1 min at 13,000 rpm and next viral DNA and RNA were extracted from 100 µl of the supernatants using TRI Reagent (Sigma Aldrich, USA) according to manufacturer's instructions. Obtained nucleic acids were stored at -80 °C until further use.

Molecular detection of viruses
The detection of RNA and DNA honeybee viruses with RT-PCR or PCR methods was described previously [31]. The presence of genomic nucleic acid of following honeybee viruses was surveyed:

Whole-genome sequencing and phylogenetic analysis
Sequences of the LSV PCR products from the RdRp region were obtained via commercial Sanger sequencing (Euro ns, Germany). Those sequences were deposited in the GenBank under accession numbers OK245389-OK245413. Along with the LSV sequences available in the database, our sequences were utilized to construct a dendrogram in MEGA X software (http://www.megasoftware.net/home). Neighbour-joining clustering method was selected to construct the phylogenetic tree. Additionally Tamura 3-parameter model [32] was employed to calculate the evolutionary distances and bootstrap test of 1000 replicates was performed to determine the reliability of the dendrogram. The evolutionary relationships illustrated in the dendrogram were used to select genetically variable strains for the whole-genome sequencing.
Further we aimed to obtain the whole sequence of Lake Sinai virus. Four LSV strains from different clusters of the RdRp dendrogram with enough quality RNA were selected for whole-genome sequencing on NovaSeq 6000 platform from Illumina.
The library preparation, sequencing and nucleotide sequence mapping to honeybee RNA virus reference sequences were performed by a commercial provider (SeqMe, Dobříš, Czech Republic). The average number of 150 bp paired-end reads per sample was 151,273,000; percentage of properly mapped reads ranged between 0.61% and 46.33% with the average of 13.29% mapped reads. Nucleotide sequences for Czech LSV strains were assembled into contigs using the high-performance graphical viewer Tablet [33].
Only two whole LSV sequences were obtained; they were analysed with the use of MEGA X, the dendrograms were prepared with the neighbour-joining method and the evolutionary distances were calculated with the use of Tamura 3-parameter model [32]. To assess the reliability of constructed phylogenetic trees, the bootstrap test with 1000 of replicates was used. The boostrap value >75% indicates satisfactory topology of phylogenentic tree branches; the boostrap value of 95-100% is very good. The sequences of LSV strains 88/15 (three whole genes) and 587/15 (four whole genes) described in this study were submitted to the GenBank under accession numbers MZ773494 and MZ773495. Accession numbers of the described Czech LSV strains as well as other analysed strains are listed in a table in Online Resource 1.

Prevalence of honeybee viruses
Honeybee viruses together with other honeybee pathogens are suggested to be the main causes of honeybee colony losses in modern beekeeping. In our virological survey of honeybee samples, nine out of eleven examined viruses were detected, as is presented in the Table 1. Only seven samples (3.3%) were negative for all screened viruses and most of the samples contained more than one virus. The similar results arise from virological surveillance in apiaries of other European countries, e.g. in Austria [34], Spain [35] or France [36]. The most prevalent viruses detected in more than half of the samples were AmFV (72.2% positive samples) and DWV (52.5% positive) which is in concordance with previously published research [37,9]. LSV was successfully detected in 36.9% of samples, therefore being the third most frequently detected virus in our sample pool. LSV was rst described in the USA where it is widespread [10], however, it was detected also in Europe in great numbers [15][16][17].

Phylogenetic Analyses Of Lsv
We successfully obtained partial (603 bp) LSV RdRp sequence of 26 samples. The sequences were used to construct a dendrogram which is shown on Fig. 1. As much as 19 (73.1%) of the sequences clustered close to the lineage of the LSV 2 reference sequence. However, the Czech isolates formed a distinct group within the dendrogram, making it more likely a close relative rather than the LSV 2 species itself, as was also supported by the percentage similarity between the strains. We were able to obtain sequences of su cient length and quality from two out of the four LSV strains selected for the whole-genome sequencing. Our two isolates are very similar to each other, identical in 96.08% of nucleotide sequence and 97.16% of translated amino acid sequence. According to the dendrogram based on the whole coding nucleotide sequences (Fig. 2 [39]. When compared with the geographically closest available strain M92/2010 (MG918125.1) obtained in Slovenia by Šimenc et al. [27], the Czech LSV strains showed percentual identity of amino acid sequence of 66.51% (isolate 88/15) and 66.17% (isolate 587/15). The Slovenian strain was described as Lake Sinai virus lineage 3, an independent lineage genetically close to LSV 1 [40]. The relatively low percentage of amino acid sequence homology between Slovenian and Czech strains is not surprising, given the fact that the Czech strains harbour greater similarity to the other of the two LSV species, LSV 2. The geographic closeness of the countries of origin is therefore not re ected in the genetic closeness of the available LSV strains. However, strain M92/2010 is also the only other Central-European strain available in the database, therefore the relationship between the geographic and genetic distances cannot be properly determined.
Aside the whole coding sequence dendrogram, more dendrograms were constructed using only speci c regions of the coding sequence -ORF1 and RdRp region (Fig. 3a), capsid region (Fig. 3b) and ORF4 region (Fig. 3c). ORF1 and RdRp coding sequences were used merged as the two regions overlap. The amino acid similarity of strains used in the phylogenetic analysis ranged from 96.92-57.92% in region coding ORF1 Some degree of difference between even closely related strains is expected, especially in single-stranded RNA viruses such as LSV. Due to high mutation rates, the virus exists in form of quasispecies, a genetically diverse spectrum of viral variants forming around a central mutant [40][41][42]. As already mentioned, while the Czech strains 88/15 and 587/15 are genetically closer to the LSV 2 than LSV 1, they belong into an individual group within the clade with SA1 strain (and VBP256 strain), as the nucleotide similarity between the strains reaches over 95%. The similarity between other strains of LSV 2 species and the Czech strains is signi cantly lower (less than 75%). According to the quasispecies model of RNA viruses, the group formed by 88/15 and 587/15 could form from the same master virus variant as LSV 2 or is a progeny of the LSV 2 master variant that accumulated enough mutations to create a separate group within the lineage [40][41][42]. This variant is favoured by natural selection, as the sequences found in the Czech Republic were not universally unique, but have close relatives in geographically distant areas, such as South Africa. Even though there are currently only two o cial species of the virus, the dendrograms make it apparent that there are more genetically divergent groups within and with the higher number of available sequences, we might see more of o cial LSV species in the future.
In conclusion, we successfully obtained two sequences of LSV and identi ed the important parts of their coding regions -ORF1, RdRp coding region, capsid protein coding region. The two sequences we acquired are the rst Czech LSV strains in the database to date. The strains form a distinct genetic group with previously published strains. We also determined that the obtained strains belong to the most prevalent variant of the virus in the Czech Republic. Aside from that, we found two more distinct variants of the virus with minor prevalence. The overall count of complete LSV sequences is very limited, especially when it comes to Central European strains. Our contribution to LSV sequence collection could help in future genetic analyses and provide data for re ning the taxonomy of the Sinaivirus genus.
Online Resource:  Unrooted phylogenetic tree based on complete coding sequences of Lake Sinai virus. The Czech isolates are indicated by a square (■) symbol. The rest of the sequences were retrieved from the nucleotide database. The three main branches are marked by labelled brackets. The evolutionary history was inferred using the Neighbour-Joining method. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1000 replicates) are shown next to the branches, values below 70% are hidden. The evolutionary distances were computed using the Tamura 3-parameter method.
Evolutionary analyses were conducted in MEGA X.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download. OnlineResource1.xlsx