Genomic Diversity and Spatiotemporal Distributions of Lassa Virus Outbreaks in Nigeria

Abstract Background Lassa virus (LASV) is a single-negative strand RNA Arenavirus (genus Mammarenavirus), oriented in both negative and positive senses. Due to the increase in the fatality rate of deadly disease LASV caused (Lassa fever), widespread LASV in Nigeria has been a subject of interest. Following the upsurge of LASV endemicity in 2012, another marked incidence recorded in Nigeria, 2018, with 394 confirmed cases in 19 states, and an estimated 25% cases led to death. This study aimed at acquiring the genetic variation of LASV ancestral evolution with the evolvement of new strains in different lineage and its geographical distributions within a specific time of outbreaks through Bayesian inference, using genomic sequence across affected states in Nigeria. Results From the result, we were able to establish the relationship of Lassa mamarenavirus and other arenaviruses by classifying them into distinct monophyletic groups, i.e., the old world arenaviruses, new world arenaviruses, and Reptarenaviruses. Corresponding promoter sites for genetic expression of the viral genome were analyzed based on Transcription Starting Site (TSS), the S_Segment (MK291249.1) is about 2917–2947 bp and L_Segment (MH157036.1), is about1863–1894 bp long. LASV sequences obtained from different parts of Nigeria were genetically related. Benue, Imo, and Bauchi states represent the host etiology of the LASV. Spread across other neighboring states were based on genetic pedigree dated to previous outbreaks as at year 2008 to 2012. Phylogeography of recent transmission from the year 2017 to 2019 indicates vectors were fast spreading LASV from Ondo states to Delta, Edo, and Kogi states, while spread across northeastern states suggests a vector origin from Bauchi state. Conclusions The study outlined the path of transmission based on the genetic homology of the sampled LASV sequences in affected geographical locations. We suggest the federal government should initiate a vector surveillance program to curtails further spread of LASV, especially states bordering with northwestern states and north-central Nigeria.

RNA, oriented in both negative and positive senses (ambisense gene coding) on the two RNA segments [4]. LASV genome comprised of two genetic segments, L (7.3kb) and S (3.4 kb), encode four proteins: Z protein, L protein, nucleoprotein, and glycoprotein [5]. The RNA polymerase L protein (200 kDa) and a RING finger Z protein (11 kDa) are coded in the L segment, while the S segment code for nucleoprotein (64 kDa) and two cleaved glycoproteins GPC1 (42 kDa) and GPC2 (38 kDa) [6].
Currently, 35 species including Lassa mammarenavirus are carried by mammalian host and they are categorized into two main groups, Old World and New World viruses [7]. This zoonotic disease which pose a greater threat to public health further classified into virus subgroups based on phylogenetic, serological, and geographical characteristics; the Old World viruses, comprises of LASV and Lujo virus (LUJV), while the New World viruses includes Machupo and Chapare virus from Bolivia, Junín virus, Sabiá virus and Guanarito virus found in Argentina, Brazil and Venezuela, respectively [8]. Genomic studies of LASV can enhance the acquisition of data-driven distribution and determinant of the Lassa fever disease by defining genetic variations and viral-specific lineage [9].
Out of all LASV proteins that exist in the virions and infected host cells, nucleoprotein has the largest number of polypeptide encapsidated genomic RNA to prevent it from degradation in infected host cells [10]. LASV nucleoprotein made up a peptide having 569 amino acids residue [11] and consists of a separate amine group (N-terminal) and carboxylic group (C-terminal) domains [12]. The N-terminal domain proposed a cap-binding cellular structure through deep binding cavity for the synthesis of viral mRNA [13], while the C-terminal domain revealed a binding site functions as exoribonuclease, leading to suppression of type I IFN production by interfering with IFN regulatory factor 3 (IRF-3) activation [12,14,15]. This pathway plays an important role in transcription and replication of Lassa virus and immunosuppression of infected host [11,16] LASV glycoprotein (GP) is a trimeric single polypeptide chain, glycosylated cotranslationally, cleaved subunit GP-1 and GP-2 by enzyme peptidase in the endoplasmic reticulum (ER) which play an important role in viral-host infection [17,18] Glycoprotein facilitates the entry of the virion into the host cell through receptor binding and fusion to the cell membrane as the only unique protein of the capsid that protect humoral immunity [17]. Before the GP-1 attachment to the host cells, there is a low immune response that poses a major problem to the host immune system as the virus spike in response to protective antibodies. This low response is due to the extensive protection by N-linked glycan, a phenomenon that is similar to other infectious diseases caused by viruses such as human immunodeficiency virus and hepatitis C virus [19]. Unlike GP-1, the fusion of GP-2 facilitates by internal fusion loop (I-FP) that triggers the transmembrane fusion as a result of acidic pH [20]. It has observed, absence of peptidase and subtilase, subtilisin kexin isozyme-1 (SKI-1)/site 1 protease (S1P) that facilitates biochemical activity of glycoprotein cleavage attracts the rational for LASV vaccine and therapeutic production due to its infectivity as the essential surface functionality of the virion [21] The L and Z proteins are encoded by L segment of RNA, unlike S segment that encodes nucleoprotein and glycoprotein [22]. L protein is majorly made up of RNA polymerase that is dependent at C domain and related to the viral nucleocapsid [23], and N domain functions as transcription terminus for the virion through enzymatic processes of endonuclease [24].
The causative agent of virulent acute hemorrhagic fever, Lassa fever, was first discovered in Borno state, Nigeria, 1960 [5]. It has an incubation period of 6-21 days and characterized by symptoms such as fever, general weakness, and malaise. It was often associated with headache, sore throat, muscle pain, chest pain, nausea, vomiting, diarrhea, and cough, a few days after infection, followed by abdominal pain. Severe cases usually result in the puffy face, pleural effusion, bleeding from a different body orifice, gastrointestinal tract, and hypotension followed by death after 14 days of fatal cases [25]. The widespread of LASV in Nigeria has been a subject of interest due to the increase in this deadly disease. According to the report by Agbonlahor et al., it was narrated that the year 2012 recorded the widest spread and higher incidence of Lassa hemorrhagic fever among different states in Nigeria. Edo, Delta, Ondo, Rivers, Ebonyi, Kano, Yobe, Benue, Kaduna, Kogi, Bauchi, Adamawa, Abia, Anambra, Imo States and the Federal Capital Territory, Abuja, were affected states [26].
Following the surge of LASV endemicity in 2012, another marked incidence recorded in Nigeria, 2018, with 394 confirmed cases in 19 states, and an estimated 25% cases led to death [9]. Increased in Lassa fever widespread in Nigeria has posed a serious threat to public health, with the recent outbreaks from January to February 2019. A total of 324 confirmed cases with 72 death cases were 5 reported by Nigeria Centre for Disease Control (NCDC), while two separate cases reported in the fifth week with confirmed cases of 68 and week 6, 37 cases, and ten death cases were also reported [27].
This analysis aimed at acquiring the genomic variation of Lassa virus ancestral evolution with the evolvement of new strains in different lineage and its geographic distributions through Bayesian inference, using genomic sequence across affected states in Nigeria and endemic countries in West Africa.

Homologous classification and genomic sequence flow of LASV in West Africa.
Evolutionary relationship of Lassa mamarenavirus and other arenaviruses has revealed the distribution of different viruses across the globe. Basically, the phylogenetic analysis helped to classified the viruses into 3 distinct monophyletic groups, i.e. the old world arenaviruses, which were   to 2947and L segments from 1863 to 1894 as predicted by promHG promoter prediction tool.    Ondo=182, and Rivers=4, as shown in Figure 3. Importantly, several reasserting studies using the Mopeia virus for the development of attenuated LASV vaccines reported by Johnson et al., [29] and Moshkoff et al., [30]. Among the new world arenaviruses, the genomic sequence distribution across different West African endemic countries showed in Figure 2. It recorded, Nigeria has the highest sampled sequence in GenBank, followed by  [35,38]. The predicted small interfering RNA revealed the potential of the down-regulation activities of some human genes through specific base pairing. Müller and Günther reported siRNA targeting the upper stream of the S and L segment are capable of downregulating reporter gene expressing LASV mRNA construct and replicon [31].

Transcription Starting Site (Promoter) and TATA Box in LASV Genome
Importantly, LASV genetic makeup has characterized by the genotypic differences of ancestral evolution within the varying time of outbreaks. Similar to the study is the report of Kafetzopoulou et al. [32], which highlighted the lineage into six groups (I to VI), as shown in Figure 4. Benue also borders with Kogi and Nassawara state on the east and north, respectively, and on the northeast by Taraba state. Ingestion of cooked rat meat is a common practice among the people of Benue state, but not raw or undercooked meat according to Olusi et al., [33]. Contrary to the previous outbreaks, phylogeography of the recent strains identified spontaneous spreads of LASV from southwest to south-south and north-central through the states that shared a common border. This finding shared a similar opinion with the recent study by Ehichioya et al., [34], confirmed that the majority of the sampled strained in lineage II evolved from Ondo state, a south-western state bordering with Kogi

LASV homologs and genomic sequence distribution
To determine the homologous recombination of the LASV and other arenaviruses through DNA sequence phylogeny, we collected homologous sequences of arenaviruses from the orthology database, OrthoDB V10 [35] and viral genome database, viruSITE [36]. Following sequence collection, sequences were manually edited, aligned using multiple sequence alignment programs, ClustalW and used to construct the phylogenetic tree in MEGA X software [37]. An online tool (Interactive tree of life: iTol) was used to annotate and display the tree [38]. According to the Centers for Disease Control and Prevention (CDC), LASV was endemic in West Africa countries of Nigeria, Guinea, Sierra Leone, Liberia, Mali, Côte d'Ivoire, and new strains recently found in Togo, 2016 [3]. To determine the genomic diversity of LASV strains, a distribution test was used to measure the sequence flow among 11 the endemic countries and states in Nigeria. This distribution was estimated using a virus pathogen resource database, ViRP [39] and GenBank [40].

Genomic Variation and Phylogeographic Distribution in Nigeria
Genomic variation and phylogeographic distribution of LASV was analyzed using the S segment of the