Maize streak virus (MSV) is a type member of the Geminiviridae family, a single-stranded DNA virus with great economic impacts on the cultivation and growth of the maize plant in the sub-Saharan Africa (Martin et al., 2008; Roumagnac et al., 2022). MSV seriously constrains maize production in the sub-Saharan Africa resulting in serious economic losses and low yield turnout, most especially for peasant farmers with little or no resources, and limited access to improved maize cultivars in the region (Bediako E et al., 2017; Charles, 2014; Oppong et al., 2013). Infected maize plants often show symptoms such as chlorotic lesions, leaf striation, yellowing, stunting, low yield turnout, and, in severe cases, death (Martin et al., 1999; Martin & Rybicki, 2002; Oyeniran et al., 2021). Of the 11 known MSV strains A through to K, only the A-strain causes severe maize streak disease (MSD) with stunting and leaf striation symptom. MSV-A being the major MSD causing agent is believed to have become adapted to the maize plant (Ketsela et al. 2022).
MSV encodes three major genes in its genome that include the virion sense movement protein (mp), and coat protein (cp) genes (Muñoz-Martín et al., 2003; Owor et al., 2007). While the complementary strand encodes the replication associated proteins (rep/repA) genes which are saddled with initiating and moderating replication of the virus genome (Shepherd et al., 2007). MSV genes as important drivers of its evolution perform vital functions that ensure its survival, spread and replications in susceptible hosts (Boulton, 2002; Davies et al., 1997). Consequently, these genes are likely targets for natural selection from the perspectives of host and pathogen evolutionary arms race (Denes et al., 2022; Wang et al., 2020). MSV movements as chiefly facilitated by its leaf hopper vectors would also mean that the virus must constantly cope with a plethora of changing environment and its likely effects on its genes. Thus, signatures of positive selection for amino acid changes responsible for host adaptation in most pathogens should be detectable in an evolutionary, analytical framework (Antonides et al., 2019).
MSV coat protein (cp) is a virion sense gene that is expressed from the long intergenic region (LIR) transcripts. The cp, about ~ 735 nucleotides (nts) long has encapsidation functions and also plays key roles in systemic spread especially by the leaf hoppers (H. Liu et al., 2001). The movement protein (mp) is another virion sense gene of ~ 310 nts that is also expressed alongside the cp gene from bidirectionally transcribed LIR with main function of mediating viral movements within infected host cells (Boulton, 2002; Wright et al., 1997). Both cp and mp virion sense genes are also linked with MSV inter and intra host movements either via intermediate leaf hoppers spread for the cp or cell-to-cell movement within infected tissues for mp. Further, because of the binding capability of the cp gene, and the accompanying nuclear signal while facilitating partially uncoated single stranded DNA (ssDNA) cell entry (Davies et al., 1997; Owor et al., 2007), continuous interaction of the cp gene with the constantly changing host conditions might make it undergo persistent stimuli-driven molecular evolution.
The non-structural complementary sense repA/rep proteins are expressed as either spliced rep and un-spliced repA (Ruschhaupt et al., 2013; Shepherd et al., 2007), rep plays key roles in replication initiation while repA acts as host and viral gene transcription regulator. The MSV rep, a spliced product of the C1:C2 is also believed to have roles in the activation of virion sense promoter and specifically have this role for the coat protein promoter (Horváth et al., 1998; Nikovics et al., 2001). The repA technically moderates rep activities through coordinated checks and balancing mechanisms. This is necessary for MSV as they cannot suppress their promoter unlike some begomoviruses (Nikovics et al., 2001). It is possible that regulating the expression of these proteins at varying stages of infection, and at different host cell cycle not only plays pivotal roles in coordinating the virus life cycle, it can also make these regulating genes evolutionary selection targets.
Key mechanisms of evolution are natural selection and genetic change. Natural selection sits specifically at the intersection of diversifying evolution. Natural selection is caused by competition and environmental changes, acts on genetic variation, produces evolution, changes gene pool, and resulting in selective survival, host expansion, and adaptation (Aguadé, 1999; Deom et al., 2021; Li et al., 2018). Fitness-based selection is a deliberate event that carefully guides the course of evolution by ensuring that organisms only pass on useful traits to the next generation. Furthermore, natural selection can be likened to a differential fitness driven giant sieve that separates undesirable traits from the desirable ones, ultimately producing fitter and healthier descendants (Acosta-Leal et al., 2011; Oyeniran & Oyediran, 2024; Spielman et al., 2019).
Here, we intend to identify the occurrence of selection in the virion strand cp and mp genes of the economically important MSV-A lineages that have disseminated within the sub-Saharan Africa using the publicly available sequence data. Given that natural selection as evolutionary signatures is detectable in sequence data, it is possible to estimate sites and branches within these genes that are evolving under selection pressure up to amino acid level as these could further give insights into how these genes evolve as they interact within changing host conditions.