Putative Local Adaptive SNPs in the Genus Avicennia

The genus Avicennia with eight species grow in intertidal zones of tropical and temperate regions, ranging in distribution from West Asia, to Australia, and Latin America. These mangroves have several medicinal applications for mankind. Many genetic and phylogenetic studies have been carried out on mangroves, but none is concerned with geographical adaptation of SNPs. We therefore, used ITS sequences of about 120 Avicennia taxa growing in different parts of the world and undertook computational analyses to identify discriminating SNPs among these species and to study their association with geographical variables. A combination of multivariate and Bayesian approaches such as CCA, RDA, and LFMM were conducted to identify the SNPs with potential adaptation to geographical and ecological variables. Manhattan plot revealed that many of these SNPs are significantly associated with these variables. The genetic changes accompanied by local and geographical adaptation were illustrated by skyline plot. These genetic changes occurred not under a molecular clock model of evolution and probably under a positive selection pressure imposed in different geographical regions in which these plants grow.


Introduction
Mangroves are woody shrub and tropical plants that grow well in the intertidal zones of tropical to subtropical latitudes (Faridah-Hanum et al. 2019).Mangrove forests are highly productive ecosystems that provide a broad range of valuable ecosystem services (Barbier et al. 2011;Lee et al. 2014).These plants are highly specialized, flourishing under inhospitable environment conditions of extreme tides, high salinity, high temperature, strong winds, and anaerobic soil (Mazda et al. 2005).As a physical structure, mangroves also protect coastlines and coastal communities against erosion, flood, and storms by reducing hydrokinetic energy (Danielsen et al. 2005;Zhang et al. 2012) and have been considered as a more sustainable, cost effective, and ecological alternative to conventional coastal defense engineering (Cheong et al. 2013;Temmerman et al. 2013).Structurally complex and often densely rooted, mangroves provide a spawning and nursery habitat for marine species and play a vital role in sustaining the production in coastal fisheries (Manson et al. 2005).Mangroves also have a large capacity for carbon sequestration and play a major role in the oceanic and global carbon cycle (Duarte et al. 2005;Alongi 2014;Ezcurra et al. 2016).
Mangrove loss and fragmentation affect human livelihoods, biodiversity, and ecosystem functioning (Polidoro et al. 2010;Carugati et al. 2018;Estoque et al. 2018), Mangroves are threatened by climate change, including changes in precipitation and temperature regimes, increased storm frequency and intensity, and fluctuations in sea level (Ward et al. 2016;Lovelock et al. 2015Lovelock et al. , 2017)).
Recent pharmacological investigations have also reported diverse medicinal properties of the plants belonging to the genus Avicennia against cancer, HIV, hepatitis, diabetes, inflammation, diarrhea, oxidative stress-related diseases, and so on (Rege et al. 2010;Sharief and Umamaheswararao 2011;Shafie et al. 2013).The total area of mangroves across geological history is unknown (Schneider 2011;Plaziat et al. 2001;Duke 1995;Duke et al. 2002).
Mangrove ancestors were presumably submerged by sea-level rise during that warm period and became adapted to intertidal conditions.Mangrove vegetation lineages increased steadily over the Tertiary Period (Ricklefs et al. 2006).Fossil evidence of major mangrove taxa has been reported to originate around the ancient Tethys Sea, and populations may have become divided as this sea closed following continental drift.Continental drift later helped some mangrove taxa extend in range (Duke et al. 2002).Transport by drifting continental plates may explain why some mangrove genera have similar global distributions despite large variations in individual dispersal capability (Duke 2017).A range of mangrove plant families diverged independently from their terrestrial ancestors during the late Cretaceous-Paleocene epoch, with the most ancient confirmed fossils belonging to the mangrove palm Nypa, aged at 75 million years (Gee 2001).First record of mangroves in the Red Sea dates to 323 BC (Schneider 2011;Flenley 1998).
Mangrove distribution probably became more extensive during the warm Eocene epoch (Plaziat et al. 2001), for example, Avicennia pollen was present in Siberia (at latitudes above 72 °N) during the early-middle Eocene (Suan et al. 2017).The fossil pollen of Avicennia by Gruas-Cavagnetto et al. (1988) in France during the Lower Eocene by Churchill (1973) in W. Australia during the Upper Eocene by Thanikaimoni (1987) in Mediterranean during the Lower Miocene by Leopold (1969) in Pacific Islands during the Middle Miocene by Muller (1964) in Borneo during the Upper Miocene and Pliocene have been recorded.Avicennia is the lone mangrove genus that occurs throughout the world, comprising eight species: A. germinans (L.) Stearn, A. schaueriana Stapf & Leechm.ex Moldenke, A. bicolor Standley, A. marina (Forssk.)Vierh, A. alba Blume, A. officinalis L., A. integra N. C. Duke, and A. rumphiana Hallier f (Duke 1991).Among them, the first three are endemic to the Atlantic-East Pacific (AEP) region; the remaining five are in the Indo-West Pacific region (IWP) (Duke 1991).The existence of hybrids in Avicennia has been examined in AEP populations by Mori et al. (2015), revealing new phylogeographic patterns.
Several studies have been carried out to identify the putative adaptive SNPs in different plant species (see, for example, Abebe et al. 2015;Garot et al. 2021).These landscape genetic studies screen the genomes to identify differentiated regions (i.e., outlier loci) that are putatively under natural selection and test for associations between putative adaptive loci (e.g., SNPs) and environmental variables of the species habitat while accounting for neutral patterns that affect allelic frequencies, such as genetic structure and demographic history.Therefore, these studies not only identify the candidate loci for adaptation but also identify the ecological selective pressures responsible for local adaptation (Garot et al. 2021).
Landscape genetic studies utilize different statistical and bioinformatics methods.For example, MDS (Multidimensional scaling) and PCA (Principal components analysis), have been used for the population divergence, while a combination of RDA (Redundancy analysis) and LFMM (Latent factor mixed model) have been used for identifying the adaptive genetic regions (see, for example, Abebe et al. 2015;Garot et al. 2021).To our knowledge, there has been no report on putative local adaptive SNPs in the genus Avicennia; therefore, the present study was performed with the following tasks: 1-To Identify the nucleotides which can differentiate Avicennia species and geographical populations from each other, 2-To reveal association between DNA sequences and geographical coordinates, and 3-To identify the SNPs which halve phylogenetic signal.
We used both PCA (Principal components analysis) and DAPC (Discriminate analysis of principal components analysis), which is suitable for SNP sequences, to identifying discriminating sequences.For association studies, we used CCA (Canonical correspondence analysis) and RDA (Redundancy analysis), followed by LFMM (Latent factor mixed model) analysis to test significance of nucleotide association with geographical and ecological variables.The phylogenetic signal of sequences was investigated by character mapping based on parsimony criterion.

Materials and Methods
In this study, we used published data on ITS sequences for an Avicennia species which are reported from different parts of the world in NCBI (the National Center for Biotechnology Information) (Table 1).

Data Analyses
DNA sequences obtained were initially aligned by MUSCLE program implemented in MEGA 7 and cured accordingly.Aligned sequences were used to construct Maximum likelihood phylogenetic tree (ML tree), based on Kimura-Two-parameter distance.We used the following analytical methods to identify the SNPs which discriminate Avicennia species and geographical regions and also to reveal association between SNPs and geographical coordinates.These analytical approaches have different assumptions and may differ to some extent in their results.Therefore, comparing obtained results are important for drawing a solid conclusion.

Canonical Correspondence Analysis (CCA)
CCA (Canonical correspondence analysis) is based on regression analysis between SNPs and ecological features.It uses an approach similar to principal components analysis (PCA), but is suited for discrete characteristics, like SNPs (Podani 2000).In PCA we have a maximized variance of data, while CCA tries to maximize the association of data (SNPS), to geographical variables (Podani 2000).CCA was performed in PAST ver. 4. (Hammer et al. 2012).

Redundancy Analysis (RDA)
RDA is a form of constrained ordination that suits for genomic data sets, where we are interested in understanding how the multivariate environmental factors shape the patterns of genomic composition across geographical areas.RDA is a multivariate regression method, which is based on a model of linear combinations of the environmental predictors that explain linear combinations of the SNPs.This method effectively identifies loci associated with the multivariate geographical variables (Legendre and Legendre 2012).RDA was performed in PAST ver.4, program.

Latent Factor Mixed Model (LFMM)
LFMM is a Bayesian method which is used for testing associations between loci and geographical gradients using latent factor mixed models.It performs a regression analysis in which the confounding variables are modeled with unobserved (latent) factors.The program estimates correlations between geographical and ecological variables and allelic frequencies, and simultaneously infers the background levels of population structure (Frichot et al. 2013;Frichot and Francois 2015).LFMM was performed by LFMM package in R. 4.2.

Phylogenetic Analyses
Phylogenetically important SNPs were determined by character mapping of 110 SNPs obtained based on parsimony criterion as performed in Mesquite 3.6 program.We performed Tajima's D test (Tajima 1989) to reveal if Avicennia species DNA sequences evolved randomly ("neutrally") or under a non-random process, including directional or balancing selection, demographic expansion, or contraction.This was followed by constructing a Neighbor-Net network analysis to reveal the presence of a positive selective force over sequences in Avicennia species.
Moreover, we also carried out the molecular clock test to show if SNP changes occurred in accordance with a uniform clock rate model of evolution during Avicennia genus speciation events.These tests were performed by MEGA 7 program.The skyline analysis was used to study population size changes in different geographical regions as performed in R-package 4.2.Dentrented correspondence analysis (DCA) of the sequences was performed to identify the most variable ITS nucleotides among Avicennia genotypes studied.DCA was permed in PAST ver. 4. (Hammer et al. 2012).
Discriminant analysis of principal components (DAPC) was used to study genetic affinity of Avicennia species and to identify discriminating SNPs.This analysis was performed in Adeegenet package of R.

Geographical Grouping of Avicennia Species
We obtained DNA sequences of 540 nucleotides length after alignment and curation, with 116 polymorphic nucleotides differing among Avicennia species.
The ML phylogenetic tree (Fig. 1), placed some of the studied species and the geographical regions in a specific clade and therefore, can be differentiated by the studied sequences.For example, taxa studied from Brazil, Bangladesh, India, as well as Saudi Arabia and Egypt, almost formed a distinct clade.However, the studied species and genotypes from Mexico, Costa Rica, and Panama were placed inter-mixed but also formed a distinct clade.These results suggest that we may find some private sequences for these particular geographical regions.
Dentrented correspondence analysis (DCA) of the sequences (Fig. 2) revealed the most variable ITS nucleotides among Avicennia genotypes studied.Some of the ITS nucleotides show correlation and are placed close to each other in a particular direction of DCA plot.For example, nucleotides number 21, 29, 49, and 109, all are placed closer to each other on the left side of DCA plot.The same holds true for nucleotides number 37, 97, 102, 104, and 104, which are placed close to one another but in an opposite direction.However, out of 110 polymorphic nucleotides, about 25-30 are showing a higher degree of discriminating power.These nucleotides are scattered throughout ITS genetic region as they do not form clusters in DCA plot.
DAPC analysis (Fig. 3), revealed that individuals studied from Bangladesh and India, show some degree of genetic affinity.The same holds true for the samples studied from Brazil and the other Latin America countries.DAPC analysis also revealed that the first three axes comprise about 80% of total variance and based on these axes, several SNPs were identified that can differentiate Avicennia taxa.Four nucleotides, namely, 48, 88, 89, and 90, showed the highest degree of discriminating power.
CCA and RDA analyses produced similar results.Moreover, RDA produced significant (0.05) association between SNPs and geographical as well as ecological variables (Fig. 4).Therefore, the RDA results are presented here.
RDA plot (Fig. 4) revealed that some of discriminating nucleotides are associated with longitudinal distribution of Avicennia species, while some others are associated with latitudinal distribution.This plot also shows that some of these nucleotides are potentially associated with a particular geographical region, like South America or East Asia.For example, nucleotides number 52, 55, 67, 72, and 76 are associated with longitudinal distribution of Avicennia species.The other nucleotides which are associated with latitude and a particular geographical area are also indicated in this Figure .LFMM (Fig. 5) analysis revealed a highly significant association between some of the discriminating nucleotides suggested by CCA and RDA analyses.These nucleotides show −log 10 p value of 2 and higher and are highly significantly associated with Avicennia geographical regions.
Detailed inspection of these nucleotides by parsimony analysis revealed that they are mostly private or restricted to a particular country.For example, nucleotide number 80, which is a C nucleotide for all the studied species, is different only in Madagascar samples with A Nucleotide.Similarly, the nucleotides positions 24 and 27 are C for all geographical regions except for Brazil which is a T nucleotide.
Tajima's D produced a positive value (D = 0.32), which indicates a positive selection over the sequences.Similarly, when data were subjected to Neighbor-Net network analysis (Fig. 6), almost two major groups were formed, which supports Tajima's D and the presence of a positive selective force over sequences in Avicennia species.
The molecular clock test showed that SNP changes within the genes Avicennia did not occurred under uniform rate of evolution and different phylogenetic clades differed in their genetic changes.This results also agree with the result of skyline plot (Fig. 7), showing a deep and a sudden change in SNP substitution and population size change in Avicennia species in different geographical regions.

Discussion
Speciation within the genus Avicennia is complex.It has been shown that the interplay of human activity, climate change, and disturbance events composes the environmental history of mangrove vegetation (Figueroa-Rangel et al. 2016).Indo-West Pacific (IWP) is considered as the center of origin of mangrove as it shows the maximum species diversity and from this region are dispersed to the other parts of the world (Ellison et al. 1999).Recently, (unpublished data), we provided DNA barcode of ITS region, in Avicennia species which illustrate genetic differentiation between taxa growing in different geographical regions.
In spite of many studies performed on the phylogenetic and genetic diversity of Avicennia genus, there has been no report on geographical association of DNA sequence in this genus.The present study revealed that DNA nucleotides of ITS region can efficiently differentiate geographical taxa in Avicennia genus.Moreover, some of these sequences may be significantly associated with geographical distribution of these species.
Tajima's test of these sequences produced a positive Tajima's D, which indicates a balanced selection related to the speciation events (Tamura and Nei 1993).We observed almost a continuous and gradual nucleotide substitution in some of the species, but with a sudden deep change in DNA sequences in some other geographical regions.This was supported by Neighbor-Net plot.
Different approaches used to identify the nucleotides associated with geographical and ecological variables were all in agreement, and LFMM Manhattan plot showed significant association between some of the SNPs and ecological as well as geographical variables.The same results were obtained in different analytical approaches used.
Therefore, the present study shows that using different analytical approaches we may improve our understanding on association between the SNPs and geographical and ecological variables.Moreover, such a combined data evaluation gives us an insight into the contemporary evolutionary processes and may explain how environmental factors influence selective and neutral genomic diversity within and among related species or different geographical populations within a single species (Segovia et al. 2020).
Presence of heterogeneous environmental conditions bring about changes in the genetic diversity of plant species, which in turn results in local adaptations (Segovia et al. 2020).Therefore, the studies concerned with the genetic basis of local adaptation can improve the knowledge of the genetic mechanism of local adaptation and probably species diversification within a genus (Zhang et al. 2019).Moreover, these studies try to answer two major questions: 1-which environmental variables play key role in the adaptive genetic divergence of a species or different species within a particular genus and shape its landscape genetic structure and 2-which genes or loci undergo adaptive genetic differentiation (Li. et al. 2017;Zhang et al. 2019).
The present study revealed that both longitudinal as well as latitudinal distribution of Avicennia species as well as ecological variables like temperature and rainfall have selective pressure on the studied SNPs and play role in genetic changes within this genus.
In some plant species, we may encounter the influence of one of these geographical variables on sequence adaptation.For example, Ingvarsson et al. (2006) characterized patterns of DNA sequence variation at the putative candidate gene phyB2 in four populations of European aspen (Populus tremula) and scored single-nucleotide polymorphisms in an additional twelve populations collected along a latitudinal gradient in Sweden.They utilized a sliding-window scan of phyB2 and identified six putative regions with enhanced population differentiation and four SNPs with significant clonal variation.Therefore, they suggested that the cline variation at individual SNPs is an adaptive response in phyB2 to local photoperiodic conditions.Similar studies suggest that divergent selection enhances the levels of genetic differentiation not only for the sites that are the direct target of selection but also for neutral sites in the vicinity of the site(s) under selection (Charlesworth et al. 1997;Nordborg and Innan 2003).
Biochemical Genetics (2023Genetics ( ) 61:2260Genetics ( -2275 In conclusion, the present study provide data on DNA sequence changes in association with geographical and ecological variables in the genus Avicennia and suggest that these variables play role in causing genetic changes within this genus.Some of these SNPs were significantly associated with geographical and ecological variables.

Fig. 1
Fig. 1 ML phylogenetic tree of Avicennia genotypes based on their geographical regions

Fig. 2 3 Fig. 3 Fig. 5 Fig. 7
Fig. 2 DCA plot of ITS sequences showing nucleotides with a higher degree of discriminating power among Avicennia genotypes

Table 1
Voucher information and GenBank accession numbers of taxa sampled for the genus Avicennia based on ITS data (https:// www.ncbi.nlm.nih.gov/ nucco re/? term= Avice nnia+ spacer)