Development of EST-SSR markers and population genetic structure and genetic diversity of the Malus transitoria (Batalin) C. K. Schneider in Qinghai-Tibetan Plateau

Malus transitoria (Batalin) C. K. Schneider is a shrub or small tree species native to in and around the Qinghai-Tibetan Plateau (QTP). Using 18 pairs of EST-SSR markers, we examined 142 samples from 8 wild populations of M. transitoria obtained from the QTP. Genetic diversity had been high at the species level and showed the mean expected heterozygosity (He) and Shannon's information index (I) per population were 0.573 and 0.921, as well as the genetic diversity of the WD population (He = 0.595, I = 0.974) had been the greatest, whereas that of the BS population (He = 0.541, I = 0.870) was the lowest. The MKH population had the lowest genetic diversity (Nm = 5.034), while the mean Gene Flow (Nm) was 6.076. Molecular variance analysis (AMOVA) demonstrated moderate genetic differentiation among populations, that Within Populations variation (95%) had been significantly higher than Among Populations variation (5%), which complies to the genetic differentiation coefficient (Fst = 0.057). Two clusters grouped similar to the Unweighted Pair Group Method with Arithmetic (UPGMA) clustering and principal coordinates analysis (PCoA) as per the STRUCTURE analysis. We speculate that long-term orogeny prevents gene exchange across populations, leading to limited gene flow and loss of genetic resources. This research investigated the genetic diversity as well as structure of natural populations of M. transitoria in order to provide a scientific basis of the conservation, breeding, and rational utilization of this tree species germplasm resources in QTP. The lowest Gene Flow population may be preserved by a combination of in-situ conservation and ex-situ conservation (MKH population). Expanding the number of populations and ex-site conservation could preserve the smallest population (XQ population). The other population s could in-situ conserved in order to protect genetic resources.


Introduction
Malus transitoria (Batalin) C. K. Schneider, native to China, is an essential shrub or small tree species in the crabapple genus Malus of the family Rosaceae. It is mostly found in locations with higher elevations, such as the Qinghai-Tibetan Plateau in Shaanxi, Abstract Malus transitoria (Batalin) C. K. Schneider is a shrub or small tree species native to in and around the Qinghai-Tibetan Plateau (QTP). Using 18 pairs of EST-SSR markers, we examined 142 samples from 8 wild populations of M. transitoria obtained from the QTP. Genetic diversity had been high at the species level and showed the mean expected heterozygosity (He) and Shannon's information index (I) per population were 0.573 and 0.921, as well as the genetic diversity of the WD population (He = 0.595, I = 0.974) had been the greatest, whereas that of the BS population (He = 0.541, I = 0.870) was the lowest. The MKH population had the lowest genetic diversity (Nm = 5.034), while the mean Gene Flow (Nm) was 6.076. Molecular variance analysis (AMOVA) demonstrated moderate genetic differentiation among populations, that Within Populations variation (95%) had been significantly higher than Among Populations variation (5%), which complies to the genetic Sichuan, Gansu, and Inner Mongolia (Li et al. 1982. Malus transitoria is regarded as the "ESe" and was used to reduce blood lipid, blood pressure, and blood glucose in Tibetan areas. At the same time, it is also used as a substitute for tea (Xia et al. 2014). The leaves of M. transitoria has reportedly been used as medicine to cure liver diseases on the Qinghai-Tibetan Plateau, while the leaves have reportedly been used to treat liver ailments (Tao 1955;Ga 1995). Modern phytochemical studies have demonstrated that M. transitoria's leaves contain a variety of vitamins, trace elements and β-Carotene, and other components (Wang et al. 2010), and there are various polyphenols, flavonoids, amino acids, and other nutrients in the leaves of M. transitoria (Zhang 2018). Thus, the leaves were treated as substitutes for tea to control some cardiovascular disease, blood lipid, blood pressure and controlling blood glucose (Chen et al. 2011). The fruit of M. transitoria is abundant in protein, a variety of amino acids, and minerals, as a result, beverages, medicines, and other products are frequently made from it (Kim et al. 2012;Xia et al. 2014).
M. transitoria have been used as a tree species for barren hill afforestation and ecology recovery because of its excellent environmental adaptability and extensive geographic distribution (1500-3900 m) in Sichuan, Qinghai, Gansu, Tibet, and other places. Moreover, is cultivated as an ornamental tree in Shaanxi, Sichuan owing to its graceful tree, blossoms, and fruits (Bao et al. 2018). It is also used as rootstock for other apples in the gardening (Chen 2001).
Only a few studies on M. transitoria have been performed, despite the plant's outstanding medicinal potential and ecological benefits. The shortage of studies constrained the origin, evolution, as well as population genetics research, and further influenced genetic breeding.
The core of my M. transitoria distribution region is Lanzhou (Gansu province) and the centre ranges from 1290 to 4000 m. The distribution natural environment is highly complex and diverse, which covers the Eastern Central Asia of the Asian desert, Tanggute region of the QTP, Hengduan Mountains region of Himalayan, and loess plateau of HuaBei region (Cheng and Li 2000a, b). The Hengduan Mountains region, situated at the southeast edge of QTP, is the primary gathering site and the core of Malus development. It has an unstable geological structure and frequently develops a microhabitat due to QTP's protracted orogeny. Along with the impact of climate change, M. transitoria have started declining in their evolutionary center, with the distribution characteristics of residual species (Cheng et al. 2004).
The ecological risk of the QTP has grown somewhat as a result of global temperature change, which has also impacted native creatures' habitat and changed the population structure of indigenous plants (Gao et al. 2016;Wang et al. 2020;Yuan et al. 2020). One of the core issues of population genetics in China is the variety, genetic study, and conservation of indigenous creatures in the QTP (Li and Song, 2021).
With the increasing development and utilization of M. transitoria resources, its wild resources had been destroyed in various regions. Simultaneously, the construction of water conservancy facilities also ruined populations of M. transitoria. M. transitoria has to be managed, conserved, and used in a responsible manner immediately. For genetic resource management, successful conservation, and breeding programs, complete understanding and scientific assessment of the genetic diversity level and population structure of the plant is essential, especially for endangered species (Wallace 2002;Kwon et al. 2012;Helmstetter et al. 2020;Sun et al. 2021). Hence, information on the population genetic structure and genetic diversity of M. transitoria are significant for the preservation and rational utilization.
A few polymorphic SSR markers have been developed in the Malus (Coart et al. 2003;Ha et al. 2021). Based on these newly created SSR markers, studies of the Genetic diversity and population structure, analysis (Urrestarazu et al. 2016), germplasm identification Liu et al. 2014), fruit quality improvement, fingerprint construction and gene mapping were analysed in several Malus species (Erdin et al. 2006). Nevertheless, the study of molecular markers as well as population evolution of M. transitoria were published. Hence, in order to speed up its basic research (including population evolution and species protection) marker-assisted selection breeding programs, developing species-specific SSR markers to examine the genetic diversity and population structure of M. transitoria is necessary.
We developed EST-SSR markers in this study. Based on the transcriptome sequencing of M. transitoria, we analyzed the distribution characteristics of SSR. Moreover, based on the developed SSR markers, the amount of genetic variation in the 8 populations, gene flows between and among the population were determined as well as the population structure and the genetic diversity of 8 natural populations in QTP. In the future, we hope that our research will serve as a guide for the preservation, reproduction, and wise use of this species (Fig. 1).

Plant sampling
Plant samples had been collected from 8 sites in QTP, all of the samples in one site were randomly collected, and each sample was separated at least 50 m for avoiding sampling clones. The majority of the natural distribution regions for M. transitoria in the QTP were covered by 8 populations in our research. The local forestry bureau approved the sample collection.
The information of the sampling site as well as the number of samples had been shown in Table 1. Fresh leaves of samples were dried and stored in silica gel. Based on the CTAB, genome DNA was extracted, and DNA quality was detected using 1.0% agarose gel electrophoresis. Fresh leaves for RNA isolation were stored in liquid nitrogen and brought back to the lab.
The map of the Geographical distribution of M. transitoria populations had been generated using Arc-GIS 10.4 (Esri 2016) and MAPGIS (Zondy 2000).
RNA isolation, transcriptome sequencing, and SSR marker development According to the manufacturer's instructions, total RNA was extracted using the RNeasy Plant Mini Kit (Qiagen, Hilden, Germany). The Nanodrop 2000, agarose gel electrophoresis, and Agilent 2100 were utilised for detecting the concentration, the integrity of RNA, purity, and the value of RNA. The Oligotex mRNA midi kit (Qiagen) had been used for mRNA The settings for the MISA tool 1.0 (http:// pgrc. ipk-gater sleben. de/ misa/) were created to detect SSRs, the parameters were designed for identifying di-nucleotide motifs with a minimum of six repeats, and tri-nucleotide, tetra-nucleotide, penta-nucleotide, as well as hexa-nucleotide motifs with a minimum of five repeats (Long et al. 2015). The Primer3 (2.3.5 version, with default parameter) (http:// Frodo. wi. mit. edu/ cg-i bin/primer3/primer3) was used for primer design (Koressaar and Remm 2007;Untergasser et al. 2012).
PCR amplification, Data analyses of genetic diversity and population structure PCR amplification had been performed on an (Eppendorf, Mastercycler nexus SX1/GSX1, Bio-rad S1000), in a total 10µL volume, which contains 50 ng of DNA template and 10 µmol/L forward and reverse primer using 2 × Taq PCR MasterMix (Takara, Dalian, China), under the following reaction condition: 94 °C for 5 min, 25 cycles at 94 °C for 30 s, 58 °C for 30 s and 72 °C for 45 s and a final extension step at 72 °C for 10 min.
Polyacrylamide gel electrophoresis was used to separate and identify PCR products. A 6% polyacrylamide gel was used to electrophoretic 1 h at 2000 V, fixed for 20 min, stained for 20 min, and after developing for 20 min photographs were taken for the records.

Data analysis
The SSR findings were saved in binary format. The isolated DNA bands were split into 1 and 0 bands, with 1 representing presence and 0 representing absence. Each locus was paired for each genotype, and A, B, and C markers had been used in turn. Total band number, polymorphic bands, polymorphism ratio, fragment range, and Percentage of Polymorphic Loci (PPL) were directly calculated from polyacrylamide gel results.
The GenAlEx 6.503 was used for detecting the number of alleles (Na), the number of effective alleles (Ne), Shannon diversity index (I), Expected Heterozygosity (He), and Observed Heterozygosity (Ho) (Wright 1943). The GenAlEx 6.503 was also used to perform the principal coordinate analysis (PCoA), and molecular variation analysis of variance (AMOVA) (Peakall and Smouse 2012).
In order to calculate the Nei's genetic distance (GD) and Genetic Identity (GI), the Popgene32 V1.31 was used. The unweighted pairing-group method with arithmetic mean (UPGMA) was used to cluster the genetic distances among populations, dendrogram had been drawn by MEGAX software (Kumar et al. 2018). The STRU CTU RE 2.3.4 was used for detecting the effective number of populations, with K = 1 to 10 (ten runs for each value of K), following a burn-in of 100, 000 and 100, 000 MCMC repetitions. (Falush et al. 2003). The result was submitted to the online software STRU CTU RE HARVESTER 0.6.94 (Earl and Vonholdt 2012), which used the 4 K technique (Evanno et al. 2005), to examine the ideal K value and take into account how the LnP (K) slope changed as the K value increased. The CLUMPP had been used for repeated sampling analysis of the results of structure analysis and obtained the Q-matrix result of the best K value (Jakobsson and Rosenberg 2007). The result of CLUMPP was passed to the Distruct 1.1 to draw the structure graph (Rosenberg 2004).

Polymorphism levels of SSR loci
A total of 80 pair's primers had been designed from the detected SSR sequences, through assessing the PCR amplification efficiency and detecting the polymorphs, 18 pairs of primers successfully amplified DNA samples, generated clear fragments, and detected the polymorphism. Thus, to evaluate the diversity and structure of 142 samples and delete this word from 8 populations, these 18 pair's primers were used. Each SSR locus could detect 2-3 alleles, with an average of 2.944, the fragment ranged from 100 to 500 bp. The sequence of 18 pair primers, the total band number, Polymorphic bands, Polymorphism ratio, as well as Fragment range were demonstrated in Table 3.

Genetic diversity
For the 8 wild populations of M. transitoria, the Na ranged from 2.611 (BS) to 2.944 (WD), with an average of 2.799. The Ne averaged 2.412 and varied from 2.282 (XQ) to 2.530 (WD). The Ho averaged 0.716 and varied from 0.628 (WD) to 0.778 (QJ). They ranged from 0.541 (XQ) to 0.595 (WD), with an average of 0.573. The Shannon's diversity index ranged from 0.870 (BS) to 0.974 (WD), with an average of 0.921 (Table 4).
The Fst values of all populations were less than 0.05, as well as ranged from 0.032 (QJ) to 0.047 (MKH), with an average of 0.040. The Nm ranged from 5.034 (MKH) to 7.670 (QJ), with an average of 6.076 (Table 5). Among the 8 populations, the genetic identity ranged from 0.825 (between MKH and HY) to 0.975 (between MH and QJ), the genetic distance ranged from 0.026 (between MH and QJ) to 0.193 (between MKH and HY, MKH and XQ) ( Table 6).

Genetic relationships of populations
AMOVA results showed there was moderate genetic differentiation among populations (Fst = 0.057, P = 0.001), but not within populations. 95% of the variance percentage was discovered between individuals, 5% was found across populations, and there was no variation found within populations (Table 7). The  (Fig. 2). The UPGMA dendrogram that based on the Nei's genetic distance of 8 populations demonstrated that the 8 populations were classified into two clusters, the first including QJ, MH, BS, XQ, and HY, the HY firstly separated from the cluster, the second cluster including TR, WD, MKH (Fig. 3). This outcome was in line with the PCoA's findings.
The PCoA of geographical distance showed that the first principal coordinate (Coord.1) accounted for 55.07% of the total variation. The second principal coordinate (Coord.2) accounted for 16.40% of the total variation. 8 populations had been classified into  (Fig. 4). According to the UPGMA dendrogram, which was based on distance from the MKH, XQ, QJ, HY, WD, TR, BS, and MH were grouped together (Fig. 5). The genetic structure of populations was uncovered by the STRU CTU RE analysis, the result demonstrated that the maximum ΔK occurred at K = 2 (Fig. 6b, c), which proposed that the 8 populations can be split into two clusters (Fig. 6a), same as indicated by the PCoA and UPGMA dendrogram based on Nei's genetic distance and geographical distance (Figs. 3, 5).

Discussion
Understanding plant speciation, adaptation, or genetic divergence in plant populations require research on genetic diversity and genetic structure. This knowledge also serves as a guide for the conservation of plants and their wise use (Ross-Ibarra et al. 2007;Cho et al. 2011;Willi et al. 2022).
As a significant index for evaluating genetic diversity, the value of Na (with an average of 2.799), Ne There was a moderate degree of genetic differentiation among the 8 populations, but there was no obvious genetic differentiation. Hence, they failed to develop a new species independently. Simultaneously, 95% of the genetic differentiation Fst occurred within individuals, and only 5% of the variation occurred among populations. The gene flow (Nm) of all populations is not high, with a varied range of 5.034-7.670, which is much higher as compared to the interspecific gene flow of M. toringoides and M. transitoria (Nm = 0.0926-1.3516), but much lower than that of M. toringoides (Nm = 15.7688) (Tang et al. 2006;Chen et al. 2008;Zhao et al. 2009).
The XQ population has the lowest Shannon's Information index (0.871), He (0.541), and Ne (2.282) among other populations. In this population, we discovered less than 20 samples, and the population distribution range is relatively concentrated, because of the close distance between plants, asexual  reproduction might exist among some samples, so collected samples are only 8. The genetic diversity of the population is positively associated with population size, T, according to Sun (1996) and Johes et al. (2001). Therefore, we speculated that the low genetic diversity of the XQ population is linked to the low effective population size and narrow distribution region.
In population genetics, the Nm values represented the degree of gene exchange between populations, if the Nm values greater than 1 imply that the gene flow between populations is at a sufficient level, and could effectively prevent genetic differentiation between populations in different regions (Wright 1931;Slatkin 1987). According to Wright (1949), the Fst value have been the accepted metric to quantify the degree of divergence among pre-specified sub-populations. If the Fst value is between 0 and 0.05, the genetic differentiation between populations is very small and could not be neglected. All populations under study had Fst values of less than 0.05 and ranged from 0.032 (QJ) to 0.047 (MKH). The Nm values ranged from 5.034 (MKH) to 7.670 (QJ), the Nm and Fst value showed that all of the studied populations are not divergent and there is gene exchange between populations. Nevertheless, among the 8 populations, the MKH population are possessed the highest genetic divergence (Fst = 0.047) and the lowest gene flow (Nm = 5.034) with other populations. We hypothesize that there are two reasons for small gene flow and differentiation between this population and other populations. The first factor is that this population is geographically isolated from other populations (Figs. 4,5), and that population diversity is strongly influenced by distance between populations (Nybom 2004). The linear distance from the MKH population to the nearest TR population is approximately 400 km, with Animaqing Mountain in the middle, so geographic distance as well as  (Irwin 2010), and lead to MKH population divergent from other populations. The second possible reason is the high attitude of the MKH population, this population is located at an altitude of 3205 m, and gene exchange between populations was blocked by the increase of altitude (Huang et al. 2015;Xu et al. 2015). The MKH population has diverged from other groups as a result of considerable distance and high altitude. The MKH population is situated in the primeval forest of the upper reaches of Daduhe River where the southeast high mountains and valleys of the QTP transition to the plateau, as the southeast edge of QTP, there have unstable geological structure and often produces a microhabitat because of the long-lasting orogeny of QTP (Cheng et al. 2004 have started declining in their evolutionary center, with the distribution characteristics of residual species (Cheng et al. 2004). The complex as well as diverse microhabitat increases the genetic diversity at the species level, on the other hand, it hinders the gene exchange between populations and influences the gene flow and genetic differentiation among populations (Ortego et al. 2015). Thus, we deduce that the geographical and geological conditions and ecosystem of this population caused the high genetic diversity of the MKH population, but also caused the differentiation between this population and other populations.
The WD population has Na (2.944), Ne (2.530), He (0.595), and Shannon's diversity index (0.974) when compared to other populations. The Structure and PCoA results revealed that the WD and MKH populations have a closer genetic structure, the UPGMA showed that WD and MKH are clustered with TR. WD distributed between MKH population and other populations and is farther away from MKH in geographical distance. The center edge hypothesis is supported by the highest genetic diversity of the WD population, which is that the geographical location of the population affects the genetic diversity of the population, the genetic diversity of the geographical center population is higher, and the genetic diversity of the geographical edge population is lower (Pironon et al. 2016).
On the basis of above mentioned outcomes, we speculate that there are two possible explanations. The first is that WD is one of the diversity centers of M. transitoria, the population diffuses towards MKH and other populations in the later evolution stage, resulting in diversity differentiation. The Second is that MKH and other populations genetically differentiated because of the geographical isolation, and formed two genetic units. WD population belongs to the offspring of genetic unit hybridization, so it has the genetic characteristics of two genetic units.
In this work, we created EST-SSR markers and assessed M. transitoria genetic diversity in the Qinghai-Tibetan Plateau. All of these developed EST-SSR markers can be utilised in genetic linkage map, candidate gene locating and other researches. Individuals with genetic diversity can be used as parents in the genetic breeding of M. transitoria, or preserves as excellent germplasm through asexual reproduction.
It is noteworthy that with the process of urbanization as well as the intensification of human activities, various existing populations have been hindered and constantly disturbed by human beings (for instance, the TR population is submerged owing to the construction of a reservoir). In this instance, there are less of these populations. In the long run, gene exchange would be reduced and genetic diversity will be reduced. Therefore, it is important to prevent the loss of certain genes and genetic resources from marginal populations throughout the core collection preservation procedure. Because the MKH population differs genetically from other populations and there is like gene flow between them and other populations, in-situ conservation and ex-situ conservation may be combined to protect genetic resources. The XQ is the smallest population, limited number of individual's leads to lower gene flow, thus, the ways of expanding the number of populations and ex-site conservation could be adopted. The other populations could in-situ conserved in order to protect genetic resources.