In this study, 10 pairs of codominant EST-SSR primers related to drought stress of P. euphratica and P. pruinosa were used to analyze the genetic diversity of 693 samples of 47 natural P. euphratica populations and 235 samples of 17 natural P. pruinosa populations distributed in Xinjiang, Gansu and Inner Mongolia. The average level of the observed allele number (Na) was an important index to evaluate the polymorphism of SSR locis and the degree of population variation. The average level of the observed allele number (Na) (Tab. S3) was 1.743 (Na = 1.743) and 1.712 (Na = 1.712) in the P. euphratica and P. pruinosa populations respectively, which was lower than that the 20 natural Pinus yunnanensis populations of using SSR markers (Na = 3.7) (Xu et al. 2015), and the 27 Rosa odorata var. gigantea (Rosaceae) populations used the 7 pairs of SSR markers (Na = 3.9) (Meng et al. 2016). The average level of the effective allele number (Ne), Shannon’s information index (I), observed heterozygosity (Ho) and expected heterozygosity (He) were important parameters to evaluate genetic diversity of populations. Through the research results of natural P. euphratica (Ne = 1.407, I = 0.34, Ho = 0.238, He = 0.215) (Tab. S3) and P. pruinosa (Ne = 1.382, I = 0.322, Ho = 0.204, He = 0.205) (Tab. S4) populations, its Shannon’s information index was all lower than that of the wild apricot populations (I = 0.46) (He et al. 2007) and Malus sieversii populations (I = 0.41) (Zhang et al. 2007), its expected heterozygosity (He) was lower than that of north American Pinus strobus (He = 0.531) (Mandák et al. 2013), Populus alba (He = 0.368), Populus tremula (He = 0.490) (Lexer et al. 2005), Populus euphratica (He = 0.787) (Wang et al. 2011a). This might be due to the fact that the sequences were derived from the drought stress-related genome sequences of P. euphratica and P. pruinosa, so the polymorphism bands acquired the codominant EST-SSR markers was lower than that the genomic SSR (Powell et al. 1996). Previous studies had also shown that the polymorphism of genomic SSR was higher than that of EST-SSR marker. For example, Eujayl et al. (Eujayl et al. 2001) studied the wheat genetic diversity by using genomic SSR and EST-SSR molecular markers, and the results showed that the polymorphism of genomic SSR was higher. Li et al. (Li et al. 2007) also found the fact that the genomic SSR polymorphism was higher, but EST-SSR markers could reflect accurately the genetic differences and genetic relationships among different wheat genotypes.
The molecular variance analysis (AMOVA) of P. euphratica populations showed that the variation among and within populations of total variation was 23% and 77% respectively (Table 1). In natural P. pruinosa populations (Tab. S5), the variation among populations accounted for the 24% of the total variation, while the variation within populations was 76%. the variation of P. euphratica and P. pruinosa populations showed that the variation within population was all the main source of the total population variation. the variation within population (77%, 76%) was less than that of previous researches respectively. For example, Wei (Wei 2010) analyzed 16 Populus simonii populations with 20 pairs of SSR primers, and the results showed that the variation within population was 85.97% of the total variation. Dong and bai (Dong and Bai 1998) analyzed 8 natural P. euphratica populations with four isozymes, and the results showed that 93.7% of the variation derived from intrapopulation. Wang et al. (Wang et al. 2011a) analyzed the P. euphratica populations using 8 pairs of SSR markers in Northwest China, and its results showed that 94.79% of the variation existed the intrapopulations. However, the variation within population P. euphratica and P. pruinosa was more than the previous research that Sheng et al (Sheng et al. 2008) found the variation existed the intrapopulation is 60.51% using 10 pairs of RAPD primers in the five natural P. euphratica populations. These results showed that, using different molecular markers to analyze the genetic variation of the populations, there were differences in the variation percentage of within population. but the overall conclusion was the same that the variation within population was bigger than that among populations at the molecular level.
Through UPGMA cluster analysis of 47 natural P. euphratica populations and 17 natural P. pruinosa (Fig. S1, S2), the results showed that, when the genetic distance among populations was about 0.1, P. euphratica and P. pruinosa populations were all divided into subgroups represented by the populations in southern and northern Xinjiang respectively, which was consistent with the previous cluster results of P. euphratica populations (Chen et al. 2021). It showed that the differentiation degree of the P. euphratica or P. pruinosa populations distributed in northern and southern Xinjiang were relatively largest. Zeng et al. (Zeng et al. 2018) studied the population structure and phylogeography of P. euphratica populations in Northwest China by using chloroplast DNA (cpDNA) sequences and nuclear microsatellites respectively. The results showed that the uplift of Tianshan Mountains directly hindered the gene flow between P. euphratica from northern and southern Xinjiang, and the distribution contraction caused by climate oscillation further accelerated the divergence of in these areas. Therefore, there was an obvious differentiation among P. euphratica populations from southern and northern Xinjiang. At the same time, Jia et al. (Jia et al. 2020) analyzed the genetic diversity of 252 individuals of 27 natural populations with the whole genome resequencing. The results showed that there was a high degree of genetic differentiation among populations in southern and northern Xinjiang.
Bayesian cluster analysis was carried out on the 692 samples from 47 P. euphratica populations and 237 samples from 17 P. pruinosa populations with STRUCTURE software. The results showed that, when the K value was about 4, ΔK had the maximum peak value (Evanno et al. 2005) (Fig. S4, S5). At this time, the same variant of natural P. ephratica and P. pruinosa populations clustered into one group respectively (Fig. 1, S3). At the same time, 47 P. euphratica populations and 17 P. pruinosa populations were analyzed by the principal coordinate axis (Fig. S6, S7). The results showed that P. euphratica and P. pruinosa populations were also clustered into four clades, of which the populations with high genetic similarity were clustered into one clade respectively. Specifically, the difference among populations from northern and southern Xinjiang was relatively obvious, while the difference within populations in northern and southern Xinjiang was relatively low, and the genetic similarity of its populations was high. The possible reason was that the similarity among populations was often related to the ecological environment, species reproductive ability and disperal media of geographical distribution areas. For example, species with biological characteristics such as wide distribution, long seeds longevity in vitro and wind pollination were relatively easy to overcome the geographical isolation among populations, which was conducive to the hybridization of individuals to maintain high similarity of different populations (Hamrick et al. 1979), P. euphratica and P. pruinosa had biological characteristics such as long life longevity, root tiller propagation, wind pollination and seed disperal, which might be the main reason for the high genetic similarity among populations in the same region (Wang et al. 2011b). At the same time, the seed maturity of P. euphratica and P. pruinosa populations in southern Xinjiang often occurred in flood period (Eusemann et al. 2013), these unique environmental conditions were conducive to the population propagation of P. euphratica and P. pruinosa and build a large-scale effective population, making populations in Southern Xinjiang become the most densely distributed area in the world (Wang 1996).
The construction fingerprinting of species in different regions helped to identify different varieties or strains quickly and accurately, which also provided great convenience for the protection and breeding management of varieties in different regions in the later period (Feng et al. 2022). In this study, 10 pairs of EST-SSR primers with good polymorphism were selected from 133 pairs of primers to construct the fingerprintings of different samples of P. euphratica and P. pruinosa populations (Tab. S6, S7). Of course, the P. euphratica and P. pruinosa populations distributed along the riparian forests had a relatively narrow genetic backgrounds and relatively close genetic relationships, which caused to a certain difficulties to identify the different populations samples. To a certain extent, The EST-SSR markers developed based on the transcription level in the study had a high accuracy, were suitable for the construction fingerprintings of P. euphratica and P. pruinosa populations and the identification of different populations samples.