Sample distribution and plant materials
Briefly, in this study, we chose 6 representative regions for sampling in Beijing, which have abundant oak resources (see Supplementary Table S1 for details). Q. variabilis, Q. aliena, Q.dentata and Q. mongolica were studied in this work. Sampling sites are mostly forest parks rich in forest resources. Except for the Q. variabilis and Q. aliena populations in Jiufeng, other populations are natural populations. These areas are the north border of the North China Plain and belong to warm temperate semi-humid continental monsoon climate zone, with 4 distinct seasons and abundant rainfall.
According to the earlier investigation on the spot, all individuals in good growth condition were labeled and numbered. After photographing and recording the position, young leaves or dormant buds were collected. All the samples were frozen in liquid nitrogen and stored at −80 °C until used for DNA extraction and isolation. Numbers of samples in each region are shown in Supplementary Table S1.
DNA isolation and SSR amplification
Genomic DNA of 400 samples was extracted using Plant DNA Isolation Mini Kit (Vazyme, China), according to the user manual and modified cetyltrimethyl ammonium bromide method was also used as previously described (Yue et al. 2014; Wang et al. 2020). The DNA was eluted into 50 μL Tris-EDTA buffer as stock solution. Subsequently the quality and the concentration of the genomic DNA were quantified using 1% agarose gel electrophoresis and a NanoDrop 2000 spectrophotometer (Thermo Scientific, USA) respectively. The genomic DNA was diluted and separated to 20 ng/μL aliquots and stored at -20℃ for polymerase chain reactions (PCR) amplification. Nine polymorphic SSR primer pairs were selected as previously reported (Kampfer et al. 1998), namely QrZAG96, QrZAG102, QrZAG112, QrZAG7, Qden03011, Qden03021, Qden03032, Qden05011 and Qden05031. The information of primers were listed in Supplementary Table S2.
The nine pairs of verified SSR primers with M13 (-21) tail at their 5’ end were used for PCR amplification together with M13 (-21) tailed fluorescent primers (M13-FAM, M13-ROX, M13-HEX) as previously described (Schuelke 2000; Wang et al. 2020). PCRs were conducted in 15μL reaction volumes containing 1 μL of genomic DNA (20 ng/μL), 7.5μL of 2×Rapid Taq Master Mix (Vazyme), 0.4 μL of fluorescent primer, 0.1 μL of forward SSR primer, 0.5 μL of reverse primer and 5.5 μL of ddH2O. PCR procedures were as follows: 95 °C (5 min), 32 cycles at 95 °C (30 s) / 55-59 °C (30 s) (depending on the Tm value of SSR primers) / 72 °C (30 s), followed by 8 cycles 95 °C (30 s) / 53 °C (30 s) / 72 °C (15 s) and a final extension at 72 °C for 5 min. PCR products were sent to Ruibo BioTech (Beijing, China) for capillary fluorescence electrophoresis detection to read the length of the PCR products.
Data analysis
Characteristics and genetic diversity of SSR primers
GeneMarker 2.2.0 software was used to read the polymorphic loci and obtain the genotypes of all samples. Linkage disequilibrium analysis was tested for all locus pairs in each population by randomization using Arlequin 3.1 (Excoffier et al. 2005). Polymorphism parameters of 9 pairs of SSR markers were calculated by GenAlEx6.5 (Peakall and Smouse 2012), respectively by the number of alleles (Na), effective alleles (Ne), Shannon’s information index (I), observed heterozygosity (Ho), expected heterozygosity (He), Fixation Index (F) and so on.
Genetic differentiation among oak populations and optimal population classification
The analysis of the population structure was assessed in STRUCTURE 2.3.4 (Pritchard et al. 2000) using a Bayesian clustering approach setting parameters with a burn-in period of 100,000 iterations and 100,000 MCMC iterations after burn-in. In this study, we used “admixture model” and “allele frequencies correlated” during the modeling process. This approach revealed genetic structure by assigning individuals or predefined groups to K clusters. Different K values that ranged from 1 to 10 were used to infer the number of clusters for 10 replicate runs. ΔK (ΔK=mean (|L’’(K)|)/sd[L(K)]) (EVANNO,REGNAUTandGOUDET 2005,2611-2620) and lnPPK were showed in STRUCUTRE Harvester (Earl and vonHoldt 2011) using the results of STRUCTURE.
Different results could be produced in the same conditions by different 10 replicate cluster analyses of the same data. Thus, CLUMPP 1.1.2 (Jakobsson and Rosenberg 2007) and R 4.2.0 were used to average these 10 replicate results and visualize the outcomes separately. Proportion of gene pools distributed in 10 oaks populations were plotted on maps in the form of pie charts. The clustering tree of both individuals (NJ) and populations (UPGMA) were made using PowerMarker 3.25 (Liu and Muse 2005). Trees were then coloured and modified using Adobe Illustrator CC 2018.
Genetic diversity of 11 oak population and PCoA analysis
Na, Ne, Ho, He and fixation index (F), Fst and gene flow (Nm) among 11 populations were calculated by GenAlEx6.5. The software was used to conduct the PCoA of the four oak species and analyze the length range of the amplified alleles among different species. According to the length range of the amplified alleles, histogram was showed by using SigmaPlot 12.5.