Genome-wide association analysis of fiber fineness and yield in ramie (Boehmeria nivea) using SLAF-seq

Ramie (Boehmeria nivea L. Gaudich) is a major natural fiber crop cultivated in East Asia. The improvement of its fiber yield and fineness are important breeding goals. Fiber yield is a complex quantitative trait comprising ramet number, stem diameter, plant height, skin thickness, and fiber percentage. The fiber fineness is a crucial trait for ramie fiber quality. However, there are few association studies for fiber yield traits and fiber fineness on ramie, and lack high-density SNP maps in natural ramie population. Here, a panel of 112 core ramie germplasms were genotyped by 215,376 consistent single-nucleotide polymorphisms (SNPs) from specific-locus amplified-fragment sequencing (SLAF-seq), and used for genome-wide association study of fiber fineness and yield. Subsequently, the genetic diversity, linkage disequilibrium (LD), and population structure was conducted based on 215,376 SNPs. Population cluster analysis disclosed five subpopulations. Neighbor-joining (NJ) analysis revealed three major clusters. No obvious relationships were identified between them and their geographic origins. The genome-wide LD decayed to r2 = 0.1 was ~ 11.75 kbp in physical distance. One, seven, one, seven, and twenty-seven significant SNP marker associations were detected for fiber fineness (third season), stem diameter (third season), stem skin thickness (third season), fiber percentage (second season), and fiber percentage (third season), respectively. Two promising candidate genes, whole_GLEAN_10029622 and whole_GLEAN_10029638 resided in the significant trait-SNP association for fiber fineness (third season), which annotated as a cotton fiber-expressed protein and an Arabidopsis thaliana homebox protein ATH1, respectively and validated by qPCR. The identified loci or genes for fiber fineness and yield may provide the basis for future research on fiber fineness and yield and marker-assisted selection breeding for ramie.


Introduction
Ramie (Boehmeria nivea L. Gaudich) is a perennial fiber crop and native to eastern Asia (Luan et al. 2015). About 87,620 ha of ramie were cultivated in China in 2016 (http://data.stats.gov.cn). Ramie is the second largest fiber crop in China (Luan et al. 2017). Fiber yield and quality limit ramie cultivation in China (Chen et al. 2016;Luan et al. 2009). While, fiber yield is a complex quantitative trait comprising ramet number, stem diameter, plant height, skin thickness, and fiber percentage (Luan et al. 2017). Qualitative fiber characteristics include length, strength, and fineness. The length and strength of both cotton and ramie fiber are suitable for textile manufacture. Unlike cotton, however, the fineness of ramie fiber is too lower for fabric production. Thus, it's a crucial trait for fiber quality. Ramie fiber traits may be quantitative and regulated by numerous genes with small phenotypic effects (Xiong et al. 1998;Lu et al. 2016). Thus, understanding the genetic basis of these traits were critical for molecular designing breeding of the traits.
Owing to the next-generation sequencing (NGS), large number of molecular markers detected and used for marker-assisted selection. Recently, 76 SSR markers were developed from 320 expressed sequence tags (ESTs) of ramie from the National Center for Biotechnology Information (NCBI), of these, twentyseven SSR loci show polymorphisms among 62 ramie individuals (Chen et al. 2011). Liu et al. (2013) identified 1827 EST-SSR markers. Chen et al. (2016) identified 4230 SSR loci and developed 2431 SSR markers from the ramie transcriptome at different developmental stages. Using SSR markers, Liu et al. (2014) and Luan et al. (2017) identified several agronomic trait-related quantitative trait loci (QTLs) in ramie. Nevertheless, it is difficult to use them for cloning or breeding programs due to these QTLs with large intervals. Compared with SSR markers, SNPs are more abundant and stable in plant genomes and their frequency is 1 in 100-300 bp (Edwards et al. 2007). With the application of the NGS technologies, it is possible to generated thousands of genome-wide SNPs markers for high-density genetic maps (Davey et al. 2011;Qi et al. 2014;Zhang et al. 2016a). Several sequencing-based technologies were developed for SNP discovery and high-throughput genotyping, such as two-enzyme genotyping-by-sequencing (GBS) (Poland et al. 2012), restriction site-associated sequencing (RADseq) (Miller et al. 2007), and double-digest RADseq (Peterson et al. 2012). Recently, another high-throughput DNA sequencing technology, specific-locus amplified-fragment sequencing (SLAF-seq) was developed Sun et al. (2013) SLAFseq had successfully create a high-density SNP maps for soybean (Glycine max L.) , cucumber (Cucumis sativus L.) (Wei et al. 2014), sesame (Sesamum indicum L.) ) and upland cotton (Gossypium hirsutum L.) (Zhang et al. 2016b). However, there are few reports of highdensity SNP maps for ramie.
Using the genotyping-by-sequencing technique, Liu et al. (2017) and Chen et al. (2018) discovered numerous SNPs and detected QTLs associated with ramie fiber yield. However, the QTLs of Liu et al. (2017) were based on the F2 agamous line population of the varieties Qingdaye and Zhongzhu 1. Chen et al. (2018) focused only on the ramet number and made no reference to the ramie genome. There are still lack studies assess the genetic variation and GWAS for fiber yield and fineness in natural ramie population. Therefore, research is needed to assess the genetic variation and association analysis of fiber yield and fineness on natural ramie accessions using the Boehmeria nivea.
Genome-wide association studies (GWAS) are an efficient strategy for detect genomic regions controlling complex traits within unstructured crop germplasms (Kang et al. 2015). In the present study, we used SLAF-seq technology perform a GWAS of 112 ramie accessions (107 core and 5 cultivar accessions). The aims of this study were to (1) develop highdensity SNP map and genetic variants; (2) assess the genetic diversity of the B. nivea germplasm pool; (3) investigate the population structure and estimate the linkage disequilibrium (LD) across the accessions, and (4) identify single-nucleotide polymorphisms (SNPs) and candidate genes associated with fiber fineness and yield.

Plant materials and phenotyping
A set of 112 ramie accessions from the National Infrastructure for Ramie Germplasm Resources of China were selected for this study (Table S1), including 106 accessions originally collected from 11 provinces of China (96 landrace accessions and 10 cultivated varieties), one landrace accession from Brazil, one landrace accession from India, one landrace accession from Cuba, and three landrace accessions from Indonesia. The germplasm was selected to represent variations in fiber fineness and yield. Briefly, the accessions were vegetatively propagated from the National Infrastructure for Ramie Germplasm Resources of China and planted in our scientific research field (Wangcheng Experimental Station of the Institute of Bast Fiber Crops, Chinese Academy of Agricultural Sciences, Changsha, China) at 2014, with two replicates, randomized block design. Each plot consisted of one accession and five clones planted 1 m apart in a single row (3 m long 9 1 m wide). Standard cultural practices were performed according to Luan et al. (2017). For plant height and stem diameter were measured in three successive seasons of 2015 from March to June (environment 1), June to August (environment 2), and August to October (environment 3). While, ramie skin thickness and fiber percentage were measured in two successive seasons of 2015 from June to August (environment 2), and August to October (environment 3) in 2015. For the fiber fineness, three successive harvest seasons were August to October in 2014 (environment 1), June to August (environment 2) and August to October (environment 3) in 2015. All ramie plants were planted under the same field conditions and were under conventional agricultural management.
The three harvest periods represented different phases of phenological development, so they were analyzed separately. Thirteen plant traits were measured and the data were applied to the association mapping (Table 1). Measurements were made according to the methods of Luan et al. (2017). Briefly, 15 ramets per plot were randomly selected to measure plant height, stem diameter, skin thickness, and fiber percentage using tape measure, Vernier caliper, and screw micrometer, respectively. Fiber fineness was examined with optical fiber diameter analyzer systems (OFDA 100 and OFDA 2000; HORNIK FIBER-TECH, Wald, Switzerland) according to Wang and Wang (2004 (Sun et al. 2013) as described in Zhou et al. (2017). Briefly, genomic DNA was digested with restriction enzyme RsaI and HaeIII (New England Biolabs Inc., Ipswich, MA, USA), followed by fragment-end repair, duplex tag-labeled adapter ligation, PCR amplification, and size-selected for SLAF libraries. Finally, the SLAF libraries were sequencing using an Illumina HiseqTM 2500 (Illumina, San Diego, CA, USA) at Biomarker Technologies Corporation in Beijing, China. The average sequencing depth was 10.89 9 . After sequencing, the low-quality reads (quality score \ 20) were filtered out using dual-index software (Kozich et al. 2013). All high-quality reads were mapped onto the Boehmeria nivea reference genome ) using the Burrows-Wheeler alignment tool (BWA) (Li and Durbin 2009) with default parameters.

SNP calling
The software GATK (McKenna et al. 2010) and SAMtools (Li et al. 2009a) were used to discover SNPs, and SNPs detected by both methods were treated as reliable. After SNPs with minor allele frequency (MAF) of less than 5% and missing data of more than 20% were filtered out, a total of 215,376 consistent SNPs remained and were used for analysis.

Phylogenetic tree and principal component analysis
Based on the 215,376 consistent SNPs identified from the 112 ramie samples, a phylogenetic tree was constructed in MEGA 5 (Saitou and Nei 1987) using the neighbor-joining algorithm (Eickmeyer et al. 2008). The genetic structure of the panel was assessment using principal components analysis (PCA) with GAPIT software (Lipka et al. 2012).
Deducing population structure and estimating linkage disequilibrium (LD) Population structure was investigated using Admixture software (Alexander et al. 2009) based on those consistent SNPs. The number of subgroups (K) was varied from 1 to 10, which represented the assumed groups of simulated population in ancient times. The best value of K was determined by the cross-validation (CV). The PLINK 1.9 BETA (Purcell et al. 2007;Shaun et al. 2007) was used to obtain the required data files for ADMIXTURE. LD was estimated between SNP pairs located on the same scaffold based on the alignment of reference genome Boehmeria nivea ). The ramie chromosome sequence was not publicly available at that time. The 215,376 consistent SNPs were used to infer LD in TASSEL (Bradbury et al. 2007). The r 2 were obtained for all SNP pairs using Plink (Purcell et al. 2007).

Association mapping
Three Genome-wide association studies (GWAS) models were used to association analysis. The models GLM-Q and compressed MLM-Q ? K were analysis in TASSEL (Bradbury et al. 2007), kinship (K) matrix was estimated using SPAGe Di v. 1.4b (Hardy and Vekemans 2002) and the best admixture results (Q) representing population membership served as covariates in the model. The EMMAX model were analysis using identify-by-state (IBS) relationship matrix according to Kang et al. (2015).
Based on the number of SNPs analyzed (n = 215,376), the significance threshold was estimated use the Bonferroni correction method, it was P = 4.64 9 10 -7 (0.1/215,376). SAMtools (Li et al. 2009a) was used for the manual verification of regions of the aligned sequencing reads with significant association relative to the reference genome of Boehmeria nivea ).

Identification of candidate genes and quantitative PCR (qRT-PCR) analysis
Functionally characterized genes harboring the detected SNPs or within a 100-kbp interval were sought in the reference genome of Boehmeria nivea. Then qRT-PCR was used to determine differential expression of the genes putatively controlling fiber fineness, stem diameter, and skin thickness. Based on our phenotypic data of field experiment, we selected six accessions (1-26 (kk), 2-34 (ct), 2-35 (bo), 1-43 (aw), 2-41 (kr), and 2-7 (ba)) for RT-qPCR. Bast samples were collected from those six accessions at the vegetative stage (* 30 cm height) and freezing in liquid nitrogen immediately. Total RNA was extracted with a TaKaRa MinBEST plant RNA extraction kit (TaKaRa Bio Inc., Shiga, Japan) and reverse-transcribed with a Thermo Scientific RevertAid first- strand cDNA synthesis kit (Thermo Scientific, Vilnius, Lithuania) according to the manufacturer's instructions. The qRT-PCR was performed in a Monad selected q225 real-time PCR system (Monad Biotech Co. Ltd., Shanghai, China) in a 10 lL reaction volume containing 0.5 lL cDNA sample, 5 lL MonAmpTM SYBR Green qPCR mix (Monad Biotech Co. Ltd., Shanghai, China), 0.5 lL each forward and reverse primers (10 lM each), and 3.5 lL nuclease-free water. The PCR parameters were 95°C for 1 min followed by 40 cycles of 95°C for 5 s and 55°C for 30 s. Each sample was analyzed in triplicate. Relative expression were calculated as previously described (Livak and Schmittgen 2001). All qRT-PCR primers (Table S2) were designed by the Primer3 online tool (http:// primer3.ut.ee/) and adjusted with Oligo v. 7.56 (Rychlik 2007). The internal control of choice was 18S based on preliminary experiments (data not shown).
The qRT-PCR data were analyzed in Microsoft Excel and reported as mean ± SD.
Validation SNP marker of fiber fineness through polymerase chain reaction (PCR) In order to verified the obtained SNP (PHNS01005842.1 :2705313) can be used for molecular marker research, we selected another 133 ramie accessions for genomic PCR amplification. The fiber fineness of those 133 ramie accessions was shows in Table S3, which from our field experiment of our laboratory. We design a pair of PCR primers (Fprimer: AAGCTTCGAATAACGATCTCCGA; R-primer: CAAATACGCTTCTGAATGAACCTGT; the 3 0 end of F-primer was on the location of SNP PHNS01005842.1:2705313) and followed by a base that manually introduced mismatch using the Primer3 online tool (http://primer3.ut.ee/). The PCR was performed in a 25 lL reaction volume containing 1 lL genome DNA, 12.5 lL PCR mix (Beijing GreatOcean Biotech Co. Ltd., Beijing, China), 0.25 lL each forward and reverse primers (10 lM each), and 11 lL nuclease-free water. The PCR parameters were 94°C for 3 min followed by 32 cycles of 94°C for 30 s, 58°C for 30 s and 72°C for 15 s, and terminal elongation at 72°C for 5 min. The PCR products (8 lL/sample) were analyzed on 1.5% agarose gel and visualized after ethidium bromide staining under UV light. One fragment of 199 bp will be amplified by the accessions with A allele. Correlation analysis using R software with ''psych'' package was performed to determine if the fiber fineness had any correlation with the SNPs.

Results
Specific-locus, amplified-fragment sequencing and SNP calling The restriction enzymes RsaI and HaeIII were selected based on predictions from in silico digestion. There were 205,334 predicted SLAF tags 264-394 bp in length. The digestion efficiency was 93.30%. In this experiment, 465.13 Gbp of sequences were generated from the 112 ramie accessions. There were 502,578 SLAF tags identified throughout the genome. The average depth per sample was 12.34 9 . There were 369,997 SLAF tags with polymorphisms. All SLAF tags aligned to the reference genome using BWA software (Li and Durbin 2009). The PLINK software (Purcell et al. 2007) performed quality control on the data. A total of 2,955,337 SNP loci were identified by GATK and SAMtools. More than 85.9% of the SNP variant effects were in the intergenic, intron, downstream, and upstream areas of the coding region whereas only 13.58% were in the exons (Table 2). After filtration for integrity ([ 80%) and low minor allele frequency ([ 5%), 215,376 highly consistent SNPs were identified and used in the subsequent analyses.

Population genetics analysis
Genetic relationships among the 112 samples were calculated using the 215,376 highly consistent SNPs identified in the present study. Among these 12,432 pairwise combinations, 11,644 (93.662%) with genetic relationship coefficients \ 0.05 (Fig. 1a).
Only four pairwise combinations (0.032%) (two pairs accessions: cd and br, ce and dr) (Table S1), which with higher kinship value between 0.6 to 0.65, suggesting the two pairs accessions may have closely relationship. Thus, most of accessions were very weak or no relationships in this study.
Neighbor-joining (NJ) clustering show this set of germplasm formed three major groups (Fig. 3). Based on the Admixture cross-validation (CV) figure (Fig. 2), the second choice consisted of three subpopulations (K) when the cross-validation error was the second lowest. Another selection comprised three subgroups for the 112 core ramie germplasms. Principal component analysis (PCA) showed genotype dispersal among the various components, and the first three PCs collectively explained only about 11.13% of the total variance, suggesting that the population structure of the 112 accessions was very low (Fig. 1b). No significant correlation was detected between genetic relationships and geographic origins. Linkage disequilibrium across the whole ramie genome LD decayed was estimated between SNP pairs along each scaffolds using the 215,376 highly consistent SNPs. The genome-wide LD decayed to r 2 = 0.1 was * 11.75 kbp in physical distance (Fig. 4), which is lower than that reported for most other crops. There are major difference among the various scaffolds. LD ranged from 1.09 kbp (Scaffold11 (PHNS01006314.1)) to 20.95 kbp (Scaffold2) ( Table S4).

GWAS of phenotypic traits and quantitative realtime PCR analysis of candidate genes
The GWAS models tested included GLM-Q, compressed MLM-Q ? K, and EMMAX. According to the quantile-quantile (Q-Q) plot results, the GLM-Q model was strongly skewed towards significance for most traits (Fig. S1). Therefore, the Q matrix did not adequately account for the population structure or cryptic relatedness. The MLM-Q ? K used both the Q and kinship matrices overcorrected these confounders (Fig. S2). The EMMAX model use the identity-bystate (IBS) relationship matrix and effectively reduced the false positive rate. It had sufficient statistical The accessions were divided into five subgroups. There were minimum cross-validation errors when K = 5. Within each subgroup, the accessions were ordered according to their genetic component. Each line indicates the subgroup value. Each accession is represented by a vertical line partitioned into K colored components which represent the inferred memberships in K genetic clusters power to identify significant marker-trait associations (Fig. S3). Therefore, the EMMAX-Q model was applied for GWAS in this study.
Variations in all 13 traits were detected among the genotypes (Table 1). We tested the associations between the 215,376 SNPs and each trait using the EMMAX model. Based on the chosen P \ 4.64 9 10 -7 , we identified 44 trait-SNP associations for five traits, namely, FF (third season), RSD (third season), SKT (third season), FP (second season), and FP (third season). These were designated the top QTL candidates (Table S5). The association pattern across the genome scaffolds visualized with Manhattan plots showed that association trait-SNPs were scattered throughout the genome (Fig. 5, Fig. S4-S6). We arbitrarily searched 100 kbp upstream and downstream of these trait-SNP association loci already known for candidate genes in the Boehmeria nivea Fig. 3 Neighbor-joining (NJ) tree of 112 ramie core germplasms cultivar Zhongzhu No. 1 genome. We predicted the number of candidate genes associated with the five aforementioned traits. We found 30 candidate genes from the significant SNP regions (PHNS01005842.1-2705313) associated with FF of the third season (Table  S6). Whole_GLEAN_10029631 and  whole_GLEAN_10029638 located at 66.7 kbp and 60.4 kbp from the trait-SNP, respectively, were promising candidate genes encoding a cotton fiberexpressed protein of unknown function and an Arabidopsis thaliana homeobox protein ATH, respectively. In order to validate the obtained SNP (PHNS01005842.1:2705313) can be used for molecular marker research, we design a pair of allelespecific PCR primer and use the primer to identify the different fiber fineness accessions. Correlation analysis shows the fiber fineness was significantly associated (P \ 0.01) with the amplification results, suggests that the SNP (PHNS01005842.1:2705313) was associated with fiber fineness in ramie (Fig. 6, Table S3).
One SNP marker (PHNS01006314.1:1911874) was associated with SKT of the third season (Table S8). The transcript of whole_GLEAN_10021907 was annotated as Arabidopsis thaliana G-box-binding factor 1 and was located 31.5 kbp from the marker. The transcript of whole_GLEAN_10021909 was annotated as Arabidopsis thaliana homeobox-leucine zipper protein ATHB-40 and was located 13 kbp from the marker. The transcript of whole_-GLEAN_10021913 was annotated as Arabidopsis thaliana BES1/BZR1 homolog protein 2 and was located 22.4 kbp from the marker. All of these were promising candidate genes.

Population structure and linkage disequilibrium
The phylogenetic tree constructed using neighborjoining algorithm based on the genetic similarity coefficients among individuals to reflects genetic relationships. In contrast, the population structure analysis using Admixture software, which estimates  Table S3). The target product size was 199 bp individual ancestries by efficiently computing maximum likelihood in a parametric model. In the present study, three major clusters were defined in a dendrogram. The second lowest cross-validation error was obtained for K = 3. The 112 accessions may be partitioned into three subgroups. No obvious relationship was detected between the accessions and their geographic origins. Indicate these lines might have undergone introgression or gene flow during ramie breeding in China. This finding was consistent with those in previous reports on Boehmeria nivea (Luan et al. 2014(Luan et al. , 2015Zhu et al. 2017).
In the present study, we used the reference genome of Boehmeria nivea zhongzhu1 based on SLAF-seq and identified 215,376 highly consistent SNPs which was * 2 9 the number of SNPs (108,888) reported by Chen et al. (2018). This abundance of SNPs will be useful for constructed a precise haplotype map and LD studies (Buckler and Gore 2007;Gore et al. 2009).
It is crucial to the LD decay of the genome for identifying the number and density of molecular markers for GWAS and their analysis methods (Myles et al. 2009). Ramie is predominantly a cross-pollinated or clonally propagated species. Thus, it should present with lower LD levels than those of self-pollinated crops. Despite several earlier association studies on ramie (Luan et al. 2017;Liu et al. 2017;Chen et al. 2018), none of them reported any LD decay values. Here, the genome-wide LD decayed to r 2 = 0.1 was 11.75 kbp in physical distance. This value was substantially lower than those for most other crops such as rice (Oryza sativa L.) (167 kbp) (Huang et al. 2010), soybean (420 kbp) (Zhou et al. 2015), and cultivated maize (Zea mays L.) (30 kbp) (Hufford et al. 2012), but higher than that of cultivated cassava (Manihot esculenta L.) (8 kbp) , inbred maize (Zea mays L.) (1.5 kbp) (Remington et al. 2001), and inbred potato (Solanum tuberosum L.) (1 kbp) (Simko 2004). Therefore, the 112 ramie accessions of the present study should have abundant allelic diversity based on the rapid LD decay in most of the scaffolds. In this way, the search for candidate genes is facilitated through efficient narrowing of the putative QTL regions.

Genome-wide association study (GWAS)
The selection of best statistical models can accurately evaluate the associations between markers and (1455 m g -1 ), 1-43 (1526 m g -1 ), 2-35 (2068 m g -1 ), 2-34 (2199 m g -1 ), and 1-26 (2237.5 m g -1 ), respectively phenotypes. As the availability of genotype data increases, comprehensive statistical models are required to distinguish true biological associations from the false positives arising from population structure and LD (Wang and Qin 2017). Numerous statistical models are available to calculate the significance of the associations between SNPs and traits (Ogura and Busch 2015). Luan et al. (2017) detected 16 stable molecular markers from an association analysis of fiber yield traits in ramie based on the MLM-Q ? K model. However, this model considers the effects of Q for the population structure, analyzes K for the genetic relationships, and Bonferroni correction is required for multiple tests. When marker number is very large, this test is too strict (Hou et al. 2018). In terms of the Q-Q for our results, MLM-Q ? K overcorrected the confounders while GLM-Q was strongly skewed towards significance for every trait. EMMAX used the identity-by-state (IBS) relationship matrix, effectively controlled the false positive rate, and provided suitable statistical power to identify significant marker-trait associations. Therefore, the identity-by-state (IBS) relationship matrix was appropriate for ramie GWAS in natural populations.
The discovery of QTL for agronomically and economically important crop traits was of vital importance for marker-assisted breeding. To our best knowledge, this is the first report on the QTL for fiber yield and fineness based on the reference genome of Boehmeria nivea ). We found one, seven, one, seven, and twenty-seven significant SNP marker associations for FF (third season), RSD (third season), RST (third season), FP (second season), and FP (third season), respectively. The associated trait-SNP (PHNS01005842.1:2705313) for fiber fineness was validated by genomic DNA PCR, indicated this SNP can be serve as unique tools for molecularassisted breeding.
Two candidate genes in the significant trait-SNP associations were identified for FF (third season). The whole_GLEAN_10029622, Pfam annotation encodes a cotton fiber-expressed protein of unknown function. The head of the protein includes the domain DUF4408 and the tail is the cotton fiber-expressed protein domain DUF761. The whole_GLEAN_10029638 is an ortholog of the Arabidopsis thaliana homeobox protein ATH1. In Arabidopsis, the TALE homeobox gene ATH1 positive regulation of the floral repressor FLC, controls floral competency (Proveniers et al. 2007). The qPCR validation disclosed that these genes were upregulated in three fineness accessions compared to their expression levels in three fiber thickness accessions. Therefore, these two may be candidates for fiber fineness. The transcript of the whole_-GLEAN_10021913 homolog of Arabidopsis thaliana BES1/BZR1 protein 2 was associated with SKT of the third season. It was a downstream target gene of the brassinosteroid (BR) signal. BRs regulate multiple biological processes including vasculature differentiation, cell elongation, senescence, photomorphogenesis, and stress response (Yu et al. 2011). In Arabidopsis, AtMYB30 is a direct target of BES1 and co-regulates brassinosteroid-induced gene expression with BES1 (Li et al. 2009b). Chai et al. (2014) reported that PdMYB10 and PdMYB128 in Arabidopsis increased stem fiber cell wall thickness and delayed flowering. Overexpression of PdMYB90, PdMYB167, PdMYB92, and PdMYB125 in Arabidopsis decreased stem fiber and vessel cell wall thickness and promoted flowering. The qPCR validation showed that whole_-GLEAN_10021913 expression was relatively lower in thick SKT varieties. Thus, it is a promising candidate gene associated with SKT of the third season. Another candidate gene, whole_GLEAN_10015241, was associated with RSD of the third season. It was annotated as a zinc finger protein CONSTANS-LIKE 3 gene. The Arabidopsis CONSTANS gene promotes flowering and its overexpression results in earlier flowering than the wild type (Putterill et al. 1995). Early flowering reduces the time for vegetative growth and results in comparatively smaller plants. In contrast, late flowering prolongs vegetative growth which culminates in relatively larger plants (Lenser and Theissen 2013). The candidate gene whole_-GLEAN_10016511 was associated with PC of the second season. It is a homolog of the Arabidopsis thaliana transcription factor KAN2 gene. In Arabidopsis thaliana, the transcription factor KAN2 regulates lateral organ polarity and determines abaxial cell fate during lateral organ formation (McAbee et al. 2006). KAN genes participate in ovule development. The growth of the outer integument observed is retarded in certain kan mutant combinations (Eshed et al. 2001). The qPCR indicates that the relative expression levels of these two genes did not correlate with the RSD or PC phenotypes for six varieties (Fig. S7-S9). This anomaly merits further investigation. In the present study, the GWAS results were limited duo to the small population size. Comprehensive phenotypic studies are currently in progress, and the future studies on association mapping in a wide range of samples may be performed.

Conclusions
This study provided information for understanding the population structure, LD decay and genetic variation in ramie natural population. The identified SNPs or genes for fiber fineness and yield may provide the basis for future research on fiber fineness and yield and marker-assisted selection breeding for ramie.