Coleoptile length in barley accessions
The coleoptile length recorded in two independent experiments correlated with each other significantly (R2= 0.87). Therefore, the average data of the two experiments was used to represent the varieties’ coleoptile length. The investigated barleys had significant variation for the coleoptile length and its heritability was estimated at 0.54. The average length of the 328 barley accessions was at 5.40 cm, with the longest variety Russia24 at 7.51 cm and the shortest variety CDC Unity at 3.27 cm. The vast majority of all varieties in the barley panel (292 varieties, 88.8%) had coleoptile length ranging from 4 to 6.51 cm, only 27 varieties (8.2%) ranging from 6.50 to 7.51 cm and 10 varieties (3.0%) ranging from 3.27 to 3.99 cm (Figure 1 (A)). The barleys from different origins, in different row types and with different growth habits were compared (Figure 1 (B), (C) and (D)). The coleoptile length separated the barley origins into two subgroups: group 1 including Australia (mean at 5.60 cm), Africa (mean at 5.74 cm) and Asia (mean at 5.86 cm); group 2 including Europe (mean at 5.26 cm) and America (mean at 5.23 cm). The Australia originated barleys had longer coleoptile than European (p=3.79×10-4) and North and South American varieties (p=1.26×10-3), African longer than European (p=2.51×10-2) and North and South American (p=4.26×10-2), Asian longer than European (p=1.93×10-4) and North and South American (p=8.87×10-4) (Figure 1 (B)). No significant difference was found among the members in either group. Barleys with different row types (two-row or six-row) had a similar coleoptile length (the two-row and six-row means were 5.40 cm and 5.37 cm, respectively) and no significant difference were detected. Similarly, no significant difference was found between the barleys with different growth habits (the spring barleys and winter barleys means were both 5.38 cm). In conclusion, the coleoptile length correlated to the breeding origins but not to the row types and growth habits. The varieties with relatively long coleoptile (>6.50 cm) and their origins, row types, growth habits and coleoptile length are listed in Table S1.
Population structure and Linkage disequilibrium (LD) analysis
A Neighbor-joining (NJ) tree based on genetic distances for the barley population in this study (328 barley accessions) was constructed, incorporating their origins, row types and growth habits respectively (Figure 2 (A)). The barley collection in this study presented high genetic diversity and covered accessions from worldwide origins (35 countries), with different row types (two-row and six-row) and in different growth habits (winter, spring, and facultative) (Figure 2 (A) (B); Supplemental Figure S1). The result of PCA with separation based on row-type, growth habit, or geographic location are presented in Figure 2 (B).
The optimal K value (number of subpopulations) of the barley germplasm collection in this study was predicted using ADMIXTURE v.1.3.0 [1]. It showed that the optimal number of subpopulations was K=7 according to the Δ cross-validation error (Figure 2 (C) and Supplementary Figure S2). The population structures of 328 barley germplasm with K value from 2 to 12 were listed in Supplementary Figure S3. The pairwise LD decay (r2) analysis was performed on each chromosome and decreased with physical distance (Supplemental Figure S4).
Association analysis
The association study was performed using two subsets of markers: subset MAF01 (MAF>0.01) including 33,146 markers and subset MAF05 (MAF>0.05) including 23,193 markers. Significant MTAs (qFDR<0.05) were identified on all 7 chromosomes (Supplemental Figure S5 (A) and (B)). The quantile–quantile (QQ) plots indicated that the GLM model was suitable and efficient for this study (Supplemental Figure S5 (C) and (D)). All the significant MTAs identified using the two marker cut-offs were listed in Supplemental Table S2. Totally, there were 128 markers identified significantly associated with coleoptile length (qFDR<0.05) using both marker subsets, representing 53 genic loci (loci within genes) and 54 intergenic loci (loci between genes) (Supplemental Table S2), explaining 4.07–6.81% phenotypic variation (r2). Apart from the common markers identified by both MAF01 and MAF05, MAF01 identified 49 additional markers, representing extra 19 genic loci and 9 intergenic loci (Supplemental Table S2), explaining 3.37-5.54 % phenotypic variation (r2). Apart from the common markers identified by MAF01 and MAF05, MAF05 identified 21 additional markers, representing extra 7 genic loci and 9 intergenic loci (Supplementary Table S2), explaining 3.49-4.07 % phenotypic variation (r2).
The highly significant associated loci (qFDR<0.01, -log10(q)>2.0) identified by either MAF01 or MAF05 are listed in Table 1. There were 5 genic loci identified by MAF01 and MAF05 in common, including HORVU1Hr1G073010 (unknown function), HORVU1Hr1G076430 (FT), HORVU1Hr1G077230 (CSLC6), HORVU5Hr1G058300 (TPPB) and HORVU5Hr1G007340 (LRR-RLK). One genic locus within gene HORVU6Hr1G019700 (SPL3) was only detected in MAF01. Two genic loci harbouring genes HORVU6Hr1G022770 (VIN3), and HORVU6Hr1G022500 (BTBD) were only detected in MAF05. The top significantly associated markers (with the lowest qFDR) for each candidate gene are summarized for their effect on the coleoptile length (Figure 3). For the SPL3 gene, the marker C6H53910826 showed 1.42 cm longer coleoptile (p=2.85×10-6) when the alternative allele T is present, explaining 5.54% phenotypic variation. For the unknown function gene HORVU1Hr1G073010, the marker D1H500582726 showed 0.24 cm shorter coleoptile (p=1.52×10-2) when the alternative allele T is present, explaining 6.52% phenotypic variation. For the FT gene, the marker C1H514098702 showed 0.34 cm longer coleoptile (p=4.81×10-4) when the alternative allele C is present, explaining 6.68% phenotypic variation. For the CSLC6 gene, the marker D1H516785422 showed 0.33 cm shorter coleoptile (p=1.96×10-3) when the alternative allele G is present, explaining 6.28% phenotypic variation. For the TPPB gene, the marker D5H456061421 showed 0.99 cm longer coleoptile (p=4.74×10-9) when the alternative allele C is present, explaining 6.45% phenotypic variation. For the LRR-RLK gene, the marker D5H014097066 showed 0.49 cm shorter coleoptile (p=1.05×10-3) when the alternative allele G is present, explaining 6.10% phenotypic variation. For the VIN3 gene, the marker C6H72969182 showed 0.51 cm shorter coleoptile (p=1.71×10-9) when the alternative allele A is present, explaining 5.91% phenotypic variation. For the BTBD gene, the marker D6H071745828 showed 0.52 cm longer coleoptile (p=1.73×10-9) when the alternative allele G is present, explaining 5.87% phenotypic variation. In conclusion, all the associated markers within genes had strong effects on the coleoptile length. There were 12 intergenic loci identified by MAF01, MAF05 or both. Possible genes responsible for coleoptile length were searched around the loci and listed in Supplementary Table S3.
Major candidate gene for coleoptile length
The markers within gene SPL3 showed association with coleoptile length for both MAF01 and MAF05. In MAF01, there were 5 markers on SPL3 identified to be significant (0.01<qFDR<0.05) and 11 markers on SPL3 showed highest significance of all associations (qFDR=5.4×10-3). In MAF05 there were 2 markers on SPL3 identified to be significant (0.01<qFDR<0.05) (Supplementary Table S2). Although the significant markers within eight candidate genes all had strong effects on the coleoptile length, the marker C6H53910826 (and other 10 markers with qFDR=5.4×10-3 in MAF01) on SPL3 showed the most significant effect: the mean length was at 5.37 cm when the allele C presented and 6.79 cm when the allele T presented (p=2.85×10-6) (Figure 3 (A)). In conclusion, multiple markers on SPL3 had been identified to be significantly associated by different methods and some of the markers had the highest association index (qFDR=5.4×10-3) and had the strongest effect on the phenotype. Therefore, SPL3 was considered the major candidate gene associated to the coleoptile length in this study.
SPL3 is a transcription factor gene located on chromosome 6H:53,909,817 to 53,916,886 (7,070 bp), consisting of 5’UTR, 3’UTR, four exons and three introns (Figure 4 (A)). The gene encodes a protein with 474 amino acids (aas). The conserved SBP domain is central functional region of this transcription factor, and contains a plant-specific DNA-binding domain. All the variants and amino acid substitutions are summarized in Figure 4 (B). There were five variants on exons, including four missense variants and one synonymous variant. The missense variant at position 53,913,050 replaced serine with alanine in the SBP domain, likely impacting its DNA-binding activity. Furthermore, this marker C6H53913050 was one of the markers showing highest significance in this study (qFDR=5.4×10-3). Other three missense variants included glutamic acid replaced with lysine at position 53,913,549, alanine replaced with valine at position 53,913,335 and 53,910,588. Five variants were found in 5’ or 3’ UTR and eight variants were in the introns (Figure 4 (B)). Figure 4 (C) showed the position of all detected variants, including significant association and non-significant association with coleoptile length, and the LD plot surrounding these markers.
To further understand the role of SPL3 gene in barley coleoptile growth, we measured its expression in coleoptile tissue of two varieties (CDC Unity and CI5791), representing two major haplotypes of SPL3 (C/T at marker C6H53910826) in the population. The comparison of the SPL3 expression between CDC Unity and CI5791, and in dark and under natural daylight is presented in Supplementary Figure S6. The data represented the actin normalized target gene expression relative to control (the SPL3 gene expression in coleoptile of CDC Unity in dark) (considered as 1). There was no statistically significant difference between CDC Unity and CI5791, either in dark or under natural daylight. However, in CDC Unity, the expression under light decreased by 43% compared with dark conditions. Similarly, in CI5791, the expression under light decreased by 39% compared with dark conditions. In conclusion, the coleoptile length variation between two SPL3 alleles was not due to the gene expression, but for the both alleles there was a steady decline in gene expression when the coleoptile was exposed under daylight.