GWAS unveils features between early- and late-flowering pearl millets

doi:10.21203/rs.3.rs-25381/v1

Download PDF

Research article

GWAS unveils features between early- and late-flowering pearl millets

https://doi.org/10.21203/rs.3.rs-25381/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 10 Nov, 2020

Read the published version in BMC Genomics →

You are reading this older preprint version

Read the latest preprint version →

Background

Pearl millet, a dietary food for around 100 million people in Africa and in India, has a large diversity due to an extensive genetic diversity combined with a high degree of admixture with wild relatives. In Senegal, two major morphotypes are distinguished: early-flowering and late-flowering millets. The phenotypic variabilities according to the flowering time plays an important role in pearl millet adaptation to climate variability. A better understanding of the genetic makeup of these variabilities would allow breeding of pearl millet fitting different climatic areas. In this study, we aimed to characterize the genetic basis of these phenotypic differences.

Results

We defined a core collection capturing most of the diversity of cultivated pearl millet of Senegal, which includes 60 early-flowering Souna and 31 late-flowering Sanio. This panel was evaluated during the 2016 and 2017 rainy seasons at Nioro for 16 agro-morphological traits. Phenological and phenotypes traits linked with yield, flowering time, and biomass helped differentiated early- and late-flowering millets. Further, using genotyping-by-sequencing (GBS), 21,663 single nucleotide polymorphisms (SNPs) with minor allele frequencies of more than 5% were identified. Sparse Non-Negative Matrix Factorization (sNMF) analysis confirms the genetic structure in 2 gene pools associated with flowering time differences. Moreover, 2 chromosomal regions on linkage groups 3 (~ 89.7 Mb) and 6 (~ 68.1 Mb) differentiated the early-flowering into 2 clusters. Genome-wide analysis study (GWAS) was used to associate phenotypic variation to the SNPs and 18 genes were linked to flowering time, plant height, nodal tiller number, and biomass (P-value ˂ 2.3E-06).

Conclusions

The diversity of early- and late-flowering pearl millet landraces of Senegal was captured using a heuristic approach. Key phenology and phenotypic traits, SNPs, ad candidate genes underlying flowering time, tillering, biomass and plant height of pearl millet were identified. Chromosome rearrangements in LG 3 and 6 were implicated as a source of variation in early-flowering morphotypes. Using candidate genes underlying these features between pearl millet morphotypes would have paramount importance in breeding strategies under climate change scenarios.

Epigenetics & Genomics

Senegal

millet

morphotypes

flowering

diversity

GWAS

Pearl millet [Pennisetum glaucum (L.) R. Br., syn Cenchrus Americanus] is a dietary food for people, particularly in Africa and Asia. It has a highly nutritional composition (high proteins and fibers content), richer in energy and essentials minerals (like iron and zinc) than other cereals (1). Gluten-free with a low glycemic index and hypoallergenic properties, pearl millet could, therefore, be promoted as nutrient-rich food for health and food security (2).

Pearl millet is an annual C4, cultivated in the driest environments. Pearl millet domestication occurred at least 4,900-years ago (3). Due to an extensive genetic diversity combined with a high degree of admixture with wild relatives, pearl millet hence shows a large diversity, morphologically and genetically (3, 4). For example, late-flowering pearl millet morphotypes, which are also grown to cope with the hunger gap, are often more sensitive to photoperiod than early-flowering ones (5). Genetic studies already identify polymorphisms within the PgPhyC and PgMADS11 associated with yield and earliness in flowering time (6–8). Such variants could be useful in breeding strategies to cope with recurrent drought periods that prevalent in the west Sahel in three last decades. Indeed, flowering time is a trait that plays an important role in pearl millet adaptation to climate variability since flowering synchronization with rainfall period allows an optimal development of varieties. Having a better understanding of the genetic features of flowering time would be useful for the breeding of pearl millet to fit different climatic areas.

In Senegal, two major varieties of pearl millet are cultivated: Souna morphotypes that flower early, from 50 to 60 days, and Sanio morphotypes which flower late, from 80 to 110 days (9, 10). Between 1992 and 2014, a nationwide collection of landraces was collected, capturing the Senegalese cultivated pearl millet varieties. Using agro-morphological traits, microsatellites, and single nucleotides polymorphism markers, studies have highlighted the genetic differentiation between the 2 types of varieties (8, 9, 11). The genetic structure of pearl millet landraces in Senegal correlates with a north-south gradient of rainfall (8), with low geographic structuration (11). However, we still do not know the genomic regions associated with the major difference in flowering time and more broadly on the morphological differences between these two genetic pools.

To better characterize the genetic diversity of the large pearl millet collection, one strategy is to define and use a subset of accessions representing the panel (12). Moreover, several approaches have been used to characterize pearl millet diversity using genotyping-by-sequencing (11)or even full genome resequencing (13). Markers generated by such approaches could be used for association studies for adaptation to stressful environmental conditions.

In this study, a core collection of cultivated pearl millets from Senegal was first defined. Then, the accessions were field-evaluated, GBS-sequenced, and used in GWAS for the identification of QTLs associated with phenotypic differentiation traits between early-flowering (Souna) and late-flowering (Sanio) accessions. Key phenology and phenotypic traits and SNPs as well as candidate genes underlying flowering time, tillering, biomass and plant height of pearl millet were identified. Their uses in breeding strategies under traditional production systems addressing environmental changes and farmer’s preferences are discussed.

Establishment of a core germplasm

To capture the diversity in cultivated pearl millet of Senegal, 392 accessions collected nationwide were analyzed. Early- and late-flowering morphotypes showed differences in plant height, tillering, and biomass, the nodal tiller number, heading and flowering time (Fig. 1a). Using a heuristic approach, a core representation of 91 accessions was established, including 60 early-flowering Souna and 31 late-flowering Sanio. This panel is geographically distributed and covered 22.5% of the germplasm (Fig. 1b, Table S1). The percentage of the coincidence rate (% CR) was equal to 97% in Souna subset and 92.5% in Sanio subset. All statistical consistency criteria (%MD, %VD, and %VR) showed high scores, except for the variance difference percentage in the core collection of early flowering (Table S2). Likewise, genetic analysis captures mostly existing alleles within the germplasm. The Nei diversity index is 0.62 among the 91 accessions and 0.56 within the entire collection. Therefore, reducing the number of accessions did not cause significant diversity loss (P-value = 0.49). Similarly, the Principal Component Analysis (Fig. 1c), and the Neighbor-joining (NJ) analysis based on Nei genetic distances showed phylogenetic relationship between accessions (Fig. 1d), which support the 2 genetic pools of pearl millet morphotypes.

Phenotypic traits discriminating Souna and Sanio

To identify phenotypic traits that differ between early- and late-flowering millets, field evaluations of 16 traits were conducted in a common environment (Nioro) during two rainy seasons in 2016 and in 2017. Analysis of variance on the data showed a large phenotypic variability of the 91 individuals almost for all traits, except for 1000 grains weight. This indicates a highly significant genotype effect on traits. The interactions between genotypes and years are significant to different degrees on all agro-morphological characters, except for downy mildew damage score, tillering and panicle exertion (Table 1).

Table 1

Analysis of variance and heritability (h²) of the 16 phenotypic traits in the core collection. DM = Downy mildew, HT = 50% Heading time, FLO = 50% Flowering time, NTN = Nodal tiller number, IL = Internode length, MSD = Main stem diameter, PH = Plant height, NPT = Number of productive tillers, FLL = Flag leaf length, FLW = Flag leaf width, SL = Spike length, ST = Spike thickness, SW = 1000 seed weight, PE = Panicle exertion, and GY = Grain yield. Significance: * P-value < 0.05, ** P-value < 0.01 et *** P-value < 0.001, NS = Not significant
	DF	DM	HT	FLO	NPT	NTN	IL	FLL	FLW	MSD	PH	SL	ST	PE	SW	Biomass	GY
Genotype effect	90	***	***	***	***	***	***	***	**	NS	***	***	***	***	NS	*	**
Year effect	1	NS	NS	NS	**	***	NS	*	NS	***	NS	NS	***	NS	***	***	**
Genotype x Year		NS	***	***	NS	*	**	*	*	***	***	*	*	NS	***	***	***
Means		0.03	56.37	60.68	6.36	9.33	21.11	46.09	4.59	4.19	241.62	48.9	8.43	3.15	5.84	1528.39	1637.49
Strd deviation		0.07	12.27	12.68	2.16	2.19	1.93	6.53	0.66	1.6	39.11	9.35	1.21	3.16	1.44	1114.7	1112.15
h²		0.69	0.97	0.98	0.86	0.96	0.68	0.62	0.6	0.23	0.94	0.9	0.84	0.71	0	0.42	0.55

The first two axes of the principal component analysis from all assessed phenotypic traits explained 63.7% of the variance and showed a strong agro-morphological structure of early- and late-flowering (Fig. 1c). Further, the discriminant analysis revealed 13 traits that were significantly different between early- and late-flowering (P-value < 0.0001) (Table S2). Six traits showed a degree of correlation to the differentiation axis (≥0.7), and were thus considered to be highly discriminating characters. All these highly discriminating characters have high heritability (Table 1). They can be classified into two types of characters: phenology that groups the heading and the flowering time, and phenotypic which include the biomass, the plant height, the nodal tiller number and the number of productive tillers (Fig. 2).

Genomic variation between the morphotypes

A total of 21,663 filtered high-quality SNPs was identified and used for diversity study. The data contained an average 3095 SNPs per chromosome and the average of minor allele frequency (MAF) was 0.213. Genome-wide structure analysis revealed 2 genetic pools that corresponded to the early- and late-flowering millets. Indeed, the sNMF analysis had the lowest cross-entropy value (0.724) at K = 2. Similarly, sNMF analysis at the 7 pearl millet linkage groups level revealed that the lowest cross-entropy values were at K = 2 for LG1 (0.717), LG2 (0.715), LG4 (0.736), LG5 (0.718) and LG7 (0.727). The two K groups corresponded to the 2 morphotypes. However, LG3 and LG6 are structured into 3 groups with the lowest cross-entropy value at K = 3 (0.733 and 0.723, respectively): one group representing the late-flowering genetic pool and 2 others representing clusters of the early-flowering genetic pool (Figure S1). Further, we observed that individuals differentiated at LG3 are different from those differentiated at LG6. The DAPC analyses confirmed the results of the sNMF by detecting two genetic groups according to the two morphotypes, at the genome-wide level and the chromosome level except at LG3 and LG6 where three genetic groups were obtained with, a group corresponding to the late-flowering genetic pool which differentiates according to the first dimension (DA) of DAPC and that of the early-flowering millets which differentiates into two clusters according to the second DA. The differentiation of the early-flowering millets at LG3 is mainly because of a local region from position 3-85314430 to position 3-174943092. This region is approximately 89.7 Mb in chromosome 3. Similarly, differentiation of early-flowering millets is carried between 6-85314430 to 6-17494309 positions covering 68.1 Mb in chromosome 6 (Fig. 3b-c, Figure S1).

Identification of SNP markers and candidate genes linked to morphotypes

The Quantile-Quantile (Q-Q) plots showed that the MLM model taking into account the population structure effect and the kinship matrix was appropriate with strong statistical power. The GWAS analysis detected eighteen SNPs at the threshold of α = 2.3 10^− 6 (Bonferroni correction). These SNP were associated with biomass, flowering time, plant height and nodal tiller number. Biomass SNPs are not identified in the peak of P-value but are squatter along the genome at SNP with very low P-values. Flowering time, plant height, and nodal tiller number are each tightly linked with unique SNP (Fig. 4, Table 2). For plant height, one SNP is located on the PgAAO1 gene that encodes an Indole-3-AcetAldehyde Oxidase protein. For nodal tiller number, one SNP is located on the PgHK4 gene that encodes a Histidine Kinase. Flowering time is associated with one SNP that is in the PgPPR gene which encodes a Pentatricopeptide-Repeat Protein that belongs to the family of ABC DNA-binding cassette involved in plant resistance and defense.

Table 2

Significantly Associated SNPs from GWAS, mapped in the pearl millet genome and Associated phenotypic traits
Marker	Chromosome	Associated phenotypic trait	P-values	FDR	Region	Distance to the nearest gene (bp)	Gene ID	Gene name	Definition	Number of proteins	Specific Functions	General Functions
S2_1673638	2	Biomass	4.52E-49	2.94E-45	Intergenic	400	Pgl_GLEAN_10023350	LTA3	Dihydrolipoyllysine-residue acetyltransferase component 1 of pyruvate dehydrogenase complex	3	metabolic process, transferase activity, transferring acyl groups	Biological Process, Molecular Function
S2_182434549	2	Biomass	2.01E-34	7.25E-31	Intergenic	15131	Pgl_GLEAN_10013800	MFDR	NADPH:adrenodoxin oxidoreductase	Unknow	unknow
S2_222890810	2	Biomass	2.63E-07	0.000437794	Genic	Inside	Pgl_GLEAN_10027846	ALDH5F1	Succinate-semialdehyde dehydrogenase, mitochondrial	2	metabolic process, oxidoreductase activity; oxidation-reduction process	Biological Process, Molecular Function,
S2_222920292	2	Biomass	7.61E-14	2.06E-10	Intergenic	-169	Pgl_GLEAN_10027850	CCT1	Choline-phosphate cytidylyltransferase 1	1	biosynthetic process, nucleotidyltransferase activity;	Biological Process, Molecular Function
S3_202496771	3	Biomass	2.22E-21	6.86E-18	Genic	Inside	Pgl_GLEAN_10033123	Os04g0379900	Stearoyl-[acyl-carrier-protein] 9-desaturase 5	2	fatty acid metabolic process, fatty acid biosynthetic process, oxidoreductase activity, acyl-[acyl-carrier-protein] desaturase activity, oxidation-reduction process	Biological Process, Molecular Function,
S3_286185314	3	Biomass	3.77E-35	1.63E-31	Intergenic	-657	Pgl_GLEAN_10023831	SCAMP6	Secretory carrier-associated membrane protein 6	1	protein transport, integral to membrane	Biological Process, Cellular Component
S3_58698346	3	Biomass	2.03E-07	0.000366077	Intergenic	333	Pgl_GLEAN_10012689	PHYLLO	Protein PHYLLO	4	catalytic activity; cellular amino acid catabolic process; thiamine pyrophosphate binding;	Molecular Function, Biological Process,
S3_61879906	3	Biomass	3.98E-07	0.000615194	Genic	Inside	Pgl_GLEAN_10014062			unknow	unknow
S3_97334843	3	Biomass	1.58E-06	0.002279413	Intergenic	-10696	Pgl_GLEAN_10000315	UGT91C1	UDP-glycosyltransferase 91C1	1	metabolic process; transferase activity, transferring hexosyl groups	Biological Process, ; Molecular Function
S4_183522008	4	Biomass	3.45E-08	8.30E-05	Intergenic	12181	Pgl_GLEAN_10004825			1	unknow
S4_32445778	4	Biomass	1.22E-07	0.000240007	Genic	Inside	Pgl_GLEAN_10002428	ABCF1	ABC transporter F family member 1	3	nucleotide binding; ATP binding, ATPase activity, nucleoside-triphosphatase activity	Molecular Function,
S4_45574746	4	Biomass	5.44E-49	2.94E-45	Genic	Inside	Pgl_GLEAN_10027450	CPLC4	Chaperone protein ClpC4	7	nucleotide binding, DNA binding; nuclease activity, ATP binding, nucleotide-excision repair, nucleoside-triphosphatase activity	Molecular Function, Biological Process,
S5_78613055	5	Biomass	5.56E-08	0.000120318	Intergenic	17163	Pgl_GLEAN_10038431		Unknow	Unknow	unknow
S7_128335177	7	Biomass	1.97E-49	2.13E-45	Intergenic	-14	Pgl_GLEAN_10007334		Unknow	Unknow	Unknow
S7_88324424	7	Biomass	9.29E-50	2.01E-45	Genic	Inside	Pgl_GLEAN_10028316	DDB1B	DNA damage-binding protein 1b	1	nucleic acid binding, nucleus	Molecular Function, Cellular Component
S5_143317884	5	Flowering time	1.94E-06	0.0419816	Genic	Inside	Pgl_GLEAN_10013349	PPR	Pentatricopeptide repeat	1	unknow
S2_16892168	2	Nodal tiller number	4.91E-07	0.01062524	Genic	Inside	Pgl_GLEAN_10013465	HK4	Histidine kinase 4	6	two-component sensor activity; two-component response regulator activity, two-component signal transduction system (phosphorelay); protein histidine kinase activity, ATP binding, regulation of transcription, DNA-dependent, signal transduction, membrane; phosphorylation, transferase activity, transferring phosphorus-containing groups, peptidyl-histidine phosphorylation	Molecular Function, Biological Process, Cellular Component,
S7_66044005	7	Plant height	1.04E-06	0.020333247	Genic	Inside	Pgl_GLEAN_10031969	AAO1	Indole-3-acetaldehyde oxidase	8	catalytic activity; electron carrier activity, oxidoreductase activity, acting on CH-OH group of donors, metal ion binding, flavin adenine dinucleotide binding, iron-sulfur cluster binding, 2 iron, 2 sulfur cluster binding, oxidation-reduction process;	Molecular Function, Biological Process

Building a Senegalese pearl millet landraces core collection

Using a heuristic approach, phenotypic and genetic data we build a core collection of Senegalese varieties. This subset of accessions is well geographically distributed and captures the maximum genetic diversity and phenotypic variation of pearl millet landraces in Senegal. The spatial pattern of the diversity is correlated with the genetic structure as previously reported (8) that fits with the cultivation areas of morphotypes across the country.

Allogamous species often have fewer redundancies due to gene flow between individuals. The great diversity of pearl millet was associated with dynamic gene flow and admixture between wild and cultivated pearl millet (3). This great diversity is certainly at the origin of the proportion of 22% of the core collection here defined, higher than other self-pollinating species such as wheat (Triticum aestivum) (14) or rice (Oryza sativa) (15). For representativeness, there are no single or fixed fractions of accessions to define a core collection. Indeed, the optimal depends largely on the degree of genetic redundancy of samples, the available resources, and the frequency of regeneration of entries (16). Our results showed there is no genetic redundancy of accessions in Senegal's pearl millet germplasm. However, they confirmed two genetic pools, as previously reported (8–10). An early-flowering pool subdivided into 3 clusters differentiated according to yield, spike thickness, nodal tiller, and flag leaf length, and a late-flowering pool differentiated into three according to yield, and nodal tiller number (Figure S2).

Differentiation of early- and late-flowering millets may occur after the domestication of pearl millet, as a consequence of a center of specialization across the Sahel belt. This assumption is supported by previous observations reporting migrations, exchanges and gene flow that led to a wider genetic diversity as an adaptive mechanism (3, 17). These findings and assumption make the identified subgroups within early-flowering Souna and late-flowering Sanio genuine heterotic groups for breeding for earliness or biomass, respectively.

Phenology and phenotypic traits featuring early- and late-flowering millets

Among the 16 quantitative agro-morphological characters evaluated in multi-sites and multi-years, 6 traits were highly discriminating between early- and late-flowering millets with high heritability. These are 2 phenology features, i.e heading and flowering time, are correlated and related to photoperiod sensitivity in pearl millet. The late morphotype is more sensitive to photoperiod. These features constitute two mechanisms for adapting of pearl millet to climate variability. Earliness of the flowering of pearl millet has been associated with a population adaptation mechanism (7), while the photoperiod was linked with an individual adaptation mechanism (5).

In parallel, four characters during the vegetative stage, i.e biomass, plant height, nodal tiller number and the number of productive tillers featured early- and late-flowering millets with high heritability. There are biomass, plant height, and nodal tiller number that contribute to stover yield while the number of productive tillers is a yield component. This is consistent with observations made by farmers who grow both morphotypes based on preferences and agro-systems. For example, some Senegalese farmers (Niakhar, Bambey) intercrop both early-flowering Souna and late-flowering Sanio during the rainy season to cope with long dry spells.

Chromosome rearrangements occurring as a source of diversity

Sequencing at genome-wide scale revealed features at chromosome 3 and 6 within Souna morphotypes, suggesting specific independent rearrangements at these regions for heading and flowering earliness. Chromosome 3 and 6 might have undergone variations with breakpoints of 89.7 Mb and 68.1 Mb, respectively, large enough to look closely at gene insertion, deletion or order (Fig. 3b-c). Indeed, chromosome rearrangements occurring as a source of diversity through i) standing rearrangements that play role in evolutionary change and adaptive evolution; ii) transposons elements rearrangements which are a major mechanism of the rearrangements identified, they could be catalysts for the changes in expression of the genes that have altered in association with rearrangements; iii; transcripts variation (de novo or level of expression via duplication tandem) (18). In allogamous crop such as pearl millet, extensive chromosomal rearrangements that have occurred in its genome since it diverged from a common ancestor (11, 13). The two latter possible sources of variation would require more investigation at the gene expression level to test the assumption of independent rearrangements at chromosomes. Rather, we would favor an explanation of standing rearrangements, such variation provides the genetic diversity for a population to quickly adapt to environments. This is supportive of evidence for rearrangements in the pearl millet genome as revealed by synteny analysis with foxtail millet and sorghum (13).

Genes underlying allelic and traits diversity

The panel assembled is strongly structured between early-flowering Souna and late-flowering Sanio, with very few intermediate genotypes. Consequently, GWAS power lays mainly in identifying SNPs and phenotypic variation inside each group as correction for population structure and kinship are made. A panel with a more admixed genotype would have been better to pinpoint a phenotypic difference between early- and late-flowering millets. Biomass led to the identification of a high number of scattered SNPs without a classic pic of P-value around the most significant markers. It is unclear why such a pattern is observed with biomass only. A more classical pattern is well observed for other traits like plant height. Correlated genes, PgAAO and PgHK4, are involved in regulating the plant development response (plant height and nodal tiller) induced by abscisic acid (19) and in the synthesis of phylloquinone essential in photosynthesis (20), respectively. This suggests that during the vegetative phase, Sanio millet produces more tillers, grows taller and captures more light through the hormonal and photosynthesis pathways than Souna millet does. Allelic variation on these genes might be enabling these specific phenotypes. On the other hand, during the transition to the flowering phase, the expression of genes may be at the origin of phenotype, as repressor or activator of signaling pathways leading to these features. Orthologues of PPR gene functions in delaying flowering, by mediating several gene expression during plant growth or repressing genes involved in transition to the flowering phase (21). The gene PgPPR here identified could be a putative gene candidate in having a role in the flowering time control between early- and late-flowering pearl millet.

Key features for breeding

The main characters that differentiate the accessions in the core collection are the yield-related components, biomass, and flowering earliness. A north-south structure of the early- and late-flowering across Senegal was highlighted (22). There are more early-flowering millets in the north of the country and more late-flowering millets in the South. This also follows a rainfall gradient where the central part is dryer (average between 500–600 mm) than in the south (average 1200 mm) between the 1990’s and 2014. Not to mention the prevalence of certain diseases such as downy mildew seen more in the agro-ecological zones from the South to the center than in the center towards the North of the country (23). In some areas, early-maturing is suitable to cope with drought, while late-maturing would adapt to cope with bird damage and pests. Therefore, harnessing the diversity based on flowering earliness to address climate variability in agro-ecosystems would be a footstep for breeding early maturing varieties. Our results showed that the flag leaf length and the thinness of the middle axis of panicles are characters of differentiation of the subgroups of early-flowering morphotypes with high heritability. Therefore, they could be advantageous traits to target under hotter and drought conditions. The nodal tiller number, which is a consistently associated character with fodder yield in pearl millet (24), is also involved in the differentiation of subgroups from both early- and late-flowering morphotypes having high heritability. From our results, traits featuring early- or late-flowering millets could be targeted for breeding for dual-purpose varieties (yield and fodder). In summary, most of the characters that differentiate the genetic pools are involved in pearl millet performance under the agro-systems of Senegal.

The cultivated diversity of early- and late-flowering pearl millet landraces from Senegal was captured and a representative core set was defined using an effective heuristic approach. GWAS confirmed the genetic structure of the two morphotypes while identifying key phenology and phenotypic traits and SNPs on genes underlying flowering time, tillering, biomass and plant height of pearl millet. Identification of two subgroups within early-flowering morphotypes on LG 3 and 6 suggests chromosome rearrangements as a source of variation for flowering earliness features. Moreover, at least 18 genes were significantly linked with trait differences within the core set and could be targeted in breeding programs for pearl millet improvement under erratic climatic conditions.

Core collection definition and field evaluation

A total of 541 Senegalese pearl millet landraces were collected between 1992 and 2014 (9, 11, 22) from farmers with their approval and according to institutional, national, or international guidelines. SSRs markers were used to genotype the germplasm consisting of 429 early-flowering morphotypes (Souna) and 112 late-flowering morphotypes (Sanio). During 2014 and 2015, we evaluated the phenotype of 392 of these accessions in ISRA research stations. A total of 306 Souna accessions were evaluated at Bambey (N14°32'12" W16°36'41") and at Nioro (N13°45'00" W15°45'00"), while 86 Sanio accessions were evaluated at Senthiou Maleme (N13°49'01" W13°55'03") and at Kolda (N12°53'02" W14°57'05"). In each site, the experiment was a randomized complete block design with 3 replications. Each accession was grown in a single row of 8 hills. The distance between rows and between plants within a row was 90 cm. For the different trials, the following phenotypes were measured : downy mildew incidence (DM), 50% flowering time (FLO), nodal tiller number (NTN), plant height (PH), number of productive tillers (NPT), flag leaf length (FLL), flag leaf width (FLW), spike length (SL), spike thickness (ST), 1000 seed weight (SW), grain yield (GY), panicle yield (PY) (25). To establish a core collection from this panel, an advanced maximization sampling, called heuristic, was performed based on phenotypic and genotypic data from these 392 accessions using the PowerCore v 1.0 software (26). From this heuristic approach, 91 accessions were retained, consisting of 60 early-floreing Souna and 31 late-flowering Sanio, and field-evaluated at Nioro during the 2016 and 2017 rainy seasons. The experimental design for each trial was a 7 × 13 alpha lattice, with three repetitions. Each of the tested accession was grown in a single row of eight hills in each repetition and the measurements were taken on three hills. The genetic variability of the panel was assessed using the significance of differences between the Nei genetic index of core collection and a Student t-test at α = 0.05 (27). To assess whether the core collection captured the diversity of the whole dataset, we calculated the percentage of mean and of variance difference (%MD, %VD), the coincidence rate (%CR), and variable rate (%VR) according to (26). There parameters allow knowing the difference of accessions in average and in distribution, respectively. We build the core-collection based on statistical values for these parameters. Notably, the core collection is considered to be the representative of the entries collection when no more than 20% of the traits have different means (significant at α = 0.05) between the core collection and the entries collection, and the coincidence rate CR% retained by the core collection is no less than 80%. Analysis of variance was performed on the different phenotypic parameters using the following model:

Y = µ + G + Y + GY + R + B + 𝜺

where Y is the phenotype; µ, the mean; G the genetic effect; Y, the year effect; GY, the interaction between genotype and year; R the replication effect; B, the incomplete block and 𝜺, the residual effect. The heritability of agro-morphological characters was calculated using the mixed linear model with random-effects for individuals, using the Plant Breeding Tools v 1.3 software, according to the following formula, where ${\sigma }_{G}^{2}$ is the genotypic variance, ${\sigma }_{GxY}^{2}$ the genotype by (y) year variance and ${\sigma }_{\epsilon }^{2}$ , the residual variance for (r) replicates and (y) year. :

For each trait, the adjusted mean from the 2016 and 2017 trials of each individual was calculated with fixed-effects for individuals, using the Plant Breeding Tools v 1.3 software and considered as the value of the individual for the trait.

A principal component analysis was performed with the individually adjusted means for each phenotypic trait, centered and reduced, using the adegenet v2.1.1 package (28), R v3.5.1 (29). A discriminant analysis (DA) between early- and late-flowering millets was then performed. Characters that were significant with a P-value < 0.001 and which presented a high correlation (r) ≥ 0.7, with the axis of differentiation, were identified as discriminating characters between early- and late-flowering millets. This analysis was performed using the (XLStat 2014 software; http://www.xlstat.com) and the distribution of discriminating agro-morphological characters was plotted using R software v. 3.5.1 (29).

DNA extraction, libraries construction, and sequencing

Genotyping-by-sequencing (GBS) was performed on genomic DNA extracted as previously described (30) from a single plant sampled at the 5-leaf stage of each 91 accessions grown in 2016 at Nioro. The DNA was of good quality with ratios 260/280 and 2060/230 between 1.8 and 2. Extracted DNA was stored in a solution of Tris-HCl and sent to be sequenced at the Next Generation Sequencing Platform of the CHU Research Center, University of Laval, Quebec. The libraries were generated in 2 multiplexes of 45 and 46 samples. PstI-MspI double-digestion was applied, and adapters were linked to each sample followed by mixing and amplification. The sequencing of these libraries was done using an Illumina HiSeq2500.

SNPs calling, filtering, and data analysis

The evaluation of the quality of reads was done using FastQC v 0.72 and MultiQC v 1.6 and the cleaning of the sequences was done with FastQ Trimmer v 1.0.0. Only the sequences having an average quality (Q) ≥ 30 (sanger format) were retained and the first 7 bases (5' side) of each read were removed. The sequences were aligned to pearl millet reference genome (GenBank Accession number GCA_002174835.2), using BWA v 1.2.3, before realigning the sequences for insertions and deletions using RealignerTargetCreator v 0.0.4 and IndelAligner v 0.0.6. The BAM format files from the above procedures were joined using the MergeBAM v 1.2.0 tool and the SNP calling was made using UnifiedGenotyper v0.0.6. A total of 545,834 variants were called including 502,382 SNPs. The filtering was first done according to the quality of the mapping and the depth, by applying hard filtering using Variant filtration v0.0.5 tool (MQ ≥ to 40 and MQ divided by the depth of unfiltered samples greater than 0.1). A second filter was performed according the frequency of minor allele (MAF) greater than 0.05, and allowed proportion of maximum missing data of 0.05 for markers and 0.1 for individuals, using Plink v1.9. The multi-allelic markers were then removed using Tassel v5.2.48. The output file ultimately contained 21,663 SNPs markers and 78 individuals. Subsequent analyses were performed using this dataset. Subsequent analyses were performed using this dataset. All the bioinformatics analysis was carried out on the Galaxy v18.0.5 platform (31), implemented in the Bio-Linux 8 operating system (32).

Genetic structure

The genetic structure was evaluated using the sNMF algorithm, through the LEA v2.2.0 package implemented in the R software. For this analysis, we used several populations ranging from K = 1 to K = 10, and 10 repetitions for each K value. Discriminant Analysis of Principal Components (DAPC) was also used through the adegenet v2.1.1 package (28). The choice of the number of axes (PCs) retained for the DAPC was made using a cross-validation method, performed on the data set subdivided into two training sets of 90% and 10% (33). A test comprising 30 repetitions was performed to preselect a limited number of PCs on which to refine the search. A second test comprising 1000 repetitions was carried out on the limited number of preselected PCs to retain the number of PCs that allow the highest proportion of correct prediction with the lowest error. This analysis was carried out using the R software adegenet package (28).

GWAS

Association analyses were conducted with a mixed linear model (MLM) correcting for population structure and kinship was conducted using the Tassel v5.2.48 software. Q-Q and Manhattan plots illustrating the results of GWAS were produced using the qqman package v 0.1.4 from R software (34). The significance threshold (α) of the association of SNP markers with the different traits was calculated using the Bonferroni correction (35). SNPs markers significantly associated with agro-morphological traits were localized in the pearl millet genome intervals. This location was performed with the valR package v 0.5.0 (36) of R software. Pearl millet genome annotation was used to identify these genes.

GWAS: Genome-Wide Association Study; Chr: Chromosome; LG: Linkage group; SNP: Single Nucleotide Polymorphism; SSR: Simple Sequence Repeat; GBS: Genotyping-By-Sequencing; Mb: Megabytes; Gb: Gigabytes; QTL: Quantitative-Trait Locus; DAPC: Discriminant Analysis of Principal Components; PCs: Principal Components; PCA: Principal Component Analysis; MAF: Minor alleles frequency; NJ: Neighbor-joining.

Ethics approval and consent to participate

Not applicable

Consent for publication

Not applicable

Competing interests

The authors declare that they have no competing interests.

Availability of data and materials

All the raw sequencing reads for all the accessions are available in additional files. The SNP generated in this study were included as additional files.

Funding

We are thankful to funders who have supported activities: the West Africa Agricultural Productivity Program (51350SE) under which plant materials were collected from 2012 to 2014; the AMMA2050 project (NE/M020126/I) for and field-evaluation and genotyping of the material; KAFACI (2017146) for the management of genetic resources and data.

Authors' contributions

NAK, YV and AF designed the study. MCG, LZ, OS, AF, OD collected accessions. OD, GK, OS, HT and AF conducted field experiments. OD performed DNA extraction, bioinformatic and statistical analyzes. OD, CBS, YV and NAK discussed the GWAS data. OD, DD, DDS and NAK drafted the MS. All authors contributed and approved the final version of the MS.

Acknowledgments

We recognize the contribution of farmers to sharing seeds of accessions used in this study. We are grateful to the outstanding contribution of our colleague Amadou Fofana into the pearl millet program since the late 1970s and wish him the best for his retirement.

Author information

NAK is a plant geneticist, researcher at the Senegalese Institute for Agricultural Research (ISRA), Director of the Regional Center for the Improvement of Adaptation to Drought (CERAAS), Co-director of the international mi laboratory for the adaptation of plants and associated microorganisms to environmental stresses. His research focus on identifying genetic traits governing crop performance under dry environments and exploiting genetic diversity for breeding in prevision to climate change and for growing population’s needs and health.

Krishnan R, Meera MS. Pearl millet minerals: effect of processing on bioaccessibility. J Food Sci Technol [Internet]. 2018/06/27. 2018 Sep;55(9):3362–72. Available from: https://pubmed.ncbi.nlm.nih.gov/30150794.
Kane NA, Berthouly-Salazar C. Population Genomics of Pearl Millet. Switzerland: In Rajora OP, editor. Population Genomics: Crop Plants. Springer Nature AG; 2020.
Burgarella C, Cubry P, Kane NA, Varshney RK, Mariac C, Liu X, et al. A western Sahara centre of domestication inferred from pearl millet genomes. Nat Ecol Evol. 2018;2(9).
NA CBAB, F K. J, N S, C B, et al. Adaptive Introgression: An Untapped Evolutionary Mechanism for Crop Adaptation. Front Plant Sci. 2019.
Haussmann BIG, Boureima SS, Kassari IA, Moumouni KH, Boubacar A. Mechanisms of adaptation to climate variability in West African pearl millet landraces a preliminary assessment. 2007.
10.1534/genetics.109.102756
Saidou AA, Mariac C, Luong V, Pham JL, Bezancon G, Vigouroux Y. Association studies identify natural variation at PHYC linked to flowering time and morphological variation in pearl millet. Genetics [Internet]. 2009;182. Available from: https://doi.org/10.1534/genetics.109.102756.
Vigouroux Y, Mariac C, de Mita S, Pham JL, Gérard B, Kapran I, et al. Selection for earlier flowering crop associated with climatic variations in the Sahel. PLoS One. 2011;6.
Diack O, Kane NA, Berthouly-Salazar C, Gueye MC, Diop BM, Fofana A, et al. New genetic insights into pearl millet diversity as revealed by characterization of early- and late-flowering landraces from Senegal. Front Plant Sci. 2017;8.
Ousmane SY, Fofana AT, Cissé N, Noba K, Diouf D, Ndoye I, et al. Étude de la variabilité agromorphologique de la collection nationale de mils locaux du Sénégal. In 2015.
10.1007/BF00983542
Tostain S. Isozymic classification of pearl-millet (Pennisetum glaucum, poaceae) landraces from Niger (West-Africa). Plant Syst Evol [Internet]. 1994;193. Available from: https://doi.org/10.1007/BF00983542.
Hu Z, Mbacké B, Perumal R, Guèye MC, Sy O, Bouchet S, et al. Population genomics of pearl millet (Pennisetum glaucum (L.) R. Br.): Comparative analysis of global accessions and Senegalese landraces. BMC Genomics [Internet]. 2015;16(1):1048. Available from: http://www.biomedcentral.com/1471-2164/16/1048.
Frankel O, Brown A. Plant genetic resources today: a critical appraisal. In: Crop genetic resources: conservation and evaluation. 1984. p. 249–57.
RK V, C S, M T, C M. J W, P Q, et al. Pearl millet genome sequence provides a resource to improve agronomic traits in arid environments. Nat Biotechnol. 2017.
Bordes J, Branlard G, Oury FX, Charmet G, Balfourier F. Agronomic characteristics, grain quality and flour rheology of 372 bread wheats in a worldwide core collection. Journal of Cereal Science. 2008.
10.2135/cropsci2006.07.0444
Yan W, Rutger JN, Bryant RJ, Bockelman HE, Fjellstrom RG, Chen M-H, et al. Development and Evaluation of a Core Subset of the USDA Rice Germplasm Collection. Crop Sci [Internet]. 2007 Mar 1;47(2):869–76. Available from: https://doi.org/10.2135/cropsci2006.07.0444.
van Hintum TJL, Brown AHD, Spillane C, Hodgkin T. Core collections of plant genetic resources. 2000.
Burgarella C, Barnaud A, Kane NA, Jankowsky F, Scarcelli N, Billot C, et al. Adaptive introgression: an untapped evolutionary mechanism for crop adaptation. 2018.
Stewart NB, Rogers RL. Chromosomal rearrangements as a source of new gene formation in Drosophila yakuba. PLoS Genet. 2019.
Abdelgawwad M. Analysis of DNA Damage-Binding Proteins (DDBs) in Arabidopsis thalian and their Protection of the Plant from UV Radiation. Curr Proteomics. 2016 Nov 28;14.
Berens M, Berry H, Mine A, Argueso C, Tsuda K. Evolution of Hormone Signaling Networks in Plant Defense. Annu Rev Phytopathol. 2017 Jun 23;55.
Manna S. An overview of pentatricopeptide repeat proteins and their applications. Biochimie. 2015 Apr 14;113:93–9.
10.3389/fpls.2017.00818/full
Diack O, Kane NA, Berthouly-Salazar C, Gueye MC, Diop BM, Fofana A, et al. New Genetic Insights into Pearl Millet Diversity As Revealed by Characterization of Early- and Late-Flowering Landraces from Senegal. Front Plant Sci [Internet]. 2017;8(May):1–9. Available from: http://journal.frontiersin.org/article/10.3389/fpls.2017.00818/full.
Zoclanclounon YAB, Kanfany G, Kane A, Fonceka D, Ehemba GL, Ly F. Current Status of Pearl Millet Downy Mildew Prevalence across Agroecological Zones of Senegal. Sci World J. 2019.
KUMAR A, ARYA R, KUMAR S, KUMAR D, KUMAR S, Panchta R. Advances in pearl millet fodder yield and quality improvement through Breeding and management practices. Forage Res. 2015.
IBPGR ICRISAT. Descriptors for pearl millet [Pennisetum glaucum (L.) R. Br.]. 1993.
Kim K-W, Chung H-K, Cho G-T, Ma K-H, Chandrabalan D, Gwag J-G, et al. PowerCore: A program applying the advanced M strategy with a heuristic search for establishing core sets. Bioinformatics. 2007 Sep 1;23:2155–62.
Hu J-J, Zhu J, Xu H. Methods of constructing core collections by stepwise clustering with three sampling strategies based on genotypic values of crops. Theor Appl Genet. 2000 Jul 1;101:264–8.
Jombart T. ADEGENET:. A R package for the multivariate analysis of genetic markers. Bioinformatics. 2008 Jul 1;24:1403–5.
Development Core Team R R. R: A Language and Environment for Statistical Computing [Internet]. Team RDC, editor. R Foundation for Statistical Computing. R Foundation for Statistical Computing; 2011. p. 409. (R Foundation for Statistical Computing; vol. 1). Available from: http://www.r-project.org.
Mariac C, Luong V, Kapran I, Mamadou A, Sagnard F, Deu M, et al. Diversity of wild and cultivated pearl millet accessions (Pennisetum glaucum [L.] R. Br.) in Niger assessed by microsatellite markers. Theor Appl Genet. 2006;114:49–58.
Afgan E, Baker D, Batut B, Beek M, Bouvier D, Cech M, et al. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update. Nucleic Acids Res. 2018 May 22;46.
Dawn F, Tiwari B, Booth T, Houten S, Swan D, Bertrand N, et al. Open Software for Biologists: from famine to feast. Nat Biotechnol. 2006 Aug;1:24:801–3.
C. JT and C. A tutorial for Discriminant Analysis of Principal Components (DAPC) using adegenet 2.0–0. 2015 Aug 1;31.
Turner S. qqman: an R package for visualizing GWAS results using Q-Q and manhattan plots. J Open Source Softw. 2018 May;19:3:731.
Haynes W. Bonferroni Correction. In: Encyclopedia of Systems Biology. 2013.
Riemondy KA, Sheridan RM, Gillen A, Yu Y, Bennett CG, Hesselberth JR. valr: Reproducible genome interval analysis in R. F1000Research [Internet]. 2017 Jun 29;6:1025. Available from: https://pubmed.ncbi.nlm.nih.gov/28751969.

Supplementarymaterial.docx

Download PDF

Journal Publication

published 10 Nov, 2020

Read the published version in BMC Genomics →

Editorial decision: Major revision
13 Jul, 2020
Review #2 received at journal
10 Jul, 2020
Reviewer #2 agreed at journal
29 Jun, 2020
Review #1 received at journal
19 Jun, 2020
Reviewer #1 agreed at journal
02 Jun, 2020
Reviewers invited by journal
19 May, 2020
Editor assigned by journal
29 Apr, 2020
First submitted to journal
28 Apr, 2020
Submission checks completed at journal
28 Apr, 2020
Editor invited by journal
28 Apr, 2020

You are reading this older preprint version

Read the latest preprint version →

GWAS unveils features between early- and late-flowering pearl millets

Status:

Journal Publication

Version 1

Abstract

Background

Results

Conclusions

Figures

Background

Results

Establishment of a core germplasm

Phenotypic traits discriminating Souna and Sanio

Genomic variation between the morphotypes

Identification of SNP markers and candidate genes linked to morphotypes

Discussion

Building a Senegalese pearl millet landraces core collection

Phenology and phenotypic traits featuring early- and late-flowering millets

Chromosome rearrangements occurring as a source of diversity

Genes underlying allelic and traits diversity

Key features for breeding

Conclusion

Methods

Abbreviations

Declarations

References

Supplementary Files

Status:

Journal Publication

Version 1