AFLP-based assessment of genetic variation in Indian elite cultivars of Cymbopogon species

Cymbopogon , one of the major aromatic genera of Poaceae family, is mainly cultivated for its essential oils that hold promising medicinal and commercial value. In India, several improved cultivars of Cymbopogon species are developed for commercial cultivation and conceded by means of agronomic and chemotypic traits. Information on genetic variability in these commercial cultivars is still limited which poses a problem in further improvement of oil grasses. The present study aimed to investigate the genetic variation among certain improved Indian cultivars of three commercially important Cymbopogon i.e., using amplified fragment length polymorphism (AFLP) markers. A total of 94 robust loci were amplified using 8 AFLP primer combinations. Among the total amplified fragments, 72 were polymorphic which showed 76.59% polymorphism across all the germplasm. Polymorphic information content (PIC) and marker index (MI) values showed significant positive correlation (r 2 =0.94, p< 0.005) in determining the discriminatory power of the AFLP primer combinations. A higher effective multiplex ratio (EMR) and MI values were recorded for the primer combinations that generated higher polymorphism across the genotypes. To obtain the genetic information at the intra- and inter species level,


Introduction
The genus Cymbopogon belongs to the family poaceae and is commonly known as aromatic grasses. Odoriferous species of this genus contain aromatic essential oils comprising of mixtures of cyclic and acyclic monoterpenes. Essential oil constituents have immense impact in various sectors of medicinal, perfumery and beverage industries. Although more than 50 species of the genus have been reported from India [1] , only three Cymbopogon species viz., C. flexuosus (lemongrass), C. winterianus (citronella) and C. martinii (palmarosa) are cultivated at the commercial scale and emerged as modern cash crops. Depending on the dominant essential oil constituents Cymbopogon species are classified into three chemotaxonomic series viz., Shoenanthi, Rusae and Citrati [2] . In India, several improved cultivars/clones from same or different species are developed giving emphasis on quality and quantity of the essential oil. Morphological variation, oil characteristics and herbage yield were taken into consideration for identification and classification of these cultivars. Nevertheless, these identifiable parameters are highly influenced by the environments and not sufficient to define the relationship among the morphotypes and chemotypes. Heterozygosity nature of the plants due to high cross-pollination and introgression of various traits through natural hybridization and sporadic mutations and final selection through human intervention lead to the development of morphological or chemotypic intermediates. These constraints pose a serious threat to the genetic improvement of this aromatic grass through conventional breeding. Hence, there is an urgent need to develop a reliable molecular fingerprinting system that could efficiently uncover the genetic variations partitioning within and among genotypes especially for the popular cultivars and subsequent utilization of the molecular signature in plant improvement and breeding programmes.
With the advent of RFLP (restriction fragment length polymorphism) markers, DNA based molecular markers have become the choice of most of the scientists for the estimation of genetic heterogeneity and interrelationships in the germplasm. Introduction of AFLP (amplified fragment length polymorphism) markers by Zabeau and Vos (1993) [3] has enriched the genotyping techniques and considered to be much more effective over other conventional markers such as RFLP, RAPD (random amplified polymorphic DNA) and microsatellites. High reproducibility and genome-wide coverage, no need of prior sequence knowledge, amenable to semi-automated genotyping are some attractive features of AFLP technique that broaden its applicability in studying the population structure, QTL (quantitative trait loci) mapping, cultivar identification, gene tagging and isolation [4] . So far, single primer amplification reaction (SPAR) based approaches have been extensively applied for the detection of ancestors and cultivars [5][6][7][8] , identification of somaclonal variants [9] , analysis of population structure [10] and assessment of the genomic diversity both at inter and intra-specific level [11][12][13][14] in Cymbopogons. Information on the effectiveness of AFLP markers in delineating the molecular diversity of Cymbopogon germplasm is still limited [15] . Considering the lack of genomic information of the genus and potential of the AFLP marker system, a maiden attempt was made to assess the efficiency and fidelity of AFLP markers for surveying Cymbopogon genetic diversity using certain improved/ selected Indian elite cultivars. Molecular passport data generated in the present investigation would help to initiate effective genetic conservation programme and assist targeted breeding for improvement of this aromatic grass.

Results and discussion
Cymbopogon aromatic grasses are cultivated as highly remunerative and alternative cash crop of which essential oils strengthen export economics for many developing agrarian nations. Despite of having considerable disparities among phenotypic traits at the inter-species levels, circumscription of individual species is largely based on morphology [2] . Morphological differences become often blurred when considered at the intra-species level. Poor characterization and exploitation of available germplasm limit the resources of genetic support to the development of cultivars in this non model grasses while ample knowledge on genetic resources and saturated linkage maps have been gathered for related model grass species. Prior information on the existing genetic variation within the available genetic resources is essential in initiating genetic improvement programme of any plant species. To our knowledge, this is the first attempt in dissecting the genetic variation of elite Indian cultivars of commercially cultivated species of Cymbopogon with AFLP markers.

Detection of polymorphisms and evaluation of AFLP markers
Efficacy of the AFLP markers in discerning the genetic variation between the studied germplasm was evaluated by monitoring two aspects i.e., (a) marker informativeness (frequency of polymorphism) and (b) marker performance (discriminatory power). Eight selective AFLP primer combinations amplified a total of 94 reproducible loci with 11.75 average loci per primer set (Table  1). Representative gel pictures depicting the banding patterns produced by the AFLP primers are shown in Fig. 1 (A, B). Size of the amplified fragments ranged from 100bp-3500 bp (Supplementary Table 1). Out of 94 loci, 72 (76.60%) loci were polymorphic for the entire data set. Based on all the aplicons, a reliable AFLP fingerprinting key for each cultivar is represented in the form barcode diagram in Fig. 2. Maximum number of polymorphic loci (11) was obtained with primer E-ACC /M-CAA and E-AGC /M-CTA with an average of 9 polymorphic loci per primer set. Polymorphism partitioning within the individual group of cultivars differed substantially and was approximately 66.29% (59/89) in palmarosa, 16.90% (12/71) in citronella and 54.67% (41/75) in lemongrasses. Although the total number of loci obtained exceeded those described by our previous studies examining genetic diversity within the same set of germplasm using RAPD [13] and ISSR [7] markers, overall polymorphism percentage was found to be decreased. On the contrary, the level of polymorphism is often comparable with the studies considering other representatives of the Cymbopogon species [8] . These results indicate that the frequency of amplified loci and their polymorphism index depends on the number of markers applied, marker distribution in the genome and most importantly the genetic background of the studied plant population. Across all genotypes, frequencies of polymorphic loci for a given AFLP primer set ranged from 0.1 to 0.9. Frequency classes in the range of 0.4-0.5 represented maximum number (20) of polymorphic loci accounting for 27.78% of the total polymorphism (Fig. 3A). Interestingly, when PIC values were correlated with the frequency of polymorphism of individual fragments it was noticed that a large portion (∼60%) of polymorphic loci are highly informative (average PIC value 0.50) (Fig. 3B). Maximum PIC value for any dominant biallelic markers such as AFLP is 0.5 [16] . Considering this, frequency of polymorphic loci amplified in the range of 0.3-0.6 was appeared to be the most informative (PIC>0.48) followed by the class 0.2-0.3 and 0.6-0.7 (Fig. 3C). Our observation is insignificantly different from the outcomes of genetic diversity analysis conducted in other crops [17,18] . Based on the analysis we propose that the fragments occurring in 40-60% accession with PIC values > 0.48 could be used for the study of diversity analysis when a particular primer set generate a large number of scorable fragments. We have averaged the PIC values of individual locus of a primer combination to calculate the corresponding PIC value of the primer set. Across the AFLP primer set, E-ACC /M-CAA primer with highest PIC value (0.412) is suggested for germplasm characterization of Cymbopogon sp. (Table 2). Marker Index (MI) has been used as an important parameter in selecting informative markers for diversity studies in several crops [19,20] . Highest value (4.53) for MI was scored with the primer pair E-ACC /M-CAA (Table 2). In this present investigation, existence of a positive correlation (r 2 =0.94, p< 0.005) among PIC and MI value strongly indicates that any of the two parameters could be used to screen the informative primer combinations. Resolving power (RP) describes the efficacy of a primer in distinguishing a specific genotype unequivocally in diversity/marker studies [21,22] . The RP of the primers varied from 4.0 (E-AGG /M-CAT) to 8.2 (E-ACC /M-CTC) with an average of 6.3, suggesting that E-ACC /M-CTC should be the most useful primer combination for discriminating the cultivars. However, positive correlation of RP with PIC and MI value was found to be insignificant. This lack of consistency in the correlation implies that all these indices are more or less equally important in evaluating the efficacy of whole set of AFLP primer combinations, at least in this study. Absence of clear relationship between the informativeness of AFLP primers and evaluating parameters (PIC, MI, and RP) was also observed in other plant species [23] .

Assessment of genetic variability and scoring of diagnostic markers
AFLP banding pattern uncovered more allelic variability per locus in palmarosa (1.6277) than lemongrass (1.4362) and citronella (1.1277) ( Table 3). Besides measuring allelic richness, Shannon information (I) and Nei's (h) diversity indices was calculated that serve as important benchmarks in determining the genetic structure of a plant species. An estimate of gene diversity value (h) differed at the intra specific level and was found to be much lower in citronella (0.0529) than lemongrass (0.1948) and palmarosa (0.2580). Shannon's information index (I) also followed the similar order of genetic heterogeneity (palmarosa> lemongrass> citronella). Across all the genotypes, values of genetic diversity parameters computed for AFLP markers are often comparable to the previous reports on Cymbopogon species [7,10,13,14] . Besides calculating the genetic variability parameters and diversity indices, development of molecular descriptors as 'stand along' (if not stand alone) discriminating the genotypes at the intra-and inter specific level would be the most promising perspective that eventually evaluate the suitability of any marker system in a diversity analysis. AFLP fingerprints discriminated only two cultivars of palmarosa by presence of single amplicon i.e., E-ACA /M-CAT(900bp) for Tripta and E-AGG /M-CAT(850bp) for Dhanwantari Acc 01 which were absent in the rest of the cultivars. However, no such promising marker was developed for lemongrass and citronella cultivars. At the species level, lemongrass generated three unique fragments of E-ACC /M-CAA(800, 900bp) and E-AGC /M-CTA(350bp); whereas, palmarosa was distinguished by one marker (E-AGC /M-CTA(1000bp)). Besides amplifying species specific markers, E-AGC /M-CTA primer combination facilitated molecular distinction of the 'Citrati' series by two unique amplicons (E-AGC /M-CTA(100, 900bp)). These cultivar and biotype-specific AFLP markers could be converted to locus specific polymorphic sequence-tagged-site (STS) marker for high-throughput fingerprinting of genotypes, breeding programs and marker assisted selection (MAS).

Genetic relationships and PCA analysis
Establishment of genetic relationships between the cultivars of three oil trade types of Cymbopogon (palmarosa, lemongrass and citronella) was carried based on pairwise similarity estimates calculated from Jaccard's coefficient. Jaccard's similarity coefficient ranged from 0.407 to 0.831 (Supplementary Table 2). The range of genetic similarity values were higher than those previous studies claimed on the basis of phytochemicals (3.12 to 75 %) in these species [24] . This discrepancy is probably because of the genomic regions targeted by the AFLP markers not directly influence the biosynthesis of phytochemicals. On the contrary, observed percentage of similarity was found to be lower when compared with the previous reports applying functional marker i.e., EST-SSR (64 to 87%) in discerning the genetic variation of Cymbopogon germplasm [25] . This might be due to the higher stringency of sequence conservation of the functional regions of the genome. UPGMA based dendrogram constructed from the genetic similarity data of AFLP markers clearly discriminated the cultivars into two major clusters i.e., A and B (Fig. 4). Cluster A included all the three cultivars (OD-19, Pragati and Nima) of lemongrass. Cluster B consisted of three sub-clusters (BI, BII and BIII) containing the cultivars of both 'Citrati' and 'Rusae' series. The largest cluster was BI and included four genotypes of palmarosa i.e., Dhanwantari Acc-01, Dhanwantari Acc-02, Trishna, PRC-1. BII cluster included the cultivars of citronella i.e., Manjusha, CIM Jeeva. BIII included one cultivar of palmarosa i.e., Tripta and regarded as single membered cluster. For further differentiation of the cultivars principal coordinate analysis (PCA) was carried out (Supplementary Fig. 1A,B). Overall, 2-D and 3-D representation of the PCA results supported the cluster analysis. Maximum value of Jaccard's similarity coefficient (0.831) was noticed among the citronella cultivars. Similar to our results, citronella cultivars CIM Jeeva and Manjusha displayed maximum genetic similarity after molecular characterization of Cymbopogon accessions through RAPD and ISSR markers [26] . The lower degree of variation in citronella can be justified by its vegetative propagation. Further, rare and scanty flowering followed by the absence of viable seed setting in citronella often limits outbreeding mediated enhancement of heterozygosity. In comparison with other Cymbopogon species citronella also displays a less chemotypic diversity [27] . Tendency of lemongrass cultivars to stay within a single cluster is understandable. Maximum homogeneity (69.6%) was observed among the OD-19 and Pragati. This indicates the consistency of the AFLP marker system since Pragati was clonally developed from OD-19 and both are enriched with citral. Nima has been developed as a distinct variety from the open pollinated offspring of OD-19. In spite of being containing highest citral content (85 to 90%) in comparison to other genotypes of lemongrass and stable for commercial cultivation, emphasis has yet not been given towards molecular characterization of Nima. Unlike our previous studies, AFLP based phenogram distinguished the lemongrass cultivars from others belonging to Citrati series. Anomaly in cluster differentiation was observed in order to place the most genetically diverged palmarosa cultivars belonging to the 'Rusae' series. Sub-cluster BI grouped all the palmarosa cultivars except Tripta. Maximum genetic homogeneity (77.6%) was observed between the Trishna and PRC-1. Despite of having a common origin and open-pollinated improved bulk or composite nature of these three cultivars [28,29] , tripta carved out from the cluster and positioned separately. This might be ascribed to the distribution of large fragments (by presence) among the Trishna and PRC-1, not scored for Tripta. Common origin and shared gene pool grouped Dhanwantari Acc 01 and Acc 02 together. Sangwan et al. (2003) [30] found similar discrepancies in cluster analysis when examined the palmarosa cultivars through profiling of genomic (RAPD) and expressed (isozyme and protein) markers. Higher genetic diversity observed in palmarosa in comparison to the lemongrass and citronella is appeared to be meaningful as palmarosa display enormous chemotypic diversity (mostly quantitative) in their cultivated and wild forms [27,29,31] .
In conclusion, the present study on the AFLP based genetic characterization of Indian commercial/improved Cymbopogon cultivars has provided convenient, efficient and reliable fingerprinting keys that could be used for exploitation of genetic resources, management of genebanks and their sustainable commercial utilization. Besides conventional breeding efforts implicated in the improvement of the cultivars of several species of Cymbopogon, systematic molecular characterization of the parents and gene introgression from the wild counterparts should be emphasized to minimize the imbalance of recessive alleles in the heterozygous state. Diagnostic AFLP markers developed at the intra-and inter species level might contain important gene sequences of improved agronomical trait, which in combination with phenotypic and chemotypic attributes could define the promising germplasm for the development or improvement of the cultivars of this aromatic grasses.  Supplementary Table 3. This study complies with the local and national regulations. Genomic DNA extraction from the young leaves of each cultivar was carried out following our established protocol [32] . DNA isolation from same line of germplasm was repeated for 3 times. Quality of the extracted DNA was checked on 1% agarose gel and quantity was estimated using NanoDrop spectrophotometer (Thermo Scientific).

AFLP analysis
AFLP fingerprinting was performed as described by Vos et al. [33] with minor modifications. Briefly, 300ng genomic DNA was double-digested with EcoRI and Tru9I (isoschizomer of MseI) restriction enzymes (Fermentas) at 37°C for 3 h followed by deactivation at 70°C for 15 min. Ligation of EcoRI and MseI adapters to the both ends of digested genomic fragments was carried out using T4 DNA ligase (Fermentas) overnight at 15°C. Adpater ligated DNA fragments were used as template for PCR amplification. Pre-amplification was performed in a Thermal cycler (Applied Biosystems Veriti, USA) using EcoRI+A and MseI+C primers. Diluted (1:9) pre-amplified DNA samples were then used for selective amplification with eight primer combinations. All the adapter and primer sequences are depicted in Supplementary Table 4. Amplified products were separated on 2.5% agarose gel followed by ethidium bromide staining and images were captured using Gel Documentation System (Biorad, USA). All experiments were repeated at least three times to ensure reproducibility and consistency.
Data scoring and analysis AFLP banding pattern generated by eight primer combinations were scored manually and assigned as '1' for presence and '0' for absence for the homologous fragments. Polymorphic information content (PIC) [34] (2) effective multiplex ratio (EMR) (3) marker index (MI) [17] and (4) resolving power (RP) [21] was calculated to compute the discriminatory power of all the AFLP primer combinations. Genetic diversity estimates at the intra-and inter species level was carried out through the POPGENE version 1.32 [35] by considering the parameters (1) observed number of alleles(na), (2) effective number of alleles (ne), (3) Nei's diversity index (h) [36] and (4) Shannon's information index (I) [37] . Jaccard's coefficient [38] values were taken for generating the pair-wise similarity matrix and UPGMA dendrogram with the aid of NTSYS-pc (Numerical Taxonomy System version 2.1) [39] . For clear differentiation of the clusters, principle coordinate analysis (PCA) was done with NTSYS software.
The authors declare no competing interests.          Table 1 Primer-wise score of PCR amplification products in the cultivars of major oil trade types of Cymbopogons.

EcoR1/Mse1
Number of PCR amplification fragments generated in: