Disentangling potential genotypes for macro and micro nutrients and polymorphic markers in Chickpea

The present investigation was conducted to assess the nutritional diverseness and identify novel genetic resources to be utilized in chickpea breeding for macro and micro nutrients. The plants were grown in randomized block design. Nutritional and phytochemical properties of nine chickpea genotypes were estimated. The EST sequences from NCBI database were downloaded in FASTA format, clustered into contigs using CAP3, mined for novel SSRs using TROLL analysis and primer pairs were designed using Primer 3 software. Jaccard’s similarity coefficients were used to compare the nutritional and molecular indexes followed by dendrograms construction employing UPGMA approach. The genotypes PUSA-1103, K-850, PUSA-1108, PUSA-1053 and the EST-SSR markers including the 5 newly designed namely ICCeM0012, ICCeM0049, ICCeM0067, ICCeM0070, ICCeM0078, SVP55, SVP95, SVP96, SVP146, and SVP217 were found as potential donor/marker resources for the macro–micro nutrients. The genotypes differed (p < 0.05) for nutritional properties. Amongst newly designed primers, 6 were found polymorphic with median PIC (0.46). The alleles per primer ranged 1 to 8. Cluster analysis based on nutritional and molecular diversities partially matched to each other in principle. The identified novel genetic resources may be used to widen the germplasm base, prepare maintainable catalogue and identify systematic blueprints for future chickpea breeding strategies targeting macro–micro nutrients.


Introduction
Chickpea, is a self-pollinating diploid (2n=2x=16) with genome size 1C=740 Mbp [1] . It consists of remarkable attributes like wide climate adaptation, low production cost and having an ability to be applied in crop alternation and atmospheric nitrogen xation. Chickpea is a noteworthy legume plant for sustainability of agriculture system [2] . Despite having little productivity especially due to Fe de ciency induced by lime, Chickpea, is cultivated on large areas of world [2] . It is the second most signi cant pulse (after dry beans) crop which is grown mainly in the arid and semi-arid regions, grown over 40 countries representing all the continents, with 13.72 million hectares (MHa) total harvested area, 1038.4 kg per hectare (Kg/Ha) total yield and 14.25 million tonnes (MT) total production [3] . Developing countries hold largest share (95%) in terms of area, production and consumption of chickpeas. During the span of last 30 years (1989-2019), worldwide chickpea area ampli ed by 138.56%, yield by 143.29% and production by 198.53% [3] . Presently, it is cultivated in several countries with the largest harvested area of 9.55 MHa by India followed by Pakistan, Russian Federation, Turkey, Myanmar etc [3] . Currently, India represents as the principal chickpea producer contributing around 69.76% of the global production followed by Turkey, Russian Federation, Myanmar and Pakistan considered as the top ve major world producers [3] . The main pulse crops i.e., beans, peas, and chickpeas account for around 64.17% of global pulse production with chickpea accounting for nearly 16.12% [3] . In India during 2018-19, it was cultivated in 9.44 MHa area with 10.13 MT total production and 1073 Kg / Ha yield. Madhya Pradesh ranked 1st with highest acreage of 3.43 MHa followed by Rajasthan, Maharashtra, Karnataka and Uttar Pradesh. The highest production of 4.61 MT was contributed by Madhya Pradesh followed by Rajasthan, Maharashtra and Uttar Pradesh. The highest yield of 1344 Kg / Ha was produced by Madhya Pradesh followed by Gujarat (1324), Uttar Pradesh (1272) and Rajasthan (1103) [4] . However, as per very recently released 3rd advance estimates, India expects 12.63 MT of total chickpea production during 2020-21 [5] .
Owing to different quality and quantity traits, chickpea owns huge variations which assist breeders to develop advanced lines and release better-quality varieties [6] . Chickpea is one of the earliest cultivated edible grain legumes [7] . It's about 7,500-year-old remnants usually found in the Middle East [7] . Chickpea serves as an ideal crop for human consumption owing to its high nutritive values for protein (17-24%), carbohydrates (41)(42)(43)(44)(45)(46)(47)(48)(49)(50).8%), minerals and unsaturated acids like linoleic, oleic etc [8] . It is important to note that identifying potential genotypes is very much vital, when huge accessions of crop germplasm are being considered. Hence, newly evolved cultivar is to be registered and purity of the variety has to be ascertained. DNA markers offer very e cient and well-grounded techniques for assessing the genomic changeability and a liations among germplasm lines. Hence, DNA markers are considered very effective tools for assessing genomic variations and learning developmental association ships [9] . In plant genomes, PCR based techniques and microsatellite sequences facilitate to analyze the genomic diversity. Genomic analysis procedures using DNA polymorphism have been progressively used to describe and classify a novel germplasm for use in the crop breeding process [10] . Environmental factors and growth practices affect the morphological and nutritional markers, whereas DNA markers remain unaffected by environmental conditions. The present study utilized EST-SSR markers to assess the pattern and the presence of genomic changeability and congruence among the genotypes. Thus, ndings would be helpful in identifying and differentiating numerous genotypes for local consumption or for exportation purpose, selection of diverse parents and devise competent approaches for the e cient management of the genetic resources and to widen the germplasm base which could be used in the forthcoming nutrition rich chickpea breeding plans.

Macro and Micro Nutrients' based Similarity Vs Dissimilarity Analysis
Nutritional pro le for macro and micro nutrient-based similarity analysis ( Table-II The similarity coe cients were applied to congregate the data as per UPGMA algorithm. The consequent phenogram clumped 9 genotypes towards four distinct conglomerations escorted by different sub clusters ( Figure 1). The Cluster-1 comprises 5 cultivars and those are further aligned into three sub clusters viz. 1A, 1B and 1C. The sub cluster 1A is represented by two genotypes PUSA-362 and PUSA-1103. The sub cluster 1B is represented by two genotypes K-850 and JG-62. The sub cluster 1C is represented by a single genotype JG-74. The cluster 2 is represented by a single genotype PUSA-1088.
The cluster 3 is represented by two genotypes PUSA-1105 and PUSA-1053. The cluster 4 is represented by a single genotype PUSA-1108 and remains isolated at the end of the dendrogram.

Molecular Analysis
Genetic markers are extensively harnessed to discover heritable disparity at independent or several gene loci of individual plants within a population or between the plant populations. In modern times, due to attainability of a huge number of disclosed expressed sequence tags (ESTs) several SSRs have been evolved and are mentioned as EST-SSRs [11] . In our study, the total numbers of 73 primers were utilized in dissecting molecular signatures of 9 chickpea genotypes, out of which 12 primers with 0.45 average PIC value showed polymorphism (Supplementary Table-I The novel EST-SSR frequency Since EST sequences are usually partial length cDNAs, it may be impossible to identify su cient and suitable sequences to delineate fringing priming coat for the harbored SSRs. The use of CAP3 software facilitated the identi cation of overlapping sequences among ESTs and generation of consensus contiguous sequences for improving the chance to identify su cient anking sequences. Out of the assessed ESTs, 18.4% fell into shared and contiguous sequences (1,178), indicating relatively a high level of redundancy within and between the chickpea EST databases. The type and length of an SSR motif is an important factor in determining its usefulness as a marker, since some motifs are more common leading to a larger repeat, the higher the probability that it will be polymorphic [12] . Within the EST-derived 27 SSR markers or constructed contigs (Table-III), the most common repeats were di (CT, TA), tri (AGA, TCA, TGG), tetra (CCAC, ANTC), penta (AAANA, TCTCN), and hexa (AATATT) varying in length from 2 to 10 units.

Putative functional categorization of the new EST-SSR markers
ESTs are currently the most widely sequenced nucleotides derived from plant genomes in terms of numbers of sequences and available nucleotide counts. Following functional characterization, the identi ed novel SSR loci may be useful for mapping and possible co-localization with QTLs for desirable traits and for future validation as possible candidate genes [13] In particular, there is an urgent need to uncover sequences that are physically and functionally associated with traits of interest [14] . Following comparison with sequences within the databases, such as those from the existing EST library [15] , functional annotation of the identi ed EST-SSR showed homology with proteins associated with various biological processes, molecular functions and cellular components. Of the 27 SSR markers optimized for ampli cation, four showed gene ontology for proteins involved in drought stress, one for protein folding and one for molecular function. Sequences encoding ABA speci c (SVP2) BTB domain (SVP 134), ribosomal protein (SVP 146, SVP 204) and dehydrin (SVP 213, SVP 285) genes were also identi ed (Table -IV).

Number of alleles and molecular polymorphism
The highest numbers of alleles were observed for the primers SVP 95 and ICCeM0059 (Three alleles) followed by SVP 55, SVP 96, SVP146, SVP213, SVP217, ICCeM0002, ICCeM0023, ICCeM0049, ICCeM0067, ICCeM0070, ICCeM0078 (Two alleles) and only one allele for other primers. Primer ICCeM0059 had maximum number of sharing alleles i.e., 27 and the primer ICCeM0025 had minimum number of sharing alleles i.e., 4 among the primers showing polymorphism. On the basis of sharing alleles, the frequency of primer ICCeM0059 per allele becomes 0.33, 0.33 and 0.33, while that of ICCeM0025 is 1.0. Based on the allele frequencies, the PIC values were estimated for different EST-SSR primers. The PIC values for the 12 polymorphic markers ICCeM012, ICCeM0049, ICCeM0059, ICCeM0067, ICCeM0070, ICCeM0078, SVP55, SVP95, SVP96, SVP146, SVP213 and SVP217 ranged from 0.28 to 0.68. Amongst the polymorphic markers SVP 213 showed the lowest PIC value (0.28) and ICCeM0059 showed the maximum PIC value (0.68) because of evenly distribution of three alleles among the genotype of C. arietinum (Supplementary Table-I).

Molecular Similarity Vs Dissimilarity Analysis
EST-SSR data were employed to compare pair wise genotypes based on combined and unmatched products with NTSYS-PC-version 2.11s (Table-II). The aptness of SSRs in discovering intraspeci c disparities in chickpea has been illustrated applying polymorphic SSR markers to investigate intra-speci c genetic variations amongst geographically distant Cicer genotypes [16] . Genomic closeness amongst genotypes was assessed by a similitude grid based on Jaccard's coe cients ranging 0.76 to 1.00.
Molecular pro le-based similarity coe cient was applied to cluster the data as per UPGMA algorithm that produced 3 clusters (Figure-

Discussion
Chickpeas exhibit nutritional bene ts and are recommended for sustainable diets. Proximate analysis of selected chickpea genotypes revealed that these genotypes possess high macro and micro nutrient contents and show great phytochemical potential. Findings of proximate compositions are in agreement with the studies conducted earlier on legumes by other researchers [17,18,19,20,21,22] . As far as total polyphenols and antioxidant activity are considered, our results showed signi cantly high level of TPC and antiradical activity which suggests that these genotypes are having substantial phytochemical properties which can be utilized in product development to cure the in ammation and malnutrition. Our results of TPC are in consistent with earlier reports [21,23] and also showed similarity with certain under used legumes in Korea like pigeon pea (248-300 mg), groundnut (140-358 mg), kidney bean (250-320 mg) and groundnut species (100-289 mg) as reported [24] . In general, all desi had greater antioxidant activity in comparison to kabuli chickpeas. Such discrepancies in antioxidant actions amongst genotypes were additionally found in several studies that can emerge due to genetic variations, the extraction method and external ambient like rainfall, temperature etc. [25] .
Our results on frequency and Characterization of Novel EST-SSR showed that the most frequent repeat type was trinucleotide (35.29%) followed by tetra (23.5%) and di nucleotide (18%) motifs. The ample of tri nucleotide motifs in the chickpea coding sequences (35.29%) was in concurrence of inspections noted in mono and dicots [26] emulating the necessity of the coding domains to perpetuate the codons [27] . In total, 348 of the 1,778 contigs encompassed SSRs (19.6%) of which 27 contained ample fringing sequences to blue print primer pairs (Table-I). In a similar study, relatively higher level of EST-SSR (11.5%) from the assessed ESTs in Cicer arietinum [28] ) as compared to SSRs (3.2%) in cereal [29] was observed. However, it should be kept in mind that the myriad of SSRs excavated out of a sequence database turns on the SSR discovery criteria, the size of the dataset and the database mining tools that are operated [30] . The 27 SSR anking primer pairs designed in the current study ampli ed products in the expected size range in each of the assessed chickpea genotype and 6 of these produced polymorphisms with a median PIC (0.46) value for the 9 genotypes (Supplementary Table-I).
Regarding putative functional categorization of the novel EST-SSRs, thejoint mapping and expression studies will determine the potential usefulness of markers for traits of interest. Future approaches will integrate transcriptomics and marker development in a single step. Although the level of polymorphism within EST derived SSR markers is generally lower than within SSR derived from genomic libraries [31] , the markers in our study have shown to be polymorphic across several accessions. The use of SSCP analysis may further disclose internal single nucleotide polymorphisms [32] . In future, the SSRs developed from ESTs will be mapped to determine if they co-segregate with the genetic variation explained by the trait loci as an initial step towards identifying potential candidate genes.
The high PIC value observed by us is also supported [33] . Meticulous perusal and interpretations based on primers ampli cation, number of alleles, repeat motifs, product size, polymorphism level and PIC values indicated that ten primers namely ICCeM012, ICCeM0049, ICCeM0070, ICCeM0078, SVP55, SVP95, SVP96, SVP146, SVP213 & SVP217 revealed their e ciency as potential markers for macro-micro nutritional trait association and polymorphism studies.
We applied an integrated approach of macro-micro nutrients and molecular diversity analysis across nine chickpea genotypes. The close perusal of nutritional observations revealed overall superiority of PUSA-1103 and K-850 over PUSA-362 in the tune of earlier studies conducted [17] . The genotype PUSA-1103 has also been reported to be a resource donor for nickel and drought resistance [41] .
Thus, an intense scienti c interpretation suggested that the identi ed novel potential resources as chickpea genotypes PUSA-1103 for higher carbohydrate and zinc, K-850 for higher antiradical activity and bre, PUSA-1108 for protein and PUSA-1053 for higher Iron, Zinc and lower TPC and phytate contents and the 10 EST-SSR markers ICCeM012, ICCeM0049, ICCeM0070, ICCeM0078, SVP55, SVP95, SVP96, SVP146, SVP213 & SVP217 may be utilized as potential donor / marker resources for the macro-micro nutritional trait speci c development of mapping populations, construction of genetic maps, marker trait associations, localization of genes /QTLs for the useful nutritional traits in chickpea. Further, the identi ed genotypes being agronomically adopted varieties may also be utilized by food technologist and govt sponsored product-oriented schemes for amelioration of malnutrition amongst infants, children and pregnant women.

Experimental plots
The experimental research and eld studies on chickpea were carried out in a randomized block design observing the national and legislative guidelines in the experimental eld (MB 6 B) of Division of Genetics, IARI, New Delhi. The experimental plot was topographically uniform situated at an altitude of 225 m above mean sea level between 28 o 38' 0" N to 28 o 38' 30" N latitude and 77 o 9' 0"E to 77 o 9' 15"E longitude. The eld soil was sandy loam with mild alkaline about 7.5-8.5pH with low EC about 0.4-0.6 dS / m, low organic content (<0.5%), low nitrogen (<280kg/ha), high phosphorous (24-50kg/ha) and high potassium (>280kg/ha), medium sulphur (10-20mg/kg), adequate zinc (1-5mg/kg), adequate iron (5.8-10mg/kg), adequate manganese (10-25mg/kg) and adequate copper (0.5-10mg/kg) respectively.  Table-II). Conditions for choosing the genotypes of distant regions were based on the already generated 'passport' data as well as eld examinations recorded over a decade period in the experimental elds of IARI, New Delhi. Healthy seeds of each genotype were cultivated in a 'randomized block design' with a set of three repetitions under all suitable agronomic practices during 2020-21. The leaves were used for molecular and seeds for nutritional studies.

Macro and Micro Nutrients' Estimation Analysis
Ash content was estimated from 2g seed samples on dry weight basis for each variety as per procedure [42] .
Moisture content was assessed from 3g seed samples for each genotype as per the procedure described [42] .
Protein content of seed samples were evaluated by using Kjeldahl method. 0.5g of sample was taken and placed into a Kjeldahl digestion ask for the digestion and percent nitrogen was calculated as per AOAC [42] .
Following equation was used to calculate nitrogen percent: Nitrogen % = (Sample titre -Blank titre × N of HCL × 14 × 100 / Weight of sample × 1000) × 100 Thereafter, protein content was measured by the equation: Protein%= 6.25 × Nitrogen% Fat content was determined by Soxhlet method by dissolving 2 g of seed sample in petroleum ether as per AOAC [42] .
Fat content in percentage was calculated after complete extraction of the sample by using following equation: Fat% = (Weight of beaker with oil -Weight of preweighed constant (blank) beaker / Weight of sample) × 100 Carbohydrate determination was done by difference method [43] and calculated by the following formula: Carbohydrate (%) = 100-{weight in grams (protein + fat + moisture + ash + crude bre) in 100g of the food sample] Crude bre was extracted with petroleum ether (2g of samples were used) and residual fat free sample was used for ber estimation as per AOAC [42] .
The percent loss in weight was expressed as crude bre.
Phytate content of the selected sample was determined by the ferric nitrate method [45] . By using Ferric Nitrate, a standard graph was plotted to calculate micrograms of iron by following expression: The antiradical activities of sample extracts were assessed by DPPH (2, 2-diphenyl-1-picrylhydrazyl radical) method [46] with slight modi cation. The percent anti radical activity was calculated using following formula:
Iron and zinc concentration in samples were determined according to standard procedure [42] by using atomic absorption spectrophotometer (AAS).Standard curves of iron and zinc (NIST) were standardized and concentrations of minerals were determined as mg/100g. Statistical data analysis for Macro and Micro Nutrients SPSS version 7.5 software was used for all analysis of the nutritional evaluations stated as means of three repetitions. The outcomes were scrutinized by one way analysis of variance (ANOVA), followed by Duncan's multiple range tests to compare means signi cance at p<0.05.

Designing of New EST-SSR Markers
The chickpea EST sequences available in the NCBI database [47] were downloaded in FASTA format (Accession No. CDO 38847-GR 394575). These EST sequences were clustered into contigs and singletons using CAP 3 software. The resultant 348 contigs were mined fornovel SSRs using tandem repeat occurrence locator (TROLL) analysis to explore for dinucleotide, trinucleotide, tetranucleotide, pentanucleotide and hexanucleotide repeat motifs. The EST-SSR markers/primer pairs were designed using Primer 3 software [48,49] by following the optimal parameters: 40 -80% GC content, 50 -60 0 C Tm; 15 -25 bases primer length and 100 -280 bp product length. The primers were synthesized by Imperial Life Sciences, USA and designated as SVP with a numerical identi cation.

DNA isolation, ampli cation and detection of microsatellite alleles
Isolation of Genomic DNA was done from young leaves as per CTAB method [50] with few modi cations. 3 gm of sample (fresh leaves) was taken to make ne powder with the help of mortar, pestle and liquid nitrogen. The ne powder was then shifted to a centrifuge tube carrying 15 ml of warmed (65°C) extraction buffer. Occasional shaking was done to mix the samples thoroughly. Thereafter, samples were incubated for 60 minutes at 65°C. An equal volume of chloroform: isoamyl alcohol (24:1) was added to each tube and mixed gently for 15-20 min. When the mixing was done, tubes were centrifuged for 10 min at 8,000 rpm (CPR-24, Remi India). Aqueous phases were shifted to blank tubes and again extracted with chloroform: isoamyl alcohol. After that, chilled isopropanol (0.6ml) was mixed and tubes were kept at -20°C for 2 hours. Again, the tubes were centrifuged for 15 min at 4° C at 10,000 rpm. After centrifugation, supernatants were rejected and pellets were cleaned using 70% ethanol. Finally, the pellets were air dried and dissolved in 100µl of TE buffer.
Eppendorf Master Cycler gradient was used for carrying out PCR analyses, in a total volume of 10 µl containing10 x buffersof 2.5 mM MgCl 2 , 25 ng of template DNA, 10mM dNTPs, 10µM of primers, 0.5 U/µl Taq DNA polymerase enzymes (BangloreGenei). 73 EST-SSR markers comprising of newly designed 27 EST-SSR markers by us, 39 & 7 markers designed [12,51] were used for ampli cation. Further for performing PCR analysis after initial denaturation for 2 min at 95°C, 35 cycles of cycling protocol consisting of denaturation at 95° C for 20 Sec, annealing at 52-70 C for 50 Sec and elongation at 72°C for 50 Sec were used. Final extension of complete cycle was done at 72° C for 7 min. 1.2% agarose gel carrying ethidium bromide (0.5µg/ml) in 1x TBE buffer was run at 60 volts to resolve amplicons. The ampli ed products were double rechecked for their reproducibility for each polymorphic primer. Calculations of frequencies of incidence of all polymorphic alleles were done for the determination of polymorphic information content.
A CCD camera assembled to a gel documentation system having the quantity one software (Alpha Innotech) was used to take photograph of the gel. Scorings were accomplished manually for each of the gel sections and alleles were determined on the basis of the positions of bands. Band pattern for each of the microsatellite markers was documented for each genotype by assigning a letter to each band. All the alleles were numbered as 'a1', 'a2' etc. In the data matrix, occurrence of a band was denoted as '1' and absentia of a band was denoted as '0'. The e cacies of 73 markers were measured by polymorphic information content (PIC) as per assessment procedure [52] .
Where, P ij is the frequency of the j th allele for i th locus summed across all alleles in the locus.
The pairwise genomic resemblances for all the genotypes were assessed as per Jaccard's coe cient [53] and all statistical analysis were accomplished utilizing the software NTSYS-PC (version 2.11 s) [54] .

Functional annotation of the new EST-SSR markers
Functional annotation of newly designed markers was obtained from GenBank using the blast X algorithm against the nr database [55] . The contigs employed for marker development were deciphered using TranSeq [56] . The derived putative amino acid arrangements were given into for a domain search in gene ontology [56] and GO Terms were withdrawn from the top most identical hits [57] . The AmiGO term browser [56] (http://amigo.geneontology.org/cgibin/amigo/search. cgi) was used to nd molecular function, cellular compartmentalization and inferred biological process ontology. The Pfam database was used to infer gene function [58] .

Clusters analysis for measurement of distances
Software NTSYS-PC version 2.11s was employed to categorize genotypes into discrete conglomerations 23 Tables   Table I: Macro-Micro Nutritional analysis for 12 traits in 9 chickpea varieties   Figure 1 Macro-Micro Nutrients based Dendrogram of 9 chickpea varieties constructed by UPGMA cluster analysis based on nutritional similarity indexes SupplementaryTables.pdf