Investigation of Cotton Germplasm for Genetic Divergence Regarding Yield Related Trait Using Principal Component Analysis

Background: Cotton is a vital ber and cash crop in Pakistan. Genetic diversity of a germplasm play an important role for cotton breeding. One hundred and two germplasm of upland cotton were investigated for genetic divergence regarding yield related attributes using principal component analysis. The research was carried out in RCB design with 2 replications. Experiment data was recorded on various qualitative and quantitative parameters and were subjected to principal components analysis (PCA) and cluster analysis. Results: PCA result showed that only four components were considered on account of their eigenvalue greater than 1 which contributed 65% to the total variability. Score plot showed that the suncrop-6, tipu-9, TJ-max, Deebal, CRIS-543, TH-20, Tahafuz-7, Eagle, BS-80, IUB-69, BH-221, NIAB-1048, and NIAB BT-2 showed the vertex of polygon and resulted as most divergent germplasm. Similarly cluster analysis also categorized the yield related traits into 5 main cluster. Cluster-1 contain 20 germplasm, cluster-II contain 16, and cluster-III, cluster-IV, and cluster-V comprise 13, 16, and 37 germplasm, respectively. Conclusion: Based on results, it was recommended that these genetically diverse germplasm might be used as parents that could be utilized in upcoming breeding programs.


Introduction
Cotton is very important non-food cultivated and industrial ber crop developed in more than eighty countries (Dutt et al. 2004; Shakeel et al. 2015). Due to its importance in agriculture and textile industry, therefore it is also recognized as "white gold". The Gossypium genus include 50 species so far. Amongst these 45 are diploid and ve are allotetraploids. Four species are cultivated, which can be divided into two groups. First one contain (Gossypium hirsutum) and (Gossypium barbadence) (2n = 52) and are cultivated from America commonly termed as new world cotton and are allotetraploids, whereas 2nd one contain (G. arboreum) and (G. herbaceum) (2n = 26) and are named as ancient world cotton or Asiatic cotton, because they are cultivated mainly in Asian region (Wilkins et al. 2000).
As cotton crop is very sensitive and the performance of its cultivar varies with location as environmental conditions. Hence genetic potential and heritability of various genotype in term of performance of various morphological parameters is directly desirable for screening of high potential strain for breeding programs (Khan et al. 2010). It is obvious from the fact that development in cotton improvement programme is depending mainly on the genetic diversity in metric trait of base population (Jamil et al. 2020). Assessment of genetic variability among germplasm is of importance not for their safety & recording but also for saving of genotypes and breeding resources (Sunseri et al. 2010). Genetic variability concentrated on several agronomic and morphological features and their relations with surrounding biotic and abiotic issues has been exploited for upcoming development in cotton breeding . The amount and nature of attainable genetic variability between the genotypes has adequate scope to develop in successful breeding program for upgrading of various attributes (Ahsan et al. 2015).
The process of principal component analysis (PCA) was working in direction to analyze the variability present in germplasm and to determine such characteristic of plant which cause the diversity to rise and to determine the comparative in uences that the various characters make to the entire variability in the germplasm. PCA was working by the researchers to discover the correspondence amongst the germplasm for the characters and their appointment into various clusters. Principal component analysis also used to evaluate the relationship and variability amongst numerous germplasm for their utilization in future cotton improvement programme (Saeed et al. 2014;Rehman et al. 2015). Due to its diverse behavior of cotton for quality and quantity parameters, it is inevitable to analyze its genetics. Thus focusing this present study is being designed with de ned goals mentioned as (i) to check the genetic diversity amongst different accession (ii) to nd the magnitude of genetic diversity (iii) to evaluate and select best ones amongst different accession on morphological basis and genetic variability.

Material And Method
The eld experiment was conducted using 102 accessions obtained from Cotton Research Station Dera Ismail Khan, Pakistan.

Experimental layout and crop management
One hundred and two germplasm were raised in randomized complete block design. The experiment was consists of two replications, where distance between plant to plant and row to row were 30 & 75 cm, respectively. Seed was sown in hills and every hill contain about 3 to 4 seeds, and the eld was irrigated after sowing. About 60 kg ha − 1 of phosphate fertilizer as single super phosphate (18% P 2 O 5) and 50 kg ha − 1 of nitrogenous fertilizer as urea (46%) as initial dose were applied before sowing. The remaining 50 kg ha − 1 of nitrogen as urea was also applied at owering and 50 kg ha − 1 at boll formation stage. Thinning was completed after every 20 days to make ensure single plant per hill. Picking of bolls were practiced in regular interval about 2-3 picking, and seed cotton were stored in bags. All the recommended management practices for good crop growth were carried out during experiment.

Data recording and statistical analysis
Data were recorded on various yield related traits i.e. two qualitative (boll shape and stigma position) and eleven quantitative (plant height, sympodial intermodal length, number of monopodial branches per plant, number of sympodial branches per plant, number of fruiting position per sympodia, number of node to rst sympodia, plant population, bolls per plant, boll weight, seed cotton yield and ginning out turn) parameters. The data noted were subjected to statistical analysis. Descriptive statistics was performed for all the studied traits and the mean values were subjected to cluster analysis and principal component analysis was done through Minitab 18.

Principle component analysis (PCA)
To explore the momentous variation among one hundred and two germplasm of upland cotton, principle component analysis was used on collected mean data of ber quality, yield and yield related attributes simultaneously ( Table 1). Out of these 11 quantitative characters, 4 principal components account for 65% of the total variability amongst the studied upland cotton germplasm for the total phenotypic variations. The rst principal component (PC) among these four principal components (PCs), depicted 3.58 eigenvalue and was found to have 32.6% out of the total variability ( Table 1).  Table 1).

Cluster analysis
One hundred and two germplasm were categorized into 5 clusters based on several yield related traits ( Table 2). Clustering pattern of germplasm under this research work reveals that the germplasm indicated signi cant genetic diversity amongst themselves by classifying into 5 diverse clusters ( Table 2). A Euclidean distance based dendogram was constructed to separated 102 germplasm into 5 main clusters (Fig. 3). Among all the upland cotton germplasm, a high of Euclidean distance was observed. Cluster-I, contain twenty germplasm, and this cluster was further sub divided into 3 cluster i.e. sub cluster I, sub cluster II, and sub cluster III comprised 4, 8, and 8 germplasm respectively (Fig. 3).
Cluster-II also contain sixteen germplasm which further sub divided into 2 main sub cluster i.e. sub cluster I and sub cluster II containing 8 and 8 germplasm respectively (Fig. 3). In the cluster-III, thirteen germplasm of upland cotton were grouped and this cluster was further subdivided into two main sub clusters, i.e. sub cluster I and II, comprised 5 and 8 germplasm respectively (Fig. 3). Similarly cluster IV containing maximum 37 germplasm which are further more sub divided into ve main cluster, i.e. sub cluster I, sub cluster II, sub cluster III, sub cluster IV, and sub cluster V comprise 7, 8, 10, 8, and 4 germplasm respectively followed by Cluster-V, which consist of 16 individuals and further sub divided into two sub groups, sub cluster I and sub cluster II, containing 10 and 6 germplasm, respectively (Fig. 3).   Table 3).

Discussion
Classi cation of crop is pre-requisite for protection of plant genetic resources and gene banks management. This evidence is useful to monitor large population of plant resources for desirable traits and useful to recognize superior genotypes i.e. high yielding, early maturing, resistant to biotic and a biotic stresses and eco-friendly. Similarly, the same evidence can be used by plant breeders as an example in crop improvement programs (Ghafoor 1999). The exploitation of several biometrical techniques comprising PCA and cluster analysis for germplasm classi cation into dissimilar clusters created on their performance for yield related traits have been stated in earlier studies (Qiaoling and Zhe 2011).
Assessment and characterization by yield relating traits of diverse crop species is therefore highly signi cant for plant breeders (Martins et al. 2006). It is therefore essential to estimate the genetic differences of the plant genotypes and examine each and every characteristic of it and preserve it for using in the future breeding programs. The valuation of overall genetic divergence present in a crop germplasm also helps to calculate magnitude of genetic load and stress being faced by population which aids experts to devise conservational strategies in advance for the species threatened.
Therefore, the objectives of the present investigation included morphological as well biochemical characterization of previously 102 germplasm of upland cotton from diverse eco-geographical background. Population was screened for 2 Qualitative and 11 quantitative yield-morphological traits to measure genetic divergence present in the germplasm.

Principal Component Analysis based on yield related traits
Upland cotton germplasm were examined through PCA and then made groups on the basis of resemblances detected in morphological traits of these characters detached from their origination from the similar or diverse ecological zones. PCA is frequently used in crop sciences for the falling of variables and to categorize the germplasm. Biometrical processes amongst these concluded that the key bene t of PCA is that every germplasm can be allotted to simply one group and it also replicates the signi cance of the principal contributor to the overall variability at each axis of differentiation (Sharma 2006). Genetic diversity for yield related traits has been assessed by principal component analysis, which clues to and identify variability in upland cotton (Li et al. 2008 (Matus et al. 1999). The number of common aspects to be reserved must be equal to the number of PCs having Eigen values more than 1 (Kaiser 1960 Cluster Analysis based on yield and morphological traits Numerous plant breeders and plant scientist in the earlier era had practiced and got remarkable results of diversity in yield morphological parameters for various cotton individuals through two complementary methods i.e. cluster analysis and PCA. These techniques were successfully observed in cotton. All the above mentioned plant breeders and plant scientist's results gives support to our present investigation that these 2 techniques are very supportive in estimated associations among individuals originated from different environments in a more clear approach. The classi cation of present studied upland cotton germplasm into various groups was not due to of its origination from diverse ecological zones of the country and world. But their classi cation was because of their diversity at agronomic and morphological levels. The qualitative and quantitative traits of Iberian pea genotypes were also in agreement with our results (Amurrio et al. 1995).

Conclusion
The estimates of genetic diversity were examined through PCA and cluster analysis which revealed that out of the eleven principle components, four components were depicted eigenvalue > 1 and cumulative variability of 65% for all the studied attributes. The PC I and PC II contribute about 44.7% of cumulative variability. Suncrop-6, tipu-9, Tj-max, Deebal were found as most genetically diverse germplasm. These most diverse germplasm could be utilized in future cotton improvement program.