Phylogenetics of rice (Oryza sativa L.) genotypes from Pakistan based on diversity of biochemical markers

Haseena Gulzar Department of Biotechnology, Shaheed Benazir Bhutto University, Sheringal Dir Upper, P.O Box 18000 Khyber Pakhtunkhwa, Pakistan Sidra Pervez Department of Biochemistry, Shaheed Benazir Bhutto Women University, Peshawar, Khyber Pakhtunkhwa, Pakistan Rehana Masood Department of Biochemistry, Shaheed Benazir Bhutto Women University, Peshawar Asad Jan Institute of Biotechnology and Genetic Engineering, Agriculture University Peshawar, P.O. Box 706 Khyber Pakhtunkhwa Pakistan Faheeda Soomro International Center for Chemical and Biological Sciences, University of Karachi, Karachi 75270, Pakistan Muhammad Asif Nawaz (  asif_biotech33@yahoo.com ) Department of Biotechnology, Shaheed BB University, Sheringal Dir.


Introduction
Rice, a member of family Poaceae is the staple food for more than half of the world's population and South Asia being the biggest rice producer and consumer. Nevertheless, it is cultivated all over the world on about 154 million hectares every year occupying 11% of the world's cultivated area, and 10% only in Pakistan. It adds 5.9% of value in agriculture and 1.3% in GDP in Pakistan, a value larger than other South Asian countries therefore, also called 'Golden Grain of Pakistan (Simmonds 1986;Shah et al. 2011).
Among the cereal crops rice has the smallest genome, well characterized and sequenced thus a model plant for genomic research having 24 chromosomes (Eckardt 2000;Swain et al. 2017). Pakistan is among one of the few South Asian countries, self-su cient in rice production with increasing yield since 2001 and is the second largest exporter with a wealth of diverse traditional landraces still cultivating in villages (Shah et al. 2011). More than 2,500 accessions collected from different areas of Pakistan are preserved in the national gene bank, IABGRI-NARC, Islamabad. Moreover, 100,000 traditional rice varieties are stored in the gene banks throughout the world (Pervaiz et al. 2011;Bishwajit et al. 2013). In the last two decades many researchers have put their efforts to assess the genetic variability present in rice germplasm of Pakistan (Khan et al. 2013;Tahir 2014;Bibi et al. 2017).
Landraces are valuable genetic resources containing huge genetic variability also hold several unwanted agronomic traits like tall plant stature, long crop duration etc however, also exhibit strong characteristics such as resistance to biotic and abiotic stresses. In the last decade of 20th century, Pakistan has released a number of superior cultivars by crossing advanced breeding lines and traditional varieties, achieving high quality of grains and resistance to biotic and abiotic stresses (Pervaiz et al. 2011;Haque et al. 2015). SDS-PAGE is useful, easy, low cost and a quick way for studying genetic diversity in numerous genotypes based on polymorphic endosperm storage proteins which are unlike morphological traits not under the in uence of environment (Ali et al. 2007;Gulzar et al. 2015). Therefore, the information provided by protein pro ling can truly be bene cial in the screening of desirable traits in these germplasm resources for e cient utilization in advanced crop improvement, genetic engineering and breeding strategies (Conrad et al. 2013;Gul et al. 2015).
This work is one of the efforts to attain these goals as to ensure food security, systematic evaluation of the valuable genetic resources is pre-requisite for using in crop improvement programs (Bibi et al. 2017).

Materials And Methods
This study was conducted at the Institute of Biotechnology and Genetic Engineering (IBGE), Agriculture University Peshawar, KPK, Pakistan. Rice germplasm was provided by IABGRI-NARC, Islamabad, Pakistan.
The work consisted of two experiments; optimization of protein extraction protocol and biochemical evaluation of 150 rice accessions originated from Pakistan. The accessions belonging to KPK province (126) were analyzed separately including check variety Fakhr-e-Malakand based on SDS-PAGE of endosperm storage proteins.

Optimization of protein extraction protocol
For optimization three different compositions of protein extraction buffer (PEB) were used to extract proteins from rice endosperms in order to obtain clear bands.
Biochemical characterization using SDS-PAGE Total rice seed proteins were evaluated by one dimensional discontinuous system of SDS-PAGE with some modi cations in Leammli's method (1970). Extracted proteins (12µl each) were loaded in 5% stacking and 12% separating gel, run at 25mA and100V for 60 to 90 min. Bio-Rad dual precision protein ladder was used as molecular weight marker (10 to 250 kDa). The gels were stained overnight in 0.02% Comassie Brilliant Blue (CBB) by continuous shaking and destained in distilled water until the background was clear (Khan et al. 2013).

Data analysis for phylogenetics and diversity
An electrophoregram was analyzed for the presence/absence (1/0) of polypeptide bands in each accession. The binary data made was used to generate similarity and dissimilarity matrix which was further processed to construct a phylogenetic tree by the unweighed pair group method with arithmetic means (UPGMA) introduced by Sneath and Sokal in (1973) using a Statistical Programme for Social Scientists (SPSS) and Ward's method (Rahman et al. 2010;Gulzar et al. 2015).

Results
In this paper 150 different local rice (Oryza sativa L.) accessions originated from different regions of Pakistan-including north western regions (Province of KPK), including Parachinar and Kurrum agency, northern areas (Gilgit-Baltistan) and Punjab province-were analyzed for phylogenetic relationship and genetic diversity using SDS-PAGE of total seed proteins ( Table 1). Most of these were from KPK Province (126) including a check variety Fakhr-e-Malakand and only 24 from the other regions of the country.

Optimization of protein extraction method
Three different protein extraction protocols were used and the procedure optimized. The electrophoregrams generated by rst protocol and third (Ranjan et al. 2012), resulted in clear bands however, we recommend the use of second protocol with PBS pH 7.4 (Walczyk et al (2017)  We found that defatting seed meal with n-hexane improved protein extraction process with no oating fats on surface and gave clear bands.

SDS-PAGE based Polymorphism
In this study the polymorphism present in 150 local rice (Oryza sativa L.) accessions of Pakistan were biochemically analyzed for seed storage proteins using SDS-PAGE. A huge content of genetic variability was seen based on polypeptides banding pattern, size (10 to 250 kDa) and number (14 to 27). Both major and minor bands were kept under consideration, thus results showed comparatively signi cant variation in minor bands although some variation in major bands was there (Figs. 1 and 2) (Tahir et al. 2014). For ease of scoring the entire banding pattern was divided into four zones, zone A (60-to-250 kDa), B (30-to-60-kDa), C (18-to-30 kDa) and D (10-to-18 kDa) as shown in Figs. 1 and 2. Maximum number of bands were found in zone A and B (each 11 bands) followed by C (08) and D (04) in different accessions.
Therefore, during analysis 34 bands were observed with minimum number of 14 bands and maximum of 27 bands in different accessions ( Table 2). Out of 34 bands 10 (29.4%) were monomorphic and 24 (70.5%) were polymorphic.
In zone A, nine bands showed polymorphism while eight, six and one band was polymorphic in zone B, C and D respectively. In present investigation, maximum number of bands were seen in the accessions Heirarchical Clustering: Cluster analysis for the whole germplasm Cluster analysis (Ward's method) Ward, and Amer (1963) for the whole germplasm consisting of 150 rice landraces of Pakistan was carried out using binary data processed by SPSS software. A dendrogram was developed which divided the whole Germplasm into two major clusters; cluster A consisted of 58 accessions while the rest 92 genotypes grouped in cluster B, each further classi ed into two (A1, A2 &B1, B2) sub-clusters having 42, 16, 48 and 44 accessions respectively. A check variety Fakhr-e-Malakand (FM) was grouped in cluster B2 as shown in table 4. The Jaccard's dissimilarity coe cient ranged from 0.09 (lowest) to 0.94 (highest), while that of similarity coe cient from 0.009 to 0.635, revealing highest genetic diversity among the rice landraces of Pakistan.

Cluster analysis for the accessions of Province KPK
In addition to whole germplasm molecular polymorphism studies, phylogenetics studies for 126 accessions of KPK were also carried out and separately analyzed by cluster analysis.
A dendrogram was developed which divided the KPK Germplasm into two major clusters; the major cluster A consisted of 105 accessions and cluster B consisted of 21 accessions. Cluster A was further divided into two (A1, A2) sub-clusters with 59 (A1a; 26, A1b; 33) and 46 (A2a;16, A2b; 30) accessions while cluster B into two (B1, B2) sub-clusters with 06 and 15 accessions respectively. A check variety Fakhr-e-Malakand (FM) was grouped in A2b ( Fig. 5 and Table 5&6). Pair-wise genetic distance calculated to comprehend the genetic diversity ranged from 0.185 (lowest) to 0.965 (highest) and similarity coe cient from 0.018 to 0.632, thus revealing intraspeci c genetic diversity within the accessions of KPK.

Discussion
In this study 150 diverse local rice (Oryza sativa L.) accessions originated from different regions of Pakistan were analyzed for genetic diversity using SDS-PAGE. Maximum number (126) of accessions was from province KPK, including a check variety Fakhr-e-Malakand. Fakhr-e-Malakand, a new cold tolerant and early maturing japonica rice variety was introduced in 2003 by the government of KPK, Pakistan, cultivated at more than 1000 meters altitude, and irrigated by snow fed rivers. It gives higher head rice (56.8%) on milling, higher quality index (1.33) and the best results in comparison to other rice varieties under the system of rice intensi cation (SRI) practices (Ahmad and Khan 2016). Fakhr-e-Malakand produced heavier grains and high yield as compared to other varieties when studied their sowing dates in relation to bacterial leaf blight (Ra et al. 2013). The outstanding features of Fakhr-e-Malakand like higher yield, lodging and pests resistance, cold tolerance, along with more Bene t Cost Ratio of 3.24, made it most popular and economical commercial variety of the rice farming community (Ahmad et al. 2015).
In order to obtain of clear bands on the polyacrylamide gel, protein extraction method was rst optimized. Three different protocols gave different results. One protocol gave denser bands and another sharp bands, on the basis of which, we proposed the use of second protocol with PBS pH 7.4 (Ranjan et al. 2012;Walczyk et al (2017). Defatting seed meal with n-hexane improved protein extraction process and gave clear bands. Hirano et al (2000) and Walkzyk et al (2017) reported the same results and found that defatting peanut meal with n-hexane resulted in signi cantly higher yields of crude proteins and allergens than other methods. Isotonic solutions like PBS is extensively used in the extraction of lectin proteins and food allergens which resulted in greater sensitivity of their molecular analysis than with Tris-Borate EDTA (TBS) or Tris. The yields of the compounds of interest were also affected by the composition of different buffers, their pH and other conditions. One hundred and fty local rice (Oryza sativa L.) accessions of Pakistan, analyzed for seed storage proteins using SDS-PAGE showed signi cant polymorphism. A total of 14-27 bands were found in each accession with size range from 10 to 250 kDa. Pervaiz et al (2010-11) made one of the efforts to explore the genetic polymorphism of some Pakistan's landraces based on biochemical and molecular markers including Wx gene products (60 kDa), glutelin, albumin, globulin and prolamine polypeptide subunits as well as RAPD fragments. Shah et al (2011) also reported 34 bands in different species of rice including Oryza sativa L. However, we have not seen 34 polypeptide bands in any accession as we have observed a maximum of 27 bands.
Rice accessions originated from district Chitral showed more genetic variation as compared to other districts of the country (Bibi et al. 2017). Pakistan's rice germplasm has proved to be such a diverse that there can possibly be more than what is explored to date. After Shah et al (2011), maximum number of polypeptide bands in various Pakistan's landraces has been reported by Habib et al (2000), equal to 32, Pervaiz et al in (2011), 25, followed by Asghar et al (2004) who observed 22 bands and Bibi et al (2017) reported overall 18 bands. Sharief et al (2005) reported a minimum of 12 bands. The primary reason for variation in number of polypeptide subunits reported by other authors is the diversity of the rice germplasm of Pakistan from different geographical locations. Moreover, it can also be due to difference in protein extraction buffer as we have used three different compositions, and other lab protocols and gel composition for the evaluation rice seed proteins (Asghar et al. 2004).
No major relationship of geographical location and variation in seed protein pro le was examined. The rice genotypes of China have been found relatively low in genetic diversity as compared to the landraces of the South Asian countries (Swain et al. 2017;Warusawithana et al. 2017).
A phylogenetic tree was developed for 150 rice accessions which divided the whole Germplasm into two major groups with 58 and 92 genotypes respectively, including check variety Fakhr-e-Malakand in second group (Ward, and Amer, 1963). The Jaccard's dissimilarity coe cient (0.09 to 0.94) and similarity coe cient (0.009 to 0.635), revealed highest genetic diversity among the rice genotypes of Pakistan. The bene cial characteristics of FM are expected to be present in the same group of accessions on the basis of hierarchical clustering, in which it has been grouped due to less genetic distance (Ahmad et al. 2016). Similar work has been done in these references on Oryza sativa L (Bibi et al. 2017). The accessions from different geographical zones were clustered together in same groups. This demonstrates that expression pro le is least affected by environment, a statement given by a range of studies on other cereal crops and rice. Some work also linked protein markers to the agronomic traits clustering in the same group which need to be explored further (Pervaiz et al. 2011). Cluster analysis for 126 accessions of KPK were also carried out and a cladogram was developed which divided the KPK Germplasm into two major groups consisting of 105 accessions and 21 accessions respectively including check variety Fakhr-e-Malakand in the rst group (A2b). Pair-wise genetic distance (0.185-0.965) and similarity coe cient (0.018 to 0.632), was also found to express intraspeci c genetic diversity within.
As FM is an improved variety with desirable characteristics, the same traits are also expected to be present in group A2b, due to lowest genetic distance with them (Ahmad et al. 2015;Ahmad et al. 2016). Perwaiz et al (2011) reported a similarity coe cient of 0.81; this low genetic distance may be credited to the same genetic background, long-established farming practices and consumer preferences.

Conclusion
The ndings in this work unveiled a considerable amount of genetic diversity in 150 landraces of the O. sativa L. from Pakistan and that of Province Khyber Pakhtunkhwa in particular. These facts offer an opportunity to be used for marker aided selection or breeding strategies and improvement of modern rice cultivars very low in genetic variability. The information obtained from clustering behaviour of accessions can be helpful to design plans for their management in the gene-bank. The desirable and rich traits in these landraces can also be identi ed and exploited in order to ensure food security and to cope with changing environmental conditions thus providing an e cient utilization and emphasizing on the conservation of germplasm resources of Pakistan for the betterment of humanity.