Identication and Expression Analysis of DUF4228 Domain Containing (DDP) Genes in Potato Under Abiotic and Phytohormone Stress

DUF4228 (Domain of unknown function 4228) proteins are widely distributed in plants and performed a vital role in abiotic stress response. In potato (Solanum tuberosum), the study of the DUF4228 gene family is sparse. To understant the role of DUF4228 in potato a comprehensive genome-wide analysis was carried out in potato genome. Further, the StDUAF4228 genes relative expression was also evaluated in various plant tissues. In the present study identied 31 StDUF4228 genes were clustered into six groups. The promoter analysis of StDUF4228 revealed various cis-acting regulatory components response to abiotic and phytohormones stresses. Under selection pressure, gene duplication also showed various positive and purifying selections in StDUF4228 genes. Expression proling indicated that StDUF4228 genes were broadly expressed in various tissues and organs in potato. Moreover, the StDUF4228 genes proles expression was characterized through RNA-sequence analysis data under salt and heat treatments. In potato, StDUF4228-4, StDUF4228-14, StDUF4228-21, and StDUF4228-30 represented a high expression under salt and heat stress.Furthermore, 23 genes under IAA treatment and 18 under ABA treatment, showed that IAA and ABA played a vital role in stresses. The RT-qPCR and expression prolling exhibited high expression in root tissues. Moreover multiple miRNA target StDUF4228 genes in potato.These expression results provide primary reference data and functional analysis data for other commercial crops. family obtaining


Introduction
Plants often experience various environmental stresses, including biotic (diseases, insect pests, weeds, etc.) and abiotic stress factors such as high and low temperature, drought, and salinity (Zaynab et al., 2018;Zaynab et al., 2021b). These factors attenuate plant development and growth, badly affecting plant productivity and yield quality (Zaynab et al., 2018). Therefore, it is important to understand plants' stress damage and stress response mechanisms to improve their resistance level (Chen et al., 2012;Zaynab et al., 2020). Abiotic stress factors are considered chief limiting factors affecting plant yield and growth, causing worldwide signi cant economic and agricultural loss. The estimated annual loss is about 70% in yield due to inadequate chemical and physical environment.
The DUFs contained various functions in plants and current studies revealed that some important DUFs played vital roles in different plants against abiotic stress. The rice plant genes DUF1644 and OsSIDP366 signi cantly regulated in salinity and drought stress condition through the transgenic plant, in overexpressing OsSIDP366 indicated high tolerance against enhanced salinity and drought conditions . In transgenic A. thaliana and O. sativa OsSGL confers improved drought tolerance and various genes response against stress was designated which signi cantly improved in transgenic rice (Cui et al., 2016). Moreover, in O. sativa other members of DUF genes were also identi ed and concerned with abiotic stress factors, for example, OsDUF810 (Li et al., 2018), OsDSR2(DUF966) (Luo et al., 2014), and SIDP361 (DUF1644 ) . Xin et al. (2007) result in A.thaliana showed that DUF231 gene family member ESK1 (AT3g55990) is identi ed as a novel negative regulator of cold adaptation (Xin et al., 2007). Further analysis in A. thaliana species explained that suppressing the ATRDUF2 and ATRDUF1 (both are ubiquitin ligases of RING-DUF1117 E3) gene expression reduced the ABA-mediated drought stress (Kim et al., 2012). The MsDUF (Medicago sativa DUF) gene overexpression in the tobacco plants resulted in increased MIDA (malondialde) content and signi cantly reduced the soluble sugar and chlorophyll contents. MsDUF gene expression was signi cantly lowered under different treatments, including gibberellic acid (GA), abscisic acid (ABA), PEG6000, and NaCl, indicating the negative regulatory function of MsDUF in Medicago sativa as a stress resistance (Wang et al., 2018). The expression of the CiDUF4228-3 (Caragana intermedia DUF4228-3) gene positively up-regulated under drought, dehydration, and low temperature showed its association in stress conditions (Leng et al., 2021). Domain-containing protein (DUF4228) is exclusively found in the genomic data of plants.
In March 2019, the Pfam database retrieved a total of 2882 DUF4228 (Pfam accession: 14009) family genes from 80 different species. Thus, only one gene (DUF4228) has been illustrated from plants. Stress-responsive novel gene MsDUF belonging to DUF4228 family obtained from Medicago sativa played a negative regulatory role in osmotic stress and seed vigor in plants (Wang et al., 2018). Current studies illustrated that the DUF4228 genes had been involved in cadmium tolerance and abiotic stress indication. Analysis of the AtDUF4228 gene expression under the various co-expression networks of the DUF4228 gene family and stress treatments in various plant species indicated the synergistic role of DUF4228 genes in plant immune response (Didelon et al., 2020). These obtaining results proposed that DUF4228 gene family may also associate with abiotic stress response in soybean.
Potato is an important economic crop and widely used as a food throughout the world (Zaynab et al., 2021a). However, as with other plants, potato yield is also vulnerable to biotic and abiotic stresses (Dahal et al., 2019;Sattar et al., 2021). Until now, very little is known about DUF4228 in the potato genome. Therefore, the current study involved a genome-wide screening of DUF4228 in the potato genome. In addition, several bioinformatics analyses were also carried out to explore the basic and advanced features of DUF4228, including gene structure, chromosome localization, phylogenetic relations, conserved protein domains, regulatory networks, expression pro les in different tissues, and relative expression through real-time qPCR. The outcomes of this study will provide the basis for further functional analysis of potato DUF4228 and can also contribute to a better understanding of their molecular mechanisms.

Materials And Methods
Identi cation of DUF4228 genes in the potato genome In the present study, Solanum tuberosum genome data were obtained from Phytozome (https://phytozome.jgi.doe.gov/) and Arabidopsis genome data were retrieved from TAIR (http://www.arabidopsis.org/ ). To nd the protein sequence of StDUF4228 Pfam no. (PF14009) and HMMER3.0 software was used..The StDUF4228 sequence was obtained through the BLASp program by using the Arabidopsis protein sequence as queries. Moreover, the StDUF4228 protein sequence was next authenticated on the base of conserved domains through SMART (http://smart.emblheidelberg.de). The Physichemical properties, such as molecular weight (kDa) and isoelectric point (pI) of each protein, were determined using ExPASy (http://web.expasy.org/protparam/).

Multiple sequence alignment and phylogenetic analysis
Multiple sequence alignment and phylogenetic analysis of all identi ed protein sequences were aligned using MUSCLE with 16 iterations. Next, the associated sequences have been used to make a phylogenetic tree with a maximum probability approach by 1000 bootstrap values.
Gene structure and motif analysis Gene Structure Display Server was used for the identi cation of StDUF4228 gene family exon-intron characteristics. Protein motifs were identi ed through Multiple Expectation Maximization for Motif Elication (MEME) software (http:/meme-suite.org/). The following parameters were used for the motif analysis: the maximum motif number identi ed is 10, and other obtaining parameters are non-remittance values. The graphical display is depending on the incredible possible gene viewer part in TBtools software.

Cis-elements Analysis and Chromosomal locations
To study the cis-element in the 31 StDUF4228 gene promoter sequence, the start codon's upstream genomic sequences (2,000bp) for all StDUF4228 genes have been submitted online to PlantCARE site (http://bioinfermatics.psb.ugent.be/webtools/plntcare/html/). Regarding information to each known StDUF4228 was restored as of PGSC website.

Synteny and Selective Pressure Analysis
For genome conservation visualization, the relative synteny analysis was performed by the Circoletto tool (tools.bat.infspire.org/circoletto/). The duplicated genes coding sequence was aligned with MEGA7 through the Muscle/ codon method. The nonsynonymous and synonymous substitution rates (number of nonsynonymous substitution per nonsynonymous site: Ka, number of synonymous substitutions per synonymous site: Ks) were computed through KaKs-Calculator 2.0 software with the MYN process. Further, the time divergence (t = Ks/2r) was computed with rate exchange (r = 2.6 x 10 -9 ) [31].

Identi cation of miRNA
The coding sequence (CDS) of StDUF4228 was used to identify possible target miRNAs in the psRNATarget database(Available online: http://plantgrn.noble.org/psRNATarget/,) with default parameters.

Expression analysis of StDUF4228 Genes
For expression analysis of StDUF4228 genes, FPKM (fragments per kilobase million) were observed in different tissues. The obtained data were computed regarding more expressive tissues for example, roots, stems, and leaves. The observed FPKM values were utilized to construct the heatmap through TBtools [32]. Moreover, to identify the genes expression, we used RT-qPCR analysis in potato. Biosystem. qRT-PCR analysis pro le was used 95 o C for ten minutes, 94 o C followed by 40 cycles for 30 seconds, and 58 o C. During this analysis, elongation factor 1-alpha was used as a housekeeping gene. The further procedure was carried out through three biological replicates and analyzed using the 2-ΔΔCt method [33]. The standard errors of replicates were graphically represented.

Identi cation of StDUF4228 Genes
To identify S.tuberosum DUF4228 genes, through their Pfam number (PF14009) and HMMER, investigate the DUF4228 protein sequence using the S. tuberosum protein database. Moreover, the local algorithm BLASTP was used through StDUF4228 genes for queries. Each gene candidate's conserved domain was predicted through the SMART database. A total of 31 StDUF4228 genes were identi ed. For basic gene features analysis for S.tuberosum StDUF4228 proteins, we have studied gene location, gene number, molecular weight (MW), and isoelectric point (PI) ( Table 1). The identi ed StDUF4228 proteins length ranged from 104 to318 amino acids, with isoelectric points varying from 5.49 to 10.25 and molecular weight ranging from 11513.26kD to35578.61kD. Phylogenetic analysis of StDUF4228 The evolutionary relationships of 31 StDUF4228 were investigated by constructing a phylogenetic tree using MEGA7.0 software, using the neighbor-joining method. The multiple sequence alignment demonstrated critical conserved residues in DUF4228 (Fig. 1). The phylogenetic tree was divided into Six groups: Group 1, Group 2, Group 3, Group 4, Group 5, and Group 6. Of these, Group 4 was the largest. In StDUF4228, all genes were distributed into all six groups (Fig. 2).

Gene structure organization and motif analysis
For phylogenetic reconstruction supported by performing structure analysis of exon-intron with the comparison of genomic and coding DNA sequences. The exon-intron distribution, length, and number were diverse in all genes represented in Fig. 2a, while the StDUF4228-9 gene was identi ed as the longest sequence and StDUF4228-5 was identi ed as the smallest sequence among all StDUF4228 genes ( Figure.3B). To identify the architecture of StDUF4228 proteins in potato, here used StDUF4228 amino acid sequences and submitted them for online motif analysis by using MEME software. After the analysis 10 conserved motifs were observed in potato (Fig. 3A). All StDUF4228 genes had motif 1. In the present study motif, 2 was present on all the genes except StDUF4228-8. Motif 1 has 29 amino acids while motif 6 and 10 had 41, motif 2 has 21 amino acids, and motif 9 has 39 amino acids.

Cis-Element Analysis and Chromosomal distribution
For Cis-Regulatory elements analysis, we have used the 2000bp promoter sequence region of the 10StDUF4228 gene represented in Fig. 02. The Cis-element analysis by PlantCARE predicted that most Cis-acting sites were from three groups: phytohormones-responsive, growth-and development-related, and stressresponsive (Lescot et al., 2002). For instance, the MYB binding site and light responses were the main element considered a growth-and developmentresponsive factor. In contrast, the anaerobic induction and defense and stress responsive elements were enriched in their promoters among stress-response elements. For phytohormones, the GBRE, ABRE, and MeJRE response factors were observed highly enriched. As a result, it indicated that the expression of the StDUF4228 gene was carried out through different Cis-Regulatory elements (Fig. 4). The StDUF4228 chromosomal distribution illustrated that all genes are unequally distributed on entire chromosomes. The higher number of three genes was present on Chr2 and Chr4 while Chr3 had two genes while Chr1 and Chr1 had only one gene (Table.1).

Gene Duplications of DUF4228 Genes
To study the molecular evolution rate for all duplicating genes were computed through Ka/Ks value estimation. The Ka/Ks > 1 was measured as a positive selection effect. Ka/Ks < 1 was considered purifying selection value, and Ka/Ks = 1 was considered as neutral selection value between all duplicating gene pairs (Yang and Bielawski, 2000). Our results show that most of the DUF4228 duplicated genes endured purifying selection pressure during the duplication process, implying that the function of duplicated DUF4228 genes might not change signi cantly in the succeeding evolutionary process. In addition, the deviation time between pairs of duplicated genes was also estimated. The cosmic mass of DUF4228 genes showed a Ks value > 0.52, whereas the signi cant time deviation can be greater than 100 MYA (million years ago). Interestingly, the value of Ks for duplicated genes (StDUF4228-19/StDUF4228-23) was 2.35, whereas the signi cant time duplication may be 452.41 MYA (million years ago) ( Table 02).
Mainly, StDUF4228-3,StDUF4228-7,StDUF4228-19,StDUF4228-20 and StDUF4228-30 were prophesied to be targeted by a greater number of miRNAs. The expression levels of these miRNAs and their targeted genes require validation in additional research to govern their biological roles in the rapeseed genome.

Tissue-speci c analysis expression of StDUF4228 genes
To nd out the StDUF4228 genes expression analyses in different tissues, including root, stem, and leaf, a widely accessible RNA database sequence of potato was investigated. Our consequences showed that most StDUF4228 genes illustrated relatively large transcriptional abundance in the three tissues described above (Fig. 6). Furthermore, some members were showed just in 1 tissue but not showed these members in other tissues. The StDUF4228-3, StDUF4228-4, and StDUF4228-30 genes were expressed in all tissues (root, stem, and leaf). Additionally, 8 StDUF4228 genes were expressed in leaves. Maximum numbers of StDUF4228 genes were expressed, which were 15 in the root. In stem 14 StDUF4228 genes were expressed. Besides, the StDUF4228-20 genes were illustrated high expression in root tissues, while in leaf tissue StDUF4228-30 gene showed high expression. Gene StDUF4228-3 expression was observed maximum in the stem. The pro ling data expression of all potato StDUF4228 genes was studied to establish the heat-map illustrated in Fig. 6. Moreover, our consequences represented that genes have been differentially expressed in three organs and highly expressed in plant stem and root tissues.

StDUF4228 Genes Expression Patterns in Response to Heat, Salt, and Phytohormones
For the expression pattern structure of StDUF4228 genes in potato at the transcriptional level during heat stress, we deduced that few StDUF4228 genes might be intricate with heat stress in potato. Under heat stress treatment, the StDUF4228-4 and StDUF4228-21 are extraordinarily enhanced in potatoes (Fig. 7), while the expression of StDUF4228-4 showed more than the StDUF4228-21 genes. Total 16 StDUF4228 genes were expressed under heat treatment response. Under salt stress treatment, the StDUF4228-28 and StDUF4228-21 are extraordinarily enhanced in potato (Fig. 7), while the expression of StDUF4228-28 showed more than the StDUF4228-21 genes. Total 23 StDUF4228 genes were expressed under salt treatment response. Indole acetic acid and abscisic acid have been chosen to investigate transcriptional responses of StDUF4228 to hormone treatments. For the expression patterns analysis of StDUF4228 genes, leaf tissue was treated with abscisic acid. The 18 StDUF4228 genes were expressed treated with abscisic acid. Out of all, StDUF4228-21 and StDUF4228-30 gene models show higher expression when treated with abscisic acid. Likely, to investigate the expression patterns of StDUF4228 genes, leaf tissue was treated with IAA. The 23 genes were expressed treated with IAA. Out of all, StDUF4228-21 and StDUF4228-30 gene models show higher expression when treated with IAA, while StDUF4228-30 showed more expression than StDUF4228-21 (Fig. 8).Furthermore, to investigate the expression patterns of StDUF4228 genes, leaf tissue was treated with GA3. The 23 genes were expressed treated with GA3.Out of all, StDUF4228-28 and StDUF4228-30 gene models show higher expression when treated with GA3, while StDUF4228-30 showed more expression than StDUF4228-28.

mRNA e xpression of StDUF4228 Genes
In potato, the StDUF4228 genes transcriptional pro le expressions were validated in three tissue types (stem, root, and leaf) qRT-PCR analysis were performed (Fig. 9). The expression analysis validation was carried through RT-qPCR analysis by using StDUF4228-3, StDUF4228-4, and StDUF4228-30 genes in stem, root, and leaf. The comparative expression pattern illustrated that the StDUF4228-4 gene had a higher expression pattern in the root while relative expression of StDUF4228-30 was higher in leaf than root and stem (Fig. 9). ). However, little information is available for the potato MAPKgene members. In this article, a total of 10 StDUF4228 genes were observed from the genome sequence of potatoes.

Discussion
All identi ed StDUF4228 genes were classi ed into six groups, e.g., Group 1, Group 2, Group 3, Group 4, Group 5, and Group 6. The comparative phylogenetic analysis revealed that the organization of St, Sl, and At proteins was relatively similar in groups 1, 2, 3, 4, 5, and 6., indicating that all StDUF4228 genes in these groups may have descended from a common ancestor. The StDUF4228 protein structural analysis will provide functional analysis signi cance. The evolutionary relic illustrated the arrangement of exon-intron that shaped the gene family evolution (Flagel and Wendel, 2009;Moore and Purugganan, 2005). This is associated with prior scienti c ndings, so as too few genes are subjected to be retained in plants, during evolution, some may not show introns and may show short introns (Mattick and Gagen, 2001). Without or with few introns, the gene expression level is lowered in plants (Mattick and Gagen, 2001). Similarly, a microarray experiment on cotton revealed that miR827 has a crucial role of salt stress responses (Covarrubias and Reyes, 2010). miR167 was previously reported to have key roles in abiotic stress responses (Khraiwesh et al., 2012). Shortly, these reports support our results and recommend that bna-miRNAs might play pivotal roles against several stresses by altering the transcript levels of StDUF4228 genes in rapeseed.
The genome duplications, distribution of genes, and genome size are the chief feature of genetic diversity between land plants. Genetic duplication has long been observed in the above factors during gene families' evolutionary origins, complexity, and expression. Some duplication events were also discovered in StDUF4228, which play an important function in amplifying the StDUF4228 genes. As gene duplication has been an important factor in diversi cation, expansion, and neofunctionalization of gene families (Lavin et al., 2005), similarly the StDUF4228 gene mapping and distribution at a chromosomal level will provide the potato breeders through desired traits to develop novel varieties of potatoes. whether the DUF4228 were expressed through hormonal signaling in potato, the leaves of the potato were treated by ABA, IAA and examined the gene expression. After IAA and ABA treatment, genes were induced, showed different functions of different StDUF4228 gene members in IAA-and ABA-induced immune responses. Total 18StDUF4228 genes under ABA and 23 StDUF4228 genes under IAA and GA3 treatment, upregulation illustrated that IAA and ABA perform a vital role in this immune response. Phytohormones were identi ed to play a role in the ATDUF4228 genes that support our results . The expression of genes and their clusters also mentioned a powerful correlation of gene clusters and gene expression in different tissues under different stress factors. This co-expression and co-occurrence show their putative function in the adaptation of plants under diverse environmental stress factors.

Conclusion
Overall, a total of 31 DUF4228 genes were identi ed in potatoes. The comparative studies of evolutionary analysis resulted in the presence of ve major groups in the DUF4228 family. The conserved functional and structural motifs were lying in each StDUF4228, with slight changes among members and groups. The presented results provide a deep understanding of major potato plant challenges under biotic stresses. In potato, StDUF4228-21 and StDUF4228-4 presented a higher expression against heat and stress response. Under IAA treatment 23 and ABA treatment, 18 genes expressions illustrated that IAA and ABA performed a vital function in the immune response. These consequences provoked an advanced understanding of StDUF42228 gene family function and gave a fundamental study for further analysis of the StDUF42228 genes on molecular processes in abiotic stress response in potato.
Numbers on the nodes represent bootstrap values.  Cis-acting elements' distribution in the regulatory regions of StDUF4228.  Expression pro ling of StDUF4228 genes in response to heat and salt.

Figure 8
Expression pro ling of StDUF4228 genes in response to IAA,GA3 and ABA. Figure 9