Comprehensive Genome-Wide Analysis of the Catalase Enzyme Toolbox in Potato (Solanum tuberosum L.)

As antioxidant enzymes, catalase (CAT) defends organisms from oxidative stress caused by excessive production of reactive oxygen species (ROS). These enzymes play important roles in miscellaneous biological processes. However, little is known about the CAT genes in S. tuberosum (StCAT) despite its essential economical rank in the world. So far, abiotic and biotic stresses severely hinder plant growth and development which affects the crop production and quality. To identify the possible roles of CAT genes especially under salt stress, a genome-wide analysis of CAT gene family has been performed in the potato plant. In this study, the StCAT gene’s structure, secondary and 3D protein structure, physicochemical properties, synteny analysis, phylogenetic tree and also expression profiling under various developmental and environmental cues were predicted using bioinformatics tools. The expression analysis by RT-PCR was performed upon a commercial potato cultivar. Three genes encoding StCAT that code for three proteins each of size 492 aa, interrupted by seven introns, have been identified in potatoes. StCAT proteins were found to be localized in the peroxisome which is judged as the main H2O2 cell production site during different processes. Many regulating cis-elements related to stress responses and plant hormones signaling were found in the promoter sequence of each gene. The analysis of motifs and phylogenetic trees showed that StCAT are closer to their homologous in S. lycopersicum and share a 41–95% identity with other plants’ CATs. Expression profiling revealed that StCAT1 is the constitutively expressive member; while StCAT2 and StCAT3 are the stress-responsive members.


Introduction
The yield of a crop is influenced not only by genetic factors but also by environmental factors that could significantly affect crop growth by inducing morphological and physiological alterations in plants (Tuteja 2007). Abiotic and biotic stresses negatively influence survival, biomass production and crop yield (Hirayama and Shinozaki 2010).
In general, stress leads to the formation of radicals and to the induction of secondary oxidative stress (Iqbal et al. 2014). Reactive oxygen species (ROS) are produced during aerobic cellular processes (Kim et al. 2003). However, imbalance between antioxidant and proxidants increases free radicals in tissues (Apel and Hirt 2004;Vellosillo et al. 2010). In fact, ROS are continually produced by different metabolic pathways at different cellular compartments especially the chloroplasts, mitochondria, peroxisomes and the endoplasmic reticulum. About 1 to 2% of the oxygen consumed by the plant leads to the formation of ROS (Bhattacharjee 2005;Sharma et al. 2012;Bose et al. 2014).
ROS accumulation, at high concentrations, causes damage to cells, whereas at low concentrations, they essentially serve as second messengers in intracellular signaling cascades of stress responses (Karuppanapandian et al. 2011;Bhattacharjee 2012). Plants grown under optimal conditions can control ROS level via efficient antioxidant systems.
Under various stress conditions, this redox balance is upset by excessive levels of ROS or depletion of antioxidant defence systems or both, leading to the collapse and cell death of plants (Karuppanapandian et al. 2011). When the level of ROS exceeds the defence capacity, the resulting oxidative stress status can cause damage to biomolecules such as DNA, proteins and lipids (Sharma et al. 2012).
The activation of a variety of enzymatic and non-enzymatic antioxidant systems constitutes an important plant adaptation strategy to overcome oxidative stress (Parida and Das 2005;Noctor et al. 2012). Indeed, a wide range of enzymes can be induced to cope with the stress condition. These are superoxide dismutase (SOD), enzymes of the glutathione, ascorbate cycle and catalase (CAT).
The catalase action in plant and animal tissues was first observed in 1818 by Thenard, who noted that such tissues readily degraded hydrogen peroxide, a substance he had also discovered some years earlier (Aebi and Suter 1971). Catalase, as an antioxidant enzyme, is found in practically all aerobic organisms. It dissolves hydrogen peroxide (H 2 O 2 ) into water and oxygen in peroxisomes (Shangari and O'Brien 2006). H 2 O 2 symbolizes a two-electron reduction position of molecular oxygen and instigates primarily from the enzymatic dismutation catalyzed by SOD isoforms. Despite its small reactivity, H 2 O 2 can simply spread across biological membranes and generate hydroxyl radicals (·OH), which can reply with biomolecules and cause damage (Novo and Parola 2008;Freire et al. 2017;Palma et al. 2020). Catalases have been purified and structurally characterized from various microbial (Borges et al. 2014;Loewen et al. 2015) and animal (Gouet et al. 1995) sources. However, limited understanding of catalases' function from rice (Alam and Ghosh 2018), wheat (Matsumura et al. 2002), rapeseed (Raza et al. 2021), cucumber (Hu et al. 2016), cotton (Wang et al. 2019b) and Phyllanthus emblica (Sharma and Hooda 2018) based on structural and functional data.
A number of studies reported that the overexpression of genes encoding catalase enzymes in plants improved their salt tolerance Chaudhry et al. 2021), as well as their tolerance to other abiotic (Noctor et al. 2014;Su et al. 2018;Alam and Ghosh 2018) and biotic (Morkunas et al. 2013;Su et al. 2014) conditions. Plant adaptation to abiotic stress might positively correlate with CAT transcriptional activation in maize (Scandalios et al. 2000;Zang et al. 2018), cucumber (Gao et al. 2009) olive (Cansev et al. 2011), broccoli (Lin et al. 2010), banana (Figueroa-Yáñez et al. 2011) and potato (Jbir-Koubaa et al. 2015. Potato (Solanum tuberosum L.), originated from the Andean regions of Peru and Bolivia (Spooner et al. 2005), is actually grown worldwide as the third most consumed food crop plant after wheat and rice (FAO 2019). Potato is vulnerable to all kinds of biotic and abiotic stresses in its growth and development periods, often resulting in lower quality and yields of tubers (Handayani et al. 2019;Hill et al. 2020;Chourasia et al. 2021). Therefore, potato research involving mining and exploitation of its stress response genes has acquired a new urgency in enhancing production and breeding.
Considering the potential value of catalase in improving stress tolerance, we screened the annotation of Solanum tuberosum group phureja DM1-3 version 2.1.10 (http://potato.plantbiology.msu.edu/cgi-bin/gbrowse/potato/) and identified 3 CAT genes. Detailed analyses of catalase in potato including gene classification, multiple alignment, gene phylogeny, conserved motif composition and synteny analysis were performed based on sequence similarity with their Arabidopsis and tomato counterparts. Moreover, the structural properties and functional components of potato catalase were examined using computational tools. Finally, PCR expression analysis was carried out for the potato catalase genes under salt stresses.
Our results from the genome-wide survey of potato catalase family offer a useful basis for further and targeted molecular analysis and functional characterization of potential catalase candidate genes.

Database search and identification of genes encoding catalase enzymes in potato
For the identification of putative genes encoding catalase enzymes in Solanum tuberosum, a BLASTp search was performed in the database of the potato genome (http://phytozome.jgi.doe.gov/pz/portal.html). The Arabidopsis thaliana proteins corresponding to these enzymes were used as queries collected from the database TAIR (https://www.arabidopsis.org/) to identify their counterparts in potato by BLAST keeping the sequence similarity cutoff of >70% and E-value cutoff of 0.001. To further verify the reliability of these candidate sequences, the Pfam database (http:// pfam.sanger.ac.uk/search) was used to confirm the presence of conserved domains. All the confirmed potato antioxidant enzyme proteins were named as prefix "St" for Solanum tuberosum followed by sub-class identifiers (StCAT) and they were numbered progressively (StCAT1 to StCAT3).

Phylogenetic relationships and identification of conserved protein motifs
The protein sequences of catalase enzymes identified in potato (St) were aligned with those of Arabidopsis (At) and tomato (Sl) by using the ClustalW software (Larkin et al. 2007). Gene structure analysis was performed using Gene Structure Display Server GSDS 2.0 (http://gsds.cbi.pku. edu.cn/; Hu et al. 2015) by aligning the mRNA sequences with the corresponding genomic sequences. Phylogenetic and molecular evolutionary analyses of AtCAT, StCAT and SlCAT protein sequences were conducted by Jones-Taylor Thornton (JTT) and Neighbor-Joining (NJ) (bootstrap =1000). The evolutionary distance was counted using this method (Jones et al. 1992). To identify the structural divergence of CAT family, the conserved motifs in the encoded proteins were achieved with Multiple Expectation Maximization for Motif Elicitation (MEME version 4.10.2) (http://meme.sdsc.edu/meme/intro.html; Bailey et al. 2009). The evolutionary correlations between potato and other species of CAT proteins were studied through phylogenetic analyses using the NJ Tree method in MEGA6 and the JTT method (Tamura et al. 2013).

Chromosomal localization and synteny analysis of potato CATs
The information on the chromosomal localization of StCAT, AtCAT and SlCAT was retrieved from the Phytozome database. Syntenic blocks between potato, Arabidopsis and tomato genomes were downloaded from the Plant Genome Duplication Database (http://chibba.agtec.uga.edu/duplication/). Syntenic blocks were drawn using Circos Tool (Krzywinski et al. 2009).

In silico analysis of promoter sequences
The promotor sequences of each StCAT gene (1500 bp upstream of ATG start codon) were extracted from NCBI. The search for putative cis-regulatory elements in the putative promoter sequence of each StCAT genes was performed by the PLACE software (http://www.dna.affrc.go.jp/PLACE/signalscan.html; Higo et al. 1999). Prevalence of motif signal was visualized as a word cloud at R core (v4.0.4).

Structural features' characterization
The isoelectric point, the molecular weight and the composition of amino acids of potato CAT deduced proteins were calculated using the ProtParamtool (http://us. expasy.org/tools/protparam.html). The location of the subcellular compartment was performed using the TargetP software (http://www.psort.nibb.ac.jp/form.html). The identity and the similarity between these proteins were calculated with Sequence Identity And Similarity (SIAS) (http://imed.med.ucm.es/Tools/sias.html).
The secondary structure of the CAT proteins was predicted by NPS secondary structure prediction method (SOPMA, GORIV, HNN, MRLC) (Combet et al. 2000). The predicted 3D structures of StCATs were generated by using the PHYRE2 server (Protein Homology/analogY Recognition Engine Version 2.0) (http://www.sbg.bio.ic. ac.uk/~phyre2/html/page.cgi?id=index; Kelley et al. 2015). The FASTA sequences of the query proteins were entered and the intensive mode was selected to attain 3D models. Molecular graphics and analyses were performed with the PyMOL Molecular Graphics System, Version 2.0 Schrodinger, LLC.

Clustering analyses of transcriptomic data
For analysing the expression of potato catalase genes at different developmental stages, in varied organs and in response to biotic and abiotic stress conditions, expression values were retrieved from Spud DB: Potato Genomics Resource database (http:// solanaceae.plantbiology.msu.edu/). Spud DB database provides the expression information of Locus ID's from S. tuberosum group phureja DM1-3 genome browser. Then, the expression values, expressed in FPKM (Fragments Per Kilobase of transcript per Million fragments mapped), were modified in logarithmic manner. Finally, the heat map was generated through the hierarchical clustering with Euclidean correlation coefficient distance measurement method using the software Multiple Array Viewer MeV v4.4.1 (Saeed et al. 2006).

Plant materials and stress treatments
A commercial potato cultivar, Nicola was analyzed. Plants grown in vitro were propagated in tubes containing 15 ml solid MS basal medium (Murashige and Skoog 1962) using 8 g/l agar supplemented with vitamins (Morel and Wetmore 1951) and 30 gl −3 sucrose. Plants were cultivated in a growth chamber (12-h photoperiod, 24°C and an irradiance of 62 μmol m −2 s −1 ). After 10 days of culture, six plantlets were transferred to MS medium supplemented with 0 and 100 mM NaCl.

Expression analyses by semi-quantitative RT-PCR
Total RNA was isolated from leaves as described by Verwoerd et al. (1989). The RNA concentration was determined by absorbance at 260 nm. The primers for semiquantitative RT-PCR analyses (Additional file 1) were designed with melting temperatures of 60°C using the Primer3 v.0.4.0 software. Semi-quantitative RT-PCR was carried out as described by Degenhardt et al. (2005).
Synthesis of the first strand cDNA was performed using the Access RT-PCR System kit (Promega) according to the supplier's recommendations operating with 2 μg of total RNA and 3 μl of oligo dT (10 mM). The reaction mixture of a final volume of 15 μl was incubated at 75°C for 10 min and then placed immediately on ice. Then, 5 μl RT buffer (5×); 2 μl dNTPs 10 mM; 1 μl DTT 100 mM; 1 μl RNAs and 1 μl (200 U) AMV reverse transcriptase (Invitrogen) were added. A first incubation at 42°C was executed for 60 min followed by a second incubation at 70°C for 10 min to stop the reaction. The following PCR amplification was performed in a final volume of 25μl. The reaction mixture contains the reaction buffer 5× supplied with the enzyme; 2 mM MgCl 2 ; 10 mM dNTP; 10 mM of each primer; 1U Taq DNA polymerase (Invitrogen) and 1 μl of cDNA. The amplification is carried out in a thermocycler device 2700 (Applied Biosystems). The program used included initial denaturation of 5 min at 95°C, followed by 35 cycles of amplification (30 s denaturation at 95°C, 1 min 30 s annealing at 60°C, 1 min elongation at 72°C) and ends by an elongation 10 min at 72°C.

Results
I. Genome-wide identification of catalase genes family in the potato 1. Identification of genes coding catalase enzymes in the potato To identify the potato genes encoding catalase antioxidant enzymes, a BlastP of the potato genome database was conducted using catalase genes of Arabidopsis thaliana as queries.Then the Pfam and NCBI-CDD databases were used to verify the presence of conserved domains. Sequence homology revealed 3 putative non-redundant genes encoding the different CAT enzymes in potato (Table 1). All these genes consist of 8 exons. The full size of StCAT genes is ranging from 3471 to 5368bp, encoding proteins of 492 aa. Still, their CDS and mRNA size showed the same profile ( Fig. 1).

Chromosomal localization and synteny analysis of catalase genes
Chromosomal positions of the different genes encoding catalase enzymes were determined from the database Phytozome. Their genes StCAT are distributed on three different potato chromosomes (Fig. 2). The StCAT2, StCAT3 and StCAT1 are localized to chromosomes II, IV and XII, respectively. They are phylogenetically distinct making it possible to conclude that they do not correspond to a duplicate.
To explore more the phylogenetic relationships within catalase gene family, a comparative synteny procedure was carried out wherein the physically mapped StCATs were compared with those of Arabidopsis and tomato on their respective chromosomes. A synteny block involving 3 potato, 3 Arabidopsis and 3 tomato CATs genes was generated (Fig. 2). Almost all potato StCAT orthologs displayed the same synteny location with Arabidopsis or tomato.

Cis-elements involved in transcriptional activity regulation
After having identified and selected a region of 1500 bp upstream the StCATs ATG for searching putative regulatory motif, several cis-acting regulatory elements associated with cellular growth, tissue-specific expression, transcription factors, hormonal and light regulation, abiotic and biotic stresses response elements were found (Table 2).
To explore, in fine, the dispersion of the cis-regulatory elements in the StCATs promoters, 'Word cloud' scheme was included (Fig. 3) accentuating the most frequent motifs such as DOFCOREZM, TATABOX and CAATBOX. Overall, five main motifs were checked counting the DOF transcription factor (DOFCOREZM), Consensus GT-1 binding site in many light-regulated genes (GT1CONSENSUS), CAAT promoter consensus sequence  (CAAT Box), root expression motif (ROOTMOTIFTAPOX1) and regulatory elements responsible for pollen specific activation (POLLEN1LELAT52). We also identified GATABOX, ARR1AT (responsible for cytokinin expression), CACTFTPPCA1 (responsible for mesophyll expression) and EBOXBNNAPA (responsible for seed expression) (Fig. 3). Various phytohormone-responsive elements were also found in the StCATs putative promoter sequence, including ABA and cytokinin. The latter, being the most abundant with 4 to 8 duplications in each catalase gene (Table 2), implies that StCAT expression might be strongly induced and regulated by cytokinin in potato plants. In particular, the DOF (DOFCOREZM) motif known for enhancing transcription activity was the most abundant in all StCAT gene promoters. The Dof proteins are a family of plant-specific S O R L I P 1 A T 1 0 0 Heat-responsive element CAATBOX1 10 17 6 CCAATBOX1 0 2 2 Low temperature-responsive element LTRECOREATCOR15 0 0 1 Pathogen-and salt-induced gene GT1GMSCAM4 6 5 5 Responsible for cytokinin expression ARR1AT 8 7 4 Transcription factor binding sites DOFCOREZM 18 18 11 transcription factors that include Dof1, Dof2, Dof3, and PBF (Yanagisawa 2000). Maize Dof1 was suggested to be a regulator of the expression of the C4 photosynthetic phosphoenolpyruvate carboxylase (C4PEPC) gene (Yanagisawa 2004). Dof1 also enhances the transcription of the cytosolic orthophosphate di-kinase (cyPPDK) genes and the non-photosynthetic PEPC gene (Yanagisawa 2000). The putative recognition site for MYC, functioning as transcriptional activator upon dehydration, was identified in most StCAT genes with 1 to 6 copies in each. Other stress regulator motifs, such as salt-responsive element (Park 2004), light-responsive elements (Terzaghi and Cashmore 2003), disease resistance response (Luo et al. 2005) and ABA responsive (Kaplan et al. 2006) were present as well in the StCAT promoters suggesting the probable implication of catalase family in response to these stresses in potato through ABA signaling pathway. However, no ABRE (ABA-responsive elements) was detected within the StCAT3 promoter, presuming other regulatory mechanisms than ABA responsiveness. Also, GATABOX is entailed for lightdependent and nitrate-dependent control of transcription in plants ( Reyes et al. 2004). The GATA motif has been found in the promoter of the Cab22 gene that encodes the Petunia chlorophyll a/b binding protein; this motif is the specific binding site of activating sequence factor-2 (ASF-2; Lam and Chua 1989).
All of the herein mentioned putative cis-regulatory elements suggest that StCAT family members are implicated in varied cellular processes and they might reply to environmental stresses via different phytohormones' signaling mediation.

II. Phylogenetic relationships and motif analysis of catalase potato enzymes
The analysis of phylogenetic tree elaborated from the catalase protein sequence from S. tuberosum, A. thaliana and S. lycopersicum (Fig. 4) shows that all the StCAT proteins are closer to their homologues in S. lycopersicum (SlCAT) than to those of Arabidopsis. Furthermore, in order to identify conserved motifs and consensus domains constituting the CAT proteins, the online MEME Suite (v4.8.2) program was used (Fig. 4a). The sequence details of each motif are shown in additional file 2. Analogous motifs were shared between the 3 members of StCAT, suggesting common conserved functions inside the catalase family. In addition, a phylogenetic tree of 39 protein sequences including several CATs from different origins showed two main groups that seem to be associated according to their family, species, and systematic method (Fig. 4b).
III. Structural characterization of StCAT proteins 1. Primary structure The identified StCAT proteins had the same size of 492 amino acids (aa), and their corresponding predicted molecular masses ranged from 44.9 to 59.9 kDa ( Table 1). The computed PI of these proteins was ∼6.54 on average, indicating that they are likely to precipitate in either acidic or basic buffers and can be maintained within a neutral buffer.
The amino acid sequence analyses showed that the protein sequences of CAT from S. tuberosum, A. thaliana and S. lycopersicum exhibit a high level of identity between each other (Additional file 4). Similarity percentages within members of the CAT family ranged between 82.55 and 100%, whereas the identity percentages varied from 75.25 to 99.39%. StCAT2 showed the highest similarity and identity levels (100 and 99.39% respectively) with SlCAT2 of S. lycopersium. The lowest similarity and identity percentages were observed between SlCAT1 and AtCAT3, the values being 82.55 and 75.5% respectively.
The main represented amino acids (Fig. 5) of the StCAT members are proline (7.3%), arginine (7.1%), aspartate (6.9%), leucine (7.7%), valine (6.5%) and alanine (6.1%). Leucine, alanine and valine are hydrophobic, aliphatic and non-polar amino acids and are thus expected to be found inside the protein or within lipidic membranes. The least common amino acids' residues were cysteine, methionine and tryptophan which accounted for ∼1% of the protein's primary structure. The low amounts of cysteine residues indicated that the chance of disulfide bond formation is low. The secondary structure features as predicted by NPS secondary structure prediction method (SOPMA, GORIV, HNN, MRLC) (Combet et al. 2000) of StCATs show that random coils (50%) dominated among secondary structure elements followed by the alpha helix (25%) and extended strand (15%) ( Table  3). The predominance of coils points to the fact that catalase from S. tuberosum might not be a very stable enzyme (Perticaroli et al. 2014).
The homology search of the tertiary structure of StCAT was predicted based on template-based modelling by PHYRE2 (Kelley et al. 2015). Six templates for each StCAT were chosen based on heuristics to maximize confidence, percentage identity and alignment coverage ( Table 4). The threading templates were selected by the PHYRE2 server from the PDB database on the basis of normalized Z-score of >1.0.
Furthermore, Clustal Omega (1.2.4) was used for multiple sequence alignment and active site identification. An alignment of the translated CDS of S. tuberosum catalase with the selected template catalases is shown in additional file 4. Conserved residues of the catalase sequence involved in the H 2 O 2 binding (V2, H3, V44, D56, N76, F81, F82, F89) were identified after a careful alignment study. The results were found to be consistent with the experimentally determined crystallographic structures of human erythrocyte catalase (1QQW) (Ko et al. 2000). However, few substitutions such as of isoleucine (I) by alanine (A), of methionine (M) by phenylalanine (F), of valine (V) by isoleucine (I) and of glutamine (Q) by leucine (L) were also observed in the proteins of StCAT (Additional file 4).
The quality of the 3D model was assessed on the basis of the confidence score. The validated model using various programs such as Ramachandran plot and energy plot confirmed the reliability of the model. All the validation parameters for validation were within the range showing the compatibility of the model with its sequence and depicting the excellent quality model (Fig. 6A).
Using the Pymol software, the residues involved in the hydrogen peroxide binding are conserved among the 3StCATs confirming the key role in hydrogen peroxide binding as identified by Prosite-ProRule annotation. Figure 6B shows the ligand (H 2 O 2 ) sits in the area lined by the predicted active site residues. Since, the catalytic site is actively involved in charge transfer reactions required for formation and degradation of bonds, so it is expected to have high electron density (Vivekanand and Balakrishnan 2009).
IV. In silico study of the expression profile of genes encoding catalase enzymes 1. Tissue-specific expression analysis of potato catalase genes  Expression data was retrieved from Spud DB: Potato Genomics Resource database (http://solanaceae. plantbiology.msu.edu/) for S. tuberosum group phureja DM1-3. The log2 transformed data was used to generate the heat map with hierarchical clustering of Euclidean distance correlation using MeV software package. A colour scale is provided along with the heat map to recognize the differential pattern of expression. Yellow colour indicates the high level of expression, black signifies medium and blue denotes the low level of expression To check the role of the different potato CAT family members, the gene expression was analyzed, first, within different organs and tissues at various developmental stages of the S. tuberosum phureja variety using microarray data available in Spud DB database (Fig. 7). The expression data of the 3 StCAT genes were available and retrieved for analysis. Fluorescence intensity values were analyzed to generate a clustered heat map based on the average Euclidian's distance. As shown in Fig. 7, StCAT family members exhibited spatial variations in transcript abundance, with high levels of transcript abundance in one or some tissues and low level transcript abundance in others.

Expression analysis of StCAT genes under different abiotic stresses, biotic stress elicitors and hormonal treatments
The expression of StCAT genes was analyzed in response to a variety of stress agents, to check their specific functions. To evaluate the effect of stress on StCAT gene expression, various abiotic stresses such as salinity, drought, wounding and heat were evaluated on potato plants (Fig. 8a). StCAT1 showed upregulation in all stress conditions, while StCAT2 showed downregulation in response to salt and mannitol stresses. The StCAT3 showed a completely opposite pattern of expression in comparison to StCAT1. Indeed, it was downregulated in response to almost all stresses.
Then, transcript abundance was analyzed in response to three biotic stress elicitors such as benzothiadiazole (BTH) and β-aminobutyric acid (BABA) and pathogen attack (Fig. 8b). The transcript abundance of all three StCAT was found to be modified in Fig. 8 Analysis of StCAT genes expression against various abiotic stresses, biotic stress elicitors and hormonal treatments. Expression of the same StCAT was analyzed in response to various abiotic stress elicitors (a), biotic stresses (b) and hormone treatments (c). Expression data of S. tuberosum group phureja DM1-3 was retrieved for three biotic stress elicitors such as BABA, BTH and pathogen attack (a); four abiotic stresses such as salinity, drought, wound and heat (b) and four plant hormonal treatments such as 6-benzylaminopurine (BAP), indole-3-acetic acid (IAA), abscisic acid (ABA) and gibberellic acid (GA3) (c), along with their respective controls. The log 2 transformed fold change in expression data was used to generate the heat maps with hierarchical clustering of Euclidean distance correlation using MeV software package. The colour scale provided at the bottom of figure represents expression variations. The stress-induced upregulation or downregulation of StCAT transcripts is indicated by yellow or blue colour, respectively Fig. 9 Profile of electrophoresis on agarose gel of RT-PCR products of the catalase genes of in vitro plants of potato leaves (var. Nicola) cultivated in absence (-) or presence of 100 mM NaCl (+). A Profile of electrophoresis on gel agarose (2%) of RNA extraction of potato leaves (var. Nicola). B Agarose gel (1%) of cDNA. C Agarose gel (1%) of RT-PCR products of the catalase genes of potato leaves. MT, size marker; T-, negative control without RT response to the pathogen infection. StCAT1 seems to be upregulated in response to all biotic stress and showed the highest activation levels in response to all hormonal treatments except for BAP. However, the StCAT2 was downregulated only in response to BABA elicitor and StCAT3 showed downregulation in response to almost all the applied stresses (Fig. 8c).

V. Functional study of genes encoding catalase in Nicola potato variety
The expression profile of the 3 StCAT genes was investigated by semi-quantitative RT-PCR of RNA from the potato variety cultivar Nicola cultivated in the presence or absence of 100mM NaCl (Fig. 9). Oligonucleotides were identified and used as primers in RT-PCR reactions allowing the amplification of an internal region of each gene (Additional file 1). The RT-PCR profiles showed that salt stress caused activation of the expression of StCAT1 gene in potato leaves. However, the expression of StCAT2 and StCAT3 genes seems to be not affected by salt stress. In conclusion, the investigation of StCAT gene expression in the leaves of Nicola potato plants was consistent with transcriptome data.

Discussion
An overload amount of ROS is engendered when plants are exposed to adverse environmental conditions (Kwak et al. 2006;Caverzan et al. 2016). To neutralize ROS excess, plants can activate several types of machinery. As one of the main antioxidant enzymes, the catalase (EC 1.11.1.6) serves as an efficient scavenger of ROS and can catalyze the dismutation of two molecules of hydrogen peroxide (H 2 O 2 ) into oxygen (O 2 ) and water. Considering its significant role, a genome-wide analysis was performed in potato to identify and characterize the catalase gene family and to check their expression profile in response to various developmental and environmental clues using computational tools.
The current study showed that potato plant contains three CAT genes (StCAT1-StCAT3). The analysis of corresponding protein revealed that both tomato and Arabidopsis possess a similar number of catalase proteins (three each) through alternative splicing (Table 1; Fig. 1). The presence of multiple catalase genes suggested multiple functions for catalases in a variety of plant tissues at various developmental stages and under constantly changing environments from which plants cannot readily escape.
The potato catalase genes are located on three different chromosomes like those of tomato, while AtCAT1 and AtCAT3 reside in the same chromosome for Arabidopsis (Fig. 2). The orthologous gene pairs followed phylogenetic clustering in a similar class. Moreover, the synteny analysis suggests the conservation in loci of StCAT genes. In recent studies, the synteny and phylogeny analyses showed the conservation of CAT genes in bread wheat and cotton suggesting the role of polyploidy in the expansion of catalase gene family (Wang et al. 2019b;Tyagi et al. 2021).
The sequence data shows that each of the three StCAT genes is interrupted by seven introns. The numbers of introns of catalase genes, and their positions, were found to be extremely conserved among 13 monocot and dicot plants. This suggests that they come from an ancestral catalase gene common to monocots and dicots containing seven introns (Frugoli et al. 1998). Although AtCAT2 has seven introns, both AtCAT1 and AtCAT3 have six introns in positions conserved with their homologue, but each has lost a different intron (Fig. 1). The presence of large introns in StCAT transcripts could improve the recombination frequency as well as maintain the counterbalance of mutation bias (Alam and Ghosh 2018). The putative promoter regions of the three StCAT genes do not share any sequence identity, possibly explaining the differential regulation observed for each of these genes. The exploration of upstream promoter region of StCAT genes revealed the occurrence of numerous cis-acting elements which were broadly categorized into (i) lightresponsive such as IBOX, GT1CONSENSUS and SORLIP1AT; (ii) stress-responsive such as ARE, LTRECOREATCOR, GT1GMSCAM4 and CAATBOX; (iii) hormoneresponsive such as ARR1AT, ABRELATERD1 and ACGTATERD1; and (iv) transcription factor binding sites such as DOFCOREZM, MYBCORE and WBOXATNPR1 (Fig.  3). The existence of similar cis-acting elements has been accounted in CATs of other plant species such as A. thaliana and O. sativa (Alam and Ghosh 2018), as well as in other antioxidant gene family of S. tuberosum, for instance glutathione S transferase (StGST) (Islam et al. 2018).
The analysis of conserved motifs and phylogenetic tree elaborated from the catalase protein sequence of catalase of S. tuberosum, A. thaliana and S. lycopersicum (Fig. 4) shows that all the StCAT are closer to their homologues in S. lycopersicum (SlCAT) than to those of Arabidopsis. Also, they possess a sequence with significant homology to other known plant CATs (Zang et al. 2018). Previously, Sheoran and collaborators (2013) reported that most of the dicot and monocot catalases are closely related with each other. This clearly revealed the divergence between dicot and monocot of catalase genes during the process of evolution. Similarly, this pattern of evolution was shown in case of glutathione peroxidase proteins (Wang et al. 2019a).
Based on the subcellular localization, the StCAT are predicted in peroxisome. Considering the prominent role played by catalase in the H 2 O 2 metabolism, peroxisomes are judged as the main cell production for the H 2 O 2 during different processes (Mhamdi et al. 2012;Corpas et al. 2019). This tensed link between catalase and peroxisomes was, supported by the composition of the polypeptide C-terminus sequence and the diverse number of methods used to localize this enzyme within the cell leading the scientific community to consider catalase as the typical marker enzyme for these organelles (Palma et al. 2020).
The StCAT polypeptides showed the same size of 492 aa with an average MW of 52 kDa. Also, an average pI of 6.54 was predicted for these polypeptides. The comparable physicochemical features of catalases were reported in other plant species, for occurrence, a range of 377-492 aa and 44-56 kDa in A. thaliana, 285-493 aa and 32-56 kDa in O. sativa, 409-494 aa and 47-57 kDa in G. hirsutum, 459 to 601 aa and 53-67 kDa in G. barbadense, 458-744 aa and 53-82 kDa in Cucumis sativus L., etc. The length of 492 aa and the MW of 56.9 kDa are the most frequent (Hu et al. 2016;Alam and Ghosh 2018;Wang et al. 2019b).
Each of the potato catalase isozymes is structurally similar to those found in other organisms (Chandler and Dodds 1983). All the catalases have a conserved residue (catalase domain) involved in the H 2 O 2 binding (Alam and Ghosh 2018;Wang et al. 2019a). Hence, the 3D model constitutes a perfect active site, which attracts the H 2 O 2 ligand (Additional file 5; Fig. 6). Based on structural homology modelling, StCATs showed the common best PDB hit with the catalase from E. coli (c1p81A) and predicted~13 α-helices (including both long and short helices) and two 4-stranded antiparallel β-sheets (Table 4; Fig. 6). These results were found to be consistent with the experimentally determined crystallographic structures of Deinococcus radiodurans catalase (4CAB) (Borges et al. 2014).
Assumed from the prediction tissue-specific expression, the StCAT genes are expressed in different organs and tissues such as leaves, roots and flowers. The strongest expression was observed for StCAT1 particularly in the tubers and leaves, whereas StCAT3 had the lowest expression in all tissues (Fig. 7). In the model plant Arabidopsis thaliana, AtCAT3 showed expression in both leaves and vascular tissues, while AtCAT1 demonstrated a spectacular change in its expression profile during floral organ and seed development and AtCAT2 is related with photosynthetic tissues and also roots (Mhamdi et al. 2012). Further, three catalase genes of maize illustrated diverse tissue specificity, while ZmCAT1 and ZmCAT3 proved kernel-specific expression and ZmCAT2 expressed after kernel development (Wadsworth and Scandalios 1989;Guan and Scandalios 1993;Guan et al. 1996).
Besides, the expression analysis of StCAT genes in different biotic and abiotic stresses of potato demonstrated that StCAT1 is the highly expressive member, followed by StCAT2 and StCAT3 suggesting that these genes are likely able to respond to several developmental stimuli. Consistent with the regulator motifs found in promoters of StCAT1 gene, the expression analyses based on transcriptome studies and semiquantitative RT-PCR revealed constitutive and ubiquitous expression profile in response to abiotic, biotic and hormonal stimuli at different development stages meaning that this gene plays various roles in potato developmental stages, and adaptation behaviors. Previously, Jbir-Koubaa et al. (2019) have shown that StCAT1 and StCAT2 genes were the most upregulated in potato somatic hybrid lines under salt stress conditions. Similarly, the orthologue of StCAT1, AtCAT3, was the highly expressive member. Besides, AtCAT2 and AtCAT3 display opposite circadian profiles, while AtCAT1 scarcely varied under the circadian clock (Li et al. 2015(Li et al. , 2017Su et al. 2018). Chen et al. (2012) concluded that the expression of sweet potato catalase SPCAT1 was coping with H 2 O 2 homeostasis in natural conditions. ZmCAT was a member of the plant CAT family in Zostera marina and implicated in minimizing oxidative damage effects under temperature stress (Zang et al. 2018).
The overexpression of KatE gene for Escherichia coli in potato affects the multiplication rate of the in vitro plants, as well as vegetative and physiological growth parameters under salt stress conditions ). Besides, the overexpression of IbCAT2, encoding gene from sweet potato cv. Xushu 18, conferred salt and drought tolerance in Escherichia coli and Saccharomyces cerevisiae (Yong et al. 2017).
These results confirm the contribution of these enzymes in plant stress tolerance to salinity. Indeed, several studies have demonstrated that the overexpression of genes encoding antioxidant enzymes in plants improved their tolerance to salt Su et al. 2014;Yong et al. 2017), abiotic (Rasoulnia et al. 2011;Zhang and Shi 2013;Noctor et al. 2014) and biotic (Morkunas et al. 2013) stresses.

Conclusion
In a gist, this study offers a genome-wide annotation and nomenclature of potato CAT gene family as well as gene structure, the secondary and the 3D protein structure, physicochemical properties and the phylogenetic tree. The expression profiling at different developmental stage-specific, abiotic and biotic stress-responsive and hormone-inducible CAT genes' expression was studied. Furthermore, the expression analysis effectuated with a semi-quantitative RT-PCR was carried out. We concluded that StCAT1 is the constitutively expressive member, while StCAT2 and StCAT3 are the stress-responsive members. These results present a novel road and starting point to distinguish the functions of CAT family members in S. tuberosum. Successively, this study will not only develop our knowledge on the mechanisms of stress tolerance of plants, but also will help to raise transgenic crop plants with better yield potential and stress tolerance potential.