Evolutionary and Expression Analysis of CAMTA Gene Family in Three Species (Arabidopsis, Maize and Tomato), and Gene Expression in Response to Developmental Stages

The calmodulin-binding transcriptional activator (CAMTA) family has been known to be one of the fast responsive stress proteins. In this study, 17 CAMTA genes were selected in Arabidopsis, tomato and maize. The chromosomal distributions, gene structures, duplication patterns, phylogenetic tree, and developmental stage of the 17 CAMTA genes in the three species were analyzed to further investigate their functions. According to the synteny analysis, CAMTA genes of maize and tomato revealed higher similarity with each other as compared with Arabidopsis. A higher than 90 percent identity was observed between maize CAMTA genes (ZmCAMTA2 and ZmCAMTA3) and tomato CAMTA genes (SlCAMTA4, SlCAMTA4.1). To detect expression levels in different plant tissues, mRNA analysis of CAMTA genes were performed using publicly available expression data in the genvestigator. The aim of study was to identify and characterize CAMTA genes in three species, for the rst time, via insilico genome-wide analysis approach. AtCAMTA1 and AtCAMTA2 and SlCAMTA2 and ZmCAMTA1 and ZmCAMTA2 genes were up-regulated during all developmental stages. The conserved motifs and gene structure in most proteins in each group were similar, validating the CAMTA phylogenetic classication. This study could be considered as a useful source for future CAMTA comparative studies in different plant species.


Introduction
Plants are exposed to a variety of stress conditions such as abiotic and biotic stresses during their growth life cycle. These environmental conditions may activate pathways or mechanisms of signaling [1]. To identify responsive genes and transcription factors involved in environmental stress adaptation, interaction of TF with cis-elements was surveyed [2]. Due to the importance of TFs in regulation of stress related genes, increasing number of studies are being performed focusing on their functions. Several studies have shown that TFs affecting life cycles of organisms play a crucial role in plant and cell signaling [3].
In general, over 90 TFs have been known as CaM-binding proteins (CBPs), including CAMTAs, MYBs, WRKY, NACs, and MADS box proteins [4]. One of the most important TFs is calmodulin-binding transcriptional activators (CAMTA), containing four functional domains known as IPT/TIG (transcription factor immunoglobulin), CaMBD, and a varying number of IQ motifs (calmodulin-binding), CG-1 (a DNAbinding domain speci c to sequence), and ankyrin (ANK) repeats. Calmodulin-binding transcription activators (CAMTAs), the well-studied CaM-binding transcription factors, are found in animals, fungi, and plants [5]. The binding site for CAMTAs are present in downstream promoter of genes and are designated as A/C/G)CGCG(T/C/G) or (A/C)CGTGT, which helps to regulate their expression [6].
The rst CAMTA gene member, NtER1, was identi ed in tobacco while surveying CaM-binding proteins [7]. In Arabidopsis, six CAMTA transporters, termed as AtCAMTA1 to AtCAMTA6, have been detected. These AtCAMTAs are involved in biotic, abiotic, and hormonal regulations and developmental stages [8]. For example, AtCAMTA1 and AtCAMTA3 genes play an important role in response to cold and drought stress and in regulation of auxin and salicylic acid in plants. AtCAMTA6 genes are expressed in plants in response to salt stress, while AtCAMTA4 performs pivotal role in defense responses of plants against pathogens, and in response to ethylene, jasmonate acid (JA), and abscisic acid (ABA) [9]. However, comprehensive analyses of CAMTA proteins from a variety of plant species with diverse phylogenetic relationship are still lacking.
Loss-of-function CAMTA3/AtSR1 mutants revealed chlorosis and autonomous lesions, and elevated resistance to pathogens. Similarly, the mutant of a rice CAMTA member OsCBT gene showed signi cant resistance to pathogens, indicating that OsCBT might also act as a negative regulator on plant defense.
CAMTA3 also played important roles in plant defense against insect herbivore, the regulation of glucose metabolism, and ethylene-induced senescence in Arabidopsis [10]. Recently, CAMTA1, CAMTA2, and CAMTA3 were reported to function together in suppressing SA biosynthesis and were involved in cold/freezing tolerance by CBF transcription induction [11].
CAMTAs from tomato were identi ed to be differentially expressed genes (DEGs) during fruit development and ripening stages, indicating that calcium signaling is involved in the regulation of fruit development and ripening through calcium/calmodulin/CAMTA interactions [12]. Recently, 15 CAMTA genes were identi ed in soybean, and expression pro le revealed that they were responsive to various stresses and hormone signals [13]. Based on a study, TaCAMTA4 may function as a negative regulator in response to Puccinia triticina, since the gene silencing-based knockdown of TaCAMTA4 resulted in the enhanced resistance to P. triticina race 165 [14].
There is little analysis of CAMTA transcription factors for comparison of Arabidopsis, tomato and maize. Therefore, this study focuses on the systematic bioinformatics analysis and expression pattern analysis of CAMTA transcriptional factors in three species. In this study, we report the identi cation and a comprehensive analysis of the CAMTA gene family in A.thaliana, Z.mays, and S.tuberosome. Speci cally, comprehensive information is provided on the gene structures, chromosomal locations, and phylogenetic relationships, gene duplications, genes expression during developmental stages, and promoter ciselements identi cation of 17 CAMTA genes in three species.

Material And Methods
The Arabidopsis information resource (TAIR) database was retrieved to download the sequences of six CAMTA family members from Arabidopsis thaliana. BLASTP (E-value was ≤ 1e-7) search was made against the genome of ax in NCBI database using Arabidopsis CAMTA proteins as queries.
ExPASy server (https://www.expasy.org/) was employed to predict the theoretical isoelectric point (pI) and the molecular weight (Mw) of each CAMTA protein.
Phylogenetic Analysis And Gene Structure Of Camta Proteins CAMTA protein sequences were aligned in Arabidopsis, maize, and tomato by using the Clustal W function of MEGA 7.0 and phylogenetic tree was constructed using MEGA 7, applying the Neiojoubor joining algorithm with 1000 bootstraps replicates. The Gene Structure Display Server v2.0 (GSDS, http://gsds.cbi.pku.edu.cn/) was used to obtain information on the exon -intron of CAMTA proteins [15]. Both genome sequences and the coding sequences were utilized for, predictingthe positional information of the CAMTA genes using Ensemble plant, chromosomal locations. The sizes (bp) and intron numbers of CAMTA genes were identi ed.
Analysis of Transcription factor binding sites in the CAMTA promoters 1500 bp upstream of the promotor region of CAMTA was retrieved from NCBI database. The sequences obtained were analyzed using PlantPAN database (http://plantpan.itps.ncku.edu.tw/) with default limitations to identify the key TFBS with respect to stress response.

Chromosomal Distribution and Conserved motif of the CAMTA Gene
For locating the CAMTA members on Arabidopsis, maize and tomato chromosomes, CAMTA genes were placed on each chromosome according to the physical location of the gene. For exon/intron structural analysis, the genomic DNA and CDS sequences corresponding to each predicted Arabidopsis, tomato and maize CAMTA factor gene were downloaded from NCBI database. The CAMTA genes were drawn on all chromosomes and presented with MapChart [16]. The conserved motifs of 17 CAMTA protein sequences were analyzed by MEME program (http://meme.nbcr.net/meme/cgi-bin/meme.cgi). Conserved motifs were identi ed by online MEME (http:// meme.sdsc.edu/meme/meme.html) program using the full length protein sequences of each CAMTA protein sequences, with the following parameters: maximum number of motifs (6 motifs) and motif width set as 6-100 amino acids.

Expression patterns of CAMTA genes in developmental stages
In order to nd DEGs under developmental stages, CAMTA gene expression data was extracted by Genevestigator from A.thaliana, Z.mays, and S.tuberosome database using Affymatrix Arabidopsis/maize /tomato Genome Array platform and 'Perturbations' tool. DEGs with p-values < 0.05 and log fold-change values ≥ 2 and ≤ − 2 were selected for genes. The fold changes in the expression of CAMTA genes under developmental stages were used to generate gene expression heatmap using genevestigator (https://genevestigator.com/gv/) with purple/white color schemes as markers where "purple" and "white" colors represent up and down-regulation of the respective genes.
Identi cation of orthologous and paralogous CAMTA genes between three species The orthologous genes in all three species were detected from EnsemblPlants (https://plants.ensembl.org/index.html). Orthologous genes in CAMTA proteins were selected when the identity exceeded 70% whereas, the paralogous genes were selected when the identity was more than 85% from EnsemblPlants. Orthologous and paralogous CAMTA genes were obtained using Circos program (http://mkweb.bcgsc.ca/tableviewer/).

Results And Discussion
In the present study, we identi ed 17 members of CAMTA gene family in Arabidopsis, maize and tomato, and named them according to their position on the chromosome (Table 1). Bioinformatics analyses such as phylogenetic relationships, domains, gene structures, protein motifs, chromosomal locations, and detection of CAMTA homologous and orthologous genes were performed. The expression pro les in different stages were analyzed. All SlCAMTA, ZmCAMTA, and AtCAMTA genes are listed in Table 1 along with their gene nomenclature, genes details, and amino acid length, molecular weight, and point isoelectric. The SlCAMTA TFs vary in amino acid length from 916 to 1041 with a predicted isoelectric point (pI) ranging from 5.50 to 8.89 and molecular weight ranging from 103.94 to 117.42 kDa. The AtCAMTA TFs vary in amino acid length from 845 to 1050 with a predicted isoelectric point (pI) ranging from 5.17 to 8.05 and molecular weight ranging from 96.18 to 117.26 kDa. The ZmCAMTA TFs vary in amino acid length from 842 to 1025 with a predicted isoelectric point (pI) ranging from 6.38 to 8.43 and molecular weight ranging from 94.65 to 114.42 kDa. Three genome sequences and the coding sequences were utilized for predicting the positional information of the CAMTA genes using Plazza database v12.1. Phylogenetic relationships, conserved motifs and gene structures of the CAMTA factor family genes in Arabidopsis, tomato and maize To determine the phylogenetic relationships among the different members of the CAMTA proteins in Arabidopsis, tomato, and maize, a phylogenetic analysis based on alignments of the 17 CAMTA protein sequences was performed. As shown in Fig. 1, the neighbor-joining phylogenetic tree divided 17 CAMTA genes into three clusters. In cluster I, seven CAMTA factors AtCAMTA1, AtCAMTA2, AtCAMTA3, SlCAMTA2, SlCAMTA3, SlCAMTA3.1and ZmCAMTA4 gene were grouped. In cluster II, tomato CAMTA factor (SlCAMTA6) clustered with AtCAMTA5, AtCAMTA6, ZmCAMTA2 genes. In cluster III, two tomato CAMTA factors (SlCAMTA4, SlCAMTA4.1) clustered with four other AtCAMTA4, ZmCAMTA1, ZmCAMTA3, and ZmCAMTA5 genes. In contrast, CAMTA genes have been divided into seven subfamilies in Arabidopsis and rice [17]. It has been reported that the subfamilies I, II and III of the CAMTA genes simultaneously occur in both dicotyledons and monocots, but the subfamily IV genes did not exist in monocots [18]. Another study showed that CAMTA genes were divided into four (I, II, III, and IV) subfamilies in soybean [18]. The genes within each cluster showed similar exon/intron structures and conserved motifs. To reveal the evolutionary relationship between Arabidopsis, tomato and maize we comprehensively analyzed the phylogeny between the orthologs and paralog of CAMTAs of three species. In this study, results of our clustering analysis of CAMTA factors in dicots and monocots showed three subfamilies in the phylogenetic tree, similar results as obtained in soybean [18]. Most of the AtCAMTA, SlCAMTA and ZmCAMTA genes were classi ed in the same subfamily. The number of IQ motifs in all CAMTAs varied from one to three. All CAMTAs contain two IQ motifs in C-terminal (Fig. 2a). This study revealed that CAMTAs share the same domain organization as reported previously [18]. Using the MEME tool, a total of eight conserved motifs were detected in the CAMTA genes, and the lengths of these conserved motifs varied from 6 to 50 amino acids. Among them, motifs 2, 3, 4, 5, 7 and 8 were widely identi ed in all CAMTAs (Fig. 2b). In general, the CAMTA factors in the same clade may have similar functions. The composition of conserved motifs in most proteins of each group was similar, which validated the CAMTA transcription factors phylogenetic classi cation.
Gene Structure CAMTA genes of group II were disrupted by the highest numbers of introns i.e. 9-12, while group I and group III were disrupted by 1-12 introns. The xed number of introns and exons is a conserved character of CAMTAs among Arabidopsis, maize and tomato [19].
Study of protein structure is important to understand its mode of action. Using the MEME, the conserved motifs of all CAMTAs protein were analyzed. The plant CAMTA-encoded proteins were characterized with the presence of four functional domains, known as CG-1 (a sequence-speci c DNA-binding domain), ANK (ankyrin repeats), IPT/TIG (transcription factor immunoglobulin), and IQ motifs (calmodulin-binding) which, are highly conserved among different plant species. All four CG-1, ANK, IPT/TIG, and IQ domains were detected within each of the StCAMTA, ZmCAMTA, and AtCAMTA CAMTA proteins,
It can be concluded from Fig. 4 that there is tandem duplication in Arabidopsis, tomato, and maize. Gene duplication can assist plants to adjust to different environments during their development and growth [20].

Prediction of TFBS in the CAMTA genes
To understand the transcriptional, hormonal, and developmental regulation in response to stress, PlantPAN database was evaluated for the prediction of transcription factor binding sites in 1000 bp upstream promotor region of CAMTA. The results showed that there are different known stresses -related cis-elements that existed in the promoter regions of the 17 CAMTA genes.
The CAMTAs promoters have various cis-regulatory elements, believed to be involved in responses to environmental and hormonal regulations (Fig. 5). In this study, the elements identi ed in response to various stresses included bZIP, CAMTA, MYB, bHLH, NF-YA NF-YB NF-YC and AP2/ERF.
Analysis of promoters showed that the bZIP and G-box cis-elements were expressed in response to ABA [20]. WRKY cis-elements are regulated in response to auxin hormone. Our results identi ed AT-hook, YABBY, and EIL/EIN3 in the promoter regions of CAMTA. Among surveyed CAMTA, ZmCAMTA gene had the maximum cis elements in its promoter region. ZmCMATA2, ZmCMATA4 and ZmCMATA5 genes had the highest number of cis-elements in their promotor regions. The promoter sequences contained approximately 25 types of elements (Fig. 5).
Within the promoter regions of each of the three species, the abiotic responsive cis elements, AP2/ERF, CAMTA, WRKY and MYB were surveyed. Pervious study has shown that WRKY cis-elements is often synergistically linked to the occurrence of responsive bZIP TFs [22]. Several AP2/ERFs play key roles in response to the induction of speci c stresses and diverse DNA binding preferences and enable these TFs to integrate responses of multiple stimuli in stress condition [23].
Researchers have proposed that AP2/ERF TFs work in tandem with bZIPs and MYBs to bring about synergistic regulation of cold stress tolerance by controlling ABA mediated gene expression in Arabidopsis [24][25]. Therefore, it can be suggested that a network of TFs is involved in co-regulating diverse stress-responsive genes, which potentially form the missing molecular link between primary and specialized metabolism genes under stress conditions [26].
Among the cis elements surveyed, dehydrin was the most cis element found in the CAMTA SlCAMTA/AtCAMTA2/ ZmCAMTA2 genes. Due to the presence of dehydrin, plants are found to respond better to drought stresses. The enrichment of the dehydrin in the promoter regions of most ZmCAMTA/AtCAMTA/SlCAMTA family genes suggested comprehensive transcriptional regulation by the CAMTAs themselves, and indicated a complicated regulation network between them. Our nding showed that the surveyed SlCAMTAs genes, such as SlCAMTA7, SlCAMTA2, SlCAMTA17, and SlCAMTA11, were up-regulated under drought stress. Physiologically, the WRKY TFs binding to W-boxes regulate various developmental activities (controlling senescence) and defense associated processes (regulating responses to pathogen infestation and other abiotic stresses) [27]. Similarly, the MYB-binding sites are present in the promoters of the genes (StCAMTA26a/26b and SlCAMTA19/5/18) and MYB (StCAMTA26a/26b and SlCAMTA4/ 5 /8/8A/16a/19/26a/26b). It has been recognized that MYB TFs binding to their respective cis-elements control changes in various processes like hormonal signaling, specialized metabolism (phenylpropanoid and anthocyanin biosynthesis), cellular morphogenesis, and formation of meristem [28][29].
Next, we further analyzed the hormone responsive elements in Arabidopsis, maize, and tomato CAMTA promoters. The most dominant hormone responsive element in CAMTAs is ABRE for recognizing ABA signal, detected in the promoters of 17 CAMTAs. Another founding member is LOB (corresponding toASL3), which is intriguing in that it is expressed speci cally at the base region of all lateral organs formed from shoot apical meristem (SAM) and oral meristem [2].
The maximum number of WRKY elements was related to AtCAMTA3/6, SlCAMTA4 and ZmCAMTA5/6. Therefore, we can conclude that WRKY, in the promoter regions of genes, has important roles in response to pathogens. The ethylene-insensitive3-like/ethylene-insensitive3 (EIL/EIN3) protein family can serve as a vital factor for plant growth and development under different environmental conditions. EIL/EIN3 protein is a form of a contained nuclear protein with DNA-binding activity, contributing to the complex network of primary and secondary metabolic pathways of plants [30]. AT-hook DNA binding proteins are known to contribute to a functional nuclear architecture by binding to the nuclear matrix. AT-hook motifs bind to the minor grooves in duplex DNA of matrix attachment regions (MAR) of target DNA sequences, containing characteristic AT-rich DNA sequences. Many plant AT-hook motif proteins have a plant and prokaryote conserved (PPC) domain with unknown functions. At-hook motifs are related with functional domains in chromatin proteins and in DNA-binding proteins like that homedomains and zinc ngers.
Our nding showed that YABBY, WRKY, and GATA are involved in response to developmental stages and abiotic stresses. The NAC and YABBY transcription factors are known to be involved in numerous biological processes [32]. Another group of transcription factors is YABBY, playing a critical role in determining organ polarity. It is involved in the establishment of abaxial-adaxial polarity in lateral organs [32]. YABBY family transcription factors contain a zinc-nger domain in the amino-terminal region and a YABBY domain in the carboxyl-terminal region.

Orthologous And Paralogous Genes Study In Camta Transcription Factor
In the present study, a comparative analysis was performed to identify orthologs of CAMTA genes between Arabidopsis, maize and tomato genomes (Fig. 6). Based on the results, AtCAMTA5 with AtCAMTA6; AtCAMTA2 with SlCAMTA3.1 and SlCAMTA4 with SlCAMTA4.1 genes showed high similarity (identity 70%) and in total resulted in three orthologous gene pairs. The analysis of synteny showed that ZmCAMTA1 with ZmCAMTA3 were paralogous. The syntenic analysis suggested the segmental and tandem duplication as a major force for the diversity in CAMTA gene family, revealing the structural and functional conservation of the genes underlying the origins of evolutionary novelty. Further, orthologs with similar functions have conserved domains [31].
No duplicated CAMTA genes were identi ed from synteny analysis in maize, implying different expansion types of this gene family between maize and dicots Arabidopsis and tomato, but AtCAMTA with SlCAMTA3.1 showed high similarity. Based on our ndings, a comparative analysis was performed to identify orthologs of CAMTA genes among three genomes. Orthologous genes among Arabidopsis, maize, and tomato suggested that whole genome duplication (polyploidy) plays an important role in the expansion of CAMTA genes. In the pathway, gene duplication consisted of chromosomal duplications.
In addition, chromosomal/segmental duplication in Arabidopsis, maize and tomato genomes, was detected. Often orthologous genes have a similar function among different species. Therefore, the study of evolutionary genomics can shed light on the gene function. Analysis of CAMTA genes showed that whole genome duplication, tandem duplication, and chromosomal/segmental duplications play an important role in tomato genome expansion. However, the number of tandom/segmental duplications indicate that they are main factors in the evolution of CAMTA genes. Given the main role of these three species as model plants, their genomes provide a new resource for use in breeding. Also, three orthologous gene pairs were identi ed between A.thaliana and Z.mays.

Analysis of Expression Pattern of CAMTA Transcription Factors during developmental stages
Analysis of expression pattern of CAMTA in Arabidopsis, tomato and maize under developmental stage were performed. As shown in Fig. 7a 7b and 7c in Arabidopsis, tomato and maize, respectively, we can clearly see the expression of 6 AtCAMTA transcription factors in developmental stages of Arabidopsis that AtCAMTA5 and AtCAMTA6 were up-regulated in almost of all developmental stages such as senescence and young rosette (Fig. 7a). In tomato, As shown in Fig. 7b, most of SlCAMTA3 genes downregulated gene, which up-regulated SlCAMTA2, SlCAMTA3, SlCAMTA3.1, SlCAMTA4, SlCAMTA4.1, and SlCAMTA6 in developmental stages. CAMTA TFs are regulated at many processes of plant growth and development, especially during light-mediated processes such as owering, maturation, embryo development, and differentiation and expansion [33]. In maize, ZmCAMTA1 and ZmCAMTA2 genes were up-regulated in all developmental stages. It has been reported that some ZmCAMTA genes were expressed in different tissues. The CAMTA genes highly expressed in organs of plants are crucial for the functioning or development of a speci c organ. StCAMTA11 and StCAMTA28 genes up-regulated in root, shoot, and in orescence. It has been reported that fewer differentially expressed CAMTA factor genes were found in soybean roots than in soybean leaves [35]. CAMTAs have been shown to be extensively involved in plant growth and developmental regulation, as well as in biotic and abiotic stress tolerance [34]. In Arabidopsis, CAMTA1 and CAMTA2 genes use in concert with CAMTA3 to directly bind to the promoter of C-repeat binding factor2 (CBF2) to induce expression, leading to increased plant freezing tolerance. AtCAMTA3 can act as a negative regulator of plant immunity to modulate pathogen defense responses by activating the EDS1-mediated salicylic acid (SA) signaling. A recent study showed that TaCAMTA4 may function as a negative regulator of the defense response against Puccinia triticina, since the virus-induced gene silencing (VIGS)-based knockdown of TaCAMTA4 resulted in the enhanced resistance to P. triticina [18]. Our results suggest that one CAMTA member usually participates in multiple signaling pathways, while multiple CAMTA members often work together to participate in one signaling pathway.

Conclusions
In conclusion, 17 CAMTA genes were identi ed in Arabidopsis, tomato, maize in the present study. Analysis of the gene structure and protein domain, biochemical properties, and the phylogenetic tree indicated that the CAMTA gene family was highly conserved during plant evolution. Expression analysis showed that all 17 CAMTA genes were expressed in multiple tissues with different expression levels, suggesting that various CAMTA gene members maintain different functions in growth and development.
Using prediction of cis elements, the present of AP2/ERF in all the CAMTA genes could respond to at least one abiotic stress or multiple stresses, implying different regulations and functions of CAMTA gene members for coping with various abiotic stresses. CAMTA genes in Arabidopsis, tomato, maize genome was predicted to be potential target genes by CAMTA, demonstrating that CAMTA can be widely involved in plant development and growth, as well as coping with stresses. Our ndings provide new insight into the CAMTA gene family in three species as well as a foundation for further studies on the roles of CAMTA genes in wheat development, growth and stress response. This study will help to identify novel CAMTA genes for future breeding programs to improve plant production, quality and stress resistance, and open a new way for further elucidation for their roles underlying the signal transduction among these plant species.  The exon-intronic structure of A. thaliana, Z.mays, and S.tuberosome CAMTA genes according to their phylogenetic relationships. Yellow and blue colors represent gene exon and intron, respectively.  Orthologous and Paralogous relationships of CAMTA genes with Arabidopsis, maize, and tomato genomes visualized by Circos database.