Genome wide identication, characterization and expression proles of heavy metal ATPase3 (HMA3) in plants

HMA (heavy metal associated) is a member of the ATPases protein family involved in metal transport in plants. This study characterizes several HMA3 homologs and infers their molecular functions in different plant species. Arabidopsis AtHMA3 (AT4G30120) was used as a reference to retrieve 11 HMA3 homologs having 97-100% query cover, 535-542 residues, 56983 to 58642 (Da) molecular weight, and 5.74 to 8.16 pI value, 29.10 to 33.89 instability index, and 0.222 to 0.380 grand average of hydropathicity. Topological analyses showed 4 transmembrane domains in these HMA3 homologs positioned similarly in terms of cytoplasmic and non-cytoplasmic regions along with ~22-28% 𝛼 -helices, ~22-28% extended strands, and ~50% random coils. HMA3 protein of Arabidopsis lyrata subsp. lyrata and Eutrema salsugineum are located at chromosome 2, while others are positioned at chromosome 4. All these HMA3 homologs are localized in the plasma membrane sharing a few common biological and molecular functions. Besides, these HMA3 genes contain 8-9 exons in which promoter positions are varied among the homologs. The cis-acting elements of HMA3 genes were projected to be involved with stress response, anaerobic induction, and light responsive regulation in plants. Three out of ve motifs encode E1-E2_ATPase involved in proton-pumping in the plasma membrane. The Arabidopsis thaliana HMA3 protein clustered with Camelina sativa and Capsella rubella show a close phylogenetic relationship. Also, AtHMA3 exhibits a close association with AtHMA3 with MTPA2, ZAT, NRAMP3, IRT2, and NRAMP2 under the local network of AtHMA3 linked to metal transport. Further, AtHMA3 is most potentially expressed during senescence, germinating seed, seedlings, young rosette, bolting, and young ower. In addition, AtHMA3 showed a signicant upregulation (>6.0 fold) under Fe-deciency. These ndings may provide essential background to perform wet-lab experiments to understand the role of HMA3 in metal homeostasis.


Introduction
Heavy metals are abundant in nature due to natural and anthropogenic causes. Heavy metals are taken up by humans through water and food-based meals and may cause serious health problems. Many of the metals, such as Fe, Cu, Zn, are essential for plants, but they need to be at an optimized level. In contrast, some of the heavy metals (Pb, Cd) are highly toxic to plants hampering photosynthesis, nutrient uptake, and yield in plants 1  Arabidopsis, HMA1-HMA4 and HMA5-HMA8 proteins are belonging to cluster 1 (Zn/Cd/Co/Pb) and 2 (Cu/Ag) groups 3 . In particular, HMA3 proteins participate in heavy metal ion transport and detoxi cation in plants. In Arabidopsis thaliana, AtHMA3 localized in tonoplast is involved in the vacuolar storage of Cd 5 . Furthermore, Arabidopsis overexpressed with AtHMA3 showed increased tolerance to Cd, Zn, Pb, and Co, while AtHMA3-knockout mutant exhibited sensitivity to Cd and Zn 6 . Similarly, the overexpression of SaHMA3h in tobacco improved the Cd accumulation and tolerance of transgenic plants 7

. Further,
BjHMA3 is shown to be associated with the varied Cd accumulation in leaves of Brassica rapa 8 . Again, OsHMA3 ectopic over-expression resulted in increased Cd tolerance and lower Cd concentration in leaves and grains but increased Cd concentration in rice roots 9,10 .
Protein topology, such as transmembrane helices, domain recognition, binding sites is crucial features for metal binding capacity in plant system. As a result, the identi cation of metal sites along with the components at transcriptional and Posttranslational regulation eventually determines the functions of a protein in response to metals. One of the identi ed candidate genes in Arabidopsis halleri was AhHMA3, which is highly similar to HMA3 in Arabidopsis thaliana (AT4G30120) 11 . HMA3, located in the vacuolar membrane, participates in vacuolar sequestration of Zn, Cd, Co, and Pb in Arabidopsis 12 . However, the function of HMA3 in hyperaccumulators remains unclear in planta. Although the molecular functions of Arabidopsis HMA3 are relatively well established, the analysis of HMA3 homologs and interactions with other transporters/genes are barely studied.
The characterization of HMA3 possesses the immense potential to combat metal homeostasis in plants.
The in silico characterization of HMA3 homologs may provide in-depth insight into these genes/proteins. In this study, we have searched for HMA3 homologs based on Arabidopsis heavy metal ATPase 3 (AtHMA3) referred to as AT4G30120 across different plant species. The CDS, mRNA, and protein sequences of these HMA3 homologs were taken into computational analysis with advanced bioinformatics software and online-based platforms.  prediction showed that all HMA3 proteins contain above ~22-28% -helices, ~22-28% extended strands, and ~50% random coils (Table 1).

Retrieval
Localization and functional annotation of HMA3 proteins HMA3 protein of Arabidopsis lyrata subsp. lyrata (XP_020886284.1) and Eutrema salsugineum (XP_006409084.2) is located at chromosome 2; however, the rest of the HMA3 homologs positioned at chromosome 4 ( Table 2). All of these HMA3 protein homologs are associated with E1-E2 ATPase (PF00122). The CELLO localization predictor showed that these HMA3 proteins are localized in the plasma membrane of roots in all 11 plant species (Table 2). Ontology analysis demonstrated that HMA3 proteins of Arabidopsis thaliana, Camelina sativa, Capsella rubella, Eutrema salsugineum, Brassica oleracea var. oleracea, Raphanus sativus, Brassica napus and Brassica rapa possess several cellular components, including vacuolar membrane, plasma membrane, and a membrane having involvement in the same biological process (cation transport, metal ion transport) and molecular function (nucleotidebinding, ATP binding, ATPase activity, hydrolase activity, metal ion binding). In addition, HMA3 Arabidopsis lyrata subsp. lyrata and Eutrema salsugineum showed the same cellular component (vacuolar membrane, plasma membrane, membrane), biological process (transition metal ion transport, cation transport, zinc ion transport, cadmium ion transport, response to cadmium ion) and molecular function (nucleotide-binding, ATP binding, ATPase activity, hydrolase activity, metal ion binding, metal ion transmembrane transporter activity, cadmium-transporting ATPase activity). Lastly, HMA3 protein of Tarenaya hassleriana showed unique cellular components (plasma membrane, membrane, integral to membrane), biological process (ATP biosynthetic process, cation transport, metabolic process, metal ion transport, zinc ion homeostasis) but molecular function similar to Arabidopsis lyrata subsp. lyrata and Eutrema salsugineum ( Table 2).
Gene organization ARAMEMNON analysis showed the presence of 8-9 exons among the HMA3 gene homologs located at different positions of gene ranged from 1-3535 base pairs (Fig. 1, Table 3). Promoter analysis showed marginal and highly like the prediction of promoter position in the gene sequence in different HMA3 homologs across the 11 plant species. The Arabidopsis thaliana HMA3 showed three different positions of promoter marginal predicted at 200, 700, and 1800 bp (Table 3). Highly predicted position of promoters are located in 2200bp, 2100bp and 2500bp in XM_006303850.2 (Capsella rubella), XM_006412685.2 (Eutrema salsugineum), XM_010550291.1 (Tarenaya hassleriana), respectively ( Table  3). The position of TSS varied from 22-36 bp if found. Also, the PolA was positioned after the coding region in all HMA3 genes showing the position at 2369-4074 bp, if found ( Table 3). The identi ed cisacting elements were stress, hormone, and other responsive factors. Stress responsive, anaerobic induction, and light responsive regulators were found to be the height number and most common of cisacting elements in HMA3 genes. That projected their involvement of these activities (Table 4).

Conserved motif, Sequence similarities, and phylogenetic analysis
We have used the MEME tool to search for the ve most conserved motifs in identi ed 11 HMA3 homologs (Fig. 2). All these 5 motifs are 50 residues long located at site 11. These motifs are as follows: motif 1 (INLNGYIKVKTTALARDCVVAKMTKLVEEAQKSQTKTQRFIDKCSRYYTP), motif 2 (HPMAAALIDYARSVSVEPKPDMVENFQNFPGEGVYGRIDGQDIYIGNKRI), motif 3 (NLSHWFHLALVVLVSGCPCGLILSTPVATFCALTKAATSGFLIKTGDCLE), motif 4 (KALNQARLEASVRPYGETSLKSQWPSPFAVVSGVLLALSFLKYFYSPLEW) and motif 5 (CMZDYTEAATIVFLFSVADWLESSAAHKASTVMSSLMSLAPRKAVIAETG). Motif 1, 3 and 5 encodes pfam_fs: E1-E2_ATPase. Motif 4 is linked to freq_pat:PKC_PHOSPHO_Site, while motif 2 shows no information (Fig. 2). The HMA3 protein homologs were aligned to check the similarities of sequence across the plant species. The MTP1 proteins showed 71.7% to 100% similarities among the different plant species, in which the consensus sequence ranged from 70%-100% ( Supplementary Fig. S.2). The phylogenetic tree was clustered into four groups (A, B, C, D) based on tree topologies (Fig. 3). In cluster A, HMA3 of Arabidopsis thaliana formed a cluster with the Camelina sativa and Capsella rubella, while group B consist of HMA3 protein homologs of Brassica oleracea var. oleracea, Raphanus sativus, Brassica napus and Brassica rapa. The HMA3 of Eurema salsugineum clustered alone is located in the distance from Arabidopsis thaliana homolog. The cluster D consists of Arabidopsis lyrata subsp. Lyrata HMA4, Eurema salsugineum HMA4and Tarenaya hassleriana HMA3 (Fig. 3). In this phylogenetic tree, HMA3 of Arabidopsis thaliana, Brassica oleracea var. oleracea, Raphanus sativus, Brassica napus and Brassica rapa showed the highest 100% bootstrap value (Fig. 3).

Predicted interaction partner analysis
Interactome analysis was performed for AtHMA3 (AT4G30120) on STRING server. STRING showed ve closely associated putative interaction partners of AtHMA3. These include MTPA2 (metal tolerance protein A2), ZAT (a member of the zinc transporter and cation diffusion facilitator), NRAMP3 (naturalresistance-associated macrophage protein 3), IRT1 (iron-regulated transporter 2) and NRAMP4 (naturalresistance-associated macrophage protein 4) genes (Fig. 4). Further, the analysis showed four local network clusters, including CL:28166 (nickel transport and cation e ux protein), CL:28164 (manganese ion transport and nickel transport), CL:28176 (ion in ux/e ux at the host-pathogen interface), CL:28126 (transition metal ion transmembrane transporter). Lastly, reactome pathways of AtHMA3 include ion in ux/e ux at host-pathogen interface, zinc e ux, and compartmentalization by the SLC30 family, peptide hormone metabolism, insulin processing, and metal ion SLC transporters (Fig. 4).

Expression pro les of HMA3
The genevestigator analysis against the Affymetrix Array Platforms showed expression potential and coexpression data of HMA3 in different anatomical parts, developmental stages, and perturbations. In the anatomical part, lateral roots, cauline leaf, silique in orescence, and radicle in shoot apex seemed to be highly potential for HMA3 expression (Fig. 5a). Further, giant root cell and sperm cells in cell culture have the potential for HMA3 expression (Fig. 5a). Besides, HMA3 has expression potential during senescence, germinated seed, seedling, young rosette, bolting, and young ower stages of development (Fig. 5a). Among the selected stress, HMA3 only showed signi cant upregulation under Fe de ciency; while the expression did not notably vary in other stresses, such as anoxia, cold, drought, gamma irradiation, genotoxicity, heat, osmotic stress, salt stress, shift cold stress, submerge stress, wounding stress (Fig.  5c).

Discussion
In recent years, the characterization of membrane transporters involved in heavy metal scavenging in plants is emerging. Prior to the wet-lab experiment, the in silico analysis is of utmost interest to narrow down the target of studies. The role of AtHMA3 in vacuolar storage of a few metals are documented, although the involvement of other metals and dynamic network of other associated genes and gene/protein properties are yet to extensively studies. This in silico characterization and expression pro le AtHMA3 and its homologs and closely associated genes unveil signi cant regulatory ndings that can be essential contributors to the downstream genome-editing or biotechnological approach to heavy metal studies.
In this study, we selectively blasted the AtHMA3 sequences resulted in 11 different HMA3 protein homologs having 71.73-100% percentage identity. The similarities in protein size, pI, instability index and hydrophilicity suggest that these HMA3 proteins are biochemically relevant. Domain analysis further revealed the association of these HMA3 protein homologs with E1-E2 ATPase (PF00122) localized in the plasma membrane. The proton-pumping ATPase (H + -ATPase) in the plasma membrane produces the proton motive force through the plasma membrane that is required to enable much of the transport of ions and metabolites 13 . Although HMA3 proteins showed diverse cellular component in HMA3 homologs, the unique feature of these limited to membrane, vacuolar membrane and plasma membrane. These cellular components are crucial in mineral absorption and metal homeostasis along with salt tolerance, intracellular pH regulation and cellular expansion in plants 13,14 . However, HMA3 proteins are predominantly associated with cation transport, metal ion transport, zinc ion transport and cadmium ion transport, as evident from our ontology analysis. HMA3 gene is involved in cadmium and lead transport along with vacuolar sequestration potentiality in a heterologous system, but not in zinc transport.
Vacuolar sequestration may have a detoxi cation function 15 .
In predicting evolutionary relationships and functional genomics possibilities, the knowledge on the position and organization of the coding sequence of a gene is considered a critical factor. In this study, all of the identi ed sequences of HMA3 proteins demonstrated 4 transmembrane helices con rming similar hydropathy of these HMA3 protein homologs. The metal speci city of each subclade is determined by speci c amino acids in the three transmembrane helices closest to C-terminus 16 . In this study, all HMA3 gene homologs belonging to the 11 plant species showed 8-9 exons, suggesting that these HMA3 genes are evolutionarily closer to each other. Although promoter analysis predicted several promoter regions of each HMA3 gene, the highly likely prediction of the promoter was made at 2100-2500 bp in several plant species. Localization of exon and promoter plays an essential part in CRISPR-Cas9 and other genome editing studies in plant science. In addition, the identi cation of TSS and PolA in HMA3 homologs will be crucial in understanding the transcriptional and translational genomics. Besides, promoter analysis reveals the involvement of cis-acting elements associated with stress response, hormone, anaerobic induction, and light responsive regulators in HMA3 genes.
Conserved motifs are identical sequences across species that are maintained by natural selection. A highly conserved sequence is of having functional roles in plants and can be a useful start point to start research on a particular topic of interest 17 . Out of the ve motifs, three motifs are mainly matched with the E1-E2_ATPase associated with H + pumping. P-type proton ATPase is found in the plasma membranes of plants that in turn, drives secondary active transport processes across the membrane 18 . One of the motifs is also linked to the protein kinase C phosphorylation site that may play roles in controlling the catalytic activity, stability and intracellular localization of the enzyme 19 . Further, the phosphorylation site may be attributed to the release of Zn from intracellular stores leading to phosphorylation kinases and activation of signaling pathways 20 . The presence of common and long-preserved residues suggests that HMA3 homologs between species may have highly conserved structures. Additionally, for sequencespeci c binding sites and transcription factor analysis, this information can be targeted. In phylogenetic analysis, HMA3 protein of A. thaliana positioned in the same cluster with C. sativa and C. rubella, suggesting its close relationship during the evolutionary trend. Consistently, HMA3 protein homologs of Brassica sp. and Raphanus sp. clustered within B, suggesting the close evolutionary emergence from a common ancestor within the Brassicaceae family. It appears that the HMA3 of E. salsugineum is relatively distantly related to A. thaliana over the evolutional trends. Thus, our results might infer a functional relationship of HMA3 sequences in metal uptake across different plant species.
The interaction network of a speci c gene provides information of all physical associations that can occur among family members. Global gene co-expression analysis is an emerging tool to identify the tissues and the conditions in which signi cant interactions occur. The interactome map analyzed in String platform showed the most close association with MTPA2, ZAT, NRAMP3, IRT2 and NRAMP2, mainly linked to metal transport in plants. Consistently, the local network of AtHMA3 implies the involvement with metal transporter. As a result, these ndings might be useful to characterize HMA3 and to interpret the interactions of multiple genes linked to particular stress of interest in plants. Studies reported that Zn homeostasis is closed associated with P-type ATPase heavy metal transporters (HMA).
Again, both HMA2 and HMA4 were reported to be involved with Zn homeostasis in Arabidopsis 21 . Besides, AtHMA3 showed some reactome pathways, among which ion in ux, zinc in ux and metal ion SLC transporters may attribute to the metal transporter properties of this gene. Overall, this interactome nding might provide essential background for functional genomics studies of metal uptake and transport in plants.
The expression potential of a gene in different conditions is a crucial factor in determining the involvement in a particular trait. The in silico expression analysis in the Genevestigator platform showed interesting outputs concerning the expression of AtHMA3 (AT4G30120) in different anatomical, perturbations, and developmental stages. Being consistent with the AtHMA3 gene ontology, Genvestigator showed that root is the signi cant location where this gene showed expression potential. In a wet-lab experiment, root-speci c expression of HMA3 was reported in rice 22 . Further, AtHMA3 is most potentially expressed during senescence, but germinated seeds, seedlings, young rosette, bolting and young ower also possess signi cant potential for AtHMA3 expression. Interestingly, AtHMA3 showed a signi cant upregulation (>6.0 fold) in response to Fe-de ciency. Till now, HMA3 is known to induce its expression subjected to heavy metals in several plant species 23,24 . Nevertheless, our results suggest that AtHMA3 is a potential gene that could contribute to Fe-de ciency tolerance in plants.

Conclusion
This in silico work identi es and characterizes 11 HMA3 homologs from each plant species. The analysis showed similar physicochemical properties, gene organization, and conserved motifs related to metal transport. The identi ed cis-acting elements were linked to stress resoonse, hormone, and other responsive factors. Sequence homology and phylogenetic tree showed the closest evolutionary relationship of Arabidopsis HMA3 with Camelina sativa and Capsella rubella. In addition, the interactome map displayed some partner genes of AtHMA3 involved in metal transport in plants. It was also predicted that AtHMA3 is expressed in root tissue during senescence and was signi cantly upregulated in response to Fe-de ciency. These ndings will provide basic theoretical knowledge for the downstream studies on HMA3 function and characterization related to metal homeostasis in various plants.

Materials And Methods
Retrieval of HMA3 genes/proteins AtHMA3 gene named as AT4G30120 in Uniprort/Aramene database (protein accession: NP_194741.2 and gene accession: NM_119158.4) was obtained from NCBI to use as a reference for homology search 25 . The search is ltered to match records with expect value between 0 and 0. The corresponding FASTA sequences of gene and protein were retrieved from the NCBI database. During ltering, one accession for each species was selected for analysis.

Analyses of HMA3 genes/proteins
Physico-chemical features of HMA3 protein sequences were analyzed by the ProtParam tool (https://web.expasy.org/protparam) as previously instructed 26  Multiple sequence alignments of HMA3 proteins were performed to identify conserved residues by using Clustal Omega. Furthermore, the ve conserved protein motifs of the proteins were characterized by MEME Suite 5.1.1 (http://meme-suite.org/tools/meme) with default parameters, but ve maximum numbers of motifs to nd 33 . Motifs were further scanned by MyHits (https://myhits.sib.swiss/cgibin/motif_scan) web tool to identify the matches with different domains 34 . The MEGA (V. 6.0) developed the phylogenetic tree with the maximum likelihood (ML) method for 1000 bootstraps using 11 HMA3 homologs from 11 plant species 35 .
Interactions and co-expression of HMA3 protein The interactome network of HMA3 protein was generated using the STRING server (http://string-db.org) visualized in Cytoscape 36 . Additionally, the expression data of Arabidopsis HMA3 was retrieved from Genevestigator software. Expression and co-expression associations of HMA3 was analyzed in different anatomical, developmental, and perturbations based on the Affymetrix Array Platforms (AT_AFFY-ATH1-0).

Declarations
Ethics approval and consent to participate We con rm that our study does not involve human subjects.

Availability of data and materials
The data that support the ndings of this study are available from the corresponding author upon reasonable request.

Competing interests
The author(s) declare no competing interests.

Funding
This research did not receive any speci c grant from funding agencies in the public, commercial, or notfor-pro t sectors.