Retrieval of HMA3 transporter genes/proteins
Arabidopsis AtHMA3 was searched in the NCBI database to get the FASTA sequence of the protein (NP_194741.2) and mRNA (NM_119158.4). The blast analysis of AtHMA3 protein showed 11 homologs of the heavy metal atpase 3 family by filtering (E-value: 0.0, query cover: 97-100%, percentage identity: 71.73-100%) in 11 plant species, which include Arabidopsis thaliana, Camelina sativa, Capsella rubella, Eutrema salsugineum, Brassica oleracea var. oleracea, Raphanus sativus, Brassica napus, Brassica rapa, Arabidopsis lyrata subsp. lyrata, Eutrema salsugineum and Tarenaya hassleriana (Table 1).
Physiochemical features and localization of HMA3 proteins
The 11 HMA3 protein homologs encoded a protein with residues of 525-542 amino acids having 56983.36 to 58642.37 (Da) molecular weight, and 5.74 to 8.16 pI value, 29.10 to 33.89 instability index, and 0.222 to 0.380 grand average of hydropathicity (Table 1). Topological prediction analyses of transmembrane (TM) domains of HMA3 protein homologs showed 4 transmembrane domains in protein representative from each of the plant species (Supplementary Fig. S.1). None of the HMA3 protein homologs contains signal peptide. These HMA3 proteins showed positioning similarity in terms of cytoplasmic and non-cytoplasmic regions (Supplementary Fig. S.1). In addition, secondary structure prediction showed that all HMA3 proteins contain above ~22-28% 𝛼-helices, ~22-28% extended strands, and ~50% random coils (Table 1).
Localization and functional annotation of HMA3 proteins
HMA3 protein of Arabidopsis lyrata subsp. lyrata (XP_020886284.1) and Eutrema salsugineum (XP_006409084.2) is located at chromosome 2; however, the rest of the HMA3 homologs positioned at chromosome 4 (Table 2). All of these HMA3 protein homologs are associated with E1-E2 ATPase (PF00122). The CELLO localization predictor showed that these HMA3 proteins are localized in the plasma membrane of roots in all 11 plant species (Table 2). Ontology analysis demonstrated that HMA3 proteins of Arabidopsis thaliana, Camelina sativa, Capsella rubella, Eutrema salsugineum, Brassica oleracea var. oleracea, Raphanus sativus, Brassica napus and Brassica rapa possess several cellular components, including vacuolar membrane, plasma membrane, and a membrane having involvement in the same biological process (cation transport, metal ion transport) and molecular function (nucleotide-binding, ATP binding, ATPase activity, hydrolase activity, metal ion binding). In addition, HMA3 Arabidopsis lyrata subsp. lyrata and Eutrema salsugineum showed the same cellular component (vacuolar membrane, plasma membrane, membrane), biological process (transition metal ion transport, cation transport, zinc ion transport, cadmium ion transport, response to cadmium ion) and molecular function (nucleotide-binding, ATP binding, ATPase activity, hydrolase activity, metal ion binding, metal ion transmembrane transporter activity, cadmium-transporting ATPase activity). Lastly, HMA3 protein of Tarenaya hassleriana showed unique cellular components (plasma membrane, membrane, integral to membrane), biological process (ATP biosynthetic process, cation transport, metabolic process, metal ion transport, zinc ion homeostasis) but molecular function similar to Arabidopsis lyrata subsp. lyrata and Eutrema salsugineum (Table 2).
ARAMEMNON analysis showed the presence of 8-9 exons among the HMA3 gene homologs located at different positions of gene ranged from 1-3535 base pairs (Fig. 1, Table 3). Promoter analysis showed marginal and highly like the prediction of promoter position in the gene sequence in different HMA3 homologs across the 11 plant species. The Arabidopsis thaliana HMA3 showed three different positions of promoter marginal predicted at 200, 700, and 1800 bp (Table 3). Highly predicted position of promoters are located in 2200bp, 2100bp and 2500bp in XM_006303850.2 (Capsella rubella), XM_006412685.2 (Eutrema salsugineum), XM_010550291.1 (Tarenaya hassleriana), respectively (Table 3). The position of TSS varied from 22-36 bp if found. Also, the PolA was positioned after the coding region in all HMA3 genes showing the position at 2369-4074 bp, if found (Table 3). The identified cis-acting elements were stress, hormone, and other responsive factors. Stress responsive, anaerobic induction, and light responsive regulators were found to be the height number and most common of cis-acting elements in HMA3 genes. That projected their involvement of these activities (Table 4).
Conserved motif, Sequence similarities, and phylogenetic analysis
We have used the MEME tool to search for the five most conserved motifs in identified 11 HMA3 homologs (Fig. 2). All these 5 motifs are 50 residues long located at site 11. These motifs are as follows: motif 1 (INLNGYIKVKTTALARDCVVAKMTKLVEEAQKSQTKTQRFIDKCSRYYTP), motif 2 (HPMAAALIDYARSVSVEPKPDMVENFQNFPGEGVYGRIDGQDIYIGNKRI), motif 3 (NLSHWFHLALVVLVSGCPCGLILSTPVATFCALTKAATSGFLIKTGDCLE), motif 4 (KALNQARLEASVRPYGETSLKSQWPSPFAVVSGVLLALSFLKYFYSPLEW) and motif 5 (CMZDYTEAATIVFLFSVADWLESSAAHKASTVMSSLMSLAPRKAVIAETG). Motif 1, 3 and 5 encodes pfam_fs: E1-E2_ATPase. Motif 4 is linked to freq_pat:PKC_PHOSPHO_Site, while motif 2 shows no information (Fig. 2). The HMA3 protein homologs were aligned to check the similarities of sequence across the plant species. The MTP1 proteins showed 71.7% to 100% similarities among the different plant species, in which the consensus sequence ranged from 70%-100% (Supplementary Fig. S.2). The phylogenetic tree was clustered into four groups (A, B, C, D) based on tree topologies (Fig. 3). In cluster A, HMA3 of Arabidopsis thaliana formed a cluster with the Camelina sativa and Capsella rubella, while group B consist of HMA3 protein homologs of Brassica oleracea var. oleracea, Raphanus sativus, Brassica napus and Brassica rapa. The HMA3 of Eurema salsugineum clustered alone is located in the distance from Arabidopsis thaliana homolog. The cluster D consists of Arabidopsis lyrata subsp. Lyrata HMA4, Eurema salsugineum HMA4and Tarenaya hassleriana HMA3 (Fig. 3). In this phylogenetic tree, HMA3 of Arabidopsis thaliana, Brassica oleracea var. oleracea, Raphanus sativus, Brassica napus and Brassica rapa showed the highest 100% bootstrap value (Fig. 3).
Predicted interaction partner analysis
Interactome analysis was performed for AtHMA3 (AT4G30120) on STRING server. STRING showed five closely associated putative interaction partners of AtHMA3. These include MTPA2 (metal tolerance protein A2), ZAT (a member of the zinc transporter and cation diffusion facilitator), NRAMP3 (natural-resistance-associated macrophage protein 3), IRT1 (iron-regulated transporter 2) and NRAMP4 (natural-resistance-associated macrophage protein 4) genes (Fig. 4). Further, the analysis showed four local network clusters, including CL:28166 (nickel transport and cation efflux protein), CL:28164 (manganese ion transport and nickel transport), CL:28176 (ion influx/efflux at the host-pathogen interface), CL:28126 (transition metal ion transmembrane transporter). Lastly, reactome pathways of AtHMA3 include ion influx/efflux at host-pathogen interface, zinc efflux, and compartmentalization by the SLC30 family, peptide hormone metabolism, insulin processing, and metal ion SLC transporters (Fig. 4).
Expression profiles of HMA3
The genevestigator analysis against the Affymetrix Array Platforms showed expression potential and co-expression data of HMA3 in different anatomical parts, developmental stages, and perturbations. In the anatomical part, lateral roots, cauline leaf, silique inflorescence, and radicle in shoot apex seemed to be highly potential for HMA3 expression (Fig. 5a). Further, giant root cell and sperm cells in cell culture have the potential for HMA3 expression (Fig. 5a). Besides, HMA3 has expression potential during senescence, germinated seed, seedling, young rosette, bolting, and young flower stages of development (Fig. 5a). Among the selected stress, HMA3 only showed significant upregulation under Fe deficiency; while the expression did not notably vary in other stresses, such as anoxia, cold, drought, gamma irradiation, genotoxicity, heat, osmotic stress, salt stress, shift cold stress, submerge stress, wounding stress (Fig. 5c).
Co-expression analysis was filtered to five closely associated genes in different anatomical parts, developmental stages and perturbations (Fig. 6). In the anatomical part, the AtHMA3 gene is closely co-expressed with AT1G30560 (putative glycerol-3-phosphate transporter), AT1G63550 (cystein-rich repeat secretory protein 9), AT3G60270 (cupredoxin superfamily protein), AT5G43370 (probable inorganic phosphate transporter 1-2) and AT3G12900 (2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily protein). During the development stage, AT3G56891 (heavy metal transport/detoxification superfamily protein), AT5G19040 (adenylate isopentenyltransferase 5, chloroplastic), AT3G45410 (L-type lectin-domain containing receptor kinase I.3), AT3G29250 (short-chain dehydrogenase reductase 4) and AT2G25260 (unknown protein) co-expressed with AtHMA3 (Fig. 6). Under perturbations, the top five genes co-expressed with AtHMA3 are AT1G64480 (calcineurin B-like protein 8), AT5G17100 (cystatin/monellin superfamily protein), AT5G65980 (auxin efflux carrier family protein), AT2G28690 (protein of unknown function, DUF1635), and AT2G12190 (cytochrome P450 superfamily protein) (Fig. 6).