Identification and sequence analysis of OSCA genes
After the BLASTp homology search, a total of 19, 22, 20, and 13 putative OSCA proteins were found in Cicer arietinum, Cajanus cajan, Vigna radiata, and Phaseolus vulgaris respectively. After the removal of redundant sequences and isoforms, and confirmation of essential domains by InterProScan, a total of 12 non-redundant OSCA proteins were identified in C. arietinum, and 13 non-redundant OSCA proteins in each of C. cajan, V. radiata, and P. vulgaris (Table 1). The OSCA genes in each legume were named according to their homology with the previously identified OSCA proteins in Arabidopsis in each clade (CaOSCA1.1 to CaOSCA4.1, CcOSCA1.1 to CcOSCA4.1, VrOSCA1.1 to VrOSCA4.1, PvOSCA1.1 to PvOSCA4.1). All three major protein domains, namely RSN1_TM (PF13967), PHM7_cyt (PF14703), and RSN1_7TM (PF02714) are found to be present in each OSCA protein sequence identified in all four plants in the current study, confirming the reliability of this mythology.
Table 1
Detailed features of OSCA genes in legumes.
Gene name | NCBI ID | Chromosomal location | mRNA ID | Protein ID | Length (amino acids) | Molecular weight (kDa) | Isoelectric point (pI) | Subcellular localization |
CaOSCA1.1 | LOC101492354 | 7 | XM_004507607.3 | XP_004507664.1 | 773 | 87.99303 | 9.42 | Plasma Membrane |
CaOSCA1.2 | LOC101499643 | 2 | XM_004491080.3 | XP_004491137.1 | 766 | 87.67938 | 8.94 | Plasma Membrane |
CaOSCA1.3 | LOC101499661 | 3 | XM_004493594.3 | XP_004493651.1 | 759 | 86.75613 | 8.57 | Plasma Membrane |
CaOSCA1.4 | LOC101503032 | 8 | XM_004512392.3 | XP_004512449.1 | 798 | 91.88662 | 9.4 | Plasma Membrane |
CaOSCA2.1 | LOC101492212 | 4 | XM_027333420.1 | XP_027189221.1 | 656 | 73.95405 | 8.8 | Plasma Membrane |
CaOSCA2.2 | LOC101508680 | 5 | XM_012716118.2 | XP_012571572.1 | 730 | 82.85833 | 8.65 | Plasma Membrane |
CaOSCA2.3 | LOC101494974 | 3 | XM_004491793.3 | XP_004491850.1 | 727 | 83.14091 | 8.68 | Plasma Membrane |
CaOSCA2.4 | LOC101488482 | 2 | XM_004490060.3 | XP_004490117.1 | 724 | 82.6153 | 8.77 | Plasma Membrane |
CaOSCA2.5 | LOC101503385 | Un | XM_027330896.1 | XP_027186697.1 | 714 | 81.29166 | 8.87 | Plasma Membrane |
CaOSCA2.6 | LOC101497011 | 2 | XM_004489366.3 | XP_004489423.1 | 713 | 81.14172 | 8.71 | Plasma Membrane |
CaOSCA3.1 | LOC101502683 | 1 | XM_004485926.3 | XP_004485983.1 | 722 | 81.80332 | 9.25 | Plasma Membrane |
CaOSCA4.1 | LOC101506041 | 7 | XM_004509346.3 | XP_004509403.1 | 804 | 90.34103 | 7.48 | Plasma Membrane |
CcOSCA1.1 | LOC109809699 | 11 | XM_020373078.2 | XP_020228667.1 | 773 | 87.99864 | 9.13 | Plasma Membrane |
CcOSCA1.2 | LOC109796123 | 3 | XM_020355765.2 | XP_020211354.1 | 764 | 87.50729 | 9.19 | Plasma Membrane |
CcOSCA1.3 | LOC109795678 | 3 | XM_020355168.2 | XP_020210757.1 | 760 | 87.49183 | 8.95 | Plasma Membrane |
CcOSCA1.4 | LOC109797430 | 3 | XM_020357469.2 | XP_020213058.1 | 757 | 85.92665 | 6.98 | Plasma Membrane |
CcOSCA1.5 | LOC109809553 | 11 | XM_020372892.2 | XP_020228481.1 | 805 | 92.39892 | 9.37 | Plasma Membrane |
CcOSCA2.1 | LOC109804967 | 9 | XM_020366896.2 | XP_020222485.1 | 746 | 84.70582 | 8.93 | Plasma Membrane |
CcOSCA2.2 | LOC109797923 | 3 | XM_020358090.2 | XP_020213679.1 | 743 | 83.92439 | 8.95 | Plasma Membrane |
CcOSCA2.3 | LOC109802752 | 7 | XM_020364155.2 | XP_020219744.1 | 706 | 80.30207 | 8.73 | Plasma Membrane |
CcOSCA2.4 | LOC109802953 | 7 | XM_020364386.2 | XP_020219975.1 | 714 | 81.15399 | 9.21 | Plasma Membrane |
CcOSCA2.5 | LOC109805632 | 10 | XM_020367800.2 | XP_020223389.1 | 711 | 81.07141 | 8.93 | Plasma Membrane |
CcOSCA2.6 | LOC109805206 | 9 | XM_020367217.2 | XP_020222806.1 | 711 | 81.31669 | 8.92 | Plasma Membrane |
CcOSCA3.1 | LOC109816407 | Un | XM_020381418.2 | XP_020237007.1 | 722 | 81.48376 | 9.2 | Plasma Membrane |
CcOCSA4.1 | LOC109802334 | 7 | XM_020363655.2 | XP_020219244.1 | 807 | 90.66487 | 6.59 | Plasma Membrane |
VrOSCA1.1 | LOC106775774 | 10 | XM_014662957.2 | XP_014518443.1 | 773 | 88.6626 | 9.22 | Plasma Membrane |
VrOSCA1.2 | LOC106768429 | 7 | XM_014653592.2 | XP_014509078.1 | 775 | 88.11553 | 8.91 | Plasma Membrane |
VrOSCA1.3 | LOC106765145 | 6 | XM_014649655.2 | XP_014505141.1 | 760 | 87.42297 | 9.08 | Plasma Membrane |
VrOSCA1.4 | LOC106757508 | 3 | XM_014640178.2 | XP_014495664.1 | 754 | 85.81765 | 7.26 | Plasma Membrane |
VrOSCA1.5 | LOC106778040 | 11 | XM_014665927.2 | XP_014521413.1 | 863 | 98.64191 | 9.06 | Plasma Membrane |
VrOSCA2.1 | LOC106764810 | 6 | XM_014649210.2 | XP_014504696.1 | 748 | 84.89885 | 8.77 | Plasma Membrane |
VrOSCA2.2 | LOC106762316 | 5 | XM_014646156.2 | XP_014501642.1 | 740 | 83.83188 | 8.65 | Plasma Membrane |
VrOSCA2.3 | LOC106759156 | 4 | XM_014642169.2 | XP_014497655.1 | 715 | 81.90525 | 7.54 | Plasma Membrane |
VrOSCA2.4 | LOC106755328 | Un | XM_014637466.2 | XP_014492952.1 | 714 | 81.64816 | 9.34 | Plasma Membrane |
VrOSCA2.5 | LOC106760545 | 5 | XM_022781245.1 | XP_022636966.1 | 726 | 82.02396 | 9.02 | Plasma Membrane |
VrOSCA2.6 | LOC106759492 | 1 | XM_014642721.2 | XP_014498207.1 | 710 | 80.84627 | 8.84 | Plasma Membrane |
VrOSCA3.1 | LOC106775773 | 10 | XM_014662956.2 | XP_014518442.1 | 728 | 82.38412 | 9.36 | Plasma Membrane |
VrOSCA4.1 | LOC106769200 | 7 | XM_014654710.2 | XP_014510196.1 | 802 | 90.60487 | 6.59 | Plasma Membrane |
PvOSCA1.1 | PHAVU_006G000300g | 6 | XM_007145888.1 | XP_007145950.1 | 773 | 88.73381 | 9.01 | Plasma Membrane |
PvOSCA1.2 | PHAVU_003G081500g | 3 | XM_007153925.1 | XP_007153987.1 | 774 | 88.11558 | 8.83 | Plasma Membrane |
PvOSCA1.3 | PHAVU_008G210700g | 8 | XM_007141550.1 | XP_007141612.1 | 755 | 86.92946 | 8.86 | Plasma Membrane |
PvOSCA1.4 | PHAVU_001G148800g | 1 | XM_007162337.1 | XP_007162399.1 | 753 | 85.72953 | 8.51 | Plasma Membrane |
PvOSCA1.5 | PHAVU_002G133000g | 2 | XM_007158143.1 | XP_007158205.1 | 857 | 97.98081 | 9.16 | Plasma Membrane |
PvOSCA2.1 | PHAVU_001G031900g | 1 | XM_007160900.1 | XP_007160962.1 | 751 | 85.05617 | 8.8 | Plasma Membrane |
PvOSCA2.2 | PHAVU_009G074900g | 9 | XM_007136732.1 | XP_007136794.1 | 678 | 77.04401 | 8.54 | Plasma Membrane |
PvOSCA2.3 | PHAVU_008G039200g | 8 | XM_007139485.1 | XP_007139547.1 | 714 | 81.48652 | 7.81 | Plasma Membrane |
PvOSCA2.4 | PHAVU_002G092500g | 2 | XM_007157656.1 | XP_007157718.1 | 714 | 81.79599 | 9.23 | Plasma Membrane |
PvOSCA2.5 | PHAVU_005G030600g | 5 | XM_007148918.1 | XP_007148980.1 | 713 | 80.50232 | 9.01 | Plasma Membrane |
PvOSCA2.6 | PHAVU_004G031700g | 4 | XM_007151199.1 | XP_007151261.1 | 711 | 80.74029 | 8.83 | Plasma Membrane |
PvOSCA3.1 | PHAVU_006G167700g | 6 | XM_007147882.1 | XP_007147944.1 | 728 | 81.90899 | 9.42 | Plasma Membrane |
PvOSCA4.1 | PHAVU_003G266800g | 3 | XM_007156139.1 | XP_007156201.1 | 802 | 90.28965 | 6.69 | Plasma Membrane |
In all four legumes, clade IV proteins are the longest, ranging from 802 to 807 amino acids, and have the highest molecular weight (~ 90 kDa). Exceptionally, CaOSCA2.1 has the smallest length among all OSCA proteins in the study with a molecular weight of ~ 74 kDa, and VrOSCA1.5 has the highest molecular weight of 98.64 kDa and is 863 amino acids long (Table 1). The length of the CaOSCA proteins ranges from 656 to 804 amino acids, with their molecular weights ranging from 73.95 to 91.8 kDa. In C. cajan, the length of the OSCA proteins ranged from 706–807 aa, and the molecular weight ranged between 80.3 to 92.4 kDa. In V. radiata, amino acid length and molecular weight vary between 710–863 amino acids and 80.8 to 98.6 kDa. The PvOSCA proteins are detected to be 711–853 aa long, with their molecular weights varying between 77–98 kDa (Table 1). The theoretical isoelectric point (pI) of most OSCA proteins is above 7, indicating their basic nature. Interestingly, the clade IV OSCA proteins (CaOSCA4.1, CcOSCA4.1, VrOSCA4.1 and PvOSCA4.1) also exhibit significant deviation from the other three clades of their pI. Their isoelectric points are 7.48, 6.59, 6.59, and 6.69, respectively, in C. arietinum, C. cajan, V. radiata and P. vulgaris, suggesting that the clade IV proteins are neutral or mildly acidic in nature (Table 1). Hence, most OSCAs, except OSCA4.1, are functional in similar alkaline sub-cellular surroundings. OSCA4.1 possibly requires a different microenvironment where it might be functional.
Phylogenetic analysis of OSCA protein
To explore the phylogenetic relationship between the OSCA proteins in the legumes, a phylogenetic tree was constructed by using multiple sequence alignment data of 77 OSCA proteins of Arabidopsis thaliana (15), Oryza sativa (11) and 12 in Cicer arietinum, and 13 in each of Cajanus cajan, Vigna radiata and Phaseolus vulgaris (Fig. 1). All 77 OSCA proteins were segregated into 4 distinct groups. Clades I and II are larger, and clades III and IV are comparatively smaller. Clade I contain 31 members consisting of eight members of Arabidopsis (AtOSCA1, -1.2, -1.3, -1.4, -1.5, -1.6, -1.7, and − 1.8), four members in O. sativa (OsOSCA1.1-1.4), four members from C. arietinum, (CaOSCA1.1-1.4), and five members in each of C. cajan, V. radiata and P. vulgaris (CcOSCA1.1-1.5, VrOSCA1.1-1.5, PvOSCA1.1-1.5). Clade II also contains 31 members consisting of five members from Arabidopsis (AtOSCA2.1-2.5), five members in rice (OsOSCA2.1-2.5), six members each in Cicer (CaOSCA2.1-2.6), Cajanus (CcOSCA2.1-2.6), Vigna (VrOSCA2.1-2.6), and Phaseolus (PvOSCA2.1-2.6). Clade III and IV comprises of six members each, AtOSCA3.1, OsOSCA3.1, CaOSCA3.1, CcOSCA3.1, VrOSCA3.1, PvOSCA3.1 and AtOSCA4.1, OsOSCA4.1, CaOSCA4.1, CcOSCA4.1, VrOSCA4.1, PvOSCA4.1 respectively.
Gene and domain structure analysis
Full-length cDNA sequences and their corresponding genomic DNA sequences of all OSCAs were comprehensively analysed to determine the number and position of the exon and introns. The OSCA4.1 gene in each of the legume plant species is devoid of any introns, displaying intron-poor structures. All four OSCA3.1 genes have five introns and six exons in each plant, showing highly conserved structures. OSCAs present in clades I and II (CaOSCA1.1-1.4, CaOSCA2.1-2.6, CcOSCA1.1-2.6, VrOSCA1.1-2.6, PvOSCA1.1-2.6), the number of exons varies from 10–11 (Fig. 2a). In general, clade I consist of 11 exons, and clade II is composed of 10 exons, but this trend has a few exceptions. In C. arietinum, CaOSCA1.3 is composed of 10 exons, indicating an exon loss. Exon gain events are noticeable in six members of group II (CaOSCA2.2, CcOSCA2.1 and − 2.2, VrOsCA2.1 and − 2.2 and PvOSCA2.1), where each of them consists of 11 exons. The consistency between the number and structure of the exon and introns within the same clades is evident, which further implies the structural and functional resemblance between the OSCA proteins in each clade. Similar patterns of exon-intron distribution have been observed in the OSCA family in other plants like rice and cotton6,14.
Three-dimensional structural modeling, domain analysis, and subcellular localization of OSCA proteins
Structural features of the OSCA protein in four legumes reveal that these proteins contain certain distinct regions, including Disordered (18%), Alpha helix (64%), Beta strand (3%), and TM helix (34%). Three-dimensional structure analysis of CaOSCA, CcOSCA, VrOSCA, and PvOSCA proteins reveals that OSCA has a dimeric structure with each monomer consisting of 11 distinct helices, similar to previously characterised proteins in Arabidopsis and rice. Two monomers form a V-shaped groove, giving rise to a dimer cavity. This cavity is eight to 20 Å in diameter and its surface is observed to be hydrophobic. The inner part of the cavity may react with head groups of certain lipids, though the exact function of the dimer cavity needs further investigation 41 (Supplementary Fig. 1).
The structure and functional roles of proteins are strongly influenced by the presence of domains. In all the OSCA proteins of four aforementioned legume species, three crucial domains viz. RSN1_TM (Calcium permeable stress-gated cation channel 1, PF13967), PHM7_cyt (Cytosolic domain of 10TM putative phosphate region, PF14703), and RSN1_7TM (calcium-dependent channel, 7TM putative phosphate region, PF02714) are present. SMART tool detected the domain present at N-terminal to be RSN1_TM. Pfam investigation reveals that the RSN1_TM family represents the first three transmembrane regions of 11-TM proteins found to be involved in vesicle transport7,42. Similarly, the multi-TM region at the C-terminal end, i.e., RSN1_7TM, is the seven transmembrane domain region which is a putative phosphate transporter. The PHM7_cyt domain is located between RSN1_TM and RSN1_7TM. The C-terminal RSN1_7TM is predicted to be the DUF221 domain. Thorough investigation and previous studies ascertain that the DUF-221, or Domain of Unknown function, which contains hypothetical TM- proteins, is an essential part of any OSCA protein. It consists of an aligned region with approximately 500 amino acid residues7,43. All 51 OSCA proteins exhibit a mostly conserved pattern with respect to the presence of essential domains. CaOSCA2.1 was visualized to possess a comparatively smaller RSN1_TM domain. Interestingly, all clade IV members consist of two DUF221 domains (RSN1_7TM) (Fig. 2b).
The OSCA proteins in each of the four plant species were detected to be present in the plasma membrane of the cell (Supplementary Fig. 2). One of the most distinct characteristic features of the OSCA proteins is the presence of OSCA proteins are the presence of transmembrane helices. CaOSCA proteins are composed of 8–11 transmembrane helices, CcOSCA has 6–10, and both VrOSCA and PvOSCA proteins consist of 7–11 transmembrane helices. The presence of the transmembrane helices further strengthens the predicted subcellular localization of OSCA proteins (Supplementary Fig. 3)
Motif structure analysis of OSCA protein
Detailed analysis of motif structure information was mined and a total of 15 unique motif patterns were detected in the 51 OSCA proteins in the four legumes. Consistent and similar motif anatomy within each clade in each species was predicted, suggesting the existence of clade-specific motifs. In clade I, motifs 3, 7, 4, 13, 6, 12, 2, 9, 5, 1, and 8 are observed in all members. Most members of clade II contain motifs 3, 7, 4, 13, 6, 12, 2, 9, 5, 15, 10, and 11. Motif patterns in clade III are highly consistent, each member consists of motifs 3, 4, 13, 12, 2, 9, 5, 15, and 10. Clade IV (CaOSCA4.1, CcOSCA4.1, VrOSCA4.1, and PvOSCA4.1) proteins reveal exceptional motif structure and composition, at the same time maintain an unfluctuating motif pattern among themselves. They are composed of three motifs only- viz motifs 14, 2, and 13. Motif 14 is unique to clade IV.
These observations suggest that motif 13 is present in all the 51 OSCA proteins studied, hence it must be a part of an essential domain. It is detected as a part of the RSN1_TM domain. Additionally, the RSN1_7TM domain is partially comprised of motif 1 and motif 9. Motif 6 is conserved through the major number of proteins and is a component of the PHM7_cyt domain. Motif 1 is present exclusively in most members of Clade I except for CcOSCA1.1, PvOSCA1.4, and − 1.5. Motif 8 is discovered to be unique to all the members of clade 1. Motif 15 is consistently present in clades II and III, but absent in all members of clade IV. It is rarely observed in Clade I, with only CaOSCA1.1, CcOSCA1.1 and − 1.2, VrOSCA1.1 and − 1.2, and PvOSCA 1.1, -1.2, -1.4, and − 1.5, demonstrating functional resemblance.
Furthermore, motif patterns in clades I and II are closely similar. CaOSCA2.5, CcOSCA2.5, VrOSCA2.5, and PvOSCA2.5 all lack motifs 7, 6, and 12, which are otherwise present in most members of clade II. Similarly, CaOSCA2.2, CcOSCA2.2, VrOSCA2.2, and PvOSCA2.2 all lack motifs 12. This indicates their functional similarity. This suggests that clade II may have the widest variety of functions. Clades III and IV exhibit recurring motif patterns among themselves, pointing to their distinct roles, and they are evolutionary conserved.
Interestingly, CaOSCA2.1 has a particularly irregular motif pattern, only consisting of motifs 13, 3, 2, 9, 5, 15, 10, and 11. This may support its 3D structure, which is different from all of the other 50 proteins in the study due to the partial presence of helix 2. Only CcOSCA1.1 in C. cajan contains motif 10 among the CcOSCA proteins. In P. vulgaris, two PvOSCA proteins (PvOSCA1.4 and − 1.5) possess motif 10, which is usually absent in clade I (Supplementary Fig. 4).
Chromosomal location and gene duplication analysis
The chromosomal location of the CaOSCA, CcOSCA, VrOSCA, and PvOSCA genes were explored to analyse how they are located within the genome. A total of 48 genes were mapped in accordance with their position in the chromosome (Supplementary Fig. 5). Three genes viz CaOSCA2.5, CcOSCA3.1, and VrOSCA2.4 are present in unknown scaffolds hence they could not be mapped. In C. arietinum, CaOSCA1.2, -2.4, and − 2.6 is present on chromosome two. Chromosome three (CaOSCA2.3, -1.3) and chromosome 7 (CaOSCA1.1 and − 4.1) contain two genes each. Chromosome one (CaOSCA3.1), chromosome four (CaOSCA2.1), chromosome five (CaOSCA2.2), and chromosome eight (CaOSCA1.4) contains one gene each.
In the case of C. cajan, CcOSCA genes are only present in chromosomes three, seven, nine, 10, and 11. Chromosome three contains the highest number of genes (CcOSCA1.4, -1.2, -1.3, and − 2.2), followed by chromosome seven which contains three genes (CcOSCA4.1, -2.4, and − 2.3). Chromosome nine (CcOSCA2.1 and − 2.6) and chromosome 11 (CcOSCA1.1 and − 1.5) contain two genes each, whereas chromosome 10 (CcOSCA2.5) contains one gene.
In V. radiata, the VrOSCA genes are present in chromosomes one, three, four, five, six, seven, 10, and 11. Here, chromosome five (VrOSCA2.5 and − 2.2), chromosome six (VrOSCA1.3 and − 2.1), chromosome seven (VrOSCA4.1 and − 1.2), and chromosome ten (VrOSCA3.1 and − 1.1) contains two genes each. Chromosome one (VrOSCA2.6), chromosome three (VrOSCA1.4), chromosome four (VrOSCA2.3), and chromosome 11 (VrOSCA1.5) contains one gene each.
In the case of P. vulgaris, the PvOSCA genes are distributed on all the chromosomes from one to nine, except for chromosome seven. Chromosome one (PvOSCA1.4 and − 2.1), chromosome two (PvOSCA1.5 and − 2.4), chromosome three (PvOSCA4.1 and − 1.2), chromosome 6 (PvOSCA3.1 and − 1.1) and chromosome 8 (PvOSCA1.3 and − 2.3) contains two genes each. Chromosome four (PvOSCA2.6), chromosome five (PvOSCA2.5), and chromosome nine (PvOSCA2.2) contain one gene each.
Gene duplication events in the OSCA gene family
Evidence of gene duplication are noticeably present in the OSCA gene family in C. arietinum, C cajan, V. radiata, and P. vulgaris. A thorough investigation of the genomes of the four plant species reveals that all four genomes have undergone segmental duplication events giving rise to both interspecific and interspecific gene pairs. A total of 79 pairs of interspecifically duplicated genes were identified, which includes 13 between C. arietinum and C. cajan, 14 between C. arietinum and P. vulgaris, 12 between C. arietinum and V. radiata, C. cajan and P. vulgaris showing 14, V. radiata and C. cajan showing 12, and 14 between P. vulgaris and V. radiata. Moreover, among the four plant species, C. arietinum contains three (CaOSCA2.4-2.6, CaOSCA2.4-2.3, and CaOSCA2.3-2.6), C. cajan contains two (CcOSCA2.2-2.1 and CcOSCA2.4-2.3), V. radiata contains two (VrOSCA2.1-2.2 and VrOSCA2.3-2.6) and P. vulgaris contains one (PvOSCA2.3-2.6) intraspecific gene duplication pairs (Fig. 3). To gauge the pressure of evolutionary selection on the OSCA genes, Ka (nonsynonymous) and Ks (synonymous) values were calculated. Ka/Ks ratio, or the substitution rate ratio was calculated for each of the 79 interspecific and the eight intraspecific collinear gene pairs. All 79 interspecific duplicated gene pairs and eight pairs of paralogous genes exhibited a ka/ks ratio of less than one in each case (Supplementary table 1). Therefore, all the segmentally duplicated OSCA genes have been subjected to purifying selection pressure7,29.
Analysis of cis-regulatory elements
A total of 16 consensus cis-regulatory elements were identified in the 2 kb upstream region of the OSCA genes in C. arietinum, C. cajan, V. radiata, and P. vulgaris (Supplementary Fig. 6 and Supplementary table 2). They include six stress-responsive factors namely G-box (pathogen inducible element/positive regulator of senescence), LTR (low temperature-responsive element), MBS (MYB binding site in drought-inducibility), WUN-motif (wound-responsive element), STRE (stress-responsive element), and TC-rich repeats (defence and stress response), six hormone-responsive elements including ABRE (Abscisic acid responsive element), CGTCA/TGACG-motif (methyl jasmonate responsive element), TGA-element (auxin-responsive element), TCA-element (salicylic-acid responsive element) and ERE (ethylene-responsive element), four growth and development factors such as O2-site (anaerobic induction), GCN4_motif (endosperm expression), ARE (anaerobic induction) and Box-4 (element related to light-responsiveness)44,45.
All four legumes contain an abundant number of ABRE, G-box, ARE, and Box 4. In C. arietinum, STRE is rarely present, but is consistently present in C. cajan, V. radiata, and P. vulgaris. Two plant growth-related elements ARE and Box4 are found abundantly in the promoter of most OSCA genes.
Gene ontology (GO) analysis
We conducted a GO analysis of the 51 OSCA proteins. A total of 21 GO terms were identified including six biological processes, 12 molecular functions, and three cellular components. With 51 proteins displaying this activity, cation transmembrane transport is predicted to be the most prevalent biological process undertaken by the OSCA gene family. CaOSCA2.2 is detected to have RNA biosynthetic properties, which is absent in any other OSCA genes in all the 51 genes studied. Interestingly, members of group IV in each legume are observed to have the most varied functions. They were predicted to execute roles like protein targeting to vacuole, hydrolysis of RNA phosphodiester bond, endonucleolytic processes, and proteolysis, along with cation transmembrane transport activity, indicating a very diverse range of functions. CcOSCA2.1, VrOSCA2.1, and PvOSCA2.1 are detected to regulate other biosynthetic processes. The most distinct molecular function of the OSCAs was detected to be calcium-activated cation channel activity, as displayed by all 51 OSCA proteins. Additionally, they are also predicted to perform various other molecular functions. RNA-DNA hybrid ribonuclease activity, aspartic-type endopeptidase activity, and nucleic acid binding activity are common and unique to all the group IV members in all four legumes. Only CaOSCA1.1 and PvOSCA1.1 were predicted to have mechanosensitive ion channel activity and identical protein binding activity. However, pyridoxal phosphate binding activity is prominent in CcOSCA2.1, VrOSCA2.1, and PvOSCA2.1, but is absent in CaOSCA2.1. Intriguingly, enzymatic activities are also detected to be performed by the OSCAs. Similar results are observed in recent studies were OSCAs are discovered to be associated with NAD(P)-dependent dehydrogenase in Zea mays46. As for cellular composition, all 51 OSCA proteins are predicted to be an integral component of the plasma membrane, whereas CaOSCA2.2 might also be a part of the DNA-directed RNA polymerase complex.
Apart from that, in C. arietinum, CaOSCA1.1, -1.3, -2.4, and − 3.1 are predicted to exhibit beta-maltose 4-alpha-glucanotransferase activity and 4-alpha-glucanotransferase activity. Other functions of the CaOSCAs include oxidoreductase activity exhibited by CaOSCA1.1, -1.2, -2.2, -2.3, -2.4, -2.6, and − 4.1, hydrolase activity displayed by CaOSCA2.2. Unlike the other three legumes, pyridoxal phosphate binding activity is not observed in chickpea. DNA-directed 5'-3' RNA polymerase activity is unique only to CaOSCA2.2 and absent in all other OSCA genes in the study.
In C. cajan, beta-maltose 4-alpha-glucanotransferase activity and 4-alpha-glucanotransferase activity are observed in CcOSCA1.1, -1.2, -1.4, -2.4, and − 3.1, hydrolase activity is predicted in CcOSCA2.1, and − 2.2. Oxidoreductase activity is exhibited by CcOSCA1.1, -1.2, -1.3, -2.2, -2.3, -2.4, -2.6, and − 4.1. Mechanosensitive channels and identical protein binding activity are absent in pigeon pea.
In V. radiata, VrOSCA1.1. -1.2, -1.4, -2.4, and − 3.1 displays beta-maltose 4-alpha-glucanotransferase activity and 4-alpha-glucanotransferase activity. Similar to pigeon pea, oxidoreductase activity is exhibited by VrOSCA1.1, -1.2, -1.3, -2.2, -2.3, -2.4, -2.6, and − 4.1 in mungbean. VrOSCA2.1 and VrOSCA2.2 perform hydrolase activities.
In P. vulgaris, PvOSCA1.1. -1.2, 1.4, 2.4, 3.1 exhibits beta-maltose 4-alpha-glucanotransferase activity and 4-alpha-glucanotransferase activity, and oxidoreductase activity is demonstrated by PvOSCA1.1, -1.2, -1.3, -2.2, -2.3, -2.4, -2.6 and − 4.1. No hydrolase activity is observed in PvOSCA genes (Supplementary Fig. 7).
Expression profile of OSCA genes in different developmental stages
Expression profile analysis was carried out in C. arietinum, C. cajan and P. vulgaris with regard to their developmental stages and response to both abiotic and biotic stress conditions. In V. radiata, expression analysis was studied in stress conditions but not in the case of developmental stages due to insufficient data. In C. arietinum, 27 tissues were taken into account for expression analysis studies at the following developmental stage- germinal stage (radicle, plumule, embryo), seedling stage (epicotyl, primary root), vegetative stage (petiole, stem, leaf, root), reproductive stage (leaf, petiole, stem, nodules, root, flower, bud, pod, immature seed), and senescence stage (leaf, leaf-yellow, immature seed, mature seed, seed coat, stem, petiole, root, nodule) from BioProject PRJNA413872. Most CaOSCA genes are considerably upregulated with CaOSCA3.1 showing the highest level of upregulation among all. However, CaOSCA2.6 and CaOSCA4.1 shows no significant change all along in the tissues (Fig. 4a and Supplementary table 3a)
In C. cajan, RNA-Seq data of 33 tissues from BioProjects PRJNA344973 and PRJNA354681, were considered for expression analysis studies at different developmental stages, namely germinal stage (embryo, hypocotyl, radicle, and cotyledon), seedling (root and shoot), vegetative stage (leaf, root, nodule, and shoot apical meristem), reproductive stage (embryo sac, seed, pod wall, mature seed, immature seed, mature pod, immature pod, shoot apical meristem, sepal, petal, petiole, stem, nodule, root, pistil, stamen, flower, bud, and leaf), and senescence stage (root, petiole, leaf, and stem). Most genes are commonly upregulated, CcOSCA1.2, -2.3, -3.1 and 4.1 gene show consistently elevated expression in most tissues (Fig. 4b and Supplementary table 3b).
In P. vulgaris, expression analysis of OSCA genes was conducted at different stages of development has been studied in 34 tissues at the following time intervals given as follows- embryo + cotyledons, hypocotyl, and radicle at V0(48 hrs); cotyledons, hypocotyl, primary leaf, epicotyl and primary root at V1(6days); hypocotyl, primary leaf, root and first trifoliate leaf at V2(10 days); root, neck of the root, trifoliate leaf, stem at V3(14days); hypocotyl, root and trifoliate leaf at V4a(29 days); trifoliate leaf, stem, axial meristem and stem node at V4b(35 days); root, trifoliate leaf, axial meristem and stem node at R5(43 days); flower bud at R5(50 days); flower at R6(53 days); small pod at R7(60 days); medium pod at R8(64 days); mature pod without seeds and immature seeds at R9(79 days) and mature pod at R9(86 days). PvOSCA-1.1, 1.2, 2.1, 2.4, 3.1, and 4.1 genes are upregulated throughout each tissue. PvOSCA1.4 gene shows diminished expression in all the tissues. The NCBI SRA data utilized was extracted from BioProject PRJNA221782 (Fig. 4c and Supplementary table 3c).
Expression profile of CaOSCA under abiotic stress
In C. arietinum, abiotic stress regulation was studied in cultivar ICC4958 in root and shoot against desiccation, salinity, and cold stress using RNA-Seq data from BioProject PRJNA232700. CaOSCA2.6 is greatly upregulated in the case of all three stress conditions in both root and shoot except for displaying downregulated expression in cold stress in shoot tissue. CaOSCA1.1 gene is considerably upregulated in desiccation stress in the shoot. CaOSCA2.3 is upregulated in salt stress in root tissue (Fig. 5a and Supplementary table 4a.1).
The regulatory roles of CaOSCA are also investigated under mild and severe drought conditions in cultivars ICC 8058 and ICC 14778, utilizing RNA-Seq data obtained from BioProject PRJNA436616. CaOSCA2.2 shows upregulated activity under severe drought conditions in cultivar ICC 8058 (Fig. 5b and Supplementary table 4a.2).
qRT-PCR expression analysis under abiotic stress (desiccation, salinity, and cold stress)
To better understand the role of OSCA genes in abiotic stress conditions, qRT-PCR analysis was performed for all 12 CaOSCA genes under desiccation, cold, and salinity stress. Expression data were obtained from 21-day-old seedlings exposed to the aforementioned stress factors. All the clade I CaOSCA genes are downregulated under desiccation stress, but in clade II, the CaOSCA2.1 gene is highly upregulated, followed by CaOSCA2.2, -2.5, and − 3.1. Exposure to cold stress results in the upregulation of CaOSCA1.2, -2.1, -2.2, and − 3.1 genes. Three genes viz CaOSCA1.2, -2.3, and − 2.6 are upregulated when subjected to salt stress, however, all the other genes are downregulated (Fig. 6).
Expression profile of CcOSCA in abiotic stress
In the case of C. cajan, abiotic stress regulation was studied in 4-week-old seedlings of ICP 7(Salt tolerant pigeon pea genotype) and ICP 1071(Salt susceptible pigeon pea genotype) subjected to salinity stress utilizing transcriptomic data from BioProject PRJNA382795 and BioProject PRJNA343064. Three genes namely CcOSCA1.3, -1.4, and − 1.5 are upregulated in the shoot of both genotypes. CcOSCA3.1 shows an elevated expression in the ICP7 shoot. CcOSCA2.6 is upregulated in the root of the tolerant genotype. OSCA genes are more commonly downregulated in roots of the salt-susceptible genotype than in salt-tolerant genotype (Fig. 5c and Supplementary table 4b).
Expression profile of PvOSCA in abiotic stresses (drought and salt stress)
The drought stress regulatory role of the PvOSCA genes is investigated in two different genotypes, BAT 477 (drought-tolerant) and Perola (drought-susceptible) using publicly available data from BioProject PRJNA327176. The study was conducted in leaf and root tissues at two different time points, i.e., expression after 75 minutes and expression after 150 minutes of being subjected to stress conditions. Most PvOSCA genes are expressed in both the tissues in each genotype at two different time frames. PvOSCA2.4 gene is downregulated in the root of both genotypes after both time intervals but is upregulated in the leaves of both genotypes at both time intervals. PvOSCA2.6 is upregulated in the first 75 minutes but is downregulated after 150 mins in the root of BAT 477. Contradictorily, the PvOSCA2.5 gene is moderately expressed after 75 minutes but exhibits elevated upregulation at 150 minutes in BAT 477 root. PvOSCA2.1 and PvOSCA3.1 are consistently expressed in both the tissues of the two genotypes in the study at two different time points, with PvOSCA3.1 showing the highest expression in BAT 477 leaf and PvOSCA2.1 being highly upregulated in BAT 477 root, after being exposed to 150 minutes of drought stress. In genotype Perola, PvOSCA1.1 and PvOSCA1.2 are upregulated in the leaf after 150 minutes of stress (Fig. 5d and Supplementary table 4c.1).
In P. vulgaris, expression analysis was conducted in the leaf and root tissue of two different genotypes, Ispir (tolerant) and TR43477 (susceptible) when subjected to salinity stress using RNA-Seq data from BioProject PRJNA656794. Most PvOSCA genes are considerably overexpressed when subjected to salinity stress. Five PvOSCA genes (PvOSCA1.3, -1.5, -2.3, and − 2.5) are expressed in the leaf of TR43477. PvOSCA2.6 is upregulated in both leaf and root tissues of TR43477. Three genes, PvOSCA1.1 and − 3.1 are upregulated in the root of Ispir genotype, and PvOSCA2.1 is upregulated in both tissues of Ispir genotype. Two genes, PvOSCA1.3 and − 2.3 are commonly upregulated in the leaf tissue of both the genotypes, but contradictorily, they are downregulated in the roots. Two genes viz PvOSCA1.1 and 3.1 genes are downregulated in leaf tissue of both the genotypes, but in the case of root, they exhibit substantially upregulated expression. (Fig. 5e and Supplementary table 4c.2).
Expression profile of VrOSCA in abiotic stress
To explore the possible role of the VrOSCA genes in abiotic and biotic stress regulation, expression levels of 13 VrOSCA genes were explored by utilizing NCBI SRA repository data from BioProject PRJNA327304. Seeds of mungbean variety Zhonglu 1 were subjected to desiccation following imbibition for three (SY3H), six (SY6H), 18 (SY18H), and 24 hours (SY24H), and then their transcriptional dynamics were compared. At least four genes viz VrOSCA1.1, VrOSCA2.1, VrOSCA3.1, and VrOSCA2.6 are upregulated in 6, 18 and 24 hours imbibed seeds, where the highest degree of upregulation is observed in SY24H seeds. VrOSCA2.3 was upregulated in SY18H seeds, but shows no expression in SY8H and SY24H seeds, and is downregulated in SY6H seeds (Fig. 5f and Supplementary table 4d).
Expression analysis of CaOSCA genes in biotic stress conditions
CaOSCA genes are not significantly expressed when exposed to Ascochyta rabiei infection. Leaf and stem tissues of two susceptible genotypes namely Pb 7, C 214, two resistant genotypes-ICCV 05530 and ILC3279, and BC3F6, an A. rabiei resistant introgression line were considered. The expression was studied after three and seven days post-inoculation by A. rabiei utilizing data from BioProject PRJNA479940. CaOSCA1.1 is highly expressed in genotype BC3F6 seven days post-inoculation. CaOSCA2.5 is highly upregulated in genotype ILC 3279 three days post-inoculation. CaOSCA1.4 has the most variable expression. It is upregulated in Pb 7 seven days post-inoculation but in C214, ICCV 05530 and ILC3279, it shows elevated expression three days post-inoculation. Other CaOSCA genes are mostly downregulated (Fig. 7a and Supplementary table 5a.1).
CaOSCA gene expression in biotic stress was explored under Helicoverpa infection with NCBI SRA data extracted from BioProject PRJNA328302. Leaves from eight-week-old chickpea plants were exposed to bollworm infection and then their transcript abundance was measured. CaOSCA1.3, -2.3 and 2.5 shows no expression. Four genes CaOSCA1.1, CaOSCA1.2, CaOSCA1.4 and CaOSCA3.1 are significantly upregulated (Fig. 7a and Supplementary table 5a.2).
Expression analysis of VrOSCA genes in biotic stress conditions
The possible role of VrOSCA genes in biotic stress is investigated by utilizing the NCBI repository data available in BioProject PRJNA715596. Mung bean lines Zheng8-4(resistant) and Zheng8-20(susceptible) were subjected to Fusarium oxysporium infection and the expression levels of the VrOSCA genes are compared at 0, 0.5, 1-, 2-, and 4-days post inoculation(dpi). The VrOSCA2.2 is upregulated at 1 dpi and 2 dpi, but shows no expression after 4 days of inoculation in the resistant line but it is mostly downregulated in the susceptible line. VrOSCA3.1 is mostly expressed in both lines with an upregulated expression in Zheng8-20 at 0.5 dpi, but is downregulated after 24 days of exposure to infection. The VrOSCA genes are mostly downregulated in the susceptible line Zheng8-20 than in the resistant line Zheng8-4 (Fig. 7b and Supplementary table 5b)
Expression profile of PvOSCA in cumulative abiotic and biotic stress conditions
Expression of PvOSCA genes is studied when the plant is subjected to both abiotic and biotic stress conditions by extracting transcriptomic data from BioProject PRJNA311998. Root of genotype BAT 477 was inoculated with AMF (arbuscular mycorrhizal fungi) and was subjected to drought stress and their expression of the were studied after 42 days. More PvOSCA genes are expressed in plants non-inoculated with AMF. PvOSCA1.1, -1.3, -2.1, -2.2, -2.3 and − 3.1 are expressed in both AMF-inoculated and AMF non-inoculated conditions with varying level of expression (Fig. 7c and Supplementary table 5c).
Expression analysis of PvOSCA genes in biotic stress conditions
Expression of PvOSCA genes has been investigated in case of biotic stress conditions using transcriptomic data from BioProject PRJNA482464. Transgenic root in P. vulgaris cv Negro jamapa was inoculated with Rhizobium tropici and an AMF-Rhizophagus irregularis, and the expression was studied after seven days of inoculation. PvOSCA genes are observed to be mostly downregulated when infected with rhizobial and mycorrhizal fungi (Fig. 7c and Supplementary table 5d).