Identification and physicochemical analysis of BAHD gene family
To systematically identify the BAHD acyltransferase gene family in L. japonicus, a bioinformatics approach was employed using the Hidden Markov Model (HMM) profile corresponding to the BAHD domain (PF02458). This search against the whole genome of L. japonicus initially yielded 87 candidate genes. Subsequent domain validation using the SMART web tool and HMM-based filtering led to the identification of 47 LjBAHD genes, with each featuring intact BAHD domains and the conserved HXXD and DFGWG motifs (Fig. 1A). These genes were renamed LjBAHD1 to LjBAHD47 according to their chromosomal location (Fig. 1B). Chromosomal distribution analysis revealed an uneven distribution of LjBAHD genes across the genome, with a higher frequency observed on chromosomes 1, 3, 6, 7, and 9 (8, 6, 6, 6, and 7 genes respectively), and a lower frequency on 2, 4, and 8 (2, 1, and 1 genes respectively). Additionally, clustering of certain LjBAHD genes was observed, suggesting a potential functional relatedness, shared regulatory mechanisms, and a collaborate role in the biosynthesis of specific metabolites in L. japonicus[25].
A detailed physicochemical analysis of the 47 deduced LjBAHD proteins is presented in Table S1. This analysis highlighted a considerable variation in the number of amino acids, ranging from 311 in LjBAHD46 to 874 in LjBAHD38. Correspondingly, the relative molecular weights varied from approximately 33.8 kDa for LjBAHD46 to 96.6 kDa for LjBAHD38, with an overall average of 52.75 kDa. The theoretical isoelectric points (pI) spanned a range from 5.14 for LjBAHD46 to 8.41 for LjBAHD18. Subcellular localization predictions indicated that 46 of the LjBAHD proteins are predominantly cytoplasmic, while LjBAHD33 was predicted to be localized to the cell membrane. This distribution is in concordance with the findings of D'Auria et al.[6], who reported the absence of signal peptides in nearly all BAHD family members, consistent with a cytoplasmic localization. In summary, the sequence diversity and the physicochemical properties of the BAHD acyltransferase family in L. japonicus underscore its potential for diverse biological functions and highlight the complexity of the regulatory mechanisms underlying the synthesis of secondary metabolites in this species.
Phylogenetic analysis, and motif composition of the BAHD in L. japonicus
Exon and intron analysis is pivotal for elucidating gene structures, organizational patterns, and the functional implications of proteins. This structural information is also instrumental to phylogenetic studies, which can shed light on the evolutionary dynamics of gene families, including the gains, losses, and modifications of gene structures[26]. In this context, the functional analysis of the 47 LjBAHD genes was extended by examining their gene structures, identifying characteristic motifs, and constructing a phylogenetic tree. The exon counts within the BAHD family genes of L. japonicus vary from one to four, with 15 genes being intronless and possessing a single exon (Fig. S1).
To delineate the evolutionary relationships and potential functions of the LjBAHDs, a phylogenetic tree (Fig. 2A) was constructed using the maximum likelihood method based on 64 amino acid sequences of BAHD, encompassing 47 typical sequences from LjBAHDs and 17 from Arabidopsis thaliana (AtBAHDs). Following the classification by D'Auria et al., BAHD family members are categorized into five distinct clades[6]. Notably, the BAHD members in L. japonicus are classified into four clades, with a notable absence of Clade V. Clade I in L. japonicus comprises 18 members, with Clade I-a and Clade I-b consisting of respectively 7 and 11 members, predominantly involved in the modification of phenolic glucosides[27]. Clade II includes 18 members, with Clade II-a and II-b having 9 members each, whose enzymes are primarily engaged in the elongation of long-chain epidermal waxes, which are crucial for reducing water loss and enhancing plant resistance to pathogens[6]. Clade III consists of six members capable of accepting a diverse array of alcohol substrates[28]. Clade IV encompasses five members, predominantly characterized as agmatine coumaroyl transferases[6].
A total of ten conserved motifs (Motifs 1–10) within the BAHD proteins were identified using the MEME web tool (Fig. 2B). These BAHD genes are marked by the conserved Motif 1 (HXXD) and Motif 3 (DFGWG). The presence of the characteristic -YFGNC- motif in BAHD members is often correlated with their roles in the biosynthesis of anthocyanins or flavonoids[29]. Upon enumeration of the -YFGNC- motifs among the 47 LjBAHDs, it was determined that nine proteins contain this motif (Table S1).
BAHD genes duplication and synteny analyses
Duplicated genes have similar gene structures and biological functions, provide critical insights into the evolutionary and expansionary processes of gene families[30]. The MCScanX algorithm was used to investigate gene duplication events within the BAHD gene family in L. japonicus. Syntenic regions in two chromosomal parts represented by red lines, it was observed between LjBAHD11 and LjBAHD18, LjBAHD9 and LjBAHD38. Considering these gene pairs residing on distinct chromosomes, it may indicate they are products of fragmental duplication events (Fig. 3A). Notably, LjBAHD6 and LjBAHD18 are classified within Clade II-a, while LjBAHD9 and LjBAHD38 fall under Clade II-b in the phylogenetic tree. Since the functions of these genes are implicated in aiding environmental adaptation and survival enhancement, it may suggest that L. japonicus has encountered adverse environmental conditions at certain stages of its evolutionary history. To assess the selective pressures on these duplication LjBAHD genes, calculations of the non-synonymous substitution rate (Ka), synonymous substitution rate (Ks), and the Ka/Ks ratio were performed. The Ka/Ks ratios for fragment duplicates arising from tandem duplications were determined to be 0.1718 and 0.1727 (Table S2). Furthermore, all co-dominant LjBAHD gene pairs exhibited Ka/Ks ratios less than 1, indicative of purifying selection acting upon these gene pairs.
To elucidate the evolutionary relationships among BAHD family members across different plant species, collinearity analysis was conducted between L. japonicus and two other species: Arabidopsis thaliana (A. thaliana) and a fellow member of the Labiatae family, Salvia bowleyana (S. bowleyana) (Fig. 3B). The analysis revealed the existence of nine homologous BAHD genes between L. japonicus and A. thaliana, with a preponderance localization on chromosome 1 of L. japonicus. In contrast, non-homologous BAHD genes were identified on chromosomes 2, 4, 6, and 10. Moreover, twenty homologous BAHD genes were identified between L. japonicus and S. bowleyana, with a higher concentration on chromosome 5 of L. japonicus, and an absence on chromosome 4. These findings provide a comprehensive view of the evolutionary dynamics within the BAHD gene family of L. japonicus, highlighting the complex interplay between gene duplication, selective pressures, and the adaptive responses of this species to its environment.
Analysis of cis-acting elements in the promoters of LjBAHD genes
To enhance our understanding of the functions and expression patterns of the LjBAHD gene family in L. japonicus, an in-depth analysis of the cis-acting elements within the promoter regions of the 47 LjBAHD genes was conducted using the PlantCARE database. The predictive analysis identified a total of 6,414 cis-acting elements across these promoter regions, which were predominantly categorized into eight distinct types (Fig. 4A). Notably, TATA-box and CAAT-box elements were found to be the most prevalent, constituting 41.16% and 23.10% of the total elements identified, respectively.
The regulatory functions of these cis-acting elements allow them to be classified into four main groups: light responsiveness, plant hormone responsiveness, stress responsiveness, and plant growth and development responsiveness (Fig. 4B). Visualization of the cis-acting elements within the promoters (Fig. S2) revealed that all LjBAHD promoter regions are enriched with multiple light-responsive elements. This was followed by a significant presence of hormone-responsive elements, including those responsive to jasmonic acid (CGTCA and TGACG motifs), gibberellin (CCTTTTG and TATCCCA motifs), and abscisic acid (ABRE and AAGAA motifs). Additionally, stress-stimulating elements were identified in the LjBAHD promoters, such as the low-temperature-responsive element (LTR), stress-responsive element (STRE), and drought-responsive element (MSB). Plant development-related elements and transcription factor binding sites for MYB and MYC were also observed.
To further investigate the transcription factors that may regulate LjBAHD gene expression, predictions were made based on the identified cis-acting elements within the promoter regions. A regulatory network of transcription factors implicated in the modulation of LjBAHD expression was constructed (Fig. 4C). The analysis revealed that the 47 LjBAHD genes are potentially regulated by 27 distinct transcription factors. These transcription factors could be categorized based on their functions into three groups: growth-related transcription factors (depicted in blue), abiotic stress-related transcription factors (in red), and those with dual functions in growth and stress responses (in pink). The regulatory relationships within this network suggest that the BAHD genes in L. japonicus are predominantly under the control of growth-related transcription factors.
Functional annotation of LjBAHD genes
Gene Ontology (GO) analysis was employed to investigate the functional enrichment among the 47 LjBAHD genes in L. japonicus. The results of this analysis categorized the LjBAHD genes into three primary GO domains: biological processes, cellular components, and molecular functions (Fig. S3). Within the biological processes domain, a majority of LjBAHD genes were implicated in metabolic process (GO:0008152) and cellular processes (GO:0009987). In terms of cellular components, the LjBAHD genes were predominantly associated with the cellular anatomical entity (GO:0110165). When considering molecular functions, the LjBAHD genes were mainly enriched in catalytic activity (GO:0003824). A detailed examination of molecular function items revealed significant enrichment in acyl transfer activity, specifically the transfer of groups other than amino-acyl groups (GO:0016747), and transfer activity (GO:0016740) (Fig. 5A).
Furthermore, enrichment analysis using the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database (Fig. 5B) demonstrated that the LjBAHD genes are significantly enriched in several key biosynthetic pathways. These include the Stilbenoids, diarylheptanoids and gingerol biosynthesis (Ko00945), Flavonoid biosynthesis (Ko00941), and Phenylpropanoid biosynthesis (Ko00940). The GO and KEGG enrichment analyses collectively provide a comprehensive overview of the potential roles and metabolic pathways associated with the LjBAHD genes in L. japonicus, underscoring their involvement in critical biological processes and cellular functions.
LjBAHDs expression analysis in different organs
Gene expression patterns are intricately linked to the biological functions of genes and can provide insights into their roles in plant growth and adaptation to environmental stress. To ascertain the functional roles of LjBAHD genes in L. japonicus, an RNA sequencing (RNA-seq) analysis was conducted to examine their organ-specific expression profiles in four key organs: stem, leaf, flower, and root. The expression heatmap for LjBAHD genes was generated using the Fragments Per Kilobase of transcript per Million mapped reads (FPKM) values, normalized by row (Fig. 6). The analysis revealed that of the 47 LjBAHD genes identified, 41 were transcriptionally active within the examined organs. However, the expression of LjBAHD6, LjBAHD11, LjBAHD12, LjBAHD19, LjBAHD26, and LjBAHD35 was not detected. The LjBAHD genes exhibited a diverse array of expression patterns. Specifically, three genes were found to be highly expressed in leaves and were classified within Clade I (LjBAHD17 and LjBAHD25) and Clade II (LjBAHD41). Twelve genes showed high expression levels in stems, with these genes being distributed across all four Clades. Eight genes were highly expressed in flowers, and these were found across all Clades except Clade IV. Eighteen genes demonstrated high expression in roots, with these genes being distributed across all Clades except Clade I-a, and showing a higher prevalence in Clades I-b and II-b. These findings offer a detailed view of the organs-specific expression patterns of LjBAHD genes in L. japonicus, highlighting their potential roles in various aspects of plant biology and suggesting a complex regulatory network that may be responsive to developmental cues and environmental stimuli.
Characterization and functional analysis of HCT genes in L. japonicus
The enzymatic activities of HCTs are integral to various biosynthetic pathways in plants. In L. japonicus, five HCT genes were identified and designated as LjBAHD4, LjBAHD23, LjBAHD24, LjBAHD25, and LjBAHD36 (Fig. S4). Expression analysis revealed that LjBAHD4 and LjBAHD36 were predominantly expressed in roots, while LjBAHD23, LjBAHD24, and LjBAHD25 exhibited higher expression levels in the aerial parts of the plant. To elucidate the roles of these HCT genes in the biosynthesis of diverse compounds in L. japonicus, a gene-phenotype association analysis was performed. This analysis involved correlating the expression of five HCT and other BAHD genes with the levels of 22 p-coumaroyl analogues detected in the metabolome of L. japonicus (Fig. 7). The 22 metabolites identified were categorized into four major groups: phenols, flavonoids, terpenoids, and alkaloids, which are known to be the most prevalent classes of compounds in L. japonicus. Among the HCT genes, LjBAHD25 demonstrated the highest expression in leaves and the lowest in roots. Notably, its expression was significantly and positively correlated with 10 p-coumaroyl compounds, suggesting a potential role in the biosynthesis of a wide array of compounds in the aerial parts of L. japonicus.
To investigate the regulation of LjBAHD25 expression by external factors, a promoter analysis was conducted. This analysis identified two gibberellin-responsive elements and three growth hormone-responsive elements upstream of the LjBAHD25 gene. Based on these findings, we hypothesized that LjBAHD25 expression is regulated by gibberellin and growth hormones. The expression levels of LjBAHD25 were quantified using reverse transcription quantitative polymerase chain reaction (RT-qPCR) following the external application of gibberellin at concentrations of 1 mg/L and 2 mg/L, and growth hormone at concentrations of 0.5 mg/L and 1 mg/L (Fig. 8). Leaves sprayed with water served as controls. Relative quantification results indicated that LjBAHD25 expression peaked at a concentration of 2 mg/L gibberellin after 2 days of treatment, which was fourfold higher than that of the control. Upon external application of growth hormone, LjBAHD25 expression was highest at a concentration of 1 mg/L after 2 days, reaching 21 times the level of the control. These results suggest that LjBAHD25 expression is hormone-inducible, with growth factors exerting a more pronounced effect on its upregulation.
To further understand the role of HCT in the phenylpropanolamine synthesis pathway in L. japonicus, and in relation to the metabolite content, a phenylpropanolamine metabolism-synthesis pathway was constructed for L. japonicus (Fig. 9). The genes involved in this pathway showed higher expression levels in the aerial parts compared to the roots, and the metabolite content in the aerial parts was also found to be higher than in the root tissues, aligning with the gene expression data.