Molecular cloning and bioinformatics analysis of HpDGAT2 genes
Based on the H. lacustris transcriptome database , five putative DGAT2 genes were predicted by BLAST method using other DGAT2s from different algal species (Additional file 1: Table S1) as query. The full-length mRNA sequences of the five genes were obtained by rapid amplification of cDNA ends (RACEs) method, and the initiation codon, termination codon, 5′-untranslated region (5′-UTR), 3′-untranslated region (3′-UTR), and poly (A) characteristic tail were determined. Five putative DGAT2 genes were designed HpDGAT2A, HpDGAT2B, HpDGAT2C, HpDGAT2D, and HpDGAT2E by multiple sequence alignment with CrDGAT2s, four of which, HpDGAT2A, HpDGAT2B, HpDGAT2D, and HpDGAT2E contained full-length open reading frame (ORF) while HpDGAT2C was partial sequence (Additional file 2: Table S2 and Additional file 3: Table S3). Then, the full-length ORF were cloned and sequenced by PCR with primers (Additional file 4: Table S4), which was renamed and deposited in NCBI GenBank (HpDGAT2A: MT875161; HpDGAT2B: MT875162; HpDGAT2C: MT875163; HpDGAT2D: MT875164; HpDGAT2E: MT875165). This is so far the highest dose of DGAT2s reported in green alga H. lacustris. Comparison with gene models of HpDGAT2s reported by Nguyen et al. , our results confirmed that there were five HpDGAT2s members in H. lacustris. Generally, only one or two alleles of DGAT1s are identified in a number of microalgae, whereas multiple alleles of DGAT2s are typically present .
To gain insights into the biochemical characteristics of HpDGAT2s, the molecular weight (MW), isoelectric point (pI), sub-cellular location, trans-membrane domain (TM), signal peptide (SP), chloroplast transfer peptide (CTP), and phosphorylation site (Phos) were analyzed. No SP or CTP was present in HpDGAT2s protein sequences except for CTP in HpDGAT2C (Additional file 2: Table S2). There were two TMs in all pDGAT2s protein sequences except for three TMs in HpDGAT2B (Additional file 2: Table S2 and Additional file 5: Figure S1), which is consistent with the membrane bound forms of DGAT1 and DGAT2 . In addition, 14-30 phosphorylation sites were predicted in HpDGAT2s protein sequences (Additional file 2: Table S2 and Additional file 6: Figure S2), indicating phosphorylation plays important roles in DGAT2s enzymes activity due to it has been indicated that the DGAT1 enzyme activity was affected by serine phosphorylation sites in mouse DGAT1 , TmDGAT1 , and BnDGAT1 . It remains to be determined whether these phosphorylation sites are important for the functional regulation of HpDGAT2 in vivo.
To further analysis the conserved domains (CDs) and evolutionary relationship between HpDGAT2s and other algal DGAT2s, multiple sequence alignment and phylogenetic tree were reconstructed. CDs analysis showed that HaeDGAT2s contained 7 CDs [26, 47, 48], including YF/YFP block (CD1) which is essential for DGAT2 activity, HPHG/EPHS block (CD2) which is proposed to partially consist of the active site, PxxR (x=random amino acid) block (CD3), xGGxAE block (CD4), RxGFx(K/R)xAxxxGxx(L/V)VPxxxFG block (CD5) which is the longest conserved sequence in plants and animal, PxxxVVGxPIxVP block (CD6), and RHK block (CD7) (Additional file 7: Figure S3). As shown in Additional file 7: Figure S3, there were two completely conserved amino acid residues (proline, P and phenylalanine, F) among all DGAT2s, which is consistent with previous reports that these two highly conserved residues maybe located at the active sites of the enzymes and make significant contribution to the enzymatic activities . The phylogenetic analysis of the HpDGAT2s and other DGATs orthologs from eukaryotic algae and plants was illustrated in Additional file 8: Figure S4, which is consistent with most of previous results [20-26]. Briefly, all HpDGAT2s clustered with the algal DGAT2s orthologs, which are distinct from other DGAT subfamilies including DGAT1, DGAT3, and DGAT/WSD. Of the five HpDGAT2s, HpDGAT2A formed monophyletic subgroup (BS: 100%) with CrDGAT2A, CzDGAT2A, CzDGAT2B, LiDGAT2A, and LiDGAT2B. HpDGAT2B and HpDGAT2E were highly close (BS: 98%) to CrDGAT2B, CzDGAT2E and CrDGAT2C. HpDGAT2C was evolutionary close (BS: 100%) with CzDGAT2C and LiDGAT2C. HpDGAT2D built monophyletic subgroup (BS: 73%) with CrDGAT2D and CzDGAT2D.
AST and TAG accumulation and HpDGAT2s genes transcription expression upon high light and nitrogen deficient stresses
High light (HL) and nitrogen deficient (nitrogen-free, ND) stresses can effectively promote the accumulation of AST and TAG in H. lacustris [32-34, 50-53]. However, under such circumstances, the growth of algal was completely restricted [51-53]. Recently, our team finished part of the research about the effects of nitrogen deficient degrees (nitrogen content compare to those in control BBM medium e.g., 0, 1/4N, 1/2N, and 3/4N) on algal growth and AST and TAG accumulation. The results indicated that the highest AST productivity arrived at the 1/4N stress due to the certain level of algal growth (data not shown). Therefore, in the current manuscript, 1/4N condition was selected as nitrogen deficient stress in further experiment. To understand the relationship between HpDGAT2s transcription expression and TAG and AST biosynthesis, time-course patterns of algal biomass, transcription expression, total AST (T-AST), and total TAG (T-TAG) contents in photoautotrophic cultures of H. lacustris under HL, 1/4N, and double HL-1/4N stresses were studied (Fig. 1).
As shown in Fig. 1a, compare to the control, HL, 1/4N, and double HL-1/4N stresses inhibited the algal growth. The T-AST production and composition were summarized in Fig. 1b-1e. From these results we could draw some conclusions that (1) M-AST is the main form, (2) compare to 1/4N stress, HL is more effective in inducing AST accumulation especially under high blue light (HLB) condition, (3) coupled HL and 1/4N dual stimulation might be better choices in improving AST accumulation. Moreover, T-TAG contents slowly increased from day 1 to 4 day and reached its maximum value of 29.5%, 28.7%, 26.8%, 25.2%, and 24.8% under HLB-1/4N, HLW-1/4N, HLB, 1/4N, and HLW conditions, respectively, which was 159.5%, 155.1%, 144.9%, 136.2%, and 134.1% higher than that of control (Fig. 1f). The effect of HL, 1/4N and double HL-1/4N stresses on TAG and AST accumulation was substantially consistent with previous studies that AST and lipid biosynthesis was enhanced and the former was coordinated with the later biosynthesis under HL and ND conditions [34, 41]. Previous studies have indicated that DGATs enzymes are probably responsible for both AST esterification and TAG biosynthesis in H. lacutris [33, 34]. As revealed by qRT-PCR results (Fig. 2), the HpDGAT2s genes transcription expression levels exhibited distinct patterns under the HL, 1/4N and double HL-1/4N stresses. Of the five HpDGAT2s, the HpDGAT2B and HpDGAT2C expression levels decreased and maintained constant (Fig. 2b and 2c). The HpDGAT2A and HpDGAT2E expression levels increased and reached its maximum at 4 d exposure, which were HL and 1/4N stress dependent (Fig. 2a and 2e), respectively, while the HpDGAT2D expression level increased and was both stresses dependent (Fig. 2d). There results suggested that these HpDGAT2A, HpDGAT2D, and HpDGAT2E genes were together involved in the AST and TAG biosynthesis under stress.
Functional complementation of HpDGAT2s in yeast
To verify the function of the putative HpDGAT2s enzymes, the ORF encoding sequences were cloned (Additional file 4: Table S4) into pYES2.0 plasmid and heterologously expressed, respectively, in the quadruple mutant yeast strain S. cerevisiae H1246 (∆dga1∆lro1∆are1∆are2) that lacks the activity of TAG synthesis. The mutant type (H1246) yeast can form TAG when at least one of these four genes was expressed. Furthermore, Wild type (INVSc1) and H1246-EV (H1246 harboring empty vector pYES2.0) yeast strains were used as positive and negative controls, respectively.
The expression of HpDGAT2A, HpDGAT2B, HpDGAT2D, and HpDGAT2E restored TAG biosynthesis at different levels in H1246 cells as indicated by the prominent TAG spot on a TLC plate (Fig. 3a). By contrast, HpDGAT2B expression in H1246 cells produced un-conspicuous TAG indicating a nonfunctional encoded protein considering the low transcription expression level in H1246 cells (Fig. 3b) and H. lacustris cells (Fig. 2b). Nevertheless, the limited FAs composition in Saccharomyces cerevisiae might lead to the low TAG content for HpDGAT2B. The ability of HpDGAT2A, HpDGAT2B, HpDGAT2D, and HpDGAT2E to restore TAG biosynthesis in yeast led us to examine their FAs substrate specificity. As indicated in Fig. 3b and 3c, the HpDGAT2A, HpDGAT2B, HpDGAT2D, and HpDGAT2E genes were heterologously expressed in H1246 and INVSc1 cells. The changes of TAG content and FAs composition of TAG extracted from the transformed H1246 and INVSc1 cells were similar (data not shown). As shown in Fig. 3d, the TAG contents of expressed HpDGAT2A and HpDGAT2B in H1246 cells were 78.3% and 56.5% lower respectively than those of control (INVSc1 and INVSc1+EV). While the TAG contents of expressed HpDGAT2D and HpDGAT2E were 108.7% and 122.7% higher respectively than control. To further test the FAs substrate specificity, FAs from the transformed H1246 and INVSc1 cells were analyzed by GC. As shown in Fig. 3d, compare to control, the MUFAs palmitoleic acid (C16:1) and oleic acid (C18:1) abundance increased in HpDGAT2A, HpDGAT2D, and HpDGAT2E expressed H1246 cells at the expense of saturated fatty acids (SFAs) including palmitic acid (C16:0) and stearic acid (C18:0). Such tendency, however at different levels, was observed for almost all transformed lines of H1246 for various DGATs enzymes [20, 23-28].
Considering the limited FAs composition in yeast strains (C16:0, C18:0, C16:1, and C18:1), some PUFA, rich in H. lacustris, including linoleic acid (C18:2n6), α-linolenic acid (C18:3n3), γ-linolenic acid (C18:3n6), and parinaric acid (C18:4n3) were tested the substrate specificity for HpDGAT2A, HpDGAT2B, HpDGAT2D, and HpDGAT2E enzymes by feeding strategy. The HpDGAT2A, HpDGAT2D, and HpDGAT2E had similar tendency that these PUFAs were incorporated into TAG on the expense of C16:1 and C18:1 with the following patterns of C18:2n6 > C18:3n3 > C18:3n6 > C18:4n3 (Fig. 3e). Considering that the C18:2n6 and C18:3n3 were rich in H. lacustris, it is reasonable to speculate that these HpDGAT2s may have potential in the C18:2n6 and C18:3n3-enriched TAG production [32-34]. The HpDGAT2A, HpDGAT2D, and HpDGAT2E enzymes showed more strong preference for PUFAs than MUFAs, alternative due to the high feeding content of PUFAs than endogenous MUFAs content. This phenomenon was also confirmed by Zienkiewicz et al (2018) that some PUFAs were incorporated into TAG on the expense of 16:1 and 18:1 in LiDGAT1, LiDGAT2.1, LiDGAT2.2, and LiDGAT2.3 expressing yeast  and in CzDGAT2C expressing yeast mutant H1246 cells  by feeding test. However, FAs profiles of the TAG fraction from yeast cells expressing HpDGAT2B showed no obvious changes, implying an un-functional protein (Fig. 3e).
HpDGAT2D overexpression promotes TAG biosynthesis and its relative MUFAs and PUFAs abundance in C. reinhardtii
In order to investigate the possible biological role of HpDGAT2s and engineering potential to modulate TAG biosynthesis in algae, we generated HpDGAT2D overexpression lines in evolutionary close green algal C. reinhardtii CC849. The HpDGAT2D was selected in further experiments due to the relative strong TAG biosynthetic activity in yeast cells (Fig. 3) and high transcription expression level in H. lacustris under stress condition (Fig. 2d).
The nuclear transformation expression vector pDB124 (Additional file 9: Figure S5), characterized in C. reinhardtii CC849 and presented by professor Zhangli Hu from Shenzhen University, was used in this study after modified due to it contained overexpression cassettes of the HpDGAT2D-His fusion and bleomycin resistance Ble genes under the control of the verified endogenous promoter and terminator of PsaD and RBCS2 genes respectively (Fig. 4a). The codon preference (HpDGAT2D) was optimized according to the algal C. reinhardtii (Additional file 10: Figure S6) before constructing the expression vector. Transformants (screening over 20 putative transformants) were selected on TAP plates supplemented with bleomycin and confirmed by genomic PCR method. The exogenous HpDGAT2D-His fusion gene was integrated into the alga chromosome due to the clear band using HpDGAT2D gene as primers in transformation lines, whereas no signal was detected in WT cells (Fig. 4b). Three overexpression lines, HpDGAT2D-4, HpDGAT2D-7, and HpDGAT2D-9, exhibited the maximum increase in transcription levels (by ~ 5.5-fold higher than the control) under ND condition at a 4-day batch culture with no significant difference in cell growth between the transgenic lines and control (Fig. 4c and 4d). Furthermore, in vivo overexpression of the HpDGAT2D protein was validated by using His-tag antibodies to detect the HpDGAT2D-His fusion protein via western blot method. The bands were present from the membrane proteins of three overexpression lines (HpDGAT2D-4, HpDGAT2D-7, and HpDGAT2D-9), while were absent from the soluble proteins, which was consistent with that HpDGAT2D was a trans-membrane enzyme (Fig. 4e). The HpDGAT2D overexpression led to considerable increases (by ~ 1.4-fold) in TAGs content under ND condition (Fig. 4f). HpDGAT2D overexpression also affected the FAs profiles in TAG (Fig. 4f). A significant increase was observed in the MUFAs (C16:1 and C18:1) and PUFAs (C18:2n6 and C18:3n3) relative abundance accompanied by a significant decrease in SFAs (C16:0 and C18:0) and some PUFAs (C16:2, C16:3, C18:3n6, and C18:4n3). These results indicated that (1) HpDGAT2D showed more strong preference for MUFAs and PUFAs than SFAs, (2) of all PUFAs, HpDGAT2D had the first option to C18:2n6 and C18:3n3 rather than C16:2, C16:3, C18:3n6, and C18:4n3, (3) these preferred substrates happened to be the type that is enriched in C. reinhardtii. This trend was consistent with results from yeast cells by feeding test (Fig. 3d and 3e) and previous studies of NoDGAT1A expression in C. reinhardtii UVM4 and CzDGAT1A expression in oleaginous alga N. oceanica by Wei et al (2017) and Mao et al (2019) respectively [20, 22].
HpDGAT2D overexpression enhances seed oil content and its relative MUFAs and PUFAs abundance in A. thaliana
To explore HpDGAT2s as a tool to manipulate acyl-CoA pools and to engineer TAG biosynthesis in higher plants, HpDGAT2D was over-expressed in Arabidopsis thaliana. Three A. thaliana independent expression T2 generation lines (At-HpDGAT2D-3, At-HpDGAT2D-6, and At-HpDGAT2D-8) were selected for further detailed analysis. The transgenic lines did not show any visible morphological difference from untransformed control A. thaliana e.g., 1000-seeds weight (Fig. 5a). The qRT-PCR results showed that the HpDGAT2D transcript was expressed in transgenic lines at different tissue organs including roots, tubers, leaves, siliques, and seeds with distinct extent (Fig. 5b). The transformation of wild type A. thaliana with HpDGAT2D resulted in higher (120.0-126.4%) seed TAG content than control (Fig. 5c). Again, further GC analysis of FAs profiles form TAG revealed that PUFAs and MUFAs significantly increased accompanied by a significant decrease in SFAs (Fig. 5c). However, the exact process of change was much more complicated than those in yeast and C. reinhardtii cells. Specifically, of SFAs, C16:0 and C22:0 decreased while C18:0 and C20:0 maintained stable. Of MUFAs and PUFAs, HpDGAT2D preferred C18:1, C18:2n6, and C18:3n3 rather than C20:1, C20:2 and C22:1 in TAG biosynthesis. These results were largely in agreement with those in yeast cells (Figs. 3d and 3e) and C. reinhardtii cells (Fig. 4c). Guo et al (2017) indicated that the CeDGAT1 gene can stimulate FAs biosynthesis and enhance seed weight and oil content when expressed in A. thaliana and B. napus .
Molecular docking reveals the binding sites between HpDGAT2s and AST structure
Although some studies have indicated that DGATs are likely to be the crucial enzymes involving in EAST biosynthesis in H. lacustris , so far there is no direct biochemical evidence. Homology modeling is a useful tool for predicting the 3D structure of proteins  and AutoDock tools is a powerful method for identifying potential binding sites between 3D structures and ligands . In this study, the docking studies were attempted to explore the binding sites between AST structure and 3D models of HpDGAT2s. SWISS-MODEL server was successful in generating 3D structures for HpDGAT2A, HpDGAT2B, HpDGAT2D, and HpDGA2E. All four HpDGAT2s protein structures contained possible AST binding sites (Additional file 11-13: Figure S7-S9). The results from HpDGAT2D were elaborated in detail (Fig. 6). The symmetrical half of the AST molecule (C20) was selected in docking process due to on one hand the oversized C40 structure (compare to C16-C22 fatty acids), on the other hand, in fact, AST esterification mainly occurred on the hydroxyl group of a six-membered ring at both ends (Fig. 6b). The symmetrical half of the AST molecule (C20) got docked onto the predicted 3D model of HpDGAT2D as shown in Fig. 6c. Further, molecular interaction studies showed that 3D model of HpDGAT2D had some potential AST binding sites (amino acids) by van der waals force, conventional hydrogen bond, alkyl, Pi-alkyl, and Pi-sigma (Additional file 14: Figure S10). Meanwhile, some binding sites between fatty acids (C16:1 and C18:1) and 3D model of HpDGAT2D were also predicted (Fig. 6d and 6e), which verifies the reliability of the AutoDock analysis due to these DGAT2 enzymes should include their binding sites.