De novo transcriptome assembly and annotation
From the de novo transcriptome assembly, a total of 210,210 transcripts with a mean length of 1,021 bp and N50 of 2,350 bp were obtained. The assembly statistics of the P. micromegethes are presented in Table 1. Analysis of the BUSCOs (Benchmarking Universal Single Copy Orthologs) against Actinopterygii gene set on the non-redundant transcripts revealed a high percentage of complete orthologues (93.4%), either as single or duplicated sequences (Fig. S1a). The protein coding sequences (cds) within the transcripts were predicted using TransDecoder. Four ORF categories, which are complete ORF (44,083), 5' partial ORF (10,602), 3' partial ORF (3,975), and internal ORF (2,326) were obtained (Fig. S1b). Based on the homology search, 87.98%, 82.72%, 86.82%, 81.64%, and 74.63% of the peptide sequences showed significant matched to the Nr, SwissProt, KEGG, KOG, and Pfam databases, respectively (Fig. S1c). Another 71.11% of sequences were assigned with GO term consisting of three independent categories: biological process, molecular and cellular functions (Fig. S2). The highest P. micromegethes transcripts assigned into the biological process category belongs to cellular process, biological regulation, and response to stimulus. As for molecular function category, the top assigned transcripts were grouped into catalytic, transcription regulator and molecular transducer activities while for cellular function category, highest transcripts were assigned to cellular anatomical entity and protein-containing complex. KEGG Automatic Annotation Server (KAAS) assigned KO IDs to 22,003 of the transcripts, with 8,673 assigned as complete, with involvement in 403 pathways (Table S2).
Table 1
General statistics for the de novo assembly Trinity of P. micromegethes
Assembly characteristics | Value |
Before filtering Raw reads (PE) | 235,289,694 |
Total assembled bases (bp) | 518,275,680 |
Total assembled contigs | 388,294 |
Average contig length (bp) | 1,334 |
Median contig length (bp) | 599 |
Contig N50 length (bp) Contig N90 length (bp) GC content (%) After filtering* Total assembled bases (bp) Total assembled contigs Average contig length (bp) Median contig length (bp) Contig N50 length (bp) Contig N90 length (bp) GC content (%) | 2,715 483 48.07 214,789,660 210,210 1,021 413 2,350 337 46.90 |
* Redundant transcripts were eliminated using CD-HIT. |
Sequence and phylogenetic analyses of the Fads and Elovl enzymes in P. micromegethes
A total of five selected transcripts, corresponding to the Fads and Elovl involved in LC-PUFA biosynthesis were mined from the P. micromegethes transcriptome. This includes a putative fads2 and four elovls, including the elovl5, elovl2 and two elovl4 paralogs, elovl4a and elovl4b (Table 2). The ORFs length of the P. micromegethes fads2, elovl5, elovl2, elovl4a, and elovl4b transcripts are 1,335 bp, 810 bp, 849 bp, 933 bp, and 903 bp, respectively, encoding the protein sequences of 445 aa, 270 aa, 283 aa, 311 aa, and 301 aa.
Table 2
List mined putative Fads and Elovl transcripts involved in the biosynthesis of polyunsaturated fatty acids for P. micromegethes
Annotated complete sequence_ID | Sequence description |
TRINITY_DN36236_c1_g1_i1.p1 | Delta-6 desaturase (Fads2) |
TRINITY_DN38107_c1_g5_i10.p1 | Elongation of very long chain fatty acids 5 (Elovl5) |
TRINITY_DN33464_c1_g2_i2.p1 | Elongation of very long chain fatty acids 2 (Elovl2) |
TRINITY_DN30298_c0_g2_i1.p1 | Elongation of very long chain fatty acids 4a (Elovl4a) |
TRINITY_DN36298_c3_g2_i1.p1 | Elongation of very long chain fatty acids 4b (Elovl4b) |
The BLASTp searches showed that the P. micromegethes Fads2 is highly identical (>85%) with functionally characterized orthologs from cyprinids, including Tinca tinca (QIA97820), Tor tambroides (AZL94116), Danio rerio (Q9DEX7), and Barbonymus gonionotus (AXF92413). Alignment of the P. micromegethes putative Fads2 with six functional characterized Fads protein sequences from Homo sapiens, Clarias gariepinus, B. gonionotus, D. rerio, Siganus canaliculatus (∆4), and S. canaliculatus (∆6/∆5) showed typical microsomal fatty acyl front-end Fads sequence features (Fig. 1). This includes a N-terminal cytochrome b5-like domain, a heme binding motif HPGG, three conserved histidine boxes (HXXXH, HXXHH and QXXHH), and four putative transmembrane domains. A region (FQFQ) speculated to be important for the regioselectivity of Fads2 is also present. The tree topology of Fads protein sequences showed two well-supported distinct clades, Fads1 and Fads2, with the P. micromegethes ortholog placed within the latter group (Fig. 2). Within the Fads2 clade, the various Cypriniformes members are grouped together with orthologs from other Otomorpha fish, including Siluriformes and Characiformes. The other Fads2 clade is made of a branch containing Salmoniformes and a larger branch comprises of Fads2 from various Acantomorphan.
As for Elovls, BLASTp analysis on Elovl5 matched this sequence (79.0-77.6%) with functionally characterized Elovl5 of other cyprinids including T. tambroides, B. gonionotus and D. rerio. Likewise, the P. micromegethes Elovl2 is highly identical with B. gonionotus, T. tinca and T. tambroides (86.8-84.8%). The P. micromegethes Elovl4a and Elovl4b were 92.0-81.5% similar with several teleost Elovl4 elongases. The alignment of all the above elongases (Fig. 3) showed typical characteristics of a microsomal fatty acyl elongase: four conserved motifs (KXXEXXDT, QXXFLHXXHH, NXXXHXXMYXYY, and TXXQXX), seven putative transmembrane domains, and a histidine rich box (HXXHH). The putative endoplasmic reticulum (ER) retrieval signal with KXRXX in Elovl5, KXKRX in Elovl2 and R(K)XKXX in Elovl4 were also detected at the carboxyl terminal. The ML tree topology showed the Elovl2 and Elovl5 orthologs sharing a common ancestor whereas all Elovl4 orthologs are in a separate cluster (Fig. 4). The Elovl4a and Elovl4b isoforms are separated into two respective clades.
Functional characterization of P. micromegethes Fads2 desaturase and Elovl elongases
The FA composition of the P. micromegethes fads2 ORF-inserted S. cerevisiae yeast indicates Δ6 desaturation activity towards C18 PUFA substrates (Table 3). A low and probably negligible Δ8 activity towards 20:3n-3 was detected. Furthermore, P. micromegethes fads2 was able to catalyze the biosynthesis of EPA and ARA through Δ5 desaturation towards C20:4n-3 and C20:3n-6 substrates, respectively. Meanwhile, a single peak corresponding to the non-methylene interrupted (NMI) product (Δ5,11,14,1720:4) was also detected. Incubation of the transformed yeast with either of the C22 PUFA substrates did not yield any products, which implies a lack of Δ4 desaturation. Incubation with C24:4n-6 or C24:5n-3 resulted in their respective desaturation products, which suggest a role in this Fads2 in the 'Sprecher' pathway for the biosynthesis of DHA. Overall, the results indicate that the P. micromegethes Fads is a Fads2 with bifunctional Δ6/Δ5 activities.
Table 3
Functional characterization of P. micromegethes Fads2 through heterologous expression of the respective transcript ORF in yeast (Saccharomyces cerevisiae), followed by incubation with PUFA substrate to assay the desaturation rate leading to PUFA product
PUFA Substrate | PUFA Product | Conversion rate (%) | | Activity |
| | Fads2 | | |
18:3n-3 | 18:4n-3 | 26.8 ± 1.7 | | Δ6 |
18:2n-6 | 18:3n-6 | 13.8 ± 0.4 | | Δ6 |
20:3n-3* | 20:4n-3 | 0.9 ± 0.1 | | Δ8 |
20:2n-6 | 20:3n-6 | 0 | | Δ8 |
20:3n-3a | Δ5,11,14,1720:4 | 4.8 ± 0.5 | | Δ5 |
20:4n-3 | 20:5n-3 | 18.6 ± 1.0 | | Δ5 |
20:3n-6 | 20:4n-6 | 7.34 ± 1.0 | | Δ5 |
22:5n-3 | 22:6n-3 | 0 | | Δ4 |
22:4n-6 | 22:5n-6 | 0 | | Δ4 |
24:5n-3 | 24:6n-3 | 7.5 ± 0.6 10.0 ± 2.0 | | Δ6 |
24:4n-6 | 24:5n-6 | | Δ6 |
Results are expressed as the percentage of the PUFA substrate converted to PUFA product with the following calculation: [individual product area / (all products area + substrate area)] × 100. Value is presented as the mean ± SEM (n = 3). |
*Conversion of 20:3n-3 involved step wise reactions of Δ8 desaturation, followed by a Δ5 desaturation of 20:4n-3 to EPA. |
** 20:3n-3 was also converted to non-methylene interrupted fatty acid, Δ5,11,14,1720:4. |
ND = not detected; PUFA= Polyunsaturated fatty acid |
Transgenic yeast harboring the ORF of P. micromegethes elovl5 was able to elongate C18 and C20 PUFA substrates, with highest activities C18:4n-3 and C18:3n-6 substrates. There was no elongation of C22 substrates, implying the P. micromegethes Elovl5 is a C18-C20 PUFA elongase (Table 4). In comparison, the P. micromegethes elovl2 ORF catalyzed the elongation of all tested C18, C20 and C22 PUFA substrates. Notably, this ortholog exhibits elongation of 22:5n-3, and 22:4n-6 substrates to C24 product, with the former a significant activity in the production of DHA. As for Elovl4, both the P. micromegethes Elovl4 paralogs exhibited elongation capacity toward the saturated fatty acid (SFA) C24:0, with detection of very-long-chain saturated fatty acid (VLC-SFA) elongated products up to C30:0 (Table 5). There was also elongation of C18 and C20 PUFA substrates, and low elongation of C22 PUFA. Between the two isoforms, Elovl4a displayed higher elongation activities towards C18 PUFA. In contrast, the Elovl4b showed higher activities towards the C20 substrates, with further elongation to C36 PUFA substrates. All chromatograms of the yeast FA profiles are presented in supplementary section (Fig. S3-S7).
Table 4
Functional characterization of P. micromegethes Elovl5 and Elovl2 through heterologous expression of the respective transcript ORF in yeast (Saccharomyces cerevisiae), followed by incubation with PUFA substrate to assay the elongation rate leading to PUFA product
PUFA substrate | PUFA product | | Conversation rate (%) | Activity |
| | Pyes2 vector | Elovl5 | Elovl2 | |
18:3n-3 | 20:3n-3 | 1.6 ± 0.0 | 6.8 ± 0.1 | 18.2 ± 0.8 | C18 → 20 |
| 22:3n-3 | ND | ND | 2.9 ± 0.1 | C20 → 22 |
| 24:3n-3 | ND | ND | ND | C22 → 24 |
| Total | 1.6 ± 0.0 | 6.8 ± 0.1 | 21.1 ± 0.9 | |
18:2n-6 | 20:2n-6 | 1.3 ± 0.3 | 2.3 ± 0.0 | 7.2 ± 0.6 | C18 → 20 |
| 22:2n-6 | 1.3 ± 0.1 | 0.6 ± 0.1 | 1.6 ± 0.0 | C20 → 22 |
| 24:2n-6 | ND | ND | ND | C22 → 24 |
| Total | 2.6 ± 0.4 | 2.8 ± 0.1 | 8.8 ± 0.6 | |
18:4n-3 | 20:4n-3 | 1.8 ± 0.1 | 39.4 ± 3.0 | 17.3 ± 0.4 | C18 → 20 |
| 22:4n-3 | ND | 0.3 ± 0.0 | 8.6 ± 0.3 | C20 → 22 |
| 24:4n-3 | ND | ND | 1.8 ± 0.1 | C22 → 24 |
| Total | 1.8 ± 0.1 | 39.7 ± 3.0 | 27.8 ± 0.6 | |
18:3n-6 | 20:3n-6 | 1.4 ± 0.1 | 32.0 ± 2.2 | 15.8 ± 0.7 | C18 → 20 |
| 22:3n-6 | ND | ND | 7.0 ± 0.4 | C20 → 22 |
| 24:3n-6 | ND | ND | 2.8 ± 0.4 | C22 → 24 |
| Total | 1.4 ± 0.1 | 32.0 ± 2.2 | 25.5 ± 1.0 | |
20:5n-3 | 22:5n-3 | ND | 12.5 ± 2.6 | 9.2 ± 0.4 | C20 → 22 |
| 24:5n-3 | ND | ND | 35.3 ± 1.1 | C22 → 24 |
| Total | | 12.5 ± 2.6 | 44.5 ± 1.4 | |
20:4n-6 | 22:4n-6 | ND | 4.4 ± 0.7 | 3.4 ± 0.2 | C20 → 22 |
| 24:4n-6 | ND | ND | 20.4 ± 1.8 | C22 → 24 |
| Total | | 4.4 ± 0.7 | 23.8 ± 2.0 | |
22:5n-3 | 24:5n-3 | ND | ND | 10.8 ± 0.5 | C22 →24 |
22:4n-6 | 24:4n-6 | ND | ND | 8.4 ± 0.5 | C22 →24 |
Results are expressed as the percentage of the fatty acid substrate converted to elongated fatty acid products with the following calculation: [individual product area / (all products area + substrate area)] × 100. Value is presented as the mean ± SEM (n = 3). |
Total conversion is calculated based on summation of all of the elongated products. |
ND = not detected; PUFA= Polyunsaturated fatty acid |
Table 5
Functional characterization of P. micromegethes Elovl4a and Elovl4b through heterologous expression of the respective transcript ORF in yeast (Saccharomyces cerevisiae), followed by incubation with SFA/PUFA substrate to assay the elongation rate leading to SFA/PUFA product
Substrate | Product | Conversion rate (%) | Activity |
| | Pyes2 vector | Elovl4a | Elovl4b | |
SFA | | | | | |
24:0 | 26:0 | 3.25 ± 0.7 | 5.2 ± 0.3 | 3.05 ± 0.3 | C24 → 26 |
| 28:0 | ND | 2.8 ± 0.1 | 1.6 ± 0.3 | C26 → 28 |
| 30:0 | ND | 1.9 ± 0.1 | 0.4 ± 0.1 | C28 → 30 |
| 32:0 | ND | ND | ND | C30 → 32 |
PUFA | | | | | |
18:3n-3 | 20:3n-3 | 1.3 ± 0.3 | 15.1 ± 0.8 | 8.3 ± 0.1 | C18 → 20 |
| 22:3n-3 | ND | 1.7 ± 0.1 | 1.0 ± 0.1 | C20 → 22 |
| 24:3n-3 | ND | ND | ND | C22 → 24 |
| Total | 1.3 ± 0.3 | 16.9 ± 0.8 | 9.3 ± 0.2 | |
18:2n-6 | 20:2n-6 | 1.2 ± 0.0 | 14.8 ± 0.9 | 5.8 ± 0.2 | C18 → 20 |
| 22:2n-6 | 1.4 ± 0.0 | 1.8 ± 0.3 | 1.5 ± 0.0 | C20 → 22 |
| 24:2n-6 | ND | ND | ND | C22 → 24 |
| Total | 2.6 ± 0.0 | 16.5 ± 1.0 | 7.3 ± 0.3 | |
18:4n-3 | 20:4n-3 | 1.4 ± 0.3 | 5.5 ± 0.3 | 4.0 ± 0.2 | C18 → 20 |
| 22:4n-3 | ND | 0.6 ± 0.1 | 0.5 ± 0.0 | C20 → 22 |
| 24:4n-3 | ND | ND | ND | C22 → 24 |
| Total | 1.4 ± 0.3 | 6.0 ± 0.3 | 4.5 ± 0.2 | |
18:3n-6 | 20:3n-6 | ND | 7.2 ± 0.7 | 5.0 ± 0.1 | C18 → 20 |
| 22:3n-6 | ND | 1.2 ± 0.4 | 0.7 ± 0.1 | C20 → 22 |
| 24:3n-6 | ND | ND | ND | C22 → 24 |
| Total | | 8.3 ± 0.3 | 5.7 ± 0.2 | |
20:5n-3 | 22:5n-3 | ND | 3.9 ± 0.2 | 5.7 ± 0.8 | C20 → 22 |
| 24:5n-3 | ND | 0.5 ± 0.1 | 1.2 ± 0.2 | C22 → 24 |
| 26:5n-3 | ND | ND | 0.2 ± 0.0 | C24 → 26 |
| 28:5n-3 | ND | ND | 0.1 ± 0.0 | C26 → 28 |
| 30:5n-3 | ND | ND | 0.2 ± 0.0 | C28 → 30 |
| 32:5n-3 | ND | ND | 1.1± 0.1 | C30 → 32 |
| Total | | 4.4 ± 0.3 | 8.3 ± 1.1 | |
20:4n-6 | 22:4n-6 | ND | 4.4 ± 0.2 | 10.5 ± 2.0 | C20 → 22 |
| 24:4n-6 | ND | 0.7 ± 0.1 | 2.8 ± 0.5 | C22 → 24 |
| 26:4n-6 | ND | 0.6 ± 0.1 | 0.6 ± 0.1 | C24 → 26 |
| 28:4n-6 | ND | ND | 0.3 ± 0.1 | C26 → 28 |
| 30:4n-6 | ND | ND | 2.4 ± 0.3 | C28 → 30 |
| 32:4n-6 | ND | ND | 3.8 ± 0.5 | C30 → 32 |
| Total | | 5.1 ± 0.3 | 20.4 ± 2.8 | |
22:5n-3 | 24:5n-3 | ND | 1.5 ± 0.1 | 3.1 ± 0.1 | C22 →24 |
| 26:5n-3 | ND | ND | 0.2 ± 0.1 | C24 →26 |
| 28:5n-3 | ND | ND | 0.2 ± 0.1 | C26 →28 |
| 30:5n-3 | ND | ND | 0.5 ± 0.1 | C28 →30 |
| 32:5n-3 | ND | ND | 2.2 ± 0.2 | C30 →32 |
| Total | | 1.5 ± 0.1 | 6.1 ± 0.2 | |
22:4n-6 | 24:4n-6 | ND | 1.7 ± 0.2 | 2.4 ± 0.4 | C22 →24 |
| 26:4n-6 | ND | ND | 0.3 ± 0.1 | C24 →26 |
| 28:4n-6 | ND | ND | 0.2 ± 0.0 | C26 →28 |
| 30:4n-6 | ND | ND | 1.6 ± 0.3 | C28 →30 |
| 32:4n-6 | ND | ND | 2.6 ± 0.5 | C30 →32 |
| Total | | 1.7 ± 0.2 | 7.1 ± 1.1 | |
22:6n-3 | 24:6n-3 | ND | ND | ND | C22 →24 |
Results are expressed as the percentage of the fatty acid substrate converted to elongated fatty acid products with the following calculation: [individual product area / (all products area + substrate area)] × 100. Value is presented as the mean ± SEM (n = 3). |
Total conversion is calculated based on summation of all of the elongated products. |
ND = not detected; SFA= Saturated fatty acid; PUFA= Polyunsaturated fatty acid |
Fatty acid composition of P. micromegethes
The FA composition of P. micromegethes whole body showed a higher proportion of SFAs (Table 6). Among these, C16:0 is the predominant FA, while for monounsaturated fatty acids (MUFAs), C18:1n-7 was the most abundance. As for PUFA, the proportion of the total n-6 PUFA is twice higher than the n-3 PUFA, with significant contribution from LA and ARA. Within the n-3 PUFA, the percentage of DHA is also higher than EPA.
Table 6
Whole body fatty acid composition of P. micromegethes
Fatty acid | Total fatty acid detected (%) |
C14:0 | 2.0 ± 0.1 |
C15:0 | 1.6 ± 0.2 |
C16:0 | 29.8 ± 1.3 |
C17:0 | 3.0 ± 0.4 |
C18:0 | 16.9 ± 0.6 |
∑ SFAs | 53.4 ± 1.7 |
C14:1 | 1.3 ± 0.1 |
C15:1 | 0.7 ± 0.1 |
C16:1 | 3.8 ± 0.4 |
C17:1 | 0.5 ± 0.0 |
C18:1n-9 | 0.9 ± 0.3 |
C18:1n-7 | 15.8 ± 0.3 |
C20:1n-9 | 0.4 ± 0.0 |
∑ MUFAs | 23.4 ± 0.5 |
C18:3n-3 (ALA) | 0.4 ± 0.1 |
C18:4n-3 | 0.2 ± 0.0 |
C20:3n-3 | - |
C20:4n-3 | 0.2 ± 0.0 |
C20:5n-3 (EPA) | 0.3 ± 0.0 |
C22:5n-3 | 0.3 ± 0.0 |
C22:6n-3 (DHA) | 4.7 ± 0.8 |
C18:2n-6 (LA) | 8.0 ± 0.6 |
C18:3n-6 | 0.4 ± 0.0 |
C20:3n-6 | 0.5 ± 0.1 |
C20:4n-6 (ARA) | 6.3 ± 0.7 |
∑ PUFAs | 21.4 ± 2.0 |
∑ n-3 | 6.2 ± 0.8 |
∑ n-6 | 15.2 ± 1.2 |
n-3/n-6 | 0.4 ± 0.0 |
SFA = Saturated fatty acid; MUFA = Monounsaturated fatty acid; PUFA = Polyunsaturated fatty acid; ALA = α-Linolenic acid; EPA = Eicosapentaenoic acid; DHA = Docosahexaenoic acid; LA = Linoleic acid; ARA = Arachidonic acid. Values are presented as the Mean ± SEM of triplicate measurement. |