A novel, unique four-member protein family involved in extracellular fatty acid binding in Yarrowia lipolytica

doi:10.21203/rs.3.rs-1949552/v1

Background: Yarrowia lipolytica, a non-conventional oleaginous yeast species, has attracted attention due to its high lipid degradation and accumulation capacity. Y lipolytica is used as a chassis for the production of usual and unusual lipids and lipids derivatives. While genes involved in the intracellular transport and activation of fatty acids in the different cellular compartments have been characterized, no genes involved in fatty acid transport from the extracellular medium into the cell have been identified so far. In this study, we have identified secreted proteins involved in extracellular fatty acid binding.

Results: The recent analysis of the Y. lipolytica secretome leads to the identification of a multi-gene family composed of four secreted proteins hereafter named UP1 to UP4. The protein products were efficiently over-expressed individually in native and multi-deletant strain (Q4: Δup1Δup2Δup3Δup4) backgrounds. Phenotype analysis demonstrated the involvement of those proteins in the binding of extracellular fatty acid. Also, deletion of these genes could prevent octanoic acid (C8) toxicity; while their individual over-expression increased sensitivity to its toxic action. The results suggested binding according to aliphatic chain length- and fatty acid concentration-dependent manner. 3D structure modelling supports at a molecular level their role in fatty acid accommodation.

Conclusions: Extracellular fatty acid binding proteins were identified for the first time in Y. lipolytica. The new gene family names are proposed eFbp1 to eFbp4. The exact mode of eFbps action remains to be deciphered individually and synergistically, nevertheless, it is expected that the proteins may be relevant in lipid biotechnology, such as improving fatty acid production and/or bioconversion.

Yarrowia lipolytica

lipid

fatty acid binding protein

fatty acid transport

secretion

This is the first report on a multi-gene family of secretory proteins involved in the binding of extracellular fatty acid, with different chain length preferences, having potential practical application in fatty acid bioconversions.

Yarrowia lipolytica is a yeast species known for its high capacity for assimilation, de novo synthesis and storage of lipids and lipid derivatives. By comparison to the yeast model Saccharomyces cerevisiae, which accumulates lipids above 20% of its dry weight at TAG:SE (triacylglycerol to sterol esters) ratio of 1:1, Y. lipolytica can natively accumulate more than 50% of lipids in its dry weight, at the ratio of 3:1, which is highly desirable as a biotechnological species. Indeed, owing to such characteristics, Y. lipolytica has become a model yeast species in research on lipid metabolism and turnover [1, 2].

The process of hydrophobic substrates (HS) assimilation is highly complex. It involves multiple metabolic pathways localized in different cellular compartments. Depending on the environmental stimuli, the onset of the assimilation process may involve emulsification and/or enzymatic hydrolysis. Emulsification is executed by a small, extracellular glycoprotein termed liposan, whose synthesis is stimulated upon growth in HS [3, 4]. Its chemical composition has not been definitely settled, but the estimations range from 5–50% of protein, 10–75% of lipids and 20–83% of carbohydrates [4–6]. Its action substantially decreases the size of lipid droplets formed in a water medium, increasing accessibility to the non-miscible HSs. Growth on TAGs requires additional step of their enzymatic hydrolysis which is catalyzed by the members of lipase(/esterase) families [4, 7, 8]. Y. lipolytica genome mining revealed a gene family of lipases (GL3R0084) containing sixteen members, including Lip2 (YALI0A20350g), Lip4 (YALI0E08492g), Lip5 (YALI0E02640g) and Lip7 to Lip19 (for details go to [7]). In addition, a four-member lipase/esterase family (GL3C3695) has been identified. That protein family is composed of: Lip1 (YALI0E10659g), Lip3 (YALI0B08030g), Lip6 (YALI0C00231g) and Lip20 (YALI0E05995g). The enzymes differ in terms of the substrate specificity, activity and expression profile [4, 9–12]. Particularly, it has been estimated, that the secreted Lip2 is responsible for the vast majority of extracellular lipolytic activity. Overall, presence of such expanded multi-gene families’ hallmarks high level adaptation of this yeast to growth on HS.

The inducible production of the surfactant and the lipases is part of the so-called surface-mediated transport mechanism [4]. The second mechanism enabling efficient utilization of HS is the direct interfacial transport relying on the binding of the HS droplets onto the cell surface [4]. That mechanism is realized by HS-inducible decrease of the cell surface polarity, and exposure of specific protrusions or hydrophobic outgrowths that collectively dock the HS droplets on the cell surface [13]. These protrusions were observed as electron dense channels connecting the exposed terminus of the protrusion with the cell interior [1, 13, 14]. Subsequently, the HS (fatty acid - FA or alkane) is passed through the cell membrane and incorporated into the cells metabolism, either by being degraded via β-oxidation for the production of energy, incorporated into membrane structures, or stored in specialized lipid bodies (LB) for further use. The downstream compartments containing enzymes involved in the degradation of HS are endoplasmic reticulum, mitochondria, LBs, but foremost – peroxisomes. The compartment in which a respective alkane/FA will be processed depends on the aliphatic chain length [8, 15–17]. Long-chain alkanes/FAs are activated in the cytoplasm by fatty-acyl–CoA-synthetase (FAA1), which is followed by transport into the peroxisome by the action of the transporters PXA1/PXA2. On the other hand, medium- and short-chain FA are not activated in the cytoplasm, but in the peroxisomes, by the action of peroxisomal fatty-acyl–CoA-synthetase (AALs genes). The mode of their internalization into the peroxisomes remains elusive.

The high capacity of Y. lipolytica for assimilation, de novo synthesis and accumulation of lipids is correlated with expanded number of genes involved in the lipids turnover. Apart from the above mentioned multi-gene family of lipases (/esterases), such amplification was also observed for the other lipid-related genes. For example, acyl-CoA oxidase (AOX), which catalyzes the first and rate-limiting step of peroxisomal β-oxidation, is present in a single copy in S. cerevisiae genome, but Y. lipolytica contains six Aox isoenzymes (encoded by POX1 to POX6 genes), with different substrate specificities and different activity levels [18, 19]. Amongst them, Aox2 preferentially oxidizes long-chain acyl-CoAs [20], Aox3 is specific for short-chain acyl-CoAs [21], Aox4p and Aox5p both act independently of the substrate length [19], while Aox1 and Aox6 are specific to dicarboxylic acid degradation [22]. Moreover, Y. lipolytica genome surveys revealed that for transformation of alkene to alcohol, there is a single gene coding for the NADPH-cytochrome P450 reductase, but there are as much as twelve genes coding for cytochrome P450 isoforms. The multigene family of cytochrome P450s (ALK genes) includes ALK1, ALK5 and ALK11 which are specific for short-chain alkanes (C10), and ALK2 that is specific for long chain length alkanes (C16) [23, 24]. Evidences on ALK3, ALK5 and ALK7 specificity towards short-chain FAs, and ALK2, ALK5, ALK7 or ALK10 gene products towards long-chain FAs were also published [1, 2, 4, 11]. Correspondingly, transcription factors that activate expression of alkane-inducible genes belong to a three member-family of proteins Yas1 to Yas3, binding to alkane-responsive element (ARE1) [25–27]. Likewise, in S. cerevisiae, a single fatty-acyl–CoA-synthetase (ScFaa2) catalyzes cytoplasmic activation of FA prior to its oxidation in peroxisomes, but as many as ten genes encoding isozymes of the Aal proteins were identified in Y. lipolytica [28]. Eight of the Aals were upregulated by HS in the medium, and all of them contained peroxisomal targeting signal PTS (SKL). Complementation tests conducted in faa1Δant1Δ background (Ant1 is a peroxisomal ATP transporter, [15, 16]) showed that overexpression of cytoplasmic version of Aal1 partly restored growth of the mutant on short-, medium- and long-chain FA media, while complementation with Aal2 to Aal10 enabled growth only in short-chain FA-based medium. Additional analyses indicated that Aal4 and Aal6 present substrate specificity towards C16:1 and/or C18:0 [28]. Collectively, those previous reports demonstrate high complexity and multiplicity of genes involved in lipids metabolism in Y. lipolytica.

While much is known in terms of lipids metabolism in Y. lipolytica, the specific, individual step of the aliphatic moiety passage through the plasma membrane remains largely unknown. The early studies on kinetics of different FAs uptake provided the first insight into this process [29]. It was suggested that FA internalized into Y. lipolytica cells by a substrate concentration-dependent, saturable, carrier-mediated, and energy-independent mechanism; as the process was operable irrespective of metabolic energy provision and membrane potential formation. In addition, it was suggested that at least two individual transportation systems of different specificity co-exist in Y. lipolytica. Specificity of the two systems was inferred from competition experiments, which clearly demonstrated that one of the systems is specific towards C12-C14, while the other is specific towards C16-C18. It was also evidenced that Y. lipolytica is unable to internalize C8 and C10, which, in addition to high toxicity of the short chain FAs and alkanes [30, 31], is the reason for its inability to efficiently grow on these substrates. Some other studies suggested that transporters belonging to ABC1 family may play a role in permeation of alkenes through the plasma membrane [22]. Four genes highly homologous to the ABC1 gene were identified in Y. lipolytica genome: ABC1 (YALI0E14729g), ABC2 (YALI0C20265g), ABC3 (YALI0B02544g) and ABC4 (YALI0B12980g). That previous research suggested that Abc1 may be involved in transportation of C16 alkane, and Abc2 in C10 alkane. Another study showed that transcription of ABC2 and ABC3 transporters was enhanced upon exposure to a range of alkanes in Y. lipolytica [31]. By their overexpression in S. cerevisiae, it was revealed that Abc2 and Abc3 act as efflux pumps, leading to improved tolerance against C9 and C10 alkanes of the host. In addition, Fat1 (YALI0E16016g), bearing activity of a very long chain fatty-acyl-CoA synthetase, was suggested to be also involved in the FAs transportation across the cell membrane [17]. Nevertheless, the exact mechanism by which hydrophobic compounds pass through a membrane is still highly controversial. Correspondingly, the genes involved in that process remain unrecognized. Systematic screening of insertional mutants [15, 32, 33] did not allow for the identification of such individual genes. Therefore, based on that previous systematic insertional mutants screening and the knowledge gained on other multi-gene families involved in lipid metabolism in Y. lipolytica, we hypothesized that the genes involved in FA/alkane internalization may belong to a multi-gene family having overlapping specificity.

In this study, based on secretory proteome data mining, we identified a new, unique protein family composed of four members (working names are UP1 to UP4, for Unknown Proteins), with no similarity to any other protein sequence. Further examination of this uncharacterized gene family led us to a hypothesis that they could be involved in FA fixation. By phenotypic characterization of a quadruple knock-out mutant and individual over-expression strains, we concluded that biological function of the family members is FA internalization. Developed outgrowth profile of the generated strains suggested that the family members possess overlapping substrate specificities in terms of aliphatic chain length. Bioinformatic modelling of the 3D structures confirmed their structural adaptation to FA binding. Altogether those results demonstrate that we identified for the first time a new, unique gene family of FA binding proteins in Y. lipolytica. The proposed name of the novel protein family is eFbp (for extracellular fatty acid binding protein).

Novel protein family found in Y. lipolytica secretome – basic analysis of amino acid sequences

Recently, high-throughput proteomics of Y. lipolytica W29 total secretome of strains producing heterologous proteins in industrial fermentation conditions (10 liters fermenter in feed batch mode) have been determined (Onésime et al, to be published). Secretome data mining with X!TandemPipeline allowed for identification of three proteins of unknown function, encoded by YALI0C05687g, YALI0D03245g and YALI0F04620g, with a coverage of 7.08 (7.66)%, 11.66 (12.62)%, and 19.03 (20.57)% (for the full length and the mature form), with an E-value of 50.199, 62.872 and 23.919, respectively. Blast analysis against Y. lipolytica genome from GRYC database, showed that the proteins belong to a multi-gene family of four members (plus YALI0F04598g) of unknown function (Table 1). Sequence-based predictions indicated that the polypeptide chains are built by 223 to 226 amino acids (complete form) / 206 to 209 amino acids (mature forms), show a molecular weight from 21678 KDa to 22725 KDa and an isoelectric point (PI) ranging from 5.1 to 8.1. Systematic gene names from Y. lipolytica strains E150 and W29 genomes and abbreviated names, used hereafter for convenience, are given in Table 1.

Table 1

Nomenclature and basic biochemical characteristics of the newly identified proteins. Systematic gene names from the E150 and W29 wild-type strain genomes. Number of amino acid in the polypeptide chain, predicted molecular weight and predicted isoelectric point (PI) of the mature form. Molecular weight prediction was based on the mature form (monoisotopic mass from Expasy).

E150 strain genome reference	W29 strain genome reference	working name	Proposed new name	AA number*	Molecular Weight*	Predicted PI*
YALI0D03245g	YALI1_D04128g	UP1	eFbp1	223 (206)	21717.43	5.40
YALI0F04598g	YALI1_F07042g	UP2	eFbp2	224 (207)	21678.16	5.10
YALI0C05687g	YALI1_C07288g	UP3	eFbp3	226 (209)	21950.50	6.47
YALI0F04620g	YALI1_F07093g	UP4	eFbp4	226 (209)	22725.72	8.09

* For the mature form, AA: Amino acid

Comparison of the amino acid sequences showed that the four proteins are highly similar, as they share between 50 and 70 % of sequence identity (Fig. 1). In addition, all the proteins are equipped with a predicted 17 amino acid signal peptide, with a probability of 0.9946 for D03245 (UP1) (MKFSHVTLAVVAATAIA), 0.9991 for F04598g (UP2) (MQFSTLALVTFAATAMA), 0.9926 for C05687g (UP3) (MKFSAVAVAAVASSALA), and of 0.9995 for F04620g (UP4) (MKLSAVTFIALSAVCLA). For each protein, a similar 3D fold composed of six-helices bundle was predicted (Fig. 1 and section 3D structure modelling).

Uniqueness of the UPs due to lack of similarity with other protein sequences

Complete UPs sequences were subsequently blasted against available protein sequence databases. Strikingly, the only similarity was found with homologous sequences from the other Y. lipolytica strains (apart from E150 and W29, they are also present in the German H222 and Polish A101 strains; data not shown). Since the screening of the protein database using the complete polypeptide sequences as queries was unsuccessful to fish out any significant hits beyond Y. lipolytica homologs, the conserved stretches of amino acid sequences found in the multiple alignment were used. Seven conserved motifs, numbered from 1 to 7, were identified within each of the UP sequence AAP[TS], APV[FY][TS]LAPxxFA, GFLDFSGY, GT[KR]FD[KQ]AVY[EA]F[IL][VI]NSGx[KS]DFL, [IF]LxSPLL, W[IL]FGxKQTVQ, [TS]GF[DN]RA, and served as queries to search against NCBI and UniProt protein sequence libraries. Expectedly, motif 1, localized to the signal peptide (at its C-terminus), was present in the highest number of protein sequences stored in NCBI and UniProt databases (3192524 and 2106782 hits, respectively). Similarly, motifs 5 and 7 were relatively frequently found (18712/13174 and 7511/5086 hits). On the other hand, motifs 2, 4 and 6 were identified in only five (NCBI database) or seven (UniProt) protein sequences within the respective library. Strikingly, irrespective of the queried database, the four UPs were among the identified motif-bearing proteins. Such an outcome highlights that each of these conservative motifs is always accompanied by the other two, and that the motifs are specific and unique for the newly identified protein family.

Cloning and overexpression of the UPs genes.

As no suggestion on biological function of the UPs could be inferred from similarity search of protein databases, overexpression strains were designed and constructed. Each UP gene was amplified with its specific primer pair (additional Table 1) from W29 wild-type genomic DNA, and cloned as BamH1/AvrII fragment into JMP4230 (additional Fig. 1), giving rise to plasmids JMP4440, JMP4442, JMP4444 and JMP4448 for UP1, UP2, UP3 and UP4 overexpression, respectively (Table 2). The individual genes were overexpressed under the control of a strong erythritol ERY-inducible promoter [34].

Table 2

Plasmids used in this study.

Plasmid Name	Characteristics	Use	Reference
JMP4230	JMP62 pHu8EYK URA3ex	overexpression	[35]
JMP4440	UP1; D03245g cloned in JMP4230	overexpression	This work
JMP4442	UP2; F04598g cloned in JMP4230	overexpression	This work
JMP4444	UP3; C05687g cloned in JMP4230	overexpression	This work
JMP4448	UP4; F04620g cloned in JMP4230	overexpression	This work
JMP4472	GGA-URA3ex_CRISPRrCas9-yl_RFP	gRNA for gene disruption	[36]
JMP4393	GGA-LYS5ex_CRISPRrCas9-yl_RFP	gRNA for gene disruption	[36]

The expression cassettes were liberated from the plasmid by NotI digestion and transformed into strain JMY7126; which is deleted for the three genes encoding the mains secreted lipases (Lip2, Lip7 and Lip8) and for the EYK1 gene for optimal erythritol induction [35]. Such cloning strategy gave rise to strains JMY7283 (overexpressing UP1), JMY7287 (overexpressing UP2), JMY7291 (overexpressing UP3) and JMY7295 (overexpressing UP4) (Table 3).

Table 3

Strains used in this study

Name	Genotype	Auxotrophy	Reference
JMY399	W29 wild-type French strain	no	[3]
JMY7126	MATA ura3-302 leu2-270-LEU2-Zeta, xpr2-322, lip2Δ, lip7Δ, lip8Δ, lys5Δ, eyk1Δ	Ura^-, Lys^-	[35]
JMY7283	JMY7126+jmp4440 D03245g (UP1^oe)	Lys^-	This study
JMY7287	JMY7126+jmp4442 F04598g (UP2^oe)	Lys^-	This study
JMY7291	JMY7126+jmp4444 C05687g (UP3^oe)	Lys^-	This study
JMY7295	JMY7126+jmp4446 F04620g (UP4^oe)	Lys^-	This study
JMY8651	JMY7126+ GGE114 +YALI0B21582gΔ* (fil- ; mhy1Δ)	Lys^-	This study
JMY8673	JMY7126 +YalI0C05687gΔ (Q1-up3Δ)	Ura^-, Lys^-	This study
JMY8674	JMY7126 +YalI0D03245gΔ (Q1-up1Δ)	Ura^-, Lys^-	This study
JMY8675	JMY7126 +YalI0F04598gΔ (Q1-up2Δ)	Ura^-, Lys^-	This study
JMY8683	JMY7126 +YalI0F04620gΔ (Q1-up4Δ)	Ura^-, Lys^-	This study
JMY8684	JMY8674 + YALI0F04598gΔ (Q2-up1Δ up2Δ)	Ura^-, Lys^-	This study
JMY8700	JMY8684 + YALI0F04620gΔ (Q3-up1Δ up2Δ up4Δ)	Ura^-, Lys^-	This study
JMY8748	JMY 8700 + YALI0C05687gΔ (Q4-up1Δ up2Δ up4Δ up3Δ)	Ura^-, Lys^-	This study
JMY8761	JMY8748 + YALI0B21582gΔ (Q4, fil- ; Q4 mhy1Δ)	Ura^-, Lys^-	This study
JMY8777	JMY8761 + jmp4230 (Q4 URA3)	Lys^-	This study
JMY8778	JMY8761 + jmp4230 (Q4 URA3)	Lys^-	This study
JMY8779	JMY8761+jmp4444 C05687g (Q4-mhy1Δ UP3^oe)	Lys^-	This study
JMY8780	JMY8761+jmp4444 C05687g (Q4- mhy1Δ UP3^oe)	Lys^-	This study
JMY8781	JMY8761+jmp4440 D03245g (Q4- mhy1Δ UP1^oe)	Lys^-	This study
JMY8782	JMY8761+jmp4440 D03245g (Q4- mhy1Δ UP1^oe)	Lys^-	This study
JMY8783	JMY8761+jmp4442 F04598g (Q4- mhy1Δ UP2^oe)	Lys^-	This study
JMY8784	JMY8761+jmp4442 F04598g (Q4- mhy1Δ UP2^oe)	Lys^-	This study
JMY8785	JMY8761+ jmp4446 F04620g (Q4- mhy1Δ UP4^oe)	Lys^-	This study
JMY8786	JMY8761+ jmp4446 F04620g (Q4- mhy1Δ UP4^oe)	Lys^-	This study

oe: overexpression, *GGE114: pSBA-U-Z_NDV_Acceptor vector: pSB1A3-Zeta up Not-Ura Bsa-RFP-Bsa Zeta down Not.

Synthesis and secretion of UP proteins in erythritol inducible strains

The overexpressing strains were subjected to shake flask batch cultivations to study the UPs overproduction pattern by proteomics analysis in non-induced (YNBD2) and induced media (YNBD2E) (Fig. 2). The observed band pattern in SDS-PAGE gels was unexpected, as the most intensive bands appearing in the erythritol-induced cultures, were migrating below the anticipated area (red arrow in Fig. 2).

Thus, three regions were excised from each lane: region 1) around the expected size of UP proteins (about 22 KDa), regions 2), 3) with intensive protein bands (boxed in Fig. 2), and subjected to proteomic analysis. The aim was to determine number of identifying peptides in each of the bands and protein coverage under both non-induced (G) and induced (E) conditions. As shown in Table 4, the numbers of identifying peptides were higher in bands formed in the concentrated supernatants from induced cultures (E), but migrating at < 14 kDa; thus indicating that the UPS proteins did not migrate at the expected size. The sequence coverage by the peptides identified under the inducing condition was high, ranging from 33.6 to 57% of the mature forms. Abundance was variable between UPs as shown by the number of identifying peptides detected under the induced condition: 19 to 11 spectra (Table 4).

Table 4

Proteomic analysis of secreted UP proteins. The number of peptides identifying respective UP protein in the most intensive bands migrating at < 14 kDa. The bands were excised from the SDS-PAGE gel (see Fig. 2) and analyzed by high resolution mass spectrometry. Coded names A2 – concentrated supernatant from UP3 C05687g, B2 - UP1 D03245g, C2 - UP2 F04598g, D2 - UP4 F04620g under non-induced (G) and induced (E) conditions.

UP genes	A2G	A2E	B2G	B2E	C2G	C2E	D2G	D2E
UP1 D03245g	5	17	5	5	4	7	4	4
UP2 F04598g	2	3	2	66	8	3	3	2
UP3 C05687g	1	2	9	3	1	111	6	2
UP4 F04620g	2	5	3	4	0	10	19	32

While the proteomic analysis confirmed the correct synthesis and secretion of the UPs in the over-expressing strains, which was much increased under the inducer provision, it also indicated that the other UPs were also constitutively expressed from their native promoter in these media (identified at lower abundance, based on typically 1-10 identifying peptides). Such an outcome could impair adequate phenotype analysis. Therefore, it was necessary to first construct a quadruple deletant strain (Q4), and then construct Q4 derivatives, overexpressing individually UP proteins in such background.

Overexpression of UPs in a quadruple deletant strain (Q4-mhy1D) and phenotype analysis.

Since accurate phenotype analysis could be impaired due to unintentional co-secretion of the other than targeted UP in the overexpressing strains, a quadruple deletant strain (Q4) was constructed. Cloning strategy comprised successive gene deletion using CRISPR-Cas9 method [36], as illustrated in Figure 3. First, replicative plasmids CRISPR-Cas9-gRNA-UPs-URA3 and CRISPR-Cas9-gRNA-UPs-LYS5 were constructed using the gRNA primer pair designed for the corresponding target sites (additional Table 1). The plasmids were co-transformed in JMY7126 strain, and prototrophic transformants were selected on minimal media YNBD2. After the transformants selection, the corresponding UP locus was amplified, screened for deletion and sequenced. After gene deletion confirmation, the strains were grown in YPD for curing the replicative CRISPR-Cas9 plasmid. Strains bearing the expected deletion were stored (Table 3). The UP1 to UP4 single deletants (Q1) were assigned names JMY8674 (up1Δ), JMY8675 (up2Δ), JMY8673 (up3Δ) and JMY8683 (up4Δ) (Fig. 3). Then, multiple gene deletion was initiated using Q1-up1Δ, by co-transformation with the CRISPR-Cas9-gRNA-UPs plasmids together with a PCR fragment amplified from the corresponding deleted strain resulting in a Q4 strain JMY8748. In addition, since filamentation is known to affect HS phenotype analysis, the MHY1 gene deletion, previously shown to abolish hyphae formation [37], was also introduced in the Q4 deletant using a CRISPR-Cas9-gRNA-MHY1-LYS5 vector. The resulting strain was then transformed with the UPs overexpression cassettes resulting in the overexpressing strains Q4-mhy1Δ-UP1^OE, Q4-mhy1Δ-UP2^OE, Q4-mhy1Δ-UP3^OE and Q4-mhy1Δ-UP4^OE. Strain JMY8761 was transformed with an empty vector containing URA3 giving rise to JMY8777 (Q4-mhy1Δ-URA3), was used as control (Fig. 3).

Phenotype studies on HS of different aliphatic chain length

Assuming the involvement of the UPs in hydrophobic substrates utilization, we screened the quadruple deletant strain (Q4) with all the four loci knocked-out and the overexpressing strains for their growth on solid plates, containing HS with different of aliphatic chain length. Strain JMY8777, a derivative of the strain JMY8761 transformed with the empty vector was used as control (Fig. 4).

As depicted in Figure 4, growth inhibition was observed for all strains grown on short chain FAs (mC10 to mC14), except for the strain overexpressing UP3 for which growth was still observed up to the 10^-3 dilution. In contrast on FAs of longer chain mC16 and C18:1 (triolein), growth was observed up to the 10^-3 dilution. This demonstrates that the deletion of the four UP proteins abolishes growth of the Q4 strain on short chain fatty acids, particularly on mC10; which implies their specific involvement in short chain FAs fixation and internalization. As growth of Q4 was impaired neither on mC16 nor on triolein, these HS must be fixed and internalized by some other mechanism. Furthermore, sole overexpression of UP3 alleviated the growth inhibition particularly on mC10, but also mC12 and mC14. This directly indicates on UP3 implication in the FAs transport of short chain FAs, which was particularly unique for mC10. Both UP2 and UP4 appeared to slightly alleviate growth retardation of the Q4 strains on mC12 and mC14, suggesting their preference towards these FAs. Overexpression of UP1 in Q4 background had minor positive impact on the strain’s growth, mainly observed on mC12 plates, where it grew up to 10^-2 dilution, vs 10^-1 for Q4. Based on these evidences, it was postulated that the UPs are involved in short/medium chain FAs fixation and internalization. It is suggested that they operate based on FAs chain length dependency (UP3 as the sole acting on mC10; UP2 and UP4 – mainly on mC12 and mC14), but with overlapping specificity (UP1 acts on mC10 to mC14). Interestingly, those discrepancies are consistent with the sequence alignment which ranges UP1, UP3 then UP2 and UP4.

Octanoic acid (C8) toxicity in Q4 and the overexpressing strains

Octanoic acid (C8) is known to be very toxic for Y. lipolytica [29,30]. Presuming the UPs’ involvement in FA transportation (based on drop test data; Fig. 4), we aimed to investigate the effects of Q4 and UPs individual overexpression on C8 toxicity. The Q4 and Q4 derivative strains were grown in minimal media supplemented with different concentration of C8 (0%, to 0.2%; Fig. 5). No main differences could be observed in the absence of C8 (Fig. 5A). The growth was monitored in the absence and in the presence of inducer (erythritol). As shown in Figure 5B and 5C, deletion of the four UP genes (Q4 strain) increases C8 tolerance upon erythritol induction compared to the control strain (JMY8651). Overexpression of UP3 and UP4 increases C8 toxicity at 0.1% (Fig. 5B), while overexpression of all UPs increases C8 toxicity at 0.2% (Fig. 5C). Based on these observations, we postulate that UPs are involved in short chain FAs internalization. Also, indications on the UPs substrate specificity could be inferred from this assay, as UP3 and UP4 both showed higher affinity towards C8.

3D structure modelling

The four sequences of UP1 to UP4 share nearly 40% of amino acid residues identity and hence are clearly homologous. Therefore, they are expected to fold into a similar 3D structure. Similarity of primary structures of UP1 to 4, with respect to known proteins of the protein database, is too low to support consistent homology modelling. We therefore submitted the sequences of UP1-4 to AlphaFold2 computational tool. AlphaFold is a family of structure prediction tools based on deep learning which produces high quality predictions in a blind test of structure prediction (CASP14), also when no clear homologs are known [38]. The 3D structures of the UP1-4, as modelled by AlphaFold, are highly similar to each other (Fig. 6)

Their 3D structures are highly similar from position 70 (50 if numbered from the mature proteins) up to the position 221 at the end of the sequence (UP1 numbering) (Fig. 6A). Consistently, the core of UP1-4 is predicted to fold into a single domain composed of five helices of thirty amino acid residues long in average that assembles into a helix bundle (helix 2 to 6 in Fig. 1). Markedly, helix 4 is locally disordered at the same position in all the models, centered onto the conserved sequence stretch F[IL][VI]NSGx[KS]DFL. This results in a topological kink that could help packing the helix bundle. Helix bundle is expected to form an inner cavity, which is observed here for the four proteins. Interestingly, the N terminal part (20-70 in Fig. 1, or 1-50 in matured protein) forms an extension outside of the main helical bundle. The predicted structure for this N-terminal part is roughly similar (Fig. 6B) for UP1 to 3, with a single alpha-helix, but the position of this N terminal part of the protein relatively to the main helical domain is not accurately predicted, possibly due to structural flexibility at this terminal end. For UP4, the N-terminal part could be also predicted. In the rank 1 prediction, the N-terminus forms a single a helix with an additional b sheet of two strands. However, rank 2 prediction does not predict these b strands stretches but instead a structure very similar to the N-terminal part of UP1 to 3. AlphaFold predictions of the helical domain in UP1 to 4 appears reliable (Fig. 6C), on the following criteria: i) the PLLDT (predicted local difference test) curves is a per residue confidence metric computed by alpha Fold. The score for the rank 1 model of the UP in the 60-80 zone (0-100 scale) for the main helical domain, which suggest that the overal fold of this domain is highly probable, the structure of the N terminal stretch and its relative position with respect to the helical domain is less confidently predicted, ii) the predicted structures for the 4 independent sequences (UP1 to 4) are highly similar as expected for these clearly homologous proteins, iii) the independent predictions of the same sequence consistently envision the helical domains. As an example, the predictions rank 1 to 5 of UP1 are shown in Additional Fig. 2.

Search of known protein with related 3D structure

The coordinates of the predicted UP1 model rank1 was submitted to a systematic comparison of the Full PDB using DALI server [39]. A list of non-redundant structures detected as structurally similar to UP1 is shown in Table 5. The closest structures are the ligand domains 1 and 2 of protein Mp1 from Talaromyces marneffei (previously known as Penicillium marneffei) and the ligand binding domain of a protein from Aspergillus fumigatus, which show a rms deviation of 2,6 and 2,7 Å, respectively.

Table 5

Relevant protein structures similar to UP1 detected by DALI. Abbreviations: Z score as computed by Dali, a score above 2 is considered significant; rmsd: average deviation of equivalent Ca between UP1 model and PDB protein detected as structurally similar; D lali: length of structural alignment; nres: number of residues = length of chain; %id: % identity between the two proteins. The 3 PDB structures indicated in bold are shown in Figure 7.

PDB	Z	rmsd	D lali	nres	%id	Species	Protein and Domain	Bound Molecules
5fb7-B	15.4	2.7	143	151	10	Talaromyces marneffei	MP1 ligand binding domain 1	Arachidonic acid (2 molecules)
5csd-A	15.2	3.2	147	158	10	Talaromyces marneffei	MP1 ligand binding domain 2	Arachidonic acid
5j5K-A	14.8	2.6	142	151	11	Aspergilus fumigatus	AFMP4P ligand binding domain	Palmitic acid
5csd-D	15.1	3.3	151	159	10	Talaromyces marneffei	MP1 ligand binding domain 2	Arachidonic acid (2 molecules)
5ecf-B	15.3	2.8	142	150	9	Talaromyces marneffei	MP1 ligand binding domain 1	Arachidonic acid
5e7x-A	15.2	3.1	147	155	9	Talaromyces marneffei	MP1 ligand binding domain 1	Palmitic acid
6zpp-A	13.4	2.9	144	157	7	Drechmania coniospora	Virulence factor

Remarkably, these structures are FA binding domains, and were resolved as bound to arachidonic acid (1 or 2 molecules) or palmitic acid [40]. A similar protein in apo form (i.e without bound FA chain) was previously solved for Drechmania coniospora. With the exception of the additional N-terminal domain extension, which seems to be specific to the Y. lipolytica proteins, all these proteins could be structurally similar as their core domain shares the helix bundle. As such, they are highly suspected to be functionally similar as the topology of the helix bundle could engage a comparable binding of FA (Fig. 7). In the known complexes, the FAs are always bound in an elongated hydrophobic pocket between the helices. Such a binding mode would also be possible in the predicted model for UP1-4.

In the known FA-Complex structures, the orientation of the bound molecule and the position of the Carboxylate group are variable. Indeed, in FAs binding proteins, the polar side chains of Q138 in domain 1 Mp1 from T. marneffei (5E7X), of N105 and S165 in domain 1 Mp1 from T. marneffei (5ECF chain B), and of S136 and S140 in domain 2 Mp1 from T. marneffei (5FB7), make hydrogen bonds with the carboxylate group of either palmitic or arachidonic acids (Table 5).

It is not possible to infer such kept positions for the binding the carboxylate in UPs. Nevertheless, UP1 to 4 should all display a putative binding site within the inner faces of helices. The modeling displays there apolar residues, which are particularly suited to bind alkyl moieties (Fig. 8). Among them, L61, F101, F127, V131, W161 and F182 are strictly conserved in the four proteins (UP1 mature protein numbering). Also, another strictly preserved residue is Y116 could possibly be a H-bonding partner of FA carboxylate. Additional minimization to relax the structures and to perform subsequent docking of FA in UP1-4, with either one or two molecules bound could be envisaged to profile FA binding capacity and specificity. It is beyond the scope of this paper.

In this study, we report on the identification and preliminary characterization of a previously undescribed new gene family of four members. The genes were identified during the data mining and analysis of Y. lipolytica secretory proteome. The newly identified proteins are highly similar to each other in terms of primary structures, hence they are also highly suspected to be related in function. Neither biological process, nor cellular compartment nor molecular function was assigned to any of these proteins. The only indication was the high confidence prediction of signal peptides at the N termini of the polypeptide chains, suggesting that all the proteins are secreted to the extracellular space. When blasting the entire amino acid sequences for the four proteins (assigned working names UP1 to UP4), no significant homologous proteins could be identified beyond this group of proteins in different Y. lipolytica strains; with no hit outside this group/species. Likewise, identification of conserved motifs and subsequent screening of databases, looking for akin pattern, which were expected to increase the probability to find structural homologs were unsuccessful. Based on these analyses, we concluded that the newly identified four-member protein family is new and unique to Y. lipolytica species.

To reveal biological function of proteins UP1-4, we generated a series of Y. lipolytica strains overexpressing individually the UP-encoding genes, also in a quadruple deletant (Q4) background. The genes were cloned under strong inducible promoter, to assure high level synthesis of the UPs and their unambiguous identification within the modified strains secretomes (Fig. 2 and Table 4). Due to constitutive basic level synthesis of the UPs, only overexpression strains constructed in Q4 background were considered relevant for the UPs functional studies. As none from the proteins deposited in the databases exhibited similarity with the UPs, it was not possible to make any supported presumptions on their putative function. Knowing biology, physiology and genomic structure of Y. lipolytica, and i) its high predisposition to thrive in the presence of hydrophobic substrates (HS; lipids, triglicerides, fatty acids) and ii) the expansion of multi-gene families involved in HS utilization, we hypothesized that the UP proteins might be involved in HS utilization. We therefore tested the constructed strains (Fig. 3) for growth in different fatty acid methyl esters. The results of drop tests (Fig. 4) suggested that indeed, the UPs might be involved in the assimilation of HS. Specifically, we observed that Q4 strain was unable to grow in a minimal medium with methyl esters of C10 to C14. No such effect could be seen, when longer FAs were used. Strikingly, overexpression of UP3, alleviated Q4 growth limitation on mC10, mC12 and mC14. Overexpression of UP2 and UP4 relieved the observed growth limitation on mC12 and mC14.

Interpretation of the observed growth pattern was straightforward for mC12 and mC14. Complementation of Q4 phenotype, by overexpression of one of the missing UPs, enhanced the initially disrupted biological process of FAs fixation and internalization. Hence, assimilable, non-toxic FAs could be metabolized by the growing cells. On the other hand, it has long been known that short chain FAs, including C8 and C10, are not assimilated by transportation systems in Y. lipolytica [29], and C8 is highly toxic to the cells [30]. Hence, to get more insight why growth of the Q4-UP3 strain was so efficiently growing on mC10-containing plates, we conducted growth kinetics analysis in the presence of the toxic FA (Fig. 5). We observed that quadruple deletant was not subjected to growth inhibition by C8 at 0.1–0.2%, as it grew better than the control strains having basic level of the UPs synthesis. Complementation with the UPs increased sensitivity of the strains. It was particularly visible at C8 0.1% for UP3 and UP4, and at C8 0.2% for UP1 and UP4 overexpression, suggesting their specificity towards this FA. Noteworthy, the differentiation between sensitive-resistant strains was concentration-dependent, as all the strains, irrespective of the constructed genotype, were sensitive to C8 at > 0.3%. On the other hand, the drop test was conducted with the FA (C10 – C18) concentration at 0.4%, which was severely limiting growth of all the Q4 and derivatives, except Q4-UP3, which grew very well. Such observations are consistent with the previous report on FAs internalization in Y. lipolytica by [29] who clearly stated that the system’s operation is concentration-dependent and exhibits specificity towards length of the aliphatic chain. Hence, it seems that UP3 enhances toxicity of C8 at > 0.1% by its enhanced delivery to the plasma membrane surface and the following flip-flop transportation, but by binding C10 at 0.4% it reduced its availability in the medium. This hypothesis could be supported by previous studies showed that ABC1 gene deletion, which is involved in exportation of the alkanes, abolished growth on C10 alkane as it enhanced the toxic effect exerted by the compound [22]. Growth on mC10 is an equilibrium between: i) hydrolysis of the methyl ester (mC10) to free FA (C10) liberated by external lipases/esterases, ii) fraction of the free FA trapped by the UPs, and iii) the transported fraction going through, for example the flip-flop/transporter pathways. Consequently, if the concentration of free FA is too high (too high activity of the extracellular lipases/esterases), the growth will be inhibited as shown previously during the comparative analysis of growth and lipase production on mC10 of strains from the Yarrowia Clade [8]. On the contrary, low activity of the enzyme limits liberation of the toxic free FA, hence cells can grow and have the capacity to metabolize FA through the beta-oxidation pathway.

Phenotypes of wild-type and mutant strains on both mC10 and C8 fatty acids, strongly support that the UP genes are involved in fatty acid utilization.

The sequence similarity between UPs and the other proteins was too low (ranged between 7 and 11% sequence identity, to identify structural analogs, however the similarity of the structures predicted by AlphaFold models was consistent and revealed accurate structural homology with FA binding proteins (Fig. 7). The UPs structures are highly similar to each other, which was expected based on the primary structure similarity. Sequence conservation projected on the predicted structure of UPs indicates that the part of the barrel domain located near the N terminal end has a higher fraction of conserved side chains, which is also included at the N terminal extremity. This may indicate a functional role and could suggest a possible interaction surface with yet unidentified partners. The N terminal part of the sequence was the most delicate to predict. It forms an extension to the helical domain which is unique and of unknown functional role. The core sequence is modeled into a helical barrel with a hydrophilic surface and a hydrophobic internal pocket that could bind FAs. Most of the residues that shape the inner side are also strictly conserved within UP1-4, which could guarantee a preserved binding mode. In line with that, the ligand binding domains of protein Mp1, detected as structurally similar to the UPs, is known to be a virulence factors as it traps the proinflammatory lipid mediator - arachidonic acid, with high affinity, and consequently alters the host response to infections [40, 41]. In Y. lipolytica, which is a non-pathogenic yeast species [42], such biological process seems to be irrelevant. Nevertheless, the molecular function of aliphatic chain recognition appears both useful and relevant in biological process such as FAs transportation or internalization. Also, this discovery is particularly striking regarding Y. lipolytica metabolic context, especially because no such molecules have been described previously. Also, redundancy within the existence of four closely similar UPs cannot be ruled out, and emphasizes both expected fine tuning of specificity towards distinct FA and possibly the importance of those proteins in Y. lipolytica life cycle. Altogether, this discovery fills in a substantial knowledge gap and need to be shortly more precisely deciphered. Based on these results we concluded that the UPs are involved in binding of free FAs in the medium and their delivery to the cell surface, according to the proposed model (Fig. 9).

Understanding how FA can enter into the cells is of great interest. Until now, no extracellular proteins able to bind fatty acids had been identified. Here, we identify a new multi-genes family only present in Yarrowia lipolytica. Through binding of fatty acids to these secreted proteins, these hydrophobic compounds are solubilized, sequestered and transported into the cells.

Further research is needed to determine the binding specificity of these eFbp. As example, in vitro binding tests could highlight the specificity of these proteins toward different types of fatty acid such as polyunsaturated fatty acids, hydroxylated fatty acids and fatty alcohol. The challenge now will be also to identify the role of the Nterminal region which seems to be “floppy” in the docking of the protein either to the cell surface or the binding to a channel transport for a Fbp-facilitated flip-flop or a Fbp-facilitated channel transport mechanism depending on fatty acid chain length.

Phenotypes of the eFbp mutants and of the eFbp overexpressing strains on HS strongly support that proteins UP1 to UP4 are FA binding proteins. The structure prediction using deep learning procedure associated to systematic structure comparison, further support that this unique protein family is involved in the binding of fatty acids, the solubilization of fatty acids, their sequestration and probably their implication in the transport of fatty acids into the cells. It is expected that the proteins may be relevant in lipid biotechnology, such as improving fatty acid secretion/production by yeast, reducing toxicity of strains secreting short chain fatty acids. These proteins could also be relevant for improving the bioconversion of fatty acids, such as the bioconversion of linoleic acids into conjugated linoleic acid.

Strains and cloning strategy

The first set of yeast strains used in this study was constructed in a background of Y. lipolytica JMY7126 host strain, developed previously [34, 35]. This strain is unable to utilize erythritol (ERY; Δeyk1), which, in combination with ERY-inducible promoter, makes it an efficient host for inducible overproduction of cloned genes. Routine cultivation was conducted, according to standard protocols [3]. The second set of yeast strains was constructed by successive gene deletion resulting in the quadruple gene deletant JMY8761 (Q4). The Q4 strain was used as a background for construction of the strains overexpressing the UPs individually, without interference of the native UPs. All the strains used in this study are listed in Table 3.

Vectors construction and subclonings were conducted in Escherichia coli DH5a strain, which was routinely maintained according to standard protocols [43]. All oligonucleotides used for cloning are listed in additional Tables 1. Plasmids are listed in Table 3.

Cloning procedures followed standard molecular biology protocols [43]. Restriction enzymes and T4/T7 DNA ligases were obtained from New England Biolabs (MA, USA). PCR amplifications were performed in an Applied Biosystems 2720 thermal cycler with GoTaq DNA polymerases (Promega, WI, USA) or Q5 High-Fidelity DNA Polymerase (New England Biolabs). PCR fragments were purified with a QIAgen Purification kit (Qiagen, Hilden, Germany) and plasmids DNA were isolated with a QIAprep Spin Miniprep kit (Qiagen) The four target genes were amplified from Y. lipolytica genomic DNA template, and cloned into JMP4230 vector using BamHI/AvrII restriction digestion. The destination vector JMP4230 is a variant of JMP62 shuttle vector series [44], bearing strong ERY-inducible promoter pHU8EYK1 [45], tLip2 terminator, excisable URA3ex auxotrophy selection marker, and zeta integration sites, flanking expression cassette (additional Fig. 1). Gene expression cassettes were obtained by NotI digestion of the corresponding plasmid and used to transform Y. lipolytica strains by the lithium acetate method as described previously [3]. Two positive sub-clones, bearing one of the four JMP62-based overexpression cassette, were stored for further studies.

The quadruple mutant (Q4) with deletion of all the four genes of interest was generated using CRISPR-Cas9 system as described previously [36]. Correctness of gene deletion was verified by colony PCR and identifying the strains that contained the expected deletion between the two guides. The expected PCR products were 1128 /615, 1381 /939, 999/4643 and 1005/457 (fragment size in the WT/fragment size in the deletant in bp) for UP1 to UP4, respectively. Correctness of genomic integration of the JMP62-based overexpression cassettes / gene disruption was verified by PCR and sequencing.

Overexpression of UPs protein

To investigate expression level of the four target proteins, the overexpressing strains were grown in liquid, batch cultures in YNBD2 medium (non-induced) and in YNBD2E (erythritol induced media) at 28°C, 150 rpm for 48h. The minimal media YNB contained 0.17% (w/v) yeast nitrogen base (without amino acids and ammonium sulfate, YNBww), 0.5% (w/v) NH4Cl, 0.2% (w/v) glucose, and was supplemented with 0.5% (w/v) erythritol for induction. Media were buffered with 50mM phosphate buffer at pH 6.8. Samples were collected at 48h, biomass was separated by centrifugation (5000 g for 5 min), and the supernatants were used for further analyses.

Drop tests on agar plates

Precultures were grown overnight (180 rpm, 28°C) in YPD. Cell were centrifuged, washed with YNB and resuspended to an OD600 of 1. Successive 10-fold dilutions were performed (10⁰-10^− 4), and 10 µl of each dilution were spotted onto YNB plates containing the fatty acids and the lipid as indicated. The following FAs were used in our study: mC10, methyl decanoate (SAFC, 99%); mC12, methyl laurate (Sigma Aldrich, 98%); mC14, methyl myristate (SAFC, 98%); mC16, methyl palmitate (SAFC, 97%), tributyrin (ACROS, 98%), triolein (Fluka, 65%), C8 octanoic acid (Aldrich, 98%). The minimal YNB media contained 0.17% (w/v) yeast nitrogen base (without amino acids and ammonium sulfate, YNBww), 0.5% (w/v) NH4Cl, 0.2% (w/v) glucose, and was supplemented with 0.5% (w/v) erythritol for induction. To complement strain auxotrophy, uracil (0.1 g/L) and lysine (0.2 g/L) were added as required. Media were buffered with 50mM phosphate buffer at pH 6.8. Stock solution of methyl FAs and lipids were subjected to sonication three times for 1 min in the presence of Tween 40 (Sigma) and used at 0.4% final. Solid media were created by adding 1.6% agar. The plates were incubated at 28°C. Pictures were taken every 24 h.

Growth in microplates

Overnight precultures in YPD were centrifuged and washed with YNB. Cell suspensions were standardized to an OD600 of 0.1. Yeast strains were grown in 96-well plates in 200 µL of minimal YNB media containing glucose (2 g/L) and different concentration of FAs. Media was supplemented with erythritol (5 g/L) in induction condition, and ethanol (8 g/L) for the control without fatty acid addition. Ethanol solution of octanoic acid (C8) was added to the medium to give a final concentration of 0.1–0.2% octanoic acid. Cultures were maintained at 28°C under constant agitation with a Biotek Synergy MX microplate reader (Biotek Instruments, Colmar, France). Growth was followed by measuring culture’s optical density at 600 nm every 30 min for 72h.

Electrophoresis SDS-PAGE and identification of polypeptides via mass spectrometry

Samples withdrawn from the shake flask cultures, run either with or without induction with erythritol, were concentrated 10-fold on Amicon ultracel − 10 kDa (Millipore, Molsheim, France) and subjected to gel electrophoresis SDS-PAGE, according to a standard methodology [43, 46]. The concentrated supernatants were denatured by boiling for 5 min with Laemmli buffer, and resolved in gradient SDS-PAGE Novex 4–12% in Tris-glycine buffer (Life Technologies). The molecular mass (MM) standard was SeeBlue® Plus2 Pre-Stained Standard (Thermo Fisher Scientific, Villebon sur Yvette, France) contained standard proteins ranging from 3 to 198 kDa.

Bands having approximately expected size were excised from the gels. The bands were washed twice with 30 µL of 50% acetonitrile (ACN) /50 mM ammonium bicarbonate (NH₄HCO₃), followed by dehydration with 30 µL of ACN. The disulfide bonds were reduced with 100 mM dithiothreitol (DTT, Sigma) during 30 minutes at 56°C. Cystein residues were alkylated with 50 mM iodoacetamide by incubating for 45 min in darkness at room temperature. Digestion was conducted overnight at 37°C with 10 ng of trypsine (Promega). Peptides were extracted with 30 µL of 40% ACN/ 0.1% triflouroacetic acid (TFA), followed by treatment with 30 µL of ACN. Samples were vacuum-dried (SpeedVac, Savant™ SPD121D, Thermo Fisher Scientific) then suspended in 20 µL of loading buffer (2% ACN/ 0.1% TFA) and were injected in LC-MS/MS.

Mass spectrometry was performed on the PAPPSO platform (MICALIS, INRA, Jouy-en-Josas, France; http://pappso.inrae.fr/), using an Orbitrap Discovery (Thermo Fisher Scientific) coupled to an UltiMate™ 3000 RSLCnano System (Thermo Fisher Scientific, San José, USA). A 4 µL treated sample was loaded at 20 µL min^-1 on a precolumn (µ-Precolumn, 300 µm i.d x 5 mm, C18 PepMap100, 5 µm, 100 Å, ThermoFisher Scientific). After 3 min, the precolumn cartridge was connected to the separating column Acclaim PepMap RSLC nanoViper (C18 particle 3 µm size, 500 mm lengh, 75 µm i.d., Thermo Fisher Scientific). Loading buffer: 2% ACN / 0.1% TFA; resolution buffer A: 0.1% AF/ H₂O 98%; resolution buffer B: 0.1% AF / ACN 80%. The runs were executed at 300 nl. min^-1 with a linear gradient from 0 to 35% buffer B for 25 minutes, including regeneration (98% buffer B). One run took 54 min. Data dependent acquisition in Top 8 was achieved with CID collision mode.

MS data Analyses. The four protein sequences of Y. lipolytica were added to a Bovin and a contaminant databases (keratins). Protein identification was performed as described previously, using X!TandemPipeline version 0.2.10, run with a precursor mass tolerance of 10 ppm and a fragment mass tolerance of 0.5 Da [47]. Enzymatic cleavage rules were set to trypsin digestion (after Arg and Lys, unless Pro follows directly after), and no semi-enzymatic cleavage rules were allowed. The fixed modification was set to cysteine carbamidomethylation and methionine oxidation, which were considered as potential modifications. The identified proteins were filtered as follows: 1) peptide E-value < 10^− 2 with a minimum of 2 peptides per protein and 2) a protein E-value of < 10^− 4.

Bioinformatics

Blast – database search for similar sequences, Signal Peptide prediction

First, the fasta sequences of UP1, UP2, UP3 and UP4 of Y. lipolytica strain CLIB122 were retrieved from GRYC (http://gryc.inra.fr/) and were analyzed using Phobius tool from EBI that predicts transmembrane topology (https://www.ebi.ac.uk/Tools/pfa/phobius/). Secretory potential as well as the primary and secondary amino acid structure of the signal peptides (SPs) were predicted using SignalP [48], TargetP [49] and PrediSi [50] tools.

Then, the sequences were aligned using ClustalW algorithm for its accurate precision (with default parameters) [51]. The following motifs were identified as conserved: AAP[TS], GT[KR]FD[KQ]AVY[EA]F[IL][VI]NSGx[KS]DFL, GFLDFSGY, [IF]LxSPLL, [TS]GF[DN]RA, W[IL]FGxKQTVQ, APV[FY][TS]LAPxxFA; residues in brackets are allowed for substitution. The Uniprot database was screened with the emboss Fuzzpro tool to extensively search for the sequences that could contain the combination of those motifs, hence to identify putative homologues. In parallel, to get functional information, the sequence proteins were submitted and scanned against the Prosite collection of motifs using the expasy/prosite webserver (https://prosite.expasy.org/scanprosite/).

Blast and 3D structure analysis.

HHpred was used to search for structural homologous in the protein database [52]. Since no structural homologues could be retrieved from the PDB at this stage, we went on using AlphaFold structure prediction, based on Artificial Intelligence [38]. The sequence of YALI0D03245g1 (UP1) was submitted to AlfaFold2 using ColabFold server [53]. The 3D structure predicted by AlfaFold was then compared to all known protein structures using DALI [39].

Ethics approval and consent to participate:

Not applicable

Consent for publication

Not applicable

Availability of data and materials

All data generated or analysed during this study are included in this published article and its additional information files.

Competing interests

The authors declare no conflict of interest.

Funding

Jean-Marc Nicaud received grant from the project YaLiOl supported by the ANR grant "ANR-20-CE43-0007" of the French National Research Agency (ANR) in France. Jean-Marc Nicaud and Lea Vidal received grants from the project Val2O supported by the UPSaclay in the frame of the Investments for the future managed by the Agence Nationale de la Recherche under the “Investissements d’avenir” program with the reference Poc in Labs 2020-1011. In part, this study was financially supported by Ewelina Celinska grant received by a subvention (506.771.09.00 B.) from Ministry of Education and Science in Poland received by Poznan University of Life Sciences.

Authors' contributions

J.-M.N. and D.O. defined the concept of the study, J.-M.N. and E. C acquired the funding, D.O., J.-M.N. supervised the experimental work, L.V., S.T. constructed the strains and plasmids, S.T., P.K. E.C. performed protein production, D.O.and J.-M.N. performed fatty acid tests, C.H. performed the proteomic analysis, V.M. and G.A. performed the protein analysis, G.A. and P.M. analyzed the protein structure, J.-M.N. and E.C. wrote the original draft preparation, D.O., C.H., G.A., P.M., E.C. and J.-M.N. revised the manuscript. All authors read and approved the final manuscript.

Acknowledgements

We thanks Esteban Lebrun for designing sgRNA target sequence for MHY1 CRISPR-Cas9 vector and Camilla Pires de Souza for helpful comments and C8 data analysis.

Beopoulos A, Cescut J, Haddouche R, Uribelarrea JL, Molina-Jouve C, Nicaud JM. Yarrowia lipolytica as a model for bio-oil production. Prog Lipid Res. 2009;48:375–87.
Beopoulos A, Chardot T, Nicaud JM. Yarrowia lipolytica: A model and a tool to understand the mechanisms implicated in lipid accumulation. Biochimie. 2009;91:692–6.
Barth G, Gaillardin C. Yarrowia lipolytica. In: Wolf K, editor. Nonconv Yeasts Biotechnol A Handb. Springer Berlin Heidelberg; 1996. p. 313–88.
Thevenieau F, Beopoulos A, Desfougeres T, Sabirova J, Albertin K, Zinjarde S, et al. Handbook of Hydrocarbon and Lipid Microbiology. Handb Hydrocarb Lipid Microbiol. 2010. p. 1514–27.
Amaral PFF, da Silva JM, Lehocky M, Barros-Timmons AMV, Coelho MAZ, Marrucho IM, et al. Production and characterization of a bioemulsifier from Yarrowia lipolytica. Process Biochem. 2006;41:1894–8.
Cirigliano MC, Carman GM. Purification and characterization of liposan, a bioemulsifier from Candida lipolytica. Appl Environ Microbiol. 1985;50:846–50.
Fickers P, Marty A, Nicaud JM. The lipases from Yarrowia lipolytica: Genetics, production, regulation, biochemical characterization and biotechnological applications. Biotechnol Adv [Internet]. Elsevier Inc.; 2011;29:632–44. Available from: http://dx.doi.org/10.1016/j.biotechadv.2011.04.005
Michely S, Gaillardin C, Nicaud JM, Neuvéglise C. Comparative Physiology of Oleaginous Species from the Yarrowia Clade. PLoS One. 2013;8:1–10.
Pignede G, Wang H, Fudalej F, Gaillardin C, Seman M, Nicaud J-M. Characterization of all extracellular lipase encoded by LIP2 in Yarrowia lipolytica. J Bacteriol. 2000;182:2802–10.
Fickers P, Fudalej F, Le Dall MT, Casaregola S, Gaillardin C, Thonart P, et al. Identification and characterisation of LIP7 and LIP8 genes encoding two extracellular triacylglycerol lipases in the yeast Yarrowia lipolytica. Fungal Genet Biol. Academic Press Inc.; 2005;42:264–74.
Fickers P, Benetti PH, Waché Y, Marty A, Mauersberger S, Smit MS, et al. Hydrophobic substrate utilisation by the yeast Yarrowia lipolytica, and its potential applications. FEMS Yeast Res. Elsevier; 2005. p. 527–43.
Meunchan M, Michely S, Devillers H, Nicaud JM, Marty A, Neuvéglise C. Comprehensive analysis of a yeast lipase family in the Yarrowia clade. PLoS One [Internet]. 2015;10:1–22. Available from: http://dx.doi.org/10.1371/journal.pone.0143096
Mlíčková K, Roux E, Athenstaedt K, D’Andrea S, Daum G, Chardot T, et al. Lipid accumulation, lipid body formation, and acyl coenzyme A oxidases of the yeast Yarrowia lipolytica. Appl Environ Microbiol. 2004;70:3918–24.
Lasserre JP, Nicaud JM, Pagot Y, Joubert-Caron R, Caron M, Hardouin J. First complexomic study of alkane-binding protein complexes in the yeast Yarrowia lipolytica. Talanta. Elsevier; 2010;80:1576–85.
Thevenieau F, Le Dall MT, Nthangeni B, Mauersberger S, Marchal R, Nicaud JM. Characterization of Yarrowia lipolytica mutants affected in hydrophobic substrate utilization. Fungal Genet Biol. 2007;44:531–42.
Dulermo R, Gamboa-Meléndez H, Ledesma-Amaro R, Thévenieau F, Nicaud JM. Unraveling fatty acid transport and activation mechanisms in Yarrowia lipolytica. Biochim Biophys Acta - Mol Cell Biol Lipids [Internet]. Elsevier B.V.; 2015;1851:1202–17. Available from: http://dx.doi.org/10.1016/j.bbalip.2015.04.004
Michely S. Dynamique des génomes et évolution du métabolisme lipidique chez les levures du clade Yarrowia. 2014.
Wang H, Le Clainche A, Le Dall MT, Wache Y, Pagot Y, Belin JM, et al. Cloning and characterization of the peroxisomal acyl CoA oxidase ACO3 gene from the alkane-utilizing yeast Yarrowia lipolytica. Yeast. England; 1998;14:1373–86.
Wang H, Le Dall MT, Waché Y, Laroche C, Belin JM, Nicaud JM. Cloning, sequencing, and characterization of five genes coding for acyl-CoA oxidase isozymes in the yeast Yarrowia lipolytica. Cell Biochem Biophys. United States; 1999;31:165–74.
Luo YS, Nicaud JM, Van Veldhoven PP, Chardot T. The acyl-CoA oxidases from the yeast Yarrowia lipolytica: characterization of Aox2p. Arch Biochem Biophys. United States; 2002;407:32–8.
Luo YS, Wang HJ, Gopalan K V, Srivastava DK, Nicaud JM, Chardot T. Purification and characterization of the recombinant form of Acyl CoA oxidase 3 from the yeast Yarrowia lipolytica. Arch Biochem Biophys. United States; 2000;384:1–8.
Thevenieau F. Ingénierie métabolique de la levure Yarrowia lipolytica pour la production d’acides dicarboxyliques à partir d’huiles végétales. 2006.
Iida T, Sumita T, Ohta A, Takagi M. The cytochrome P450ALK multigene family of an n-alkane-assimilating yeast, Yarrowia lipolytica: cloning and characterization of genes coding for new CYP52 family members. Yeast. England; 2000;16:1077–87.
Iida T, Ohta A, Takagi M. Cloning and characterization of an n-alkane-inducible cytochrome P450 gene essential for n-decane assimilation by Yarrowia lipolytica. Yeast. England; 1998;14:1387–97.
Endoh-Yamagami S, Hirakawa K, Morioka D, Fukuda R, Ohta A. Basic helix-loop-helix transcription factor heterocomplex of Yas1p and Yas2p regulates cytochrome P450 expression in response to alkanes in the yeast Yarrowia lipolytica. Eukaryot Cell. 2007;6:734–43.
Hirakawa K, Kobayashi S, Inoue T, Endoh-Yamagami S, Fukuda R, Ohta A. Yas3p, an opi1 family transcription factor, regulates Cytochrome P450 expression in response to n-alkanes inYarrowia lipolytica. J Biol Chem. 2009;284:7126–37.
Yamagami S, Morioka D, Fukuda R, Ohta A. A basic helix-loop-helix transcription factor essential for cytochrome P450 induction in response to alkanes in yeast Yarrowia lipolytica. J Biol Chem [Internet]. Â© 2004 ASBMB. Currently published by Elsevier Inc; originally published by American Society for Biochemistry and Molecular Biology.; 2004;279:22183–9. Available from: http://dx.doi.org/10.1074/jbc.M313313200
Dulermo R, Gamboa-Meléndez H, Ledesma-Amaro R, Thevenieau F, Nicaud J-M. Yarrowia lipolytica AAL genes are involved in peroxisomal fatty acid activation. Biochim Biophys Acta. Netherlands; 2016;1861:555–65.
Kohlwein SD, Paltauf F. Uptake of fatty acids by the yeasts, Saccharomyces uvarum and Saccharomyces lipolytica. Biochim Biophys Acta. 1983;792:310–7.
Park YY-K, Nicaud J-MJ. Screening a genomic library for genes involved in propionate tolerance in Yarrowia lipolytica. Yeast [Internet]. England; 2019;yea.3431. Available from: https://onlinelibrary.wiley.com/doi/abs/10.1002/yea.3431
Chen B, Ling H, Chang MW. Transporter engineering for improved tolerance against alkane biofuels in Saccharomyces cerevisiae. Biotechnol Biofuels [Internet]. Biotechnology for Biofuels; 2013;6:1. Available from: Biotechnology for Biofuels
Mauersberger S, Nicaud J-M. Tagging of Genes by Insertional Mutagenesis in the Yeast Yarrowia lipolytica Stephan. In: Wolf K, Breuning K, Barth G, editors. Non-Conventional Yeasts Genet Biochem Biotechnol. Springer-Verlag Berlin Heidelberg 2003; 2003. p. 3098–107.
Mauersberger S, Wang HJ, Gaillardin C, Barth G, Nicaud J-M. Insertional Mutagenesis in the n-Alkane-Assimilating Yeast Yarrowia lipolytica: Generation of Tagged Mutations in Genes Involved in Hydrophobic Substrate Utilization. J Bacteriol. 2001;183:5102–9.
Trassaert M, Vandermies M, Carly F, Denies O, Thomas S, Fickers P, et al. New inducible promoter for gene expression and synthetic biology in Yarrowia lipolytica. Microb Cell Fact. BioMed Central Ltd.; 2017;16.
Park YK, Vandermies M, Soudier P, Telek S, Thomas S, Nicaud JM, et al. Efficient expression vectors and host strain for the production of recombinant proteins by Yarrowia lipolytica in process conditions. Microb Cell Fact [Internet]. BioMed Central; 2019;18:167. Available from: https://doi.org/10.1186/s12934-019-1218-6
Larroude M, Trabelsi H, Nicaud JM, Rossignol T. A set of Yarrowia lipolytica CRISPR/Cas9 vectors for exploiting wild-type strain diversity. Biotechnol Lett [Internet]. Springer Netherlands; 2020;42:773–85. Available from: https://doi.org/10.1007/s10529-020-02805-4
Konzock O, Norbeck J. Deletion of MHY1 abolishes hyphae formation in Yarrowia lipolytica without negative effects on stress tolerance. PLoS One [Internet]. 2020;15:1–11. Available from: http://dx.doi.org/10.1371/journal.pone.0231161
Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596:583–9.
Holm L. Using Dali for Protein Structure Comparison. Methods Mol Biol. United States; 2020;2112:29–42.
Sze K-HH, Lam W-HH, Zhang H, Ke Y-H hong, Tse M-KK, Woo PCYY, et al. Talaromyces marneffei Mp1p Is a Virulence Factor that Binds and Sequesters a Key Proinflammatory Lipid to Dampen Host Innate Immune Response. Cell Chem Biol [Internet]. United States: Elsevier Ltd.; 2017;24:182–94. Available from: http://dx.doi.org/10.1016/j.chembiol.2016.12.014
Lam W-H, Sze K-H, Ke Y, M-K T, Zhang H, Woo P, et al. Talaromyces marneffei Mp1 Protein, a Novel Virulence Factor, Carries Two Arachidonic Acid-Binding Domains To Suppress Inflammatory Responses in Hosts. Infect Immun. 2019;87:1–17.
Groenewald M, Boekhout T, Neuvéglise C, Gaillardin C, Van Dijck PWM, Wyss M. Yarrowia lipolytica: Safety assessment of an oleaginous yeast with a great industrial potential. Crit Rev Microbiol. 2014;40:187–206.
Sambrook J, Russell D. Molecular Cloning: A Laboratory Manual. 3rd ed. Cold Spring Harbor Laboratory Press, New York; 2001.
Nicaud JM, Madzak C, Van Den Broek P, Gysler C, Duboc P, Niederberger P, et al. Protein expression and secretion in the yeast Yarrowia lipolytica. FEMS Yeast Res. Oxford University Press (OUP); 2002;2:371–9.
Park Y-K, Korpys P, Kubiak M, Celińska E, Soudier P, Trébulle P, et al. Engineering the architecture of erythritol-inducible promoters for regulated and enhanced gene expression in Yarrowia lipolytica. FEMS Yeast Res. England; 2019;19:1.
Laemmli UK. Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature. England; 1970;227:680–5.
Langella O, Valot B, Balliau T, Blein-Nicolas M, Bonhomme L, Zivy M. X!TandemPipeline: A Tool to Manage Sequence Redundancy for Protein Inference and Phosphosite Identification. J Proteome Res. United States; 2017;16:494–503.
Petersen TN, Brunak S, von Heijne G, Nielsen H. SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat Methods [Internet]. 2011;8:785–6. Available from: https://doi.org/10.1038/nmeth.1701
Almagro Armenteros JJ, Salvatore M, Emanuelsson O, Winther O, von Heijne G, Elofsson A, et al. Detecting sequence signals in targeting peptides using deep learning. Life Sci alliance. 2019;2.
Hiller K, Grote A, Scheer M, Münch R, Jahn D. PrediSi: Prediction of signal peptides and their cleavage positions. Nucleic Acids Res. 2004;32.
Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994;22:4673–80.
Söding J, Biegert A, Lupas AN. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res. 2005;33:W244-8.
Mirdita M, Schütze K, Moriwaki Y, Heo L, Ovchinnikov S, Steinegger M. ColabFold - Making protein folding accessible to all. bioRxiv [Internet]. Cold Spring Harbor Laboratory; 2022; Available from: https://www.biorxiv.org/content/early/2022/02/08/2021.08.15.456425

No competing interests reported.

Onesimeadditionalfile20220810.docx

A novel, unique four-member protein family involved in extracellular fatty acid binding in Yarrowia lipolytica

Status:

Version 1

Abstract

Figures

One-sentence summary

Introduction

Results

Discussion

Conclusions

Materials And Methods

Strains and cloning strategy

Overexpression of UPs protein

Drop tests on agar plates

Growth in microplates

Electrophoresis SDS-PAGE and identification of polypeptides via mass spectrometry

Bioinformatics

Blast – database search for similar sequences, Signal Peptide prediction

Declarations

References

Additional Declarations

Supplementary Files

Status:

Version 1