CAZyome and Sugar Transportome Unveil Singular Aspects of the Lignocellulolytic Enzyme System of P enicillium echinulatum 2HH

Background: The production of bioethanol using lignocellulosic feedstocks has proved to be a cutting-edge technology. However, this technology faces limitations such as cost and yield of enzymatic systems, required to degrade eﬃciently the polysaccharides of plant cell walls. Penicillium echinulatum 2HH is a widely studied ascomycete, known by its eﬃcient cellulolytic cocktails. One strategy to improve the sacchariﬁcation yields for commercial exploitation is the design of hypersecreting strains. However, molecular knowledge about the lignocellulolytic system of this speciﬁc fungus is scarce. Understanding both lignocellulolytic and sugar uptake systems is essential to obtain industrial strains with adequate eﬃciency for bioethanol production. Results: We report a comprehensive in silico characterization of CAZymes and sugar transporters of P. echinulatum 2HH. The CAZyome reveals an outstanding repertoire of enzymes involved in plant biomass degradation. Among them, we highlight the cellulolytic enzyme system whose genes are predominantly orthologous to P. oxalicum 114-2, demonstrating the high similarity of these phylogenetically related enzyme producers. We also report a LPMO-type enzyme of the AA16 family described for the ﬁrst time in these fungi. In addition to the well-known high activity of β -glucosidases, we found that coding genes for the AA16, GH5-4 and GH45 families comprehend the main diﬀerences in the cellulolytic complexes of P. echinulatum 2HH and P. oxalicum 114-2, when both are compared to commercial producers. Our phylogenetic analysis of the sugar transportome suggests that P. echinulatum 2HH diversity and speciﬁcity of STs include eight major families with speciﬁcity to diﬀerent groups of sugars. Finally, our phylogenetic analyses enabled the identiﬁcation of several iBGLs and STs potentially involved in the accumulation of intracellular cellodextrins. Conclusions: Overall, both CAZyome and sugar transportome of P. echinulatum revealed new insights into the mechanisms underlying a ﬂexible and highly functional metabolism to degrade plant biomass. Peculiarities found in our study help to highlight the cellulolytic complex of P. echinulatum 2HH, contributing to the commercial ascendance of Penicillium spp. as cellulolytic enzyme producers. Furthermore, the ﬁrst phylogenetic classiﬁcation of STs and iBGLs shed new light into the role of these genes regarding the preferred carbon source during fungal growth. Along these lines, these iBGLs and STs comprise valuable gene targets to understand the regulatory mechanisms underlying cellulolytic enzymes and to design hypersecreting strains with adequate eﬃciency for bioethanol production in large-scale.

Second-generation bioethanol produced by using lignocellulosic feedstock has proved to be a cutting-edge technology in optimizing the production of biofuels. Renewable sources used for generating bioethanol exhibit a particularly rich lignocellulose content. The high cost of bioethanol production from second-generation feedstocks results from the pretreatment and enzymatic hydrolysis processes, required to convert this lignocellulosic biomass into monomeric sugars more rapidly and with greater yields [1]. Some Penicillium species have been highlighted due to its superiority over existing enzyme producers, especially due to the production of balanced cellulolytic cocktails rich in β-glucosidases. Consequently, these enzymatic mixtures result in improved enzymatic hydrolysis yields [2]. P. echinulatum is widely studied for biofuels production from lignocellulosic biomass, mainly agricultural residues, such as sugar-cane bagasse and elephant grass [3,4,5,6]. P. echinulatum has been studied for about 40 years, since the isolation of the 2HH wild-type. Longterm strain improvements resulted in the S1M29 mutant that yields an expressive increase in cellulase titers and provides a better lignocellulosic biomass hydrolysis [7,8].
Filamentous fungi genome, secretome and transcriptome analyses are quite relevant for enzymatic cocktails design, particularly for biofuel production. The design of hypersecreting strains is essential to achieve adequate efficiency of enzyme mixtures, considering the established potential of P. echinulatum 2HH for the production of cellulolytic enzymes. In summary, it is necessary to understand the different strategies employed by P. echinulatum 2HH to degrade lignocellulosic biomass, enabling the enhancement of enzyme production. The ability of fungi to grow, to transport, and to ferment different types of sugars remains a major challenge for biofuels production from lignocellulosic biomass [9]. The plant lignocellulosic biomass is primarily made up of the glucohomopolysaccharide cellulose (20-50%, w/w), hemicellulose (15-35%, w/w) and lignin. The polysaccharides that constitute the hemicellulose include xylan, glucuronoxylan, xyloglucan, glucomannan and arabinoxylan backbones with heterogeneous side-chains. The use of monosaccharides that constitute plant biomass polymers implies their efficient hydrolysis, which is still a major technical challenge because of its recalcitrance and heterogeneity [10].
P. echinulatum 2HH was first isolated from the digestive-tract of coleoptera larvea, commonly known as furniture beetles. As the name suggests, it is known to feed on wood and has the potential to reduce wooden objects to fine dust. Moreover, Anobium punctatum larvae normally live in coarse wood debris, which are weakened and predigested fallen tree trunks, allowing the larva to move through the cracks [11]. Assuming that lignin, cellulose and hemicellulose work together to provide a structural function in plants and lignin is responsible for stiffness and rigidity [12], it can be stated that the larvae diet is basically composed by cellulose and hemicellulose, and may contain some lignin residues. In this sense, the first enzymatic production experiments with 2HH strain raised evolutionary speculations, suggesting a possible long-term mutualistic symbiosis between the 2HH strain and A. punctatum larvae.
Cellulose degradation is mediated by the cellulolytic enzyme system, widely used for biofuel production [13]. Besides that, different biopolymers require specific enzymatic systems for their degradation, such as starchdegrading enzymes responsible for starch degradation and xylan-degrading enzymes responsible for hemicellulose major component depolymerization. Enzymes that degrade, modify or create glycosidic bonds are known as CAZymes and are classified by the Carbohydrate-Active Enzyme (CAZy) database. These enzymes are organized into different families, according to their amino acid sequence and structural similarity: i) Glycoside Hydrolases (GHs) are responsible for hydrolysis and / or rearrangement of glycosidic bonds; ii) Glycosyl Transferases (GTs) are responsible for the formation of glycosidic bonds; iii) Polysaccharide Lyases (PLs) perform non-hydrolytic cleavage of glycosidic bonds; iv) Carbohydrate Esterases (CEs) hydrolyze carbohydrate esters; v) Auxiliary Activities (AAs) are redox enzymes that act synergically with other CAZymes; and vi) Carbohydrate Binding Modules (CBMs) promote the adhesion of the enzyme to the carbohydrate [14].
The production of plant cell wall degrading enzymes, cellulases, hemicellulases, ligninases and pectinases, is regulated mainly at the transcriptional level in filamentous fungi. Gene expression of these enzymes is regulated by various environmental and cellular factors, some of which are common while others are speciesspecific or enzyme class-specific. These genes are inducible in presence of the carbon source or molecules derived from the carbon source, whereas repression occurs under growth conditions where the production of these enzymes is not necessary, such as in presence of glucose. Along these lines, it has been shown that cellulolytic enzyme expression is induced by cellobiose in many species of fungi, regarding cellobiose is the primary end product generated from cellulose degradation by cellulases [18]. Recent research results support that accumulation of intracellular cellodextrins (mainly cellobiose) may raise cellulases secretion by a cascade signaling pathway in P. oxalicum [19] and Neurospora crassa [20].
Filamentous fungi are able to transport a wide diversity of sugars by transmembrane proteins. The vast majority of Sugar Transporters (STs) characterized so far, belong to the subfamily (PF00083) of the major facilitator superfamily (MFS). Members in this subfamily include various STs, which are responsible for the binding and transport of various carbohydrates, organic alcohols, and acids [21,22]. Among these sugars, filamentous fungi are able to transport disaccharides such as cellobiose into the cell through specific transporters. Cellobiose and other cellodextrins can act as signal transducers in two ways: i) cellodextrins are transported into cells activating intracellular sensors, and ii) extracellular cellodextrins activate plasma membrane sensors, such as transporter-like proteins or protein-coupled G receptors [23].
Recently, the whole genome sequences (WGS) of both 2HH wild-type and S1M29 mutant were deposited at GenBank, allowing studies to discover novel features of the lignocellulolytic enzyme system of P. echinulatum. These WGS provided evidence that 2HH wild-type strain is closely related to P. oxalicum, leading to a taxonomic revision study of this fungus. In this study, we explore the genomic content of this notable fungus to discover novel features of the lignocellulolytic expression mechanisms, including the characterization of CAZymes and STs involved in the accumulation of intracellular cellodextrins. We also link experimental results of cellulase enhancement secretion from established cellulase producers to P. echinulatum 2HH, by analyzing amino acid sequences similarities and their likely roles.
Here, we analyzed the genes that constitute the cellulolytic enzyme system of P. echinulatum 2HH using a comparative genomic approach. Besides, P. echinulatum 2HH phylogenetic adjacency to P. oxalicum 114-2 enabled genomic comparisons to identify singularities in the lignocellulolytic enzyme system of these two well-known enzyme producers. Characterization of CAZymes and STs also provided new evidences to elucidate the adaptation of P. echinulatum 2HH to the coleoptera larvae diet during a potential long-term mutualistic symbiosis. Additionally, our in silico approach allowed the discovery of new gene targets and suggests a path to engineer P. echinulatum 2HH for industrial use. Finally, our results point to a broad number of genes involved on cellulolytic expression mechanisms, revealing new targets to design hypersecreting strains.

P. echinulatum CAZyome
Among all CAZymes annotated in P. echinulatum proteome and detailed in Additional file A01.1, we highlight the protein encoding-genes for the cellulolytic enzyme system in Table 1. Cellulase mixtures of Penicillium spp. are known to be rich in β-glucosidase [2], they are found in GH1 and GH3 families in P. echinulatum 2HH. Those proteins belonging to GH1 are probably intracellular enzymes, whereas five of nine GH3 β-glucosidases contain a signal peptide and are probably secreted into the medium. Furthermore, all cellobiohydrolases also contain a signal peptide, one of GH6 family and two of GH7 family. In addition, endo-1,4-β-D-glucanases are found in GH5-4, GH5-5, GH5-22, GH7, GH12 and GH45 families, where GH5-22 family is probably an intracellular enzyme, whereas all others contain a signal peptide. Considering AA enzymes, four LPMOs of AA9 family, the LPMO of AA16 family and the CDH enzyme of AA3-1 family contain a signal peptide, with the exception of the CDH enzyme of AA8 family. These AAs act synergically with the GHs, playing a crucial role in cellulose degradation system. The LPMO-type enzyme of AA16 family acts on cellulose with oxidative cleavage at the C1 position of the glucose unit [16]. In this study we identified this enzyme for the first time in P. echinulatum 2HH and P. oxalicum 114-2.
Among the putative cellulases, only EGL1 has been cloned, characterized and heterologous expressed [24]. This characterization showed that EGL1 optimal temperature is 60°C and the optimal activity occurs over a broad pH range (5)(6)(7)(8)(9). Furthermore, the EGL1 secreted by a Pichia pastoris recombinant also showed high thermostability (84% of residual activity after 1h of pre-incubation at 70°C) and calcium exerted a strong stimulatory effect over EGL1 activity [24].
Several CAZymes contains accessory non-catalytic domains (e.g.: CBMs). InterPro, PROSITE, Pfam and dbCAN2 were used to refine CBM predictions and the results are presented in Table A01.2. Specifically, we found 58 proteins containing at least one CBM domain, including 24 proteins with a CBM1 domain targeting cellulose. The main CBM1-containing cellulases are featured in Table 1. In addition to the cellulolytic enzymes of GH families 5, 6, 7, 45 and AA9, CBM1 was also observed in association with different types of catalytic domains from CE2 family and GH families 10, 11, 26, 30, 43 and 62. Furthermore, CBM1 was observed as associated with CBM63 domain, also targeting cellulose in a swollenin encoding-gene. We also found an expansin encoding-gene, containing a CBM63 domain. Many expansin-like proteins have been reported and demonstrated to bind and act on cellulosic networks. Some of them have shown to act synergistically with cellulases and xylanases [26].
In summary, the CAZyome characterization confirms the natural potential of P. echinulatum 2HH for the production of cellulolytic mixtures. Considering that P. echinulatum 2HH is known to produce a vast range of CAZymes, primarily cellulases and xylanases, previously described by secretome analysis [7]. These results provide an important step in the molecular understanding of this microorganism, allowing strain improvements using advanced techniques and further elevating the importance of the genus Penicillium in biotechnology for biofuels.

CAZymes as evolutionary markers
Whole genome sequences of 2HH wild-type strain, deposited recently at GenBank, provided evidence that 2HH is closely related to P. oxalicum, leading to a taxonomic revision study of this fungus. The phylogenetic proximity between P. echinulatum 2HH and P. oxalicum 114-2 and the orthology of genes belonging to the cellulolytic system (detailed in Additional file A01.1) denote that the cellulolytic system of P. echinulatum 2HH and P. oxalicum 114-2 are highly similar. We found the respective orthologous in P. oxalicum 114-2 for almost all cellulolytic genes of P. echinulatum 2HH.
The isolation method of 2HH strain hypothesizes a potential natural adaptation for the secretion of cellulolytic enzymes, as a possible adaptation to A. punctatum larvae diet as the only growth condition available for the fungus. To date, this evolutionary hypothesis was supported only by the mixture of cellulases secreted by the 2HH strain, which provides an effective enzymatic formulation for complete saccharification of plant residues rich in cellulose and hemicellulose [7]. This hypothesis encourages the search for new insights into the uptake of carbon sources available in the environment.
In order to understand evolutionary relationships, we analyzed the differences in the CAZyome composition of P. echinulatum 2HH and P. oxalicum 114-2. First, we analyzed CAZymes that showed low transcription level in P. oxalicum 114-2 [27] and their respective orthology in P. echinulatum. In P. oxalicum, PDE 05193 is an endo-1,4-β-D-glucanase with a signal peptide and it is orthologous of (PECH 006981/PECM 005047) in P. echinulatum. The secretion profile of P. oxalicum does not show secreted protein ratio for PDE 05193 and the transcription levels in CW medium is very low [27]. When we aligned the nucleotides of P. oxalicum 114-2 and P. echinulatum 2HH, it was possible to observe an insertion in P. echinulatum 2HH, exactly where should be the correct start codon of this gene, suggesting that this gene may have been disabled by mutations.
In the same way, extracellular β-glucosidase BGL3 (PDE 01277) of P. oxalicum 114-2 did not show activities on both pNPG and salicin in vitro and its role is not yet known [19]. In P. echinulatum 2HH we have not found the ortholog of this β-glucosidase of GH1 family. To confirm the absence of this ortholog, we performed a syntenic comparison of the genome region that was supposed to contain the orthologous of PDE 01277 in P. echinulatum 2HH. We observed that this part of the sequence was missing, comprising the location of both PDE 01277 and PDE 01278 orthologs. Although, nearby synteny is preserved, both before (PDE 01276) and after (PDE 01278), exhibiting their respective orthologs (PECH 007174 and PECH 007175) in P. echinulatum 2HH.
In contrast, by searching for orthologs in P. brasilianum, P. subrubescens and P. chrysogenum, we found the respective orthologs of both endo-1,4-β-D-glucanase and β-glucosidase of P. oxalicum 114-2. These orthologs in related Penicillium spp. suggest that the absence of both genes are particular evolutionary characteristics of P. echinulatum 2HH. It is remarkable that these two genes are also the only differences in the proteins containing signal peptide, when we compare the cellulolytic complexes of P. echinulatum 2HH and P. oxalicum 114-2. An adaptive characteristic, such as the ability to survive within a specific host, may culminate in "conditionally dispensable" genes, reflecting their importance in some, but not all, growth conditions. Gene loss during the evolution can be an adaptive evolutionary force that is especially effective when organisms are faced with abrupt environmental challenges [28].
The coleoptera larvae diet, mainly composed of cellulose, hemicellulose and residues of lignin from decayed wood [11], led us to hypothesize the presence of environment-specific adaptations in P. echinulatum 2HH to degrade these biopolymers. Along these lines, we investigate a specific set of enzymes required to degrade wood, comparing P. echinulatum 2HH and P. oxalicum 114-2. We included 18 families of peroxidases and CAZymes in our analyses, as it was previously suggested [29]. Table 2 shows the number of encoding genes of each enzyme family, which were organized into three major groups: oxidoreductases related to the degradation of lignin or lignin-like compounds, CAZys active on polysaccharide main chains; and other CAZys related to wood decay.
Fewer oxidoreductases encoding-genes of AA1 family and HTP-type could be explained by the larvae diet, which probably does not comprise unaffected lignin, but contains lignin-related compounds. Surprisingly, an encoding-gene of dye-decolorizing peroxidase (DyP) was found in P. echinulatum 2HH. This family of hemecontaining peroxidases is active on lignin-related compounds and contains important properties for lignocellulose biorefineries [30]. Apparently, this enzyme is not a particular adaptation of P. echinulatum 2HH, considering that the orthologs of this DyP peroxidase were found in P. brasilianum, P. subrubescens and P. chrysogenum, with the exception of P. oxalicum 114-2.
In summary, the major differences of P. echinulatum 2HH in relation to P. oxalicum 114-2 include: i) fewer encoding-genes for oxidoreductases of AA1 family and HTP-type, contrasting with an additional encoding-gene for a DyP peroxidase, which all related to lignin compounds degradation; ii) one less endo-1,4-β-glucanase of GH5 family and one less extracellular β-glucosidase of GH1 family, where both enzymes showed low transcription levels in P. oxalicum 114-2; iii) four additional α-L-rhamnosidases of GH78 family, comprising a remarkable range of enzymes related to pectin degradation; iv) one less β-xylosidase and one additional iBGL of GH3 family, which is ortholog of Cel3C in T. reesei ; and v) one less acetyl-xylan esterase of CE16 family, one additional endomannanase of GH5-7 family and one additional α-fucosidase of GH29 family. Our results reinforce the hypothesis that a potentially host-symbiont association may lead to environment-specific adaptations in the symbiont (fungus), particularly due to the host (insect larvae) diet, although it is still a hypothesis. P. echinulatum 2HH and P. oxalicum 114-2 may have a very close common ancestor and this is a remarkable finding that might grant a status to P. echinulatum 2HH on the global market of enzymes producers, particularly owing to its potential natural evolution for cellulase production.

Comparative analysis of Plant Biomass Degrading CAZymes (PBDC) in related fungi
We performed a comparative analysis of the number of CAZy coding proteins, which are related to the degradation of plant biomass, and which also contain signal peptide. First, we identified putative CAZymes, then each one was classified according to its connection to plant biomass degradation, as it was previously suggested [31]. Finally, we crossed both PBDC and signal peptide predictions. Our results provide a comprehensive comparative analysis of plant biomass degradation profile between twelve filamentous fungi species. The stacked barplot (Fig. ) shows the distribution of potentially extracellular CAZymes (number of proteins) involved in degradation of plant biopolymers. Complementary information about PBDC predictions are available in Additional file A02. Comparative studies like this one, contribute with the identification of nature and peculiarities for each species and how each one can be used for commercial enzymatic production.
As can be observed in the stacked barplot (Fig. ), when we totalize the number of proteins in all CAZy classes, the total number of PBDC in analyzed Ascomycetes and also between the Penicillium species varies greatly. Our results revealed an expressive higher number of PBDC for P. subrubescens, totalizing 181 proteins. Aspergillus oryzae also displays a wide range of PBDC, being the two fungi to outpace the 180 proteins, while P. digitatum includes only 71 proteins. In contrast, P. echinulatum and P. oxalicum include an average number of PBDC, comprising 125 and 121 proteins, respectively. Although the number of PBDC is not a key factor for efficient breakdown of plant biomass, our comparison shows the potential for enzyme secretion in relevant filamentous fungi.
A previous study observed CAZy families that were present in P. chrysogenum and Aspergillus niger but not in T. reesei [32]. These enzymes comprise CE8 and CE12 families whose proteins encode for pectin methylesterase and rhamnose acetylesterase activities, respectively. Also, PL1-7 and PL4-1 families that encode for pectate lyase and rhamnogalacturonan lyase activities respectively, and endo-α-1,5-L-arabinosidase from GH43-6 family. In our study we observed that the same families were present in almost all analyzed species but not in T. reesei, indicating that T. reesei does not include enzymes with these activities. Except for N. crassa which also does not include GH43-6 enzymes. In summary, T. reesei provides a narrow range of enzymes for pectin degradation when compared to the other fungi.
A larger difference was observed in P. echinulatum 2HH when compared to other Penicillium species. Most of this difference can be attributed to GH11, GH43 and AA9, which account for 19 PBDC in P. echinulatum, 16 in P. oxalicum, 12 in P. chrysogenum, 16 in P. brasilianum, 29 in P. subrubescens and only 2 in P. digitatum. Enzymes of these families are mostly related to the degradation of hemicellulose and cellulose, including xylanases, xylosidades, arabinofuranosidases and LPMOs. As already mentioned, P. echinulatum 2HH and P. oxalicum 114-2 showed a quite similar profile of putative cellulolytic enzymes, leading to their comparison with other filamentous fungi to discover peculiarities of cellulolytic enzyme profiles. In this context, we highlight the ability of these two fungi to produce effective cellulolytic cocktails. We made interesting discoveries by reducing the analysis scope to putative proteins of the cellulolytic system that carry signal peptide (Fig. ).
In T. reesei, despite the regular number of βglucosidase putative proteins, it is known that the low β-glucosidase activity of the cellulolytic complex leads to inefficiency in biomass degradation, requiring genetic engineering for secretion enhancement of this enzyme [33]. Additionally, the conservation of LPMO-type enzyme (AA16) in Aspergilli and Penicillia analyzed is notable, while this enzyme family was not found in Neurospora and Trichoderma genomes. Another interesting finding comprises putative cellulases of GH45 family. Here, we demonstrated that only four of the twelve filamentous fungi investigated possess GH45 proteins containing signal peptide. Lastly, GH5-4 typical catalytic domain was not found in the cellulolytic complex of T. reesei, N. crassa and A. niger.
In summary, our results showed that both P. echinulatum 2HH and P. oxalicum 114-2 enclose a quite similar profile of PBDC. Cellulolytic activities of AA16, GH5-4 and GH45 deserve attention when it comes to understanding their roles in the cellulose degradation system of P. echinulatum 2HH and P. oxalicum 114-2. Considering that these enzymes comprehend the main differences in the cellulolytic complexes, when both are compared to commercial producers. The peculiarities found in our study may contribute to highlight P. echinulatum 2HH and P. oxalicum 114-2 cellulolytic complexes, contributing to the commercial ascendance of these two fungi. In addition to the high level of extracellular β-glucosidase activity, our results help to support the commercial application of P. echinulatum 2HH for cellulolytic enzymes production.

Cellulolytic system expression induced by cellodextrins
Cellodextrins are saccharide polymers of varying length (two or more glucose monomers) resulting from cellulolysis, the breakdown of cellulose. The primary end product generated from cellulose degradation is disaccharide cellobiose. Cellodextrin classification occurs by its degree of polymerization (DP) including different saccharide polymers, such as cellobiose (DP2), cellotriose (DP3), cellotetraose (DP4) and so forth. Research results support the existence of a cascade signaling pathway conserved in filamentous fungi [18]. This pathway acts when cellobiose or other cellodextrins accumulates into the cell, raising the secretion of cellulases. In this context, intracellular accumulation of cellodextrins can occur in two ways: i) by low activity of iBGLs, which reduces the hydrolysis of cellodextrins to D-glucose [19,34]; and ii) by the high expression of STs, which are able to transport cellodextrins from the medium into the cell [20,35]. In order to discover the features behind this cellodextrin induction system in P. echinulatum 2HH, we performed several genomic analyses including iBGLs and STs, as presented below.

Intracellular β-glucosidases (iBGLs)
Phylogenetic analyses were performed using two datasets containing 19 and 32 iBGLs sequences, respectively for GH1 and GH3 families. The GH1 dataset includes 3 putative iBGLs of P. echinulatum 2HH, 13 putative iBGLs from related fungi, as well as, 3 BGLs of Arabidopsis thaliana used as outgroup. The GH3 dataset includes 4 putative iBGLs of P. echinulatum 2HH, 26 putative iBGL sequences of the related fungi, as well as, 2 BGLs of A. thaliana used as outgroup. Detailed information of iBGLs is available in the Additional file A03. Fig. shows the phylogenetic classification of iBGLs, comprising enzymes of GH1 (a) and GH3 (b) families. The roles of proteins highlighted in bold in both trees were verified and may provide evidence of likely conserved roles in filamentous fungi.
In T. reesei, CEL1a and CEL1b iBGLs may not participate directly into cellobiose hydrolysis, however they may contribute to the accumulation of cellobiose as signal inducers. CEL1a plays an important role in cellulase induction in T. reesei, since the cel1a single-nucleotide mutation in strain PC-3-7 resulted in high cellulase expression on cellobiose [36,37,34]. In T. reesei, CEL1a and CEL1b were also functionally equivalent in mediating the induction of cellulase genes by lactose and the simultaneous absence of these iBGLs abolished cbh1 gene expression. Still in T. reesei, CEL1a protein and its glycoside hydrolytic activity were indispensable for cellulase induction by lactose. Intracellular BGLmediated lactose induction is further conveyed to XYR1 to ensure the efficiently induced expression of cellulase genes [37].
Moreover, several studies were carried out involving iBGLs of GH3 family in T. reesei. Deletion of bgl3i gene significantly increased cellulase activities, it had no influence on fungal growth though. Deletion of bgl3i also enhanced transcription levels of CEL1a, CEL1b and XYR1 regulator, which are all crucial for lactose induction in T. reesei [38]. Still in T. reesei, ∆cel3c mutant had no significant influence on the expression of secreted proteins [39], while dysfunction of cel3d resulted in higher secretion of cellulases [33].
In P. oxalicum 114-2, BGL2 (PDE 00579) is the major iBGL and was found to be dependent on ClrB at the transcription level. The deletion of bgl2 facilitates the synergistic expression of cellulase genes. Lack of this iBGL facilitates the accumulation of intracellular cellodextrins, which can trigger signaling cascades that include expression of cellulase genes [42,19].
In P. echinulatum, protein sequence alignments of BGL2 orthologs between S1M29 mutant (PECM 002864) and 2HH wild-type (PECH 005648), revealed a single amino acid substitution (D194P), which occurred in the major iBGL of GH1 family. Although several mutations have been identified, this mutation is probably the major source for cellulase hyperproduction by the S1M29 mutant. Despite amino acid substitution not affecting BGL2 catalytic domain in P. echinulatum S1M29, a single amino acid substitution could negatively affect the enzyme or reduce its activity, as occurred in BGL2 ortholog of T. reesei [34].
In summary, the influence of iBGLs on the induction of cellulolytic enzyme systems in filamentous fungi is undeniable. However, related fungi results report the complexity and specificities of each species. All iBGLs found in P. echinulatum 2HH are highlighted with blue stars in the phylogenetic trees, which allows opportunities to figure out molecular mechanisms underlying the regulation of cellulolytic enzymes secretion. Therefore, we suggest these highlighted genes as potential engineering targets, aiming to improve the expression of cellulolytic enzymes in P. echinulatum 2HH.

Sugar transporters (STs)
Sugar transportome of P. echinulatum 2HH includes 64 putative ST encoding genes, found by searching the conserved ST domain (PF00083) on proteome. We also found considerable diversity in the numbers of PF00083-containing proteins in the fungi investigated, with more than three-fold differences between related species. Our results are consistent with the Aspergillus phylogenetic study of STs [43]. The largest and smallest numbers of PF00083-containing proteins correspond to P. subrubescens (116) and N. crassa (35) [22,43]. Putative sugar specificity to each clade were suggested based on the previously reported properties of the STs included in the phylogenetic tree.
With the exception of unknown STs highlighted in red, we numbered the clades coursing the tree counterclockwise. The first clade contains 11 STs of P. echinulatum 2HH and 9 known cellodextrin/lactose transporters from related fungi. This clade represents the most important group of STs for understanding the cellodextrin induction system in P. echinulatum 2HH. Previous studies and established functions of these STs in related fungi, particularly in P. oxalicum, help to clarify and provide insights on the influence of cellodextrin transporters in the cellulolytic induction system. In P. oxalicum, overlapping activity of isoproteins was observed between cellodextrin transporters (cdtC, cdtD or cdtG). Deletion of a single gene resulted in no observable effect on cellulase expression. Nonetheless, simultaneous deletion of cdtC and cdtD resulted in remarkable decrease in cellobiose consumption and low growth on cellulose, resulting also in lower extracellular activity of cellulases. Besides, overexpression of cellodextrin transporter genes (cdtC, cdtD or cdtG) improved cellulase production in P. oxalicum mutants, with the highest fold changes in cdtG overexpressed mutant [35]. Orthologous of these three cellodextrin transporters (cdtC, cdtD or cdtG) were found in P. echinulatum 2HH: PECH 005610, PECH 006659 and PECH 005330, respectively.
In Aspergillus nidulans, CltA is a cellobiose-specific transporter, while CltB/LacpB is able to transport cellobiose and lactose. However, this protein is a cellulose signaling sensor rather than a cellobiose transporter [23]. Still in A. nidulans, deletion of cltB/lacpB resulted in reduced growth and extracellular cellulase activity, indicating that cellulose and lactose catabolic systems operate with common components. Yet, deletion of cltA showed no significant effect on cellulase expression in the presence of cellobiose [44]. In P. echinulatum 2HH, cltA was also identified as orthologous to PECH 006659, while cltB/lacpB is also an ortholog of PECH 005610. P. echinulatum 2HH appears to lack orthologs of lacpA (high-affinity lactose permease) of A. nidulans, reinforcing the hypothesis that lactose is not among the preferred carbon sources of 2HH strain. Previous experiments have shown reduced extracellular cellulase activity in lactose medium by 9A02S1 mutant, obtained from the 2HH strain [45]. In T. reesei, Crt1 plays a crucial role in lactose induction of cellulase genes, either as a lactose transporter or a cellulose sensor [46]. The absence of crt1 abolished cellulase gene expression, being essential in cellulase gene induction independent of intracellular sugar delivery [47]. In P. echinulatum 2HH, crt1 was also identified as orthologous to PECH 005610.
In N. crassa, CDT1 and CDT2 present dual function, acting as cellodextrin transporters and also holding a key role as cellulose signaling sensors, involved in the induction of cellulases. Still in N. crassa, CLP1 is a putative cellodextrin transporter-like protein that is a critical component of cellulase induction pathway. Although CLP1 protein cannot transport cellodextrin, this signaling sensor may suppress cellulase induction. The co-disruption of cdt2 and clp1 enhanced 6.9-fold the cellulase production with cellobiose induction in the strain ∆3βG [20]. In P. echinulatum 2HH, clp1 was identified as orthologous to PECH 007291, while cdt1 is orthologous to PECH 005610 and cdt2 is orthologous to PECH 004467. In addition to the five orthologs listed so far, six more putative cellodextrin transporters and/or sensors were mapped in P. echinulatum 2HH: PECH 007978, PECH 001261, PECH 002010, PECH 008603, PECH 005239, PECH 003597. These transporters lack reviewed orthologs in related fungi, demanding experimental studies to clarify their roles in P. echinulatum 2HH cellulase induction.
Following the tree counterclockwise, Clade 2 carries mainly pentose and glycerol transporters, containing ten STs of P. echinulatum 2HH and twenty-one STs of A. niger, as well as two pentose transporters XAT1 and AN25 of N. crassa, the first with specificity for D-xylose and L-arabinose and the second is a D-xylosespecific transporter [48]. Besides that, it carries glycerol transporter (MfsA) of Aspergillus fumigatus [49].
Transporters expressed by filamentous fungi can often transport more than one type of sugar. For example, A. nidulans transporter XtrD is able to transport xylose, glucose and several other monosaccharides, whereas T. reesei STP1 is involved in glucose and cellobiose uptake. In T. reesei, disruption of stp1 comprised major cellulase and hemicellulase genes induction on cellobiose but not on sophorose [47]. In P. echinulatum 2HH, stp1 was identified as orthologous to PECH 001072.
In summary, our phylogenetic analysis, including sugar transportome of P. echinulatum 2HH, follows the same classification observed in A. niger [22]. Eight major families of STs with specificity to different groups of sugar molecules were identified, involving hexoses, pentoses, di-/oligosaccharides, and galacturonic/quinic acid. Furthermore, we also identified 11 STs of P. echinulatum 2HH and 9 known cellodextrin and lactose transporters from related fungi, which are grouped in a specific clade in our phylogenetic analysis. Of these 11 STs of P. echinulatum 2HH, 5 STs correspond to orthologous reported in literature of the related fungi, which are proven to affect the induction of cellulolytic enzymes. These results suggest that P. echinulatum 2HH diversity and specificity of STs are consistent to other cellulase producers. The putative STs provide new insights into metabolism and nutritional behavior of P. echinulatum 2HH. Finally, the genes highlighted with blue stars in the tree comprise the basement to comprehend the role of cellobiose induction on the cellulolytic expression mechanisms of P. echinulatum 2HH. These gene targets can be applied to different industrial processes and represent an important tool to engineer P. echinulatum 2HH for the biofuel industry.
In the genomic analyses we found out various novel features of the lignocellulolytic enzyme system of P. echinulatum 2HH. The CAZyome characterization exhibits the outstanding repertoire of enzymes involved in the degradation of lignocellulolytic biomass offered by P. echinulatum 2HH. In fact, the genes that constitute the cellulolytic enzyme system of 2HH strain are predominantly orthologs to the cellulolytic enzyme system of P. oxalicum 114-2, revealing the phylogenetic proximity of these filamentous fungi. Cellulolytic activities of AA16, GH5-4 and GH45 deserve attention when it comes to understanding their roles in the cellulose degradation system of P. echinulatum 2HH and P. oxalicum 114-2, considering that these encoding genes comprehend the main differences in the cellulolytic complexes, when both are compared to the commercial producers. Both cellulolytic systems include a LPMO-type enzyme of the AA16 family, which acts on cellulose with oxidative cleavage at the C1 position of the glucose unit, described for the first time in these fungi.
Besides the similarities, we also highlight the singularities in the lignocellulolytic enzyme system of P. echinulatum 2HH. Considering CAZymes as evolutionary markers, we compared P. echinulatum 2HH to P. oxalicum 114-2, reinforcing the previous reported hypothesis of environment-specific adaptations in P. echinulatum 2HH during a potential long-term mutualistic symbiosis with A. punctatum larvae. We suggest that adaptations to the symbiotic environment associated to the larvae restricted diet could potentially explain some differences in the gene composition of enzymes required to degrade wood. Major differences include: i) fewer encoding-genes for oxidoreductases related to degradation of lignin in P. echinulatum 2HH; and ii) functional genes with low transcription levels corresponding to cellulolytic enzymes in P. oxalicum 114-2, whose orthologs were absent or identified as pseudogene in P. echinulatum 2HH. However, our results are not conclusive and the potential long-term mutualistic symbiosis persists as a hypothesis.
In addition to the CAZyome, we also characterized the sugar transportome of P. echinulatum 2HH. Our phylogenetic analysis suggests that P. echinulatum 2HH diversity and specificity of STs are consistent to other enzyme producers, including eight major families of STs with specificity to different groups of sugars. Phylogenetic classification of STs helps to clarify the role of STs regarding the preferred carbon source of P. echinulatum 2HH. Furthermore, the phylogenetic analyses of iBGLs and STs enabled the identification of several iBGLs and STs involved in the accumulation of intracellular cellodextrins, bringing about a few candidate target genes for rational engineering of industrial strains. Considering the intracellular cyclodextrin accumulation mechanism that plays a key role inducting the expression of cellulolytic enzymes in filamentous fungi by a signalling cascade pathway. Overall, a significant number of putative iBGLs and STs of P. echinulatum 2HH correspond to orthologs reported in the literature of related fungi, which are proven to affect the induction of cellulolytic enzymes. These iBGLs and STs comprise valuable gene targets to understand the mechanisms underlying the regulation of cellulolytic enzymes and to design hypersecreting strains of P. echinulatum 2HH.

CONCLUSIONS
Our results provide important steps into building molecular understanding of P. echinulatum 2HH, revealing surprising features related to PBDC, particularly about the cellulolytic enzyme system. Knowledge of CAZys and STs provide valuable instruments for strain improvements for ethanol production, regarding adequate enzymatic balance for the attributes of second-generation feedstocks, such as crop residue of corn or sugar-cane bagasse. Finally, we highlight P. echinulatum 2HH distinguished cellulolytic system, promoting this species as an important biotechnological ally for lignocellulosic biofuel production.

CAZyome annotation
Recently, the systematic analysis of bacterial [63] and fungal [64] genomes highlighted the distribution and the variability of enzymes involved in polysaccharide degradation. These approaches provide a framework to investigate CAZymes diversity, and to identify new enzymes with potential for the biopolymer degradation industry.
In addition to dbCAN2 predictions, InterPro [67], PROSITE [68] and Pfam [69] were also used to improve accuracy of carbohydrate binding modules assignment. To perform a realiable CAZyome annotation is not a simple task, especially when dealing with a novel species. For dbCAN authors [66], the reliability of their tool depends on CAZy predictions by more than one tool. Our experience in this annotation suggests that the two approaches used in this study are complementary and relevant to assign CAZy families, although they are not yet sufficient to assign putative protein products. Along these lines, all putative CAZymes sequences were also subjected to manual curation which involved BLASTp searches against UniProtKB/Swiss-Prot/TrEMBL [70] and orthologous inspection from related fungi to assign function to each CAZyme. Orthologous groups were found by searches using Pro-teinOrtho (V5.16b) [71].

Comparison of Plant Biomass Degrading CAZymes (PBDC)
In order to explore the ability of the twelve investigated filamentous fungi (P. echinulatum and the eleven related fungi) to degrade plant biomass, all encoded protein sequences (Additional file A02) were first subjected to dbCAN2 [66] server against HMMdb release 8.0 using specific HMM models for each CAZy module family using default cut-off values (previous detailed in P. echinulatum CAZyome annotation). Next, CAZy families involved in degradation of plant biomass, previously described in [31], were used to classify all putative CAZymes predicted by dbCAN2. Then, all putative CAZymes sequences were subjected to SignalP Server (v5.0) [72] to predict the presence and location of signal peptide cleavage sites. Finally, ggplot2 [73] was used to plot all charts.

CAZymes as evolutionary markers
All proteomes detailed in the section Fungal proteomes analyzed were used to find orthologous groups in whole genome-wide searches using ProteinOrtho (V5.16b) [71].

Phylogenetic analysis of iBGLs
The proteomes of P. echinulatum and the other eleven related fungi were investigated to find iBGL protein sequences. Those belonging to GH1 and GH3 families were added to separate datasets. BGLs from A. thaliana [74,75] were used as outgroups in the phylogenetic analyses. Detailed information about these datasets are available in the Additional file A03.

Identification of STs
To assess the diversity of STs, the ST domain (PF00083) profile extracted from PFAM database [69] was used to search against the proteomes of P. echinulatum and the other eleven related fungi with hmmsearch (v3.1b2) [80], choosing hmmsearch score ≥ 238 as cutoff [22].

Phylogenetic analysis of STs
A dataset was created including 64 putative STs of P. echinulatum, 85 putative ST sequences of A. niger CBS 513.88 [22], as well as, 44 fungal ST protein sequences of related fungi (Only protein sequences reviewed or referenced in previous studies were included for better tree visualization). In addition, 7 STs from A. thaliana [81] were used as outgroup in the phylogenetic analysis. Detailed information about this dataset containing 200 ST protein sequences is available in the Additional file A03.
The collected ST sequences were aligned using TM-Coffee [82], a transmembrane protein alignment tool.
The parameter sequence type was set as transmembrane and the parameter homology extension was set as UniRef100. The CIPRES Science Gateway (v3.3) [77] was used to perform RAxML-HPC2 (v8.2.8) [78]. The workflow analysis for ST dataset was performed to obtain bootstrap support (BS), setting PROTGAM-MAWAG, executing Maximum Likelihood search and thereafter a thorough bootstrap with 500 iterations. The resulting tree was visualized and configured using iTOL [79].

Ethics declarations
Ethics approval and consent to participate Not applicable.

Consent for publication
Not applicable.

Competing interests
The authors declare that they have no competing interests.

Funding
We are grateful to the Coordination for the Improve-

Author's contributions
The authors contributed equally to this work. All authors read and approved the final manuscript.

Availability of data and material
All data generated or analyzed during this study are included in additional files or available in public databases. Genomic data of P. echinulatum are available in NCBI database. The 2HH wild-type data were deposited under the accession numbers PR-JNA520890 (BioProject); SRX6631912, SRX6631913 and SRX6631914 (SRA); and WIWU00000000 (WGS). The S1M29 mutant data were deposited under the accession numbers PRJNA521489 (BioProject); SRX6631956, SRX6631957 and SRX6631958 (SRA); and WIWV00000000 (WGS).  Figure 1 Stacked barplot comparing the number of putative proteins classified as PBDC in twelve filamentous fungi. Only proteins containing signal peptide were counted. Distribution of PBDC sums were grouped accordingly to CAZy classes.    Additional file A02 -PBD enzymes assignments in P. echinulatum and eleven related fungi. Additional file A03 -Identification and phylogenetic analyses of iBGLs and STs in P. echinulatum and eleven related fungi.