103 CsDof proteins were identified in C. sativa
To identify the Dof proteins encoded in the camelina genome, all 36 AtDof proteins from Arabidopsis were used as query sequences to perform genome-wide detection by BLASTp (Basic Local Alignment Search Tool) in the publicly available genome database of C. sativa. All candidate sequences were then validated using the HMM (Hidden Markov Model) with the Dof domain (PF02701). Finally, a total of 103 CsDof proteins were obtained (Additional file 1). These CsDof proteins were renamed from CsDof1 to CsDof103, and their characteristics were further examined, including the gene locus ID, gene starting and ending positions on chromosomes, protein sequence length (SL), molecular weight (MW), isoelectric point (pI), and member classification. The length of CsDof protein sequence ranged from 174 (CsDof36 and CsDof86) to 587 amino acids (CsDof56), with an average of 380 amino acid residues. The CsDof MWs ranged from 19.24 kDa (CsDof36) to 63.06 kDa (CsDof56), with an average of 41.15 kDa. In addition, the pIs of CsDof1 and CsDof11 were the smallest (4.47), and CsDof55 was the largest (10.06), with an average of 8.37, indicating that most of them were basic proteins. Notably, compared with the numbers of Dof family members reported in other plant species, the CsDof family with 103 members is one of the larger Dof families tested demonstrates that this TF family was extensively expanded in C. sativa genome during its evolutionary process.
Multiple sequence alignment and phylogenetic analysis of CsDof proteins
In order to ascertain the evolutionary relationship among CsDof proteins, a phylogenetic tree containing 36 AtDofs and 103 CsDofs was constructed with MEGA7.0 software using the neighbor-joining method (Fig. 1). Eventually, CsDof was classified into four groups (group A, B, C and D) , with 9 CsDofs in group A, 21 CsDofs in group B, 43 CsDofs in group C, and 30 CsDofs in group D. Furthermore, group B, C, and D were divided into several subgroups (B1, B2, C1, C2.1, C2.2, C3, D1 and D2) . Then, the CsDof sequences randomly selected from group A, B, C and D were used for multiple sequences alignment. As shown in the Fig. 2, all these sequences contained a CX2C-X21-CX2C type zinc finger domain, the typical nature of Dof proteins. The conserved domain consisted of 52 amino acid residues, of which four cysteine residues coordinated with zinc in the N-terminal region.
Conserved motif and gene structure analysis of CsDof proteins
To reveal the functional regions of CsDof proteins, conserved motifs were predicted by the MEME (Multiple Em for Motif Elicitation) program, and a total of 10 conserved motifs (denoted as Motif 1 to Motif 10) were detected in all the 103 Dof proteins (Fig. 3B and Additional file 2). Among these conserved motifs, Motif 1 (Fig. 3D and Additional file 2) is a conserved Dof domain found in all CsDof proteins. In accordance with the results of conservative motif analysis, two Motifs 1 were detected in CsDof91 and CsDof75, respectively. Moreover, Motif 9 was predicted in most subgroups except for B2 and C3. For example, the D2 subgroup contained both Motif 9 and Motif 1. Motifs 6 and 8 were detected only in the D1 subgroup, while Motifs 3 and 5 were found only in the C3 subgroup. In addition, Motifs 2 and 4 were mainly found in C3 and D1 subgroups, respectively.
Analysis of CsDof gene structure (Fig. 3C) showed that most CsDof genes contained coding regions (CDS) and untranslated regions (UTR), while 17 CsDof genes were absent of UTRs, such as CsDof65, CsDof32 and CsDof58. The number of exons of CsDof genes ranged from one to seven, while the number of introns was from one to six. Total of 41 CsDof genes had only one exon. Introns occurred in 41 CsDof genes, accounting for 39.8% of the all CsDof genes, such as CsDof8, CsDof66 andCsDof49. Furthermore, several CsDof genes including CsDof39, CsDof65, CsDof70 and CsDof75 showed the characteristics of split genes.
Uneven distribution of 103 CsDof genes on 20 chromosomes
To determine the distribution of CsDof genes on chromosomes, all identified CsDof genes were mapped onto chromosomes by TBtools software . A total of 102 CsDofs ORFs were located on 20 chromosomes of Camelina (Fig. 4), including two alternatively spliced variants (CsDof39 and CsDof40) produced by one CsDof gene locus (Csa07g036310). A single Dof gene (CsDof1) did not map to any chromosome, but appeared in the scaffold region (Scaffold00734). The remaining CsDof genes were assigned to 20 chromosomes. 13 CsDof genes were located in Chromosome 11 while only one gene mapped to Chromosome 1, 15, 19 and 20, respectively. The number of CsDof genes located on the rest chromosomes ranged from two to eight. This indicates that CsDof genes are unevenly distributed on chromosomes.
Segmental duplication and comparative syntenic maps
It is recognized that expansion of a gene family is due to duplication events that occur within the entire family or between some members. Compared with other species including model plants Arabidopsis and rice, C. sativa contains more members of Dof genes, showing that CsDof family expands greatly. To elucidate the mechanism responsible for such expansion, TB tools and MCScanX  were used to examine the duplication pattern of CsDof genes, including tandem and segmental duplication. A total of 83 pair segmental duplication genes (Fig. 5 and Additional file 3) were identified in the CsDof gene family, but no tandem duplication was detected in any CsDof gene. These results revealed that segmental replication might be the core driving force for CsDof gene family evolution.
To investigate whether selection pressure (purifying and positive selection) acts on CsDof gene family formation, the synonymous substitution rates (Ks) and non-synonymous substitution rates (Ka) as well as the Ka/Ks of the identified orthologous CsDof gene pairs were calculated using KaKs Calculator 2.0 (Additional file 3). Only a duplicate gene pair of Csa10g022470/Csa12g037530 had the value of Ka/Ks (1.061) >1, whereas the rest of the duplicate CsDof gene pairs exhibited the Ka/Ks value less than 0.8, indicating that the CsDof gene family may be mainly affected by purifying selection or negative selection in their evolution.
To further explore the origin and evolution of CsDof genes, the comparative synteny analysis was performed between C. sativa and the two closely-related plant species Arabidopsis thaliana and Brassica napus. As shown in the comparative synteny maps (C. sativa VS A. thaliana and C. sativa VS B. napus), Orthologous relationships were detected between 84 CsDof genes and 31 AtDof genes, and then 84 orthologous Dof gene pairs were identified accordingly (Fig. 6A and Additional file 4), with all of them located on the syntenic locus in Arabidopsis and C. sativa chromosomes. Remarkably, multiple CsDof genes were identified as putative orthologs of a single AtDof gene. For example, Csa14g009010 and Csa03g011080 were identified as the ortholog of AT1G07640. These results suggest that the expansion of CsDof gene family may have occurred after the separation of Arabidopsis in evolution. In addition, 71 BnaDof genes, located on chromosomes A01-A10 and C01-C09 in B. napus (Additional file 5), were selected for collinearity analysis with C. sativa. The orthologous relationships were also detected between 59 CsDof genes and 44 BnaDof genes, and the corresponding 115 orthologous Dof gene pairs were built between these two species (Fig. 6B and Additional file 6).
Multiple response elements exist in the CsDof promoter region
In order to further understand the transcriptional regulation mechanism, the 2000-bp promoter region upstream from the starting codes of CsDof genes in C. sativa was taken for cis-element analysis by PlantCARE database. As a result (Table 1 and Additional file 7), a variety of cis-acting elements were identified, including growth and development elements, stress response elements, light response elements, hormone response elements and so on. Strikingly, 70% and 98% of the CsDof genes were detected to contain light- and hormone-responsive elements in their promoters, respectively, indicating that most CsDof genes may mediate the regulation of light- or hormone-induced life processes. Moreover, several MYB binding sites including MBS, MBSI, MRE and CCAAT-box associated with drought, photoresponse and flavonoid biosynthesis were identified in many CsDof gene promoters, demonstrating that these CsDof proteins may interact with MYB TFs to involve in regulation of drought stress response, photoperiod and flavonoid biosynthesis. In addition, 62% of the CsDof genes were examined to have stress response elements in their promoters, suggesting that CsDof gene is likely to be involved in stress responses. Interestingly, nine CsDof genes were identified to possess seed-specific regulatory elements (RY-element, CATGCA(TG)), including CsDof10, CsDof12, CsDof19, CsDof32, CsDof47, CsDof51, CsDof52, CsDof55 and CsDof89. Except for CsDof19 and CsDof47, the remaining seven genes were expressed to different degrees during seed development (Fig. 7B). And thus, it is assumed that these CsDof genes play an important role in the process of seed formation and storage material accumulation. Notably, RY-element is an important target of the seed-specific TFs ABI3 and FUS3 , and it is hypothesized that ABI3 and FUS3 can interact with Dof members.
Expression patterns of CsDof genes in various C. sativa tissues
Analysis of gene expression pattern can provide important information to identify biological functions of an interest gene. To investigate the expression patterns of CsDof genes, the expression data of CsDof genes deposited in the publicly available database (FPKM values) were used to do such analysis in twelve tissues of C. sativa (Additional file 8), including root (R), stem (S), young leaf (YL), mature leaf (OL), flower (F), inflorescence (IF), early seed development (ESD), early-mid seed development (EMSD), mid-late seed development (LMSD), late seed development (LSD), germinating seed (GS) and cotyledons (C). As shown in Fig. 7A, a heat map of the expression profiles for CsDof genes demonstrated that most CsDof genes (up to 90%) were expressed in multiple organs despite their expression levels were different in these tissues, particularly higher expression in roots and stems. The tissue-specific expression pattern was also detected for a number of CsDof genes. For example, several CsDof genes were only highly expressed during seed germination, including CsDof12, CsDof80 and CsDof91. Moreover, during seed development (Fig. 7B), most CsDof genes showed specifically high expression in the early stage, including CsDof52, CsDof57, CsDof58 and CsDof71. Whereas, small parts of CsDof genes such as CsDof9 and CsDof67 exhibited large expression levels in the middle and latest ages. These expressing analysis indicate that most CsDof genes may function constitutively during plant growth and development, but a number of them may act differentially in various tissues/organs and different seed development stages.
CsDof TFs may regulate expressions of the genes related to lipid biosynthesis
The Dof TF can specifically recognize and bind to the A/TAAAG (CTTTA/T) motif for activation and transcription regulation of the target genes . In order to identify whether CsDof TFs regulate expressions of the lipid-related genes, the upstream 2000-bp promoter sequences derived from a number of lipid-related genes were taken to detect the cis-elements. As shown in Table 2，seven enzyme genes related to oil biosynthesis were found to contain the Dof-binding motif in their promoters. These seven lipid-related genes included three fatty acid desaturases (ROD1, FAD3 and SAD6), a long-chainacyl-CoA synthase (LACS3), and two acyltransferases (PDAT1 and DGAT1). Further analysis of expression patterns of these seven enzyme genes and several CsDof genes revealed that these enzyme gene expressions were extremely positive or negative correlated with the CsDof’s FPKM (Fig. 8). It is hypothesized that the a few CsDof TFs may directly regulate the expression of these seven lipid-related enzymes in C. sativa, and consequently mediating oil biosynthesis and accumulation.
Several CsDof genes may mediate plant responses to salt stress
In order to identify the CsDof TFs that may mediate regulation of stress responses, six candidate CsDof genes were selected to examine their expression profiles in C. sativa seedlings treated by salt stress (150 mM NaCl). These six CsDof genes were screened based on the transcriptome data (FPKM values) derived from C. sativa seedlings under salt treatment (Additional file 9). Subsequently, the temporal expression profiles of the six CsDof genes (Fig. 9) were obtained in shoots and roots of C. sativa under salt treatment by quantitative RT-PCR (qRT-PCR). Overall, these six CsDof genes were more responsive in the shoots than in the roots. The relative expression of all six genes in the shoots tended to increase firstly and then decrease. Furthermore, with the exception of CsDof54, which reached its maximum relative expression at 6 h after salt treatment, the relative expression of the other five genes (CsDof27, 60, 63, 83 and 95) all peaked at 12 h after salt treatment. In the roots, generally, the four genes (CsDof27、60、83 and 95) showed similar expression trends to that in shoots, also reaching the maximum relative expression at 12 h after salt treatment, but the expression levels were much less than that in shoots. The other two genes (CsDof54 and 63) showed an increase at 12 h and then a decrease at 24 h after slat treatment, but increased to the maximum relative expression at 48 h after salt treatment. These results suggest that the six CsDof genes may positively regulate plant response to salt stress, with higher effects in shoots than in roots.