3.1 Evolution of cucumber SBP family genes
The SBP gene family plays an important role in various stages of plant development, such as stem growth[22], leaf development[35], flowering[14], fruit ripening[36] and signal transduction[21]. Phylogenic analysis of multiple plants species focusing on SBP-box gene family has been reported, such as Arabidopsis(16 members), rice(19 members), Betula luminifera(18 members)[25], common bean(23 members)[37], grape(19 members)[38], Ziziphus jujube(16 members)[39], Petunia(21 members)[40]. In this study, we identified 15 SBP family genes, all of which contain a conserved SBP domain. To investigate the homology of SBP family genes in cucumber, Arabidopsis and rice, phylogenetic tree analysis showed that the SBP genes were clustered into 6 groups (Group I-Group VI) (Figure 2). Preston divided the SBP family genes of 9 species into eight clades[13]. In present study, dividing Group IV and Group VI from the first node respectively into Group IV -1, Group IV -2, Group VI -1, Group VI -2 generates 8 same clades as previous studies. However, there is no CuSBP in Group IV -2.
A recent study on SBP-box genes elaborated that land plants’ SBP-box genes were generated through duplication events from one common ancestor. Based on the timeline of duplications, SBP-box genes are divided into group 1 and group 2-1, group 2-2, among which group 2-2 plays the major role of the expansion of SBP-box genes.[13] Group I in present study corresponds to group1, which is the first formed lineage. Group V in present study corresponds to group 2-1, which is the subgroup in the second group retaining similar evolutionary features to group 1 compared with group 2-2. We further constructed a phylogenetic tree of cucumber SBP genes and analyzed their conserved motifs and introns (Figure S1, S2). The number of introns in the CuSBP genes ranged from 2 to 10, motifs from 4 to 13. From the evolutionary point of view, the genes assigned under the same sub-group have similar numbers, types and distributions of motifs and introns, such as CuSBP14 and CuSBP1a have identical conservative motifs in Group V, which further support the grouping of the phylogenetic tree. Some genes have different isoforms, which may be caused by paralogous genes of different chromosomes or by alternative splicing, such as CuSBP13a, CuSBP13b and CuSBP13c, which are assigned to the same sub-group. However, CuSBP1a, CuSBP1b and CuSBP1c are assigned to different sub-groups.
3.2 Expression pattern and potential function analysis of CuSBP genes
SBP-box genes are plant-specific transcription factors, which play important roles in plant growth and development processes. They usually exhibit high expression level in panicles, flower buds, shoot apices in studied species[41-43]. At the same time, some SBP family genes are also involved in some biotic and abiotic stress responses.[4, 25] Currently, there are few functional data available describing SBP-box genes in cucumber.
To explore potential function of CuSBPs in cucumber, two sets of transcriptional data (flowering and PM) are used. Flowering serves as transition from vegetative phase to reproductive phase, which makes it crucial in reproductive plants. Therefore, it is important to study the expression pattern of CuSBP family genes in cucumber flowering process, discover its potential functions in flowering in order to provide potential resources for future cucumber breeding to improve cucumber yield. From the expression patterns of CuSBP family genes in different flower development stages, they could be divided into two groups, one group had lower expression levels during flowering (CuSBP8/6/1b/13a/7b/13b) The other group has a higher expression level (CuSBP3/12/16/7a/9/1c/14/13c/1a). It is speculated that genes with low expression during flowering may be involved in the vegetative growth process of plants. CuSBP8, CuSBP1b and OsSBP10 are members of Group Ⅲ, and their expression patterns are similar. OsSBP10 was shown to be highly expressed in rice seedlings and young spikelets[44]. CuSBP13 homologous gene AtSPL13 expresses highly in hypocotyl, shoot apical meristem, leaf primordia and developing inflorescence. Disruption of AtSPL13 regulation delays post-germinative transition from the cotyledon to vegetative-leaf stage.[35] In opposite, CuSBP14 and CuSBP1a have high expression levels throughout the flowering process and are distributed in the GroupⅤ, which may play a key role in the flowering process. Its mutant fbr6, a homologous gene of AtSBP14 in Arabidopsis, showed a transition to flowering later than wild type[23]. Group II contains CuSBP1c, AtSPL3/4/5 and OsSPL13. In Arabidopsis, SPL3 3’ UTR recognized by miR156/157 prevents early flowering[16]. SOC1 (Suppressor of overexpression of constans1) integrates photoperiod and GA signals to promote flowering via the SPL3/4/5[19]. CuSBP1c homologous gene AmSBP1 also acts in the initiating flower development[45]. OsSPL13 was found associated with spikelet size in cultivated rice by a GWAS analysis[46]. CuSBP1c shows decreased expression level as the flower buds develop, indicating similar roles with AtSPL3/4/5. In GroupⅤ, CuSBP9 expresses less as the flowering time goes. AtSPL2 have also been found to regulate ASYMMETRIC LEAVES 2(AS2) to control floral organ development[36]. OsSPL14 controls shoot branching in the vegetative stage with the regulation of miRNA[26], pointing out that CuSBP9 potential role in juvenile-to-adult vegetative transition.
Powdery mildew is one of the world most damaging diseases to cucumber, which is caused mainly by Podosphaera fusca [47]. It seriously affects photosynthesis and disturb metabolism, resulting in premature aging and declined production. Therefore, it is important to study the resistance functions of CuSBP family genes involved in the regulation of powdery mildew. In GroupⅤ, CuSBP14 and CuSBP1a maintained high expression levels in both cultivars. After treatment with PM, the expression levels of the two genes in SSL508-28 were lower than those in D8. CuSBP14, a homologous gene of CuSBP14 in Arabidopsis, is more sensitive to programmed cell death (PCD)-inducing fungal toxin FB1. This indicates that CuSBP14 and CuSBP1c may be involved in the negative regulation of biotic stress. CuSBP7a has a lower expression level in SSL508-28 compared to D8, and AtSBP7, OsSBP9 and CuSBP7a belong to the Group 1. AtSPL7 and OsSPL9 are considered functional genes in copper regulation pathways and Loss-of-function mutations in SPL9 resulting in enhanced plant resistance to rice stripe virus[35, 48]. Both CuSBP1b and CuSBP13a were highly up-regulated after treatment with powdery mildew in SSL508-28. This indicates that these two genes may play a positive regulatory role in the regulation of powdery mildew resistance. CuSBP12 was downregulated after treatment of two varieties of PM, but CuSBP6 with the highest homology was not detected in this database. AtSBP6 is a homologous gene of CuSBP6/12, which can actively regulate the defense genes and regulate Plant innate immune system. NbSPL6 is essential for the N-mediated resistance to Tobacco mosaic virus.[24] This also indicates that homologous genes of different species may differ in functional evolution, but the specific function remains to be studied.
3.3 SBP-box genes had high codon usage bias in cucumber
Codon usage bias reflects the genetic information hidden in the RNA sequence and accessed us to the evolutionary process of genes in organism. In present work, we find high codon usage bias in cucumber SBP-box genes.
ENC is a parameter acknowledged to be used to measure the codon bias degree. Previous studies demonstrate that gene expression is negatively correlated with ENC value[49, 50], in other words, important highly expressed genes tend to have low ENC values. The ENC value of CuSBPs varies from 49 to 61 with the mean value 53.2, which indicates high codon bias. Lowest ENC values among them is 49, indicating severe selection exerted on them and they are well functionalized, which is identical to previous discussion. Group III members have top 3 ENC values, suggesting their roles may not be irreplaceable. In ENC-plot, notably, CuSBPs have low ENC value compared to the ENC expected value, rejecting the null hypothesis that there is no selection. The presented gene points are lying below the curve expect one and the genes were not narrowly distributed in the plot, which demonstrated that both mutation pressure and selection affected the codon usage pattern, reinforcing the theory of equilibrium between mutation and selection[51].
In the neutrality plot, the correlation of the points is not significant (P>0.05) and the slope of the regression line is close to 0. The weak correlation between two GC12 and GC3s suggests that there is high mutation bias or low conservation of GC content levels, which means the mutation pressure rather than translational selection plays the major role.
RSCU is commonly used to analysis the synonymous codon usage (Table S4). Additionally, most abundantly used codons are A/T ended, as the result of compositional constrains (i.e., A and T)[52], which is part of the mutation pressure. For subsequent researches, if transgenic SBP-box genes are required to be expressed in cucumber, the insertion sequence can be modified by preferred codon patterns in this study.