Genome-wide characterization of SPL gene family in Codonopsis pilosula reveals the novel roles of CpSPL2 and CpSPL10 in promoting the accumulation of secondary metabolites and growth of C. pilosula hairy root

Background SQUAMOSA PROMOTER BINDING PROTEIN-LIKE (SPL) transcription factors play critical roles in regulating diverse aspects of plant growth and development, including vegetative phase change, plant architecture, anthocyanin accumulation, lateral root growth, etc. Codonopsis pilosula is a famous medicinal plant and its dried root, named Dangshen, is one of the most widely used traditional Chinese medicine. However, little information about SPL genes in this species has been reported.


Background
Transcription factors (TFs) function in various physiological and developmental processes via activating and/or repressing transcription of multiple target genes [1].They have been usually divided into different families according to the sequence of DNA-binding domains and other conserved motifs [2].SQUAMOSA-promoter binding protein-like (SPL or SBP) TFs are exclusive to plant and characterized by a highly conserved SBP domain and a nuclear localization signal (NLS) at the C-terminus.The SBP domain is approximately 76 amino acids and includes two zinc-binding sites (one zinc nger is C3H or C4, and the other is C2H4) essential for DNA binding, and the NLS partially overlapps with the second zinc nger [3][4][5][6].AmSBP1 and AmSBP2 from Antirrhinum majus were the rst discovered SBP-domain proteins in plants, and were found to bind to the oral meristem identity gene SQUAMOSA promoter, so named them [3].Then SPL genes have been identi ed in many plant species, including single-cell algae, mosses, gymnosperms, and angiosperms [7].With the rapid implication of high-throughput sequencing technology, more and more plant genome data have been released and genome-wide identi cation of the SPL gene family from model and non-model plants have been identi ed in Arabidopsis thaliana [8], Oryza sativa [9], Glycine max [10], Solanum lycopersicum [11], Malus domestica [12], Salvia miltiorrhiza [13], Vitis vinifera [14], Phyllostachys edulis [15], Capsicum annuum [16], Ricinus communis [17], etc.
The functions of SPL genes have been well characterized in the model plant Arabidopsis and they play important regulatory roles in diverse developmental progresses, including vegetative to reproductive phase transition, cotyledon-to vegetative-leaf transition, micro-and megasporogenesis, trichome formation, stamen lament elongation, axillary bud formation, and lateral root growth [18][19][20][21][22][23].Besides, they are involved in copper homeostasis, abiotic stress response, immune response, and secondary metabolites production [24][25][26][27].The functions of SPL genes from other species have also been identi ed.In rice, OsSPL14 has been found to promote panicle branching and grain productivity and OsSPL16 regulates grain yield and quality [28,29].FvSPL10 from strawberry (Fragaria vesca) not only promotes early owering, but also increases organs size, such as longer root, larger oral organ and seeds [30].As a class of plant-speci c gene family, some SPL genes are important candidates for improving plant agronomic traits by genetic engineering.
Codonopsis pilosula is a member of the Campanulaceae family.Its dried root, named "Dangshen" in Chinese, is one of the most widely used traditional Chinese medicine for replenishing qi (vital energy), strengthening body immunity, improving appetite, promoting gastrointestinal function, reducing blood pressure, and curing gastric ulcers [31].In addition, Dangshen is also a wellknown health-care food in China and is listed in the "Food and Drug Homology Catalogue" approved by the National Health Commission of People's Republic of China.Consequently, the demand for Dangshen is growing, and the yield and accumulation of bioactive metabolites is attracting more and more attention in planting eld [32,33].Lobetyolin, alkaloids, polysaccharides, and saponins are the major active ingredients in Dangshen, which are responsible for most of the pharmacological functions found in the medicine [34].Lobetyolin, a general marker compound in Dangshen, has been well reported to exert multiple bioactivities, such as anti-cancer, antiviral, anti-in ammatory, anti-oxidative, mucosal protective, and xanthine oxidase inhibiting properties [35,36].
Although C. pilosula has received great attention on the chemical constituents and their pharmacological activities, relevant study of this species at the genetic level is lagging behind and only a few literatures involved genes in C. pilosula [37][38][39][40].Until now, SPL gene family has never been reported in C. pilosula.Most recently, we have developed an e cient Agrobacterium rhizogenes-mediated transformation approach for transgenic hairy roots with this species [39], which lay a good foundation for genetic engineering of that species.Here, we identi ed 15 SPL genes based on the genome sequence of C. pilosula (data unpublished).Gene structure, conserved motif, target prediction of miR156, and cis-acting elements of 15 CpSPLs were systematically analyzed.And their spatiotemporal expression pro les in different tissues and expression patterns under various conditions (NaCl, MeJA and ABA treatment) were analyzed by qRT-PCR.Furthermore, we obtained CpSPL2 or CpSPL10 overexpressing transgenic hairy roots, and a signi cant increase was observed in the biomass and concentrations of total saponins and lobetyolin.As far as we know, this is the rst experimental research on gene function in this species.These ndings demonstrate that CpSPL2 and CpSPL10 positively regulate the growth of hairy roots and accumulation of active ingredients, which have great potential in improving the yield and quality of Dangshen.

Identi cation of SPL genes in C. pilosula and bioinformatic analysis
The sequences of SBP domain (ID: PF03110), which were downloaded from Pfam database (http://pfam.xfam.org/),was used to search possible SPL genes in C. pilosula genome sequences (data unpublished) by HMMER (http://hmmer.org) with the evalue < 1e-10.A total of 15 CpSPL genes containing a complete SBP domain were identi ed.
The online analysis software psRNATarget (http://plantgrn.noble.org/psRNATarget/analysis?function=3) was used to predict CpSPL genes directly targeted by miR156, with the maximum expectation value of 3.0 and UPE value of 16.MEGA X software (https://mega.nz/) was used to construct the phylogenetic tree of 31 full-length SPL amino acid sequences, 15 from C. pilosula and 16 from A. thaliana, with 1000 bootstraps in the Maximum-likelihood (ML) method.Gene Structure Display Server (http://gsds.cbi.pku.edu.cn/) was used for gene structure analysis.The MEME program (http://meme-suite.org/) was used to for identi cation of the conserved motifs.The cis-acting elements of 15 CpSPL genes promoter regions (2000 bp upstream of the translation initiation codon "ATG") were analyzed online (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/).

Plant materials and treatments
Seeds of C. pilosula were collected from Gansu Province, China.The botanical origin of the materials was identi ed by Professor ZheZhi Wang in Shaanxi Normal University.The specimens of the seeds were deposited in the herbarium of National Engineering Laboratory for Resource Development of Endangered Crude Drugs in Northwest of China, Shaanxi Normal University, Xi'an, China.The Seed owner allowed the collection.The seeds of C. pilosula were germinated and incubated according to the method that we described previously [39].
For gene spatiotemporal expression analysis, the leaves, stems, and roots were collected separately from one-, two-, and threemonth-old seedlings, and the ower and calyx were collected from the plants at the owering stage.To test CpSPLs responses to hormonal and stress treatments, two-week-old seedings were treated with 200 mmol NaCl, 200 µmol MeJA, and 100 µmol ABA, respectively, as we have described previously [40].The control group was treated with the same amount of ddH 2 O and all the samples were collected 6 h after treatment.

Gene expression analysis
For qRT-PCR analysis, total RNA was extracted and then reverse transcripted into cDNA as we described previously [40].All the primer sequences used for qRT-PCR were listed in Table S1 and CpGAPDH was used as the internal control [40].The relative expression levels of 15 CpSPLs were calculated according to the method described by Livak and Schmittgen (2001) [41].All the experiments included three biological and three technical replicates.

Vector construction and hairy root transformation
The complete open reading frames of CpSPL2 and CpSPL10 were ampli ed through PCR using the speci c primer pairs CpSPL2-F/R and CpSPL10-F/R (Table S1), respectively, with the following PCR conditions: 98 ℃, 3 min; 30 cycles of 98 ℃, 10 s, 58 ℃, 30 s, 72 ℃, 45 s; 72 ℃, 5 min.Then the products were digested with Pac I and Asc I and ligated into pMDC85 to generate overexpression (OE) vectors pMDC85-CpSPL2 and pMDC85-CpSPL10.
Transgenic hairy roots overexpressing CpSPL2 or CpSPL10 were obtained by Agrobacterium-mediated method according to the protocol established in our lab and selected on the MS medium containing 2 mg/L hygromycin [39].In parallel, pMDC85 was introduced into C. pilosula as the empty vector control (EV).Every transgenic line was excised and sub-cultured separately as we described previously [39].Five independent CpSPL2-OE lines, seven CpSPL10-OE lines, and four EV lines were obtained, and then con rmed by genomic DNA PCR using primers hptII-F/R (Table S1) for hygromycin phosphotransferase II gene (hptII), followed by expression analysis of CpSPL2 or CpSPL10 by qRT-PCR.

Determination of lobetyolin and total saponins
Transgenic hairy roots sub-cultured for one month were used for determination of lobetyolin and total saponins.
To determine the concentration of lobetyolin, we ground the dried hairy roots into powder, followed by extracted three times with 10 mL methanol by sonication (50-150 W) in an ultrasonic bath (Kunshan Instrument Co., Ltd., China) for 30 min, 20 min, and 15 min, respectively.The extracts were put together and the methanol solution was evaporated, followed by dissolved with methanol to 5 mL volumetric ask.After ltration with 0.22 µm microporous membrane, the solution was used for HPLC analysis on a Shimadzu LC-20A instrument (Shimadzu, Japan) equipped with an Agilent 5 TC-C 18 column (250 mm × 4.6 mm, 5 µm).The mobile phase consisted of ultrapure water (A) and methanol (B) and the gradient condition was 0-5 min, 20-40% B; 5-10 min, 40-70% B; 10-12 min, 79-90% B; 12-25 min, 90% B. The separation was performed at 30 ℃, with the ow rate of 1.0 mL/min and UV detector wavelength at 220 nm.
The concentration of total saponins in transgenic hairy roots was determined as we described previously [39].

Statistical analysis
All the experiments and date presented here involved at least three biological repeats.SPSS version 20.0 software (SPSS Inc., Chicago, IL, USA) was used for statistical evaluation.The error bars indicate standard deviation.Signi cant difference of the mean values was set at P < 0.05.

Genome-wide identi cation and sequence feature analysis of CpSPLs
To identify possible SPL genes in C. pilosula genome sequences, we employed the SBP domain (PF03110) to search the databases by HMMER.A total of 15 SPLs containing complete SBP domain were identi ed based on the genome sequence of C. pilosula and their cDNA sequences were listed in (Table S2).Consulting the homologous AtSPLs in Arabidopsis, 15 CpSPLs were named from CpSPL1 to CpSPL15.The deduced CpSPLs exhibited great variations in terms of their molecular weight (MW), ranging from 17.98 KDa (CpSPL5) to 119.80 KDa (CpSPL14).Similarly, the lengths of the CDS were found to be varied in the CpSPLs, from 480 bp (CpSPL5) to 3276 bp (CpSPL14).The detailed information, including the gene length, intron number, protein length, predicted MW, and theoretical isoelectric point (pI), were listed in Table 1.[42].In the model plant Arabidopsis, 10 of 16 AtSPLs are direct targets of AtmiR156 [42].In rice, 11 of 19 OsSPL genes are targeted by OsmiR156 [9].We predicted CpSPLs targeted by AtmiR156 using the on line plant target prediction tool.The prediction result indicated that a total of 10 CpSPLs were the targets of AtmiR156, eight of which (including CpSPL3, CpSPL6, CpSPL9-13, and CpSPL15) were targeted in the coding regions, while two (CpSPL4 and CpSPL5) in 3′ UTR regions (Fig. 1).

Gene structure and conserved motif analysis
To clarify the structural diversities of 15 CpSPLs, we performed gene exon/intron structure analysis.The result displayed that the number of introns had a high variation and ranged from one to ten (Fig. 3).Interestingly, we found that most CpSPLs in the same group share similar structure.For instance, CpSPL1 and CpSPL14, belonging to G1, have ten introns, respectively.CpSPL4 and CpSPL5, members in G6, have only one intron, respectively (Fig. 3).
To explore the conserved motifs, 15 CpSPLs were subjected to analysis with MEME program.Among the 12 conserved motifs identi ed (Fig. 4, Table S3), motif 1, motif 2, and motif 3 existed in all the 15 CpSPLs and formed the conserved SBP domain.
Similar motif composition existed in the same group.For example, CpSPL2 and CpSPL10 in G4 all consisted of ve conserved motifs (motif 1/2/3/10/11).The motif composition in CpSPL9 was completely consistent with that in CpSPL15, suggesting that CpSPL9 and CpSPL15 probably have similar and redundant functions in plant development.

Spatiotemporal expression analysis of CpSPL genes
We investigated the expression patters of 15 CpSPLs in the leaves, stems, and roots from one-, two-, and three-month-old seedlings, and the ower and calyx from the plants at the owering stage by qRT-PCR assay.The results showed that most CpSPLs expressed in almost all the tissues (Fig. 5).Compared with other genes, the expression level of CpSPL7 was more constant in all the tissues tested.CpSPL8 showed highest level in calyx.CpSPL5 was expressed at relatively higher levels in leaf and calyx.The expression levels of CpSPL3, CpSPL8, CpSPL10, CpSPL12, and CpSPL13 in the stems gradually decreased with the maturation of the seedlings.CpSPL1 and CpSPL14, two members in G1, showed similar expression patterns and their expression levels in the root increased gradually with the maturation of the seedlings.In addition, the expression patterns of CpSPL9 and CpSPL15 were highly similar, with higher levels in owers and 3-month-old roots.In summary, spatiotemporal expression analysis results indicated that CpSPL genes exhibited various expression patterns, which provide preliminary information for understanding their potential functions in the development of C. pilosula.

Overexpression of CpSPL2 or CpSPL10 promotes the growth of C. pilosula hairy root
To investigate the function of CpSPL2 and CpSPL10 in root development, we generated CpSPL2-overexpressing or CpSPL10overexpressing transgenic hairy roots.The expression level of CpSPL2 or CpSPL10 in the transgenics was examined by qRT-PCR (Fig. 7A, B).Two independent CpSPL2-overexpressing lines (CpSPL2-OE3 and CpSPL2-OE5) and CpSPL10-overexpressing lines (CpSPL10-OE2 and CpSPL2-OE3) with dramatically elevated CpSPL2 or CpSPL10 expression were selected for further analysis.
In comparison to the control, the hairy roots overexpressing CpSPL2 or CpSPL10 grew faster (Fig. 7C, D, and E).When the transgenic hairy roots with the length about 1.0 cm were cultured for one month, the biomass of CpSPL2-OE3, CpSPL2-OE5, CpSPL10-OE2, and CpSPL10-OE3 was 2.19, 1.98, 3.15, and 2.83 times that of the control (EV2), respectively (Fig. 7F).Our results indicated that both CpSPL2 and CpSPL10 promote the growth of hairy roots.
Overexpression of CpSPL2 or CpSPL10 promotes accumulation of lobetyolin and total saponins in C. pilosula hairy root To evaluate the impact of CpSPL2 or CpSPL10 on active ingredients, HPLC and UV spectrophotometer were used to determine the concentrations of lobetyolin and total saponins in those transgenic lines, respectively.It was surprising that the production of both lobetyolin and total saponins were greatly increased in CpSPL2-OE or CpSPL10-OE lines.The concentration of lobetyolin in CpSPL2-OE3, CpSPL2-OE5, CpSPL10-OE2, and CpSPL10-OE3 was 6.43, 6.25, 6.29, and 7.03 times that of the control (EV2), respectively (Fig. 8A).The concentration of total saponins in CpSPL2-OE3, CpSPL2-OE5, CpSPL10-OE2, and CpSPL10-OE3 was 3.18, 2.72, 1.81, and 1.94 times that of the control (EV2), respectively (Fig. 8B).In summary, CpSPL2 and CpSPL10 promote not only the growth of hairy roots but also accumulations of lobetyolin and total saponins.

Discussion
Identi cation of SPL genes in C. pilosula SPLs are plant-speci c TFs and characterized by a highly conserved SBP domain [5,6].They play critical roles in regulating diverse aspects of plant growth and development, including vegetative phase change, plant architecture, anthocyanin accumulation, lateral root growth, etc [18][19][20][21][22][23][24].Since its rst discovery in A. majus [3], the SPL gene family from various plants has been isolated and identi ed.For instance, there are 16 SPL gene family members in Arabidopsis thaliana [8], 19 in Oryza sativa [9], 15 in Solanum lycopersicum [11], 15 in Salvia miltiorrhiza [13], and 15 in Ricinus communis [17].However, little information is known about SPL gene family in C. pilosula, a famous species with important medical and edible values.Here, we identi ed 15 CpSPL genes in C. pilosula genome.Since most SPL genes are targets of miR156 [42], we predicted miR156targeted CpSPL genes by psRNATarget.Prediction results showed that ten CpSPL genes were targeted by miR156 (Fig. 1), indicating the miR156-SPL module is universal in plants.
We constructed the phylogenetic tree of 16 AtSPLs from A. thaliana and 15 CpSPLs from C. pilosula.31 SPL genes were divided into seven groups and each group had at least one CpSPL and one AtSPL (Fig. 2).All the non-miR156-targeted AtSPLs and CpSPLs were grouped into G1, G3, and G7.CpSPL family members in the same group showed similar gene structure and motif composition (Fig. 3; Fig. 4), which was consistent with previous report [17].In Arabidopsis, members in the same group often have the same or similar function.For instance, AtSPL3, AtSPL4, and AtSPL5, clustered in G6, synergistically induce owering under long-day photoperiod [44].Most recently, AtSPL2, AtSPL10, and AtSPL11, members in G4, have been reported to inhibit root regeneration by dampening auxin biosynthesis [43].We speculate that CpSPLs in the same group maybe have the same function, such as CpSPL2 and CpSPL10 in G4, CpSPL4 and CpSPL5 in G6 and so on.
CpSPL genes expression patterns in C. pilosula Gene expression patterns, to a large extent, will provide valuable information for its potential function [45].In this study, the spatiotemporal expression patters of 15 CpSPLs in the leaves, stems, and roots from one-, two-, and three-month-old seedlings, and the ower and calyx from the plants at the owering stage were detected by qRT-PCR (Fig. 5).The results showed that CpSPL1 and CpSPL14 in G1 exhibited similar expression patterns, and the expression patterns of paralogous CpSPL9 and CpSPL15 in G5 showed high similarity.Our results were consistent with previous conclusion that paralogous SPL genes in the same group often showed similar expression pro les [46,47].AtSPL9, AtSPL10, and AtSPL15 contribute to the vegetative to reproductive phase transition [8].Here, CpSPL9, CpSPL10, and CpSPL15 expressed predominantly in the ower, suggesting they might function in the development of ower in C. pilosula.The expression level of non-minR156-targeted CpSPL7 was more constant in all the tissues tested, which was consistent with the result of SmSPL7 from Salvia miltiorrhiza [13].
Some SPL genes have been proved to be involved in abiotic stress.For example, in Arabidopsis, AtSPL1 and AtSPL12 function redundantly in thermotolerance and overexpression of AtSPL1 or AtSPL12 increased plant thermotolerance [48].In alfalfa, silencing MsSPL13 enhanced tolerance to drought and heat stress (40 °C) [26,49], and down-regulation of MsSPL8 led to enhanced salt and drought tolerance [50].In the present study, we investigated the expression levels of 15 CpSPLs under various stress conditions, including NaCl, MeJA, or ABA treatment.We found that the expression levels of most CpSPL genes signi cantly changed under NaCl, MeJA, and ABA treatment (Fig. 6).Among those genes with signi cant change, CpSPL4, CpSPL6, CpSPL14, and CpSPL15 positively response to all the treatments, while CpSPL5, CpSPL8, and CpSPL10 negatively response to all the treatments.Compared with other genes, CpSPL5 and CpSPL8 showed higher fold change under different treatments.We speculate that those two genes are potential candidates involved in abiotic stress.
Functional study of the CpSPL2 and CpSPL10 genes Since the medicinal and edible part of C. pilosula is the root, increasing root yield is one of the main goals of breeding for this species.In Arabidopsis, AtSPL2, AtSPL10, and AtSPL11, inhibit root regeneration by dampening auxin biosynthesis [43].The miR156-targeted SPL10 is involved in regulating not only lateral root growth but also primary root growth [21,23].Recently, it was reported that overexpression of FvSPL10, a SPL gene from Fragaria vesca, resulted in increased organs size, including longer root, larger oral organ and seeds [30].We speculated that CpSPL2 and CpSPL10, two members clustered in the same group with AtSPL2/10/11 (Fig. 2), were probably involved in the regulation of root development.To investigate the function of CpSPL2 and CpSPL10, we generated transgenic hairy roots overexpressing CpSPL2 or CpSPL10.Compared with the control, transgenic lines overexpressing CpSPL2 or CpSPL10 grew faster and the biomass of CpSPL2-OE3, CpSPL2-OE5, CpSPL10-OE2, and CpSPL10-OE3 was 2.19, 1.98, 3.15, and 2.83 times that of the control when the transgenic hairy roots with the length about 1.0 cm were cultured for one month (Fig. 7).Our results indicated that overexpression of CpSPL2 or CpSPL10 signi cantly promote the growth of hairy root.
Furthermore, we determined the concentration of lobetyolin and total saponins in those transgenic lines.Unexpectedly, we found that overexpressing CpSPL2 or CpSPL10 dramatically promoted the accumulation of lobetyolin and total saponins in the hairy roots (Fig. 8).Among 16 AtSPLs in Arabidopsis, AtSPL9 is the only one that has been reported to regulate biosynthesis of secondary metabolites [24,25].AtSPL9 negatively regulates anthocyanin accumulation by preventing the formation of MBW complex [24], and it positively regulates the formation of (E)-β-caryophyllene by binding to the promoter of sesquiterpene synthase gene TPS21 and activates its expression [25].Our results indicated that CpSPL2 and CpSPL10 are potential candidates for genetic improvement of C. pilosula because they can signi cantly promote not only the growth of hairy roots, but also accumulation of lobetyolin and total saponins.The molecular mechanism that CpSPL2 and CpSPL10 function in hairy roots needs to be addressed in the future.

Conclusions
In this study, we identi ed 15 CpSPL genes, which were supported by con rmation of the SBP domain, based on the genome data of C. pilosula.Ten of 15 CpSPLs were predicted to be directly targeted by miR156, including CpSPL3-6, CpSPL9-13, and CpSPL15.All CpSPLs were clustered into seven groups and members in the same group share similar gene structure and conserved motif composition.The spatiotemporal expression analysis of 15 CpSPLs showed that CpSPL gene family had   The analysis of conserved motifs of SPLs in Codonopsis pilosula.
Concentration of lobetyolin and total saponins contents in Codonopsis pilosula transgenic hairy roots.One-way ANOVA (followed by Tukey's comparisons) tested for signi cant differences among means (indicated by different letters at p<0.05).

Figure 2 An
Figure 2

Table 1
The information of 15 SPL genes in Codonopsis pilosula of the most conserved miRNA families, plays very important roles in the process of plant growth and development by direct cleavage of SPL transcripts