‘QY1’ and ‘QY3’ are H. tuberosus cultivars bred by Qinghai Academy of Agricultural and Forestry Sciences (Xining 810000, China). The tuber epidermis of ‘QY1’ is red, whereas that of ‘QY3’ is white (Fig.1A). All materials were planted and stored in the Institute of Horticulture, Qinghai Academy of Agricultural and Forestry Sciences (E101°45′08.15″, N36°43′32.06″). The library label of these samples were recorded in Table S4. The Nicotiana tabacum cultivar Samsun was chosen as a transformation plant. Nicotiana tabacum (Samsun) was given by Professor Cathie Martin from John Innes Centre, and stored now in Northwest Plateau Institute of Biology, Chinese Academy of Sciences. No permission was required in collecting the plants. In this study, Yuan Zong was responsible for the planting and identification of these samples.
Tuber epidermis samples of ’QY1’ and ’QY3’ were collected in triplicate and used as the source material from which the transcriptomes were generated. Each of the three transcriptomes was generated from a different sample of ‘QY1’ and ‘QY3’. The cDNA libraries of tuber epidermis were created according to the descrition of instrument sample requirements for mRNA-Seq sample preparation (Illumina Inc., San Diego, CA, USA). The cDNA library products were sequenced by Illumina paired-end sequencing technology with read lengths of 150 bp, and they were sequenced on the Illumina HiSeq 2000 platform by Novogene with three repeats. Before assembly, original reads were filtered to obtain high-quality clean reads. Sequences with ambiguous bases (denoted with > 5% ‘N’ in the sequence trace), low-quality reads (the rate of reads with a quality value ≤ 10 was more than 20%) and reads with adapters should be removed. After puritfying all reads, Trinity was used to assembly the high-quality reads, with default parameters to construct unique consensus sequences . The expression levels of every unigene was calculated based on the FPKM (fragments per kilobase of transcript per million mapped reads) values. Difference in Unigenes between purple and white sample transcripts were identified by the Chi-square test, using IDEG6 software . The False Discovery Rate (FDR) method was introduced to determine the threshold p-value at FDR ≤ 0.001, with the absolute value of |log2Ratio| ≥ 1 being used as the threshold to determine the significance of the differential expression of unigenes . All Unigenes related to anthocyanin biosynthesis in the Kyoto Encyclopedia of Genes(KEGG) and Genomes(GO) pathways were collected and aligned to the unigenes of the transcriptome, using BlastX with e-value < 1e-5 . In order to comparing the relative expression levels of unigenes, the FPKM values of unigenes aligned to genes of the anthocyanin biosynthesis pathway were accumulated together.
DNA and cDNA preparation
Genomic DNA of Jerusalem artichoke was extracted from 1 g fresh weight tuber . Total RNA was extracted from root, stem, leaf, flower and tuber epidermis of different Jerusalem artichoke organs, using the Trizol method . The synthesis of the first strand of cDNA was carried out according to the manufacturer’s instructions of the First Strand Synthesis Kit of Fastking gDNA Dispelling RT SuperMix (TIANGEN, Beijing, China). The DNA and synthesized cDNA were stored at −20 oC prior to subsequent gene cloning and qPCR analysis.
PCR and qPCR analysis
The primers were designed by PRIMER 6.0 (Palo Alto, CA, USA) and synthesized by BGI Biological Technology Co., Ltd (BGI Company, Beijing, China). The 50 μl reaction volume included 25 μl 2× Unique HiQTM PCR Buffer, 0.5 μl Pfu DNA Polymerase (Thermo Fisher Science, Beijing, China), 0.5 μl 20 pmol primers each, and 0.5 μl cDNAs, and were made up to volume with dd H2O. The PCR procedure was: 98 oC for 2 min, 98 oC for 10 s, 53 oC for 30 s, 72 oC for 2 min, for 30 cycles, followed by 72 oC for 10 min, and then storage at 4 oC . The PCR products were detected by 1% agarose gel electrophoresis and photographed by a gel imaging analyzer (Tanon, Shanghai, China). All primers used in this research are listed in Table S5
In order to analyze the transcription level of genes related to anthocyanin synthesis, real-time fluorescence quantitative PCR (qPCR) was performed on an Applied Biosystems QuantStudio® 3 Real-Time PCR System (Thermo Fisher Company, Beijing, China). The fusion curve was analyzed to confirm the specificity of the amplification. The reaction mixture (20 μL): 2× SYBR Green 10 μL, ddH2O 7.8 μ L, primers 0.6 μl each, and cDNA template 1μL (about 100 ng/μL). The PCR thermal cycle consisted of four steps as follows: pre denaturation at 95 oC for 15 min, denaturation at 95 oC for 10 s, annealing at 60 oC for 20 s, and extension at 72 oC for 30 s, with 40 cycles in total. Fluorescence signals were collected at the 60 oC annealing stage to obtain circulating CT values for different genes. The data were analyzed using the 2 - ΔΔCT method .
The online software of ExPASY translate (https://web.expasy.org/translate/) was used to predict the protein. BlastP (https://blast.ncbi.nlm.nih.gov/blast.cgi) in NCBI was used to predict the conserved protein regions. The neighbor-joining method was used to construct phylogenetic trees with default parameters based on the software MEGA6 (http://www.megasoftware.net/mega6/faq.html) . BDPG (http://www.fruitfly.org/seq_tools/promoter.html) was used to predict the functional domain in promoter.
Overexpression of HtMYB2 in tobacco
The overexpression vector for tobacco transformation was based on the pJAM1502 binary vector, which contains a double CaMV35S promoter . The pJAM1502: HtMYB2 construct was achieved using the Gateway cloning Kit (Invitrogen, Carlsbad, CA, USA). Binary vectors were electroporated into Agrobacterium tumefaciens strain GV3101. Tobacco (Nicotiana tabacum) transformation was carried out using a leaf disc transformation method . Transgenic shoots were grown on selective medium containing 3% (w/v) sucrose, 0.7% (w/v) MS (Murashige and Skoog), 0.7% (w/v) agar, 1.0 mg/mL 6-benzylaminopurine(6-BA), 1.0 mg/mL 1-naphthaleneacetic acid(NAA), 300 mg/L Hygromycin and 150 mg/L kanamycin. These transgenic shoots were transferred to the greenhouse under long-day light conditions (16 h light/8 h dark) after one month. Significant differences were determined using analysis of variance (ANOVA) and Tukey’s honestly significant difference (HSD) test, where P < 0.05 was considered to be significant. All data were analyzed using SPSS software (IBM, USA).
Anthocyanins were extracted by the method for "total monomeric anthocyanin pigment content of fruit juice, beverages, natural colors, and wines" (AOAC Official Method 2005.02). The absorbances (A) at 530 nm and 657 nm (expressed as ΔA g−1 fresh weight was measured using a spectrophotometer (Beijing General Analysis Company, Beijing, China). The relative content of anthocyanin in the extract was calculated as [ΔA = A530 – (0.25 × A657)], and the effects of chlorophyll and its degradation products on the absorbance results were corrected [29, 30]
Genotyping of a natural population of Helianthus tuberosus
The promoter sequences of HtMYB2 were isolated from ‘QY1’ and ‘QY3’, based on thermal asymmetric interlaced (TAIL)-PCR . According to the nucleotide sequence differences between the promoters of HtMYB2 of ‘QY1’ and ‘QY3’, a polymorphic PCR marker HtproS was designed to distinguish between ‘QY1’ and ‘QY3’ (Table S5). The allelic variation in HtMYB2 was identified in the natural population of H. tuberosus. 180 Jerusalem artichoke materials from different regions, and DNA extraction for backup (Table S4).