Plant materials
More than 1000 5-year old shrubs of Shu Chazaotea (Camellia sinensis var. sinensis cv. “Shu Chazao”) are grown in the research station at Anhui Agricultural University for different research purposes. In middle spring, shoot buds, the, 1st, 2nd, and 3rd young leaves on new sprouts, fully expended mature leaves, young stems, and roots were collected in liquid nitrogen and then stored at -80ºC until experimental uses. Nicotiana tabacum cv. “G28” grown in the green house and N. benthamiana grown in the growth chamber were used for genetic transformation and for the subcellular protein localization experiments, respectively. The growth chamber was set up with a constant temperature at 28±3 ºC and a constant 12/12-h (light/dark) photoperiod with a light intensity of 150-200 μmol m-2 s-1.
Gene cloning and phylogenetic analysis
Total RNA was isolated from plant tissues using an RNAiso-mate for Plant Tissue Kit (Takara, Dalian, China) according to the manufacturer’s protocol. The first strand cDNAs were synthesized using a PrimeScript® RT Reagent Kit (Takara, Dalian, China). Based on available cDNA sequence of Cs4CL1, forward and reverse primer pairs were design through Primer Premier 5 software according to the ORF (open read frame) sequences (Supplementary TableS1). To clone the full length of Cs4CL2, according to an EST sequence of Cs4CL2, primer pairs (Supplementary Table S1) were designed to for a RACE PCR. The ORFs of Cs4CL1 and Cs4CL2 were amplified with a polymerization chain reaction (PCR) program, which consisted of 98 ºC for 30s, 29 cycles of 98 ºC for 10s, 60 ºC for 20s, and 72 ºC for 30s, and then a 10 min extension at 72 ºC. The amplified PCR products were ligated to the PMD19-T vector and then sequenced at BGI (http://www.genomics.cn/index) for accuracy confirmation.
The resulting ORF sequences were deduced to obtain amino acid sequences that were used for alignment analysis performed with DNAman software (Zhao et al. 2013) . A phylogenetic tree was developed using Cs4CL1, Cs4CL2 and other 21 4CL sequences and using MEGA6.1 software. The Gene-Bank IDs of the 4CLs were listed in supplementary Table S2. The tree nodes were evaluated with a bootstrap for 1000 replicates and the evolutionary distances were estimated with a p-distance method.
Cloning and sequence analysis of Cs4CL1 and Cs4CL2 promoters
The cloning method of promoters followed a genome-walker method according to the Genome-Walker Kit (Clontech). The primers (Table S1) were designed for PCR to clone 1000 nucleotides of promoter sequences. The amplified PCR products were cloned to plasmid pEASY-T1 and then sequenced at BGI to confirm the accuracy. Identification of binding elements in promoter sequences was completed using an online promoter analysis tool at PLACE (http://www.dna.affrc.go.jp/htdocs/PLACE/) and PlantPan (http://plantpan.itps.ncku.edu.tw/).
Construction of binary vectors and plant transformation
The OFRs of Cs4CL1 and Cs4CL2 were cloned to the binary vector PCB2004 under the control of a 35S-promoter using the Gateway® Cloning System (Invitrogen, Carlsbad, CA) (Lei et al. 2007) . PCR primers were designed with an addition of attB at the both end. The PCR program was as following: 98 ºC for 30s, followed by 30 cycles of 98 ºC for 10s, 60 ºC for 20s, and 72 ºC for 30s, and then a 10 min extension at 72 ºC. The PCR products were purified and then cloned to the entry vector pDONR207 using the Gateway® BP Clonase® Enzyme mix (Invitrogen, Carlsbad, CA). After sequencing confirmed the accuracy of the inserts in the recombinant pDONR207 plasmids, ORFs were cloned to the PCB2004 binary vector via an exchange reaction using Gateway® LR ClonaseTM enzyme (Invitrogen, Carlsbad, CA). Two recombinant plasmids, PCB2004-Cs4CL1 and –Cs4CL2, were introduced into Agrobacterium tumefaciens strain EHA105 for tobacco transformation. The positive EHA105 colonies were selected on agar-solidified medium containing 50 mg/L kanamycin. The transformation of tobacco plants was completed via a leaf disc method (Li et al. 2017) . The positive transgenic plants were screened on agar-solidified MS medium supplemented with 25 mg/L of phosphinothricin. Multiple phosphinothricin-resistant plants were verified to be transgenic by genomic DNA-based PCR and RT-PCR, and then planted in pot soil in the green house.
Subcellular localization
The stop code of the ORFs of both Cs4CL1 and Cs4CL2 were removed by PCR and then the fragments were cloned to the pGWB5 vector, in which each was fused at the 5’end of an EGFP reporter gene using a Gateway cloning system as described above. Two recombinant constructs, pGWB5-Cs4CL1-EGFP and pGWB5-Cs4CL2-EGFP, were obtained and then introduced into A. tumefaciens EHA105. The positive colonies were activated and then used to infect leaves of 45 days old N. benthamiana. After infection for 48h, leaves were used to examine the EGFP fluorescence under an Olympus FV1000 confocal microscope (Olympus, Tokyo, Japan).
Quantitative reverse transcription PCR (RT-qPCR) and semi-quantitative RT-PCR analysis for gene expression in different tea tissues, treatments, and transgenic tobacco plants
RT-qPCR was performed gene expression profiling in different tissues using SYBR-Green PCR Mastermix (Invitrogen, Carlsbad, CA) on a CFX96™ instrument (Bio-RAD, California, USA) following the manufacturer’s instructions. Gene-specific primer pairs and thermal programs for Cs4CL1 and Cs4CL2 (Table S1) were designed for PCR. ACTIN was used as reference. Amplified products were monitored using an optical reaction module and amplified values were normalized against the ACTIN gene expression (Pang et al. 2007) . For identification of positive transgenic tobacco lines, semi-quantitative RT-PCR was completed as previously described (Zhao et al. 2013) .
Heterologous expression in Escherichia coli and recombinant protein purification
The ORFs of both Cs4CL1 and Cs4CL2 were ligated to a pet-SUMO vector (Life technology biology, USA) with a T4-ligase according to the manufacturer’s protocol. The resulting positive Cs4CL1-petSUMO and Cs4CL2-petSUMO plasmids were transformed into competent E. coli BL21. Positive colonies were obtained after screening on agar-solidified Luria-Bertani (LB) medium supplemented with 100 μg/ml kanamycin. One positive colony for each construct was inoculated in 200 mL liquid LB medium supplemented with 100 μg/ml kanamycin in a 500 ml E-flask. The flasks were placed on a rotary shaker with a speed of 250 rpm at 37 °C. After the concentrations of suspension cultures reached 1.00 at 600 nm, IPTG was added to the suspension cultures to a final concentration of 0.2 mM. The temperature was then changed to 28 °C to induce protein expression. After 24 hrs of induction, suspension cultures were transferred to tubes for 10 min of centrifugation at 5000 rpm to harvest E. coli. Crude proteins were extracted as described previously (Zhao et al. 2017) . Recombinant Cs4CL1 and Cs4CL2 were purified with an affinity chromatography consisting of Ni resin (New England Biolabs, MA, USA). Two buffers were used for affinity chromatography. Buffer A consisted of 20 mM pH 7.4 Tris-HCl, 200 mM NaCl. Buffer B was composed of buffer A and 100 mM imidazole. Column was first washed and equilibrated with buffer A. After crude protein extracts were loaded onto the column, buffer A was used to elute all unbound proteins. Then, buffer B was used to elute recombinant Cs4CL1 and Cs4CL2, which were examined with electrophoresis on a 12% SDS polyacrylamide gel and staining of Coomassie brilliant blue.
Enzymatic assays
Enzymatic assay was performed in order to characterize the catalytic activity of Cs4CL1 and Cs4CL2 according to a published protocol (Knobloch and Hahlbrock, 1975). In addition, minor modifications were added for these two enzymes. Briefly, prior to optimize reaction conditions, we tested enzymatic activity with three different substrates in a 500 μl mixture volume. The reaction mixture contained 0.5 M pH 7.5 phosphate buffer, 0.3 mM coenzyme A (CoA), 5 mM ATP, 5 mM MgC12, 5 μg purified recombinant enzyme, and 0.4 mM substrates (p-coumaric acid, caffeic acid and ferulic acid) in a 1.5 ml tube. The reactions were incubated at 30 °C for 30 min and then terminated with the addition of 50 μl methanol. After the reactions were centrifuged for 10 min at 12,000 rpm, the supernatants were transferred to new tubes for immediate HPLC analysis. To further optimize pH and temperature values of Cs4CL1 and Cs4CL2, 4-coumaric acid was used for assays, while other conditions were not changed. To estimate Km and Vmax values, reactions were carried out in a 2 ml reaction volume containing 0.5 M pH7.5 PBS buffer, 0.03 to 0.3 mM CoA, 5 mM ATP, 0.04-0.4 mM substrates (p-coumaric acid, caffeic acid and ferulic acid), 2-20 μg recombinant Cs4CL1 and Cs4CL2 protein. Finally, the Km and Vmax values of Cs4CL1 and Cs4CL2 were calculated according to Lineweaver-Burk plots.
HPLC analysis of enzymatic reaction products
The enzymatic reaction products were analyzed using a Shimadzu LC20-AT system with a full wavelength detection. The elution of products was monitored at 333 nm for p-coumaroyl-CoA, 350 nm for caffeoyl-CoA, and 346 nm for feruloyl-CoA. The mobile phase consisted of solvents A (0.5% acetic acetate in double deionized water) and B (100% HPLC-grade acetonitrile). A gradient elution program used consisted of 30–90% solvent B from 0 to 23 min and 90–30% buffer B from 23 to 29 min, and then followed by a 10 min column washing with 30% buffer B.
Analysis of total lignin in transgenic and wild-type tobacco plants
Lignin analysis was carried out by following the Syros method (Halpin et al. 1994) . Brifely, roots, stems, and leaves were harvested from 90-day old plants grown in the pot soil, pooled together, and then ground into fine powder in liquid nitrogen. One hundred mg of powder was suspended in 1 ml 50 % ethyl alcohol (in water) in a 2 ml tube. After 3 hours of extraction at 80 °C, 1 m1 methanol was added into the mixture, which was incubated for 1 hr at 80 °C. Tubes were centrifuged at 12,000 rpm for 10 min to obtain supernatant and residue phases. The supernatant was removed to a waste container. The remaining residues were fully dried at 60°C in an oven. Ten-milligram of dry powder was weighed to a 1.5 ml tube, suspended in 5 mL 25% (w/w) acetyl bromide in acetic acid, and treated for 30 min at 70 °C. The mixtures were added with 0.2 mL 70% perchloric acid in water, mixed thoroughly, and then continuously treated at 70°C for 30 min. After cooling to the room temperature, tubes were centrifuged at 3000 g for 15 min to obtain supernatants and residues. The supernatant of each treatment was pipetted into a new 50 ml tube, which was added with 5 mL of 2 M NaOH. Then, glacial acetic acid was immediately added to the mixture to adjust the final volume to 25 ml, followed by mixing thoroughly. One ml of mixture was used to measure absorbent value at 280 nm on UV-visible Hitach U-5100 spectrophotometer (Hitach, Tokyo, Japan). According to a previously published method (Piquemal et al. 2002) , the absorbent values were converted to the lignin content in samples.
Extraction of polyphenolic compounds from transgenic and wild-type tobacco plants
Leaves from 90-day old wild-type, Cs4CL1 and Cs4CL2 transgenic tobacco plants were collected to analyze polyphenols. Phenolic compounds were extracted using the following procedure. Samples were ground into fine powder in liquid nitrogen. Powdered samples (150 mg) were suspended in 1 ml 80% methanol: 1% hydrochloric acid in a 1.5 ml tube. Tubes were completely vortexed and then placed in the room temperature for 20 min, followed by 10 min of centrifugation at 12000 rpm. The supernatant was transferred to a new tube. This extraction was repeated once to obtain a final volume of 2 ml extract for each sample. All extracts were then filtered through a 0.22 µm membrane into a new tube prior to UPLC-MS analysis described below.
Analysis of metabolites by UPLC-MS/MS
Products from Cs4CL1 and Cs4CL2 assays and polyphenolic metabolites extracted from wild type and Cs4CL1 and Cs4CL2 transgenic tobacco plants were analyzed using ultra-high-performance liquid chromatography (UPLC)-MS/MS analysis on an Agilent LC-MS system (Palo Alto, CA, USA). Compounds were separated in an Agilent 20RBAX RRHD Eclipse Plus C18 column (particle size: 1.8µm, length: 100 mm, and internal diameter: 2.1 mm Palo Alto, CA, USA). The column oven, mobile gradient, and electrospray ionization technique were as described previously (Jiang et al. 2013a) .