Identification of three target genes
As shown in Fig. S2a, three fragments, the length of which were about 1200, 1000, and 1000 bp, were obtained by PCR. Sequencing results confirmed that the exact full length of the three PCR products were 1140, 930, and 993 bp, respectively. The 1140 bp-PCR product had 86.71% identity with the Abrus precatorius COMT (GenBank accession No. XM_027512828.1), the 930 bp-PCR product had 89.82 % identity with the Glycine max CRTZ (NM_001254504.1), and the 993 bp-PCR product had 81.82% identity with Cajanus cajan F3’H (XM_020357268.2), indicating the COMT, CRTZ, and F3’H sequences were obtained from G. uralensis. We registered these sequences in GenBank and got the accession numbers as follows: COMT (MZ169549), CRTZ (MZ169550), and F3’H (MZ169551). It is the very first time to identify COMT, CRTZ, and F3’H from G. uralensis. To evaluate the evolution of these genes, we then established phylogenetic trees based on COMT, CRTZ, and F3’H homologous sequences registered in NCBI, respectively. As shown in Fig. 2, COMT and F3’H homologous sequences both clustered into three major independent clades, Dicotyledoneae, Monocotyledoneae, and Pteridophyta, while CRTZ orthologs clustered into four, Dicotyledoneae, Monocotyledoneae, Pteridophyta, and Bryophyta. These findings suggest that the evolution of the three enzymes is stable and corresponding to the evolution of species.
Construction of recombinant vectors
As shown in Fig. S2b, three fragmentswere amplified from the plant binary expression vectors, pCA-COMT, pCA-CRTZ, and pCA-F3’H, respectively. Sequencing results further confirmed that these PCR products were completely identified with COMT (MZ169549), CRTZ (MZ169550), and F3’H (MZ169551)that we cloned from G. uralensis. These findings demonstrated that the plant binary expression vectors for overexpressing COMT, CRTZ, or F3’H were correct. As shown in Fig. S2c, several 400 bp PCR products were obtained from the plant CRISPR/Cas9 vectors, pHSE-COMT, pHSE-CRTZ, and pHSE-F3’H. Sequencing results confirmed that these fragments contained the corresponding sgRNA sequences of target genes, indicating that the CRISPR/Cas9 vectors for knocking out COMT, CRTZ, or F3’H were constructed successfully.
Identification of G. uralensis hairy root lines
We generated nine types of G. uralensis hairy root lines, including WT, NC-PCA, NC-PHSE, COMT+, CRTZ+, F3’H+, COMT-, CRTZ-, and F3’H-. Fig. 3 shows the growth situation of these hairy roots after the onset of induction for 10, 20, and 30 days. They all grew in good condition. To identify these hairy root lines, we firstly cloned the rolC genes, the signature gene in hairy roots, from all samples and got 600 bp-fragments by PCR (Fig. S3a), which were identified to have 100% identity with rolC (GenBank accession No. DQ160187.1). Also, the fragments amplified from COMT+, CRTZ+, and F3’H+ showing in Fig. S3b were confirmed with correct sequences. Fig. S3c and S3d exhibit the third exons of COMT and CRTZamplified from COMT- and CRTZ- respectively, and Fig. S3e shows the first exons of F3’H amplified from F3’H-. These fragments, as described in next section, will be used to analyze the exact editing sites of target genes in hairy root lines COMT-, CRTZ-, and F3’H- through further cloning and sequencing. In the end, one WT, five COMT+, five CRTZ+, seven F3’H+, three COMT-, five CRTZ-, four F3’H-,one NC-PCA, and one NC-PHSE samples were obtained.
Characterization of gene knockout and overexpression in transgenic G. uralensis hairy root lines
Further cloning and sequencing analysis confirmed that the COMT gene was edited in four hairy root lines (COMT--2,-5,-6, and -8) out of nine with a gene editing efficiency of 44.4%. The gene editing details are shown in Fig. 4a and the mutation sites present in COMT amino acid sequences are listed in Table 1. We observed that the synonymous mutation appears in COMT--2, while the missense mutations present in the other three samples. The most efficient gene edition was a fragment deletion in COMT--5, which results in a termination codon “TAA” in this position. CRTZ gene was edited in eight lines (CRTZ--1, -2,-3,-4,-5,-6,7, and-8) out of ten with an editing efficiency of 80%. Among them, homozygous mutations were present in four samples, CRTZ--1,-3,-4 and-6 (Fig. 4b), while heterozygous mutations in the other four (Fig. S4a). Frameshifts were observed in all lines. The mutation sites in CRTZ amino acid sequences are listed in Table 2. CRTZ--1 and CRTZ--6 exhibit the same two mutation sites, while CRTZ--3 and CRTZ--4 share a similar one mutation site in CRTZ sequences. All mutation sites in these four samples were located within the functional domain of CRTZ. Further cloning and sequencing analysis showed that F3’H gene was edited in seven lines (F3’H--2, -4, -5, -7, -8, -9, and -10) out of eleven with a gene editing efficiency of 63.6%. Four hairy root lines, F3’H--2, -7, -8, and -10, were confirmed with homozygous mutations (Fig. 4c), while the other three lines were heterozygous mutations (Fig. S4b). The mutation sites present in F3’H amino acid sequences are listed in Table 3. F3’H--2 and F3’H--10 exhibit 16 similar mutation sites in F3’H sequences. All these mutations are located within the functional domain of F3’H. We next examined the gene expression levels of the three target genes in newly constructed COMT+, CRTZ+, and F3’H+ hairy root lines. As illustrated in Fig. 4d, the relative expression levels of COMT, CRTZ, and F3’H in COMT+, CRTZ+, and F3’H+ hairy root lineswere all sharply higher than that in WT. In particular, samples COMT+-4, CRTZ+-4, and F3’H+-6 showed the highest expression level in their respective groups.
GA content assay by UPLC inG. uralensis hairy root samples
Fig. 5a shows the healthy and luxuriant hairy root samples cultured in liquid 6,7-V medium. Fig. 5b shows the collected hairy root samples, which were prepared for UPLC analysis. It is worth noting that samples F3’H- and CRTZ+ look partial white and CRTZ- looks reddish, although most of the hairy root lines are yellowish white. The UPLC chromatograms of reference substance GA, WT, NC-PCA, NC-PHSE, COMT+, CRTZ+, F3’H+, COMT-, CRTZ-, and F3’H- are shown in Fig. 6a. The UPLC retention time of GA was 6.532 min, and the standard curve was as follows: Y = 2,683,455.37 X - 1,660.17 (R2 = 0.9999). The GA contents in all of the hairy root samples are calculated and listed in Table 4. The difference of GA content among samples is shown in Fig. 6b and which among groups is shown in Fig. 6c. We found that the GA contents in negative control (NC-PCA and NC-PHSE) were equivalent to that in WT and no significance were detected among these samples. Interestingly, we noticed that gene expression levels of the three target genes exhibit negative correlations to GA contents in hairy root samples when comparing to WT and negative control. For example, the GA contents in all the seven F3’H+ samples were all significantly lower than that in WT and NC-PCA, and which in three F3’H- samples (F3’H--2, -8, and -10) were all remarkably higher than that in WT and NC-PHSE. In group F3’H-, the only exception was F3’H--7, the GA content in which was higher than that in WT and NC-PHSE, but the difference was not significant. The similar results were also observed in COMT+, CRTZ+, COMT-, and CRTZ- samples. Next, we analyzed the GA content difference among groups (Table 4) and found that the GA contents in groups F3’H+ (average value: 2.3654 mg·g-1), CRTZ+ (2.2050 mg·g-1), and COMT+ (2.3509 mg·g-1) were significantly lower than that in both WT (3.3242 mg·g-1) and NC-PCA (3.5188 mg·g-1). While, the GA contents in groups F3’H- (5.4744 mg·g-1), CRTZ- (5.3651 mg·g-1) and COMT- (4.2746 mg·g-1) were significantly higher than that in WT as well as NC-PHSE (3.3459 mg·g-1). These findings suggest that the expression of three target genes negatively correlates to GA production.