Vector construction and Agrobacterium-mediated transformation of soybean
All cloning steps were carried out using the binary vector pTF101. The 1413-bp seed-specific promoter of β-conglycinin alpha subunit (BCSP) was amplified from soybean as previously described  and sub-cloned into the HindIII/XbaI sites of the compatible pTF101 plasmid. The genes encoding maize 27 kDa γ-zein (Genbank No. AF371261.1) and 13.56 kDa ELP proteins were synthesized and optimized based on soybean codon usage (GenScript USA Inc., NJ, USA) to remove cryptic splice sites and rare codons, and then inserted into the binary vector under the control of the promoter BCSP. The coding sequence of the green fluorescent protein (GFP) was then cloned upstream of ELP or downstream of γ-zein sequences to generate the transformation vectors pTF101-GFP-ELP and pTF101-zein-GFP. In construct pTF101-GFP-ELP, the ER localization signal KDEL sequence was added at the C-terminus of ELP, while in construct pTF101-zein-GFP, the enterokinase cleavage site (DDDDK) was located between γ-zein and GFP instead (Fig. 1A). For the control vector pTF101-GFP, the GFP gene was cloned into the XbaI/SacI sites between BCSP and nopaline synthase (NOS) terminator. All three vector constructs contained a phosphinothricin acetyltransferase selectable marker gene (bar) for the selection of glufosinate-resistant transgenic events. The soybean genotype P03, provided by Prof. Bao Peng from the Jilin Academy of Agricultural Sciences, China, was used for Agrobacterium-mediated transformation as previously described by Yang et al. (2017) . The primary transformants were transferred into a greenhouse for subsequent molecular screening.
Molecular analysis of the transgenic plants
The T0 transgenic plants were first screened using the LibertyLink® strip according to the instructions of the manufacturer (EnviroLogix Inc., ME, USA). Polymerase chain reaction (PCR) was performed to confirm the presence of the transgenes using the primers BG-F1/BG-R1 (5′-CTCACCATCGCTCAACACATT TC-3′, 5′-TGTCTTGTAGTTCCCGTCGTCC-3′), which anneal with GFP and its upstream BCSP sequences, respectively. For the subsequent generations, the transgenic plants were further screened by both glufosinate (1500 mg/L) spraying and PCR analysis. Southern hybridization was conducted to determine transgene copy numbers in soybean plants. Total genomic DNA was extracted from the young leaves of T3 transgenic plants with a modified procedure using cetyltrimethylammonium bromide at a high salt concentration as described previously . Approximately 30 μg of genomic DNA was digested overnight with XbaI, separated on 1% agarose gel, and transferred onto a positively charged nylon membrane (GE Amersham, USA) according to the standard protocols. The 714-bp GFP probe was obtained by PCR amplification using the primers GFP-F1/GFP-R1 (5'-ATGAGTAAAGGAGAAGAACTTTTCA-3', 5'-TTTGTATAGTTCATCCATGC
CATGT-3') and labeled using digoxigenin (DIG)-high prime (Roche, Germany). Hybridization, membrane washing, and subsequent chemical staining with BCIP/NBT were conducted as described previously .
Quantitative reverse transcription (qRT)-PCR
Forty-five days after flowering, immature transgenic soybean seeds were sampled to determine the relative expression of the GFP transcripts by qRT-PCR. Total RNA was extracted from the immature seeds (～10 mm in length) of T3 transgenic plants using the EasyPure Plant RNA Kit (TransGen Biotech, Beijing, China), according to the manufacturer’s protocol. Genomic DNA removal and first-strand cDNA synthesis were carried out using TransScript® One-Step gDNA Removal and cDNA Synthesis SuperMix (TansGen Biotech). Amplifications were performed on ABI 7900HT (Applied Biosystems, USA) using TransStart® Top Green qPCR SuperMix (TansGen Biotech). The reaction conditions were as follows: 50°C for 2 min, 95°C for 10 min, and 35 cycles of 95°C for 2 min, 60°C for 30 s, and 72°C for 30 s. The specific primers GFP-F2/GFP-R2 (5′-TACGTGCAGGAGAGGACCA T-3′, 5′-ACTTGTGGCCGAGGATGTTT-3′) were designed based on the GFP sequence. The constitutively expressed native soybean gene GmACT6 (GenBank No. NM_001289231) was amplified with the primers GmACT-F1/GmACT-R1 (5′-ATCTTGACTGAGCGTGGTTATTCC-3′, 5′-GCTGGTCCTGGCTGTCTCC-3′). The expression levels were calculated using the relative quantification (2-ΔCt) method and the data were normalized using the expression level of the internal control GmACT6. All experiments were conducted with three biological replicates.
Western blotting and GFP quantification analysis
The T3 transgenic lines carrying a low-copy (single or double) transgene insertion and with similar mRNA expression of the transgenes in the immature seeds at 45 days after flowering were used for western blotting and enzyme-linked immunosorbent assay (ELISA). Immature seeds of similar size (～10 mm in length) were harvested from the transgenic plants expressing zein-GFP, GFP-ELP, or unfused GFP. Total soluble protein was extracted with extraction buffer (100 mM NaCl, 10 mM EDTA, 200 mM Tris-HCl, 0.05% Tween-20, 0.1% SDS, 14 mM β-mercaptoethanol, 400 mM sucrose, and 2 mM phenylmethane sulfonyl fluoride). Protein quantification was performed using the Bradford method . The protein samples were then separated by 12% sodium dodecyl sulfate-polyacrylamide gel electrophoresis and electrotransferred on to a PVDF membrane (GE Healthcare, USA). After blocking with 5% (w/v) fat-free milk powder, the membranes were blotted with monoclonal anti-GFP antibodies (1:500 dilution, Abcam, UK) for 2 h at room temperature. Antibody binding was then detected by the addition of horseradish peroxidase (HRP)-conjugated goat-anti-rabbit IgG (1:5000 dilution, Abcam). After extensive washing, the signal was visualized using the BiodlightTM Western Chemiluminescent HRP substrate (Bioworld Technology, Inc. USA). To quantify GFP, the protein samples from immature or mature soybean seeds were subjected to ELISA with purified GFP (Abcam) as the standard protein. The wells of the ELISA plates were coated with diluted protein extracts and standard protein in the coating buffer (15 mM Na2CO3, 35 mM NaHCO3, and 3 mM NaN3; pH 9.6). After blocking with 5% dried skimmed milk in phosphate buffer containing 0.1% Tween 20, the monoclonal anti-GFP antibodies (1:500 dilution) were added into the wells and incubated for 1 h at room temperature. The HRP-conjugated secondary antibodies were then added at a dilution of 1:5000 and incubated for 1 h. Optical density (OD) of each reaction mixture was measured at 450 nm (ELx800; BioTek, USA) with tetramethylbenzidine (TMB) as the substrate.
Confocal laser scanning
Immature seeds were collected at 45 days after flowering and tissue sections were prepared by hand. The samples were immediately imaged using a Leica TCS SP2 confocal laser scanning inverted microscope (Leica Microsystems, Germany) equipped with a 63× water immersion objective. To visualize GFP fluorescence, excitation with 488-nm argon laser was performed and the emission was detected at 500–530 nm. The collected images were analyzed using Leica Application Suite for Advanced Fluorescence (LAS AF, V2.3.5; Leica Microsystems).
Protein and oil content analyses in transgenic seeds
Protein and oil content in the transgenic and non-transformed soybean seeds were analyzed as described previously . Briefly, mature soybean seeds (200 mg) were ground in a grinder after drying at 60°C. The level of total nitrogen was determined using LECO CHN 2000 analyzer (LECO, St. Joseph, MI) as described by Hwang et al. (2014) . The seed protein content was calculated based on the dry weight basis. The seed oil content was determined by pulsed NMR (Maran Pulsed NMR, Resonance Instruments, Oxfordshire, UK), followed by the field induction decay-spin echo procedure .
All statistical analyses were performed using SPSS software v.17.0 (SPSS Inc., Chicago, IL, USA). Differences in the average values of samples were tested using the analysis of variance (ANOVA).