Evaluation of three different vectors on human iLRP expression in Escherichia coli

iLRP (immature laminin receptor protein) is a tumor associated antigen over-expressed on the surface of most human cancer cells and plays an important role in the process of tumorigenesis and development. It has strong auto-immunogenicity in patients and is a good target protein for tumor immunotherapy. To find a practical and an efficient method for the production of recombinant iLRP, three expression vectors based on pET-30a(+) from pET expression systems were constructed. The first one is to add a 6xHis-Tag to the N-terminal of natural iLRP, which is called as pET-His-iLRP. The second one is to include an S-Tag between the C-terminal of 6xHis-Tag and the N-terminal of natural iLRP, which is called as pET-His-S-iLRP. The third one only contains the 6xHis-Tag in front of the N-terminal of iLRP, but the natural iLRP gene was first optimized according to the bacterial genome, and the constructed vector was named as pET-His-Opt-iLRP. Then the expression of three vectors in Escherichia coli (E. coli) BL21 (DE3) was analyzed. Results demonstrated that the iLRP expression was the highest in pET-His-Opt-iLRP, followed by pET-His-S-iLRP, and pET-His-iLRP expressed the lowest iLRP. This implied that the vector combining the codon-optimized iLRP gene with a 6xHis-Tag had the best effect on the expression of iLRP, and the expression in vectors with the natural iLRP gene depends on its leader sequences in the expression frame. It can be consequently concluded that the heterologous expression of human iLRP in a bacterial host could be greatly increased by genetically modifying gene constituents and using proper protein-tags, which laying the foundation on the future researches on its applications.


Introduction
Human immature laminin receptor protein (iLRP), the precursor of laminin receptor protein (LRP), is abundantly expressed in the early stage of embryonic development, but gradually 3 disappeared after the middle and late stages. Its cDNA sequence originates from a highly conserved gene RPSA (40S ribosomal protein SA) which produces one mRNA transcript that expresses iLRP (Brassart et al. 2019). In normal adult tissues, its expression can hardly be detected (Barsoum and Schwarzenberger 2014). However, studies have confirmed that iLRP is re-expressed in most of tumor tissues and plays an important role in the occurrence, development and metastasis of tumors, which is closely related to the degree of cancer deterioration (Coggin et al. 2004). Since its discovery, iLRP has been regarded as an important target in the study of cancer immunotherapy (Pesapane et al. 2017).
However, it is difficult to obtain sufficient pure protein for studying its functions and applications in clinics, because the natural tissues that can be used for purification are limited. In order to overcome these obstacles, large-scale production by genetic engineering is in the urgent need.
There are many factors to be considered for expressing recombinant proteins in nonnative hosts. The first consideration is to choose an appropriate expression system.
Usually, eukaryotic expression system is the best choice for target genes from eukaryotes, and prokaryotic expression system is the best choice for target genes from prokaryotes.
However, eukaryotic expression system is more complex and difficult to operate than prokaryotic expression system in the production of exogenous proteins. It is not suitable for large-scale production in a short time because of its low production and high cost (Yousefi-Rad et al. 2015). Therefore, most of the current methods for large-scale production of exogenous proteins are carried out by prokaryotic expression system. Furthermore, prokaryotic expression system has been very mature with many advantages like as: simple operation, high yield, short production cycle, mature expression vectors easily available, and simpler purification methods (Rosano and Ceccarelli 2014). Among the prokaryotic systems, the pET system developed by Novagen Company is the most 4 successful and products by this system have been in the market, which has produced great economic value (Hou et al. 2006;Kalim et al. 2017). In this study, the expression of iLRP was analyzed in detail by using this system, and the expression of three different vectors was compared. The difference among three vectors is: adding 6xHis-Tag to the Nterminal of natural iLRP to the construct, pET-His-iLRP; adding the S-Tag between the Cterminal of 6xHis-Tag and the N-terminal of natural iLRP to the construct, pET-His-S-iLRP; and adding 6xHis-Tag to the N-terminal of codon-optimized iLRP to the construct, pET-His-Opt-iLRP. Results showed that the most suitable expression vector for mass production of recombinant iLRP is the fusion structure of codon-optimized iLRP and 6xHis-Tag, pET-His-Opt-iLRP. By this vector, the product is not only easy to be expressed in large quantities and to be purified through Ni + column, but also has little effect on the structure and function of the protein itself.

Chemicals and strains and plasmids
Terrific Broth (TB) and kanamycin were purchased from ThermoFisher Scientific, Chengdu, China for culturing bacterial strains. PCR kit, restriction enzymes, and gel extraction kit were purchased from TaKaRa Co. Ltd., Beijing, China. The primary antibody, mouse anti-iLRP and the secondary antibody, goat anti-mouse IgG conjugated with Horseradish peroxidase (HRP) were bought from Abcam, Shanghai, China. Other molecular biology reagents were from Solarbio Science & Technology Co. Ltd., Beijing, China. All chemicals in this study were with analytical grade or a higher purity.
Escherichia coli DH5α was used as the host for recombinant DNA manipulation, and BL21(DE3) was used for gene expression. Competent cells of bacterial strains were purchased from ThermoFisher Scientific, Chengdu, China. Plasmids pUC57 and pET30a(+) for cloning and expression were from ChinaPeptides, Shanghai, China. The plasmid pENTER containing the natural human iLRP cDNA was bought from Vigene, Shandong, China. All hosts and plasmids were stored at -80 °C in our lab.

Codon optimization
The technology of codon optimization has been a valuable tool for producing proteins as therapeutic agents or research reagents in heterologous host. With the availability of large amount of genomics data, and with the increased knowledge of protein expression, function and structure relationships, gene expression levels can be significantly improved by this technique (Gao et al. 2015;Song et al. 2014). During the process of applying the technique, a variety of key elements involved in different stages of protein expression, like as GC content, codon adaptability, mRNA structure, and various cis-elements in transcription and translation should be carefully taken into consideration (Gvritishvili et al. 2010;Hanson and Coller 2018). In this study, human iLRP gene (NM_002295) was optimized in terms of above points using a commercial proprietary algorithm, NG® Codon Optimization Technology (Synbio Technologies, NJ, USA) in order to analyze its expression in Escherichia coli.

Construction of expression vectors
Plasmid pET-30a(+) (from Novagen) was used as the backbone for the construction of vectors in this study. The vector pET-His-iLRP was constructed by inserting the natural iLRP gene between restriction sites of NdeI and XhoI. The natural iLRP gene was amplified by PCR using the plasmid pENTER (Vigene, Shandong, China) containing the human iLRP gene (Vania et al. 2018) as template. In order to add a restriction site of NdeI and a 6xHis-Tag to the N-terminal of iLRP, the 5'-primer sequence was synthesized by Sangon Biotech, Shanghai, China for this PCR with the sequence as: 5'-ATACATATGCACCATCATCATCATCATTCCGGAGCCCTTGATG-3'. Similarly, a restriction site of XhoI was added to the 3'-primer, and the sequence is: 5'-CCGCTCGAGTTAAGACCAGTCAGTGGTTGC-3'. To construct the vector pET-His-S-iLRP, the restriction sites of BglII and XhoI were used for the process of cloning so that the original 6xHis-Tag and the S-Tag were kept as it was. Therefore, the 5'-primer for amplifying the iLRP gene from pENTER was designed as: 5'-ACGAGATCTTCCGGAGCCCTTGATG-3', and the 3'-primer is same as above. PCR was both performed at the following conditions: initial denaturation at 98°C for 2 min, 30 cycles of denaturation at 98°C for 40 s, annealing at 58°C for 40 s, and extension at 72°C for 1 min, and the final extension at 72°C for 4 min and then maintenance at 4°C forever. In the construction of pET-His-Opt-iLRP, the codonoptimized iLRP gene with the additional elements of restriction site of NdeI and the 6xHis-Tag at 5'-end of iLRP and EcoRV at the 3'-end of iLRP was synthesized by ChinaPeptides, Shanghai, China and then cloned into pET-30a(+) between NdeI and EcoRV. All constructed vectors were characterized by sequencing and digestion of restriction enzymes.

Protein expression
Above vectors were transformed into Escherichia coli BL21 (DE3) (from Novagen) for the analysis of protein expression at a small scale. Briefly, on day 1, a fresh TB agar plate containing 50 μg/ml kanamycin was inoculated with bacterial stocks for each vector. On day 2, well isolated single clones were picked from the plates and cultured overnight in test tubes with 3 ml TB containing 50 μg/ml kanamycin, shaking at the speed of 200 rpm and at 37°C. In the morning of day 3, overnight cultures were transferred to 500 ml flasks with 100 ml fresh TB and 50 μg/ml kanamycin, continuously shaking at the speed of 200 rpm and at 37°C until the optical density at 600 nm (OD 600 ) reached to 1.0. At this point, a small aliquot of cultures for each sample was collected as negative control. Then, the rest of cultures was induced by adding isopropyl-thio-β-D-galactoside (IPTG) to a final concentration of 1.0 mM and shaking at the conditions above until the OD 600 reached to the same point of 6.0. Cell cultures were harvested by centrifuging at 4°C, 5000 rpm for 10 min and stored at -80°C.

SDS-PAGE and Western Blotting
Cell pellets harvested above were divided into two parts. One of them was directly lysated to prepare samples of whole cell lysates for SDS-PAGE (sodium dodecyl sulfatepolyacrylamide gel electrophoresis) and another one was used to prepare inclusion bodies.
Cell pellets were washed with ice-cold phosphate-buffered saline (PBS, PH 7.2) twice, and resuspended with 5 ml of PBS. The cells in suspension were disrupted by sonication with 20 s pulse-on and pulse-off time each using an ultrasonicator (Ningbo Scientz Biotechnology Co. LTD, China). The inclusion bodies were then isolated by centrifugation at 12,000 rpm for 20 min at 4°C. After washing twice with 20 ml ice-cold PBS, the inclusion bodies were dissolved in 5 ml of solubilization buffer (10 mM Tris-HCl, 100 mM sodium phosphate, 6 M guanidine-HCl, 2 mM 2-mercaptoethanol, PH 8.0). After centrifugation at 12,000 rpm for 20 min, the total protein in the supernatant was determined with a BCA protein assay kit (ThermoFisher Scientific, Chengdu, China), and then, the supernatant was used to prepare the samples for SDS-PAGE with equal protein amount of each sample being loaded onto the gel. All samples for SDS-PAGE were boiled in 1x reducing loading buffer at 98°C for 10 min and subjected to 12% denaturing SDS-PAGE. Gels were stained by Coomassie Brilliant R-250 (ThermoFisher Scientific, Chengdu, China) to identify the expected protein bands. For Western Blotting, gels were blotted to nitrocellulose membrane, and then the membrane was blocked with 2.5% (w/v) skimmed milk powder in PBS, pH 7.2 for 2 h at room temperature (RT), and successively incubated with mouse anti-iLRP monoclonal antibody (Abcam, Shanghai, China) at a dilution of 1:5,000 at 4°C overnight. Next day, the membrane was washed 3 times, and then 8 incubated with a 1:10,000 dilution of horseradish peroxidase (HRP) conjugated anti-mouse IgG secondary antibody at RT for 2 h. After the incubation, wash the membrane 5 times, then incubate the membrane with electrochemiluminescence (ECL) reagent (ThermoFisher Scientific, Chengdu, China) for 5 min in dark. Protein bands were detected by ChemiDoc TM MP imaging System (Bio-Rad Laboratories Co. LTD, Shanghai, China).

Gene source
The iLRP DNA sequence used in the research for the vector construction and gene optimization was obtained from the Database of Nucleotide in the website (www.ncbi.nlm.nih.gov) with an accession number of NM_002295. The coding sequence for amino acids is also included in the content. The optimized iLRP DNA sequence has been deposited in publicly available Database of GenBank with an accession number of MN927188.

Codon optimization
Codon optimization is frequently used to avoid rare codons with low utilization, simplify the secondary structures of mRNA after gene transcription, remove the motifs that are not conducive to efficient expression while increasing useful ones, and regulate GC content (Gao et al. 2015;Hanson and Coller 2018). In order to meet requirements for expressing human iLRP gene in Escherichia coli (E. coli), the GC content, codon adaptation index (CAI), mRNA structure and cis-acting elements in the gene were optimized by NG®Codon Optimization Technology. Through optimization, a smoother GC content curve (Fig 1a) was obtained, and the stable hairpin structures were reduced from 11 to 1, which could promote more effective ribosome translation. In addition, the frequency of codon usage and the gene distribution became more consistent with the genome of E.coli. After 9 replacing rare codons, the CAI accordingly increased from 0.60 to 0.84 (Fig 1b). Moreover, the negative cis-acting elements and killer sequences were eliminated. Lastly, the gene optimization model evaluated the integrated score negatively related to gene expression level. The score of wild-type iLRP gene was 1632435, and the optimized gene score was 776954, leading to a 47% decrease. According to the results of codon optimization, the human iLRP gene sequence was changed so as to have met the needs of expression in E.coli, but the amino acid sequence remained unchanged. The alignment of wild-type and optimized sequences clearly showed that the two sequences encode the same protein (Fig   1c).

Construction of expression vectors
The vectors used to express iLRP in this study all are derived from plasmid pET30a(+),a category of vectors in the pET System that is the most powerful system yet developed for the cloning and expression of recombinant proteins in E. coli. Target genes are cloned in pET plasmids under control of strong bacteriophage T7 transcription and translation signals. Native human iLRP gene with a 6xHis-Tag was cloned between NdeI and XhoI to form the plasmid pET-His-iLRP. However, the optimized human iLRP gene with a 6xHis-Tag was cloned between NdeI and EcoRV to form the plasmid pET-His-Opt-iLRP because the restriction site of XhoI is already existing in the optimized gene sequence but EcoRV is not within. For the plasmid pET-His-S-iLRP, the restriction sites of BglII and XhoI were used so that the original His-Tag and the thrombin protease cleavage site and S-Tag were included into the frame of target expression. The resultant vectors were confirmed by both restriction digestion and sequencing. The structure of vectors and the difference among them are demonstrated in Fig. 2.

Analysis of protein expression by SDS-PAGE and Western Blotting
Following gene transformation by the constructed vectors, transformants were cultured in TB (Terrific Broth) with kanamycin and the expression of recombinant proteins was induced by IPTG. First, SDS-PAGE was performed to check the expression in the whole cell lysates. Results showed that the vector pET-His-Opt-iLRP expressed the highest recombinant proteins with the lowest expression from pET-His-iLRP and the vector pET-His-S-iLRP expressed high enough proteins but little lower than pET-His-Opt-iLRP (Fig. 3a).
Based on published documents (Barsoum et al. 2009) and our preliminary data, human iLRP gene was expressed in E. coli as a form of inclusion body, therefore, inclusion bodies were made for the analysis of protein expression by SDS-PAGE and same Results were obtained (Fig. 3b). To verify the authenticity of the expressed proteins, Western Blotting with the purchased anti-iLRP monoclonal antibody from Abcam, Shanghai, China was carried out. Clearly, target bands appeared as expected on the nitrocellulose membrane, with the darkest band from the induced vector pET-His-Opt-iLRP, lighter darkness band from the induced vector pET-His-S-iLRP, and the weakest band from the induced vector pET-His-iLRP (Fig. 3c).

Discussion
It has been reported that the high expression of 37 kDa iLRP in human cancers is closely related to poor prognosis. Meanwhile, iLRP is a key factor in tumor cell proliferation, survival and protein translation, and also plays an important role in tumor invasion and metastasis (Poon et al. 2011). What is more important, other researchers reported that iLRP induced strong immune response specifically in cancer patients with iLRP overexpression, which implied that iLRP could be used as an ideal target in tumor immunotherapy (Friedrichs et al. 2008;Rohrer et al. 2006). In order to reinforce the research on this tumor antigen, it is urgent to produce it in a large scale. To this end, the most powerful system yet developed for heterologous protein expression, pET system was used to explore a practical method for iLRP production. On this basis, three different expression vectors were constructed and their expressions were then compared and analyzed.
In designing these three vectors, the possible influence of protein tags on the expression of the target gene and the structure of the target gene were the main factors to be taken into account. One of the frequently used protein tags, 6xHis-Tag is included for each vector in this study. The tag is not only simple in structure composing of only 6 histidines with no need to remove it from the final products, but also very convenient for the followup research on the engineered protein. The label has strong affinity with Ni + , which makes the purification of overexpressed proteins simple by using Ni + -affinity column (Zhao and Huang 2016). In addition, the expressed products can be easily verified and detected by the monoclonal antibody against 6xHis-Tag that is very common in the current market (Yin et al. 2014). Therefore, each vector was constructed to have the His-Tag added to the 5'end of the fusion proteins. In certain cases, it is practical to add other kinds of protein tags for heterologous protein expression in E. coli, such as S-protein tag. S-Tag is an oligopeptide derived from bovine pancreatic ribonuclease A, which consists of 15 amino acids and is rich in charged and polar amino acid residues (Raines et al. 2000). This tag has been studied extensively and can not only increase the expression yield and improve the solubility of target protein (Kobayashi et al. 2012), but also has little effect on the structure and function of the recombinant protein with weak immunogenicity either (Zhang et al. 2005). It binds to ligand S-protein with high specificity and affinity, leading to the development of simple technique for protein purification with widely used S-protein affinity column in the market (Zhao et al. 2013). Considering of these merits, a vector containing both His-Tag and S-Tag was designed for the comparison with other expression vectors. Between pET-His-S-iLRP and pET-His-iLRP, it was apparent that the former had much higher expression level than the latter (Fig. 3). Since the only difference between both vectors is in the presence or absence of S-Tag, the result consistently confirmed that the S-Tag had critical influence on the expression of iLRP as expected. However, the overly extended sequence in front of the target protein could remain a persistent concern for its future applications and in most cases, it is required to be removed even though it does greatly enhance the expression (Zhang et al. 2018). This extra step will add uncertainties to the final production for the recombinant protein. Therefore, other methods that possibly have similar or even better effects on the expression for the target gene could be developed. Accordingly, a vector with the optimized iLRP gene was constructed. To our best knowledge, the gene structure is another crucial aspect to be considered for engineered gene expression in the host besides the addition of protein tags. This was verified by the huge difference of expression level among three vectors (Fig. 3). Between pET-His-iLRP and pET-His-Opt-iLRP, although the vectors both encode same amino acid sequences for the target protein iLRP, the expression level was totally different. The vector pET-His-iLRP expressed too few recombinant iLRPs to be detected by Coomassie Brilliant Blue staining after SDS-PAGE and even hardly be detected by Western Blotting. However, the expression increased substantially with the vector pET-His-Opt-iLRP including the optimized iLRP gene sequence according to the host E. coli's genome.
Obviously, the composition of the target gene sequences exerted tremendous impact on the expression level. According to the report (Hanson and Coller 2018), heterologous expression of rare codon-containing genes is likely to exhaust the endogenous pools of the analogous tRNAs, which leading to growth inhibition, premature termination of transcription and/or translation, decreased mRNA stability, and increased frame-shifts, deletions and misincorporations. Through the technique of gene optimization, these factors could be easily modified to accommodate with the need in expression for the host.

13
In this case, the genetic constitution of iLRP had been changed a lot by changing the GC content, improving the CAI, stabilizing the mRNA structures, and reducing the integrated scores negatively correlated with the protein expression (Fig 1). Combining all above factors, it is very reasonable for the great enhancement of the iLRP expression in pET-His-Opt-iLRP comparing with the native gene structure. As for the vector pET-His-S-iLRP, its high level of expression might be attributed to the fusion sequence in front of iLRP. This sequence is composed of a 6xHis-Tag and a thrombin protease cleavage site and an Sprotein tag, around 84 base pairs which close to 10% of the length of iLRP gene. As a common sense, constituent sequences in pET system was modified to the optimum conditions to ensure the best expression for heterologous proteins. Logically, the longer DNA sequences accommodated to bacterial components are included in the target protein, the higher expression will be reached. This logicalness meets with the results well in this study (Fig 3). In pET-His-iLRP, the His-Tag is only 18 base pairs which merely close to 2% of the length of iLRP gene and hardly put any influence on the expression, leading to the expression level could not be improved. This is also consistent with the data from other researchers (Barsoum et al. 2009). In their study, a Tobacco Etch Virus (TEV) protease site was added to the 5'-end of human iLRP gene, plus the His-Tag, around 51 extra base pairs optimized for the expression in E. coli were included in the expression frame. Therefore, they also had good expression even with the native iLRP gene for the production of recombinant iLRP. From this point of view, the optimized iLRP gene in pET-His-Opt-iLRP deserved the highest expression level, followed by pET-His-S-iLRP with the lowest expression in pET-His-iLRP.
Through the comparison of three constructed vectors with different components for iLRP production, it was found that the construct with an entirely optimized gene expression frame had the best effect, and the construct with a longer fragment optimized expressed 14 better products comparing with the vector within which a shorter fragment was optimized.
In addition, proper protein tags can also put important influence on the expression level for the target gene. These findings illustrated that the yield of recombinant protein in a heterologous host could be significantly improved by changing the gene constituents to accommodate to the host genome either through codon optimizing or with a proper leader sequence. To further broaden the research areas on iLRP's functions and clinical applications, large amount of recombinant products is quite necessary, therefore, this study provided a new way to explore on how to improve the production of iLRP in E. coli.

Ethics approval and consent to participate
Not applicable.

Consent for publication
Not applicable.

Availability of data and materials
The dataset supporting the conclusions of this article is available in the [GenBank] repository, [MN927188 and hyperlink to dataset in https://www.ncbi.nlm.nih.gov/WebSub/? tool=genbank&form=history]. Comparison among constructed vectors. The vector pET-His-iLRP contains native human iLRP gene sequence that was inserted into pET30a(+) between NdeI and XhoI. The sequence of His-Tag from the parental vector pET30a(+), therefore, the percentage of inherent composites from the host is close to 2% in the iLRP expression frame. The vector pET-His-Opt-iLRP harbors the optimized iLRP gene sequence that was inserted into the same plasmid between NdeI and EcoRV. In this vector, the full length of the expression structure matches the host for expressing the target gene. The vector pET-His-S-iLRP was constructed by inserting the native iLRP gene sequence into the plasmid between BglII and XhoI in order to make use of intrinsic components of His-Tag and S-Tag in it.
Therefore, the percentage of inherent composites from the host increased up to