Optimization of Expression, Puri cation and Secretion of Functional Recombinant Human Growth Hormone in Prokaryotic Hosts Using Modi ed Staphylococcal Protein A Signal Peptide

Garshasb Rigi Shahrekord University Amin Rostami National Institute of Genetic Engineering and Biotechnology Habib Ghomi National Institute of Genetic Engineering and Biotechnology Gholamreza Ahmadian (  ahmadian@nigeb.ac.ir ) National Institute of Genetic Engineering and Biotechnology Vasiqe Sadat Mirbagheri National Institute of Genetic Engineering and Biotechnology Meisam Jeiranikhamenh National Institute of Genetic Engineering and Biotechnology Majid Vahed Shahid Beheshti University of Medical Sciences Sahel Rahimi National Institute of Genetic Engineering and Biotechnology


Introduction
Human Growth Hormone (hGH) or somatotropin is a glycoprotein which is released from the pituitary gland. Various forms of this hormone is available while the 191 amino acid length is more predominant [1]. hGH, as a multifunctional hormone, plays various roles in cells. For example, it can have inhibitory effects on glycolysis followed by a direct effect on protein synthesis [2], as well as increase the absorption and retention of calcium, magnesium and phosphate ions in the body. Any disorder in the secretion of hGH in childhood or adolescence can lead to a variety of diseases including gigantism, acromegaly, and dwarf diseases [2,3]. hGH has been used in clinics since 1985 to treat a variety of children as well as adults hGH-related disorders, including Prader-Willi syndrome, chronic renal insu ciency, Turner syndrome, AIDS-related wasting, fat accumulation associated with lipodystrophy in adults [4,5]. Furthermore it may also be associated with some metabolic complications and even mellitus diabetes [6]. On the other hand, recombinant therapeutical proteins have received a great deal of attention in recent years due to their advantages including low side effects, minimized cytotoxicity, high selectivity, and very low non-speci c interactions [7,8]. These products must be physically and chemically stable enough. Physical instability refers to changes that occur in a three-dimensional structure, [9] while chemical instability refers to any chemical changes in a protein that involves the formation of a bond resulting in the formation of a new chemical [10]. Proper folding of therapeutic proteins is essential for their function [10]. Recent advances in genetic engineering have made it possible for microorganisms to be used for expression of heterologous proteins.
In this approach, heterologous proteins are expressed either as cytoplasmic or intracellularly or secreted as extracellularly [13]. In the rst method, a methionine amino acid is necessarily added to the amino terminus as the starting codon for protein expression. Given that this methionine has been shown to stimulate the immune system against heterologous proteins, and since this is very important for pharmaceutical proteins, additional methods must be used to remove this methionine such as the use of different peptidases which imposes an additional step on the system [14].
Furthermore, expression of the intercellular form can lead to the formation of inclusion bodies that has no functions [11]. Therefore, secretion of the heterologous proteins into the extracellular space could be a more appropriate choice for obtaining an active, endotoxin-free form of the protein. This could be achieved using secretory signal peptides. Signal peptides play a vital role in directing the target protein into the periplasmic and extracellular medium [12]. Various strategies have been used to increase the extracellular secretion of hGH in E. coli, including the use of physical and chemical methods such as osmotic shock, freeze-thaw cycles, lysozyme treatment, and chloroform shock, each with its own problems. In the present study, the recombinant hGH was expressed and puri ed with a very low level of endotoxin contamination and without N-terminal methionine in a prokaryotic host. For this purpose, signal peptide from Staphylococcus aureus protein A (SpA) was modi ed, redesigned based on the Sec secretory system and linked to the coding region of the mature form of the HGH. Finally, optimization of expression and secretion was carried out using Response Surface Methodology.

Optimization of codon usage
The sequence of optimized hGH gene was deposited in GenBank with the accession numbers of MT321110. Rare gene codons that could reduce translation e ciency were adapted to the expression system of E. coli codon usage. The Codon Adaption Index (CAI) was improved from 0.85 to 0.87. Meanwhile, GC content was also optimized to increase the length of the mRNA half-life and its stability, and the stem-loop structures that blocked the ribosomal connection were removed ( Fig. 1 A and Fig. 1 B). FOP of 68 was attained after optimization ( Fig.1 C and Fig. 1 D). The ideal percentage range of GC content is between 30-70%. GC content adjustment resulted in the average of 50.59% after optimization ( Fig. 1 E and Fig. 1 F).

Design of secretory signal peptide
The designed secretory signal peptide includes three domains. Domain A consists of a region with positively charged amino acids. Domain B consists of a hydrophobic region with a stretch of hydrophobic amino acids and domain C consists of the conserved site for signal peptidase cleavage. The sequence of the natural signal and the designed signal based on the general structure of the natural sec-dependent signal sequences are shown in g. 2. The altered amino acids have a lower line, and the alignment of these two signals shows their amino acid different.
Our initial studies showed that the natural signal peptide of the protein A is unable to secrete hGH. Therefore, the natural secretory signal was modi ed and optimized. Our results show that comparing to the natural signal peptide, the modi ed one became functional in secreting the hGH. The modi ed signal peptide is able to secrete the recombinant hGH to the both periplasm space and culture medium. As can be seen in Fig. 2, a series of amino acid changes have taken place in different regions of the peptide signal, including the N, H, and C2 regions, which increased the ability of the altered peptide signal to secrete growth hormone out of the cell.
For example, replacing arginine with leucine in the modi ed form of signal peptide has given it a more positive charge in the N domain and increased its ability to enter the cell membrane which has a negative charge. Another important change is seen in the H domain, in which the amino acid isoleucine replaces threonine and the amino acid serine is changed to alanine in the hydrophobic domain, which, as shown in Fig. 5, increases the hydrophobicity of the H region.
Designed gene and tertiary structure The schematic view of the pET26 plasmid as well as the designed gene structure is shown in supplementary data le 1. The signal peptide hGH-mutant (jei36c) has two Arginine residues R10 and R12 while the signal peptide of the native hGH has one Argentine residue R10. (Fig. 3).

Signal peptide functional analysis
The results of likelihood score-based predictions for the native and modi ed jei36c signal peptides are shown in Table 3. For prediction with signal peptide (SP) and tail anchor (TA) datasets, scores were calculated from the probability values of TA and SP models (Table 3 and g. 4).
Hydrophobicity characteristics of the designed signal peptide ProtScale online software was used to evaluate the hydrophobicity of the designed signal peptide comparing to the native signal peptide [13] (Fig. 5). Drawing the hydrophobicity and hydrophilicity pro les for these two signal peptides and examining their different domains to observe the changes in their hydrophobicity and hydrophilicity is shown in Fig. 5 (panels E and B). Comparison of these changes shows that the corresponding pro les in these two signal peptides have changed signi cantly.
The changes in the mutant signal peptide have been able to increase its hydrophilicity of the N region and, conversely, the hydrophobicity of the H region, and ultimately facilitate the secretion of hGH out of the cell. Taken together, these changes increasing the e ciency of this signal peptide in secreting hGH and make it optimized for the secretion of growth hormone from the sec pathway.

Stability of mRNA
To evaluate the stability of the transcribed mRNAs, the mRNA sequences of hGH resulted from fusion of both native and designed signal peptides were submitted to the mfold online server (http://unafold.rna.albany.edu/?q=mfold/RNA-Folding-Form) [14]. During codon optimization, an attempt was made to remove the secondary structure that attenuates or stops mRNA translation, and as it is shown in the g. 6, the translation inhibitory structure is not seen in the starting region. Test condition B includes induction of expression with 7% lactose, glycine 1% after reaching the bacterial OD to 1.5 and incubation at 25 ° C. in this experiment the sampling time was also reduced to eight hours. In this situation, the supernatant from the bacterial culture was centrifuged at 11000×g for 5 min and the supernatant was precipitated by TCA. As shown in the g. 8, the hGH bands are present in all supernatants except No. 3.
The expression of recombinant hGH was con rmed by western blotting. As shown in g. 9A and 9B, Western blot analysis using a speci c polyclonal Antibody raised against hGH was used to con rm the expression of the hormone. Arrows in all gures 6-8 represent the expressed recombinant hGH that is processed by the cell secretory system and secreted into the environment.

Protein puri cation
The expressed recombinant hGH was then puri ed by a nity chromatography and puri cation was evaluated using SDS-PAGE analysis. As shown in g. 10, the hGH is shown as a single, pure band (Line 6, panel B), comparing to its pre-puri cation state (Lines 1-5, panel A).
Optimization results of the periplasmic hGH expression and secretion In order to investigate the possible interactions between the factors affecting the production of periplasmic and cytoplasmic hGh production along with their optimal levels, the central composite design (CCD) was used. The matrix is designed and the responses are shown in table 2. Analysis of variance (ANOVA) based on the response surface model for the periplasmic hGH (Table 4) was calculated. Regarding the expression of periplasmic hGH, , the Model F value of 3.23 implies that the model is signi cant. Values of "Prob > F" less than 0.05, indicate the signi cance of the model terms. In this case, C: Cell density at induction time (OD.600), and D: Post induction time (h) are signi cant. Values > 0.1 demonstrate that the model terms are not signi cant. The "Lack of Fit F value" of 1.42 implies that relative to the pure error, Lack of Fit is not signi cant. We prefer the model to t and it is in suitable agreement (Table 4).
Multiple regression analysis of the data was performed, and a rst-order polynomial equation for the periplasmic hGH (ug/mL) (Y) according to the coded factors was expressed as below: The designed matrix and the responses are shown in Table 2. Analysis of variance (ANOVA) based on response surface model for cytoplasmic hGH (Table 5)

Determination of functional hGH concentration
After generating the standard curves for each test by plotting the absorbance versus the concentration of each controls, concentrations of the active form of the hGH were obtained from the standard curve (all row data were reported in supplementary data le 2). The results showed that the amount of active form of hGH produced in the periplasm was about 28.12% of the total proteins, while this amount was equal to 10% for the cytoplasm fraction. These results indicate that the secretion of hGH into the periplasmic space by this method causes the protein to fold into its proper structural conformation.

Discussion
Since its FDA approval in 1985, hGH has been used for clinical applications for more than three decades [15]. A variety of expression hosts has been used for production of hGH including mammalian, yeast and bacterial cells [16][17][18]. The use of E. coli as an expression host has several advantages including the possibility of easy manipulation of bacteria, low growth cost, rapid cell growth and the possibility of culturing high-density cells, which together make it an ideal system, especially from economic view compared to other expression systems [19].
During the cytoplasmic expression of recombinant proteins in E. coli, a single methionine is necessarily added to its N-terminal region. This may not be a problem for industrial enzymes, but in the case of recombinant proteins with pharmaceutical use, in addition to causing unwanted side effects, it is shown that it may stimulate the immune system and produce antibodies against these proteins [31]. Although this methionine can be removed using enzymatic reactions, its removal might impose some problems and additional costs on the system for their mass production, which is not desirable [20].
A major solution to overcome this problem in E. coli is the secretion of recombinant proteins into the culture medium using a suitable secretory signal peptide. Extracellular production of recombinant proteins has many advantages. The release of the recombinant proteins into the culture medium, in addition to reducing costs in the industry, eliminates the need to disrupt the host cell to extract and purify the proteins. As a result, proteases as well as endotoxins are not released into the environment. In addition, by continuously producing target proteins in the host, more recombinant proteins can be obtained in the fermenter. Although there are advantages to expressing proteins as periplasmic or cytoplasmic, each has disadvantages that limit the use of these methods, such as creating insoluble forms as inclusion body in the cytoplasm or obtaining small amounts of protein in the periplasmic method due to the limited capacity of the periplasm in the E. coli. Several classes of proteins, such as some toxins, are naturally secreted by E. coli, and others can enter the environment from the periplasmic space, possibly due to increased cell membrane permeability during the long incubation period. [12]. In this study, the hGH was secreted into the medium through the type II secretion system named Sec pathway. The Sec-dependent system in gram negative bacteria is made of a channel called the SecYEG complex and a translocation protein known as SecA which has ATPase activity. Signal peptide sequences in the N-terminal region of proteins lead these proteins to the Sec system and eventually secrete them out of the cell [21]. These signal peptides have a three-dimensional structure with a positively charged N region, a hydrophobic H region, and a C region in which Ala-X-Ala motif is identi ed and cleaved by the cellular signal peptidase enzymes [21,22].
The natural signal sequence and the designed signal based on the general structure of the sec-dependent signal sequences including all three domains introduced above are shown in Figs. 2 and 3. The probabilities shown on the graph, including SP (Sec/SPI) / LIPO (Sec/SPII) / TAT (Tat/SPI); indicate that the mutated signal peptide has improved the likelihood of the recombinant hGH secretion (Fig. 3). The secretory pathways of Sec and twin-arginine translocation (Tat) pathways function in parallel to transport proteins across the cytoplasmic membranes of prokaryotes as unfolded and folded, respectively (Palmer et al. 2012). So far, several proteins have been secreted extracellularly with this strategy using modi ed forms of signal peptides [23][24][25][26][27][28][29][30][31][32][33].
In this study, a series of amino acid changes were made in different regions of the signal peptide, including N, H and C domains.to evaluate and compare the e ciency of signal peptides in directing the recombinant hGH into the periplasmic and extracellular space. These changes were able to increase the hydrophilicity of the N domain and vice versa the hydrophobicity of the H region and nally increased the e ciency of this signal and optimized it for the secretion of hGH through the Sec pathway. Analysis of the hydropathy plots and examination of their different domains from the perspective of hydrophobicity and hydrophilicity showed obvious differences in different domains. Because of these changes, for example, the replacement of arginine instead of leucine in the N region has created a more positive charge in this area which might facilitate membrane entry and possibly increased the ability of the signal peptide to enter a negatively charged membrane. Another important change in the H region is the replacement of the amino acid isoleucine instead of threonine and the amino acid serine with alanine in the hydrophobic region, which has also increased the hydrophobicity of the this domain. Although the changes resulted in the formation of the two amino acids arginine at positions 10 and 12 in the altered signal, its pattern does not match the known motif in the signal peptides of the Tat pathway, which is RRKR.
The search for various internal and foreign patents and the lack of similarity, highlights the innovation and the strength of our work. The cytoplasmic or periplasmic expression of recombinant hGH has many bene ts, and therefore much research is being done in this eld which has led to the identi cation of the mechanisms involved in protein secretion in microorganisms, pathways involved and different proteinprotein interactions. Based on these ndings, many efforts have been made to develop an e cient method for the secretion of recombinant proteins. One of the most important and effective approaches is the use of secretory signal peptides. Secretory signals play an important role in targeting proteins into the periplasmic and extracellular space [11,12,[34][35][36][37][38][39][40][41][42][43][44][45] [46]. Since the mandatory addition of the amino acid methionine to protein for intracellular expression may elicit an immune response to the protein, this methionine must be removed in later steps. In addition, cytoplasmic production of hGH can lead to the formation of inclusion bodies in which proteins usually lack proper structure and function. Therefore, the secretion of hGH to the extracellular environment is the most appropriate method to obtain an active protein free of endotoxin.
We rst tried to use the native SpA signal peptide, which has previously been used successfully in our laboratory to secrete a number of proteins. However, both bioinformatics analysis and experimental results showed that the native signal peptide is not effective in hGH secretion. The original peptide signal is based on protein A in the gram-positive Staphylococcus aureus, but the altered signal peptide is redesigned for secretion or hGH in the gram-negative bacterium E. coli. However, the general secretion pathway in both signal peptides is through the sec pathway.
One of the challenges of using different signal peptides lies in the fact that a particular signal peptide of one protein may not necessarily function properly when fused to another protein. There is currently no speci c method for nding an appropriate signal peptide for protein secretion and in many cases it is based on trial and error.

Conclusions
In the present study, since the native signal protein peptide of S. aureus protein A was not able to deliver hGH to the extracellular space, it was modi ed using bioinformatics tools and fused to the n-terminal region of hGh to show that the redesigned signal peptide was functional. The e ciency of the redesigned signal peptide in the secretion of recombinant hGH into the periplasmic space as well as the extracellular environment was con rmed by evaluating the presence of hGH in different cell fractions.
Since the H and N domains in signal peptides play a key role in protein secretion, in this study, the hydrophobicity of the H region and the positive charge of the N region, which play an important role in cell membrane fusion, increased. Although the length of the H region has been shown to affect protein secretion and the length of the H region in gram-positive peptide signals is longer than in gram-negative signal peptides, in this study the peptide signal length did not change. A wide range of environmental factors affect gene expression and regulation.
Optimization of hGH production was performed using the response surface methodology (RSM). To overcome the disadvantages of the classical experimental design method and evaluate various parameters with a smaller number of experiments, we performed RSM to optimize hGH expression and secretion. In RSM, input variables were changed to achieve the desired output [47][48][49]. We also de ned some equations to maintain the best conditions. Determination of functional hGH concentration using a quantitative ELISA assay con rmed that the concentration of active form of hGH and its secretion in the periplasm is higher than the active form of hGH produced in the cytoplasm of bacterial cells. This demonstrates the importance of secreting recombinant pharmaceutical proteins to maintain their function. The results of optimization of hGH production showed that the factors of culture medium temperature, induction rate and time after induction have a positive effect on hGH production. It has also been shown that the use of glycine can improve hGH secretion in culture medium while the mechanism is not yet known.
Also in this study, using the appropriate algorithm and according to the eukaryotic origin of hGH gene, rare codons that reduce translation e ciency were changed in accordance with the E. coli expression system and its preferred codons to the highest possible level of gene expression. Provided that the CAI was upgraded from 0.85 to 0.87 by optimizing the codon. Meanwhile, GC content was also optimized to increase mRNA half-life and stability using related software. Stem-loop structures that prevent ribosomal binding to initiate translation and their stability were removed as much as possible.
Our results generally show that by increasing their e ciency, this prokaryotic secretory expression system can be used in the pharmaceutical industry to produce recombinant proteins. Although these experiments were performed in shake-ask cultures, the results of these experiments should also be performed in a fermenter to investigate the possibility of increasing the scale of protein production. Genetic manipulation of the host strain may also be required to make it more suitable for the expression and secretion of recombinant drug proteins.

Bioinformatics analyzes
Gene optimization Gene optimization was done for removing the structures that may inhibited high levels of protein expression in E. coli or ribosomal bonding, such as stem-loop structures, sequences cause mRNA instability, cis-acting elements and repetitive signals with negative effects on genes expression.
Investigating the e ciency of the secretory signal peptides The secretory signal peptide was designed using the web-based bioinformatics tool (SignalP-version 5.0). The secretory signal peptide of the Staphylococcus aureus protein A was modi ed by replacing different amino acids in its three different domains [41] and the ability of native and modi ed signal peptides in secretion of the hGH were investigated.

Tertiary structure analysis
The sequence of the optimized hGH gene was obtained from the Gene data bank with access code MT321110. Also, sequences of Spa native signal peptide from residues 1-36 (ID: TYO48081.1.) and mutant signal peptide (jei36c) (ID: QKG82153.1) were used in this study. Reading Frame region of the hGH was fused to an arti cial secretory signal peptides based on the secdependent pathways signal peptides. E. coli (BL21 DE3) competent cells were established by thermal shock as well as chemical treatment with calcium chloride and magnesium chloride. The competent cells were then juxtaposed to the recombinant structures. The cells were cultured in agar-LB culture medium [56] containing ampicillin 100 μg/ml and incubated at 37 ° C for 16 hours. The designed constructs containing the native and modi ed signal peptides fused to the coding region of the of the hGH were codon optimized and synthesized for expression in host E. coli.

Optimization of gene codons
Optimization of gene codons was used to optimize a several factors that are critical for the e ciency of hGH gene expression in E. coli such as GC content, codon usage bias using codon adaptation index (CAI) and frequency of optimal codons (FOP) parameters.
Evaluation of recombinant protein expression in culture medium, periplasm and cytoplasm SDS-PAGE and Western Blot analysis were used to evaluate the expression and secretion of the hGH in E. coli (BL21 DE3) cells. For this, 24 clones containing hGH gene were cultured separately in Luria-Bertani medium. After reaching the optical density (OD) of the culture medium at about 1.2-1.5, the medium was inoculated with 5% lactose and 0.5% glycine, and placed in a 30° C. Sampling was done 5 and 16 hours post inoculation. Expression of hGH in each of the 24 clones were investigated separately.

Fermentation of recombinant clones
A single colony from an E. coli BL21 DE3 carrying hGH gene was transferred to a 1000 ml ask containing 200 ml of F1 medium (Table 1) and incubated at 37°C, 200 rpm for 16 hours in an incubator shaker.
The New Brunswick fermenter (model Bio o 3 with a 4-liter capacity reservoir) was used for the production of hGH. F1 medium components were prepared individually and sterilized prior to use. Prepared culture medium was inoculated with 200 ml pre-culture medium.
After inoculation, the fermenter operating conditions were set at 37°C, 400 rpm stirrer and aeration of 1 liters per minute. The starting pH was set at 7.0. After 4 hours, lactose was added to the culture as an inducer and the process was continued with the same conditions for 10 hours. The process was terminated before the consumption of the carbon source.
Protein expression con rmation using SDS-PAGE and Western blot analysis E.coli BL21 (DE3) in LB medium with 50μg/μl kanamycin (Merck, Germany) and 7% lactose (Merck, Germany) as an inducer was used for protein expression. Cells were collected and lysis buffer (50 mM Tris base, 10% glycerol, 0.1%Triton X-100) (Merck, Germany) was added to the cells. Total protein was extracted and analyzed using polyacrylamide gel. Transformed bacteria were induced by lactose and sampling was done in different time points T0, T5 and T overnight (To/n). At each stage, the bacteria were precipitated at (5000 rpm for 10 min) and the protein contents were precipitated by adding 100% TCA to the supernatant medium. The bacterial pellets were dissolved in the sample buffer as follows: T0= in 120 µl sample buffer, T5= in 250 µl sample buffer, To/n= in 350 µl sample buffer.
Resolved proteins were transferred from SDS-PAGE to the nitrocellulose paper (Wathman, UK). TBS buffer (Sigma, USA) was used for blocking the membrane. Polyclonal antibody produced against hGH was used to approve its expression. Anti-rabbit-HRP conjugated was used as a secondary antibody and the band were decrypted using addition of the 4-chloronaphthol substrate. Protein puri cation Total protein was extracted from the bacteria and hGH was puri ed using a nity chromatography by Ni-NTA Agarose (Qiagen, USA) based on the manufacturer's instructions. Ni-NTA Agarose is an a nity chromatography matrix for purifying proteins including a 6XHis-tag. Histidine residues in the 6XHis-tag bind to the sites in the immobilized nickel ions with high speci city and a nity.  Table 2.
Determination of functional hGH concentration using an ELISA-based method In this experiment, the hGH quantitative test kit (company-country name) based on solid phase enzyme immunoassay (EIA) method contains two mouse monoclonal antibodies that identify antigenic markers on the surface of the active form of hGH and is able to form To distinguish active hGH from its inactive form, was used. Both standard hGH as well as our laboratory produced hGH were bound to the anti-hGH antibodies, result in developing a blue color. The intensity of the developed color is proportionate to the value of hGH in the sample. The optical density was measured in a 96-Well plate by ELISA reader at 450 nm. Standard curves were created for each test by planning the absorbance amount versus the concentration of each controls. The hGH concentrations of the samples were then taken from the standard curve.