CD171 multi‑epitope peptide design based on immunoinformatics approach as a cancer vaccine candidate for glioblastoma

Glioblastoma (GB) is a common primary malignancy of the central nervous system, and one of the highly lethal brain tumors. GB cells can promote therapeutic resistance and tumor angiogenesis. The CD171 is an adhesion molecule in neuronal cells that is expressed in glioma cells; it is a regulator of the brain development. CD171 is one of the immunoglobulin-like CAMs (cell adhesion molecules) families that can be associated with prognosis in a variety of human tumors. The multi-epitope peptide vaccines are based on synthetic peptides with a combination of both B-cell epitopes and T-cell epitopes, which can induce specific humoral or cellular immune responses. In the present study, several immune-informatics tools were used for analyzing the CD171 sequence and studying the important characteristics of a designed vaccine. The results included the prediction and validation of the secondary and tertiary structure, physicochemical properties, solubility, conservancy, toxicity as well as antigenicity and allergenicity of the promising candidate for a vaccine against CD171. The immunoinformatic analysis suggested 12 predicted multi-epitope peptides, whose construction consists of 582 residues long. Therewith, cloning adaptation of designed vaccine was performed and eventually sequence was inserted into pET30a (+) vector for the application of the anti-glioblastoma vaccine development.


Introduction:
Glioblastoma (GB) is the most aggressive type of glioma and corresponding to the majority of primary central nervous system (CNS) malignancy form in adults. GB supposed to respond for more than 50% of all intracranial malignancies (1). CD171 plays a regulating role in neural cell development, tumor cell survival and migration (2). Researchers have proven the overexpression of CD17 in solid tumors, such as gliomas and colorectal cancer, as a prognostic factor (3).
CD171 is a 200 to 220 kDa multidomain type 1 membrane glycoprotein, containing a cytoplasmic intracellular domain, a transmembrane domain, and six immunoglobulin-like and five fibronectin-repeat in the extracellular domain. This glycoprotein has a regulatory function in cell adhesion, development, survival, and metastasis of tumor cells (4,5). CD171 ectodomain is abnormal in tumor cells, due to cleavage by the ADAM10 protease and consequent auto stimulation, resulting in cellular motility and proliferation (6,7,8). Currently, considering the various advantages of multi-epitope peptide-based vaccine, which consists of high specificity, good safety, stability, ease of production and storage, it has become an area of increasing interest in the field of vaccine research; this is even more promising in light of advances in immunoinformatics and vaccinology (9). Multi-epitope peptide vaccines are based on synthetic peptides with a combination of many B-cell and T-cell epitopes that can induce specific humoral and/or cellular immune responses. Prediction of B-and T-cell epitopes has been the focus of computational vaccinology and, given the potential translational implications, several bioinformatics tools have been developed (10,11). The aim of the present study is to analyze, using computational methods, the sequence and structure of CD171 and to predict potential linear epitopes of CD171 that may be targets of B and T-cells. This multi-epitope design will provide information for a promising peptide vaccine based on the epitope for cancer therapy.

CD171 sequence retrieval and structural prediction
The amino acid sequences of CD171 with accession number: NP_001265045.1 were retrieved from the NCBI database in FASTA format (12). TMHMM tool was used to demonstrate if the peptides are in transmembrane regions or not. TMHMM supported the hidden Markov model (HMM) method, which specializes in the modeling of globular domains, helix caps and other various regions of cell membrane proteins ( http://www.cbs.dtu.dk/services/TMHMM/ )(13).

B-cell epitope prediction
The objective is the prediction of the B cell epitope to find a potential antigen that would interact with B lymphocytes and initiate an immune response (14). The linear B cell epitopes have variable peptide lengths, from 2 to 85. The BepiPred-2.0 webserver was used to the prediction of linear B cell epitopes (http://www.cbs.dtu.dk/services/BepiPred/). This method is based on a random forest algorithm trained on epitopes annotated from antibody-antigen protein structures (15).

T-cell epitope prediction
The Immune Epitope Database (IEDB), HLA allele frequencies and reference sets with maximum population coverage for the selected epitopes were used. The prediction of the most probable epitopes interacting with different MHC class I, II alleles chosen based on the percentile cut-off was set at 0.5 and 1 for MHC class I and II, respectively. In both cases, to find a good binding affinity, the cut-off value IC50 was set at lower 50 nm and 150 nm to obtain a better level of confidence in the prediction of epitopes for MHC class I and II, respectively, and the antigenicity score. The candidate epitopes for MHC class I were determined by both IEDB and NetMHCpan EL 4.0 methods, and for MHC class II by IEDB 2.22 recommended method binding prediction tool (http://tools.iedb.org/mhci/), (http://tools.iedb.org/mhcii/)(16).

Construction of multi-epitope vaccine candidate sequence
The candidate vaccine sequence was generated based on the overlapping of the predicted B-cell and T-cell epitopes of the predicted peptide containing linear B-and T-cell epitopes was fused using AAY linkers (22). Using a DPRVPSS linker, the cholera toxin subunit B (CTB) with accession no. CAA53976.1 was chosen as an adjuvant, which was constructed within the final vaccine model at the N-terminal in order to potentiate the immunogenic capacity of our peptide by stimulating the innate immunity. The adjuvant potential of CTB has been reported in several animal models, indicating that the adjuvant potential would be scalable to complex species (23).

P Physicochemical properties and solubility prediction
Various physicochemical features of candidate peptides which included theoretical pI, qliphatic index, instability index, stimated half-life in the mammalian reticulocytes in vitro, extinction coefficient, molecular weight, and grand average of hydropathicity (GRAVY) were determined using the online web server ProtParam (http://web.expasy.org/protparam/) (24). Estimated solubility in water of the multi-epitope vaccine peptide was evaluated using the Pepcalc (http://pepcalc.com/) (25).
In addition, the validation of the built model was performed as a vital step to the detection of potential errors in predicted 3D models (30). Subsequently, a Ramachandran plot was generated using the RAMPAGE server (http://mordred.bioc.cam.ac.uk/~rapper/rampage.php). The Ramachandran plot energetically displays visualization of allowed and disallowed dihedral angles psi (ψ) and phi (ϕ) of the amino acid and is calculated based on the van der Waal radius of the side chain. The RAMPAGE results demonstrate the percentage of residues in allowed and disallowed regions that define the quality of the modeled structure (31). The entire quality of the final vaccine model was defined by ProSA-web (https://prosa.services.came.sbg.ac.at/prosa.php).
ProSA-web provides a protein tertiary structure validation and calculates an overall 3D-structure quality score; if the calculated score is outside the characteristic range of native proteins, the structure is likely to have errors (32).

In silico cloning adaptation of designed vaccine
The Java Codon (JCat) adaptation tool server (http://www.prodoric.de/JCat/) was performed for reverse translation and codon optimization, in order to construct the multi-epitope vaccine in a selected expression vector (33). Codon optimization leads to a higher expression rate of the final vaccine in E. coli K12 as an expression host because the use of the condon of the human and the selected host differs from each other. Three additional options were applied to avoid the transcription of rho-independent termination, prokaryote ribosome binding site and cleavage site of restriction enzymes to increase the efficiency of a translation process. The JCat result includes the codon adaptation index (CAI) and the of percentage GC content, which can be used to evaluate protein expression levels. CAI provides information on codon usage biases; the ideal CAI score is 1.0, however, greater than 0.8 is considered a good score (34). The GC content of a sequence should range between 30-70% and outside this range is considered unfavorable translational and transcriptional efficiencies (35). To clone the optimized, the final vaccine sequence was reversed and Nde I and Xho I restriction sites were added at the N-and C-terminal sites of the final construct, respectively. Finally, the optimized sequence (with restriction sites) was inserted into the pET-30a (+) vector using SnapGene restriction cloning module to ensure vaccine expression.

CD171 sequence analysis and structural prediction
The amino acid sequence of CD171 contains 1257 amino acids, which ectodomain residues ranging from 19 to1120 (the signal peptide was removed), selected using TMHMM (Figure 1).

Prediction of B-cell epitopes
Results using ABCpred server and on the basis of VaxiJen scores showed 29 linear B cell epitopes, varying in peptide lengths from 6 to 37. However, among the 29 predicted linear B cell epitopes, 4 epitopes, including -EASGKPEV‖, -WREGSQRKH‖, -PLDEGGKGQ‖ and -VPKEGQ‖, had the VaxiJen score of more than one, which indicates the high antigenicity nature of these epitopes and that they can be considered the most potential antigenic B cell epitopes, as shown in supplementary table 1.

Prediction of the physicochemical properties and solubility of the vaccine candidate
The multiepitope vaccine had an estimated molecular weight of 50kDa, a theoretical isoelectric

Multi-epitope vaccine design
After the prediction of B-and T-cell epitopes, we fused them in order to generate a multi-epitope peptide by using suitable linkers. In order to produce sequences with minimized junctional immunogenicity, AAY linkers were combined between the predicted epitopes. Cholera toxin subunit B (CTB) sequence as an adjuvant was constructed within the final vaccine model at the N-terminal, using a DPRVPSS linker to improve the immunogenic capacity of the peptides by stimulating the innate immune response. Moreover, to aid in protein purification and identification, a 6xHis tag was added at the C-terminal. The final vaccine peptide was constructed with 582 residues derived from 12 merged peptide sequences. A schematic diagram of the vaccine is displayed in figure 2.

Secondary structure prediction
The PSIPRED prediction method was used to predict the secondary structure of the final chimeric peptide, which accomplished output analysis obtained from the PSI-BLAST and was submitted in FASTA format. The obtained secondary structure prediction revealed that the protein had to contain 2% alpha helix, 47% beta strand, and 49% coil ( Figure 3A). As well, considering the accessibility of the amino acids to solvents, it was predicted that 50% would be exposed, 32% exposed to the medium exposed, and 17% would be buried. The RaptorX Property server predicted a total of 39 residues (6%) to be located in disordered domains.

Tertiary structure modelling and validation
In total, five models of tertiary structure of the designed chimeric protein were predicted based RAMPAGE, a determinative tool, was assigned to evaluate the reliability model, generate Ramachandran plot and determine the energy of the stable conformation of the psi (ψ) and phi (Φ) twisting or dihedral angles for each amino acid. The results of the tertiary structure validation of the Ramachandran plot analysis shows that the number of residues from favorable regions was 69.5% and, additionally, the allowed region residues were 20.2%, and only 10.3% of the residues were found in the outlier region. The total percentage of favoured and allowed region residues was 89.7%, while more than 90% is an ideal result to make the mode believable and convincing ( Figure 4A). The quality and potential errors in the crude 3D model were verified by ProSA-web, which gave a Z-score of -2.25 for the chosen model of the input vaccine protein ( Figure 4B). Both the Ramachandran plot and the ProSA-web score authenticated the quality of the CD171 3D model.

In silico cloning adaptation of designed vaccine
The Java Codon adaptation tool (JCat) was utilized to optimize the use of codons in the vaccine constructed on strains of E. coli K12, for maximum protein expression and differentiate the human and E. coli expression system.The optimized codon sequence has 1746 nucleotides in length. The Codon Adaptation Index (CAI) of the optimized codon sequence was found to be 0.96. The optimal range of CAI index was between zero to one, showing the probable success of the target gene expression. The GC sequence content was 53.3%, which is also satisfactory because it shows the possibility of good expression of the vaccine candidate in the host E. coli.
The ideal percentage range of GC content is between 30% and 70%. Eventually, using SnapGene software for restriction cloning, the recombinant plasmid sequence was constructed by adding the adapted codon sequences into the pET30a (+) vector between Nde I and XhoI restriction sites ( Figure 5).

Discussion
Malignant gliomas are rare and indicate an incidence of 2.5% of the leading cause of cancer death (36). Many vaccines against glioblastoma have been examined in cell cultures and animal models, but none have been used for therapeutic application in humans so far (37,38). Epitopebased vaccines as a promising approach have been considered to generate a specific immune response and avoid responses against other unfavorable epitopes on the complete antigen (39).
Various advantages of this approach include increased safety, the opportunity to rationally engineer the epitopes for increased potency and breadth, and the ability to focus immune responses on conserved epitopes (40). In this research, several precise bioinformatics tools were applied to gather information about the candidate vaccine; the first step was the prediction of B- and T-cell epitopes in the protein sequence and, in order to design a multi-epitope peptide vaccine, the selected epitopes were linked using suitable linker sequences (41). To the production of sequences with minimized junctional immunogenicity, AAY linkers (30) were inserted between the predicted epitopes, thus lead to the rational multi-epitope vaccine design construction (41). Also, in order to attain a high level of expression and improved bioactivity of the fusion peptide the DPRVPSS linker was inserted between the adjuvant protein sequence and the designed epitopes. results, the peptide expression in a bacterial system and fulfill the various immunological assays is essential.

Conclusions
In the formulation of novel drugs or vaccines, immunoinformatics approaches provide new insights and appear as interdisciplinary strings that take different materials to overcome the difficulties related to the long time and high expense on a new therapy development. Herein, several immuno-informatics tools were applied to design a promising vaccine peptide; the suggested vaccine candidate could potentially be used to glioblastoma neoplasm cell elimination.