Translation of chimeric transcripts (CTs) in ovarian cancer generates tumor-specific (TS) peptides
A necessary first step was to identify peptides generated from chimeric transcript (CT) sequences around the fusionpoint identified in the TCGA ovarian cancer RNA-sequencing datasets. We consolidated all spanning reads of each CT to derive long read (LR) sequences for focused analyses rather than full-length transcript sequences since our working hypothesis was to evaluate chimeric peptides harboring residues of both parental proteins which were likely to be tumor-specific and perceived as non-self by the cellular machinery. Thus, 997 unique chimeric LR-derived peptides (cLRPs) were identified in ovarian tumor mass spectra datasets generated by the National Cancer Institute Clinical Proteomic Tumor Analysis Consortium26 (CPTAC) using a X! Tandem search engine-based pipeline and a targeted chimeric LR-derived peptide (cLRP) database27–28 (Fig.S1a; Methods) on the Seven Bridges Cancer Genomics Cloud29. Each tumor displayed a distinct profile of 28–132 cLRPs with a relative peptide expression range between 3.5-8 orders of magnitude (Fig. 1a-i). Since false-positive rates for PSM subgroups can vary (different peptide sizes, charge state of precursor ions, missed enzymatic cleavage, etc.), we affirmed chimerism by examining the charge state of PSMs in terms of b- and y- ion fragmentation profiles (Fig.S1b). While most peptides were transcribed from a single reading-frame (RF) without preferential usage of sense / antisense strands, RF isoforms of the same CT were also identified in some cases (Figs.S1c,S1d).
Tumor and tissue specificity was further inferred by probing expression of cLRPs in ovarian controls (OvCtrls; CPTAC data analyzed in parallel with tumor data), and human lymphoblastoid cell line data30 (LBL; PXD001406). OvCtrls and LBLs displayed distinct cLRP profiles at frequencies of 29–49 and 0–9 /sample respectively, with corresponding relative peptide expression between 3.5–6.7 and 2-4.08 orders of magnitude. cLRP frequency as well as expression levels were lower in OvCtrls over primary tumors and even lower in LBL cultures (Figs. 1a-i,a-ii,a-iii). Peptide profiling designated 843 cLRPs to be Ovarian Tumor-Specific (TS), while remaining were Tumor-Associated (TA). Effectively, translation potential could be assigned to only 21.12% of the identified CTs as affirmed by the specificity imposed by ion fragmentation profiling. Such lower coding capability as compared to that of parental genes despite the enhanced peptide diversity via RF isoforms of a single CT may indicate either a non-coding regulatory nature of CTs, or elimination of newly formed proteins by cell surveillance and homeostatic mechanisms.
Tumor associated Chimeric Transcript-Protein (TcTP) burden may define MHC-I allele restriction by cLRPs and patient prognosis in a personalized manner
While oncogenic roles have been recently assigned to CTs31, it is likely that several chimeric proteins may fail to achieve a stable conformation and thereby targeted for proteasomal degradation. Interestingly, recent findings suggest cryptic neoepitopes generated through proteasomal degradation bind to MHC class I molecules and are presented to CD8 + cells in eliciting anti-tumor responses32–35. To evaluate a similar possible fate within the 844 tumor-specific cLRPs in our study, we first predicted their constitutive and immunoproteasome degradation products using the Proteasomal Cleavage Prediction Server followed by identifying good and moderate strong binding peptides to the transporter associated with antigen processing protein complex (TAP; binding scores 6–7, 7–8 and 8-9.5 respectively) on TAPPred server (Methods). Further presentation of these 526 9-mer peptides to MHC Class I molecules was screened using NetMHCPan 4.1 in a set of 769 high frequency restricted MHC Class-I alleles representing Indian and Caucasian populations derived from the HLA Allele Frequency Net Database36 (Supplementary Information 1). This identified 369 peptides that restrict 98.9% of selected MHC alleles with a Binding Affinity (BA) Rank </=0.5%; restriction being either between single or multiple peptide-alleles (Fig.S2; Tables S1, S2, S3). We also determined the Patient Harmonic Mean Best Rank (PHBR) and Allele Harmonic Binding Rank (AHBR) scores for these 369 putatively antigenic peptides, and correlated them with cLRP frequencies and relative expression in patients along with non-antigenic cLRP-derived peptides (Figs. 1b; Methods). This winnowed out a subset of 248 peptides (PHBR and AHBR < 0.3), whose wider distribution across patients and higher binding rank within the restricted HLA alleles indicated higher probability of their antigenicity (Fig. 1c; Table S4).
We further examined the association of CTs and cLRPs in terms of a tumor-associated Chimeric Transcript-Protein (TcTP) burden, with overall patient survival (OS) in the cohort. This identified 2 distinct groups; patients in Cluster1 with lower average OS presented with a lower TcTP burden (< 60 CTs, < 40 cLRPs), while Cluster 2 patients with significantly higher average OS presented with higher TcTP burden (> 65 CTs, > 55 cLRPs; Fig. 2a). As a corollary, differences in outcomes between patient groups in the TCGA cohort demarcated on the basis of median CT-cLRP values in the cohort as either Group 1 (lower TcTP burden) or Group 2 (higher TcTP burden) also indicated a chimerism-based survival advantage (Fig. 2b). Importantly, the variance between PHBR and AHBR within both groups was significant while only AHBR between the 2 groups was significant (Fig. 2c), emphasizing possible improved presentation and specificity in binding of cLRP to MHC-I alleles. Consolidation of all such parameters accentuated differences between the 2 groups of patients demarcated based on their TcTP burden (Fig. 2d-I). Lower TcTP (Group1) tumors were associated with fewer antigenic peptides and of a lower relative abundance, hence are likely to restrict fewer MHC-I alleles than Group 2 tumors with higher TcTP (Figs. 2d-II to 2d-VI). Most importantly, cLRP derived neoantigens as a determinant of patient allele(s) are suggested to influence prognosis; this allele-specificity of cLRPs could serve as protective factor in a personalized manner and effectively impart a survival advantage in at least a subset of ovarian cancer patients (Fig. 2d-VII). Overall, a certain threshold of TcTP burden may be essential to assign a CT-derived survival advantage, especially on the background of rare outliers in Group 1 that could imply other, unknown protective mechanisms.
Positional amino acid binding preferences in cLRP-derived epitope-MHC complexes exhibit homology with binding pocket preferences of reference peptide:MHC structures
We increased the stringency of our prediction pipeline through haplotype-based determination of the stability and affinity of these potential neoantigens (NetMHCstabpan1.0; Methods), to identify a subset of 55 peptides restricting 117 HLA alleles at a high confidence and significant BA rank (p:MHC interactions; Thalf > 2; IC50 < 100nm; Table S5;Figs. 3a,3b;Fig.S3). HLA-C alleles were strikingly absent in this subset that represent cLRP derivatives most likely to harbor neoantigenic potential, including a highly stability ASCSVAWSW:HLA-B*57:26 complex (Thalf = 31.44h), while ATIRTVSSW:HLA-B*58:19 presented with highest affinity (IC50 = 3.12nM). Figure 3
Further molecular modelling of the p:MHC interactions was performed using reported PDB allele structures (http://www.IMGT/3Dstructure-DB). 12 MHC allele structures were thus available within our list of restricted alleles, which interact with 27 peptides in 39 complexes, 27 of which display significant polar, electrostatic, and hydrophobic interactions (Figs.S4a,S4b). Within these, we performed Gibbs motif analyses for 7 complexes (for which binding pocket details were available in elucidated crystal p:MHC structures;Methods). This guided the assignment of positional residue specificity within cLR-p:MHC molecule interactions (Fig. 3c). Further comparisons vis-à-vis polar, electrostatic and hydrophobic interactions within these 7 complexes with reference to reference structures (RP:MHC) available for each allele revealed some identical interactions, amongst others involving interactions between a different peptide residue and the same MHC residue at a specific position (we termed these as MHC interactive residues; Table S6, Supplementary Information 4 Video 1 ).
Thus, the high probability positional residues identified through Gibbs Cluster analysis anchor ATQGRSWRK in HLA-A*11:01 through interactions of A1 with T171 and E63 (pocket A, identical to reference), and K9 with Y99 (pocket D, MHC interactive residue); this complex is additionally stabilized through binding of T2 and Q3 with MHC interactive residues in pockets B and C respectively (Fig. 3d-i). Our observation of the KRMLASFSF:HLA-B*27:09 complex being stabilized through 3 polar and 1 electrostatic bonds between R2 and B pocket residues (E45,E63,T24, identical to reference) is supported by an earlier report indicating pocket B of HLA-B*27:09 to be sterically and electrostatically suited to bind to arginine37. This complex displays additional bonds between K1 and MHC interactive residues in pocket A that create an outward directed kink in the peptide and exposes remaining residues (Fig. 3d-ii). A high probability positional Y9 residue anchors RQRQKRIAY in pocket F of HLA B*15:01 identical to the reference, along with additional bonds between R1 and MHC interactive residues in pockets A and B (Fig. 3d-iii). Pocket B of HLA-B*58:01 interacts with several residues of KQLLHSWKW including the high probability positional residues W9, S6 and K8; a predicted Y3-R97 salt bridge however may have a destabilizing effect (Fig. 3d-iv). The LAARPGPRW:HLA B*57:03 and IFWDIFCRF:HLA-A*24:02 models however did not compare favourably either with Gibbs analyses or comparative outputs with reference complexes (Figs.S4c-i,c-ii). The high probability positional R2 of RRTERAPRF:HLA-B*27:05 displayed interactions with MHC interactive residues in pockets E and F, with additional stabilization through interactions of Y3,P7 and F9 with MHC interactive residues (Fig.S4c-iii). Conclusively, commonalities identified between cLRP- and R- p:MHC structures vis-à-vis involvement and positional preferences of specific residues of peptides and MHC molecules, will contribute to their recognition as neoepitopes.
cLR-p:MHC:TCR complex stabilization involves dynamic interactions similar to reference structures
A next level of molecular modeling to examine interactions involved in recognition of cLR-p:MHC complexes by α and β chains of TCR was performed, which is the defining prelude to MHC-I mediated T-cell cytotoxicity. cLR-p:MHC:TCR complexes were generated considering the reported complementary CDR3 chain sequences available for 3 alleles (HLA-A*24:02:4, HLA-A*11:01:2, HLA-B*27:05:4). Comparison of these models with reference epitope interactions displayed relatively similar ERGO II scores across all complexes, with those for ATQGRSWRK:HLA-A*11:01 and KIINPIIRK:HLA-A*11:01 being indicated at a higher confidence (Fig.S6a; Table S9). Interactions of ATQGRSWRK:HLA-A*11:01 with 2 reported TCRs were examined and compared for stability and affinity with those of the 2 reference peptides (one each for MHC and TCR binding) viz. AIMPARFYPK:HLA-A*11:0138 (RP:MHC) and HLA-A*11:01-TCR: IVTDFSVIK 39 (RP-TCR). ATQGRSWRK:HLA-A*11:01:TRAV21*01/J50*01-TRBV6.6*01/J2.3*01 complex displayed a strong binding score with 4 polar interactions similar to those in the RP:TCR (R5-S98; peptide:TCR-β Figs. 4a-i,4a-ii) and RP:MHC (A1-Y7, A1-Y159 in A pocket; T2-E63 in B pocket; Fig. 4a-iii), which were further stabilized with de novo interactions including K9-D116 (MHC-F pocket; Fig. 4a-iv; Table S10). Alignment of crossing and incident angles (61.368° and 11.3246° respectively) with a smaller θ (x-axis angle of projection of the TCR:CoM–MHC:CoM vector; -33.41°) led to a diagonal placement of TCR on the cLR-p:MHC structure and facilitates a R5-S98 interaction (Fig. 4b,Fig.S6b, Supplementary Information 4 Video 2). Visualization of polar, hydrophobic and electrostatic clouds indicated evasion of steric hinderance; specifically, presentation of R5 towards the outside facing plane (surface) of the MHC groove along with a lateral shift of its amino terminal within the same pocket makes it available for interaction with the TCR-β chain (Fig.S6c-i,6c-ii, left panels). TCR recognition and R5-S98 interaction also achieves a robust anchoring of the peptide deeper into the hydrophobic MHC pocket (Fig.S6c-i,56c-ii, right panels) enhancing stabilization of entire complex.
On the other hand, no interactions between peptide and TCR residues were evident in the ATQGRSWRK:HLA-A*11:01: TRAV35*01/J49*01- TRBV11-2*01/J1-2*01 complex, although most of the polar, electrostatic and hydrophobic p:MHC interactions were retained and geometrical alignment of crossing and incident angles were similar to those within the ATQGRSWRK:HLA-A*11:01:TRAV21*01/J50*01-TRBV6.6*01/J2.3*01 complex (Table 11; Fig.S6d). This is likely due to the steric hindrance generated by an altered TCR:CoM–MHC:CoM vector (θ: -143.15°), which in turn results in a positional shift of S6 deeper into the MHC complex that collaterally decreases the possibility of R5 interacting with TCR–β. Comparative modeling by superimposing ATQGRSWRK:HLA-A*11:01 with the 2 reported TCRs further emphasized the planar shift and steric hinderance reducing the chances of p:TCR interactions within the A*11:01: TRAV35*01/J49*01- TRBV11-2*01/J1-2*01 complex (Fig. 4c). Effectively, this highlights prioritization of neoepitopes through its recognition and binding to the TCR by, (i) determining the orientation and conformation of a firmly anchored peptide in the MHC molecule with specific exposed residues, and (ii) extent of perturbations within the p:MHC-I complex due to the proximity of a TCR through steric hinderance.
Similar assessment of earlier reported immunizing mutated neoantigens affirms prediction analyses
We further compared our evaluation with reported, tumor-specific mutated immunizing neoantigens in melanoma40 and glioblastoma41. Processing of the reported immunizing mutated peptides in these studies through our stability- and affinity-based prediction identified 6 and 24 antigenic peptides in melanoma and glioblastoma respectively (Figs. 5a-i,5b-i; Table S11;Figs.S7a;S7b;S7c). Molecular modeling of some of these, for example NEVSEVTVF-HLA-A*18:01 (melanoma) revealed polar interactions (V8-R97;F9-W147) similar to those in reference p:MHC complex (Fig. 5a-ii; Table S12). Following TCR recognition and binding, the NEVSEVTEF:HLA-B*18:01:TRAV1-2*01-J32*01:BV18*01-J1-4*01 complex displayed a polar p:MHC interaction (E2-S24) identical with the reference (DELEIKAY:HLA-B*18:01), and unique p:TCR-α (polar:S4-G92) and p:TCR-β (electrostatic:E8-K51;Fig. 5a-iii) interactions. A second complex NEVSEVTEF: HLA-B*18:01: TRAV19*01-J3*01:BV20-1*01-J2-7*01 was characterized with similar polar p:MHC interaction (E8-W147) and unique p:TCR-α (polar:S4-R96) and p:TCR-β (electrostatic:E8-R99) interactions as compared with reference complexes.(Fig. 5b-iv).
The glioblastoma peptides AAHRARYFW and VSAAHRARY restricting HLA-B*27:05 display an overlap with the reported peptide ARYFWGNLA, which presents 3 polar p:MHC interactions (A2-E63, A2-R62, A2-Y159;Fig.S7d-i), and complexes with TRAV14/DV4/J21-TRBV6-5/J1-1 through three similar polar, and single electrostatic and hydrophobic p:MHC interactions (A1-Y171, A2-E63, A2-Y159, W9-D77, A1-W167 respectively) and exclusive polar (R4-P28, R6-N98) and hydrophobic (A5-Y52) interactions with TCR-α (Fig.S7d-ii,iii; Table S12). VSAAHRARY displayed 6 similar polar p:MHC interactions (V1-Y171,V1-Y159,S2-E63,S2-R62,Y9-D77,Y9-W147) and a single hydrophobic interaction (V1-W167) as that of reported peptide, and interacts with TCR TRAV14/DV4/J21-TRBV6-5/J1-1 through five similar polar pMHC (V1-Y171,V1-Y7,V1-Y159,S2-E63,A3-Y99), and exclusive polar interactions in TCR-β (A7-G98) and TCR-α (H5-Y52), and two exclusive electrostatic interactions TCR-α (R6-E95, R8-E30; Figs. 5b-ii;5b-iii;5b-iv). A few more peptides in both tumor datasets were indicated to be antigenic through Gibbs cluster motif analysis and docking (data not shown). Taken together, our prediction of neoepitopes generated from cLRPs represents a robust approach suggesting chimeric transcript derived proteins and their reading frame isoforms to present cryptic epitopes in tumors.