CRISPR-guided reversion reveals the immunogenicity of a “non-MHC binding” cancer neoepitope in vivo

A high- anity MHC I-peptide interaction is considered essential for immunogenicity. However, some neoepitopes with low anities for MHC have been reported to elicit CD8-dependent tumor rejection in immunization-challenge studies. Here, we ask if a non-binder, tumor-rejection- mediating neoepitope inuences the natural immunogenicity of a tumor in vivo, in the absence of articial immunization. A mutation in tumor MUT1 was edited to its WT counterpart; the mutation was then re-introduced into the WT tumor, recapitulating the mutation in a tumor MUT2. TILs from all three tumors show T cell activation. However, TILs of MUT1 and MUT2 show signicantly stronger transcriptional signatures of cytotoxicity and TCR engagement as well as the greater breadth of TCR reactivity than those of WT. Structural modeling of the K d -neoepitope complex suggests increased hydrophobicity of the neoepitope surface consistent with higher TCR reactivity. These results reveal the immunogenicity in vivo of low anity or “non-binding” epitopes that do not follow the canonical view of MHC I-peptide recognition. A mutation that results in tumor rejection activity in a neoepitope (which is a poor binder of K d ) inuences the immunogenicity of the tumor as a whole. Our results demonstrate the activity in vivo of a poorly-MHC I-binding cancer neoepitope.


Introduction
Antigen presentation by MHC molecules is fundamental to adaptive immunity. In case of MHC I molecules, such presentation involves a complex series of steps that result in the proteolytic processing of whole or partially synthesized proteins, chaperoning of the peptides through the cytosol and the endoplasmic reticulum, and their rendezvous with MHC I molecules into a tri-molecular MHC I-β2 microglobulin-peptide (pMHC) complex 1 . Based on extensive analyses of peptides recognized by mouse and human T cells against viral antigens, it has been clear that a high a nity (IC50 values < 500 nM, but preferably < 50 nM) of peptides for MHC I is essential for antigen presentation 2 . This premise has been abundantly validated in its ability to predict the epitopes that can elicit a CD8 + T cell response measurable in vitro 3 .
Subsequent to the advances in genomics and bioinformatics and consequent ability to identify somatic mutations in cancers, identi cation of epitopes that can act as cancer vaccines has recently become a large area of enquiry. Since a nity of peptides to MHC I has withstood the test of time as a key criterion for predicting immunogenicity, this has been applied to the discovery of cancer neoepitopes as well, and a number of high a nity neoepitopes that can elicit tumor rejection as well as CD8 T cell responses measurable in vitro, have been identi ed [4][5][6] . In many more instances, a measurable CD8 response has been considered a valid surrogate for tumor rejection, and CD8 T cell response eliciting neoepitopes, which have a high a nity for MHC I, have been identi ed [7][8][9] . Indeed, the a nity of a peptide for MHC I has become so entrenched in immunological thought that peptides with a low a nity (IC50 of > 500 nM) are routinely excluded from consideration as candidates for vaccines, and are even often referred to as "non binders" to reinforce their irrelevance.
A small number of recent reports have examined the question of immunogenicity of mouse cancer neoepitopes from a vantage point agnostic to peptide-MHC I a nity. Such studies have reported a number of neoepitopes which bind MHC I with low a nity, and mediate CD8-dependent tumor rejection 10,11 . At the same time, two retrospective human studies analyzing the genomic and clinical outcome data from nearly 7,000 patients with 27 cancer types, have shown that better clinical outcomes and T cell in ltration of tumors are associated with the presence of cancer neoepitopes with low a nities for HLA I molecules, and not with the presence of high a nity HLA I-binding neoepitopes 12,13 . Consistent with lack of association between high a nity of neoepitope to MHC I and anti-tumor activity, all high a nity binding neoepitopes failed to elicit tumor rejection in a mouse model of ovarian cancer 14 . Human clinical trials with high a nity neoepitopes have also failed to elicit signi cant CD8 T cell responses even when high a nity MHC I binding algorithms were used to predict the immunizing neoepitopes [15][16][17] . Such clinical trials have also not shown convincing evidence of anti-tumor activity of the immunizing neoepitopes.
Since the immunogenicity (and ability to mediate CD8-dependent tumor rejection mediating ability) of a neoepitope with poor a nity for MHC I molecules runs contrary to our dominant conception of MHC Ipeptide interaction, it deserves critical scrutiny. The evidence for the immunological activity of low a nity or "non-MHC I-binding" neoepitopes has come thus far purely from studies where such neoepitope peptides are used to immunize mice followed by a tumor challenge. There is no evidence thus far that such neoepitopes are functional physiologically and in vivo. Here, we have asked the question if the presence or absence of a low a nity MHC I-binding neoepitope in the tumor in uences the spontaneous immunogenicity of the tumor in vivo. We have used CRISPR-mediated editing of the mutation to edit the tumor and have evaluated the consequences of such editing on the CD8 T cell immunogenicity of the tumor.

Results
De nition of the neoepitope Ccdc85c MUT . The Ccdc85c gene encodes a gap junction protein expressed mostly in the brain, colon, lung, kidney and testes in adult mice. The protein has no known oncogenic (driver) function. A non-synonymous (leucine to phenylalanine, Chromosome 12-108221754) somatic SNV in Ccdc85c was detected in the BALB/cJ Meth A brosarcoma (Fig. 1a). The mutation is heterozygous and the un-mutated as well as the mutated reads are detected in the transcripts. BALB/cJ bone marrow derived dendritic cells (BMDCs) pulsed with an 18-mer peptide with the mutant amino acid near the center (DPSSTYIRPFETKVKLLD) or un-pulsed BMDCs, were used to immunize BALB/cJ mice as described in Methods. All mice (n = 225) were challenged with the Meth A cells, and tumor rejection was monitored. A tumor rejection score (TRS) (with a maximum score of 5 indicating near 100% tumor rejection) was used to quantitate the extent of tumor rejection as described in Methods. The 18-mer peptide elicited a perfect 5.0 TRS score (Fig. 1b). Various truncated versions of this peptide as indicated in Fig. 1b were similarly tested for tumor rejection. The most and the least effective peptides along with their TRS scores are shown in Fig. 1c. Since the 10 amino acid peptide YIRPFETKVK was the shortest peptide active in tumor rejection, we consider this the precise epitope.
Evidence of presentation of YIRPFETKVK was sought by analyzing the peptides eluted from MHC I molecules puri ed from the Meth A cells by mass spectrometry (MS), as described in Methods. No Ccdc85c-derived peptides were detected, as expected from the low abundance of expression of this protein. In order to identify the precise peptide derived from the mutant Ccdc85c that could be crosspresented by the DCs, BMDCs were pulsed in vitro with the 18-mer peptide as previously described 11 . The BMDCs were extensively washed and MHC I molecules eluted. Targeted-MS analysis of the eluted peptides in the presence of spiked-in heavy labeled synthetic peptides showed the presence of two Ccdc85c-derived peptides TYIRPFETKVK and YIRPFETKVK (Fig. 1d). These two peptides detected by cross-presentation of the 18-mer peptide were identical to the two truncated versions of the 18-mer peptide that were observed to be the most effective in tumor rejection (Fig. 1c). These peptides had very low or undetectable predicted as well as measured a nities for K d , D d and L d as shown for K d in Fig. 1e.
With such low a nities, these neoepitopes would normally be considered non-binders.
The 18-mer sequence was queried for the presence of predicted K d , D d or L d -binding peptides. No D d or L dbinding peptides were predicted; three peptides were predicted to bind K d albeit with poor a nity (IC 50 values between 692 and 864 nM) (Fig. 1e). Ironically, none of these three peptides were detected by MS among peptides eluted from MHC I of BMDCs pulsed with the 18-mer long peptide.
Immunization of BALB/cJ mice with the un-mutated Ccdc85c 18-mer peptide failed to elicit protection from tumor growth (Fig. 2a). The anti-tumor activity of Ccdc85c MUT was abrogated by depleting the mice of CD8 cells by treating the mice with the anti-CD8 antibody but not by a control antibody during the priming phase as previously described 11 .
In order to determine if any peptides within Ccdc85c MUT could be presented by MHC II molecules, we analyzed the interaction of H2-A d and H2-E d with TYIRPFETKVK, YIRPFETKVK and IRPFETKVK as well as their wild type counterparts, using a cell-surface density assay. In this assay, β-chains of H2-Ab1 d or H2-Eb1 d are expressed in fusion with the peptide of interest and the amount of cell-surface MHC II, as a measure of intrinsic stability of MHC II, is quanti ed in engineered conditions 18 . No signi cant difference was observed in binding of Ccdc85c MUT and Ccdc85c WT to H2-IA or H2-IE (Supplementary Table 1 and Supplementary Fig. 1).
Analysis of immune response against Ccdc85c MUT . Tumor microenvironment (TME) of Ccdc85c MUTimmunized mice was examined using single cell RNA sequencing (scRNA seq). BMDCs-immunized mice were used as controls. As a peptide control, mice immunized with a neoepitope Alms1 MUT were used.
During comparison of the sequences of Meth A exomes with the normal BALB/cJ exomes, we identi ed Alms1 MUT (LYLDSKSDTTV) which was also identi ed among the peptides eluted from MHC I molecules from BMDCs pulsed with the 18-mer Alms1 MUT peptide. Although this neoepitope has a high a nity for a mouse MHC I K d (IC 50 62.25 nM), and it can be processed and presented, immunization of mice with the 18-mer Alms1 MUT failed to elicit tumor rejection ( Supplementary Fig. 2). Hence, this peptide was chosen as a control peptide. Mice were immunized with Ccdc85c MUT , Alms1 MUT (peptide control) or BMDCs (control) as described in Methods, and were challenged with Meth A. RNA from CD45 + cells (estimated 15,925 cells for the four libraries after QC, with an average coverage of 32,663 reads per cell and median 1,339 genes per cell) was sequenced. Data from the four libraries were pooled and clustered based on the gene expression pattern of each as described in Methods. Annotation of the clusters was informed by both differentially expressed genes (DE Genes) and per cluster highly expressed genes identi ed by the TF-IDF analysis as described in Methods. Two major and distinct clusters were identi ed namely, myeloid and lymphoid. Upregulated genes used to identify the myeloid cluster included, but were not limited to: Itgam (CD11b), Adgre1 (F4/80), Arg1 (Arginase 1), Nos2 (Nitric oxide synthase 2), Ms4a4c (Membranespanning 4-domains, subfamily A, member 4C), C1qa (Complement component 1) were highly expressed in the myeloid cluster. Upregulated genes used to identify the lymphoid cluster included, but were not limited to: (Cd3 (CD3), Ptprcap (CD45-AP), Nkg7 (Protein NKG7), Cd28 (CD28), Gzma (Granzyme a), Prf1 (Perforin1) etc.) (Fig. 2b, left panel). The proportion of lymphoid versus myeloid compartments in the Ccdc85c MUT library was very different compared to the Alms1 MUT or control groups. The Ccdc85c MUT library was mostly composed of lymphoid cells (62.68% lymphoid and 37.32% myeloid), while the control groups and mice immunized with Alms1 MUT were mostly composed of the myeloid compartment (~ 35% lymphoid and ~ 64% myeloid) (Fig. 2b, right panel). In order to study different cell types, lymphoid and myeloid clusters were re-clustered into 6 and 9 sub-clusters, respectively (Fig. 2b, bottom panel). The six identi ed lymphoid clusters were: CD4 T cells (CD4(1)), NKC(1), naive/early activated CD4 T cells (CD4(2), de ned by a high expression of Sell, Il7r, Tcf7 and Ccr7 genes and low expression of Il2ra and lack of expression of effector and cytotoxicity genes), NKC(2) (less cytotoxic and active than NKC(1)), CD8 T cells (CD8) and proliferating CD4/CD8 T cells (Pr. CD4/CD8, de ned by higher expression of Stmn1 and Mki67 genes and cell cycle gene expression analysis, described in Methods). The selected genes used as markers to annotate each lymphocyte cluster are listed in the summary heat map (Fig. 2c, right panel).
The majority of NK Cells (~ 80%) in the Ccdc85c MUT library were from the NKC(1) cluster which was the more cytotoxic and active cluster (de ned by higher expression of Cd44, Tnfa and Il7r), while, the fraction of active NK cells in other libraries was about 55%.
To pinpoint differences in T cells of the four libraries, clusters 1 and 5 (activated CD4 and CD8 T cells) were computationally pooled and the expression of cytotoxicity and other effector function genes were compared between libraries. Proliferating CD4/CD8 T were excluded from further analysis because the gene expression levels in these cells could be in uenced by the cell cycle effect prominent in this cluster.
Interestingly, Ccdc85c MUT library had the most contribution to the aforementioned pooled cells (31% Ccdc85c MUT , 25% Alms1 MUT , 21% Ctlr1, 20% Ctrl2). Also, the normalized average gene expression (described in Methods) of cytotoxicity (Gzmb, Prf and Nkg7) and other effector function (Ifng) genes were signi cantly higher in T cells derived from the Ccdc85c MUT library compared to the control or Alms1 MUT libraries. Similarly, T cells of Ccdc85c MUT library had a signi cantly higher expression of genes involved in TCR engagement (Nr4a1 and Irf4). A transcription factor involved in transcription of cytotoxicity genes, Eomes, had a signi cantly higher expression in T cells of Ccdc85c MUT library (Fig. 2d).
In the myeloid compartment, nine distinct clusters were identi ed. These are: macrophage1 (Mφ1), Mφ2 (de ned by a moderate expression of Arg1 and lower expression of Cd302, Ccl5, Ccl8 and C1qa), monocyte1 (Mo1), Mφ3 (de ned by a lower expression of Ccl8 and a higher expression of Ly6c, Cxcl9, Il1b, H2-Ab1, H2-DMb2, Mmp14 and Cd38), Mφ4 (de ned by a higher expression of Nos2, Mrc1, Itgam, Pf4, C1qa, C1qb and C1qc), DC1, Mo2 (de ned by a higher expression of Itgax, Tlr7, Ace, and Adgre4), neutrophil (Ne) and DC2 (de ned by a higher expression of Ccr7, Ccl5, Samsn1, Pcgf5, Gyg, Net1 and was compared between the three libraries (Fig. 3a). CD8 T cells of all three libraries showed CD8 activation markers including but not limited to: Cd69 and Cd44 as activation markers, Lamp1 (CD107a) as a measure of degranulation and Tbx21 (Tbet), a transcription factor involved in transcription of cytotoxicity-associated genes (Fig. 3b, upper panel). We then compared the gene expression patterns of the total immune cell population in TILs of MUT1, REV and MUT2. The gene expression patterns in TILs of MUT1 and MUT2 showed a higher similarity to each other than to the REV: a simple hierarchical clustering (using Euclidean distance, and complete linkage) of the MUT1, REV and MUT2 libraries represented by the normalized average expression vector of top informative genes (selected by highest average TF-IDF score, see Methods) showed MUT1 and MUT2 are closer to each other (distance 1.097) than to the REV (distance 1.820) (Fig. 3a upper panel). In Fig. 3a bottom panel, where genes with variability among the three libraries are juxtaposed, it is clear that the difference between REV and MUT1/MUT2 is more pronounced than the difference between MUT1 and MUT2.
To identify differences in CD8 T cells of the three libraries, the RNA sequencing data of the combined libraries were clustered based on the gene expression pattern of each cell type as described in Methods.
Annotation of the clusters was informed by both differentially expressed genes (DE Genes) and per cluster highly expressed genes identi ed by the TF-IDF analysis. T cells were re-clustered into 7 clusters by unsupervised clustering as described in Methods and enriched CD8 T cells populations were further analyzed ( Supplementary Fig. 5a-b). Some cytotoxicity and effector function genes (Tnf (TNF), Gzma (Granzyme a) and Ifng (Interferon gamma)) had similar expression pattern in CD8 T cells of all three libraries; however, the normalized average gene expression of other cytotoxicity genes (Fasl (Fas ligand), Gzmb (Granzyme b) and Prf1(Perforin1)) as well as other effector function genes (Pcdc1 (PD1) and Tbx21(Tbet)) were signi cantly higher in CD8 T cells derived from MUT1 and MUT2 libraries compared to the REV library (P value < 0.001). Similarly, CD8 T cells of MUT1 and MUT2 libraries had a signi cantly higher expression than the REV library, of early response genes which are involved in TCR engagement (Nr4a1, Nr4a2 and Nr4a3, P value < 0.001) (Fig. 3b, bottom panel).
T cell receptors (TCRs) in the TILs of MUT1, REV and MUT2 tumors. T cell receptors (TCRs) in the TILs of the three libraries were characterized using Grouping of Lymphocyte Interactions by Paratope Hotspots (GLIPH) analysis that groups together the TCRs into speci city groups based on the global and local similarities of the CDR3 regions of the TCRs 19 . Based on the GLIPH algorithm, 40 to 42.9% of all distinct clonotypes contributed to forming a network/similarity-based speci city groups in each of the libraries, while the rest were standalone clonotypes (with no similarity to other clonotypes). The similar percentage of network-based speci city groups (40-42.9%) in all three libraries was expected because of the existence of other mutations (except Ccdc85c MUT ) in all the three libraries. To further analyze the networks, we performed Louvain graph-based clustering of the speci city networks and calculated the modularity scores of the identi ed communities for each of the libraries (score of zero means the communities are the same and score of one refers to a perfect separation between communities). The modularity of a graph with respect to its division into communities measures how well separated (diverse) the different nodes (clonotypes) forming the communities are from each other (see Methods). In the TILs, the TCR clonotypes that form communities/speci city groups are almost identical in frequency (42.9% for REV, and 42% and 40% for MUT1 and MUT2). However, the average modularity score of the communities/speci city groups including the most frequent (expanded) clonotypes is 0.53 for the REV, 0.73 for MUT1 and 0.77 for MUT2, indicating lower diversity of TCR clonotype in the TILs of REV than those of MUT1 and MUT2.
Using GLIPH analysis, top ten clonally expanded CD8 T cells were computationally pooled and further analyzed for gene expression patterns of their cytotoxic and effector functions. The normalized average gene expression of cytotoxicity-associated genes (Fasl or Fas ligand, Gzmb or Granzyme b, Prf1 or Perforin1, Nkg7 or Protein NKG7) as well as other effector function genes (Tbx21 or Tbet, Pcdc1 or PD1 and Ifng or Interferon gamma) were signi cantly higher in the clonally expanded CD8 T cells derived from MUT1 and MUT2 libraries compared to the REV library (Fasl P value < 0.001, Gzmb P value < 0.001, Prf1 P value < 0.001 Nkg7 P value < 0.001, Tbx21 P value < 0.001, Pcdc1 P value < 0.001 and Ifng P value < 0.001). Similarly, the top 10 clonally expanded CD8 T cells of MUT1 and MUT2 libraries had a signi cantly higher expression of early response genes which are involved in TCR engagement (Nr4a1 or NUR/77 P value < 0.001, Nr4a2 or NUR-related factor 1 P value < 0.001 and Nr4a3 or Orphan nuclear receptor TEC P value < 0.001) (Fig. 3c). Interestingly, Ifng and Nkg7 which had a similar expression pattern in the pooled CD8 T cells of all three libraries ( Supplementary Fig. 5c), had signi cantly higher expression in the top 10 clonally expanded CD8 T cells derived from MUT1 and MUT2 libraries compared to the REV library.
Molecular modeling of Ccdc85c MUT . To gain insight into how the leucine to phenylalanine mutation leads to immunogenic epitopes, we modeled the structures of the 11-mer TYIRPFETKVK neoepitope and the 10-mer YIRPFETKVK neoepitope bound to K d . We modeled each corresponding WT peptide as well, to assess possible changes resulting from the mutation and thus infer how the neoepitopes might differ from self. We used the same stochastic, template-based modeling procedure previously applied to murine neoepitopes 11 . For the TYIRPFETKVK 11-mer, the phenylalanine at position 6 is predicted to extend up from peptide near the MHC α2 helix, increasing the amount of exposed hydrophobic surface 5% over the wild type peptide and potentially allowing the aromatic phenylalanine to interact with T cell receptors (Fig. 4a). Other than the side chain replacement, no conformational changes are predicted to occur in the peptide. For the YIRPFETKVK 10-mer, the new phenylalanine at position 5 is predicted to pack between the peptide and the a2 helix, in this case reducing exposed hydrophobic surface area (Fig. 4b). Subtle structural changes are predicted for the exposed side chains at positions 6 and 9, which could be suggestive of changes not captured by static structural modeling, such as changes in peptide exibility that lead to altered TCR recognition 10 .

Discussion
A high binding a nity of peptides to MHC I is generally considered essential for immunogenicity 2,3 . However, some reports with cancer neoepitopes show that even peptides with very low a nities for MHC I elicit CD8-dependent tumor rejection 10,11 . These reports have used immunization with peptides to demonstrate immunogenicity. Here, we have asked and addressed if low a nity neoepitopes actually in uence the natural immunogenicity of a tumor in vivo in the absence of arti cial immunization. The answer is a clear a rmative. Using CRISPR to edit the cancer genome, our results show that introduction of a single point mutation into the Meth A tumor results in strong transcriptomic signatures of TCR engagement and cytotoxic functions in the CD8 T cells in ltrating the tumor. Extinction of this mutation eliminates that signature. Remarkably, the Ccdc85c MUT neoepitopes used here have very low a nities (IC50 values of 1,434 and 39,661 nM) for K d . These results have been obtained during examination of the natural growth of a tumor in the absence of any immunization and indicate that the low a nity MHC Ibinding neoepitopes have a functional role in the immunogenicity of a tumor in vivo.
These ndings are the most detailed yet, on the activity of a neoepitope that would be considered a non-MHC I binding epitope. Under the canonical view of MHC I-peptide interaction, epitopes with such low a nities are typically considered to be non-immunogenic and are routinely eliminated from further study. Our results show that such non-canonical neoepitopes indeed behave in manner similar to the traditional high a nity MHC I-binding epitopes, and in ignoring them, we run the risk of ignoring a signi cant proportion of the cancer immunome. Studies with several thousand cancer patients with a wide array of cancers have also noted the strong correlation between presence of low a nity neoepitopes and good clinical outcomes 12,13,20 .

Puri cation of MHC I eluted peptides
MHC-I peptides were immunoa nity puri ed as described before 11 . MethA cells or BMDCs were lysed and MHC-I molecules were immuno-a nity puri ed from cleared lysates with HIB antibodies cross-linked to Protein A-Sepharose 4B beads at 4 °C. MHC-I complexes and the bound peptides were eluted with 1% tri uoroacetic acid (TFA). Elutions containing MHC-I molecules were loaded in pre-conditioned Sep-Pak tC 18 96-well plates (Waters). MHC-I peptides were eluted with 28% ACN in 0.1% TFA. Recovered peptides were dried using vacuum centrifugation (Thermo Fisher Scienti c) and stored at -20 °C.

LC-MS/MS analyses for the discovery of neoepitopes
The LC-MS system consists of an Easy-nLC 1200 (Thermo Scienti c, Bremen, Germany) coupled on-line to Q Exactive HF and or HF-X mass spectrometer (Thermo Scienti c, Bremen, Germany). The LC-MS/MS parameters used for detection of TYIRPFETKVK and YIRPFETKVK peptides were previously reported 11 and here the parameter used for the detection of the peptide LYLDSKSDTTV are reported. The analytical separation of the peptides was performed on a 500 mm homemade column of 75 µm inner diameter packed with ReproSil Pur C 18 -AQ 1.9 µm resin (Dr. Maisch GmbH, Ammerbuch Entringen, Germany) during 120 min using a gradient of H 2 O/FA 99.9%/0.1% and ACN/FA 95%/0.1%. For discovery MS spectra were acquired in the Orbitrap from m/z = 300-1650 at a resolution of 60,000 (m/z = 200) with a maximum injection time of 20 ms. The auto gain control (AGC) target value was set to 3e6 ions. MS/MS spectra were acquired at a resolution of 15'000 (m/z = 200) using a 'top 10' data-dependent acquisition method. Each precursor ion was sequentially isolated with an isolation window of 1.2 m/z, activated by higher-energy collision dissociation (HCD) with a normalized collision-energy (NCE) of 27. Ions were accumulated to an AGC target value of 1e5 with a maximum injection time of 120 ms. In the case of assigned precursor ion charge-state of 4 and above, no fragmentation was performed. Selected ions were dynamically excluded for additional fragmentation for 20 seconds and the peptide match option was disabled.

Identi cation of peptides by MS
We employed the MaxQuant platform 21 version 1.5.5.1 to search the peak lists against a fasta le containing the mouse proteome (Mus musculus_UP000000589_10090, the reviewed part of UniProt, with no isoforms, including 24,907 entries downloaded in June 2016) concatenated to a list of 3,783 long peptides (up to 31 aa) encompassing the non-synonymous somatic mutations described above. The second peptide identi cation option in Andromeda was enabled. The enzyme speci city was set as unspeci c. An FDR of 1% was required for peptides and no protein FDR was set. Peptides with a length between 8 and 25 amino acids were allowed. The initial allowed mass deviation of the precursor ion was set to 6 ppm and the maximum fragment mass deviation was set to 20 ppm. Methionine oxidation and N-terminal acetylation were set as variable modi cations.

Validation of neoepitopes with Parallel Reaction Monitoring (PRM)
Synthetic peptides labeled with heavy isotopes were purchased as crude (PEPotec SRM Custom peptide libraries grade 3) from ThermoFisher Scienti c (Paisley, PA49RE, UK). For quality control, before spiking the peptides, we con rmed absence of residual interferences of 'light' peptides by measuring separately each peptides with the method described below. The peptides were spiked into each of the peptidomic samples with a concentration of 100 fmol/µl. The PRM parameters used for detection of TYIRPFETKVK and YIRPFETKVK peptides were previously reported 11 and here the parameter used for the detection of the peptide LYLDSKSDTTV are reported. The mass spectrometer was operated at a resolution of 120,000 (at m/z = 200) for full scan MS, scanning a mass range from 300-1,650 m/z with a maximum ion injection time of 120 ms and an AGC target value of 3e6. Then each peptide was isolated with an isolation window of 1.2 m/z prior to ion activation by HCD (NCE = 27). Targeted MS/MS spectra were acquired at a resolution of 60,000 (at m/z = 200) with a maximum ion injection time of 180 ms and an AGC target value of 1e6. The data were processed and analyzed by Skyline (MacCoss Lab, Skyline v19.1.0.193, Seattle, USA). An ion mass tolerance of 0.055 m/z was used to extract fragment ion chromatogram. Peptides with precursor's charge state z ≤ 3 + and fragment ion with z ≤ 2 + were used to monitor multiple transitions corresponding to -b and -y ion types. We plotted -y ion type transitions with z = 1+. We then enabled synchronization of isotope labels for a proper alignment of transitions between heavy and endogenous peptides. Raw data were converted into Mascot generic format (mgf) by MSConvert (Proteowizard, Palo Alto, CA 94304, USA) in order to extract matched peak lists for heavy peptide and light counterpart for visualization of the MS/MS spectra. The assessment of MS/MS matching was done by pLabel (Version 2.4.0.8, pFind studio, Sci. Ac., China) and Skyline.
For the data shown in Fig. 1D, targeted MS-based detection of TYIRPFETKVK and YIRPFETKVK among MHC I peptides eluted from BMDCs pulsed with the 18-mer Ccdc85c MUT . Heavy labeled synthetic peptides were spiked into the peptide samples; the labeled amino acid is marked with a bold character and the mutation is in red. Matched peak lists for the "heavy" and "light" ions were extracted and monitored, while only single charge y ions were plotted. First, the absence of "light" peptide and the presence of the "heavy" peptide were con rmed by Parallel Reaction Monitoring (PRM) as a quality control measure in the synthetic peptide samples (upper left and lower left, respectively). Then, the co-elution of the synthetic "heavy" and endogenous "light" fragment ions was measured by PRM in Ccdc85c MUT  Tumor growth analysis Area under the curve (AUC) has been described previously as a tool to measure tumor growth 25 . Brie y, AUC was calculated by selecting "Curves & Regression" and then "Area under curve" from the "analyze" tool, using the Prism 5.0 (GraphPad). Immunization Fifty microliter of TiterMax (CytRx Corporation) or Day 7 granulocyte-macrophage colony-stimulating factor-derived BMDCs (GM-CSF-BMDCs), were pulsed with 40 µg of the neoepitope. The pulsed BMDCs/TiterMax were used to immunize a single mouse. All immunizations were performed in the presence of CTLA4 blockade, using the IgG2b isotype (Clone: 9D9, Bio X Cell), administered with the second immunization and every 3 days after tumor challenge. Peptides were synthesized by JPT Peptide Single cell RNA sequencing alignment, barcode assignment, and UMI counting Cell Ranger v.3.0 count pipeline was used to process the FASTQ les for each sample. The mm10 genome and transcriptome was used to align samples, lter, and quantify. The "cellranger aggr" pipeline was used to aggregate the analysis les for each sample into a combined set by performing between-sample normalization (samples are subsampled for an equal number of con dently mapped reads per cell). Cell Ranger pipeline output, the 'feature (gene) vs cell' count matrix is then used for the secondary scRNA-Seq analysis in SC1 as described below 27 . Single cell data analysis Samples from the libraries were analyzed using the SC1 tool available at sc1.engr.uconn.edu. Preprocessing quality control was conducted to exclude outlier and low quality cells based on the data distribution, from 20422 cells from 10x pipeline 15925 cells met our QC criteria ( cells with over 30000 total UMIs or expressing less than 500 genes or over 5000 genes, with higher than 10% mitochondrial genes or less than 5% ribosomal genes were excluded from the analysis).
Cells were then clustered using Ward's Hierarchical Agglomerative Clustering algorithm using the top average TF-IDF genes as features 27 , after log2(x + 1) transformation of the data. Clusters were annotated based on one-versus-all differential expression analysis between clusters, determined by a p < 0.01 and absolute value of Log 2 fold change of > 1.

Lymphoid population
The Lymphoid compartment was re-clustered into six subclusters. Based on the differentially expressed genes, different subclusters were annotated as follows: CD8 T cells (assigned by their expression of Cd3e and Cd8b1 and Cd8a), CD4 T cells (assigned by their expression of CD3e and Cd8b), Naive/early activated CD4 T cells (assigned by a high expression of Sell, Il7r and Ccr7 genes and lack expression of Il2ra, effector and cytotoxicity genes), proliferating CD4/CD8 T cells (de ned by higher expression of Stmn1 and Mki67 genes and cell cycle gene expression analysis), NKC (de ned by a high expression of Ncr1, Nkg7 and Fcer1g).

Myeloid population
The myeloid compartment was re-clustered into nine subclusters. Using DE gene list in myeloid compartment, different subclusters were annotated as follows: four clusters of macrophages (assigned by their expression of Mrc1, Adgre1 and Itgam), two clusters of DCs (assigned by their expression of Zbtb46, Flt3 and H2-Oa) two clusters of monocyte (assigned by their expression of Itgam, Ly6c, Ms4a4c and Il1b) and a neutrophil cluster (assigned by expression of S100a8, S100a9 and Itgam). Furthermore, for the cell types with more than one cluster (macrophages, DCs and monocytes), DE gene lists were generated to determine the main differences between different clusters of one cell type.
Cell cycle gene expression analysis: Using the Sc1 tool, each cluster was examined against the cell cycle gene list, obtained from GoTerm "Go:0007049". Clusters that were found to have a high expression of cell cycle genes and dominated by cell cycle effect were excluded from further analysis.

TCR sequencing analysis
Speci city groups/clusters in the TCR repertoire were identi ed via computational analysis following the grouping of lymphocyte interactions by paratope hotspots (GLIPH) algorithm from Glanville et al 19 . GLIPH searches for global and local motif CDR3 similarity in TCR CDR regions with high contact probability. Each speci city group is analyzed in GLIPH for enrichment (of common V-genes, CDR3 lengths, clonal expansions, motif signi cance, and cluster size). Global similarity measures CDR3 differing by up to one amino acid and local similarity measures the shared enriched CDR3 amino acid motifs with 10x fold-enrichment and probability less than 0.001. More details about the algorithm can be found in Glanville et al 19 . Supplementary Tables 2-4 show the enriched CDR3 motifs of TCRs from TILs of MUT1, REV and MUT2 libraries.
Modularity Score (as de ned in igraph R package) The modularity of a graph with respect to some division (or vertex types) measures how good the division is, or how separated are the different vertex types from each other. It de ned as Q = 1/(2 m) * sum( (Aij-ki*kj/(2 m) ) delta(ci,cj),i,j), here m is the number of edges, Aij is the element of the A adjacency matrix in row i and column j, ki is the degree of i, kj is the degree of j, ci is the type (or component) of i, cj that of j, the sum goes over all i and j pairs of vertices, and delta(x,y) is 1 if x = y and 0 otherwise.

Clustered regularly interspaced short palindromic repeats (CRISPR)
A guide RNA was designed that included the C -> A Ccdc85c mutation in its seed region. The seed region refers to the 8-12 nucleotides proximal to the PAM; mutations in this region signi cantly limit Cas9's ability to cleave target DNA. A single stranded oligodeoxynucleotide (donor ssODN) template was designed that contained 50-base pairs of homology to the endogenous sequence on either side of the target base. The donor ssODN was resuspended in TE buffer and stored according to the manufacturer recommendation upon receipt (IDT). Custom TrueGuide sgRNAs (Synthego) were resuspended to a concentration of 4 ug/ul in TE, aliquoted and stored at -20 °C. Prior to transfection, Cas9 ribonucleoprotein complexes (Cas9 RNP) were formed by incubating 10 ug of Alt-R HiFi Cas9 Nuclease v3 (IDT) with 8 ug of sgRNA in 0.3 M NaCl for 30 minutes at room temperature. To produce revertant clones, Cas9 RNPs complexed as previously described and were mixed with 200 pmol of donor ssODN and delivered into 10 6 MethA cells via electroporation with a Lonza 4D Nucleofector X Kit using program DS-150 and Cell Line Solution SG (Lonza) according to the manufacturer's protocols. Two days postdelivery, cells were split via limiting dilution into single-cell clones, allowed to expand for 21 days, and genotyped with PCR and Sanger sequencing at the Ccdc85c mutation locus in a 96-well plate. Clones that were successfully edited by CRIPR, were identi ed by Sanger sequencing.
Structural modeling of wild type/neoepitope peptide-MHC pairs Structural modeling of the 9-mer, 10-mer, and 11-mer wild type and neoepitope peptide/MHC pairs was conducted as previously described 11 . Brie y, modeling utilized Rosetta 28,29 and the ref2015 energy function 30 . The structures used as templates for modeling were PDB 5T7G 31 for the 9-mer peptides and 5GSV 32 for the 10-mer and 11-mer peptides. The templates were energy minimized via Rosetta FastRelax 33 . As there was no 11-mer peptide/MHC structure containing H-2K d as of November 2019, the 11-mer was approximated by interpolating a glycine between residues 5 and 6 of the template. Subsequently, the desired peptide sequence was introduced via mutation of the template peptide. Structures were rst modeled with a low resolution centroid kinematic closure protocol 34 then with a high resolution atomistic protocol. To su ciently sample the available conformational space, we modeled 10,000 decoys for each peptide/MHC. The lowest scoring decoy of each was retained for further analysis. Root-mean-square deviation of atomic positions (RMSD) of peptide common or backbone heavy atoms between wild type and mutant peptides was calculated and models were inspected visually for differences in structural features. Solvent-accessible surface areas were calculated in Rosetta using a probe radius of 1.4 Å.
Statistical analysis P values for comparisons of MHC/GFP MFI and AUC scores were calculated using t-test and 1-way ANOVA test, respectively, adjusted for multiple comparisons. P values were adjusted for multiple comparison by False Discovery Rate method or "Dunnet's multiple comparison test". P < 0.05 was considered statistically signi cant. Differential expression (DE) analysis is done by performing t-test to compare clusters/libraries. The t-test uses the Welch (or Satterthwaite) approximation with 0.95 con dence interval by calling the t-test available in R stats package. Results of the Log 2 fold change and the P value from the analysis are provided with 1.5-fold change cutoff and 0.05 for P value Declarations Competing interests statement