Deciphering the immunopeptidome in vivo reveals new tumour antigens

Immunosurveillance of cancer requires the presentation of peptide antigens on major histocompatibility complex class I (MHC-I) molecules1–5. Current approaches to profiling of MHC-I-associated peptides, collectively known as the immunopeptidome, are limited to in vitro investigation or bulk tumour lysates, which limits our understanding of cancer-specific patterns of antigen presentation in vivo6. To overcome these limitations, we engineered an inducible affinity tag into the mouse MHC-I gene (H2-K1) and targeted this allele to the KrasLSL-G12D/+Trp53fl/fl mouse model (KP/KbStrep)7. This approach enabled us to precisely isolate MHC-I peptides from autochthonous pancreatic ductal adenocarcinoma and from lung adenocarcinoma (LUAD) in vivo. In addition, we profiled the LUAD immunopeptidome from the alveolar type 2 cell of origin up to late-stage disease. Differential peptide presentation in LUAD was not predictable by mRNA expression or translation efficiency and is probably driven by post-translational mechanisms. Vaccination with peptides presented by LUAD in vivo induced CD8+ T cell responses in naive mice and tumour-bearing mice. Many peptides specific to LUAD, including immunogenic peptides, exhibited minimal expression of the cognate mRNA, which prompts the reconsideration of antigen prediction pipelines that triage peptides according to transcript abundance8. Beyond cancer, the KbStrep allele is compatible with other Cre-driver lines to explore antigen presentation in vivo in the pursuit of understanding basic immunology, infectious disease and autoimmunity. A newly developed genetically engineered mouse model enables the analysis of specific antigen presentation in vivo, providing insights into the tumour immunopeptidome and cancer progression.

The success of cancer immunotherapy has led to an explosion of interest in understanding immune recognition of cancer 1,4 . Studies using preclinical models and patient samples have revealed that responses to immunotherapy depend on the presentation of peptide antigens on MHC-I 4,5 . MHC-I is a heterotrimeric complex that consists of a heavy chain (H2-K and H2-D in C57BL/6 mice, HLA-A, HLA-B and HLA-C in humans), a light chain (β2 microglobulin) and a peptide (generally 8-11 amino acids in length). Peptides presented by MHC-I are derived from the proteolysis of intracellular proteins, giving rise to a diverse array of peptide MHC-I complexes (pMHCs), known collectively as the immunopeptidome 6 . Expression of non-self proteins, such as those present in cancer or virally infected cells, results in the presentation of foreign peptides or neoantigens that are recognized by CD8 + T cells to drive antitumour immune responses 9 .
Numerous groups have contributed to the understanding of MHC-I trafficking, peptide loading onto MHC-I and the biochemical features of the immunopeptidome 1,8,[10][11][12][13] . Notably, mass spectrometry analyses have characterized endogenously presented peptides in mouse and human cells and tissues, which has resulted in improved prediction algorithms for peptide-MHC binding and designs of personalized, neoantigen vaccines 10,11,[14][15][16] . Although these advances have improved the design of antigen-specific therapies, our understanding of the dynamic, context-dependent immunopeptidome is still lacking.
Proteomics studies interrogating the immunopeptidome have been carried out almost universally using antibody or affinity immunoprecipitation of MHC-I peptides from cells grown in culture, which lack the microenvironmental and/or tissue-specific stimuli that are certain to affect the repertoire of peptides presented by cancer cells 10,14,15 . Significantly less focus has been paid to understand the immunopeptidome in vivo, for which existing studies generally profile bulk tissue or tumour lysates and are obscured by heterogenous cell mixtures.
A more precise comparison of tumour and normal immunopeptidomes is likely to uncover new cancer-specific epitopes. Unfortunately, the field currently lacks specialized tools to precisely isolate MHC-I peptides from cells of interest in vivo. Genetically engineered mouse models (GEMMs)-such as the Kras LSL-G12D/+ Trp53 fl/fl (KP) modelrecapitulate the histopathological features of human cancer and are Article tractable systems for studying tumour progression in the native tissue microenvironment 7 . Therefore, GEMMs represent an underappreciated tool to interrogate the tumour immunopeptidome at distinct stages of tumour progression and to uncover features of tumour antigen presentation in vivo that have thus far remained elusive 17 .
Here we report a new GEMM that enables the specific purification of pMHCs from cells of interest in vivo. Using this tool, we show that the cancer immunopeptidome reflects a loss of cell identity through tumour evolution and provide compelling evidence that patterns of cancer-specific antigen presentation are not driven by changes in RNA abundance or translational efficiency. Finally, we identify immunogenic epitopes that are presented on cancer cells in vivo and provide evidence that the universe of targetable antigens in cancer is potentially broader than currently appreciated.

Inducible affinity tagged MHC-I GEMM
We engineered a Cre-inducible exon that encodes the highly specific affinity tag StrepTagII into intron 1 of the H2-K1 locus (which encodes the H2-K b alloantigen of MHC-I) (K b Strep; Fig. 1a). This design results in the Cre-dependent incorporation of the StrepTagII onto the mature H2-K b protein. We targeted the K b Strep allele to embryonic stem cells derived from mice harbouring the KP genotype. Thus, Cre recombinase activates oncogenic Kras G12D with simultaneous biallelic deletion of Trp53 and activation of the StrepTagII, which enables the specific purification of MHC-I complexes from autochthonous tumours in vivo ( Fig. 1b and Extended Data Fig. 1a-c).
To validate this system, we derived pancreatic organoids from KP mice and KP/K b Strep mice and confirmed that the K b Strep allele was Cre-inducible, presented on the cell surface and enabled the purification of intact MHC-I complexes (Extended Data Fig. 1d-j). We next validated our system in vivo and confirmed that the StrepTagII was specifically detected on cancer cell nests in the tumour microenvironment (Extended Data Fig. 2a-c).
For immunopeptidome profiling, we isolated peptides from cells grown in a monolayer (two-dimensional) and from orthotopically transplanted KP/K b Strep organoids in vivo (Ortho). We also isolated peptides from autochthonous tumours initiated through retrograde pancreatic ductal instillation of adenovirus expressing Cre recombinase into KP/ K b Strep mice (Auto) and into KP/K b WT mice (Auto-WT) 18 . We extracted peptides with the expected characteristics 12 of H2-K b binders in length, predicted affinity and amino acid content from all samples, except for the negative control Auto-WT samples (Extended Data Fig. 2d-j).
We next compared peptides isolated from orthotopic and autochthonous pancreatic ductal adenocarcinoma (PDAC) using our K b Strep system with those identified in normal pancreas tissue from a previous study 19 . Antibody-based immunoprecipitation of H2-K b revealed numerous peptides specific to PDAC tissue (Extended Data Fig. 2k). Moreover, analysis of source protein gene expression in single-cell RNA sequencing (scRNA-seq) data indicated that normal pancreas MHC-I peptides were nonspecifically sampled across all cell types in the pancreas. By contrast, peptides identified in orthotopic and autochthonous samples were enriched for ductal cell features that reflected the expected cellular phenotype of malignant cells in PDAC (Tabula Muris; Extended Data Fig. 2l-m). Collectively, these results validate that the KP/K b Strep system enables high-resolution interrogation of cell-type-specific immunopeptidomes in vivo.

Profiling the LUAD immunopeptidome in vivo
We next applied the KP/K b Strep model to autochthonous LUAD (Fig. 1c-h and Extended Data Fig. 3). Using antibody-based immunoprecipitation, we isolated H2-K b peptides from healthy lung (Normal-Ab) and 16-week tumour-bearing lung (Tumour-Ab) and compared the peptides to those captured by cancer-cell-specific StrepTactin affinity purification from 16-week KP/K b Strep tumours (Tumour-Strep) or negative control KP/K b WT tumours (Tumour-WT) (Fig. 1d). Peptides isolated from all samples had length distributions, predicted affinities and amino acid motifs that reflect K b binding, except for the negative control Tumour-WT samples ( Fig. 1e-g and Extended Data Fig. 3j-m).
In vitro biochemical experiments also demonstrated that StrepTagII incorporation did not affect MHC-I stability, trafficking or recognition by T cells (Extended Data Fig. 4).
A comparison of the identities of peptides derived from the Normal-Ab, Tumour-Ab and Tumour-Strep samples uncovered 438 peptides that were specific to tumours (Fig. 1h, red outline). Although 246 of these peptides were also identified using the traditional antibody approach in tumour-bearing lungs, 192 were specific to Tumour-Strep samples, which suggests that cell-type-specific isolation of MHC-I peptides can provide deeper coverage of the immunopeptidome for cells of interest. A total of 346 peptides were specific to the Tumour-Ab samples, but given the lack of specificity with this method, we could not conclude whether these peptides were derived from neoplastic cancer cells.

The immunopeptidome reflects cell identity
We next evaluated whether in vivo immunopeptidomes capture specific cellular identities or cell states within native tissue microenvironments. We applied gene signatures derived from peptides identified from healthy lung (Normal), antibody purification from tumour-bearing lung (Ab) or affinity purification of MHC-I specifically from cancer cells (Strep) to scRNA-seq data from healthy mouse lung 20 (Tabula Muris; Fig. 2a and Extended Data Fig. 5a,b). Direct comparison of the Strep and Normal signatures across all cell types revealed a highly significant enrichment for an alveolar type 2 (AT2) phenotype in the Strep signature. This results was consistent with tumour initiation driven by Cre recombinase expressed from a SPC promotor specific for AT2 cells ( Fig. 1d and Fig. 2b, left). AT2 enrichment was not as pronounced when comparing the Ab and Normal signatures (Fig. 2b, middle). Notably, comparison of the Strep and Ab signatures also resulted in strong enrichment for an AT2 phenotype in Strep samples, even though both datasets were derived from identical tumour-bearing lung tissues (Fig. 2b, right). This result suggests that peptides isolated using nonspecific antibodies from MHC-I complexes obtained from bulk tumour lysates are contaminated by peptides from tumour-infiltrating immune cells and other stromal cells in the microenvironment.
Recent data have demonstrated that KP LUAD tumours progressively lose AT2 identity, which prompted us to directly compare LUAD and AT2 immunopeptidomes [21][22][23] . To accomplish this, we crossed mice harbouring the K b Strep allele to Sftpc creERT2 mice 24 , which enabled tamoxifen-inducible incorporation of the StrepTagII specifically on AT2 cells in healthy lung tissue ( Fig. 2c and Extended Data Fig. 5c-e). MHC-I peptides isolated from AT2 cells in vivo strongly reflected an AT2 cell identity (Extended Data Fig. 5f,g). We next evaluated the LUAD immunopeptidome with respect to tumour progression at 8 weeks (early), 12 weeks (mid), and 16 weeks (late) and compared these data with bulk lung and AT2 peptides ( Fig. 2c and Extended Data Fig. 5c-e). Comparison of MHC-I peptides isolated from early-stage, mid-stage and late-stage tumours to those found in normal AT2 cells or lung tissue revealed that the tumour immunopeptidome progressively diverges from normal (Fig. 2d,e). We next derived signatures from genes encoding peptides identified in early-stage, mid-stage and late-stage tumours and applied them to AT2, club and bronchioalveolar stem cell (club/ BASC), and basal cell subsets from the healthy lung scRNA-seq data. The immunopeptidome signatures exhibited progressive decline in signal within the AT2 compartment throughout tumour progression (Fig. 2f). By contrast, club/BASC cells and basal cells exhibited slightly increased association with mid-stage and late-stage signatures, which indicates that as tumour cells lose AT2 identity and adopt alternative cellular phenotypes, a parallel alteration is observed in the tumour immunopeptidome 25 (Fig. 2f).

Tumour evolution alters the immunopeptidome
To further understand the LUAD immunopeptidome in the context of tumour evolution, we applied AT2, early, mid and late peptide-derived gene signatures to published scRNA-seq data from normal AT2 cells (T 0 w) and KP tumour cells throughout progression (KP 2 w, KP 12 w, KP 18 w and KP 30 w) 22 (Fig. 2g). We observed dynamic changes in the correlation of peptide signatures with previously described expression modules in the KP model 22 (Fig. 2g). Notably, the late-stage signature exhibited decreased correlation to the AT2 module and increased correlation to the gastric epithelial module and the highly mixed transcriptional module, which were previously found to be associated with phenotypic plasticity in late-stage KP tumour evolution ( Fig. 2h and Extended Data Fig. 5j-l). Moreover, gene set enrichment analysis revealed that inflammatory cytokine signalling was highly correlated with the late peptide signature, whereas Myc signalling, metabolic processes and epithelial-to-mesenchymal transition (EMT) were inversely correlated with the late signature (Fig. 2i). Additionally, the metastatic cluster of KP tumour cells (cluster 10, low Nkx2-1, high Hmga2) exhibited low expression of H2-K1 and was the cluster with the lowest expression of the late signature (Fig. 2j). Heterogeneity of antigen presentation across cancer cell states was further supported by multiplexed immunofluorescence of late-stage tumours, which revealed significant intratumour and intertumour heterogeneity in MHC-I presentation (Fig. 2k). Consistent with this notion, dampening tumour inflammation with CD8a + immune cell depletion decreased the number of unique MHC-I peptides we recovered. By contrast, inducing CD8 + T cell infiltration with an agonistic CD40 antibody and a FLT-3 ligand (CD40/FLT3L) increased previously unknown peptide identification, highlighting that antigen presentation is responsive to inflammatory cues in the tumour microenvironment 26 (Extended Data Fig. 5m,n). Collectively, these results demonstrate that antigen presentation by cancer cells is a highly integrative and complex process, subject to the evolution of cell identity and to microenvironmental cues in vivo. Importantly, our system captures each of these important features.

Features of LUAD-unique peptides
Although we noted that gene signatures derived from the totality of the immunopeptidome reflected broad cellular identities as measured by RNA-seq, we next sought to understand the features of peptides that were specifically presented in KP tumours compared to healthy bulk lung or AT2 cells (Fig. 3a, LUAD-unique). To understand the relationship between transcription and presentation of LUAD-unique peptides, we used published bulk RNA-seq data from sorted normal AT2 cells or from cells sorted from early-stage, mid-stage or late-stage KP tumours 27 (Fig. 3b). Genes encoding LUAD-unique peptides exhibited varied patterns of RNA expression across timepoints, which did not correlate with either mean mRNA expression levels or predicted peptide affinity. This result indicates that changes in mRNA expression cannot exclusively explain LUAD-unique presentation of individual peptides. To corroborate this finding, we identified genes that were upregulated at any stage of tumour progression and predicted the affinity of all possible 8-mer and 9-mer peptides from this list (Fig. 3c). This analysis identified 17,130 peptides predicted to be enriched in tumour presentation based on mRNA expression and predicted affinity alone. However, this pipeline only identified 39 out of 312 total 8-mer and 9-mers in our LUAD-unique peptide list, which reinforces the importance of evaluating cell-type-specific and tissue-specific MHC-I presentation with empirical, mass spectrometry analysis (Fig. 3c). We next assessed whether protein synthesis in tumour or normal cells could better predict the presentation of LUAD-unique peptides.   LUAD cells and performed ribosome profiling (Ribo-seq) and RNA-seq ex vivo 28-30 ( Fig. 3d and Extended Data Fig. 6a-d). Genes associated with AT2 identity (Sftpc, Sftpb, Ager, Nkx2-1 and Cxcl15) exhibited higher translation rates in the AT2 samples, whereas genes associated with KP LUAD progression (Onecut2, Duox2, Hmga2, Porcn, Cldn6 and Hnf4a) were translated at higher rates in LUAD samples (Fig. 3e). Integrated analysis of RNA-seq and Ribo-seq data revealed numerous genes that were coordinately upregulated or downregulated at the mRNA and ribosome protected fragment (RPF) levels ( Fig. 3f, blue) or exhibited differential translation efficiency (TE) (Fig. 3f, green, and Extended Data Fig. 6e-g). However, genes encoding LUAD-unique peptides generally exhibited no difference in mRNA or RPF abundance (Fig. 3f, red), and prediction of peptides based on differential TE resulted in only 4 out of 312 empirically identified, LUAD-unique peptides (Fig. 3g). The inability of mRNA abundance or TE to fully describe the LUAD-unique immunopeptidome prompted us to consider post-translational features of tumour-specific presentation 31 . Although source protein length and thermal stability were similar in normal and LUAD-unique immunopeptidomes, we observed differences in source protein localization and a trend towards decreased protein half-life in source proteins giving rise to LUAD-unique peptides (Extended Data Fig. 7). We next sought to perturb post-translational processes through inhibition of the molecular chaperone heat shock protein 90 kDa (HSP90), which can reshape the tumour immunopeptidome and stimulate antitumour immune responses 32 . Treatment with an HSP90 inhibitor (HSP90i) increased the number of specific MHC-I peptides we identified, including those derived from HSP90 client proteins with lower thermal stability (Extended Data Fig. 8a-k). Taken together, these data demonstrate that the immunopeptidome can be shaped through post-translational mechanisms and, more broadly, underscore the potential of the KP/K b Strep model to discover treatment-induced changes in the tumour immunopeptidome in vivo (Extended Data Fig. 8l).

Discovery of tumour antigens in KP LUAD
Given our ability to discern the LUAD immunopeptidome from that of normal tissues in vivo, we evaluated new tumour epitopes in the KP/ K b Strep model. We found 135 peptides that were recurrently presented on LUAD in vivo but not observed in healthy tissues from our study or previously published data 19 , which we termed putative non-mutated tumour-specific antigens (TSAs) ( Fig. 4a and Extended Data Fig. 9a). In addition, we nominated an additional 147 peptides that were only found on one tissue (not lung) and determined these peptides to be putative tumour-associated antigens (TAAs) ( Fig. 4a and Extended Data Fig. 9b). Seeking to evaluate the immunogenicity of both classes of antigens, we vaccinated naive mice with a pooled peptide vaccine (three TSAs, five TAAs) using bone-marrow derived dendritic cells 33 (Fig. 4b). Interferon-γ (IFNγ) enzyme-linked immune adsorbent spot (ELISPOT) assays of splenocytes from control and vaccinated mice revealed three immunogenic peptides ( Fig. 4c and Extended Data Fig. 10). Notably, two out of three peptides were not presented by KP/K b Strep cells in vitro, and comparative RNA or RPF expression in LUAD compared to AT2 cells would not have led to their prioritization ( Fig. 4d and Extended Data Fig. 9c-f). Collectively, these data reinforce the importance of examining the immunopeptidome in vivo and challenge the notion that differential expression is required for tumour-specific antigen presentation.
To further explore the immunogenicity of these epitopes, we vaccinated 5-week KP tumour-bearing animals and measured CD8 + T cell reactivity with custom peptide-MHC-I tetramers 34 (Fig. 4e). We detected CD8 + T cells that recognized a peptide derived from the TAA PRDM15 (SVAHFINL) in tumour-bearing lung tissue in vaccinated mice, but not control mice ( Fig. 4f and Extended Data Fig. 9h). Seeking to understand the lack of immunogenicity of the remaining five out of eight vaccine peptides, we cross-referenced our AT2 immunopeptidome data and   b, Heatmap showing the relative mRNA expression (red, white, blue), mean mRNA expression (gold) and predicted affinity (green) of genes encoding LUAD-unique peptides throughout tumour progression compared with normal AT2 cells (data from ref. 27 ). Met, metastatic; NonMet, non-metastatic. c, Workflow depicting the in silico approach used to predict tumour-specific peptides based on RNA expression and predicted affinity. Histograms show the number of LUAD-unique peptides according to whether they were predicted (grey) or not predicted (red) by RNA and affinity analysis. d, Experiment outline for Ribo-seq using RiboLace. e, Heatmap showing the relative translation intensity (TPM) for AT2 identity genes and genes associated with KP tumour progression. f, Comparison of RNA-seq abundance (x axis) and Ribo-seq abundance (y axis) in tumour organoids versus AT2 organoids. Genes coordinately upregulated or downregulated in both RNA-seq and Ribo-seq are shown in blue, genes exhibiting differential TE are shown in green, and genes encoding for LUAD-unique peptides are shown in red. g, Workflow depicting the approach used to predict LUAD-unique peptides based on differential TE in LUAD versus AT2 and predicted affinity. Histograms show the number of LUAD-unique peptides according to whether they were predicted (grey) or not predicted (red) by differential TE.

Article
found that three out of five were presented in AT2 but not bulk lung tissue (Fig. 4g). RNA expression patterns across healthy mouse tissue (Mouse Encode Project) for genes encoding peptides presented by AT2 cells would incorrectly predict potential cancer testis antigens (Ccdc158) or oncofetal antigens (Gpn3 and Znf462). These analyses demonstrate that in vivo, cell-type-specific immunopeptidomics provides an opportunity to empirically evaluate cell-specific and tissue-specific presentation patterns and more accurately classify potential antigens compared to in vitro or in silico approaches ( Fig. 4h and Extended Data Fig. 9j,k).

Discussion
We developed a mouse model that enabled the specific isolation of MHC-I complexes from discrete cell populations of interest in vivo. Application of this system to AT2 cells and LUAD revealed that the

Fig. 4 | Discovery of new tumour antigens in LUAD. a, Workflow for
identifying putative non-mutated TSAs or TAAs. b, Schematic of the pooled peptide vaccine strategy in naive mice. c, Quantification of IFNγ ELISPOT data of splenocytes from naive mice or mice vaccinated (Vax) with pooled peptides. Each peptide in the pool was used individually to stimulate splenocytes before ELISPOT. Data are mean ± s.d., n = 3 mice per group. d, Comparison of in vivo bulk RNA-seq (data from ref. 27 ), scRNA-seq (data from ref. 22 ) and ex vivo RNA-seq and Ribo-seq for genes encoding immunogenic peptides. Identification of peptides by in vitro and in vivo immunopeptidomics is also indicated. For in vivo peptide identification, the fraction of late-stage tumour samples for which the peptide was identified is indicated. Boxes depict first, second and third quartiles and whiskers show range excluding outliers.
Histograms depict mean ± s.e.m., ex vivo organic data include n = 3 for AT2 and n = 4 for LUAD. e, Pooled vaccine strategy for KP tumour-bearing animals. f, Flow cytometry plots depicting pMHC-I tetramer staining for a representative TAA (SVAHFINL, PRDM15) in the lung tissue of naive and vaccinated, tumour-bearing mice. P values calculated using one-sided Mann-Whitney test. Data are mean ± s.d., n = 5 mice in each group. g, Transcript abundance across healthy mouse tissues for peptides included in the pooled vaccine. Detection of the peptide on AT2 cells or bulk lung tissue is also indicated. E14, embryonic day 14. h, Model depicting the incongruence between the immunopeptidome derived from in silico prediction methods, in vitro mass spectrometry and tumour-specific in vivo immunopeptidomics.
immunopeptidome is highly dynamic through tumour progression and evolves with cellular states adopted by tumour cells. This raises the possibility that distinct stages of tumour development are associated with an ever-changing landscape of targets for antigen-specific T cells through tumour progression.
Notably, our results suggest that neither mRNA expression nor TE fully explain LUAD-specific antigen presentation in vivo. Although many neoantigen prediction pipelines include mRNA abundance as a method to triage potential neoantigens, our data indicate that many peptides are recurrently presented in vivo despite apparently low transcript levels, which highlights the importance of empirical mass spectrometry evidence to evaluate MHC-I presentation. Further study is needed to elucidate post-translational mechanisms that shape tumour-specific antigen presentation in vivo, which may ultimately improve neoantigen prediction.
TSAs and TAAs are known to elicit antigen-specific responses against tumours [35][36][37][38] . Consistent with recent work from our laboratory 26,39-41 , we found that vaccination was required to induce responses against antigens identified in vivo, which is probably due to suboptimal priming of CD8 + T cells in our model. Systematic interrogation of epitopes identified with this mouse model will probably uncover additional immunogenic epitopes. These strategies will ultimately expand our understanding of the 'altered-self' presented by cancer cells, which can be used to design and evaluate new cancer immunotherapies in preclinical models 42 .
We also envision that this model system can identify peptides derived from cryptic translation events 29,43-45 , post-translational modifications 46 , transposable elements 47 and the microbiome 48 , as many of these events will be influenced by physiological cues from the in vivo microenvironment.
Given the prevalence of cell-specific and tissue-specific Cre-driver mouse strains, the K b Strep allele presents an opportunity to map a high-resolution in vivo immunopeptidome atlas 19 , as we demonstrated with AT2 cells in this study. Cross-referencing these data to other omics-based tissue atlases can elucidate the relationship between cellular phenotype and the immunopeptidome, which cannot be accomplished using bulk tissue measurements.
In summary, we uncovered important aspects of the tumour immunopeptidome in vivo that should promote further investigation into context-specific antigen presentation and will improve our understanding of tumour-immune interactions. In turn, we described a versatile tool for the broader research community to interrogate mechanisms of antigen presentation at high resolution in health and disease.

Online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41586-022-04839-2.

Mice
All animal studies described in this study were approved by the MIT Institutional Animal Care and Use Committee. All animals were maintained on a pure C57BL/6J genetic background, except for embryonic stem cell (ESC) chimeras, which were a mix of C57BL/6J and albino C57BL/6J. Generation of Kras LSL-G12D/+ and Trp53 fl/fl (KP) mice has previously been described 49,50 . Sftpc creERT2/creERT2 mice on a C57BL/6J background were purchased from the Jackson Laboratory (strain 028054) 24 and bred to H2-K1 Strep/Strep mice to produce the Sftpc creERT2/creERT2 H2-K1 Strep/Strep strain.

Mouse ESC culture and CRISPR-assisted gene targeting
Targeted insertion of the Cre-invertible StrepTag allele (K b Strep) into the endogenous H2-K1 locus was performed in KP*1 ESCs, which were generated by crossing a hormone-primed C57BL/6J Trp53 fl/fl female with a C57BL/6J Kras LSL-G12D/+ Trp53 fl/fl male. At 3.5 days after coitum, blastocysts were flushed from the uterus, isolated and cultured on a mouse embryonic fibroblast (MEF) feeder layer in ESCM+LIF+2i (knockout DMEM (Gibco), 15% FBS (Hyclone), 1% NEAA (Sigma), 2 mM glutamine (Gibco), 0.1 mM β-mercaptoethanol (Sigma-Aldrich) 50 IU penicillin, 50 IU streptomycin, 1,000 U ml -1 LIF (Amsbio), 3 µM CHIR99021 (AbMole) and 1 µM PD0325901(AbMole)). After 5-7 days in culture, the outgrown inner cell mass was isolated, trypsinized and re-plated on a fresh MEF layer. ESC lines were genotyped for Kras LSL-G12D/+ Trp53 fl/fl and Zfy (Y-chromosome specific). Primer sequences are available upon request. ESC lines were tested for pluripotency by injection into host blastocysts from albino mice to generate chimeric mice. DNA mixes containing 3:1 mixes of H2-K1-Strep targeting vector: U6-sgH2-K1-eCas9v1.1-T2A-BlastR were ethanol precipitated before Lipofectamine 2000 (Thermo Fisher) transfection of approximately 3 × 10 5 KP*1 mouse ESCs according to the manufacturer's instructions. Transfected mouse ESCs were plated on MEF feeder cells and selected with 4 µg ml -1 blasticidin for 48 h. Cells were then trypsinized and plated at low density onto a fresh plate with MEF feeder cells. After 5-6 days, large colonies were picked using a dissecting microscope and replated into a 96-well plate containing MEF feeder cells. After 4-5 days, each well of the 96-well plate was trypsinized, and 1:2 of the material was frozen in ESCM plus 10% DMSO and the remainder of the cells were transferred to a 96-well PCR plate for PCR-based integration screening. Clones containing homozygous targeting events were then thawed and expanded into 24-well dishes, which were then subjected to genomic DNA extraction and overnight restriction digest with SapI. Digestions were then electrophoresed on a 0.7% agarose gel before transfer onto an Amersham Hybond XL nylon membrane (GE Healthcare). Blots were then probed with 32 P-labelled DNA probes comprising an internal sequence homologous to the genomic insertion containing the StrepTag exon.
Correctly targeted clones were injected into albino C57BL/6 blastocysts. Coat colour was used as a surrogate marker for chimerism. Low-degree chimeras were chosen for pancreatic organoid generation and high degree chimeras were bred to KP* mice for germline transmission.

Pancreatic organoid isolation
Low-degree KP;K b Strep chimeras were chosen for organoid isolation using previously described methods 41 . In brief, the pancreas was manually dissected, transferred to a Petri dish and thoroughly minced with a razor blade. Minced tissue was then transferred to a 1.5-ml microcentrifuge tube with 1 ml of PBS supplemented with 125 U ml -1 collagenase IV (Worthington) and incubated with rotation at 37 °C for 20-30 min. Cell suspensions were diluted with 9 ml of PBS and centrifuged at 2,000 r.p.m. for 2 min. Cell pellets were then washed with 10 ml of PBS and centrifuged at 2,000 r.p.m. for 2 min. The resulting cell pellet was then resuspended in 100% Matrigel (Corning) and plated as 50 µl domes and solidified at 37 °C. Organoids were then cultured in organoid complete medium. Purification of cells derived from the targeted ESCs was accomplished with puromycin selection at 6 µg ml -1 (a puromycin resistance gene is encoded within the LSL cassette upstream of the Kras G12D allele).
Pancreatic organoids were serially passaged with TrypLE Express (Life Technologies). After four passages, KP or KP;K b Strep organoids were then subjected to ex vivo transformation by dissociation, mixing with adenoviral Cre (Ad-CMV-Cre; multiplicity of infection of 500) and re-embedding in Matrigel. After 72 h, transformants were selected with Nutlin-3a (10 µM, Sigma Aldrich) to select for loss of p53.

Orthotopic transplantation of pancreatic organoids
Orthotopic transplantation of organoids was performed as previously described 41 . In brief, animals were anaesthetized using isoflurane, the left subcostal region was depilated (using clippers or Nair) and the surgical area was disinfected with alternating betadine and isopropyl alcohol. A small (approximately 2 cm) skin incision was made in the left subcostal area and the spleen was visualized through the peritoneum. A small incision (about 2 cm) was made through the peritoneum overlying the spleen and the spleen and pancreas were exteriorized using ring forceps. A 30-gauge needle was inserted into the pancreatic parenchyma parallel to the main pancreatic artery and 100 µl (containing 1.25 × 10 5 organoid cells in 50% PBS plus 50% Matrigel) was injected into the pancreatic parenchyma. Successful injection was visualized by the formation of a fluid-filled region within the pancreatic parenchyma without leakage. The pancreas and spleen were gently internalized, and the peritoneal and skin layers were sutured independently using 5-0 Vicryl sutures. All mice received pre-operative analgesia with sustained-release buprenorphine and were followed post-operatively for any signs of discomfort or distress. Organoid-Matrigel mixes were kept on ice throughout the entirety of the procedure to prevent solidification before injection. For orthotopic transplantation, syngeneic C57BL/6J mice (aged 4-12 weeks) were transplanted. Male pancreatic organoids were only transplanted back into male recipients.

Retrograde pancreatic duct delivery
Retrograde pancreatic duct instillation of lentivirus has been previously described 18,41 . In brief, the ventral abdomen was depilated (using clippers or Nair) 1-2 days before surgery. Animals were anaesthetized with isoflurane and the surgical area was disinfected with alternating betadine and isopropyl alcohol. A small skin incision was made in the anterior abdomen (2-3 cm midline incision extending caudally from the xiphoid process). A subsequent incision was made through the linea alba and incision edges were secured in place with a Colibri retractor. The remainder of the procedure was conducted under a Nikon stereomicroscope. A moistened (with sterile 0.9% saline) sterile cotton swab was used to gently move the left lobe of the liver cranially towards the diaphragm. A second moistened sterile cotton swab was used to gently reposition the colon and small intestine into the right lower abdominal quadrant until the duodenum was visualized. The duodenum was gently repositioned (still in the abdominal cavity) using moistened cotton swabs until the pancreas, common bile duct and sphincter of Oddi were clearly visualized. The common bile duct and cystic duct were gently separated from the portal vein and the hepatic artery using blunt dissection with Moria forceps. A microclip was placed over the common bile duct (cranial to pancreatic duct branching) to prevent influx of the viral particles into the liver or gallbladder, forcing the viral vector retrograde through the pancreatic duct. To infuse the viral vector, the common bile duct was cannulated with a 30-gauge needle at the level of the sphincter of Oddi, and 150 µl of virus was injected over the course of 30 s. Gentle pressure was applied at the sphincter of Oddi after needle exit to prevent leakage into the abdominal cavity. Subsequently, the microclip and Colibri retractor were removed. The peritoneum was closed using running 5-0 Vicryl sutures. The cutis and fascia were closed using simple interrupted 5-0 Vicryl sutures. The entire procedure was conducted on a circulating warm water heating blanket to prevent intra-operative hypothermia. All mice received pre-operative analgesia with sustained-release buprenorphine and were followed post-operatively for any signs of discomfort or distress. For retrograde pancreatic ductal installation, male mice (aged 3-6 weeks) and female mice (aged 3-8 weeks) were transduced with 7 × 10 8 p.f.u. of Ad5-Pft1a-Cre (University of Iowa Viral Core) in serum-free medium (Opti-MEM, Gibco).

Intratracheal administration of adenovirus
Adenovirus expressing Cre recombinase from a SPC promoter (Ad5-SPC-Cre, University of Iowa) was prepared by diluting virus stocks to 400 p.f.u. per µl into OptiMEM (Gibco) followed by the addition of CaCl 2 to a final concentration of 10 µM (ref. 7 ). Viral suspensions were then mixed and incubated at room temperature for 20 min before placing on ice. Mice were anaesthetized with isoflurane before setting on a custom wire platform to open the mouth. The trachea was then canulated with 22-gauge catheters (Exel), and 50 µl of viral suspension was added to the catheters. Once the mouse aspirated the viral suspension, they were transferred to a pre-warmed cage to recover from anaesthesia and monitored every day for 3 days to ensure recovery from the procedure. Dilutions of virus were used within 1 h of preparation.

Flow cytometry of cultured cells and lung tissue
For staining pancreatic organoids, cells were plated into 4 × 20 µl domes per well in a 12-well plate. Cells were either left untreated or incubated with 20 ng ml -1 IFNγ for 48 h. At the time of collection, organoid medium was aspirated, and the Matrigel domes were mechanically disrupted by vigorous pipetting in PBS. Cell suspensions were then transferred to 15-ml conical tubes passivated with 0.1% BSA and centrifuged at 2,000 r.p.m. for 2 min. The resulting cell pellet was then incubated with 1 ml TrypLE Express for 10-15 min. The dissociation reaction was quenched with 10 ml PBS followed by centrifugation at 2,000 r.p.m. for 2 min. Cell pellets were then resuspended in 200 µl PBS and transferred to a 96-well U-bottom plate for staining. Cells were then incubated with Zombie Aqua fixable viability stain (1:1,000, BioLegend) for 15 min before staining with primary antibodies diluted in PBS plus 2% heat-inactivated FBS (FACS buffer) for 30 min on ice (Supplementary  Table 11). Cells were then washed twice with FACS buffer before analysis on a LSR II analytical flow cytometer.
For FACS analysis of tumour-bearing lung tissue, KP or KP/K b Strep mice (n = 3 each) were euthanized 12 weeks after tumour initiation. Two minutes before euthanasia, an intravascular staining antibody (anti-CD45-APC-eFluor786, BioLegend) was injected retro-orbitally to stain circulating immune cells. Mice were then euthanized by cervical dislocation, and the lungs were removed and placed on ice. Approximately 100-200 mg of tumour-bearing lung tissue was then thoroughly minced with Noyes scissors before incubation with digestion buffer (HBSS supplemented with 5% HI FBS, 125 U ml -1 collagenase IV (Worthington) and DNase (Roche) with rotation for 30 min at 37 °C. Cell suspensions were then macerated through 70-µm cell strainers (Corning) and centrifuged at 500g for 5 min. Cell pellets were resuspended in 1 ml ACK lysis buffer (Gibco) and incubated at room temperature for 5 min before quenching with 10 ml RPMI and 10% HI FBS and centrifugation at 500g for 5 min. Cell pellets were then resuspended in 200 µl FACS buffer and transferred to 96-well U-bottom plates for staining with Fc block (BD Biosciences) for 20 min, Zombie fixable viability stain for 20 min and primary antibody cocktail for 30 min on ice. Cells were then washed twice with FACS buffer before fixation overnight with FOXP3/transcription factor staining buffer set (eBioscience) according to the manufacturer's instructions before analysis on a Fortessa flow cytometer. All flow cytometry data were analysed using FlowJo v.10.

Affinity purification of H2-K b with StrepTactin
Whenever possible, great care was taken to keep samples ice cold at all times to maintain MHC-I complex stability. For cultured cells, about 6 × 10 7 cells were used for each replicate (4 × 15 cm dishes). In the culture dish, cells were washed twice with PBS before lysis with 2 ml MHC extraction buffer (MEB; 20 mM Tris pH 8.0, 100 mM NaCl, 1 mM EDTA, 1% Triton X-100, 60 mM octyl-glucopyranoside, 6 mM MgCl 2 and 1× HALT protease inhibitors (Pierce)) 10 . Cell lysates were then transferred to 2-ml microcentrifuge tubes and supplemented with 20 U benzonase and 10 U avidin (to block endogenously biotinylated protein) before incubating with rotation at 4 °C. Lysates were then cleared by centrifugation at 16,000g for 15 min before incubation with MagStrep Type 3 StrepTactin XT beads (IBA Biosciences).
For pancreatic and lung tissues, tumour-bearing tissue was dissected and immediately lysed or flash frozen in liquid nitrogen for later processing. For lysis, fresh or frozen tissue was quickly minced with Noyes scissors and transferred to a 7-ml glass Dounce homogenizer (Sigma), precooled on ice. MEB (4 ml) was then added, and the tissue was thoroughly homogenized with 10-20 passes of a loose-fitting pestle followed by 5-10 passes of a tight-fitting pestle. Tissue homogenates were transferred to 5-ml centrifuge tubes (Eppendorf) and supplemented with 20 U benzonase and 10 U avidin before incubating with rotation at 4 °C for 20 min and subsequent removal of debris by centrifugation at 16,000g for 15 min.
Before incubating with cleared lysate, StrepTactin beads were equilibrated by magnetizing and washing once with MEB. For each cell culture sample or tissue sample, 1 ml of bead suspension (50 µl bed volume) was used. Equilibrated beads were then added to cleared lysates and incubated with rotation at 4 °C for 1-3 h. After incubation, beads were washed twice with MEB, twice with TBS and twice with 20 mM Tris. On the last wash, suspended beads were transferred to a new Lo-Bind microcentrifuge tube before elution. Strep-tagged H2-K b was then eluted from the StrepTactin resin by adding 400 µl of 0.5× buffer BXT (IBA Biosciences) and incubating on ice for 20-30 min with occasional flicking of the tube to maintain the beads in suspension. Beads were then magnetized, and the supernatant was transferred to a new Lo-Bind microcentrifuge tube. Biotin, H2-K b heavy chain and B2M light chain were then precipitated by adding 1% trifluoroacetic acid slowly while gently vortexing the elution. The fluffy white precipitate was then pelleted by centrifugation at 20,000g for 10 min. Supernatants containing liberated MHC-I peptides were then directly aspirated into 8 µg binding capacity C18 solid-phase extraction pipette tips (Pierce). Tips were preequilibrated with 50% acetonitrile (ACN) and 0.1 % formic acid (FA) according to the manufacturer's instructions. The 400 µl of eluted material was loaded, 20 µl at a time, until the entire volume of the elution was passed over the C18 sorbent. Tips were then washed twice with 5% ACN and 0.1% FA before elution in 10 µl of 30% ACN and 0.1% FA. Desalted peptides were then dried down before reconstitution in 3% ACN and 0.1% FA in an autosampler vial for liquid chromatography and tandem mass spectrometry (LC-MS/MS) analysis.

Antibody immunoprecipitation of MHC-I
Peptide MHC isolation was performed as previously described 33 . Healthy lung tissue or tumour-bearing lung tissue was homogenized and cleared as described for 'Affinity purification of H2-K b with Strep-Tactin'. Per sample, 1 mg of anti-H2-K b (clone Y3, BioXCell) was bound to 20 µl (bed volume) FastFlow Protein A sepharose beads (GE Healthcare) by incubating for 1 h at 4 °C. Beads were then washed with lysis buffer and samples were incubated for 2-4 h with rotation at 4 °C. Beads were then centrifuged at 1,000 rpm, washed twice with MEB, twice with 1× TBS and eluted with 10% acetic acid at room temperature. Eluate was then filtered using 10 kDa MWCO spin filters (PALL Life Science), which were passivated with 0.1% BSA and acidified with 10% acetic acid before filtration. Filtered peptides were then further purified with 8 µg binding capacity C18 tips (Pierce) before LC-MS/MS analysis. For multiplexing, lyophilized pMHCs were resuspended in 33 µl of labelling buffer (50% ethanol, 150 mM TEAB) and mixed with 40 µg of pre-aliquoted TMT 6plex (Thermo Scientific) resuspended in 10 µl of anhydrous ACN. The labelling reaction was carried out on a shaker for 1 h at room temperature and quenched with 0.3% hydroxylamine. Samples were combined and dried in SpeedVac before being cleaned up using the SP3 protocol as previously described 15 .
Standard MS parameters were as follows: spray voltage, 2.0-2.5 kV; no sheath or auxiliary gas flow; and heated capillary temperature, 275 °C. The Exploris was operated in data-dependent acquisition (DDA) mode. Full scan mass spectra (350-1,200 m/z, 60,000 resolution) were detected in the orbitrap analyser after accumulation of 3 × 10 6 ions (normalized automatic gain control (AGC) target of 300%), automatic maximum injection time (IT). For every full scan, MS 2 spectra were collected during a 3 s cycle time. Ions were isolated (0.4 m/z isolation width) for a maximum IT of 150/250 ms or 1 × 10 5 to 7.5 × 10 4 ions (100%/75% normalized AGC target) and fragmented by higher energy disassociation collision (HCD) with 30% collision energy (CI) at a resolution of 60,000. Charge states <2 and >4 were excluded, and precursors were excluded from selection for 30 s if fragmented n = 2 times within 20 s window.
Synthetic peptides were ordered from Genscript at ≥85% purity, with no trifluoracetate removal, and stock solutions were dissolved in DMSO at a peptide concentration of 10 mM. Synthetic peptides analysis was also performed on the Exploris 480 coupled to an Agilent 1260 LC system using a custom analytical chromatography column, prepared as described above. Peptides were eluted using a 105-min gradient (3.4% for 10 min, 3.4-6.9% for 2 min, 6.9-29% for 53 min, 51-100% for 3 min, hold for 1 min, 100-3.4% for 1 min) with 70% ACN in 0.2 M acetic acid at the flow rate of 0.2 ml min -1 and a pre-column split of 2,000:1. The Exploris was operated with a precursor scan range of 440-544 m/z and a targeted inclusion list for synthetic peptides (Supplementary  Table 10). Additional acquisition parameters are as follows: full scan spectra were collected at 60,000 resolution, 300% normalized AGC, automatic IT; MS 2 spectra were collected at 60,000 resolution, isolation width 0.4 m/z, maximum IT of 250 ms, 100% normalized AGC target fragmented by HCD with 30% CI with a 3 s cycle time.
TMT-labelled samples were analysed on a Q Exactive Plus mass spectrometer coupled to Agilent 1260 LC. Samples were resuspended in 3% ACN, 0.1% FA and one-third of the labelled mixture was loaded onto a pre-column (100 µm ID × 10 cm) packed in-house with 10 µm C18 beads (YMC gel, ODS-A, AA12S11) connected in tandem to an in-house packed analytical column (50 µm ID × 15 cm and 5 µm C18 beads). Peptides were separated by a 105-min LC gradient with a flow rate of 0.2 ml min -1 and a similar pre-column split ratio. MS 1 scans were performed with the following settings: 350-1,200 m/z range, resolution of 70,000, AGC target of 3 × 10 6 and maximum IT of 50 ms. The top 15 abundance ions were isolated and fragmented with HCD (31% CI) with the following parameters: resolution of 70,000, AGC target of 1 × 10 5 , maximum IT of 350 ms and isolation window of 0.4 m/z. Charge states <2 and >4 were excluded. TMT channels were as follows: 126, KP; 127, KP 128; KP/KbStrep; 129, KP/KbStrep.

MS data analysis
All mass spectra were analysed with Proteome Discoverer (v.2.5) and searched using Mascot (v.2.4) against the mouse SwissProt database (2021_03, 2021 02 for label-free quantification). Peptides were searched with no enzyme and variable methionine oxidation. Peptide spectrum matches were further filtered according to the following criteria: ions score ≥ 15, search engine rank = 1 and results from technical replicates of each sample analysis were combined. Median retention time was calculated using the retention time values of filtered peptide spectrum matches from all replicates of a given sample. Label-free quantitation was done using the Minora Feature Detector (precursor abundance values measured based on area under the curve) in Proteome Discoverer with match between runs enabled and filtered for peptides with ions score ≥15 and search engine rank = 1. Abundances were averaged across technical and biological replicates. All data were processed in R studio (R v.4.1.0) and Microsoft Excel (v. 16.57) Endogenous and synthetic peptide spectra were compared by head-to-tail plots using MSnbase R package 52 . Pearson's correlations and dot products were calculated using all the detected fragment ions in both spectra. Spectra that had the highest ions score in tumour samples (endogenous) or synthetic peptide sample were used (details listed in Supplementary Table 10) for head-to-tail plots and visualization using Interactive Peptide Spectral Annotator 53 . For Interactive Peptide Spectral Annotator, all fragment ions were plotted.

Immunofluorescence of lung and pancreas tissue
Tumour-bearing lung or pancreas tissue was manually dissected and embedded in optimal cutting temperature (OCT) compound and slowly frozen on dry ice. Frozen tissue sections were stored at −80 °C until sectioning. On the day of sectioning, frozen tissue was allowed to equilibrate to −20 °C in the cryostat for at least 1 h. Sections (8 µm) were then cut and transferred to microscope slides (Fisher) before fixation in 100% acetone at −20 °C for 10 min and dried for 15 min at room temperature. Slides were then stored at −20 °C until staining.
Tissue sections were circled with a hydrophobic pen and then rehydrated with PBS for 5-10 min and then blocked in PBS supplemented with 5% BSA for 45 min. Primary antibodies were then added at the indicated dilutions (Supplementary Table 11) for 1 h at 25 °C. Slides were then washed four times with PBS and BSA, and incubated with fluorescently labelled secondary antibodies where indicated for 1 h at 25 °C. Stained sections were then washed three times with PBS, incubated with DAPI for 5 min, washed once with PBS and then mounted with Prolong Diamond AntiFade Mountant (Thermo). Slides were scanned on a VERSA 8 slide scanner (Leica) before analysis in ImageScope v64 and ImageJ.

Immunoblotting
For organoids, cells were dissociated with TrypLE, washed once with PBS, and then lysed in cell lysis buffer (CLB; 50 mM Tris pH 7.5, 150 mM NaCl, 1 mM EDTA, 1% Triton X-100, 0.1% SDS and 1× HALT protease inhibitors). For monolayer cultures, medium was aspirated and cells were washed twice with PBS before lysis with CLB. Protein concentration was quantified with BCA (Pierce) and samples were loaded onto 4-12% Bis-Tris SDS-PAGE gels (Invitrogen) and electrophoresed at 150 V until the loading dye reached the bottom of the gel. Gels were then transferred onto 0.45 µm nitrocellulose membranes overnight at 20 V in a cold room. Blots were blocked with PBST (PBS plus 0.5% Tween20) and 5% milk for 30 min at room temperature, incubated with primary antibody for 1 h at room temperature, washed four times in PBST, incubated with HRP-conjugated secondary antibody for 1 h at room temperature, and then washed four times in PBST. Blots were developed with Clarity or Clarity-Max ECL substrate (Bio-Rad) and imaged on a ChemiDoc Gel Imaging System (Bio-Rad).
PCR with reverse transcription RNA was isolated from organoid cultures by aspirating medium and then resuspending the Matrigel domes in 1 ml TRIzol (Ambion). RNA was then extracted from the homogenate using a PureLink RNA Mini kit (Ambion) according to the manufacturer's instructions with the final elution step in 40 µl of nuclease-free water. Isolated RNA was then quantified on a NanoDrop 2000 (Thermo). Total RNA (1 µg) was then added to a cDNA synthesis reaction using a High Capacity cDNA Reverse Transcription kit (Applied Biosciences) according to the manufacturer's instructions, with random hexamers as primers and including RNase inhibitor in the reaction. cDNA was then diluted 1:10 with nuclease-free water before adding 1 µl to a PCR reaction containing 500 µM forward and reverse primer and 1× Q5 master mix (NEB). Reactions were cycled 35 times with annealing temperatures calculated by NEB T m calculator before loading on 2% agarose gels. DNA bands were imaged by staining with ethidium bromide and ultraviolet transillumination.

Ribosome-protected fragment isolation with RiboLace
Ribosome profiling was performed using a previously described method using puromycin-conjugated magnetic beads to isolate actively translating ribosomes 28 . To obtain cellular material for Ribo-seq, organoid cultures of normal AT2 cells or organoids derived from 12-week KP tumours were transiently adapted to monolayer culture by washing tissue cultures dishes with a 10% Matrigel solution before plating cells in either complete lung organoid medium for AT2 cells or lung organoid base medium for tumour organoids 30 . After 48 h, cells were treated with 100 µM cycloheximide for 5 min and immediately transferred to ice. Plates were then washed once with 10 ml PBS and 100 µM cycloheximide followed by direct lysis and thorough mechanical disruption in the plate according to the RiboLace Protocol (Immagina Biotech). Lysates were then cleared by centrifugation at 20,000g for 15 min. Nucleic acid content was quantified with a NanoDrop and each lysate was normalized to 0.3 a.u. 260 nm in 150 µl of W-buffer. SS solution (0.3 µl) and diluted RNAse Nux solution (5 µl of 1:66.7 dilution) were then added, and RNase digestion was stopped with 0.5 µl SUPERaseIn for 10 min on ice.
RiboLace beads were prepared according to RiboLace kit instructions. In brief, per sample, 90 µl of beads was magnetized, washed once with buffer OH, once with nuclease-free water, twice with B buffer and then resuspended in 30 µl RiboLace probe followed by incubation shaking for 1 h at room temperature. Beads were then passivated with 3 µl PEG for 15 min at room temperature and washed twice with 500 µl of buffer W.
Prepared beads were then resuspended in RNAse digested lysates and incubated with slow rotation for 70 min at 4 °C. Beads were then washed twice with 500 µl of buffer W before protease digestion by addition of 20 µl of SDS and 5 µl of proteinase K and incubating for 75 min at 37 °C. RNA was then extracted with phenol-chloroform, precipitated overnight with isopropanol and then electrophoresed on denaturing 15% TBE-urea gels. Gel pieces were excised from the region of 20-40 nucleotides to enrich for ribosome-protected fragments (RPFs). Gel pieces were then crushed by centrifugation through a punctured 0.5-ml tube placed in a 1.5 ml microcentrifuge tube. Gel debris was then incubated overnight rotating in 400 µl of gel elution buffer (20 mM Tris pH 7.5, 250 mM sodium acetate, 1 mM EDTA and 0.25% SDS). RNA was precipitated from the elutions by adding 700 µl of isopropanol and 1.5 µl of GlycoBlue (Thermo) followed by overnight incubation at −80 °C. RNA was then pelleted by centrifugation at 20,000g for 30 min at 4 °C. The RNA pellet was then washed with 70% ethanol, air dried for 2-3 min and resuspended in 10 µl of nuclease-free water.

RPF sequencing library preparation
Illumina sequencing libraries of isolated RPFs was performed using a LACEseq kit (Immagina Biotechnology). Library preparation was performed according to the manufacturer's protocol with the following parameters. RPFs (5 ng) were 5′ phosphorylated in a 50 µl reaction containing 5 µl buffer BPK, 5 µl ATP, 1 µl PK and 34 µl nuclease-free water. The reaction was incubated for 1 h at 37 °C. RNA was then purified with an RNA Clean and Concentrator -5 kit (Zymo) and eluted in 6 µl. A linker was then ligated with 6 µl of RNA from the previous step, 1 µl buffer BA, 0.5 µl GTP, 0.6 µl MnCl 2 , 1 µl enzyme mix A, 0.25 µl linker MC and 0.75 µl nuclease-free water. The reaction was then incubated for 1 h at 37 °C and purified with an RNA Clean and Concentrator kit with 8 µl elution in nuclease-free water. Circularization was then performed with 8 µl of the eluted RNA, 2 µl buffer BLB, 1 µl ATP, 8 µl PEG8000 and 1 µl enzyme mix B. The resulting reaction was then incubated for 2 h at 25 °C. The reaction was then purified with a RNA Clean and Concentrator kit with 10 µl elution in nuclease-free water. The 10 µl of circularized RNAs were then primed by adding 1 µl of dNTPs, 1 µl of RT T primer and 2 µl of nuclease-free water and incubating at 70 °C for 5 min. Next, a master mix containing 4 µl buffer BRT, 1 µl DTT and 1 µl RT enzyme was added followed by an incubation at 50 °C for 40 min and heat inactivation for 5 min at 80 °C. cDNA was then amplified by adding 50 µl amplification mix, 0.8 µl forward primer, 0.8 µl reverse primer and 28.4 µl nuclease-free water. The reactions were then cycled seven times according to the manufacturer's PCR parameters. PCRs were cleaned up using AMPURE XP beads at a 1.6× ratio and elution in 40 µl of nuclease-free water. A second PCR amplification was performed using 20 µl from the previous PCR, 50 µl amplification mix, 1 µl LACEseq UDIs and 29 µl nuclease-free water. Reactions were cycles six times according for manufacturer's parameters. The final libraries were size selected using PAGE purification and cleaned up using a DNA Clean and Concentrator (Zymo) before Illumina sequencing.

RNA-seq library preparation
RNA was extracted from the remaining AT2 or tumour lysates that were used for Ribo-seq using TRIzol according to the manufacturer's instructions. The aqueous phase containing RNA was precipitated with 2 volumes of isopropanol and washed once with 70% ethanol and air dried. Pellets were resuspended in nuclease-free water before polyA+ selective mRNA library preparation at the MIT BioMicroCenter.

Ribo-seq data analysis
Raw fastq files from each Ribo-seq library were trimmed according to recommendations in the LACE-seq protocol (Immagina Biosciences). Fastq files from each sample type were then merged before submission to the RiboToolKit server (http://rnabioinfor.tch.harvard.edu/ RiboToolkit/) using a 26-38 nt size filter and a predicted open reading frame cut-off of P < 0.05. Data were then downloaded and visualized in R. For analysis of TE, raw RNA-seq and Ribo-seq data were aligned to the transcriptome with kallisto 0.46.0. Aligned reads were then imported into R for DEseq2 analysis using the tximport package. Differential expression at the RNA and RPF levels were calculated using standard DEseq2 workflows. For TE analysis, the DEseq2 results object was generated incorporating both sequence type (Ribo versus RNA) and sample (LUAD versus AT2) as independent variables to identify genes that were differentially translated in LUAD versus AT2.

HSP90i treatment
The HSP90i (NVP-HSP990, Selleck Chemical) was administered in the drinking water 32 . The average water consumption of the mice was calculated by measuring water bottle weight before and after 72 h of housing to determine the average consumption per mouse per day. Across all experiments, C57Cl/6 mice consumed approximately 4 ml every day. Using these water consumption values and mouse weight, a 4 mg ml -1 stock solution of NVP-HSP990 (in 100% PEG400) was diluted directly into the drinking water to achieve a target dose of 0.5 mg kg -1 day -1 . HSP90i treatment began 8 weeks after tumour induction, and water was replaced twice per week for 4 weeks before euthanasia.
CD8 depletion and CD40/FLT3L treatment KP/K b Strep tumour-bearing mice at 2 weeks (chronic depletion) or 14 weeks (acute depletion) after tumour initiation were treated with 100 µg of anti-CD8a-depleting antibody (BioXCell, BE0061) every 3 or 4 days until animals reached 16 weeks after tumour initiation. At necropsy, a small portion of tumour-bearing lung tissue (about 100 mg) was taken for flow cytometry confirmation of CD8a + immune cell depletion. The remaining tissue was flash frozen and processed for pMHC isolation.
KP/K b Strep tumour-bearing mice at 15 weeks after tumour initiation were treated for 7 days with 10 µg FLT3L-Ig (BioXCell, BE0342) with intraperitoneal injection 26 . On day three of FLT3L-Ig treatment, mice were also dosed with 100 µg of agonistic CD40 antibody (BioX-Cell, BP0016-2) by intraperitoneal injection. On day 7 of treatment (16 weeks after tumour initiation), mice were euthanized and a small portion of tumour-bearing lung tissue (about 100 mg) was taken for flow cytometry analysis of tumour-infiltrating CD8 + T cells, and the remaining tissue was flash frozen for pMHC isolation.

OT-I killing assay
OT-I transgenic mice on a Rag2 -/background were euthanized by CO 2 inhalation and spleens were dissected and macerated through a 70-µm cell strainer. Splenocytes were then pelleted by centrifugation at 500g for 5 min, followed by red blood cell lysis using ACK lysis buffer for 2 min at room temperature. ACK was quenched with RPMI 1640 and 10% HI FBS, and splenocytes were pelleted by centrifugation at 500g for 5 min. The resulting cell pellet was then resuspended in 10 ml of T cell medium (RPMI 1640 supplemented with 10% HI FBS, 1× penicillin-streptomycin (Gibco), 1× non-essential amino acids (Gibco), 1× l-glutamine (Gibco), 1× HEPES (Gibco) and recombinant IL-2 (Peprotech)). At the time of plating, 1 µM SIINFEKL peptide was added to stimulate T cell activation and differentiation. At 18 h after plating cells, the SIINFEKL peptide was removed and cells were replated in T cell medium. At 72 h after addition of SIINFEKL, cells were prepped for target killing assays.
Target cells were KP or KP/K b Strep cell lines transduced with either lentivirus encoding mScarlet or mScarlet-SIINFEKL. On the day of the assay, target cells were trypsinized, washed and counted. KP-mScarlet cells (1 × 10 6 ) were incubated with 100 nM CFSE (KP CFSE lo ) and 1 × 10 6 KP-mScarlet-SIINFEKL cells were incubated with 5 µM CFSE (KP CFSE hi ) for 20 min at room temperature in PBS 54 . In parallel, 1 × 10 6 KP/ K b Strep-mScarlet cells were incubated with 100 nM CFSE (KP/K b Strep CFSE lo ), and 1 × 10 6 KP/K b Strep mScarlet-SIINFEKL cells were incubated with 5 µM CFSE (KP/K b Strep CFSE hi ) for 20 min at room temperature in PBS. Labelling reactions were quenched with DMEM and 10% FBS and cells pelleted at 500g for 5 min. KP CFSE hi or CFSE lo and KP/K b Strep CFSE hi or CFSE lo were mixed in a 1:1 ratio and plated into round-bottom 96-well plates at 20,000 cells per well (10,000 CFSE hi and 10,000 CFSE lo ). Transgenic OT-I T cells were then counted, washed and plated into wells containing target cells at 0:1, 5:1 and 1:1 effector:target ratios. After 4 h, cells were processed for flow cytometry. Cells were gated on live/dead viability stain, and CD45 + cells were excluded. Specific killing was calculated with the following formula: 100 - ((CFSE hi /CFSE lo - with effector cells)/(CFSE h i/CFSE lo without effector cells) × 100.

Bone-marrow-derived dendritic cell vaccination
Peptides were prioritized for vaccination as described in Fig. 4a. Candidates were then further prioritized through manual inspection of RNA expression patterns across healthy mouse tissues. We selected peptides that have very low RNA expression in healthy tissue (Slc26a4 and Prdm15), peptides that resemble cancer testis antigens (Ift74, Ccdc158 and Arrdc5), oncofetal antigens (Gpn3 and Znf462), and widespread expression despite tumour-specific presentation (Csf2ra). This effort was not comprehensive of all potential antigens, and continued interrogation of peptides from our model is warranted.
Isolation of bone-marrow-derived dendritic cells (BMDCs) was carried out by euthanizing mice and dissecting the femur and tibia into PBS 33 . The ends of the bones were then clipped, and the open end of each bone was placed facing down into a 0.5-ml tube with a hole in the bottom and placed into a 1.5 ml tube. The tube assembly was then quickly centrifuged on a tabletop microcentrifuge. Red blood cells were removed from the pelleted cell mass with ACK lysis buffer (Gibco), and the resulting cell pellet was resuspended at 1.5 × 10 6 cells ml -1 in BMDC medium (RPMI 1640 supplemented with 10% FBS, penicillin-streptomycin, 50 µM β-mercaptoethanol, 600 ng ml -1 FLT3L (BioXCell) and 5 ng ml -1 GM-CSF (R&D Systems)) and cultured in non-tissue culture treated dishes. After 5 days, cells were washed, recounted and replated in BMDC medium. On day 10, cells were activated with 10 µg CpG ODN 1826 (Abeomics) for 18-24 h.
On day 11, activated BMDCs were gently washed with BMDC medium and pulsed with 10 µg of each peptide in a 15-ml conical tube with the lid not fully tightened for 2 h in a 37 °C tissue culture incubator. Pulsed cells were then washed once with BMDC medium, and two times with PBS. For injection, 1.5 × 10 6 peptide-loaded BMDCs were subcutaneously injected into the flank. As a negative control, mice received BMDCs that were not pulsed with peptide.
Ten days after BMDC injection, mice received a booster of 6 µg of each peptide with 25 µg of c-di-GMP as adjuvant. At 14 days after the first boost (24 days after BMDC injection), mice received a booster of 6 µg of each peptide with 1 nmol of LipoCpG as adjuvant (provided from the Irvine Laboratory at MIT). Seven days after the second boost, splenocytes were collected for ELISPOT analysis or tetramer staining for tumour bearing animals.
Custom pMHC tetramer generation H2-K b tetramers were generated in-house using previously described protocols 34 . Specifically, we used disulfide bridged H2-K b (Y84C, A139C). In brief, recombinant MHC-I heavy chain (DS-K b ) and light chain (Homo sapiens B2M) were expressed in Escherichia coli using the lac-operon regulated pET-15b plasmid. Inclusion bodies were prepped as previously described 55 . Inclusion bodies were solubilized in 8 M urea, 100 mM Tris-HCl pH 8.0 and added to a refolding buffer (100 mM Tris-HCl pH 8.0, 400 mM l-arginine (Sigma), 5 mM reduced glutathione (Sigma), 0.5 mM oxidized glutathione (Sigma), 2 mM EDTA (Gibco), 1× HALT protease inhibitor (Roche)). At the time of refolding, 1 mM Gly-Leu dipeptide was added to refolding buffer, followed by 2 µM B2M, 1 µM DS-K b and 1 mM PMSF for three consecutive days. After 72 h, the reaction was concentrated to about 2 ml by applying the reaction to a 10,000 MWCO filter by nitrogen flow. The resulting concentrated protein solution was then purified by gel-filtration on an s200 Sephadex column in HBS (100 mM HEPES pH 8.0, 150 mM NaCl). Relevant fractions were collected and concentrated to approximately 8 ml. Purified monomers were then biotinylated using BirA Ligase according to manufacturer's instruction (Avidity). Biotinylated monomers were then further purified with another round of gel-filtration on an s200 Sephadex in HBS, and the relevant fractions were collected and concentrated to 2 mg ml -1 and 50 µl aliquots were flash frozen in liquid nitrogen until use.
To generate tetramers, 15.9 µl of Streptavidin-PE (SA-PE, Molecular Probes) was added to 50 µl DS-K b monomers and incubated for 10 min at room temperature, protected from light. Addition of SA-PE was repeated 9 times for a total of 10 times resulting in a final volume of 209 µl tetramerized DS-K b . At the time of staining, 20 µl of tetramer was incubated with 50 µM peptide of interest for 1 h on ice. The resulting solution was used at a 1:200 dilution in flow cytometry staining solutions.

ELISPOT assay
Before plating cells, wells of a precoated ELISPOT plate were loaded with 100 litre of DMEM and 10% FBS and penicillin-streptomycin containing 10 µM of individual peptides, medium only or PMA/ionomycin as a positive control. Spleens from control or vaccinated mice were dissected and macerated through a 70-µm filter. Cells were then centrifuged for 5 min at 1,500 r.p.m. and pellets were then lysed with ACK lysis buffer and the remaining splenocytes were counted with a haemocytometer. Cells were then plated across control and peptide stimulation conditions at a concentration of 750,000 cells per well and gently mixed before incubation for 20 h. After overnight incubation, cells were decanted and the plates were washed and IFNγ spots detected using the procedure outlined in the Mouse IFNγ ELISPOT kit (ImmunoSpot). After development, plates were dried overnight before automated spot counting on an ImmunoSpot scanner.

Data analysis, statistics and visualization
All quantitative data were processed and visualized in R (R v.4.0.2, RStudio v.1.3.959). Peptides from PDAC or LUAD samples were concatenated and organized in Microsoft Excel with predicted affinity values calculated using NetMHCPan 4.1. Following import into R, peptide lists were filtered for length (8-11 amino acids) and predicted affinity (<1,000 nM) before further analysis. Non-metric multidimensional scaling (NMDS) analysis was performed using a previously published R script 14 and applied to 8-mer and 9-mer peptides separately. Peptide clusters were empirically chosen through hierarchical clustering of the calculated NMDS matrix. Peptide motifs were generated with the ggseqlogo package in R. All boxplots are in the Tukey style with the horizontal mark at the median, the box extending from the first quartile to the third quartile, and whiskers extending 1.5× the interquartile range from the box.
Bulk RNA-seq was adapted from ref. 27 by taking the mean expression of each gene across all normal, early, non-metastatic and metastatic primary tumour samples before cross-comparison with peptide data.
scRNA-seq data were analysed using Seurat (v.4.0.1). Cells by genes matrices were downloaded from the Tabula Muris (healthy lung and pancreas) or from the Broad Institute's Single Cell Portal (KP scRNA-seq data). Data were filtered with a gene count cut-off of 1,100 before normalization, scaling and dimensionality reduction with uniform manifold approximation and projection according to standard Seurat pipelines (https://satijalab.org/seurat/). Custom modules were added with the AddModuleScore function according to gene lists from peptide data or published reports. Volcano plots for comparing signatures across cell types in the healthy lung data were calculated with two-tailed Student's t-test for all pairwise comparisons of Normal, Ab and Strep signatures within each cell type with a Bonferroni multiple comparisons adjustment. The median score for each signature was then used to calculate the log 2 (fold change) and P values were transformed to -log 10 (P values) for plotting. Pearson's correlations between gene signatures were calculated using AddModuleScore in Seurat, extraction of metadata, conversion of the data to a matrix containing cells × signatures and calculation of Pearson's correlation for signatures across cells. For correlation of peptide signatures to all genes, the scaled data from Seurat was extracted to a matrix, and the late-stage or Tumour-unique peptide signature was appended onto the matrix. Pearson's correlation was then calculated between the peptide signature versus each individual gene. Genes were then ranked by correlation and analysed with pre-ranked gene set enrichment in GSEA (Broad Institute).
Gene ontology analysis was performed on Gene Lists in StringDb (http://www.stringdb.org) with high confidence interaction cut-offs. Gene set enrichment analysis (Fig. 2) was performed with the GSEA software package using a pre-ranked gene list according to correlation to the late-stage or Tumour-unique peptide signature. Subcellular compartment analysis was performed by extracting compartment locations from UniProt for source proteins that had available localization data. Comparison of subcellular distributions was performed with Fisher's exact test with Monto Carlo simulation. Protein length information was also extracted from UniProt for each source protein. Thermal stability and protein half-life were obtained from published datasets 30 .
Statistical analyses were performed in R. To compare median fluorescence intensities from flow cytometry data, two-tailed Student's t-tests were performed for all pairwise comparisons. For comparisons of gene expression or signature scores, unpaired two-sample Wilcoxon test (Mann-Whitney U) was performed. For analysis of peptide abundance distributions in vehicle-treated and HSP90i-treated samples, Kolmogorov-Smirnov tests were used.

Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this paper.

Data availability
All MS data have been deposited to the Proteomics Identifications Database (PRIDE) repository with the dataset identifier PXD033232. Raw RNA-seq and Ribo-seq data have been submitted to the Gene Expression Omnibus with the dataset identifier GEO178944. Source data are provided with this paper. Extended Data Fig. 3 | See next page for caption. Fig. 3 | (related to Fig. 1).     . 2). Analysis of the LUAD immunopeptidome throughout tumor evolution. a) UMAP embedding of clusters used for signature expression analysis in Fig. 2a. b

Article
Extended Data Fig. 9 | (Related to Fig. 4). Expression and presentation of putative tumor specific and tumor associated antigens. a) Correlelogram and heatmap depicting transcript abundance (transcripts per million, TPM) of putative TSA genes across mouse tissues. b) Correlelogram and heatmap depicting TPM abundance of putative TAA genes across mouse tissues. c) Experimental schematic showing the derivation of samples for 2D immunopeptidomics. d) Venn diagram depicting the relationship between peptides identified by KP tumors in vivo and those identified in vitro. e) Boxplot showing the predicted affinity distributions of peptides isolated in vivo and in vitro. f) Distribution of source protein subcellular compartments for peptides identified in vivo (gray) and in vitro (green). P calculated with Fisher's Exact test with Monte Carlo simulation. g) Volcano plot indicating differentially expressed genes between EPCAM+ cells from embryonic day 16.5 and post-natal day 28 mouse lung (Adapted from Lung Map Project). Data analyzed and P calculated with DEseq2. All genes detected are shown in grey and genes encoding for LUAD-unique peptides are indicated with black dots. h) Flow cytometry analysis of tumor-bearing lung tissue from naïve and vaccinated mice stained with control pMHC-I tetramer (SIINFEKL) or TAA tetramer (SVAHFINL). i) Peptides identified in A549 cells (Javitt et. al.) with and without treatment of IFN-γ/TNFα. Peptides derived from source proteins homologous to those using in the pooled vaccine are indicated in red. j) Heatmap depicting expression of the human homologs of putative TSAs and TAAs from this study and whether or not peptides derived from those genes were found to be presented on A549 cells from Javitt et. al. k) Heatmap depicting the RNA Expression of homologs of potential TSA and TAA genes as found in Fig. 4a across all individual human tissues and 33 cancer types within TCGA.  . 4). Mass spectrometry validation of immunogenic epitopes with synthetic peptides. a) Mass spectrometry comparison of spectra from endogenously identified VNVYFALL peptide (Slc26a4) and a synthetic standard. b) Mass spectrometry comparison of spectra from endogenously identified SVAHFINL peptide (Prdm15) and a synthetic standard. c) Mass spectrometry comparison of spectra from endogenously identified AVLLYEKL peptide (Ift74) and a synthetic standard.
In the left panel, y-, b-, and a-ions are colored in bold. In the right panel, common peaks are drawn in darker colour.