Genomic and Immunological Features of Micropapillary Pattern in Lung Adenocarcinoma

Background: Lung adenocarcinoma (LUAD) usually contain heterogeneous histological subtypes, among which the micropapillary (MIP) subtype was associated with poor prognosis while the lepidic (LEP) subtype possessed the most favorable outcome. A more comprehensive analysis involving discovery and public validation cohorts on the two subtypes could better decipher the key biological and evolutionary mechanisms. Methods: We rstly retrospectively studied the survival status of 286 LUAD patients with different subtypes. MIP and LEP components were micro-dissected for whole-exome sequencing (WES). Shared and private alterations as well as genomic alternation characteristics between the two components were investigated. Four public cohorts containing LEP and MIP samples were further selected for genomic prole comparison, novel therapeutic target investigation and immune inltration quantication. Results: LEP and MIP subtypes exhibited largest disease free survival (DFS) in our patients. A total of 2035 SNV and 2757 InDels were identied in the sequenced LEP and MIP components. EGFR was found with highest mutation frequency. Distinct biological processes or pathways were involved in the evolutionary of the two components. Besides, analyses on copy number variation (CNV) and intratumor heterogeneity further discovered the possible immunosurveillance escape, the discrepancy between mutation and CNV level ITH and the pervasive DNA Damage Response as well as WNT pathway gene alternations in MIP component. Phylogenetic analysis on 5 pairs of LEP and MIP components further conrmed the presence of ancestral EGFR mutations. Through comprehensive analysis in our samples and public cohorts, PTP4A3, NAPRT and RECQL4 were identied as novel therapeutic and diagnostic targets in MIP subtype. Immunosuppression prevalence in MIP component was nally conrmed by multi-omics data. Conclusion: We identied genetic differences responsible for variated prognosis. The subtype evolution trajectory was additionally unraveled. Novel gene targets and the immunological analyses also provided therapeutic suggestions for MIP subtype.


Introduction
Lung cancer is the leading cause of cancer-related death worldwide, and adenocarcinoma is the most common histological type of non-small-cell lung cancer (NSCLC) (1). Most cases of adenocarcinoma are composed of heterogeneous histological subtypes rather than a single one. In the year of 2015, the world health organization (WHO) proposed a novel de nition of ve lung adenocarcinoma (LUAD) subtypes to address the histologic heterogeneity, including the lepidic (LEP), acinar (ACI), papillary (PAP), micropapillary (MIP), and solid (SOL) pattern types. More detailedly, as a poorly differentiated, high-grade tumor, the MIP histological subtype has been repeatedly reported to be a negative prognostic factor. Patients presenting MIP are prone to have lymphovascular invasion and pleural invasion, as well as lymph node or intrapulmonary metastasis after surgical resection (2). Meanwhile, previous studies indicated that patients with LEP growth pattern exhibited less aggressive behavior and had most favorable outcome among the prede ned subtypes (3). Both the presence of LEP and absence of MIP growth patterns served as predictors of favorable disease-free survival (DFS).
Aiming for the elucidation of the mechanisms beyond tumorigenesis and malignance discrepancy, several studies were conducted to evaluate the molecular and genetic features of LUAD subtypes, especially on MIP and LEP. As for MIP subtype, though positive staining of E-cadherin and beta-catenin were found, a recent study observed the disruption of catenincadherin complex in MIP (4), which possibly contributed to its poor intercellular adherence. Besides, the cytoplasmic accumulation of beta-catenin induced the canonical WNT-beta-catenin signaling pathway in MIP. The cell proliferation and migration induced by canonical WNT pathway may account for the histological characteristics of MIP patients. As for the genetic level, MIP/SOL tumors had signi cantly higher tumor mutation burden (TMB) and fraction of genome altered than other LUAD subtypes. Key oncogenes BRAF and EGFR were found with higher mutation frequency in LUAD with MIP in multiregional and multiracial cohorts(5). The gene and protein level of c-MET were also found elevated in MIP and patients with poor prognosis (5,6). Although dysregulated oncogenes associated with poorer prognoses of MIP-predominant LUAD were identi ed, there remains key mechanisms uncharacterized. For example, the genetic association between subtypes and evolutional trajectory of the relatively malignant MIP subtype were scarcely discussed.
Additionally, therapeutic options including surgical resection, chemotherapy, and targeted therapy have been proposed recently for LUAD with MIP dominance. However, these regimens still have non-negligible limitations. Emerging evidences denoted the limited resection may not be the optimal surgical approach for MIP patients due to observed postoperative recurrence, whereas adjuvant chemotherapy can mostly contribute to early-stage MIP-positive patients(6). Besides, though sporadic mutational targets were identi ed in previous studies, the development of the effective therapy is burdensome. Not to mention the inescapable acquired resistance to tyrosine kinase inhibitor (TKI) like EGFR TKIs. Noticing the recent emergence of lung cancer immunotherapy, studies assessing the e cacy of immune-related therapies on MIP-predominant LUAD had emerged. Considering the abundance of programmed death-ligand 1 (PD-L1) and programmed cell death protein 1 (PD-1) as well as the tumor immunological microenvironment crucially in uence the immunotherapy effectiveness, Francois et al. found the signi cant differences in PD-L1 expression level between LUAD histological patterns (7) while Zhang et al. detected higher CD4+ and CD8+ T cell in ltrations as well as increased PD-L1 abundance through the immunohistochemistry staining(8). Regarding the fact that both the studies focused on restricted components of the tumor microenvironment (TME), A more comprehensive analysis of the variation of TME in speci c LUAD subtypes could ll the gap of optimal treatment determination, especially for MIP patients.
To address the limitations mentioned above, we retrospectively reviewed 286 patients with different histological subtypepredominance and compared their survival difference. Patients simultaneously possess MIP and LEP components were further selected for whole-exome sequencing (WES) on both LEP and MIP components and the genetic differences responsible for variated prognosis as well as the subtype-level genetic association were investigated. Multi-cohort analyses further discovered the genes speci cally altered in MIP or LEP as novel therapeutic targets as well as the extent of immune in ltration. Our results expanded the evolutional cognition between the LUAD subtypes and offered therapeutic suggestions for MIP patients.

Patient selection and histopathologic subtyping
We retrospectively reviewed patients diagnosed with LUAD at Tianjin Cancer Hospital from 2011 to 2014. Patients underwent tumor resection, in which the MIP component exceeded 5% of area size were primarily selected. Later those receiving pre-surgery anticancer treatment, with stage IV disease or other malignancy were further excluded. A total of 286 patients passed the selection criteria and the resected tumors were restaged according to the eighth edition of American joint committee on Cancer TNM staging system for lung cancer. As for the LUAD histological subtyping, the formalin-xed para n-embedded (FFPE) samples were rstly stained with hematoxylin and eosin (H&E) and reviewed by 2 pathologists. Intratumor heterogeneity measurement and SNV/SCNA clonal architecture inference We measured intratumor heterogeneity (ITH) of samples on both SNV and SCNA. For ltered SNVs, the mutant-allele tumor heterogeneity (MATH) score (14) was calculated using VAF values. The ABSOLUTE(15) tool further estimated the cancer ploidy, tumor purity, rescaled copy ratio and cancer cell fraction (CCF) combining SNV and SCNA data. The clonal architectures of SNVs were derived by the higher clonal mutation probability and the CCF upper 95% con dence interval larger than 1. As for SCNAs, copy neutral LOH (CNLOH) segments were initially discarded and clonal architectures were further annotated using allelic subclonal information from ABSOLUTE outputs.
DDR pathway gene analysis DNA damage repair (DDR) related genes were collected from a previous publication(16) for the integrative analysis on SNV and SCNAs.

Inference of the clonal population structures
The populational structures on mutations were identi ed on ltered SNVs and annotated SCNAs by PyClone-VI (17). Later these clone clusters were visualized by ClonEvol(18).

Pathway annotation and Gene Oncology analysis
Enrichr (19) tool was used for pathway enrichment and Gene Oncology (GO) analysis. Enriched GO Biological Processes and Reactome (20) pathway entries were reported with P-values.

Public data curation for comparisons
For multi-omics data comparison, SNV, SCNA, transcriptomic and proteomics data were retrieved from multiple LUAD datasets. More speci cally, SNV and SCNA data from four datasets (Lung-Broad(21), Lung-MSKCC, Lung-OncoSG (22) and TCGA-LUAD) were downloaded from cBioPortal for Cancer Genomics database (23) or UCSC Xena database (24) and only non-synonymous mutations were retained. Survival information was also downloaded if available. Additionally, transcriptomic data from Lung-OncoSG, TCGA-LUAD and one GEO dataset GSE148801(25) containing good-prognosis (e.g. LEP, ACI and PAP histological subtypes) and poor-prognosis (e.g. MIP and SOL) samples were collected while proteomics data from TCGA-LUAD were similarly curated. Only data from LEP and MIP subtypes were used for further comparisons.

Immune in ltration analysis
Immune in ltration analysis was conducted on curated transcriptomic samples for comparisons between LEP and MIP components. The RSEM(26) normalized expression as well as FPKM values were subjected to R package Immunedeconv(27) xCell(28) method for the derivation of density of immune cells in the tumor microenvironment (TME). Additionally, regarding the recognition, response and killing of cancer cells by immune system are in a step-wise manner, i.e. the cancer immunity cycle (29), activities of each step in the cycle were assessed by single sample Gene Set Enrichment Analysis (ssGSEA) from GSVA R package (http://bioconductor.org/packages/release/bioc/html/GSVA.html).

Statistical methods
Two-sided Mann-Whitney test was used for the evaluation of group-level differences between LEP and MIP components. As for multiple comparisons, P-values were adjusted by Benjamini-Hochberg method. Additionally, when the comparisons were conducted on categorical data, Fisher's exact test was utilized. As for the protein expression data, one-sided student's t-test was used in comparison. For all tests, a P-value < 0.05 was considered statistically signi cant. The Kaplan-Meier (K-M) survival curves were generated by survminer package (30) Figure 1A) and LEP and MIP subtypes exhibited largest DFS difference while SOLpredominant patients showed signi cantly worse OS (P<0.05, Supplementary Figure 1B).

Mutational landscape exhibits the involvement of distinct biological processes in LEP and MIP lesions
Among 8 micro-dissected samples, 6 cases passed quality control, but the quantity of LEP component in one case was not enough. 6 MIP and 5 LEP components were nally sequenced (Supplementary Table 3 and Figure 1A). A total of 2035 and 2757 SNVs and InDels were identi ed while 684/791 and 257/284 mutations were retained after quality and cancer-related gene ltrations (Supplementary Figure 2A and data quality metrics in Supplementary Table 4). Genes with top mutation frequency after quality ltering was shown in Figure 1B. EGFR was identi ed to be the most frequently mutated drive gene, in line with the nding that LEP and MIP components possess signi cantly higher EGFR mutation frequency (31). Besides, several cancer-associated genes including TP53, TRIO, CEBPA, PCLO and PDE4DIP were concomitantly mutated ( Figure 1B), denoting p53, WNT-beta-catenin signaling, PI3K/AKT/mTOR signaling and DNA repair pathways were affected. Interestingly, shared mutations were observed between paired LEP and MIP components from single patients ( Figure 1B), raising the presumption that the paired LEP and MIP components could be homogenious. We also compared the mutation frequency of these genes with public cohorts, including Lung-Broad, Lung-OncoSG and TCGA-LUAD. Several cancer-associated genes including EGFR, TP53, TRIO, PCLO and PDE4DIP were found recurrently mutated (Supplementary Figure 2B).
Additionally, mutation signature analysis was conducted on un ltered mutations separately for LEP and MIP components.
The point substitution spectrum plot displayed insigni cant difference between the two histological subtypes (Supplementary Figure 2C). Similarly, the SBS, InDel and DBS signatures mapped to COSMIC database (accessed in March 2021) were similar between the two subtypes ( Figure 1C-D), denoting the histological differences between LEP and MIP components could be caused by alternations on speci c key genes.
Of particular interest, mutated tumor suppressor genes (TSGs) were enriched in distinct pathways in LEP and MIP subtype ( Figure 1E). TSGs from LEP components were enriched in DNA repair and TP53-related pathways while mutated TSGs in MIP components were found enriched in pathways associated with beta-catenin destruction complex, AXIN mutation and WNT signaling, which shared high consistency with previous report (4). When concerning the enriched pathways for mutated oncogenes, 7 out of 10 top enriched Reactome pathways from two groups were identical, which were mainly associated with EGFR and PI3K signaling (Supplementary Figure 2D). By further inspecting the mutated TSG pathway enrichment pattern in Lung-Broad (Supplementary Figure 3A Figure 5C), which was consistent with previous report (33). To further pinpoint the recurrent SCNAs at focal level, we identi ed 1116 genes with somatic copy number alternations through statistical testing on read coverages from all samples, among which 159/80 genes were uniquely ampli ed/deleted in LEP component while 34/11 genes were uniquely ampli ed/deleted in MIP component. By further annotating the enriched pathways on these genes, 27 pathways were found overlapping between enrichment results of uniquely ampli ed genes in LEP and deleted genes in MIP, which could further be categorized into immune system, innate immune system, interleukin signaling, SHC1 events, ERK activation and FRS-mediated signaling pathways (Figure 2A). When further inspecting the number of genes, pathways related to immune system and innate immune system got the highest gene number variated (37 genes ampli ed in LEP and 4 genes for deleted in MIP subtype), indicating MIP LUADs tend to have induced immunosurveillance escape. Additionally, two pathways were identical between the enrichment results of uniquely deleted genes in LEP and ampli ed genes in MIP ( Figure 2B), which were associated with Homology Directed Repair (HDR) and mRNA fate regulation but the variated gene number was limited (6 genes for LEP-and 4 genes for MIP group).
The intratumor heterogeneity (ITH) can depict the genetic and epigenetic tumor inner diversity and was proven to be closely related to cancer progression, therapeutic resistance and recurrences. To compare the ITH of the two histological subtypes at both mutational and copy number level, we annotated the mutations/SCNAs with clonality. As shown in Figure 2C, MIP group has higher clonal tumor mutation burden (cTMB) and lower subclonal mutation proportion ( Figure 2D), denoting the lower mutation level ITH of the MIP subtype. The MATH score, which was widely used to measure the mutational ITH, exhibited similar trend (Supplementary Figure 5D). As for the copy number variations, MIP group possessed signi cantly higher proportion of subclonal SCNAs ( Figure 2E) as well as subclonal genome fraction ( Figure 2F). Interestingly, the frequency of clonal mutations in DNA Damage Response (DDR) and WNT pathway genes was higher in MIP subtype (Supplementary Figure 5E), which possibly partial give rise to the subclonal genome alternations on immune-related genes since the association between canonical WNT-beta-catenin signaling and carcinogenesis as well as immune suppression was clear (34). Indeed, 6 MIP components got higher percentage of subclonal SCNA ( Figure 2G) as well as higher number of focal deletions ( Figure 2H) on the genes related to the two immune pathways.
Evolutionary pattern exploration on the paired LEP and MIP components To elaborate the possible evolutionary process between LEP and MIP subtypes, we delineated the phylogenetic trees for each patient based on mutations as well as focal level SCNAs. As shown in Figure 3, all 5 patients possessed truncal mutations between paired LEP and MIP components while no obvious bias on private mutation burden after truncal divergence was observed. Clonal mutations on cancer drivers including EGFR, TP53 and CEBPA were identi ed and EGFR was the only gene coincident in 5 pairs, which con rmed the presence of ancestral mutations. Additionally, the driver mutations private to LEP were enriched in chromatin organization, TP53-related and DNA double strand repair pathways (Supplementary Figure 6A) while mutations private to MIP were enriched in cellular signaling and beta-catenin-related pathways (Supplementary Figure 6B). We further annotated the shared mutations in Figure 3 with clonality to explore the clonal-subclonal transitions between LEP and MIP subtype. For the genes possessing mutations with increased clonality in MIP, GO terms related to neurogenesis were found enriched (Supplementary Figure 6C), denoting the tumor-induced neurogenesis and nerve-cancer crosstalk may account for the aggressiveness of MIP subtype. Oppositely, genes with mutations switched from subclonal to clonal in LEP were associated with cell-cycle related GO biological processes (Supplementary Figure 6D). As for the truncal focal SCNAs, several driver genes including CSMD3, SPTAN1, BCORL1, CAMTA1, GRIN2A, MED12 and TRAF7 were concurrently ampli ed in the two subtypes (Figure 3), which was associated with developmental biology (R-HSA-1266738) and EGFR-related Reactome pathways (R-HSA-179812 and R-HSA-180336). Moreover, the deletion of TP53, MUC4, ARID5B, ANK1, PTEN, SFPQ, FANCA, MAF and ZFHX3 were observed in the two subtypes. Interestingly, no driver gene showed concordant copy number variation in 5 pairs of samples, possibly due to the elevated SCNA-level ITH in MIP group. We also scrutinized the genes with shared copy number variation in the paired samples. As shown in Supplementary Figure 7A, most shared deletions were on immune-related genes, while signal transduction and PI3K/AKT pathways, which abnormality is highly associated with tumor progression and therapeutic resistance, were found uniformly ampli ed (Supplementary Figure 7B). To further derive the mutational transitions and evolutionary trajectory, we used PyClone-VI to infer the mutational populations and their evolution from paired components. As shown in Supplementary Figure 8A, numerous clone clusters were identi ed in 5 patients, which exhibited dynamic variant allele frequency (VAF) alternation. Clusters with drastically increased VAF in LEP subtype were mainly enriched in mRNA splicing pathways (Supplementary Figure 8B) while clusters with increased VAF in MIP subtype were associated with ERBB2 functions (Supplementary Figure 8C). These data imply that LEP and MIP components from one patient were derived from same initiation cells and the pathway-speci c mutations acquired after EGFR clonal mutation eventually shaped the subtype-speci city.

Group-wise comparison discovered possible novel therapeutic targets for MIP histological subtype
For the sake of the derivation of histological subtype-speci c therapeutic targets, we next gathered SNV and SCNA data from the LEP/MIP (both LEP and MIP) groups and identi ed the genes with highest alternation frequency difference in all samples. As shown in Figure 4A, large mutation frequency difference was observed on 9 genes, with 3 genes speci cally mutated in LEP group. Additionally, 5 genes were found with distinct copy number alternation pattern ( Figure 4B), with 1 gene speci cally ampli ed in LEP group. Similarly, mutation frequency on the 9 genes plus EGFR were compared between 4 public cohorts ( Figure 4C) and EGFR was the only gene with signi cant mutational difference in non-east Asian public cohorts. Moreover, all the 5 SCNA group-speci c gene showed alternation frequency difference relatively in 3 non-east Asian cohorts ( Figure 4D). Interestingly, the altered sample proportion or the alternation frequency for the MIP-group-ampli ed gene PTP4A3, NAPRT and RECQL4 was highly similar in our samples ( Figure 4B) and public cohorts (Lung-Broad, Lung-OncoSG and TCGA-LUAD, Figure 4D), implying the feasibility of their cooperative function through duplication in MIP component. Spearman's correlation coe cient (SCC) on SCNA pattern of our 11 samples con rmed the association of the co-ampli ed genes (Supplementary Figure 9A). Such strong association was also observed on LEP and MIP SCNA data from Lung-Broad (Supplementary Figure 9B), Lung-OncoSG (Supplementary Figure 9C, left) and TCGA-LUAD (Supplementary Figure 9D, left) cohorts. Concerning the fact that SCNA is highly related to the consequent gene expression alternation, we calculated the expressional SCC between the 5 group-speci c genes on cohorts with available transcriptomic data. Unsurprisingly, when compared to all samples (Supplementary Figure 9C-D, middle), the expressional associations between PTP4A3, NAPRT and RECQL4 transformed to a higher synergetic state for LEP/MIP samples in both Lung-OncoSG (Supplementary Figure 9C, right) and TCGA-LUAD (Supplementary Figure 9D, right) cohort. More explicitly, the correlation between SCNA and RNA expression was higher for the three genes in two public cohorts ( Figure 4E) and NAPRT as well as PTP4A3 exhibited signi cantly higher LEP/MIP group-speci city. As an exempli cation, the correlation between  Figure 10B,D). Complementally, patients with increased SCNA level on the co-ampli ed genes exhibited similar trend as the expressional strati cation strategy, both for all LUAD samples (Supplementary Figure 10E) and LEP/MIP subset (Supplementary Figure 10F) in TCGA-LUAD cohort. To conclude, our comprehensive analyses identi ed PTP4A3, NAPRT and RECQL4 as the possible novel therapeutic and diagnostic targets for MIP histological subtype, which were co-ampli ed and co-expressed speci cally in LEP/MIP components and expressionally upregulated in MIP group.

Immune-related analyses uncovered elevated immunosuppression in MIP subtype
In connection with the 3 co-ampli ed genes and the possible exacerbated immunosuppression in MIP subtype, we quanti ed the immune in ltration in the tumor microenvironment (TME) of three public LUAD cohorts to investigate the possible roles of these genes. For Lung-OncoSG cohort, some differences were observed but the discrepancy did not reach statistical signi cance (Supplementary Figure 11A). We next correlated the SCNA level of 3 co-ampli ed genes with the immune cell in ltration levels. By only selecting correlations with P-value<0.01, entries including "Granulocyte-monocyte progenitor", "immune score" and "microenvironment score" were found negatively correlated (Supplementary Figure 11B), all indicated a reduced in ltration of general immune and stromal cells in MIP subtype. Similarly, when selecting correlations between RNA expression and immune in ltrations with P-value<0.01, PTP4A3 was negatively correlated with CD4+ T memory cell in ltration while NAPRT expression was negatively associated with cancer-associated broblast (CAF) amount (Supplementary Figure 11C-D). As for the GEO dataset, signi cant between-group differences were observed on 7 cell types (Supplementary Figure 12A). Levels of hematopoietic stem cell (HSC) was elevated in relatively aggressive subtype (MIP/SOL) with adjusted P-value<0.01 while quantities of myeloid dendritic cell activated and mast cell decreased in aggressive subtypes. Naïve B cell, neutrophil and T cell CD4+ T-helper 1 cell in ltration were further found associated with NAPRT and RECQL4 expressions (Supplementary Figure 12B). Similar with Lung-OncoSG cohort, the dissimilarities between LEP and MIP subtype were not statistically signi cant in TCGA-LUAD cohort (Supplementary Figure 13A). Similar cell types were also observed in the correlational analysis on SCNA-level (Supplementary Figure 13B) and RNA-level (Supplementary Figure 13C) data.
Moreover, inspection on the disparity of cancer immunity cycle activity was conducted on Lung-OncoSG, GEO and TCGA-LUAD cohorts. Activities of recurrent steps including release of cancer cell antigen, CD8+ T cell recruiting, dendritic cell recruiting, macrophage recruiting, T-helper 17 (Th17) cell recruiting, T cell in ltration into tumors and killing of cancer cells were signi cantly higher in MIP subtype (Supplementary Figure 14A-C). Interestingly, the expressions of PD-1 and PD-L1 were generally higher in MIP group in Lung-OncoSG (Supplementary Figure 15A, D), GEO (Supplementary Figure 15B, E) and TCGA-LUAD (Supplementary Figure 15C, F) datasets. By further examining the differential expressed proteins between LEP and MIP subtype from TCGA-LUAD, the identi ed proteins with MIP-speci c elevation (Supplementary Figure 15G) were signi cantly enriched in PD-L1 and PD-1 checkpoint pathway in cancer (Supplementary Figure 15H) and the protein-level expression of PD-L1 was positively correlated with the transcriptional expression sum of co-ampli ed genes PTP4A3, NAPRT and RECQL4 (Supplementary Figure 15I). Interestingly, the known associations between the co-ampli ed genes and chemotherapy resistance was previously reported (35). To summarize brie y, our analyses con rmed the immunosuppression prevalence in MIP subtype and provided therapeutic suggestions for MIP-predominant LUAD patients. Shaped by the higher T cell in ltration level, possible chemotherapy resistance and immunosuppression, immune checkpoint inhibitor treatment could maximize the therapeutic bene t for MIP-predominant LUAD patients.

Discussion
LUAD exhibited high inner heterogeneity, including 5 subtypes: lepidic, acinar, papillary, micropapillary and solid. Consistent with other studies, we con rmed the survival disparity between LUAD subtypes. By performing WES on micro-dissected LUAD tissue samples of MIP and LEP components, we explored the genetic feature related to LEP/MIP growth pattern and evolutional connection between LUAD subtypes. Our results revealed that LEP and MIP subtypes could be derived from same initiation cells with EGFR mutation and the ultimate histological dissimilitude was shaped by the pathway-speci c mutations acquired along evolution. Previously Wang et al. revealed trunk mutations of common oncogenes like EGFR, TP53 played a dominant role in early invasive LUAD(36) while Xin et al. observed truncal EGFR and KRAS mutations between preneoplasia and invasive LUAD in multi-region multifocal pulmonary nodules from the same patients (37). Our results showed EGFR trunk mutation arose between pre-invasion and invasion LUAD components and LEP/MIP components were evolved by branched evolution model.
Through comprehensively comparisons on genetic alternations, biological characteristics of the two LUAD subtypes were further elucidated. As for mutational comparisons, TSG mutations in LEP were associated with DNA repair and TP53 regulation while genes related to WNT signaling and beta-catenin destruction complex got both higher mutational frequency and clonality. Driver mutations private to MIP were also enriched in cellular signaling and beta-catenin-related pathways, while genes possessed lower mutational heterogeneity in MIP were associated with neurogenesis and ERBB2 functions. Aberrant WNT signaling pathway activation caused by gene mutations of intracellular components is associated with a higher rate of recurrence in early-stage NSCLC. On the other side, being the critical downstream effector in canonical WNT pathway, excessive intracellular beta-catenin promotes lung cancer aggression. Liang et. al further con rmed the intracellular beta-catenin expression in MIP-predominant LUAD was higher than LEP-predominant LUAD(4). Besides, neurogenesis induced by tumor shapes a immunosuppressive microenvironment(38). Although cancer-related neurogenesis has been considered to be associated with solid tumor metastasis, its role in LUAD remains poorly understood. Our results suggested the inner association between MIP aggressive phenotype and neurogenesis. The activation of well-known protooncogene ERBB2 signaling was associated with poor outcomes in NSCLC (39), again coincides with MIP characteristics. Additionally, copy number of genes related to immune system, innate immune system, interleukin signaling, SHC1 events, ERK activation and FRS-mediated signaling were also found decreased in MIP. With highest proportion of immune genes affected, the immunosuppression status in MIP subtype was con rmed. Apart from speci ed genomic alternations, ITH provides crucial information for drug responsiveness and clinical prognosis. Discordance between SNV and SCNA ITH was particularly observed in MIP subtype. Subclonal genetic instability possibly facilitated MIP neoplastic cell proliferation (40) and the clonal mutations on key MIP-speci c pathways contributed to its aggressive behavior.
The current diagnosis of LUAD growth patterns was based on morphologic assessment, which mainly relies on the subjective experience of pathologists. We discovered three potential diagnostic and therapeutic target genes with co-ampli cation tendency in MIP, both in our discovery cohort and three public validation cohorts. Previous studies proved the knockdown of PTP4A3 inhibited cell migration and invasion of lung cancer cell lines (41). It also induced microvascular and lymphatic vessel formation by fascinating VEGF and VEGF-C expression in lung cancer tissues, which was in accordance with the clinical observations that MIP component in LUAD increases the risk of distant and lymph node metastasis. Previous study found the loss of NAPRT promoted the Epithelial-Mesenchymal Transition (EMT) by stabilizing betacatenin (42). The elevated expression of NAPRT conceivably was associated with the disruption of catenin-cadherin complex in MIP. Besides, RECQL4 could coordinate and regulate cell proliferation and cell cycle progression by protecting chromosome stability (43) and its protein expression was remarkably higher in LUAD (44). These biological mechanisms of the three genes further veri ed our discoveries.
Surgical resection, chemotherapy, and targeted therapy are common therapeutic options for MIP-predominant LUAD but limitations still exist respectively. Regarding the increasing enthusiasm on lung cancer immunotherapy, we assessed and compared the TME landscape between MIP and LEP components. The three co-ampli ed genes showed associations with cell in ltration levels of general immune/stromal cells, CD4+ T memory cells and CAF. Reduced in ltration of general immune and stromal cells in MIP subtype indicated its lower systemic immunity status, lower amount of CD4+ T memory cell was associated with worse survival in other tumor type (45) while CAFs were connected to decreased number of CD8+ T cells and possible immunotherapy resistance(46). Additionally, MIP/SOL subtype got elevated in ltration of HSC, which was associated with immunosuppressive phenotypes and elevated PD-L1 expression in glioblastoma (47). Mast cell in ltration level decreased in MIP/SOL samples and its depletion improved the e cacy of anti-PD-1 therapy in melanoma model(48). As for the cancer immunity cycle stepwise activity, steps including CD8+ T cell recruiting, T-helper 17 (Th17) cell recruiting, T cell in ltration into tumors and killing of cancer cells were signi cantly higher in MIP samples. In addition to the TME analyses, proteomics analysis further con rmed the immunosuppression status speci city in MIP subtype.
Unsurprisingly, the PD-L1 protein expression correlated with the three co-ampli ed genes and their known associations with chemotherapy resistance was revealed previously. In general, our work revealed the comprehensive TME situation of LEP/MIP components and emphasized the suitableness of MIP-predominant patients on anti-PD-1/anti-PD-L1 immunotherapies.
Our study exists several limitations. Firstly, the cohort only included 5 pairs of LEP/MIP components detached from 5 LUAD patients and a total of 11 samples. Further studies with larger amount of patient involvement can better decipher the evolutionary trajectory between LUAD histological subtypes and identify subtype-speci c genetic changes. Moreover, the three potential diagnostic and therapeutic target genes with co-ampli cation or co-expression tendency in MIP should be further experimentally validated, especially on their protein expression status. Additionally, we portrayed the TME heterogeneity using bulk RNA-seq data. With the recent maturation of multiple advanced techniques, utilizing methods including single-cell RNA-seq, spatial transcriptomics and multiplexed immunohistochemistry could better dissect the TME in LUAD. Lastly, our analyses only focused on MIP and LEP subtypes. A more integrated study incorporating other LUAD histologic subtypes could better decode the disease.
To conclude, by selecting patients with both MIP and LEP components in one lesion for WES, we identi ed subtype-speci c genetic differences responsible for variated prognosis. We also revealed the evolution trajectory of MIP subtype. The subtype-speci city was possibly shaped by pathway-speci c mutations acquired after EGFR clonal mutation. Besides, by utilizing data from multiple cohorts, genes speci cally altered in MIP subtype were identi ed as novel diagnostic and therapeutic targets. Tumor microenvironment and proteomics analyses also revealed the immunosuppression prevalence in MIP. Immune checkpoint inhibitor treatments like anti-PD-1/anti-PD-L1 possibly could maximize the therapeutic bene t for MIP-predominant LUAD patients.

Consent for publication
All authors have approved the nal manuscript for publication.