Landscape of somatic mutations in primary lung cancer and brain metastasis
We collected and quantified DNA from the original 12 non-small cell lung cancer (NSCLC) patient FFPE samples and the matched brain metastasis samples. The average coverage depth for the tumor cells and normal cells were 194 × and 120×. Detailed clinicopathological information is summarized in Supplementary Table 1. We identified 1,702 SNVs and 6,131 mutation events in 1,220 genes from 12 paired lung cancer (LC) and brain metastasis (BM), including LC most frequently driver gene mutations such as TP53, EGFR, BRCA1, BRCA2, and BRAF. We identified a mean of 3.1 driver gene mutation events per tumor with the dN/dS of 2.13 which is slightly higher than non-metastasis lung cancer samples in The Cancer Genome Atlas (TCGA), indicating a significant enrichment for the cancer driver gene mutations. We did not find any difference of the dN/dS ratio between primary tumor (dN/dS = 2.20), brain metastasis tumor (dN/dS = 2.06) and shared mutations between lung cancer and brain (dN/dS = 2.25).
We found that more somatic mutations in BM lesions (median 71, range 23–180) than in LC lesions (median 48.5, range 13–187), while the difference was not statistically significant (p = 0.069, Student's t test) (Supplementary Fig. 1B). High correlation between TMB( tumor mutation burden) of LC and TMB of BM were confirmed by Pearson coefficient 0.65 (p = 0.02) (Supplementary Fig. 1C,1D), indicating the TMB of BM can be estimated by that of primary lung cancer when brain tissue is not available, so as to screen patients who will most likely benefit from PD-L1 immunotherapy. 18.2% (0.5–35.9%) of all SNVs were shared between LC and BM, clearly suggesting a common ancestral truncal clone with 30.0% (9.3–60.8%) LC-specific and 51.8% (19.9–79.7%) BM-specific, respectively (Fig. 1B). Although metastases had more private SNVs than the primary tumor, they were not enriched for the pan-cancer driver genes (31)(Fig. 1C). It suggested that few additional private genomic driver genes were required for metastasis when the primary cancer is already advanced.
We identified several lung cancer metastases associated genes (KMT2C, AHNAK2, PDE4DIP, ANKRD36C, and BAGE2), and the mutations of these genes showed distribution diversity among the LC and BM samples (Fig. 1A and 1D). KMT2C mutations were found in 25% samples in LC, however, the mutation frequency in BM was up to 50%, indicating the positive selection of KMT2C mutations during metastasis. AHNAK2 have significant enrichment in LC according to TCGA dataset with mutation ratio of 18.8% in lung cancer while 9.98% in Pan-cancer (p = 7.2 × 10− 9, Chi-square test). However, mutation frequency of AHNAK2 in our dataset is as high as 26.9% which is 1.43 times of LC population in TCGA dataset (p = 0.02, Chi-square test). We also observed all EGFR mutations were shared between LC and BM, suggesting that EGFR mutations are drivers and likely to be an early event before BM.
In order to provide more landscape for the mutations identified in our study, we conducted a pathway analysis to the most frequently mutation genes (mutation frequency > 5%) (Supplementary table2). We found the most important tumor pathways were chronic myeloid leukemia (p = 0.002), ErbB signaling pathway (p = 0.0014) and glioma pathway(p = 0.05). Keyword enrichment indicates important metabolic abnormal for the lung-metastasis cancers including EGF-like domain and tyrosine-specific phosphatase (Supplementary table3).
Copy Number Variations
To further explore BM-related molecular events, genomic copy number variations (CNV) were analyzed: 8q21.2, 6p22.1, 12p13.33, 5q35.3 were the most common chromosome deleted regions in both LC and BM, and 8q24.13 were the most commonly regions with gain copy numbers in both LC and BM (Supplementary Fig. 2A). Loss of 6p22.1, which harbors HLA-A, HLA-G and HLA-H, was most frequent in both LC and BM of patient 9 and in BM of patient 11 (Supplementary Fig. 2B). Interestingly, these samples also had high TMB (Supplementary Fig. 1C). This may be due to the loss of HLA function associated with higher overall mutation burden and a larger fraction of HLA-binding neoantigens(32). The recurrent deletion of HLA was detected as the early events, indicating the important role of the immune system in LCBM, and these patients may benefit from immunotherapy.
The differences include gains of chromosomes 7q35 and loss of 7q22.1, 7q36.3, which were more frequent in metastasis samples, and gains of chromosomes 11q13.2 and losses of 7q11.23, 2q13 which were less frequent. Most recurrent CNV regions were shared in LC and BM samples, indicating that CNVs are early molecular events in tumorigenesis and metastasis.
Spectrum And Signatures
To determine the relationships between the mutational spectra and tumor organ sites, we analyzed the spectra of LC, BM from our study and primary lung adenocarcinoma (LUAD), lung squamous cell carcinoma (LUSC), low grade glioma (LGG), glioblastoma multiforme (GBM) from TCGA dataset. C > T was the most common base substitutions in our LC and BM samples, which was much closer to primary brain cancer (LGG and GBM) but significantly different from primary lung cancer (LUAD and LUSC) which has higher C > A transversion (Fig. 2A). These evidences consistent with our hypothesis that the mutations identified in our study have higher probability to be associated with brain metastasis. The mutational spectrum of LC and BM samples from the same individuals are more similar to each other than that from different patients, implying that different mutational processes were involved during the development of metastasis between the different patients (Fig. 2B).
We further analyzed mutational signatures in BM and LC and signatures 1, 3 and 4, which have been linked to aging, BRCA1/2 mutations and smoking, respectively, were identified as dominant in either BM or LC samples (33) (Fig. 2C). There was no significant difference between the signature levels of LC and BM tissues(Wilcoxon rank sum test, p༞0.05), indicating the change of mutational signatures happened before metastasis and may not lead to their great difference between the two groups.
Clonal Evolution During The Development Of Lcbm
Phylogenetic trees give clear overviews of the order of mutation events, allowing the track of emergence and movement of clones from LC to BM (Fig. 3). Phylogenetic trees of the 9 patients showed that mutations on the trunk were probably earlier genetic alteration events, followed by those on the branch mutations occurred later during tumorigenesis and BM development. Clonal evolution analyses revealed that LC tumors and BM tumors had the same evolutionary process in three patients (P02, P07 and P10), LC tumors harbor a cluster of LC-private clones in other three patients (P05 and P08) and BM tumors harbor clones that are nonexistent in matched LC tumors in other three patients (P06, P13 and P15), indicating the mutations on BM-private clones may contribute to metastatic progression.
Overall Survival By Genotype
In order to identify independent prediction factor for outcomes, we conducted survival analysis to several potential factors. Patients with the aberrations of 18 genes in LC (Supplementary Fig. 3) and 15 genes in BM (Supplementary Fig. 4) had significantly worse OS than those without these aberrations (p < 0.05). Of these genes, we identified a significant survival associated mutation gene ERF which was confirmed by both TCGA (p = 0.01) (Fig. 4A) and our dataset (p = 0.012) (Fig. 4C). What’s more, in order to show the prognostic roles of ERF, we also found high expression of ERF genes in TCGA is a significant risk factors for the overall survival time (HR = 1.46, p < 1.2 × 10− 22, Fig. 4B). Taken together, our findings reveal an important role for ERF in prognostic prediction of lung cancer.
Multivariate analysis demonstrated gender (p = 2.02 × 10− 119), smoking status (p = 1.21 × 10− 269), metastatic tumor size (p = 0), and the ratio of shared mutations in lung and brain cancers(p = 0.019) were significant associate with overall survival time while no significant association were found in drinking status (p = 0.996), the number of metastatic tumors (p = 0.746), the mutation numbers of primary tumor (p = 0.840) or metastatic tumor (p = 0.248) (Fig. 4D).