Patient characteristics
The patients in this study comprised 39 males and 33 females, with a mean age of 62.35 years, of whom 61 had ADC, 10 had SqCC, and 1 had combined ADC and SqCC. Stage I disease was identified in 38 patients, stage II in 12 patients, stage III in 10 patients, and stage IV in 6 patients (Table 1).
Genomic alterations
Among the 72 samples, 32 contained driver mutations in well-known cancer genes in NSCLC, such as EGFR (n=26; E709G, T790M, L858R and non-frameshift deletions of exon 19), and PIK3CA (n=4; E542K and G1049R). Besides EGFR and PIK3CA, other known mutations were detected in KRAS (n=4; G12V, G12A, G12D, and Q61H), which have all been reported as driver mutations in lung cancer. In addition, four samples carried known activating mutations in the well-known oncogenes CTNNB1 (n=3; S33F, S37C and S37F) and MET (n=1; R1004X and c.3028+1G>T). Overall, 34 specimens harbored driver mutations in five cancer genes (EGFR, PIK3CA, KRAS, CTNNB1, and MET), which are canonical driver mutations (Table S1). These mutations were mutually exclusive, except for four cases of double mutations (n=2; EGFR and CTNNB1 and n=2; EFGR and PIK3CA). TP53 was the most frequently mutated gene after EGFR (n=18; S95fs, K120X, T125T, W146X, 152_153del, V173A, F212fs, G245C, G245D, R248L, R248W, R273H, R273C, V274D, E286K, and c.673-1G>T) (Table S1). Among the 16 TP53 variants, two were novel (S95fs and F212fs). Comparisons between Taiwanese and Caucasian patients with NSCLC
To compare the frequency of driver mutations of NSCLC between Taiwanese and Caucasian patients, we obtained all available lung cancer cases (560 ADC and 489 SqCC) from the TCGA dataset. Notable differences from the TCGA data included the frequencues of mutations in EGFR (36.11% vs. 9.82%, p<0.0001), KRAS (5.56% vs. 15.92%, p=0.0165), and TP53 (25.00% vs. 69.69%, p<0.0001). A full comparison of the frequencies of selected gene alterations between the two cohorts is depicted in Figure 1 and Table S2.
Clinically relevant genomic alterations
Based on the latest NSCLC guidelines published by the National Comprehensive Cancer Network, clinically relevant genomic alterations were identified in 34 (47.22%) patients (Table 2). As shown in Table 2, the clinically relevant alterations included in EGFR (26, 36.11%), ERBB2 (2, 2.78%), KRAS (4, 5.56%), MET (1, 1.38%), and NTRK1 (1, 1.38%).
Among the 26 patients with EGFR mutaion patients, only 8 had an additional TP53 mutation, of whom 1 died of a cause unrelated to NSCLC. Among the 7 remaining patients, 3 had good survival outcomes, and 4 did not. We compared the genetic differences between the patients with a good and those with a poor survival outcome. In addition to the EGFR and TP53 mutations, one patient with poor survival harbored a MYC non-frameshift deletion (p.48_48del, rs776629119), and one patient with good survival had an AR non-frameshift insertion (p.L57delinsLQQQ, rs4045402) and a FBXW7 non-frameshift deletion (p.117_117del, rs781154022), and another patient with good survival had a CTNNB1 (p.S37F, rs121913403) mutation.
Correlations between driver mutations and clinicopathological characteristics
Correlations between the genotypes and clinicopathological characteristics are listed in Table 3. The EGFR mutation rate was significantly higher in patients with ADC than in those with SqCC (41.0% vs. 10.0%, p=0.059). No association was found between the EGFR mutation status and sex, age, or tumor stage of the patients. In contrast, the PIK3CA and TP53 mutation rates were significantly higher in patients with SqCC than in those with ADC (30.0% vs. 1.6%, p=0.008 and 50.0% vs. 21.3%, p=0.053). No association was found between KRAS or CTNNB1 mutations and any clinicopathological characteristic.
Identification of trunk or branch driver mutations
To determine whether driver genes carry trunk or branch mutations, we identified potential driver mutations among the 299 known cancer driver genes [20]. All variants classified as pathogenic in the ClinVar database are trunk mutations present in two tumor regions (Table S3). Among the 61 predicted pathogenic variants identified from four patients (15, 11, 16, and 19, respectively), 37 were classified as trunk mutations (11, 4, 7, and 15, respectively) (Figure 2 A, Table S4).
We further analyzed the variant allele frequencies of the trunk and branch mutations. Generally, the variant allele frequencies in four paired samples suggested that trunk mutations (median: 0.23-0.34%) occurred much more frequently than branch mutations (median: 0.12- 0.15%) (Figure 2 B).
We also analyzed the expression of the driver genes that carried trunk or branch mutaions. Gene expression profiles revealed no differences in driver genes harboring trunk or branch mutations between the two different tumor regions of the four paired samples (Figure 2 C).
Intratumoral heterogeneity of 299 driver genes
We determined the intratumoral expression of 299 driver genes, which were derived from 33 cancer types in the PanCancer dataset [20]. We used Spearman’s rank correlation to calculate the gene expression correlations between two regions from four tumors each. Two regions from a tumor showed the highest correlation coefficient (Figure 3).
The numbers of differentially expressed genes with a fold change in expression >4 in the four paired samples were 4 (FGFR2, PRKAR1A, MYC, and MYD88), 1 (MYD88), 4 (KMT2C, GNA11, ALB, and B2M), and 2 (KLF5 and CDKN2A), respectively (Table S5). Most of the genes showed consistent expression (fold change ≤4). There were few differences between the different regions within the same tumor, and we suggest that any differences were due to branch mutations. Thus, the 299-driver gene signature may correctly predict cancer etiology if assessed from a single tumor region.