Lung Adenocarcinoma: Next Generation Sequencing and Outcome in Regional Community Oncology Setting

PURPOSE: To analyze lung adenocarcinomas (LUAD) from a geographically unique population of rural Maine by next generation sequencing (NGS), correlate mutational ndings with clinical features, patient outcomes and published data from other populations. METHODS: 210 consecutive LUADs diagnosed in 2017-2018 were analyzed for 50 oncogene/tumor suppressor gene hot spots by NGS. ALK, ROS-1, RET and MET were assessed by FISH, PD-L1 by immunohistochemistry. Findings were correlated with age, gender, smoking history, stage, overall (OS) and progression-free survival (PFS) and compared to pubished literature. RESULTS: The cohort included 113 (54%) women and 97 (46%) men, ages 33 to 91 (mean: 67.4 years), 52% active and 41% former smokers, 79 (38%) of advanced stage (stage IV). Most frequently detected mutations included TP53 (47.6%), KRAS (38.1%), EGFR (10%), STK11 (8.6%), BRAF (4.8%), MET (3.8%), ABL-1, ATM, CDKN2A, PIK3CA, (all 2.9%), RB-1 and NRAS (2.4%), APC, ERBB4, PTPN11, SMAD4, (all 1.9%), CTNNB1 and ERBB2 (both 1.4%). MET amplication occurred in 3.3%, RET and ALK/ROS-1 rearrangements in 1.4% and 0.5%, high PD-L1 expression in 35.2%. Treatment included surgery/radiation/adjuvant chemotherapy for stages I-II, denitive chemo/radiation therapy and immunotherapy for stage III, immunotherapy, chemo-immunotherapy, targeted therapy, palliative radiation for stage IV. At median of 26 months (minimum 21 month for surviving patients), OS/PFS were 44.3%/39.5%. Stage, male gender, TP53 mutation and KRAS/STK11 co-mutations correlated with adverse OS. In stages I-II, KRAS/TP53 co-mutation was unfavorable. CONCLUSION: NGS testing in a regional oncology setting identied established prognostic/therapeutic markers, as well as additional molecular features correlating with outcome. Our ndings support prognostic stratication of LUAD based on the presence of gene mutations outside of the current NCCN guidelines: TP53, KRAS and STK11.


Introduction
Lung cancer is the second most common cancer and the leading cause of cancer-related death in the U.S. 1 . Non-small cell carcinoma (NCSLC) accounts for approximately 85% of cases, with lung adenocarcinoma (LUAD) being the most common subtype, accounting for approximately 40% of NSCLC cases 2 . There is a known urban-rural disparity in lung cancer incidence within the United States, with the most rural counties having an annual lung cancer incidence almost twice that of the largest metropolitan areas 3 . The cause for this disparity is largely attributed to higher prevalence of cigarette smoking, as the increase in lung cancer incidence tracks closely rural smoking rates 4 . Such urban/rural smoking rate differences were found to be most signi cant in the New England region 5 . The state of Maine with its quintessential New England rural communities, has rates of lung cancer astonishingly 30% higher than the national average 6 . The lung cancer incidence in Maine is even higher than what would be expected for its smoking rate, suggesting exposure to other risk factors besides smoking, such as residential radon exposure or air pollution 7 . In fact, 12 of 16 Maine counties are classi ed as zones with predicted average indoor radon levels greater than those recommended as safe by the EPA 8 . Whether such potential factors may contribute to the molecular genetic makeup of lung cancer from this region and its mutational pro les differ from those reported in other parts of the country is unknown, as to date, no molecular pro ling studies from this geographically unique region have been published.
The recent standardization of next-generation sequencing (NGS) methods allowed its gradual implementation in molecular pathology laboratories outside of larger academic institutions 9 . This trend provides an opportunity to pro le tumors diagnosed locally in more homogeneous rural populations. In addition to markers endorsed by current clinical practice guidelines, NGS provides molecular data that can be utilized for investigational and clinical trial-based treatment purposes 10 . Routine local NGS testing of lung cancer samples with short turn-around time can improve the e ciency of regional oncology care by providing timely results for genomic-based therapies without delay, which often results from sending samples to remote reference centers 11 . Our single-center institution and laboratory collects and analyzes samples and treats patients from the entire central and northern Maine region, thus concentrating the vast majority of lung carcinomas from the rural parts of the state. As a result, the cases in our series represent a geographically homogeneous lung cancer patient population from an area with its prevalence ranking among the highest in the nation.
Our retrospective cohort consists of consecutive LUAD cases analyzed by NGS, uorescence in situ hybridization (FISH) and immunohistochemistry (IHC). In addition to the standard markers endorsed by the National Comprehensive Cancer Network (NCCN) 12 , our analysis includes a panel of tumor suppressor genes and oncogenes, as well as correlation with clinical features and outcomes. The relative geographic isolation and ethnic uniformity of our patient population provided us with an opportunity to compare the mutational pro les of "rural" LUAD to the data in the literature, generated predominantly by academic centers and re ecting patients from predominantly larger metropolitan areas. Our work also highlights the bene ts of "re exive testing" in community oncology practice. We describe the utilization of "standing order" for molecular pathology and integration of upfront sample preparation for molecular testing as part of routine diagnostic process, and assess their impact on tissue adequacy and turnaround time as important contributors to e cient oncology care in a regional setting.

Methods
Patient selection. 210 consecutive LUAD cases from patients with available clinical follow up tested by NGS during the period of January 2017-June 2018 were retrieved from the NGS database at Dahl-Chase Pathology Laboratory and Northern Light Oncology Institute. This work was performed in concordance with an Institutional Review Board approval (19-1-A-001). LUADs arising in the lung or involving metastatic sites in patients where primary tumor was unavailable were utilized for testing. A molecular pathologist con rmed the diagnosis ensured minimum of 20% tumor content in each sample. Staging of patients was performed according to the 8 th edition of the tumor, node and metastasis (TNM) classi cation for lung cancer 13 . Re exive molecular testing utilizing a standing order.
In order to perform molecular testing in a short turnaround time, avoid the need for sending individual test requests to pathology and to preserve tissue, our institution utilizes "re exive" molecular standing order.
Updated regularly to ensure it includes all NCCN-endorsed biomarkers and meets the current standards of care, this standing order is approved by the institutional molecular tumor board in conjunction with the hospital medical staff and initiated automatically by the pathologist, as soon as the diagnosis of LUAD is established.

DNA extraction and sample preparation.
A pathologist con rmed the presence of su cient tumor tissue for NGS analysis by direct microscopic visualization of tissue on glass slides stained with hematoxylin-eosin. Ten 4-5 μm formalin-xed, para n-embedded sections were marked for manual microdissection of genomic DNA. 15 ng of DNA was ampli ed, fragmented, ligated to adapters, barcoded, clonally ampli ed onto beads to create DNA The resulting DNA sequencing reads were aligned to the hg19 (GRCh37) reference genome and data analysis was performed using the Ion Torrent Suite (5.8.0) and Golden Helix VarSeq (2.0.2) software. The lower limit of detection of this assay has been established at a 5% variant allele frequency (VAF), with normal range (mutation not detected) being a VAF <5%, and mutational status of VAF ≥ 5% indicating a mutation present. Previously established minimum read depth cut off was set at 100X coverage with a minimum of 25 variant allele observations, with a 250X read depth minimum for a 5% sensitivity (based on prior validation studies, data not shown). Previously reported false-positive mutations and other potential artifacts due to nonspeci c mispriming events were removed according to McCall et al 14 .
Genomic alterations were reviewed by a molecular pathologist, tiered according to their clinical relevance, and reported with accompanying NCCN therapeutic guidelines.
Additional testing.
Fluorescence in situ hybridization (FISH) was performed according to an established protocol, using probes for ALK, ROS-1, RET and MET. Immunohistochemistry (IHC) for PD-L1 was performed according to manufacturer's protocol using 22C3 antibody (Dako, Carpinteria, CA) and FDA-approved scoring approach, reporting the percentage of tumor cells with membranous positivity (tumor proportion score; TPS). 15 MET Exon 14 skipping mutation was evaluated via send-out testing, since this test was not available in-house at the time of this study period.

Statistical analysis.
All statistical analyses were performed using SPSS for Windows (v19.0, IBM, Armonk, NY). Comparisons of proportions across different groups were evaluated using Fisher's exact test or Chi-square test where appropriate. The presence of molecular abnormalities was correlated with patients' age, gender, smoking history (including pack-years) and tumor stage. Progression-free survival (PFS) was measured from the date of diagnosis to tumor recurrence or death, while overall survival (OS) was measured from the date of diagnosis to the date of death. Patients were censored on September 30, 2020 if alive and disease-free for OS and PFS, respectively. A log-rank test was used to compare survival curves in Kaplan-Meier analysis. Cox regression analysis was performed to evaluate the signi cance and calculate relative risks in a time-dependent multivariate model, with risk expressed as hazard ratios (HR) with 95% con dence intervals (CI). All statistical analyses were two-sided with p-values at signi cance level of 0.05. 208 of the 210 patients (99%) were Caucasian, 1 (0.5%) Asian and 1 (0.5%) of native American (Penobscot) ancestry. 14 patients (6.7%) were never smokers, 109 (51.9%) were active smokers at the time of diagnosis and 87 (41.4%) former smokers (quit 1 or more years prior to diagnosis). The mean number of pack years among the smokers was 47 (range: 1-122), with 79 (42.9%) having at least 50-pack year history. 131 (62.4%) patients had limited stage: 71 (33.8%) stage I, 25 (11.9%) stage II and 35 (16.6%) stage III. 79 (37.6%) patients had advanced (stage IV) disease. Treatment was administered according to the NCCN guidelines 16 , dictated by stage, presence of predictive/therapeutic biomarkers and including: surgery/radiation plus/minus adjuvant chemotherapy for stages I-II, de nitive chemo/radiation therapy followed by immunotherapy for stage III, and immunotherapy, combination chemoimmunotherapy, targeted therapy or palliative radiation for stage IV. At a median follow-up of 26 months (range: 1-46 months, SD 13.9), with a minimum of 21 months follow-up for surviving patients, the overall survival (OS) and progression-free survival (PFS) were 44.3% and 39.5%, respectively.

NGS Performance and mutations detected in the studied cases
The tissue sections used for analysis contained a minimum of 500 cancer cells in a micro-dissected area from which 10 consecutive unstained sections, which were obtained during initial sectioning of the tissue for diagnosis. The mean tumor cell content per sample was 40% (range 20-80%). Overall, the variant allelic frequency (VAF) of the mutations detected in this study ranged from 6% to 48% (mean: 32%). Read depths ranged from a minimum of 350X to over 12,000X (mean 2,800X), indicating a satisfactory level of coverage for the regions of interest (ROI).  Table 1.
Most commonly mutated genes (occurring in greater than 5% of cases) TP53 mutations.
TP53 mutation was the most commonly detected, occurring in 100 (47.6%) cases. The detected mutations were highly diverse; 77% occurred only in a single case in this cohort. The most common recurring mutations affected residue 273 (p.R273L, p.R273C or p.R273H) seen in 7% of the mutated cases, followed by residue 179 (p.H179R or H179Y), accounting for 5.5% of cases, 248 (p.R248L or R248W; 4.4% of cases), 154 (p.G154V; 3.3% of cases) and 282 (p.R282G; 2.2% of cases). Most missense, frameshift, and nonsense mutations were located in known "hot spots", annotated as functional protein domains for DNA-binding (amino acid residues 100-292) and tetramerization (residues 325-356). 1718 Such functional domains mutations were seen in 94 (44.8%) cases. The remaining rare mutations were seen outside of these functional domains, including variants located in the transactivation domain (residues 6-29) or splice site variants. Distribution of TP53 mutations in studied cases across protein domains is depicted in Fig. 2.
Overall mutational rates per tumor and co-mutation rates of the most commonly mutated genes.
PD-L1 expression and its correlation with molecular genetic abnormalities.

Re exive molecular standing order experience
Utilizing a re exive standing order for molecular testing ensured completion of testing within 10 working days after the biopsy/excision date in all of the cases studied (mean: 8.5 days; range 7-10 working days after the biopsy date). Obtaining unstained sections for molecular testing at the time of cutting the initial hematoxylin-eosin sections for diagnosis allowed performing molecular studies on tissue from small biopsies and cytology cell blocks, by obtaining unstained ribbons of tissue during the initial diagnostic evaluation, avoiding "refacing" the para n blocks, which is necessary when the blocks need to be cut for molecular testing at a later date. During the study period, only two cases (less than 1%) had no residual tumor tissue at diagnosis and could thus not be included in this cohort (never progressed to molecular testing). Prior to introducing the re exive testing in 2017, our turn-around time exceeded this by up to 5 working days, plus not obtaining sections for molecular testing at diagnosis resulted in an approximately 8% testing failure rate due to lack of residual tissue in the para n blocks (data not shown).

Molecular abnormalities, clinical features and survival
Overall survival/progression-free survival and clinical stage.
The overall survival and progression-free survival rates across all stages at the mean follow up of 26 months (with a minimum of 21 months for surviving patients) were 44.3% and 39.5%, respectively.
KRAS/STK11 co-mutation was associated with worse OS (p=0.018; Fig. 4D). This was independent of stage, gender, or TP53 mutation in the above multivariate model. In stage I, as well as stages I-II, after excluding cases with favorable prognostic effect of EGFR mutation , KRAS/TP53 co-mutation was predictive of adverse OS (p=0.005 and p=0.02; Fig. 4E).
Mutually exclusive mutations.
EGFR mutations were entirely mutually exclusive with BRAF mutations, as well as nearly mutually exclusive with KRAS and NRAS mutations (in only one case, EGFR mutation co-occurred with mutation in either KRAS or NRAS; both p=0.001).

PD-L1 status and survival.
High PD-L1 expression appeared as an adverse factor in both OS and PFS on univariate analysis (p=0.03 and 0.04, respectively) but this was shown to be due to a correlation with advanced stage, where PD-L1 was more often expressed (p=0.012). A multivariate Cox regression model controlled for age, gender, stage and smoking history, showed that PD-L1 status was not independently associated with OS or PFS.
Gene mutations and smoking.
As mentioned earlier, KRAS G12C mutation was more commonly seen in smokers (p=0.04), with G12V being second most common. In non-smokers this order was reversed (Fig. 3B). An inverse association was seen between EGFR mutation and smoking, with 50% of non-smokers harboring an EGFR mutation, in contrast to only 7% of smokers (p<0.001). Among patients with positive smoking history, EGFR mutation occurred in 18% of prior smokers, in contrast to only 2.7% of current smokers (p<0.001), and in 17% of patients with lower than 50 pack-year history, in contrast to only 2.5% of smokers with 50 or greater pack-year history (p=0.001). Considering the positive correlation of male gender with heavy smoking (greater than 50 pack-years; p=0.007) and a female non-smoker status with EGFR mutation (p<0.001), a relative survival advantage of women non-smokers was seen by Kaplan-Meier survival analysis (p=0.05; log rank). However, unlike male gender, such "female non-smoker" status was not an independent survival factor in a multivariate Cox Regression model with stage, TP53 mutation and KRAS/STK11 co-mutation.

PD-L1 expression, gene mutations and survival.
High PD-L1 expression occurred signi cantly more commonly in tumors with mutations in the MAPK pathway (KRAS, BRAF or NRAS; p<0.001) as well as those with MET gene ampli cation (p=0.009). Similar trend was seen for tumors with KRAS/TP53 co-mutation, but did not reach signi cance (p=0.08). In contrast, PD-L1 expression was more commonly absent in tumors with STK11 mutations (p=0.009), even more so in those with KRAS/STK11 co-mutations (p=0.002).

Discussion
In the present study, we report mutational pro les and clinical outcomes of lung adenocarcinomas from patients residing in the rural parts of the state of Maine -a region of the US with a particularly high incidence of this disease. As could be expected from the clinical experience and established knowledge, 19 clinical stage at presentation was a strong factor determining the outcome in our patients. In addition, several mutations and co-mutations were found to strongly affect survival independently of stage, both in the entire cohort overall, as well as in select stage subgroups. The 44.3% overall survival rate of our cohort within the studied time frame is comparable to the approximately 40% 2-year survival rate across all stages of LUAD in the national cancer databases. 20 TP53 mutations affecting DNA-binding and tetramerization domains are known to dysregulate the transactivation of TP53-dependent genes and promote tumorigenesis [21][22] . TP53 mutations in these two domains occurred in 41% of our patients and showed a strong, stage-independent prognostic effect on OS with a hazard ratio of 1.7. The stage-independent prognostic signi cance of these TP53 mutations was further demonstrated in a separate analysis of stage IV tumors, showing a similar hazard ratio for adverse OS. A high prevalence of TP53 mutations in our lung cancer population was not unexpected. TP53 mutation has been identi ed in a number of prior studies as consistently the most frequent mutation in LUAD -with a prevalence reported around 45%, ranging from mid-30% range to 50% of cases [23][24] . The adverse prognostic effect of TP53 mutations has been reported less consistently among studies, but has likewise not been an infrequent nding 25 . The TP53 mutations identi ed in our tumors were very diverse, with the same mutation only rarely seen in more than a single patient. This is also a common occurrence, as there is typically no consistent pattern in the spectrum of TP53 mutations, even in cancers of the same type 26 . Six point mutations (R175, G245, R248, R249, R273 and R282), seen in 14% of our cases, typically account for up to 30% of all mutations in human cancers 22 . Since they are located in the DNA binding region, these mutations disrupt the tumor suppressive activity of the p53 protein.
Finding these mutations in our study is similar to that of Zahra et al, who utilized the same NGS platform used in our study, detecting 11% of TP53 mutations in their lung cancers in these "hotspots" 23 .
TP53 mutations at seven residues (R157, R158, R179, G245, R248, R249 and R273) known to be susceptible to damage caused by benzopyrene diol epoxide, a potent cigarette smoke carcinogen, occurred in 25% of our mutated patients (all smokers), similar to what has been reported in lung cancer patients previously 27 . A TP53 transversion mutation at residue 249, which is rarely seen outside of radon exposure, only occurred in one of our patients, suggesting that radon exposure may not have played a major role in our cohort, although other studies have questioned the sensitivity of this mutation as a marker of radon exposure in LUAD 28 .
The spectrum of KRAS mutations detected in our study was similar to previously reported lung cancer studies [29][30] . The most common mutations affected codon 12 (predominantly G12C and G12V), followed by codons 13, 61 and rarely others. The most common mutation types differed between smokers and non-smokers, similar to what was reported previously by Yu et al 31 .
While by themselves, KRAS mutations didn't show an impact on prognosis, when co-occurring with other mutations, KRAS had a very signi cant prognostic impact. Co-mutation KRAS/STK11 identi ed an adverse group with 18% OS at 40 months, which contrasted with a 40% OS rate seen in patients without such co-mutation over the same timeframe. Similarly, KRAS/TP53 co-mutations delineated an adverse disease subset within the low-stage (I-II) cancers, when excluding the prognostically positive effect of EGFR mutation 32 . This difference was marked, with 65% of low-stage patients without such co-mutation being alive at 40 months, in contrast to only 30% of those with KRAS/TP53 co-mutation. In other words, less than half of the patients with these co-mutations achieved the overall survival seen in patients with the same stage of disease, who lacked these adverse mutational signatures. Prior reports have suggested existence of such molecular LUAD subsets, de ned by the presence of co-existing mutations in the tumor, such as: the wild-type group, isolated TP53 group, KRAS group, KRAS/TP53 group and KRAS/STK11 group 33 . Others however, could not con rm the effect of such genomic co-alterations on survival 34 .
Among the remaining mutations detected most frequently in our patients, of note is the relatively lower prevalence of EGFR mutations, compared to the literature typically reporting it to range anywhere between 17% to 52% 35 . These cumulative averages re ect data in uenced by extrapolations from East Asian data, as well as that derived from populations with larger contribution of women and non-smokers 36 . In contrast, lung cancer patient populations more similar in composition to that of rural Maine (majority Caucasian of predominantly northern European/often German or Scandinavian ancestry, with a large proportion of smokers), have shown the prevalence of EGFR mutations closer to 10%, similar to our study [37][38] . In fact, even in The Cancer Genome Atlas (TCGA), the overall rate of EGFR mutations is only 14% 39 .
The prevalence of other mutations in our cohort was similar to previous reports 40 . We observed that tumors with mutations affecting the MAPK pathway (KRAS, BRAF and NRAS), as well as those with MET gene ampli cation, more frequently show high expression of PD-L1, with a trend for KRAS/TP53 comutated tumors to be overexpressed. In contrast, tumors with STK11 mutations, as well as with KRAS/STK11 co-mutations were signi cantly less commonly PD-L1 positive in our experience. This supports the notion that the previously proposed co-mutation subtypes not only exist and vary in terms of biologic aggressiveness, but may also differ in their responsiveness to immunotherapy, as has been recently suggested [41][42] .
From methodological standpoint, utilizing re exive standing order for local molecular testing, initiated by the pathologist at the time when the diagnosis is established, resulted in the testing being completed in a short enough turn-around time for the results to be available at the time of the patient's rst encounter with the oncologist (generally 2 weeks after diagnosis). Such completion times are not only out of the reach of most send-out tests, but the re exive procedure, which also includes obtaining sections for molecular testing during the intial sectioning, prevents loss of material typically associated with later procurement of sections for any subsequently requested testing. This in turn leads to a very low unsatisfactory rate for molecular testing (generally within single percentage range in our experience across many different tumor types).
In conclusion, our analysis of oncogenes and tumor suppressor genes in LUAD showed a distribution of mutations in rural Maine tumors to be similar to what has been reported from other regions of related ancestry, supporting the recently proposed subclassi cation of LUAD into different co-mutational subsets. Importantly, in this study we show such subclassi cation to be possible to accomplish by using a smaller gene panel in a regional oncology care setting, with the only pre-requisite being a su cient coverage of the TP53, KRAS and STK11 genes. In particular, our study adds to the recently emerged data emphasizing the importance of detecting therapeutically and prognostically signi cant mutations in early stage tumors 43 . The recent FDA-approval of targeted therapies such as EGFR inhibitors for lung carcinomas of stages IB-IIIA provides a strong support for mutational testing of early stage lung tumors 44 , and may constitute a tipping point for laboratories and hospitals to adopt similar re exive molecular testing strategies as described herein. Our results contribute to the so far elusive efforts to develop risk strati cation models for early stage lung cancer utilizing their molecular characteristics 45 . In comparison to previously published studies derived predominantly from academic institutions or large commercial laboratories, we show that NGS performed in a regional medical center setting yields molecular genetic information of equal value for patient risk strati cation and management. While utilizing a smaller panel can prove to be a technical advantage in a regional community oncology setting due to its lower complexity, it also represents a limitation of our study. For example we could not assess the recently reported KEAP1/NFE2L2 pathway alterations, predicted to associate with therapy resistance and rapid progression 46 . Despite this limitation, the results we report complement the predominantly more urban data in the literature by providing region-speci c mutational pro les from a geographically unique lung cancer population with high disease prevalence and a strong known risk factor, addressing what is often referred to as an "urban-rural disparity" in oncology. Identi cation of high-risk groups amenable to targeted intervention available in regional setting (such as targeting the speci c G12C KRAS mutation most recently 47 ) is essential for achieving sustainable improvement to rural cancer survival, which is especially true for lung cancer 48 . Our future work utilizing a larger mutation panel will allow further expansion of the current pro les and continue our efforts of mapping the molecular lung cancer landscape in our region.

Declarations
Funding: Figure 1 Molecular alterations detected in the lung adenocarcinomas in this study. Mutational frequency depicted for each gene as percentage of cases with mutation.