In-Depth Genomic Characterization of Programmed Cell Death Ligand 1 (PD-L1) Expression in Chinese Lung Adenocarcinoma Patients

Background: Immune checkpoint blockades have revolutionized anticancer treatments for lung cancers, and PD-L1 expression has been served as crucial predictive biomarkers for immunotherapies. However, PD-L1 expression has been found to be inuenced by some extrinsic or intrinsic factors in vitro or to be associated with other biomarkers in vivo. Methods: A amount of tumor tissues was taken from a biopsy or surgery for PD-L1 expression assay and genomic proling in Chinese lung adenocarcinoma patients. In-depth retrospective analyses of clinical features, gene alternations, singling pathways and immune signatures was conducted in negative group (TPS < 1%), intermediate group (1% ≤ TPS < 50%), and high group (TPS ≥ 50%). Clinical responses to selected mutation were also evaluated from public database such as TCGA and MSKCC. Results: A total of 248 Chinese patients with lung adenocarcinoma was retrospectively identied and included in this study, consisting of negative group (n=124, 50%), intermediate group (n=93, 38%), and high group (n=38, 12%). High tumor mutation burden was signicantly as associated with high PD-L1 expression. In addition, PD-L1 expression was highly related with gene alternations such as PRKDC, KMT2D, ERBB2 and SETD2. Moreover, signaling pathways of DDR, TP53, cell cycles and NOTCH also obviously related with PD-L1 expression. Besides, most of patients in high PD-L1 group were determined as high-grade immune subtypes (C2-C4), showing signicant higher levels of IFN-gamma, CD8+ T-cells, NK cells, NK CD56 dim cells, Th1 cells, Th2 cells (P < 0.0001). Moreover, the prognostic value of SETD2 mutation was slight positive with overall survival from MSKCC cohort (HR 1.92 [95%CI 0.90-4.10], P =0.085), and the percentage of IFN-gamma was signicantly higher in SETD2 mutant group than in wild-type subgroup (P This illustrated in-depth genomic proles and immune signatures of PD-L1 expression might interpret potential molecular mechanisms for immunotherapy in


Background
Lung cancer is the major cause of cancer-related death with a poor 5-year survival rate, among which comprises 80% non-small cell lung cancer (NSCLC) [1]. Platinum-based chemotherapy or targeted therapy for speci c driver genes used to be standard anti-cancer therapies for NSCLC, but related drug resistance to these treatments becomes a huge challenge [2][3][4][5]. Immune checkpoint blockades (ICBs), including programmed death 1 (PD-1) blockade and cytotoxic T lymphocyte antigen-4 (CTLA-4) blockade, have recently revolutionized the treatments for NSCLC and has emerged as a promising therapeutic strategy for NSCLC patients [6,7].
As there were still a certain number of patients who cannot bene t from ICBs, predictive biomarkers for clinical responses to the immunotherapies such as programmed death-ligand 1 (PD-L1) expression has provided clinical assistance for clinicians in early selection of those responders and timely implementation of therapeutic regimens [6,7]. Some studies have demonstrated that positive PD-L1 expression level signi cantly correlated with an improved response to therapies in NSCLC [8,9]. Based on several clinical trials, PD-1 inhibitor-pembrolizumab has been approved by FDA as the front-line therapy for advanced lung cancer patients who present high PD-L1 expressions (TPS > 50%) and are diagnosed as EGFR or ALK negative [10] and as the front-line single-agent immunotherapy for metastatic NSCLC patients who have positive PD-L1 expression (TPS ≥ 1%) and have processed on or after platinum-based chemotherapy [11].
However, other studies showed low PD-L1 expression level in NSCLC (< 10%) cannot predict treatment response [8,9]. Some patients with negative PD-L1 expression, despite high tumor heterogeneity, can also bene t from ICBs [12]. As a result, prognostic value of this biomarker for ICBs was recently challenged. In spite of detection methodology and tumor heterogeneity, PD-L1 expression has been found to be in uenced by some extrinsic or intrinsic factors in NSCLC. For example, alternations involved with TP53, KRAS, EGFR, ALK, STK11and PTEN, can affect PD-L1 expression [13][14][15][16]. Besides, activating of oncogenic signaling pathways related with these genes including PI3K-AKT-mTOR pathway, Jak-STAT pathway and KRAS-ERK pathway, can also induce PD-L1 expression in NSCLC or in other cancer types [17][18][19]. Additionally, other biomarkers such as tumor mutation burden (TMB) and tumor in ltrating lymphocytes (TILs), have been validated for predicting the e cacy of ICBs [20,21].
With advances of next-generation sequencing (NGS) techniques, we conducted in-depth retrospective analysis in order to characterize the factors associated with PD-L1 expression in Chinese lung adenocarcinoma patients, which might help with executing the optimal treatment for patients with lung adenocarcinoma and illustrating potential molecular mechanisms of therapeutic response to immunotherapy in NSCLC.

Sample collection
Data of patients with lung adenocarcinoma examined using NGS panel (YuceOne™ Plus, Yucebio, China) were collected from January 2019 to May 2020. Basic demographic data and pathological diagnoses were checked with each patient with corresponding medical record. A su cient amount of formalin-xed para n-embedded (FFPE) tumor tissue or fresh tissue for each patient were either taken from a biopsy or surgery for PD-L1 expression assay and genomic pro ling. The study was approved by the institutional review board at the local sites. Informed written consent was obtained from each patient.

Next generation sequencing and mutation analysis
Genomic pro ling was performed on tumor tissue and matched peripheral blood samples. Genomic DNAs were isolated from tumor specimens and blood was respectively extracted using the GeneRead DNA FFPE Kit (Qiagen) and Qiagen DNA blood mini kit (Qiagen). Then, extracted DNAs were ampli ed, puri ed, and analyzed using NGS panel (YuceOne™ Plus, Yucebio, China).
Sequencing reads with > 10% N rate and/or > 10% bases with quality score < 20 were ltered using SOAPnuke (Version 1.5.6). The somatic single nucleotide variants (SNVs) and insertions and deletions (InDels) were detected using VarScan (Version 2.4), and further in-house method was applied to lter the possible false positive mutations. Then, SnpEff (Version 4.3) was used to perform functional annotation on the mutations detected in the tumor sample. Tumor mutation burden (TMB) was calculated using nonsilent somatic mutations, including coding base substitution and indels.
HLA typing of tumor and matched control samples were assessed by OptiType (Version 1.3.2). The loss of heterogeneity (LOH) of HLA were detected by LOH HLA [22]. The neoantigen prediction was performed as previously described [23]. Tumor neoantigen burden (TNB) was measured as the number of mutations which could generate neoantigens per megabase.

Copy number variations analysis
Somatic copy number alterations (SCNAs) analysis was performed using Allele-Speci c Copy number Analysis of Tumours (ASCAT) with default parameters and FACETS algorithm. Then GISTIC2.0 was used to identify signi cant driver somatic CNVs by evaluating the frequencies and amplitudes of observed events. Chromosomal instability (CIN) was estimated using the weighted chromosomal instability (wCIN) score [24]. The score was computed as follows: the ploidy of the sample was generated by the ASCAT algorithm. For each of the 22 autosomal chromosomes, the percentage of gained and lost genomic material was calculated relative to the ploidy of the sample. The wCIN score of a sample was de ned as the average of this percentage value over the 22 autosomal chromosomes.

Pathways and immune signatures analysis
Genes of pathways analysis were curated by comparing previously reported gene list [25,26] with overlapping genes covered in the YuceOne™ Plus panel. The genes of IFN-gamma signature and in ltrating immune cells were curated using the gene sets previously described [27,28]. The immune signature scores were calculated using ssGSEA method implemented by R package GSVA [29].

Statistical analysis
Correlations between PD-L1 expression and clinical parameters were analyzed using the Fisher's exact test for categorical variables. Kruskal-Wallis rank sum tests were used for comparisons of continuous variables across multiple groups. Wilcox rank sum tests were used for comparisons of continuous variables between two groups. Multiple comparison corrections were used to calculate Q values by the FDR correction. Survival analysis was performed using Kaplan-Meier survival plot and log-rank test p value was calculated. P < 0.05 or Q < 0.25 were considered statistically signi cant. All statistical analyses were performed in the R Statistical Computing environment v3.6.1 (http://www.r-project.org).
Major immune signatures linked to PD-L1 expression in lung adenocarcinoma patients from TCGA database Based on TCGA-LUAD database, we primarily characterized immune signatures in lung adenocarcinoma patients among different PD-L1 expression groups in Fig. 3A-F and found that most of patients in high PD-L1 group were determined as high-grade immune subtypes (C2-C4). Compared with PD-L1 negative group, higher levels of IFN-gamma, CD8 + T cells, NK cells, NK CD56 dim cells, Th1 cells, Th2 cells (P < 0.0001) and lower percentage of NK CD56 bright cells and Th17 cells (P < 0.05) was observed in PD-L1 high group, suggesting high PD-L1 expression level can be a prognostic marker for the clinical response to of immune cells during anti-cancer immunotherapy.
Potential therapeutic response correlated to SETD2 mutation from immunotherapy cohort As shown in Fig. 3G-J, the prognostic value of SETD2 mutation was slight positive with overall survival from MSKCC cohort (HR 1.92 [95%CI 0.90-4.10], P = 0.085), but not progression-free survival among the patients from Rizvi cohort (HR 1.35 [95%CI 0.70-2.57], P = 0.37). Furthermore, the percentage of IFNgamma (P < 0.01) was signi cantly higher in SETD2 mutant group than in wild-type subgroup, despite no signi cant difference of CD8 + T-cells.

Discussion
Taking consideration of the results of some clinical trials with single-agent immunotherapy or immunotherapy-based combination therapy in NSCLC, PD-L1 expression can help direct clinicians to choose single-agent immunotherapy for patients with high PD-L1 expressions or combined chemoimmunotherapy for patients with low PD-L1 expressions. But, due to constantly emerging of converse results, prognostic value of PD-L1 expression for ICBs was recently challenged [8,9].
In spite of the variabilities in immunohistochemical staining antibodies and heterogeneous expression may result in broad inconsistency of this biomarker, PD-L1 expression has been found to be in uenced by some extrinsic or intrinsic factors in NSCLC. In this study, we conducted in-depth retrospective analysis in order to characterize the factors associated with PD-L1 expression in Chinese lung adenocarcinoma patients. In this study with Chinese lung adenocarcinoma patients, clinical features such as age and gender cannot affect PD-L1 expression in lung adenocarcinoma. High TMB levels were signi cantly as associated with high PD-L1 expression in lung adenocarcinoma (P < 0.05), which was consistent with those ndings from multicenter studies [14,15]. cancers. Similarly, it is recently reported that 75% of mutant PRKDC patients with lung cancers can response to immunotherapy, suggesting PRKDC can be explored as both a predictive biomarker and a therapeutic target for ICBs [31].
Recently, NSCLC patients with driver gene mutation in DDR pathways presented signi cant higher TMB values and higher objective response rate, longer median PFS, showing improved clinical responses to anti-cancer immunotherapy in NSCLC [32].We also found DDR pathway, TP53 pathway, cell cycles pathway and NOTCH pathway were highly correlated with high PD-L1 expression in Chinese lung adenocarcinoma patients (P < 0.05), which might provide more evidences for illustrating a clearer molecular mechanism for PD-L1 expression in Chinese patients with lung adenocarcinoma.
Due to the complexity of tumor immunity mechanisms, immuno-genomic analyses with tumors and TILs in tumor microenvironments might be important for underlying potential factors for the promotion of tumor immunogenicity and immunotherapy e cacy. Patients who were diagnosed as immune type I refer to those with high PD-L1 expression and CD8 + TLs in the tumor microenvironment, and can bene t from ICIs [33,34]. These patients are also likely to associate with increased numbers of somatic driver mutations, tumor neoantigen, PD-L1 ampli cation, and infection with Epstein-Barr virus, etc. [33,34]. Accordingly, we primarily characterized immune signatures with PD-L1 expression in patients with lung adenocarcinoma from TCGA-LUAD database and found that the percentage of high-grade immune subtypes (C3-C5) in PD-L1 high group was higher than PD-L1 low group. Signi cant higher levels of IFNgamma, CD8 + T-cells, NK cells, NK CD56 dim cells, Th1 cells, Th2 cells were found in PD-L1 high group (P < 0.0001), whereas substantial lower percentage of NK CD56 bright cells and Th17 cells was observed (P < 0.05). Then, we found SETD2 mutation were slight positive correlated with overall survival from MSKCC cohort (HR 1.92 [95%CI 0.90-4.10], P = 0.085), and the percentage of IFN-gamma (P < 0.01) and CD8 + T-cells (P < 0.05) was higher in SETD2 mutant group than in wild-type subgroup.
This study involved several limitations. First, most of the patients in our studies were treatment naïve with anti-cancer therapy, which might present lower PD-L1 expression levels than after-line patients. Second, clinical diagnose data such as cancer stage, tumor site, etc., were not collected, which cannot give a detailed analysis on the clinical impact on PD-L1 expression. Third, due to lack of clinical survival data like PFS and OS, we used TCGA data to evaluate the in uence on clinical response, However, there were an inconsistence between stratifying patients by TPS in our study and by a quartile method in TCGA database. Besides, we did not exclude patients with positive drive genes like ALK and EGFR when investigating on the roles of gene mutations on PD-L1 expression. These may cause some statistical bias nally.
In conclusion, our study illustrated a clearer genomic landscape and relevant immune signatures of PD-L1 expression in Chinese lung adenocarcinoma patients, highlighting the potential impact of speci c gene alternation and signaling pathways to be both a predictive biomarker and a therapeutic target for ICBs. The study was conducted after the acquisition of written informed consent from the patients.

Consent for publication
All contributing authors agree to the publication of this article.

Availability of data and materials
The datasets used and analyzed during the current study are available upon reasonable request.

Competing interests
The authors declare that they have no competing interests.
Funding None Authors' contributions K.L., J.L., L.W. and Y.X contributed to research design, manuscript drafting and methodology conceptualization; J.L. and H.D. contributed to patient recruitment and sample collection; Z.Z., X.L. and D.W. contributed to data analysis and interpretation; Z.Z. and B.C. contributed to study supervision and manuscript revision.