Predictive performance of PD-L1 tumor proportion score for nivolumab response evaluated using archived specimens in patients with non-small cell lung cancer experiencing a postoperative recurrence

Postoperative recurrence in patients with non-small-cell lung carcinoma (NSCLC) is a major issue for life expectancy. Programmed cell death ligand 1 (PD-L1) expression on tumor cells is important in the prognosis of NSCLC. However, the predictive ability of PD-L1 evaluated with archived surgical specimens for nivolumab treatment have remained unknown. This study was aimed to analyze the predictive ability of the PD-L1 tumor proportion score (TPS) for nivolumab response in patients with NSCLC experiencing a postoperative recurrence using archived surgical specimens. This retrospective cohort study involved patients with advanced NSCLC (N = 78) treated with nivolumab between April 2016 and September 2018. They were categorized into postoperative recurrence (N = 24) and non-postoperative recurrence (N = 54) groups. The predictive ability of PD-L1 TPS for response to nivolumab treatment in these two groups was determined using receiver operating characteristic (ROC) analysis. Additionally, we evaluated the predictive ability of PD-L1 TPS using rebiopsy specimens collected from the recurrent lesions in six patients of the postoperative recurrence group. PD-L1 TPS exhibited lower predictive performance in the postoperative recurrent group (area under the curve [AUC] = 0.58) compared with that in the non-post operative recurrent group (AUC = 0.81). Furthermore, PD-L1 TPS was significantly increased in rebiopsy specimens. The predictive performance of PD-L1 TPS in these specimens was higher (AUC = 0.90) than that in the archived surgical specimens. The study revealed that archived surgical specimens are inadequate for assessing the predictive ability of PD-L1 for nivolumab response, while rebiopsy specimens are adequate.

obtained from the surgery are used for biomarker analysis. Importantly, in previous clinical trials, archived specimens stored for more than six months were not permitted [4,5]. Therefore, the prognostic accuracy of PD-L1 tumor proportion score (TPS) evaluated using long-stored specimens, such as archived surgical specimens, has not been explored.
The aim of this study was to investigate the predictive ability of PD-L1 TPS using archived specimens in patients with NSCLC experiencing postoperative recurrence. The findings can assist in managing advanced NSCLC in patients experiencing postoperative recurrence.

Patients, specimens, and assessments
We retrospectively analyzed patients with advanced NSCLC previously treated with nivolumab monotherapy at Kobe City Medical Center General Hospital, Japan, between April 2016 and September 2018. A total of 116 patients were evaluated, of which 78 patients were included and divided into postoperative (N = 24) and non-postoperative recurrence (N = 54) groups (Supplementaray Fig. 1 available in Online Resource 1). In the postoperative recurrence group, PD-L1 TPS of patients was evaluated with archived surgical specimens after recurrence was found. In some cases (N = 6), rebiopsy from recurrent lesions was conducted, and the reevaluation of PD-L1 TPS was added for this study. In the non-postoperative recurrence group, where the definitive surgery was not essential, we investigated the specimens obtained by biopsy or surgical resection before nivolumab therapy. Samples were considered adequate for interpretation if 100 or more TCs were present in hematoxylin and eosin (H&E) stained sections. All patients were classified based on their clinical stage according to the eighth edition of the TNM classification [6]. Progression-free survival (PFS) was defined as the period from the day of commencement of nivolumab treatment until progression of lung cancer or death due to any cause or the end of the follow-up period. Data cutoff was conducted on November 1, 2019. All patients underwent computed tomography within 30 days before initiating nivolumab treatment. Clinical tumor assessment was performed using Response Evaluation Criteria in Solid Tumors (RECIST) version 1.1 [7]. The tumor response was assessed every 6 to 12 weeks by performing a computed tomography scan.
This study was approved by the Kobe City Medical Center General Hospital Ethics Committee (zn180632). All tumor specimens for the pathological analyses were collected after obtaining informed consent from all patients. However, informed consent for the use of clinical information of the patients was waived because of the retrospective study design.

Tissue processing and IHC
PD-L1 expression in NSCLC specimens was analyzed by IHC staining using the 22C3 assay kit (Dako, North America). Four-micron-thick sections were cut from formalinfixed paraffin-embedded (FFPE) tumor blocks and then routinely deparaffinized and rehydrated. The cut sections were stained for 10 days with anti-PD-L1 22C3 mouse monoclonal primary antibodies, following the procedure described in the PD-L1 IHC 22C3 pharmDx assay kit [8,9].

Pathological analyses
The samples were anonymized, and experienced pathologists scored all immunostained slides according to the scoring algorithms. Tumor proportion scores (TPSs) of PD-L1 in TCs were expressed as the percentage of PD-L1-positive TCs in the overall tumor sections and estimated in increments of 5%, except for 1% positivity of TCs. The cutoffs for PD-L1 expression were set to 1% and 50%.

Statistical analyses
Continuous variables were analyzed using the Student's t-test. Dichotomous variables were analyzed using Fisher's exact test. The Kaplan-Meier method was used to estimate PFS, and the groups were compared using the log-rank test. The predictive performance of different groups for predicting response to nivolumab was determined using receiver operating characteristic (ROC) analysis [10]. A test with an area under the curve (AUC) of 0.9 or higher was considered highly accurate, 0.7-0.9 was considered moderately accurate, and 0.5-0.7 was considered poorly accurate. The results are expressed as hazard ratios (HRs) with 95% confidence intervals (CIs). The difference in PD-L1 TPS between the surgical and rebiopsy specimens was estimated. The 95% CI for the difference was also calculated. A two-tailed P-value of < 0.05 indicated statistical significance. All statistical analyses were performed using the JMP 16 software (SAS Institute, Cary, NC, USA).
Furthermore, the specimens (N = 78) were dichotomized according to their median storage time (327 days) to analyze the effects of storage time on the efficacy of nivolumab treatment and PD-L1 expression. These two groups [long (N = 46) and short storage (N = 32) groups] were further divided into three subgroups based on the PD-L1 TPS. The PFS of each group was evaluated using the Kaplan-Meier method.

Patient characteristics
The patient characteristics and comparisons are summarized in Table 1. All patients had stage IIIB or IV NSCLC. The median age was 67.4 years (standard deviation (SD), 9.8), 54 (69%) patients had adenocarcinoma histology, 68 (87%) patients had an Eastern Cooperative Oncology Group Performance Status (ECOG PS) of 0 or1. The overall response rate to nivolumab was 25%, and the median PFS was 2.1 (95% CI: 1.6-3.9) months. In the postoperative recurrence group, the interval of PD-L1 evaluation from the surgery or biopsy was longer than that in the non-postoperative recurrence group (514 days vs. 233 days, p = 0.0019). Other characteristics did not differ significantly between the two groups. Table 2 shows the organs analyzed for PD-L1 TPS and the distribution of PD-L1 TPS in each group. In the non-postoperative recurrence group, 76% of patients were diagnosed by lung biopsy. 11% of these patients underwent surgery for diagnosis and were treated for advanced lung cancer. The analysis of TPS distributions stratified by the two cutoffs (1% and 50%) demonstrated no significant difference in the frequencies of PD-L1 intensities between the two groups. However, patients with negative PD-L1 were common in the non-postoperative recurrence group. The treatment outcomes showed almost similar efficacy in both groups.

PD-L1 staining of TCs and comparisons by 1% and 50% cutoffs
The relationships between PD-L1 TPS and response rates (RRs), disease control rates (DCRs), and PFS are summarized in Table 3. In the non-postoperative recurrence group, patients with strongly positive PD-L1 had better RR (67%), DCR (83%), and PFS (7.7 months). In contrast, patients with weakly positive or negative PD-L1 had worse RRs (33% and 8%, respectively), DCRs (53% and 24%, respectively), and PFS (1.6 and 8.5 months, respectively). In the postoperative recurrence group, patients who were strongly positive for PD-L1 had worse RR (20%), DCR (40%), and PFS (1.8 months), whereas patients who were weakly positive or negative had better RRs (17% and 14%, respectively), DCRs (83% and 57%, respectively), and PFS (2.2 and 4.0 months, respectively). The Kaplan-Meier curves for each group are shown in Fig. 1. The PFS of patients who were strongly positive for PD-L1 in the non-postoperative recurrence group was significantly longer than that of patients with negative PD-L1 expression. However, patients with postoperative recurrence who were strongly positive for PD-L1 had a shorter PFS than those negative for PD-L1 (p = 0.34).

Comparison of predictive performance by ROC analysis
For each group, we conducted ROC analysis to predict the response to nivolumab to compare the predictive performance (Fig. 2). In the non-postoperative recurrence group, PD-L1 TPS showed moderate accuracy (AUC: 0.81), whereas, in the postoperative recurrence group, it showed low accuracy (AUC: 0.58) (Fig. 2a, b).

Analysis of PD-L1 IHC in the sample collected by rebiopsy
Of the 24 patients with postoperative recurrence, 6 patients underwent a biopsy at the time of recurrence. PD-L1 TPS was increased in the rebiopsy samples (p = 0.03; Supplementary Table 1 in Online Resource 2). For instance, 17% (one out of six) of these patients evaluated as weakly positive for PD-L1 using archived specimens were identified as highly positive using the specimen obtained from rebiopsy. Further, prediction of the responses of the rebiopsy specimens to nivolumab using ROC analysis and comparison of their predictive performance revealed a high accuracy (AUC = 0.9) (Fig. 2c). Representative IHC data for these specimens are shown in Supplementary Fig. 2 which is available in Online Resource 3.

Comparision of PFS according to the storage time
Analysis of the PFS in specimens based on their storage time revealed no significant difference in PFS between the three PD-L1 TPS subgroups [strongly positive (TPS ≥ 50%; N = 19), weakly positive (TPS 1-49%; N = 26), and negative (TPS < 1%; N = 32)] in the long storage group. Conversely, in the short storage group, PFS was significantly shorter in the negative PD-L1 group than those in the strongly and weakly positive PD-L1 subgroups ( Supplementary Fig. 3-(a),(b) in Online Resource 4).

Discussion
To the best of our knowledge, this is the first study to demonstrate the predictive ability of PD-L1 TPS using archived specimens for patients with postoperative recurrence. The study demonstrates that specimens obtained by definitive surgery show lower predictive performance for the nivolumab response. 4). These results indicate that longer storage time may lead to the underestimation of PD-L1 TPS because of antigenicity loss.
The second possible reason could be the PD-L1 heterogeneity among specimens. Several studies have reported the difference between surgical specimens and small biopsies, and some argue that there are substantial differences [12][13][14]. However, theoretically, the whole section specimens The first possible explanation for this finding is that PD-L1 TPS might have been altered by a longer storage time in paraffin sections. In a previous report, when the paraffin storage time was long, the PD-L1 TPS was likely underestimated, possibly due to antigenicity loss [11]. In our cohort, PD-L1 expression in archived specimens evaluated with surgery was lower than that in specimens evaluated with biopsy at recurrence (Supplementary Table 1 in Online Resource 2). Additionally, the negative PD-L1 TPS subgroup had significantly shorter PFS for nivolumab treatment in the short storage group than that in the long storage group. However, no significant difference was observed in PFS among the three PD-L1 TPS subgroups in the long storage group (Supplementary Fig. 3-(a),(b) in Online Resource Group Performance Status; CI: confidence interval; PD-L1: programmed cell death ligand 1; IQR: interquartile range a Patients who reported never having smoked were classified as neversmokers. Those who had smoked within 1 year of the diagnosis were categorized as current smokers, and the remaining patients were considered former smokers b Comparison between adenocarcinoma and non-adenocarcinoma   In conclusion, this study demonstrates that archived specimens are inadequate for PD-L1 analysis in patients with postoperative recurrence, whereas PD-L1 TPS evaluated using the biopsy specimens from recurrent sites can be an accurate indicator of nivolumab response. These findings can assist in establishing appropriate management plans for patients experiencing postoperative recurrence.
provide more comprehensive information about PD-L1 status than small biopsy specimens; therefore, we speculated that the surgical specimen is not inferior to the small biopsy [15][16][17]. Another possible explanation for lower predictive ability in archival surgical specimens is the spatiotemporal heterogeneity of PD-L1 expression. Moreover, the immune status of the recurrence lesion differs from that of the primary lesion [18]. For example, approximately 14% disagreement of PD-L1 expression on tumor cells has been reported in specimens collected from paired primary lung cancers and brain metastases (mostly collected six or more months apart). This study identified significant differences between the tumor microenvironment of paired primary lung cancers and brain metastases, which could be attributed to the immunological diversity in metastatic lesions or the difference of PD-L1 status in metastatic lesions. In concordance, the spatial and temporal heterogeneity of the tumor microenvironment might be attributable to our disagreement on PD-L1 TPS between archived surgical and rebiopsy samples in the postoperative recurrent group.
In this study, the PD-L1 TPS was significantly increased when the biopsy was performed at recurrence. Further, ROC analysis demonstrated that samples taken at the diagnosis of recurrence had an extremely high predictive performance (AUC = 0.90). After the data accumulation of Keynote 024 clinical trials, clinicians generally chose pembrolizumab monotherapy for patients with PD-L1 ≥ 50% [4]. However, in our study, 17% (one out of six) of the patients were classified as weakly positive, although the true PD-L1 TPS was highly positive. A previous study suggested that pembrolizumab monotherapy has limited efficacy in patients with lower positive PD-L1 expression [19]. Collectively, it can be inferred that false PD-L1 TPS using archived surgical specimens may misguide therapeutic strategies. Therefore, PD-L1 status should be carefully considered to estimate the treatment efficacy. This study suggests that conducting a biopsy at recurrence, which yields a more accurate PD-L1 TPS, is useful for proposing an accurate strategy.
The present study has several limitations. First, it was a retrospective study with a small number of subjects from a single institution. Furthermore, the imbalance in the background of patients with high and low PD-L1 expression levels cannot be ruled out. Therefore, the influences of the confounding factors on the analysis of the association between PD-L1 expression and the efficacy of immune checkpoint inhibitor therapy cannot be excluded. Nevertheless, it is difficult to adjust these variables using multivariate analysis due to the small sample size. Second, only six patients underwent rebiopsy. Therefore, to ascertain the PD-L1 status change, further large-scale analysis is warranted.