CT radiomics for predicting PD-L1 expression on tumor cells in gastric cancer


 Purpose To explore CT radiomics for predicting programmed cell death ligand 1 on tumor cells (PD-L1) status in gastric cancer (GC).Methods From March 2019 to July 2020, 358 patients identified with GC who underwent surgery at our hospital were enrolled in this study retrospectively. All patients were divided into primary (n=239) and validation (n=119) cohorts based on the time of surgery at a ratio of 3:1. Radiomic features were extracted from regions of interest manually drawn on venous CT images. Besides, preoperative tumor markers of all patients were collected and analyzed. The signatures based on radiomics were built using Support Vector Machine (SVM) and Random Forest (RF). Receiver operating characteristic (ROC) curve was performed to assess diagnostic efficiency. Decision curve analysis confirmed the clinical utility.Results Numerous radiomic features (all p<0.05) differed significantly between GCs with different PD-L1 status. The model developed by SVM algorithm in the primary and validation cohort achieved better performance with AUCs of 0.704 and 0.799, respectively.Conclusion It was promising to predict PD-L1 status in GCs noninvasively using CT radiomics. It might help to improve clinical decision making with regard to immunotherapy.


Introduction
Gastric cancer (GC) is a common malignancy with the third cause of cancer-associated mortality worldwide [1]. GC is treated in a variety of ways, among which surgery is the preferred treatment.
However, for unresectable locally advanced, recurrent, or metastatic GC, palliative management containing systemic treatment is recommended for patients [2].
Immunotherapy has been the focus and hotspot of clinical research since it was proposed. It includes treatments using monoclonal antibodies, cytokines, etc., which are expected to boost host anti-tumor response capability or boost the immunogenicity and sensitivity to treatment of the tumor cells [3,4]. Recently, the introduction of immunotherapy, particularly the utility of immune checkpoint inhibitors, has improved the prognosis of many cancers [5,6]. Numerous clinical works demonstrated that programmed cell death protein-1 (PD-1) (a classic immune checkpoint protein) and programmed cell death ligand 1 (PD-L1) are ideal targets for immunotherapy [7,8]. National Comprehensive Cancer Network guidelines for GC also comprise the treatment of positive PD-L1 expression tumors with monoclonal antibody (Pembrolizumab) [2]. What's more, a recent study indicated that nivolumab (PD-1 inhibitor) in combination with chemotherapy showed superior overall survival along with progression-free survival bene t than chemotherapy alone for advance GCs [9]. Thus, it is crucial to obtain the status of PD-L1.
Currently, the detection of PD-L1 is mainly based on endoscopic biopsy or resected specimens.
Endoscopic examination is an invasive procedure with limited samples and possible selection bias, and resected specimens cannot be achieved from unresectable GCs [10]. Hence, a new method is essential to obtain the status of PD-L1 in GC in a simple, non-invasive, and dynamic manner. In recent years, with the increasingly extensive study of PD-L1 in tumors, medical imaging methods have been used to predict the PD-L1 expression level [11,12]. Contrast-enhanced computed tomography (CT) is the conventional imaging modality for assessment of GC, which plays a pivotal role in staging and follow-up [2,13]. Nevertheless, conventional CT images mainly provide simple morphological characteristics instead of complex quantitative parameters.
Radiomics, as an emerging image analysis tool, allows extracting quantitative features noninvasively from digital medical images that enables mineable high-dimensional data to be applied in oncological practice within histological classi cation, lymph node metastasis, treatment response, and prognosis [14][15][16]. As previous studies revealed, the presented radiomic-based signatures from CT and the positron emission tomography (PET)/CT were able to achieve signi cant and robust individualized estimation of speci c PD-L1 status in non-small cell lung cancer (NSCLC) and advanced lung adenocarcinoma, respectively [17,18].
With regard to the evaluation of PD-L1 in GC, it has been reported that GC with positive PD-L1 expression had elevated 18 F-uorodeoxyglucose ( 18 F-FDG) accumulation, and 18 F-FDG PET/CT had the capacity to predict the status of PD-L1 [19]. PET/CT has higher radiation and price than CT, which is not a routine examination. To our limited knowledge, no study has been conducted to predict PD-L1 expressions in GC based on CT radiomics. In addition, clinical data, including demographic information and preoperative serum tumor markers, could also be effectively used. Moreover, there is a substantial interest in the use of machine learning algorithms for selecting optimal radiomic features from medical images and applying them to tumor evaluation, as well as in the improvement of diagnostic e cacy [20,21]. Therefore, we sought to explore the capability of the signatures based on CT radiomics complement clinical data to predict PD-L1 status in GC.

Patients
We searched patients who underwent surgery at our hospital between March 2019 and July 2020 consecutively, and 491 patients were identi ed with GC. The following were inclusion criteria: (1) a pathological con rmation of GC postoperatively and (2) availability of tumor markers and abdominal contrast-enhanced CT within 2 weeks prior to surgery. The following were exclusion criteria: (1) a history of GC treatment before surgery (n=20); (2) insu cient distention of the stomach (n=40); (3) no de nite information on PD-L1 (n=12); and (4) poor imaging quality due to respiratory or peristaltic motion (n=16) ; (5) hardly visible on CT images due to the small size of the lesion (n=37); and (6) incomplete information on tumor markers (n=8). The ow chart of patient selection is plotted in Fig. 1. Our Institutional Review Board has approved the current study, following the regulations outlined in the Declaration of Helsinki.
A total of 358 patients (male, 258; female, 100; median age, 60 years; age range, 29-97 years) conformed to the criteria. Patients were divided into primary cohort (n=239) and validation cohort (n=119) at a ratio of 3:1 according to the time of surgery.
CT image acquisition CT examinations were performed on 64-row scanners (VCT, Discovery HD 750, GE Healthcare, and uCT 780, United Imaging). All patients were requested to fast for at least 6 h and drink 600-1000 mL warm water to distend stomach before examination. All patients were in the supine position, and the scan covered the upper or entire abdomen. The patients were trained to hold their breath during CT scans.
Following the unenhanced scan, 1.5 mL/kg iodinated contrast agent (Omnipaque 350 mg I/mL, GE Healthcare) was injected intravenously at a ow rate of 3.0 mL/s using a high-pressure syringe (Medrad Stellant CT Injector System, Medrad Inc.). Imaging was achieved with a post-injection delay of 30-40 s and 70 s after initiation of contrast material injection, corresponding to the arterial and venous phases, respectively. CT scan parameters: tube voltage 100-120 kV, tube current 150-250 mA, slice thickness 5 mm, slice interval 5 mm, eld of view 35-50 cm, matrix 512 × 512, rotation time 0.7 s, and pitch 1.0875.

Image analysis
Axial venous CT images of all patients were downloaded through a picture archiving and communication system and uploaded into Imaging Biomarker Explorer software. A polygonal region of interest (ROI) was manually drawn along the margin of the tumor on maximal transverse slice as illustrated in Fig. 2, carefully avoiding the normal gastric wall tissue and gastric cavity contents. ROI segmentations were performed manually by reader 1 (X.X. with 8 years of experience in abdominal imaging) who was unaware of clinicopathological information of the patients. The general location of the tumors (cardia, body, and antrum) was informed. To evaluate the interobserver reproducibility, 35 cases of CT images were randomly selected for the second ROI segmentation and feature extraction as above by reader 2 (X.X. with 8 years' experience in abdominal imaging). In total, 744 radiomic features were generated automatically from the ROIs. The detailed explanations and formulas of radiomic features are displayed in supplementary material.

Development and performance of signatures
As depicted in Fig. 2, rst, the intraclass correlation coe cient (ICC) was calculated to evaluate the interobserver variability of radiomic features extraction using "irr" package (vers. 0.84). Radiomic features with the ICC values >0.8 were regarded as highly reproducible features and initially selected. Second, the Mann-Whitney U test was used to select signi cantly different radiomic features between different PD-L1 status groups. The least absolute shrinkage and selection operator (LASSO) was used for the dimension reduction of radiomic features. Then, the optimal variables were put into our in-house software programmed with the R software package (version 3.5.2: http:// www.Rproject.org), and the Support Vector Machine (SVM) and Random Forest (RF) algorithms were applied to generate signatures in the primary cohort. The ratio of the training and testing sets was 4:1. In the training phase, a popular data-preprocessing method in machine learning-Synthetic Minority Oversampling Technique was applied to handle the class imbalance problem. The models were evaluated by repeated strati ed (K=5) crossvalidation. The models developed was also applied to the validation cohort. In addition, to evaluate the clinical usefulness of the developed model, a decision curve analysis (DCA) was plotted by demonstrating the net bene ts graphically for a range of threshold probabilities in the validation cohort.

Detection of PD-L1 Expression Status
The PD-L1 expression status were measured through immunohistochemistry testing for para nembedded tumor tissues in our study. The markers cytokeratin and the lymphocyte common antigen were used to differentiate tumor cells. The positivity for PD-L1 was assessed by one pathologist using SP142 abcam staining. The expression for PD-L1 was scored according to tumor cell / tumor in ltrating lymphocyte proportion, which was de ned as the percentage of tumor cells / tumor in ltrating lymphocytes with complete or partial membranous staining at any intensity.

Statistical analysis
The normality distribution of radiomic features was evaluated by the Shapiro-Wilk test. Based on the normality test results, the difference of them was analyzed by the Mann-Whitney U test. Interobserver

Qualitative Data Analysis
The demographic data and pathological information of the included patients are summarized in Table 1.

Quantitative Data Analysis
Tumor marker There were no signi cant differences in six tumor markers between GCs with different PD-L1 status in the primary cohort (Table 2).

Radiomic feature
After assessing the reproducibility, the data derived from the tumor ROI on venous CT images were reduced to 665 robust features (interobserver ICC values >0.8) for the subsequent analysis. In addition, there were signi cant differences in 83 radiomic features derived from venous CT images between negative and positive PD-L1 expressions GCs in the primary cohort. The diagnostic performance of those features ranged from 0.596 to 0.627.

Diagnostic Performance of Signatures
All the signi cantly different variables were placed into LASSO for dimension reduction (Fig. 3). Ultimately, 10 (F140Percentile, F145PercentileArea, F2135.7Homogeneity, F2315.7Homogeneity, F245.4InverseVariance, F290.4InverseVariance, F2225.4InverseVariance, F2270.4InverseVariance, F4MedianAbsoluteDeviation, F5MedianAbsoluteDeviation) optimal variables were reserved to develop the models using the SVM and RF algorithms in the primary cohort, the model developed by SVM achieved better performance with an AUC of 0.704. The developed model was also applied to the validation cohort with an AUC of 0.799. The DCA for the developed model based on the validation cohort is plotted in Fig 4. Discussion PD-1 is a classic immune checkpoint protein, and the utility of PD-1 targeted therapy (depending on the status of PD-L1) is bene cial to improve the prognosis of many cancers [3]. In this current study, we developed and validated the utility of the signatures based on CT radiomics complement clinical data to predict PD-L1 status in GC. Clinical data, including the demographic information and six serum tumor markers, were collected. In addition, 744 radiomic features of GCs based on venous CT images were extracted. Plenty of radiomic features differed signi cantly between GCs with different PD-L1 status. After the feature selection, 10 robust features were retained, these signatures based on SVM were competent to predict the status of PD-L1 in GCs. However, serum tumor markers and the demographic data, including age and gender, showed no signi cant differences between negative and positive PD-L1 expressions GCs in the primary cohort.
Nowadays, CT radiomics has been used widely in tumor assessment [14][15][16]], yet only a few of them focused on PD-L1. Our study found that there were abundant radiomic features with statistical signi cance between GCs with different PD-L1 expressions. All those features were placed into LASSO for dimension reduction, the diagnostic e ciency of SVM increased with AUCs of 0.704 and 0.799 in the primary and validation cohorts, while the AUCs achieved with RF is relatively lower. It indicated that the SVM might be a more suitable algorithm for evaluating the PD-L1 expression in GC. To our limited knowledge, no previous study focused on CT radiomics to predict PD-L1 status of GC, but Chen R, et al.
applied SUVmax based on PET to investigate PD-L1 expression in GC with an AUC of 0.822 [19]. Although the AUC was higher than ours, the sample size was small, and the study lacked validation group for further veri cation. Besides, for some patients in their study, the specimens for the detection of PDL-1 were obtained only by endoscopic biopsy.
Serum tumor markers, indirectly indicating tumor burden, are commonly used for malignant tumor diagnosis, e cacy evaluation, and prognosis [22][23][24]. Lang D et al. demonstrated that the early change of serum tumor markers is predictive of progression-free and overall survival in patients with NSCLC treated by PD-1 targeted therapy [25]. In this study, preoperative six tumor markers were also collected and analyzed, though there were no signi cant differences in tumor markers between GCs with different PD-L1 status in the primary cohort. Depending on our previous literature investigation, no study on the prediction of PD-L1 status in GC by serum tumor markers was found. Therefore, applying tumor markers in the prediction of PD-L1 status in GC needs further exploration.
In this study, we developed and validated radiomic models to assess PD-L1 status in GC preoperatively using relatively large sample size, and it indicated that the models could predict PD-L1 status. Decision curve analysis showed that the radiomic model is useful. Though the serum tumor markers showed no predictive value of PD-L1 status in GC, there insists other clinicopathological information, including hematological parameters and endoscopic biopsy, to be further effectively utilized.
Certain limitations of our study deserve consideration. First, it was a single centre retrospective study with a relatively small sample size. Prospective multicentre studies with large samples need to be carried out to con rm our results further. Second, our feature extraction was restricted to lesions on venous phase, because a prior study suggested [26] that radiomic features derived from venous phase images had superior discrimination of tumor tissue from adjacent normal gastrointestinal wall. Third, ROIs were delineated on the maximal slice images since tumor on that section is the clearest, which might not re ect the overall information of the tumor.
In conclusion, it was promising to predict PD-L1 status in GCs noninvasively using CT radiomics. It might help to improve clinical decision making with regard to immunotherapy.

Data availability
Data are available upon request to the corresponding author.     (b) LASSO coe cient pro les of the 83 selected features. A coe cient pro le plot was generated versus the selected log (λ) value using vefold cross-validation; ten selected features with nonzero coe cients were retained.