Study design and patient population
This prospective study was reviewed by the Institutional Review Board of Jinshan Hospital, Fudan University (No.JIEC2021S47). All patients signed the informed consent. All the methods were carried out in accordance with relevant guidelines and regulations.
From December 2018 to October 2021, 60 consecutive patients were enrolled in this study. All the patients were pathological confirmed NSCLC with advanced stage IV. And they received a first-line immunotherapy of Toripalimab (200 mg/time), Camrelizumab (200 mg/time) or Sintilimab (200 mg/time) intravenously every 21 days for 4-6 cycles until disease progression. Exclusion criteria: (1) patients with symptoms associated with infection, such as increased leukocytes and neutrophils, or inflammation indicated by lung CT (n=5); (2) patients not completing two cycles of immunotherapy (n=4); (3) patients lost of follow-up (n=5). Finally, 46 patients were enrolled in the metabolomics analysis. They were deviled into a disease control (DC) group (including partial response [PR] and stable disease [SD]) and a progressive disease (PD) group. All the serum samples were collected before the first-cycle of the immunotherapy treatment. The tumor responses to the immunotherapy were evaluated by Response Evaluation Criteria in Solid Tumors, version 1.1 (RECIST v1.1).
Clinical information and serum sample collection
Clinical information including age, gender, tumor position, metastases position, pathological subtype, disease stage and metabolic syndrome situations (clinically diagnosed diabetes, hypertension, or hyperlipidemia) were collected. Fasting peripheral blood (2-4 mL) was collected with a serum separator tube within one week before the first cycle of immunotherapy. The blood was centrifuged at 1,200 g for 10 min at 4°C within 30 min after collection, and stored at -80°C.
Metabolite extraction and profiling analysis
Serum samples were thawed and vortexed thoroughly on ice. For hydrophilic metabolites extraction, 100 µL of serum sample was mixed with 400 µL methanol. The mixture was vortexed for 30 s and incubated for 6-8 hours at -80°C. After centrifugation at 14,000 g at 4°C for 15 min, 250 µL supernatant was transferred to a fresh tube and lyophilized under vacuum. The dried samples were reconstituted in 50 µL 80% methanol, vortexed for 60 s, and incubated at 4°C for 15 min. The samples were centrifuged at 12,000 g at 4°C for 30 min. Finally, 60 µL of supernatant was used for GC-MS analysis. For hydrophobic metabolite extraction, 100 µL of serum sample was mixed with 400 µL of chloroform/methanol (2:1, vol/vol). The mixture was vortexed for 30 s and centrifuged at 10,000 g for 10 min at room temperature. Then, the lower organic-phase (200 µL) was transferred to a fresh tube and lyophilized under vacuum. The dried samples were dissolved in 150 µL of dichloromethane/methanol (2:1, vol/vol), vortexed for 30 s, and then centrifuged at 12,000 g for 15 min at room temperature. Finally, 60 µL of supernatant was used for GC-MS analysis[12].
The analyses were performed on GC-MS (7890B-5977A, Agilent Technologies, Waldrom, Germany). Metabolite identification ID was carried out by using retention times (Rt) compared to pure compounds Rt (Sigma-Aldrich, Shanghai, China). Pooled quality control (QC) samples were analyzed at the beginning of the sample queue followed by one QC sample inserted for every ten samples.
Data processing
Peak extraction and alignment was performed using Seahorse Analytics (Agilent). Features that existed in at least 80% samples in one group were retained. First, the metabolomics features was normalized to the sum of the peak area of a sample. Then metabolomics features with Pearson correlation coefficients greater than 0.9 were identified as redundant features. If two features had a Pearson correlation coefficient > 0.9, the feature with the largest mean absolute correlation was removed. Third, A binary least absolute shrinkage and selection operator (LASSO) logistic regression analysis with 10-fold cross validation was used to select the metabolomics features to generate metabolomics signatures. A metabolomics score for each patient was calculated using a linear combination of metabolomics signatures weighted by their respective coefficients derived from linear regression[13].
Multivariate binary logistic regression analysis was used to select were significant predictors for response of immunotherapy from the collected clinical information.
Metabolomics nomogram discrimination
The metabolomics nomogram was developed by combining metabolomics score with independent clinical predictors by using a multivariable logistic regression method. To select the optimal combination, the Akaike Information Criterion (AIC) score were used. The model with the lowest AIC score was selected as a metabolomics nomogram. A heatmap was computed to analyze the correlation between the metabolomics features and the independent clinical predictors. The receiver operator characteristic (ROC) and area under the ROC curve (AUC) were employed to evaluate the accuracy of the metabolomics score, clinical predictors and metabolomics nomogram in predicting the efficiency of immunotherapy in NSCLC.
Statistical analysis
All statistical analyses were performed in R (Version 4.0.2; http://www.r-project.org/). Sample size estimating was calculated by using between-group and within-group standard deviations of metabolomics score. Shapiro-Wilk test and Bartlett test were used to assess the normality and variance homogeneity. metabolomics score and clinical parameter were compared by one-way ANOVA followed by Bonferroni correction (met normality and variance homogeneity) or Mann-Whitney U test (if not met normality or variance homogeneity). Survival analysis was assessed with Kaplan-Meier curves via log-rank tests. The ROC and AUC were employed to evaluate the predictive accuracy of the metabolomics features and clinical predictors. The "caret" package was used for redundant features elimination; the "glmnet" package was used for binary LASSO logistic regression, linear regression, and multivariate binary logistic regression; the "rms" package was used for nomogram and calibration curve plotting; the "pROC" package was used for AUC calculation; the "FELLA" package was used for enrichment and pathway analyses. The "pwr" package was used for sample size estimating. All the statistical tests were two-sided and considered statistically significant at P < .05, unless otherwise stated.