Demographic characteristics of study participants
We enrolled a total of 141 study participants consisting of 76 treatment-naïve early-stage NSCLC patients, 12 non-malignant pulmonary diseases patients, and 53 healthy individuals. The median age of early-stage NSCLC patients was 67 years [range 41-81]. Fifty percent of patients were female. All patients had ECOG performance of 0-1. Nearly all (98.7%) histology showed adenocarcinoma. Pathological stage I disease accounted for 70%. All patients received curative attempt operations including lobectomy and bilobectomy. The median age for the non-malignant pulmonary diseases and healthy controls were 38 years [range 16-68] and 62 years [range 53-64], respectively. Ninety-five percent of non-malignant pulmonary disease patients presented with pulmonary nodules mimicking lung cancer and underwent operative procedures. A majority (70%) of resected lung tissue histology showed infection and inflammation. Details of patient demographics and disease characteristics at study enrollment are presented in Table 1.
Quality control assessment and storage time effect
We explored individual CD3 T-cell positive expression using flow cytometry detection over 5,000 viable PBMCs as the minimal accepted rate for quality control (QC). There were 44 healthy controls (83%), 10 non-malignant pulmonary diseases (83%) patients, and 63 early-stage NSCLC patients (81%) passing QC. Median collected freezing time of PBMCs to flow cytometry detection was 30.6 months [range 2.3-49.2]. While the storage time of passing QC specimen was shorter in healthy volunteers compared to early-stage NSCLC patients with a median of 10.8 [range 2.35-10.96] vs. 40.6 month [range 28.71-49.16] (p<0.0001), the proportion of passing QC was consistent between groups. There was no significant difference in freezing time of PBMCs for passing QC specimen in early-stage NSCLCs and non-malignant pulmonary diseases. Median storage time of PBMCs from non-malignant pulmonary diseases was 31.7 [range 29.61-48.45] months (p= 0.08). We concluded that long-term storage in -196 °C did not affect the quality of protein detection.
Candidate PBMCs protein expression discovery
We discover the potentially different protein expressions on PBMC in early-stage NSCLC patients compared to healthy controls. The datasets analyzed during the current study are available in the Gene Expression Omnibus repository (available at https://www.ncbi.nlm.nih.gov/geo/) including GSE12771(10), GSE13255 (14), GSE20189 (15), and GSE39345 (16). Demographic characteristics of patients from 4 gene expression datasets are shown in Table S1. Candidate up- and down-regulated gene expressions from these four independent microarray experiments were identified by using CU-DREAM (Connection Up- and Down-Regulation Expression Analysis of Microarrays) (17). 1,885 significant gene expressions with p-values and odd ratios > 1 which indicated a strong association between the two independents studies were retrieved (Table S2). Overlapping candidate gene expressions from each gene expression microarray dataset are illustrated (Figure S1). Overlapping significant up-regulated genes at least 3 datasets were retrieved for biological functions using the PANTHER (Protein ANalysis THrough Evolutionary Relationships) classification system (available at http://www.pantherdb.org) (18) (Figure S2). Seventy-five genes with immune system processes were identified and then mapped with pathology-based protein expression profiling in the Human Protein Atlas (available at https://www.proteinatlas.org) (19) (Table S3). Three significant up-regulated genes which had the presence of antibody-specific, immunohistochemistry-based protein expression on tumor-infiltrating lymphocytes (TILs) in tumor specimen, but not in normal lymphoid tissue, were selected including CLEC4A, C5AR1, NLRP3 (Figure S3). Details of selected gene ontology in homo sapiens based on the PANTHER are summarized in Table S4.
Protein expression on PBMCs in early-stage lung cancer patients and healthy controls
We first explored the potential of identifying a biomarker in patients with early-stage lung cancer detection compared to healthy controls. The ratio of specific antibody staining to CD3 positive cells was calculated by dividing the percentage of specific antibody staining cells to the percentage of CD3 positive cells. The median ratios of C5AR1, CLEC4A and NLRP3 expression in early-stage NSCLC patients compared to healthy volunteers were 0.014 [range 0-0.37] vs. 0.01 [range 0-0.07, p=0.13], 0.03 [range 0-0.87] vs. 0.02 [range 0-0.13, p=0.10] and 0.19 [range 0-0.6] vs. 0.09 [range 0.02-0.31, p< 0.0001], respectively (Figure 1a, Table 2). In addition to the number of positive staining cells, flow cytometry also enabled us to determine fluorescence intensity on the cell surface of each specific antibody-stained cell. The median fluorescence intensity (MFI) of CD3+C5AR1+, CD3+CLEC4A+ and CD3+NLRP3+ expressions in early-stage NSCLC patients compared to healthy volunteers were 185 [range 64.2-480] vs. 107.5 [range 27-229, p<0.0001], 91.2 [range 42.4-2355] vs. 71.2 [range 46.2-103, p=0.0005] and 1585 [range 478-5224] vs. 758.5 [range 318-1976, p<0.0001], respectively (Figure 1b, Table 2). Adjusted fluorescence intensity for specific protein expressions measured the fluorescence intensity if the ratio of specific proteins to CD3 positive cells equaled 1. It was calculated from the ratio of specific antibody staining to CD3 positive cells multiplied by MFI. Median adjusted fluorescence intensity of CD3+C5AR1+, CD3+CLEC4A+ and CD3+NLRP3+ cells in the early-stage NSCLC compared to healthy volunteers were 2.80 [range 0.63-1758.44] vs. 1.01 [range 0.03-15.21, p=0.007], 2.49 [range 0.15-2037.22] vs. 1.39 [range 0.27-12.97, p=0.02], and 286.79 [range 0.41-3138.03] vs. 64.67 [range 11.09-517.74, p<0.0001], respectively
Fluorescence intensity of C5AR1, CLEC4A and NLRP3 expression on CD3 positive in early-stage NSCLC patients were significantly higher than healthy controls. These results suggest that either MFI or adjusted C5AR1, CLEC4A and NLRP3 protein expression on the circulating T-lymphocytes exhibit changes in the presence of cancer. The ratio of NLRP3 expression on CD3 lymphocytes was also significantly higher in early-stage NSCLC patients compared to healthy controls. These results show the feasibility of using either the NLRP3 ratio or MFI in CD3 positive cells as a biomarker for distinguishing between the early-stage NSCLC and healthy controls.
Protein expression on PBMC as a potential candidate to discriminate early-stage non-small cell lung cancer patients from healthy controls
For the clinical use of a potential biomarker to discriminate early-stage NSCLC patients from healthy controls, the ratio and MFI cutoff of significant specific antibody staining expressed on CD3+ cells were calculated. An NLRP3 ratio of more than 0.12 was found to significantly discriminate early-stage NSCLC patients from healthy volunteers with an area under the ROC curve of 0.72 (p < 0.0001). This cutoff provided 60% sensitivity and 75% specificity (Figure 1c). CD3+C5AR1+, CD3+CLEC4A+ and CD3+NLRP3+ MFI also significantly discriminated between early-stage NSCLC patients and healthy volunteers with an area under the ROC curve of 0.74 (p < 0.0001), 0.69 (p = 0.0006) and 0.76 (p < 0.0001) respectively (Figure 1d). The CD3+C5AR1+ MFI of more than 139 units could distinguish early-stage NSCLC patients from healthy controls at 62% sensitivity and 70% specificity. The CD3+CLEC4A+ MFI of more than 81.5 units could distinguish early-stage NSCLC patients from healthy controls at 60% sensitivity and 75% specificity. Lastly, the CD3+NLRP3+ MFI of more than 1054 units provided the best sensitivity at 71.5% between early-stage NSCLC patients and healthy controls and 70% specificity.
Protein expression on PBMC in early-stage lung cancer and non-malignant pulmonary disease
An important clinical issue in the discovery of a biomarker is the ability to discriminate non-malignant pulmonary nodules from the early-stage NSCLC patients. We used specific antibodies for staining the candidate proteins on CD3 positive in PBMCs of non-malignant pulmonary diseases compared to early-stage NSCLC patients. The median ratios of C5AR1, CLEC4A and NLRP3 expression in CD3+ of non-malignant pulmonary diseases were 0.01 [range 0-0.02, p=0.12], 0.02 [range 0.01-0.08, p=0.42] and 0.13 [range 0.07-0.22, p=0.14], respectively. Median fluorescence intensity of CD3+C5AR1+, CD3+CLEC4A+ and CD3+NLRP3+ was 177.5 [range 44.90-263, p= 0.23], 84.80 [range 55.20-161, p=0.75) and 899 [range 354-1888, p=0.01], respectively. Median adjusted expression of CD3+C5AR1+, CD3+CLEC4A+ and CD3+NLRP3+ was 2.14 [range 0.21-3.58, p=0.14], 2.30 [range 0.77-12.11, p=0.51] and 114.39 [range 24.32-413.16, p= 0.07]. The NLRP3 MFI on CD3 positive of early-stage NSCLC patients showed higher expression than non-malignant pulmonary disease whereas C5AR1 and CLEC4A expression levels were not different (Figure 2a-2c). Even though limited number of non-malignant pulmonary disease patients in our study, higher CD3+NLRP3+ MFI in early-stage NSCLC patients compared to non-malignant pulmonary disease patients probe the possibility of these PBMCs protein expressions as a biomarker to discriminate between those 2 conditions.