Global proteome characterization of LUAD and healthy plasma
The mass and signal intensity of peptide and fragment ions after peptide fragmentation were obtained by mass spectrometry. The information at the peptide level is called the primary spectrum, and the information at the peptide fragment ion is called the secondary spectrum. Secondary MS data were retrieved by Proteome Discoverer (V2.4.1.15), Homo_sapiens_9606_PR_20201214. fasta database. In this research, proteomic analysis was matched on 10 LUAD peripheral blood (7 female and 3 male) and 10 healthy peripheral blood. A total of 1,181,604 secondary mass spectra were collected (Figure 1a). 340,834 spectra are matching theoretical secondary spectra in the database, with a utilization ratio of 28.85%. A total of 11936 peptide sequences were identified from the matching results, including 10922 unique peptide sequences. During quantification, one protein corresponded to multiple specific peptides, and 2094 identification proteins were identified by specific peptides, resulting in 1772 proteins. Principal component analysis was used to show a general pattern of changes in protein abundance within and between the two groups to observe the similarities and differences between samples. As shown in Figure 1b, the LUAD group exhibited clustering specificity, while the healthy group spread out randomly. The Pearson correlation coefficient between samples is shown by a heatmap. The correlation coefficient between proteomics data is shown in Figure 1c, and the correlation between LUAD samples was 0.916-0.947 and between healthy samples ranged from 0.890 to 0.945, indicating a significant correlation.
Proteomic features of DEPs in LUAD
To judge the conspicuousness of the difference in protein expression, we performed a t-test on the Log2FC of each protein in the LUAD group compared with the NL group. When the P value was ≤ 0.05, Log2FC>1.5 were up-regulated DEPs, and Log2FC <1/1.5 were down-regulated DEPs, and 317 DEPs were finally obtained. Compared with healthy controls, in the quantifiable proteins, there are 208 up-regulated proteins and 109 down-regulated proteins in the LUAD group signally, like MYH6, POSTN, NAP1L1, EXOC2 and NOTCH1. A volcano plot was drawn exhibiting DEPs in statistics in LUAD in comparison with the NL group (Figure 1d). Furthermore, the heatmap also represented a hierarchical cluster of the DEPs (Figure 2a). According to Gene Ontology (GO) functional classification for the 317 DEPs, they were cataloged into three categories and 22 terms, including 12 biological processes, three cellular components, and seven molecular functions (Figure 2b). These proteins are involved in cellular processes, binding and catalytic activity. Figure 2c shows that about 36% DEPs were in the cytoplasm (115 proteins), 23% in extracellular space (74 proteins), 13% in mitochondria (42 proteins), and 11% in the nucleus (36 proteins), which suggests that DEPs of LUAD may be secreted to carry out signal transduction, participate in tumor energy metabolism through mitochondria, and regulate gene expression in the nucleus. The Clusters of Orthologous Groups of proteins (COG) categories indicated that the DEPs were related to energy metabolism, carbohydrate metabolism, signal pathway and mechanisms, cytoskeleton and protein modification after translation, protein transportation and chaperones (Figure 2d).
KEGG and GO enrichment analyses of DEPs in plasma
To further predict the possible roles of DEPs, we conducted functional enrichment analysis using Fisher’s exact test, P value < 0.05 is required. The GO analysis included biological processes, cellular component, and molecular function annotations, explaining the function of DEPs in multiple angles. The upregulated proteins in the plasma of LUAD patients compared with healthy subjects were significantly enriched in neurogenesis, negative regulation of cell communication and signal transduction, positive regulation of cell death and other biological processes (Figure 2e). According to the cellular component annotation, the majority of the DEPs originated from the endoplasmic reticulum, extracellular space, Golgi apparatus, and endoplasmic reticulum lumen and are involved in protein processing and transport. The molecular function analysis revealed that DEPs were enriched in ionic binding activity and protein heterodimerization activity (Figure 2f), which act a pivotal part in transmembrane transports and biological information transfer. The downregulated proteins were centered on phosphorylation, regulation of organelle organization, ribose and nucleoside phosphate metabolic processes. Cellular component annotation was focused on the cytoskeleton, mitochondrion, microtubule cytoskeleton, and other regions related to cell proliferation. The molecular functions of the downregulated proteins were enriched in anion binding, small molecule binding and nucleotide binding, which is consistent with the upregulated proteins.
Development of targeted protein assays using PRM
To further extend the research for application and prognosis role, PRM detection was further tested. The 317 DEPs obtained previously were retrieved from the database, and the next study focused on 40 proteins closely related to prognosis. PRM testing was performed in peripheral blood from an additional 10 LUAD patients and 10 healthy subjects, of which 35 proteins were quantitatively analyzed, limited by certain protein characteristics and expression abundance. 27 proteins were statistically significant, including 20 up-regulated differential proteins and seven down-regulated DEPs. The obtained data were processed by Skyline (v.3.6) to calculate protein relative abundance (Figure 3a). The DEPs were involved in platelet activation, VEGFA-VEGFR2 signaling pathway, and carbohydrate catabolic process (Figure 3b).
As illustrated in Figure 3D, we researched the altered proteins in STRING (V.11.5) and accessed the PPI network (Figure 3c); these proteins may have a specific role in the early occurrence of LUAD. GAPDH, TUBA4A, WDR1, and LDHA are the central proteins shown in the network. The candidate proteins GAPDH and RAC1 showed the highest connectivity with other differentially expressed proteins between LUAD and NL using STRING by calculating the ratio of LUAD/NL, the expression levels of the top 10 significant DEPs, RAC1, ACTR2, PFKP, FHL1, UQCRC1, POSTN, RAB27B, ARPC2 , RAP1B, and PNP, are shown in Figure 3d.
KM-plotter was used to evaluate the prognosis performance of every DEP. The KM analysis showed that LUAD patients with positive ATCR2, FHL1, RAB27B, and RAP1B(Figure 4) expression had observably longer OS than patients with negative expression (P<0.05). The high expression of ARPC2, PFKP, PNP, RAC1, GAPDH, and TUBA4A was observably negatively correlated with OS prognosis (P<0.05). ARPC2 is involved in the control of intracellular dynamic changes of actin and facilitates cell migration and tumor metastasis in lung, colon, and breast cancer. PFKP is a platelet-specific phosphofructokinase that makes a critical difference in metabolic reprogramming in certain cancers, including bladder cancer, breast cancer, and lung cancer, and is a potential driver gene in the GEO (Gene Expression Omnibus) database[21–25]. RAC1 (Rac family small GTPase 1) is a driver gene that regulates plenty of cellular events, particularly in cell growth and division, cytoskeletal and synaptic recombination, autophagy, and tumor metastasis. These 10 genes were analyzed for combined KM prognosis (Figure 4)[26–28].
Overall survival of ATCR2, FHL1, RAB27B, RAP1B, UQCRC1, ARPC2, PFKP, PNP, POSTN, RAC1, GAPDH and TUBA4A were performed in KM-Plotter online survival analysis site
Potential diagnostic markers in LUAD
The samples used in this study were mainly concentrated in early LUAD, so the next step was to draw ROC curves to check the effect of each DEPs in the diagnosis of LUAD. If the AUC was > 0.70, the proteins could be regarded as a potential independent diagnostic factor. The mean plasma protein expression of 10 healthy people was negative, and ROC analysis was performed on the 27 proteins with differential expression by Medcalc software. 17 out of 27 proteins revealed a high AUC (>0.80) between the LUAD group and NL group. Among those proteins, CCT7 had an AUC of 0.960, and there were five proteins with an AUC from 0.90 to 0.95, indicating that these central proteins might have the discriminative capacity in LUAD (Figure 5a), P<0.05. Logistic regression analysis was performed for 6 plasma proteins with higher AUCs, including CCT7, UQCRC1, PGD, GAPDH, LDHA, and GP5, resulting in a detection rate of 92% (Figure 5b). UQCRC1 acts on cytochrome C upstream or internally of mitochondrial electron transport and has been studied extensively in Alzheimer's disease, which may play an important role in the targeted therapy of pancreatic cancer [29–31].