Exploration of specific co-expression modules associated with lung adenocarcinoma recurrence
To establish a co-expressed gene network associated with postoperative recurrence of lung adenocarcinoma, we used WGCNA to analyze gene expression profile data from patients with recurrent lung adenocarcinoma in the TCGA database. Finally, we screened the transcriptome data of 24 relapsed patients, including 14 males and 10 females. Based on the correlation between each two genes, the gene expression data of these patients with recurrent lung adenocarcinoma were classified into 39 gene modules using unsupervised average linkage hierarchical clustering, and labelled in a heat map with different colors (Figure 1A). Gene modules of different colors contained mutually exclusive co-expressed genes. For some genes that could not be classified into a particular module, we incorporated them into gray modules. WGCNA could analyze the correlation between gene modules and a series of phenotypes, thus this method was used to analyze the correlation between the specific gene modules of patients with postoperative recurrent early lung adenocarcinoma and a series of phenotypes, such as age, gender, survival, recurrence, recurrence type and pathological stage. Without any phenotypic and genetic preferences for module partitioning, we found that the purple module had a significant relationship with survival and recurrence, with correlation coefficients greater than 0.7 (Figure 1B). Therefore, we believed that these genes and their co-expression patterns may be associated with the recurrence of lung adenocarcinoma.
Biological insights from module purple
WGCNA classifies co-expressed genes of all patient samples into specific modules related to a series of traits regulated by the same mechanism. In the previous section, we obtained the purple module most relevant to postoperative recurrence. To verify the relationship between the co-expressed genes contained in the purple module and lung adenocarcinoma, we further constructed a heat map of the gene expression in 24 recurrent tumor tissues and 53 paracancerous tissues. The results showed significant difference in the expression pattern of the purple module between paracancerous tissues and recurrent tumor tissues (Figure 2A). However, we observed that the expression pattern of the purple module gene in patients with recurrent tumors was not as consistent as in paracancerous tissues, which exhibited three expression patterns of light red, light blue, and deep red, suggesting different mechanisms of relapse. Of the 117 genes in the purple module, there were 68 genes with significant differences between recurrent tumor tissues and paracancerous tissues (logFC|<0.6|, FDR<0.05) (Figure 2B). These 68 genes were used to map the expression heat maps of the two tissues(recurrent tumor tissues and paracancerous tissues), which showed that the expression patterns of the 68 genes were clearly more uniform than in the heat maps constructed with 117 genes. To further analyze the biological function of the genes in the purple module, the genes of the purple module (117) were then analyzed by GO in the DAVID, and the most significant Go term was "Cytosol" (p value=0.0356) (Figure 2C).
Further clarification of key genes associated with adenocarcinoma recurrence in the purple module
Gene significance (GS) has a high correlation with gene connectivity, which means that nodes with higher connectivity in the co-expression network also play an important role in the process of performing biological functions. Therefore, we also constructed a co-expression network of genes for lung adenocarcinoma recurrence, and obtained a total of 2840 edge and 879 nodes (power=8) (Figure 3A). We found that there were four genes, UPK2, KLHDC3, GALR2, and TYRP1, in the co-expression network with more nodes linked appearing in the purple module, which was highly correlated with survival and recurrence (Table1). Among these genes, the expression levels of UPK2, KLHDC3 and GALR in tumor tissues were higher than those in paracancerous tissues, and the expression level of TYRP1 in tumor tissues was lower than that in paracancerous tissues. Then, we further analyzed the function of these four genes in lung adenocarcinoma using more clinical data in the Oncomine database. The results showed that the survival outcomes of patients with low expression of UPK2, KLHDC3 and GALR2 were significantly better than those of patients with high expression (Figure 3B, C and D), with P values of 4.9e-05, 0.009, and 1.7e-05, respectively; while patients with high expression of TYRP1 had significantly better prognosis than those with low expression, with P value of 2e-07 (Figure 3E).
Demographic information and clinical characteristics of patients with surgically treated early-stage lung adenocarcinoma receiving UPK2 plasma free mRNA testing
Recent studies have shown that plasma free mRNA has the potential to act as a tumor marker. Table 2 shows the demographic information and clinical characteristics of 105 patients who meet the study criteria out of 132 patients with early-stage lung adenocarcinoma admitted to our hospital. Of these ADC patients, 58 are male (55%), 47 are female (45%), and the average age of all patients is 58 years (39-83 years), indicating that the patients admitted do not have age or gender bias. The pathological stage of most patients is stage I or stage II (83%), and that of the remaining patients is stage IIIa (17%). After surgery, 43 patients received adjuvant therapy (41%), including 39 patients receiving radiation therapy and 8 patients receiving adjuvant chemotherapy.
Diagnostic performance of UPK2
We began to collect the patient's blood, test the free UPK2 level, and then perform imaging examination from the time of first repeated examination, which was on the 90th day after surgery. If the imaging examination indicated that the patient had relapsed, then he or she would be classified into the relapsed group, and the relative expression level of UPK2 mRNA detected would be recorded. If the patient had no recurrence during the follow-up period, then the mean value of multiple testing would be recorded as the corresponding UPK2 expression level. We found that there were no significant differences in UPK2 between lung adenocarcinoma patients of different ages and genders (Figure 4A and B). Interestingly, for non-relapsed patients, UPK2 was maintained at a lower expression level. The expression level of UPK2 relative to GADPH in relapsed patients was 0.2763, while the average UPK2 expression level in non-relapsed patients was 0.1623, which was significantly lower than that of relapsed patients (P < 0.001; Figure 4C). More interestingly, for the same patient, the level of UPK2 expression at relapse was higher than that when there was no recurrence (Figure 4D). In addition, we plotted the ROC curve and calculated the AUC to determine whether the expression level of UPK2 in plasma could be used to distinguish between relapsed and non-relapsed patients (Figure 4E). The results showed that, when plasma UPK2 expression levels were used alone as diagnostic biomarkers, the AUC was 0.767 with a 95% confidence interval of 0.675-0.858. Moreover, ADC patients included in the study were also divided into UPK2 high expression group and low expression group, and their survival curves were plotted, respectively. The results indicated that patients with high plasma UPK2 mRNA expression had poorer survival, while those with low plasma UPK2 mRNA expression had a better prognosis (Figure 4F).