WGCNA and key module analysis
The initial quality was assessed using the average linkage method. Two outlier samples were removed after the clustering. The remaining 339 HNSCC and 44 normal tissue samples with clinical information pertaining to LOI were used for subsequent analyses. In total, 2,601 genes showed the highest variance via the average linkage/hierarchical clustering method.
To establish a scale-free network, the scale‑free index (Fig. 1a) and mean connectivity (Fig. 1b, 1c) were calculated. We found that when the power value of β = 7, the scale‑free topology for the fitting index reached 0.85 (Fig. 1d). Different genes were subsequently grouped into modules according to the association of expression. Moreover, genes with similar expression patterns were placed into different modules via average linkage clustering. Finally, a total of 10 modules were identified (Fig. 2). On exploring the correlation between the MEs and LOI clinical phenotype, we found that 10 co-expression modules were correlated with the LOI clinical phenotype (Fig. 3a) and were associated with cancer status, particularly turquoise and pink key modules (Fig. 3b). Then, scatter diagrams were constructed for correlation analyses between gene significance for LOI status and module membership in the turquoise (Fig. 3c) and pink (Fig. 3d) modules, which revealed that genes in the two modules were significantly related with LOI status. The correlation and P values (Fig. 3c, 3d) indicated that the turquoise and pink modules showed high correlations with LOI status.
Enrichment analysis of key co-expression modules
To determine the function of genes in the key co-expression modules, GO function and KEGG pathway analyses were performed. GO function analysis showed that the turquoise module was associated with DNA replication, mitotic nuclear division, chromosome segregation, nuclear division, and DNA-dependent DNA replication, whereas KEGG pathway analysis indicated that it was associated with cell cycle, DNA replication, mismatch repair, and p53 signaling pathway (P < 0.05) (Fig. 4a, 4b). Similarly, GO function analysis indicated that the pink module was involved in not only squamous cell functions, such as epidermal cell differentiation, keratinocyte differentiation, skin development, epidermis development, and cornification (P < 0.05), but also regulation of protein secretion, for example, via the negative regulation of proteolysis, peptidase activity, and endopeptidase activity (P < 0.05) (Fig. 4c). These results indicated that the turquoise and pink modules played a pivotal role in LOI in patients with HNSCC.
PPI network analysis and hub genes
To identify hub genes in the key modules, PPI network analysis was performed using the STRING database. Connection threshold was used to define the hub genes; 89 genes, including the top five genes KIF18B, BUB1, BUB1B, KIF4A, and EXO1, in the turquoise module (connect threshold > 0.25) and 38 genes, including the top five genes KRT78, CNFN, SLURP1, PRSS27, and CRCT1, in the pink module (connect threshold > 0.10) were screened as candidate hub genes (Fig. 5, Additional file 4: Table S4, Additional file 5: Table S5). In addition, connect degree (>6) was used to define the hub genes, which led to the identification of 24 hub genes (18 in the turquoise module and 6 in the pink module).
Prognostic value and expression analysis of hub genes
After excluding samples with no survival information or survival duration < 1 month, 339 HNSCC samples were used to evaluate the prognosis of the 24 hub genes. We found that HNSCC samples with LOI showed a poor clinical outcome than those without LOI (P < 0.05), indicating that LOI is a key histological characteristic in HNSCC (Fig. 6a). Univariate survival analysis was then performed using the R-package survival, and the results indicated that CNFN was associated good survival (Fig. 6a) but KIF18B, KIF23, PRC1, CCNA2, DEPDC1, and TTK were associated with poor survival in HNSCC samples with LOI (P < 0.05; Fig. 6c–h).
To determine mRNA expression levels of the seven hub genes (CNFN, KIF18B, KIF23, PRC1, CCNA2, DEPDC1, and TTK), we used GEPIA and found that CNFN was significantly downregulated but KIF18B, KIF23, PRC1, CCNA2, DEPDC1, and TTK were significantly upregulated in HNSCC (P < 0.05; Fig. 6i). To assess protein expression levels of the seven hub genes, we performed protein expression analyses using the HPA database (Fig. 6j). The expression level of CNFN was low and thus could not be detected (100%, n = 4), whereas that of KIF18B (66.7%, n = 3), KIF23 (100%, n = 4), PRC1 (75.0%, n = 4), CCNA2 (66.7%, n = 3), DEPDC1 (100%, n = 3), and TTK (66.7%, n = 3) was either moderate or high (Fig. 6k).
Establishment of the prognostic risk score formula
Using the LASSO method and multivariable Cox regression analysis, two mRNAs (CNFN and DEPDC1) were identified as integrated prognostic biomarkers in patients with HNSCC. We then established a prognostic risk score formula based on the expression profiles of CNFN and DEPDC1 and their regression coefficients. The prognostic risk score formula was as follows: risk score = EXPDEPDC1 * 0.32636 + EXPCNFN * (−0.07544). The risk score was calculated for all patients, classifying patients into the high-risk (n = 165) and low-risk (n = 165) group using the median risk score as the cutoff value (Additional file 6: Table S6). The distribution of risk scores and survival status of patients are shown in Fig. 7a and 7b, respectively.
We then assessed the prognostic value of the aforementioned formula using Kaplan–Meier analysis. Patients in the low-risk group showed better OS than those in the high-risk group (P < 0.001; Fig. 7c). Moreover, time-dependent ROC analysis was utilized to evaluate the prognostic capacity of the formula. The areas under the ROC curve for 1-, 3-, and 5-year OS were 0.582, 0.634, and 0.636, respectively, implying that the integrated two-mRNA signature was much better at predicting the risk of LOI in patients with HNSCC (Fig. 7d).
Identification of small molecular agents
To determine which small molecular agents in the turquoise and pink modules could target LOI, we searched all drug–gene interactions in the DrugBank database. |Connectivity score| > 2 and P < 0.05 were used for screening; we found that five drug–module interactions (XL844, AT7519, AT9283, alvocidib, and nelarabine) in the turquoise module and three drug–module interactions (benzamidine, L-glutamine, and zinc) in the pink module could be used to target LOI (P < 0.05; Table 2). To investigate the clinical application of the eight small molecular agents in head and neck cancer or solid tumor, we examined the clinical trial registration of these agents using ClinicalTrials.gov (https://clinicaltrials.gov/ct2/home). Although a study on benzamidine remains to be conducted, three clinical trials of L-glutamine (NCT03015077, NCT02282839, NCT00006994) and zinc (NCT00036881, NCT03531190, NCT02868151) in head and neck cancer have been conducted (Additional file 7: Table S7). Moreover, XL844 (NCT00475917), AT7519 (NCT00390117, NCT02503709), AT9283 (NCT00443976, NCT00985868), alvocidib (NCT00080990), and nelarabine (NCT01376115) have been explored in the context of solid tumor/cancer. These findings indicated that XL844, AT7519, AT9283, alvocidib, nelarabine, benzamidine, L-glutamine, and zinc appear promising for treating LOI.