Repeatability test of the samples and analysis of the COL11A1
In the GSE41613, there were strong relationships among all the samples, which manifested that the repeatability of the data is good (Figure 1A). Compared with the control group, the expression of COL11A1 in the tumor group was higher (P<0.05, Figure 1B). There existed high inference score of COL11A1 in the mouth neoplasms, head and neck neoplasms, squamous cell carcinoma of head and neck through the CTD database (Figure 1C). The higher expression of COL11A1 was, the more severe pathological stage of OSCC was (Figure 1D). Based on the different expression of DEGs, the tumor grade of OSCC could be classified (Figure 1E). The OSCC patients with higher expression of COL11A1 have the poor overall survival compare with the OSCC patients with lower expression of COL11A1 (HR=1.32, P=0.047) (Figure 1F).
GO enrichment analysis for the OSCC
Butterfly plot could present that the high grade OSCC and low grade OSCC could be separated based on the score (signal 2 noise) (Figure 2A). The DEGs related with the OSCC were mainly enriched in the phosphatidylinositol binding, which was higher in the high grade OSCC (Figure 2B). Furthermore, the DEGs were also enriched in the epidermal cell differentiation (Figure 2C), establishment or maintenance of epithelial cell apical basal polarity (Figure 2D), which were down-regulated in the high grade OSCC. The NES vs. Significance could be presented in the Figure 2E, which showed the relationship between FDR q-value (or P-value) and the NES (Figure 2E). Ranked gene list correlation profile showed that area bias to high =49.3% and zero crossing at rank 11548 (55.9%) (Figure 2F).
Random ES distributions of the phosphatidylinositol binding, epidermal cell differentiation, establishment or maintenance of epithelial cell apical basal polarity could classify the two groups of low grade and high grade OSCC (Figure 3A-C). In addition, there were plenty of DEGs based on the expression, and the expression heat map could classify the OSCC (Figure 3D-E).
KEGG enrichment analysis for the OSCC
Through the GSEA, the DEGs were mainly enriched in the P53 signaling pathway, alpha linolenic acid metabolism, amino sugar and nucleotide sugar metabolism, arachidonic acid metabolism, beta alanine metabolism, cysteine and methionine metabolism (Figure 4A-F). Random ES distributions of these KEGG pathways could classify the two groups of low grade and high grade OSCC (Figure 5A-F).
Verification of the expression of COL11A1 with the OSCC clinical samples
Through the immunofluorescence assay, the expression of COL11A1 was higher in the high grade OSCC than the low grade OSCC (P<0.05) (Figure 6A). Furthermore, the result also be verified by the PCR assay. Compared with the low grade OSCC, the mRNA expression of COL11A1 was higher in the high grade OSCC (Figure 6B).
Strong predictive value of COL11A1 for the OSCC survival time
Through the BP-neural network, best training performance is 0.0027083 at epoch 3000, and the training relevance (R) is 0.99409 between COL11A1 and the OSCC survival time. Furthermore, the percentage errors of the BP-neural network were small (Figure 6C-E). In addition, the strong predictive value of COL11A1 for the OSCC survival time was also verified by the SVM model (y=0.1834x+42.4856, R=0.6809) (Figure 7).
High sensitivity and specificity of COL11A1 for diagnosing the OSCC
The expression of COL11A1 might be molecular target for the diagnosis of the OSCC, and the sensitivity and specificity were high (AUC=0.781, P<0.05). However, the sensitivity and specificity of tumor grade (AUC=0.581) and tumor size (AUC=0.540) for diagnosing the OSCC were low. The joint effect of all factors on the diagnosing the OSCC was strong (AUC=0.815, P<0.05) (Figure 8).
Overall survival analysis for the OSCC
Based on the clinical samples, the OSCC patients with higher expression of COL11A1 have the poor overall survival compare with the OSCC patients with lower expression of COL11A1 (HR=1.645, P=0.005). However, the other factors were not related with the survival time of OSCC (P>0.05, Figure 9)
Pearson's Chi-square test was used to analyze the correlation between prognosis and related factors of oral squamous cell carcinoma
Pearson's Chi-square test was used to summarize the relationship between the prognosis of oral squamous cell carcinoma and related clinical factors. Age (P=0.011), tumor grade (P= 0.023), COL11A1 (P < 0.001) was significantly correlated with the prognosis of oral squamous cell carcinoma. However, gender (P =0.887), tumor size (P = 0.261), family history (P = 0.418), and tumor stage (P = 0.429) were not significantly correlated with the prognosis of oral squamous cell carcinoma. (Table 1).
Spearman correlation coefficient was used to analyze the correlation between prognosis and related factors of oral squamous cell carcinoma
Further analysis of Spearman correlation coefficient showed that prognosis of oral squamous cell carcinoma was correlated with age (ρ = 0.179, P = 0.011), tumor grade (ρ = 0.161, P = 0.023), COL11A1 (ρ = 0.561, P < 0.001) significantly correlated. However, gender (rho = 0.010, P = 0.888) and tumor size (rho = 0.079, P = 0.263), and family history (rho = 0.057, P = 0.420), tumor staging (rho = 0.056, P = 0.432) had no significant correlation with the prognosis of oral squamous cell carcinoma. (Table 2).
Univariate Logistic regression analysis of prognosis and related factors of oral squamous cell carcinoma
Logistic regression was used to determine the relationship between relevant parameters and prognosis, odds ratio (OR), and 95% confidence interval (95% CI) for oral squamous cell carcinoma. Table 3 describes the OR and 95%CI of the subjects at the univariate Logistic regression level, and the results show that age (OR = 2.102, 95%CI: 1.180-3.746, P = 0.012), tumor grade (OR = 1.919, 95%CI: 1.093-3.372, P = 0.023) and COL11A1 (OR = 12.775, 95%CI: 6.509-25.071, P < 0.001) was significantly correlated with the prognosis of oral squamous cell carcinoma. However, gender (OR = 1.041,95%CI: 0.597-1.815, P = 0.887), tumor size (OR = 1.376,95%CI: 0.788-2.404, P = 0.261), and family history (OR = 1.264,95%CI: 0.717-2.229, P = 0.418), tumor stage (OR = 1.253,95%CI: 0.716-2.190, P = 0.429) had no significant correlation with the prognosis of oral squamous cell carcinoma. (Table 3).
Multivariate Logistic regression analysis of prognostic factors of oral squamous cell carcinoma
Multivariate Logistic regression was used to describe the OR and 95%CI of the subjects at the multivariate level. COL11A1 (OR = 12.066, 95%CI: 6.042-24.096,P < 0.001) was significantly correlated with the prognosis of oral squamous cell carcinoma. However, gender (OR= 0.847, 95%CI: 0.422-1.699, P = 0.640), age (OR= 1.774,95%CI: 0.874-3.603, P = 0.113), tumor size (OR= 1.465, 95%CI: 0.738-2.906, P = 0.275), family history (OR = 1.112, 95%CI: 0.552-2.240, P = 0.766), tumor grade (OR = 1.561, 95%CI: 0.786-3.098, P = 0.203) and tumor stage (OR = 1.009,95%CI: 0.503-2.022, P = 0.981) were not significantly correlated with the prognosis of oral squamous cell carcinoma. (Table 4).
Univariate Cox regression analysis
Table 5 shows hazard ratios (HRs) and 95% confidence intervals (95%CI) for prognosis of oral squamous cell carcinoma. Age (HR=1.592, 95%CI: 1.150-2.205, P = 0.005), tumor grade (HR=1.460, 95%CI: 1.067-1.999, P = 0.018) and COL11A1 (HR=1.848, 95%CI: 1.340-2.548, P < 0.001) were significantly associated with patient survival time. However, gender (HR = 1.275, 95%CI: 0.932-1.743, P = 0.129) tumor size (HR = 1.066, 95%CI: 0.781-1.454, P = 0.687), family history (HR = 0.929, 95%CI: 0.675-1.278, P = 0.650), and tumor stage (HR = 1.252, 95%CI: 0.915-1.714, P = 0.160) had no significant correlation with patient survival time. (Table 5).
Multivariate Cox regression analysis
In order to effectively control the influence of confounding factors, all factors were included into the multivariate Cox regression model. Multivariate Cox proportional regression analysis showed that tumor grade (HR = 1.466, 95%CI: 1.064-2.020, P = 0.019) and COL11A1 (HR =1.645, 95%CI: 1.164-2.325, P = 0.005) were significantly correlated with patient survival time. However, gender (HR = 1.097, 95%CI: 0.785-1.535, P = 0.588), age (HR = 1.379, 95%CI: 0.976-1.949, P = 0.068), tumor size (HR = 1.068, 95%CI: 0.779-1.464, P = 0.682), family history (HR = 0.959, 95%CI: 0.694-1.326, P = 0.801), and tumor stage (HR = 1.242, 95%CI: 0.902-1.711, P = 0.184) had no significant correlation with patient survival time. (Table 6)