Data collection
lncRNA and mRNA expression and clinical profiles of LSCC samples from the Cancer Genome Atlas (TCGA) database (https://tcga-data.nci.nih.gov/tcga/), including LSCC (n=110) and normal samples (n=12). Patients with survival time less than 30 days were omitted. Then, the immune associated genes (IRGs) list was obtained via the Immunology Database and Analysis Portal (ImmPort) database [12], and IRGs list was used to screen immune associated lncRNAs (IALs) by a co-expression strategy.
Identification of differentially expressed genes and lncRNAs
“limma” package in R software was used to screen the differentially expressed genes and lncRNAs, which were screened using the threshold of a log |fold change| > 1 and an adjusted P-value < 0.05. Correlation analysis was performed between IRGs and differentially expressed lncRNAs. Those with immune gene correlation coefficients greater than 0.4 and p value less than 0.001 were considered as IALs. The ‘‘gplots’’ package in R software were used to draw volcano maps.
Immune associated lncRNA based prognostic signature (IALPS)
The survival data of included patients were downloaded from TCGA database. Then, univariate cox regression was used to select survival-associated IALs (SAIALs) through the R software “survival” package and P < 0.01 was considered significant. SAIALs were submitted for multivariate Cox regression analysis. The IALPS was constructed based on expression data multiplied by the Cox regression coefficient.
Validation of the Constructed Risk Signature
The LSCC patients were divided into high and low risk groups based on the median risk score. Then, the Kaplan–Meier method with log-rank test was used to establish survival curve of the two groups. The one-year, three-year, and five-year survival receiver operating characteristic (ROC) curves predicted by the IALPS were drawn. The area under the curve (AUC) of the ROC curve was calculated to validate the performance of IALPS. Then, the correlation between the expression of the selected immune-associated lncRNAs and clinicopathological characteristics were analyzed using student's t-test.
Gene Set Enrichment Analysis
The Gene set enrichment analysis (GSEA 4.0.1) was performed to identify different functional enrichment of the low-risk and high-risk groups. Firstly, the mRNA expression profiles of LSCC from the TCGA dataset were divided into the low-risk and high-risk groups, which were performed on “IMMUNE RESPONSE” and “IMMUNE SYSTEM PROCESS” gene sets.
Evaluation of Infiltrating Immune Cells
The CIBERSORT algorithm, coupled with 22 immune cell types, was applied to analysis the differences in the abundance of infiltrating immune cells of between the low-risk and high-risk groups. Student's t-test was used to compare significant immune cell types in between the low-risk and high-risk groups. to chart violin plot by boxplot in the R package.
Construction of the lncRNA–miRNA–mRNA Regulatory Network
The DIANA-LncBase v2 tool was used to explore the target miRNAs of lncRNA [13]. Three miRNA databases, including miRDB [14], miRTarBase [15], and TargetScan (http://www.targetscan.org/vert_71/) were used to predict the target mRNAs of miRNAs. The threshold of the expression correlation between lncRNAs and miRNAs was set at 0.85. Then, the lncRNA-miRNA-mRNA regulatory network was visualized using Cytoscape (version 3.6.1).
Statistical analysis
R (version 4.1.0), Perl (version 5.26.3) and software were used for all data analyses. Both univariable and multivariable Cox regression analyses were generated by the R package ‘survival’. The ROC curve was performed using the R package ‘survivalROC’. P < 0.05 was considered statistically significant.