The 2022 Clinical Practice Guidelines for Non-small cell lung cancer(NSCLC) state that NSCLC is a unique disease and that the main causative factors are active and passive (second-hand) smoking(18). Evidence suggests that secondhand smoke exposure associated with living with a smoker leads to a 20–30% increased risk of developing NSCLC(19). As approximately 75% of patients with NSCLC are already at an intermediate to advanced stage when detected, with a very low 5-year survival rate, lung cancer screening is recommended for high-risk smokers and those with a history of smoking(20; 21).
Messenger ribonucleic acid (mRNA) is a single-stranded ribonucleic acid that is transcribed from a strand of DeoxyriboNucleic Acid (DNA) as a template and carries genetic information that can guide protein synthesis(22). MicroRNAs (miRNAs) are small non-coding RNA molecules of approximately 18–22 nucleotides in length, whose primary mechanism of action is to regulate the expression of their target genes by interacting with the 3'UTR of the target mRNA(23). There is growing evidence that mRNAs and miRNAs are important players in tissue physiology and disease pathology, including cancer, where large numbers of dysregulated mRNAs and miRNAs have been identified, implying their oncogenic or tumour suppressive properties(23–25). The development of NSCLC is also regulated by mRNAs and miRNAs(26). Therefore, we used 519 NSCLC-related samples from The Cancer Genome Atlas (TCGA) database to screen for differential mRNAs and miRNAs by bioinformatic analysis of NSCLC-related mutation groupings. For mRNAs, Least Absolute Shrinkage and Selection Operator (LASSO) regression analysis was performed to select differentially significant mRNAs, and then a multigene combination model (mRNA risk score, risk model) was constructed. The original sample was divided into high- and low-risk groups by median risk score, and the association of this risk model with NSCLC survival events and clinically relevant information (age, sex, risk level and stage) was assessed. Finally, the prognostic value of the risk model was identified by Kaplan-Meier analysis and the predictive accuracy of the risk model was assessed by plotting the Receiver Operating Characteristic Curve (ROC) curve using the R.
Understanding the causes of certain diseases is extremely difficult and complex, but the research method of using bioinformatics to find differential genes to predict the risk of developing disease in different conditions is ranked as one of the most ground-breaking methods for 2021(11; 27; 28). There are clear benefits to using polygenic risk scores to predict the risk of developing disease, for example women with a higher risk of breast cancer will get more mammograms compared to women with a lower polygenic risk score(29). We used bioinformatics to screen for differential genes in NSCLC and eventually identified a combination model of 22 genes for the prognosis of NSCLC. Multi-gene risk scoring can be scientifically useful in predicting the course of disease, but there can be potential for misuse and questionable applications, so we need to use this technology wisely(30).
It has been shown that miR-21-5p acts as an oncogene that promotes the progression of many cancers, affecting the development of lung, colorectal and chondrosarcoma cancers. Recent studies have shown that miR-21-5p is highly expressed in lung cancer cells and that miR-21-5p inhibitors significantly inhibit the proliferation, invasion and migration of lung cancer cells(31). In colorectal cancer, miR-21-5p induces cell death by regulating TGFBI, thus providing an important mechanism for its antitumor effects and expanding its clinical potential(32). miR-21-5p targets CCR7 expression and thereby inhibits STAT3 and NF-κB signaling, thereby inhibiting proliferation, migration and invasion of chondrosarcoma cells(33). In patients with NSCLC, miR-21-5p was highly expressed in tumour tissues and plasma, positively correlated with the development of NSCLC, and could be used as an independent indicator of the prognosis of NSCLC, and in combination with other miRNAs may be used in the clinical diagnosis of NSCLC(34; 35). Cellular assays showed that miR-21-5p overexpression could promote the proliferation of NSCLC cells. Through bioinformatics analysis, we found that mir-21-5p was significantly upregulated in patients with NSCLC, and this finding also provides more theoretical basis for the ability of miR-21-5p to be used as an indicator of clinical diagnosis and prognosis of NSCLC.
In the screening of the multigene combination model for the prognosis of NSCLC, we found that the multigene combination model has some limitations for the prognosis of NSCLC. On the other hand, we have not validated this multigene combination model in the clinical setting, but only used bioinformatics for analysis, lacking clinical and basic experimental validation, which may be our next work. Overall, it is feasible to use bioinformatics to screen the prognostic multigene combination model for NSCLC.