Currently, there are a large number of studies from across the world studying genes and biomarkers related to CIN progression and occurrence. However, the majority of studies report on single genes or single biomarkers that are associated with CIN, and only a few studies have combined multiple factors to predict CIN progression and occurrence. Mei Sze Tan et al.  screened 9 differentially expressed genes in cervical cancer and normal tissues using bioinformatics tools. However, this study only screened the genes in the data set and provided no experimental verification, and there was no demonstration of the expression of these genes in tissue specimens or analysis of the diagnostic value of CIN. Petra Biewenga et al. used clinical cervical cancer tissue specimens and normal cervical tissue specimens to conduct experimental research and screened 9,313 significant genes, but no further detailed analysis of these expressed genes was performed. However, it has been suggested that there are a great deal of significant genes in normal cervical tissues, CIN tissues and SCC, which lays the foundation for multifactor combined diagnosis.
(1) The genes and pathways related to the occurrence and progression of CIN
The HPV infection pathway summarizes the mechanism of HPV infection and the carcinogenic process. The HPV infection pathway includes 11 subpathways: the Wnt signaling pathway, mTOR signaling pathway, apoptosis pathway, NFKB signaling pathway, P53 signaling pathway, JAK/STAT signaling pathway, Notch signaling pathway, PI3K/Akt signaling pathway, Toll-like receptor signaling pathway, focal adhesion pathway and antigen processing and presentation pathway. In this study, FOXO1, CSKN1A1 and CTBP2 were significantly differentially expressed genes located in the HPV infection pathway. Among them, the significant genes CSKN1A1 and CTBP2 were located in the Wnt signaling pathway. HPV E6 can activate the Wnt signaling pathway, thereby causing immortalization of cervical epithelial cells . In addition, HPV E6 acts on the gene Dvl, which is located upstream of the Wnt signaling pathway. The Dvl gene is overexpressed in cervical squamous carcinoma cells and plays a key role in the carcinogenesis of cervical epithelial cells. While experimental results indicated that CSKN1A1 is located downstream of Dvl, it was speculated that in the progression of CIN, CSKN1A1 was affected by HPV E6 so that the cells acquired immortality(figure 3). CTBP2 has not been reported to be related to cervical diseases, and its role in the HPV infection pathway is unknown. In studies of gynecological tumors, L Barroilhet et al . pointed out that CTBP2 is overexpressed in ovarian cancer cells and that CTBP2 can downregulate the target gene of the Wnt signaling pathway and promote the carcinogenesis of ovarian epithelium, but its role in cervical cancer needs further study. FOXO1 is located in the PI3K/Akt signaling pathway. The PI3K/Akt signaling pathway can be activated by HPV E7, which can inactivate Rb and promote the occurrence of HSIL . HPV E7 can upregulate the expression of FOXO1, which serves as the upstream gene of Akt, but Akt can inhibit the expression of FOXO1, so HPVE7 can indirectly inhibit the expression of FOXO1. In this study, FOXO1 expression was significantly lower in cervical cancer tissues than it was in normal tissues, and the FOXO1 gene was located upstream of the Rb gene in the PI3K/Akt signaling pathway. Therefore, low expression of the FOXO1 gene may be related to Rb inactivation(figure 4).
The main function of the Hippo signaling pathway is to control the normal size of organs. In the process of cervical carcinogenesis, the expression of the core gene of this pathway, YAP, is upregulated with the progression of cervical lesions . Excessive activation of YAP increases the susceptibility of cervical epithelial cells to HPV, and YAP and HPV work together to promote carcinogenesis of cervical epithelium cells . In this study, PRKCI and TGFBR2, which are located in the Hippo signaling pathway, were significantly differentially expressed genes. TGFBR2 is located upstream of YAP and inhibits the formation of apoptotic precursor proteins(figure 5). In the SCC group, the expression of TGFBR2 was significantly higher than it was in the CIN group. According to the experimental results, it is speculated that the overexpression of TGFBR2 inhibited the apoptosis of cervical epithelial cells, and together with the synergistic effect of HPV, carcinogenesis of cervical epithelium cells was promoted. Compared with normal cervical tissue, the expression of TGFB2 in CIN tissue is significantly lower, and it decreases with the progression of CIN . TGFBR2 is a receptor protein of TGFB2, and the decreased expression of TGFB2 is likely to cause a similar change in TGFBR2. Previous studies have revealed that cervical cancer cases with low expression of TGFBR2 have a poor prognosis and have confirmed that TGFBR2 can inhibit the cell cycle process at the G1/S stage through the TGFB/Smad pathway, while low expression of TGFBR2 can alleviate the inhibitory effect of this pathway, thereby speeding up cervical cancer cell progression from the G1 phase to the S phase and resulting in cell proliferation . TGFBR2 works via different pathways in the process of initiation and progression of CIN. Kyung-Hee Kim et al.  reported that overexpression of the YAP gene in lung adenocarcinoma can result in the phosphorylation of PRKCI, which upregulates the expression of PRKCI, suggesting a high pathological grade and an unfavorable prognosis. PRKCI likely inhibits the recruitment of immune cells in the microenvironment of ovarian cancer by regulating the activity of YAP1 through the Hippo signaling pathway, resulting in immunosuppression and promoting tumor growth. There are few reports of PRKCI and its role in the carcinogenic mechanisms of cervical cancer(figure 6). Femi OF et al  demonstrated that a PRKCI mutation is related to the occurrence of cervical cancer, but the specific mechanism remains unclear(figure 6).
(2) Clinical factors related to the occurrence and progression of CIN
In this study, the proportion of premenopausal cases of CIN was significantly higher than that of SCC cases, and logistic analysis found that premenopause was one of the independent risk factors for the progression of CIN. Chen et al.  studied patients with CIN who relapsed after receiving cervical conization or LEEP treatment, and the reoccurrence rate of premenopausal patients was significantly higher than that of menopausal patients, which is consistent with this study. However, Renata B et al.  reported that postmenopausal CIN patients were more prone to interstitial infiltration and progression to invasive cervical cancer. Therefore, it is still unclear whether menopause has any effect on the progression of CIN. According to the results of this study, it could be speculated that patients without menopause were younger, had more active sexual activity and were more likely to have persistent HPV infection . At the same time, the level of endogenous estrogen in premenopausal females is higher , and the high level of estrogen promotes the transcription and integration of HPV and the degradation of the host cell P53 protein, thereby causing cervical epithelial cells to become cancerous . Moreover, young premenopausal women are more likely to take oral hormonal contraceptives, and oral hormonal contraceptives are also one of the risk factors for the progression of CIN .
Compared to CIN patients, the average age of SCC patients was greater, and the parity was significantly more than that of the CIN cases. For women younger than 25 years old, regardless of the level of cervical lesions, the rate of spontaneous regression was 1.4 times higher than that of women older than 50 years old. Christine Bekos obtained similar results; the proportion of women over 40 years old who experienced CIN progression was significantly higher than the proportion who were younger than 40 years, and for every extra 5 years of age, despite cervical lesion grades, the rate of spontaneous regression decreased by 21%. The results of this study showed that the average age for patients with SCC is significantly greater than that of CIN patients, suggesting that age is likely to be related to the progression of CIN. As age increases, immune function declines, leading to persistent HPV infection. In addition, the parity of patients with SCC was significantly greater than that of patients with CIN. Among women with persistent HPV infection, the greater the number of deliveries there were, the greater the risk of developing high-grade cervical lesions was. High parity is a risk factor for cervical cancer . Especially for women who are elderly and have high parity, HSIL is more likely to progress . The results of this study are consistent with those reported in previous research.
Compared to patients in the normal cervical group, the proportions of HPV positivity and CINII+ TCT results in CIN cases were significantly higher than they were in the normal cervical group. In model 12, the TCT results had a large impact on the results. This shows that TCT examination played an important role in the diagnosis of CIN. HPV and TCT play an important role in diagnosing CIN and identifying CIN and SCC. Although the results of TCT will cause false negatives due to the different methods of the operators, the accuracy of TCT in the diagnosis of cervical diseases has been significantly improved compared to traditional cervical smears. Among HPV-negative women, the proportion of women with normal TCT results and cervical biopsies who experienced CINII + after 15 years of follow-up was only 4.8%. However, 46.2% of women with TCT results of HSIL+ experienced disease progression . Moreover, HPV is an important factor in the occurrence of CIN and cervical cancer , and TCT combined with HPV detection has greatly promoted the early diagnosis of cervical disease. Hence, patients with HPV infection and TCT results with CINII+ should undergo further examination and follow-up to prevent the occurrence and progression of cervical lesions.
(3) The predictive random forest models
The random forest model consists of multiple decision trees, and there is no correlation between decision trees. When a new input sample enters, it will be judged by each decision tree. The random forest model is capable of preventing fitting, has low requirements of the data set, and has strong adaptability, making it suitable for nonlinear data. In this study, a random forest algorithm was used to build random forest models. Then, we choose the best models according to the accuracy, AUC value and OOB error value.
Regarding the random forest models of CIN progression, model 3 had the highest accuracy and AUC value, and the OOB error value was relatively small. Therefore, model 3 was chosen as the predictive model for CIN progression. In model 3, CSNK1A1 and PRKCI had a great impact on the result. Moreover, these two genes were also significantly differentially expressed genes during the progression of CIN. In the HPV infection signaling pathway, CSNK1A1 can cause cell polarity loss through the action of HPVE6. However, there is no research on the expression of CSNK1A1 and cervical diseases. Most of the research on CSNK1A1 focuses on hematological malignancies. Overexpression of CSNK1A1 can promote the proliferation and survival of tumor cells by downregulating the expression of CTNNB1 in myeloma ; CSNK1A1 and CTNNB1 both function in the classic Wnt/β-catenin signaling pathway. CSNK1A1 inhibits the canonical Wnt/β-catenin signaling pathway by promoting the degradation of CTNNB1, thereby promoting tumor cell growth . However, in this study, the expression of CSNK1A1 in cervical cancer tissue was significantly higher than it was in CIN tissue, but the expression of CTNNB1 in CIN and SCC tissues was not significantly different. According to the experimental results, it is speculated that the overexpression of CSNK1A1 has no effect on CTNNB1 during the progression of CIN, so it may not promote cell proliferation or even malignancy through pathways other than the Wnt/β-catenin signaling pathway(figure 7). PRKCI is in the Hippo signaling pathway, but the mechanism by which it leads to CIN and cervical cancer is unknown. According to previous studies, overexpression of YAP in this pathway may lead to upregulation of PRKCI, which eventually results in carcinogenesis. PRKCI has been confirmed to be overexpressed in many solid tumors. In the study of gynecological tumors, the expression of PRKCI in ovarian cancer tissues was significantly higher than it was in normal tissues, and it enhances the invasion and proliferation ability of ovarian cancer cells . The experimental results of this study showed that the expression of PRKCI in cervical cancer tissue was significantly higher than it was in CIN tissue, which may be related to the progression of CIN. However, more research is needed to uncover the mechanism.
For the random forest model of CIN occurrence, the accuracy rate and AUC value of model 12 are the largest, and the OOB error value is the smallest. Therefore, model 12 was chosen as a predictive model for the occurrence of CIN. CTBP2 has the largest impact on the model 12. CTBP2 is in the classic Wnt/β-catenin signaling pathway, which is one of the subpathways of the HPV infection signaling pathway; however, the role of CTBP2 in the HPV infection signaling pathway is unknown. In this study, the expression of CTBP2 in normal cervical tissue was higher than it was in CIN tissue. Surprisingly, compared with normal cervical tissue, SCC tissue expresses CTBP2 at much higher levels. Overexpression of CTBP2 might be linked to the progression of CIN. CTBP2 has not yet been reported in cervical diseases, but some studies have found that CTBP2 is associated with a variety of solid tumors. CTBP2 is overexpressed in non-small cell lung cancer and promotes tumor cell invasion and proliferation through the classic Wnt/β-catenin signaling pathway . Additionally, CTBP2 is related to angiogenesis in prostate cancer cells, and silencing CTBP2 can promote prostate cancer cell apoptosis . Similar results were obtained in this study, in which CTBP2 was overexpressed in SCC tissues. Overexpression of CTBP2 can also inhibit the expression of genes located downstream of the classic Wnt/β-catenin signaling pathway, such as CTNNB1 ; however, in this experiment, CTNNB1 was not inhibited by overexpressed CTBP2, indicating that the overexpression of CTBP2 might not promote the malignant transformation of cervical cells through Wnt/β-catenin signaling pathway, and it might use other pathways instead, leading to the occurrence and progression of CIN. Another study suggested that CTBP2 can promote epithelial-mesenchymal transition (EMT), and EMT is the key process of epithelial cell carcinogenesis. In addition, CTBP2 can promote the replication and proliferation of adenovirus in 293T cells . Since most CIN and SCC cases involve persistent HPV infection, it can be inferred that overexpression of CTBP2 may be related to HPV infection and may promote the malignant transformation of cells by promoting the replication and proliferation of HPV in cells.
TGFBR2 and FOXO1 were two other significant genes that might be related to the occurrence and progression of CIN. Among the tested models, model 3 incorporates TGFBR2 and FOXO1, and model 12 incorporates FOXO1. These two genes influence the progression and occurrence of CIN. There are few reports on the correlation between TGFBR2 and cervical diseases. Cai et al.  pointed out that miR-17-5p promotes the development and metastasis of cervical cancer by upregulating TGFBR2. In contrast, Yang et al.  reported that the downregulation of TGFBR2 suggests a poor prognosis for cervical cancer. In the early stages of carcinogenesis, TGF-β plays an inhibiting role, while in the later stages of carcinogenesis, tumor cells lose sensitivity to TGF-β signaling and take advantage of TGF-β signaling to promote cellular EMT . This study found that the expression of TGFBR2 in CIN tissue was significantly lower than it was in normal cervical tissue and SCC tissue. It is speculated that when precancerous lesions of cervical epithelial cells occur, the expression of TGFBR2 is downregulated, and TGF-β signaling does not function as an inhibitor of cancerous cells. When CIN progresses to SCC, cells and tissues become resistant to TGF-β signaling, and TGFBR2 is upregulated and promotes the progression of CIN. In addition, several studies have described that TGFBR2 is closely related to EMT during carcinogenesis. In immortalized cervical epithelial cells, overexpression of TGFRB2 promotes EMT and malignant transformation of immortalized cervical epithelial cells . TGFBR2 may also accelerate the malignant transformation of cervical epithelial cells and the progression and initiation of CIN by promoting EMT.
FOXO1 plays an important role in the occurrence and progression of CIN. FOXO1 is considered to be a tumor suppressor gene. Overexpression of FOXO1 in vitro can inhibit the growth and proliferation of cervical cancer cells. Moreover, the prognosis of cervical cancer with FOXO1 overexpression is more satisfactory. In contrast, the downregulation of FOXO1 promotes the invasion and metastasis of cervical cancer cells . The role of FOXO1 in the development of cervical cancer is still controversial. The results of this study showed that the expression of FOXO1 in SCC tissue was significantly lower than it was in CIN tissue, but the expression of FOXO1 in CIN tissue was significantly higher than it was in normal cervical tissue. According to the experimental results and pathway information, during the progression of CIN to SCC, FOXO1 functions in the PI3K-Akt signaling pathway, which is a part of the HPV infection signaling pathway. This pathway is related to cell proliferation, and upstream genes can inhibit the expression of FOXO1 to promote cell proliferation. Thus, the low expression of FOXO1 may be linked to the excessive proliferation of cells, which may promote the progression of CIN to cervical cancer. However, some studies reported that inhibiting FOXO1 expression could inhibit cervical cancer growth . Chay et al. reported that the expression of FOXO1 in cervical cancer tissue and CIN tissue is significantly higher than it is in normal cervical tissue and that FOXO1 overexpression is an independent risk factor for poor prognosis of cervical cancer. Concerning the mechanism of CIN occurrence, FOXO1 might have a tendency to be overexpressed to inhibit the progression of lesions to cervical cancer. The overexpression of FOXO1 in CIN might be a warning of the progression of CIN.
This study has some limitations. First, this is a single-center retrospective analysis with a small sample size. Second, there are some errors in the experimental results. In the future, it will be necessary to expand the sample size and improve the experimental methods to fully assess risk factors related to the occurrence and progression of CIN.