Most patients with PTC achieve a relatively good prognosis. However, persistent disease or recurrences are observed in 5%-20% of patients, which can be associated with severe complications following re-operation[31]. For patients with a low risk of recurrence, prolonged thyroid stimulating hormone (TSH) suppression therapy may cause multiple adverse effects such as osteoporosis or osteopenia and cardiac comorbidities like atrial fibrillation[32]. Considering the excellent prognosis and high recurrence rate, development of novel diagnostic tools with high sensitivity and specificity seems to have greater clinical significance than exploration of neoadjuvant therapies. Traditional staging systems such as the ATA risk stratification system allow evaluation of recurrence risk with a stratified population rather than individualized risk, which indicates that a group of patients sharing the same clinical and pathological characteristics would have the same chance of recurrence[33]. However, the biological mechanisms underlying PTC progression are highly complex and heterogeneous and require a more accurate and personalized prediction model based on biomarkers at the molecular level. Therefore, specified gene signatures would predict the metastatic and recurrent potential of tumors effectively.
The incidence of PTCs has been continually increasing; however, the mortality rate has not changed substantially, which is probably because the majority of PTCs diagnosed incidentally are low-risk papillary thyroid microcarcinomas (PTMCs). Except for tumors with high-risk features such as extrathyroidal extension, clinically evident LNM(+)and special aggressive types, active surveillance appears to be safe[34] and can replace immediate surgery for low-risk PTC[35]. In general, active surveillance begins when patients are diagnosed with low-risk PTC by ultrasound examination of fine-needle aspiration biopsy (FNAB). Since PTCs involve biological mechanisms, the decision to perform active surveillance is based on gene signatures determined using biological tests followed by FNAB, which would be safer than assessments based on simple clinical and pathological characteristics, since patients with a higher gene risk score but with a low risk of clinical features would be treated more rationally.
PTC patients with cervical LNM are usually at a high risk of recurrence and have a poorer prognosis as the incidence of PTC has increased rapidly in recent years[36]. Therefore, LNM is a significant reason for locally advanced and recurrent diseases, which motivated us to focus on differentially expressed MTGs derived from HCMDB, which annotated about 2,000 potential MTGs based on more than 7,000 published pieces of literature. We identified 33, then reduced the variables to 14 DE-MTGs that were PFI-related of PTC. A novel 14-gene signature was then established and proved to be an independent prognostic factor of PTC. The high-risk patients were with a significantly shorter PFI than those with low risk. Among the 14 genes, CRABP2, EZH2, KISS1R, and S100A4 were upregulated and associated with shorter PFI (HR > 1), whereas AGTR1, ALDH1A1, DEPTOR, FAM3B, FBLN5, LIFR, SDPR, SOD3, and TFF3 were downregulated and associated with better PFI (HR < 1). In the identified 14 genes, several were previously proved to be associated with PTC progression through experiments. For example, extracellular S100A4 mediates human TC cell migration through the response of RAGE/Dia-1 signaling system[37]; overexpression of cancer stem cell markers, including ALDH1A1, in PTC was associated with a shorter PFI during follow-up[38]; ablation of estrogen receptor β decreases and suppresses PTC tumor growth, while the estrogen receptor β-H19 positive feedback loop has an influential role in PTC stem cell preservation[39]. Thus, these genes have the potential to predict metastasis and recurrence in PTC. GO enrichment analysis showed that DE-MTGs were enriched in cell adhesion, cellular matrix organization, and cell-substrate junction, consistent with the definition of MTGs, which have been proven to be associated with cancer metastasis. To our knowledge, patients with ATC and PDTC only have a mean survival after diagnosis of 0.5 and 3.2 years, respectively, and undifferentiation is a major reason for the highly malignant degree[40]. The results of significant higher gene risk scores in ATC samples partly confirms our conjecture. Besides, we explored the potential molecular alteration by the 14-gene signature using GSEA. GSEA, which is based on careful consideration of all differential genes' role, can help reveal the complex behavior of genes in the condition of health and disease more accurately whereas traditional strategies including KEGG or GO are focused on identifying individual genes that exhibit differences[41]. Multiple alterations of gene expression in the high-risk group were involved in tumor biology processes, such as homologous recombination, cell cycle, and P53 signaling pathways. Thus, the potential mechanisms underlying patients' poor PFI in the high-risk group could be elucidated. However, further explorations are needed.
Nomograms are widely used since the ability to present the numerical probability of a particular clinical event by integrating prognostic variables[42]. Nomograms including a risk score based on gene signatures and clinicopathological parameters can predict prognosis more precisely after surgery. Moreover, numerical results are more comfortable for patients to understand than the traditional staging system. As described before, traditional staging systems cannot provide an individualized risk, which is consistent with the result that the novel nomogram was better than ATA risk stratification in efficacy of predicting the PFI in PTC. To our knowledge, the prognostic gene signature based on these 14 MTGs and the relevant nomogram has not been reported before. The limited number of genes made it practical and economically feasible than whole-genome sequencing.
There were limitations to our study. First, the primary source of RNA sequencing data and clinical information was the TCGA program, in which the source of samples were from North American people. When applying the model to patients from different countries or regions, possible deviations or biases should be accounted for. Second, due to the lack of large independent dataset of PTC, we validated the nomogram’s power on the TCGA dataset itself. Future validation of external datasets with complete follow-up data is necessary. Furthermore, some essential clinical information, such as N stage condition, was missing or uncertain (Nx), which would attenuate the prognostic model's predictive power. Finally, we compared with the 2009 ATA risk guidelineto evaluate predict power since the available TCGA program did not include the latest ATA risk stratification. Further comparison is required to validate the nomogram's efficacy with the newest ATA risk system.