Objectives
The goal of this study was to use a new machine-learning framework based on max-linear competing risk factor models to identify a parsimonious set of differentially expressed genes (DEGs) that play a pivotal role in the development of colorectal cancer (CRC).
Methods
Transcriptome data from six public datasets were analyzed, and a new Chinese cohort was collected to validate the findings.
Results
The study discovered a set of four critical DEGs - CXCL8, PSMC2, APP, and SLC20A1 - that exhibit high accuracy in detecting CRC in diverse populations and ethnicities. Notably, PSMC2 and CXCL8 appear to play a central role in CRC, and CXCL8 alone could potentially serve as an early-stage marker for CRC.
Conclusions
This work represents a pioneering effort in applying the max-linear competing risk factor model to identify critical genes for human malignancies, and the reproducibility of the results across diverse populations suggests that the four DEGs identified can provide a comprehensive description of the transcriptomic features of CRC. The practical implications of this research include the potential for personalized risk assessment and tailored treatment plans for patients.