Background Functional disruptions by large germline genomic structural variants in susceptible genes are known risks for cancer. Few studies have used deletion structural variants (DSVs) to predict cancer risk with neural networks or studied the relationship between DSVs and immune gene expression to stratify prognosis.
Methods Whole-genome sequencing (WGS) data was analyzed with the blood samples of 192 cancer and 499 noncancer subjects with or without family cancer history (FCH). Ninety-nine colorectal cancer (CRC) patients had immune response gene expression data. To build the cancer risk predictive model and identify DSVs in familial cancer, we used joint calling tools and attention-weighted model. The survival support vector machine (survival-SVM) was used to select prognostic DSVs.
Results We identified 671 DSVs that could predict cancer risk. The area under the curve (AUC) of receiver operating characteristic curve (ROC) of attention-weighted model was 0.71. The 3 most frequent DSV genes observed in cancer patients were identified as ADCY9, AURKAPS1, and RAB3GAP2 (p < 0.05). We identified 65 immune-associated DSV markers for assessing cancer prognosis (P < 0.05). The functional protein of MUC4 DSV gene interacted with MAGE1expresssion, according to the STRING database. The causal inference model showed that deleting the CEP72 DSV gene could affect the recurrence-free survival (RFS) of IFIT1 expression.
Conclusions We established an explainable attention-weighted model for cancer risk prediction and used the survival-SVM for prognostic stratification by using DSV and immune gene expression datasets. It can provide the genetic landscape of cancer patients and help predict the clinical outcome.