3.1 Expression and correlation of pre-mRNA 3′ end processing factors in pan-cancer tissues
The expression patterns of pre-mRNA 3′ end processing factors spanned across 33 unique cancer types (Fig. 1A). Several genes within this family, including CPSF1, CPSF3, CPSF4, NUDT21, PAPOLA, SYMPK, CSTF2, CSTF3, CSTF2T, and CSTF1, manifested elevated expression levels universally across these cancers. We embarked on an exploration of the interplay among various pre-mRNA 3′ end processing factors’ genes (Fig. 1B). Our data revealed a predominantly positive correlation among the expression profiles of most pre-mRNA 3′ end processing factors. To delve deeper, we meticulously examined the expression patterns of all pre-mRNA 3′ end processing factors within these 33 cancer types (Fig. 1C). Notably, CPSF6 showcased pronounced expression in the CHOL category, while CSTF2T's expression was notably subdued in pan-cancer tissues, especially within KICH (Fig. 1C).
Further, we harnessed RNA sequencing data from the TCGA database, processed using R software, aiming to discern the differential expression of pre-mRNA 3′ end processing factors across a myriad of cancer types. Our analysis revealed that CPSF2's expression was heightened in a variety of cancers, such as BRCA, BLCA, and CHOL, while it was diminished in KIRC (Fig. 2A). CPSF3′ s expression trajectory was elevated in numerous cancers, yet it was subdued within kidney chromophobe (KICH) (Fig. 2B). CSTF2 and SYMPK displayed elevated expression levels in a host of cancer types, including but not limited to bladder urothelial carcinoma (BLCA), breast invasive carcinoma (BRCA), and cholangiocarcinoma (CHOL) (Fig. 2C and 2D). The subsequent genes, from CPSF4 to PCF11, exhibited varied expression patterns across different cancers, as detailed in (Figures S1 A-M).
To further dissect the expression profiles of pre-mRNA 3′ end processing factors across diverse cancer cell lines, we sourced data from the CCLE database, embarking on an exhaustive statistical analysis. The gene expression patterns of NUDT21, CPSF1, and PAPOLA were positively correlated with a plethora of cancer cells. In contrast, the gene expression of PCF11, WDR33, CLP1, and CSTF2T was negatively associated with these malignancies (Fig. 3A). In specific cellular contexts, genes like CPSF1, CPSF6, FIP1L1, and NUDT21 displayed elevated expression levels in cell lines derived from PAPOLA, breast, and other sources. On the flip side, genes like CLP1, CSTF2T, PCF11, and WDR33 manifested reduced expression within these cancer cell lines (Fig. 3B). Additionally, we observed significant mutation events in specific genes across distinct cancer types, as illustrated in (Fig. 3C).
3.2 Prognostic value of pre-mRNA 3′ end processing factors in pan-cancer
We embarked on a comprehensive exploration of the prognostic implications of pre-mRNA 3′ end processing factors across a spectrum of cancers. Using COX analysis (Fig. 4A), we assessed the prognostic risk of these genes in a pan-cancer setting. Supplementary COX analysis results for other pre-mRNA 3′ end processing factors are depicted in (Figures S2).
To gauge the prognostic significance of differentially expressed pre-mRNA 3′ end processing factors in various tumor patients, Kaplan-Meier survival curves were employed. These curves highlighted the relationships between specific pre-mRNA 3′ end processing factors and clinical outcomes. Intriguingly, elevated expression of pre-mRNA 3′ end processing factors correlated with enhanced patient survival rates, whereas diminished expression was linked to decreased survival rates (Fig. 5).
For instance, CPSF2 had adverse implications in ACC, LIHC, and UVM but was protective in LGG (Fig. 5A). CPSF3 was detrimental in ACC, KIRC, LIHC, and MESO but beneficial in THYM (Fig. 5B). Similarly, CSTF2 was associated with unfavorable outcomes in LAML, LIHC (Fig. 5C). Meanwhile, SYMPK played a detrimental part in ACC and KIRC and embraced a protective function in UVM and PAAD (Fig. 5D). The patterns continued with CPSF1, CPSF4, NUDT21, CPSF6, CSTF1, CSTF3, CSTF2T, CLP1, WDR33, FIP1L1, RBP6, PAPOLA and PCF11 each showing varied expression implications across different cancer types, as detailed in (Figures S3 A to M).
To further understand the expression profiles of pre-mRNA 3′ end processing factors in various cancer cell lines, we sourced data from the CCLE database. Genes like NUDT21, CPSF1, and PAPOLA showed positive correlations with several cancer cells, while genes like PCF11, WDR33, CLP1, and CSTF2T had negative associations. In specific cellular contexts, certain genes displayed elevated or reduced expression levels, influencing the prognosis in various ways. In summary, our findings provide a comprehensive overview of the prognostic implications of pre-mRNA 3′ end processing factors across a range of cancers, offering valuable insights for future research and potential therapeutic interventions.
3.3 Association of pre-mRNA 3′ end processing factors with TME and stemness score in pan-cancer tissues
The tumor microenvironment (TME) plays a crucial role in driving cancer cell diversity, enhancing drug resistance, and steering cancer progression and metastasis. Our previous research confirmed the predictive potential of pre-mRNA 3′ end processing factors across various cancers. Understanding the relationship between pre-mRNA 3′ end processing factors expression and the TME in pan-cancer tissues is essential.
Using the ESTIMATE algorithm, we calculated immune and stromal scores across pan-cancer tissues, as shown in Fig. 7. Notably, there was a strong positive correlation between the scores and the expression of CSTF2T and CLP1 (Figs. 6A and 7B). Additionally, significant positive or negative correlations were observed between pre-mRNA 3′ end processing factors expression and RNA signatures (Fig. 6C) and DNA signatures (Fig. 6D). Detailed correlation coefficients and p-values for all pre-mRNA 3′ end processing factors in pan-cancer tissues can be found in Table S5.
We further explored the relationship between pre-mRNA 3′ end processing factors expression and scores related to the immune system, stroma, estimate, and stemness in specific cancers, including KIRC and LIHC (Figs. 8–9). In BLCA and COAD, pre-mRNA 3′ processing factors showed significant correlations with TME, as well as with DNAss and RNAss(Figures S4-6). In essence, our findings highlight the profound connection between pre-mRNA 3′ end processing factors and the TME, offering valuable insights for future cancer research.
3.4 Association of pre-mRNA 3′ end processing factors with immune subtypes in pan-cancer tissues
Previous research identified six unique immune subtypes, labeled C1–C6 [29], through an in-depth immunogenomic analysis. These subtypes have shown significant correlations with prognosis and genetic and immunomodulatory changes in tumors. Building on this, we explored the relationship between pre-mRNA 3′ end processing factors and these immune subtypes.
Distinct expression patterns of pre-mRNA 3′ end processing factors were observed across various pan-cancers (Fig. 7A). In particular, CPSF4, CPSF2, CPSF3, PCF11, CLP1, and CSTF2 showed marked differential expression in bladder urothelial carcinoma (BLCA) (Fig. 7B). In breast invasive carcinoma (BRCA), a range of pre-mRNA 3′ end processing factors, including CPSF1, WDR33, FIP1L1, and others, displayed significant variations in expression (Fig. 7C). In liver hepatocellular carcinoma (LIHC), genes such as CPSF1, WDR33, FIP1L1, and CPSF4, among others,showed notable differences in expression, with CPSF1 being especially elevated (Fig. 7D). Lastly, in kidney renal clear cell carcinoma (KIRC), distinct expression levels were observed for genes like WDR33, CPSF3, NUDT21, and several others (Fig. 7E). In essence, our findings underscore the intricate relationship between pre-mRNA 3′ end processing factors and immune subtypes, highlighting the potential for further research in this domain.
3.5 Association of pre-mRNA 3′ end processing factors with pan-cancer drug sensitivity gene therapy treatments
To investigate the potential relationship between the expression of pre-mRNA 3′ end processing factors and the susceptibility of various human cancer cell lines to different drugs, as recorded in the CellMiner™ database, we conducted an in-depth correlation analysis. All relevant data, including the expression profiles of these cell lines and their associated drug sensitivities, are detailed in (Table S6). We systematically outlined the set of 17 pre-mRNA 3′ end processing factors in Fig. 8 and (Table S6), each showing a unique association with certain drugs. Our research highlighted significant associations, such as NUDT21 having a positive correlation with susceptibility to pyrazoloacndine, amonaflide, and chelerythrine (Fig. 10A, D, G), while it negatively correlated with sensitivity to okadaic (Fig. 10L). Additionally, FIP1L1 showed a positive correlation with chelerythrine sensitivity (Fig. 10B), and CSTF2T aligned positively with nelarabine sensitivity (Fig. 10C). Notably, CPSF6 revealed positive correlations with sensitivity to both chelerythrine and nelarabine (Fig. 10E, M), while CPSF1 was positively associated with susceptibility to fludarabine andcladribine (Fig. 10G, H). Similarly, CPSF3 was positively correlated with chelerythrine sensitivity (Fig. 10I), and PCF11 was positively associated with susceptibility to nelarabine, chelerythrine, and PX-316 (Fig. 10J, O, P). Finally, RBBP6 showed positive correlations with sensitivity to both chelerythrine and PX-316 (Fig. 10K, N). Considering the varied expression patterns of pre-mRNA 3′ end processing factors in tumor tissues compared to adjacent non-tumor tissues, along with the unique RNAss and DNAss profiles and the prognostic significance of pre-mRNA 3′ end processing factors, we identified CPSF2, CPSF3, CSTF2, and SYMPK as standout members of the pre-mRNA 3′ end processing factors.
3.6 Association of pre-mRNA 3′ end processing factors with pan-cancer immune microenvironment
TMB has been recognized as a valuable biomarker for predicting the outcomes of immunotherapy. Importantly, a higher TMB level suggests increased effectiveness in tumor immunotherapies[30–31]. Moreover, Microsatellite Instability (MSI) has been linked to tumor progression [32–33]. Given these insights, we sourced TMB and MSI data from the TCGA database to explore the complex relationship between TMB/MSI and the expression of CPSF2, CPSF3, CSTF2, and SYMPK (Fig. 11A, B, C, D). A significant correlation was observed between CPSF2 expression and TMB in various cancers, including LUAD, LUSC, STAD, THYM, and UCEC (Fig. 11A). Similarly, a pronounced association was found between CPSF3 expression and TMB in cancers like BLCA, BRCA, and HNSC (Fig. 11B). A significant relationship was also noted between CSTF2 expression and TMB in cancers such as BLCA, LGG, LUAD, SARC, SKCM, STAD, and UCEC (Fig. 11C). Concurrently, CPSF2 expression showed a strong correlation with MSI in several cancers, including BRCA, COAD, DLBC, HNSC, OV, PRAD, SKCM, THCA, and UCEC (Fig. 11E). CSTF2 expression also correlated significantly with MSI in cancers like ACC, BRCA, COAD, KIRC, and UCEC (Fig. 11G). Similarly, SYMPK expression had a significant association with MSI in cancers such as BRCA, HNSC, LIHC, LUAD, LUSC, PRAD, and READ (Fig. 11H).