Landscape of N(4)- acetylcytidine modified genes in triple-negative breast cancer Revealing the potential markers of N(4)-acetylcytidine through acRIP-seq in triple-negative breast cancer

doi:10.21203/rs.3.rs-1633136/v1

Download PDF

Research Article

Landscape of N(4)- acetylcytidine modified genes in triple-negative breast cancer Revealing the potential markers of N(4)-acetylcytidine through acRIP-seq in triple-negative breast cancer

https://doi.org/10.21203/rs.3.rs-1633136/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Background: Triple-negative breast cancer (TNBC) still lacks effective treatments, and understanding the causes of its tumorigenesis and tumor progression is helpful for prognostic evaluation. N(4)-acetylcytidine (ac4C), an epigenetic modification of RNA, whose regulatory mechanism is still unclear in TNBC tumors may provide a new perspective for understanding the development of TNBC.

Methods: Two pairs of TNBC patient tissues were sequenced at the genome level using acRIP-seq technology, and differential ac4C genes were identified by Robust Rank Aggregation (RRA). Then, by integrating genome-wide lncRNA expression profiles in TCGA, we constructed a co-expression network containing 180 ac4C genes and mined 1080 lncRNA gene relationship pairs. And use cox regression analysis to screen genes associated with prognosis. Additionally, drug sensitivity was predicted using a drug target network of differential ac4C genes.

Results: We screened a total of 180 differential ac4c genes, of which 6 genes were significantly associated with prognosis. In addition, we screened out three lncRNAs interacting with prognostic risk genes, and the corresponding interaction pairs were "LINC01614-COL3A1", "OIP5-AS1-USP8" and "RP5-908M14.9-TRIR". Significant ac4C genes COL3A, USP8 and TRIR1 were accurately predicted with the sensitivity of Doxorubicin and Paclitaxel, the main chemotherapy drugs in TNBC.

Conclusions: This study suggests that RNA ac4C acetylation may be a potential marker of prognosis and drug sensitivity/target through the involvement of lncRNA. These results will be used as a guide for experiments.

N(4)-acetylcytidine (ac4C)

TNBC

acRIP-seq

prognostic risk marker

drug target prediction

Breast cancer (BC) is a complex heterogeneous disease with an increasing incidence [1]. TNBC is an aggressive form of BC in which cells do not express ER, progesterone receptors or HER2 and is ineffective for treatment. Several molecular targets are being explored to target TNBC, including the androgen receptor, epidermal growth factor receptor (EGFR), poly (ADP-ribose) polymerase (PARP), and vascular endothelial growth factor (VEGF) [2] TNBC accounts for approximately 15–20% of BC. In contrast to hormone receptor-positive or HER2-positive disease, TNBC has a highly aggressive clinical course, earlier age of onset, greater metastatic potential, and worse clinical outcome, as evidenced by higher recurrence rates and lower survival rates [3].

N4-acetylcytidine (ac4C) is generally considered a conserved, chemically modified nucleoside. However, recent studies have found extensive ac4C modifications in human and yeast mRNA that contribute to translation efficiency and mRNA stability [4]. Previously, studies by Thomale et al. [5] and Liebich et al. [6] found a significant increase in modified nucleosides (including ac4C) in the urine of tumor-bearing mice and cancer patients. Most importantly, recent studies have shown that modified nucleosides such as ac4C frequently exhibit significant changes in urine samples from patients with multiple cancers [7–9]. Compared with the control group, ac4C in urine samples of BC patients was significantly increased, and the curve area of ac4C was 0.825 under the subject operating characteristics (ROC) in BC diagnosis [9]. Several other studies have also shown that ac4C levels were associated with inflammatory responses [10, 11]. These findings suggest that ac4C can be used as a potential biomarker for cancer and is important in the diagnosis and treatment of cancer.

NAT10 (also known as hALP, human N-acetyltransferase-like protein), a protein with histone acetylation activity, is the first acetylation regulator shown to maintain efficient translation and stabilize mRNA by forming ac4C on mRNA [12]. Duan et al. showed for the first time that NAT10 is associated with cancer, demonstrating that it significantly promotes cell growth in epithelial ovarian cancer and BC [13, 14]; in addition, NAT10 has a potential role in increasing melanogenesis and melanoma growth [15]. These findings suggest a multifaceted functional role for NAT10 in cancer. Nine of the thirty-three cancer types in the TCGA database were found to show a significant relationship between NAT10 expression levels and OS, seven showed a significant association with PFS, five with DFS, and seven with DSS. Overall, NAT10 expression was an independent risk factor for poor prognosis in these cancers [16].

Long-stranded non-coding RNAs (lncRNAs) are endogenous single-stranded RNAs that do not encode proteins and have transcriptional lengths ranging from 200 bp to 1000 kb [17]. LncRNAs can regulate DNA methylation, histone modification, and chromosome remodeling through epigenetic, transcriptional, and post-transcriptional regulation [18]. As a new class of biomarkers, lncRNAs have been widely studied in the pathogenesis, diagnosis and drug screening of various diseases [19]. Their dysregulation and abnormal expression patterns lead to various cancer types, including TNBC [20, 21]. LncRNA SNHG14 induces BC resistance to trastuzumab through H3K27 acetylation-mediated regulation of PABPC1 expression; LncRNA GHSROS has a significant effect on significant promotion growth and migration of BC; LncRNA NONHSAT101069 regulates Twist1 by targeting microRNA miR-129-5p to induce epirubicin resistance and promote BC cell migration and invasion [22]. In addition, in TNBC studies, researchers found that a novel lncRNA named RP11-22N19.2 was highly overexpressed in TNBC compared with non-TNBC tissues, and overexpression of RP11-22N19.2 predicted poor prognosis for overall survival (OS) and recurrence-free survival (RFS) in TNBC [23]. In conclusion, lncRNAs play a key role in the development of breast cancer.

TNBC greatly affects the prognosis of patients due to the absence of key biomarkers and targeted drugs. As one of the modifying molecules on human mRNA, ac4C plays a key role in the transcription and translation. The abnormality of its regulator NAT10 protein is also closely related to the occurrence and prognosis of various cancers. In this study, we combined the TCGA database and acRIP-seq sequencing technology to mine the lncRNAs involved in RNAac4C acetylation in the development of TNBC. These lncRNAs are significantly related to the prognosis of TNBC patients, which can provide reliable help for mining TNBC-specific drug targets and predicting chemosensitivity.

ac4C detection and quantification of clinical samples

This study was approved by the Ethics Committee of Harbin Medical University, and the written informed consents were obtained from all participants prior to inclusion. Our tumor and paracancerous samples were surgically derived from two patients with TNBC from Harbin Medical University Cancer Hospital. Both are first-time patients without prior neoadjuvant or other therapy.

After adding 1 mL of TRIzol™ Reagent per 50-100 mg of tissue samples, the tissue samples were ground by high-frequency reciprocating vibration, impact, and shear of grinding beads in a cryogenic grinding environment, and then the total RNA was isolated and purified by the phenol-chloroform method. Take 150ug of total RNA, digest the DNA, use Epi^TM ac4C immunoprecipitation kit to perform RNA fragmentation according to the manufacturer’s instructions. Zymo RNA clean and concentrator-25 kit was used to purify and recover the fragmented RNA. Anti-N4-acetylcytidine (ac4C) antibody was used for immunoprecipitation reaction with ac4C modification site on RNA, and then the immunoprecipitated RNA was recovered by HiPure cell miRNA Kit after washing with high salt. The library was prepared using the Epi^TM mini longRNA-seq kit, and the Bioptic Qsep100 Analyzer was used for quality inspection of the library to check whether the library size distribution conformed to the theoretical size. Finally, the high-throughput sequencing platform of NovaSeq, PE150 sequencing mode, was used to sequence the TNBC and paracancerous samples.

Analysis of acRIP-seq sequencing and peak calling

The total RNA before and after treatment was sequenced by acRIP-seq, and the sequence data was obtained by base identification and error filtering, and the quality of the sequencing data was analyzed by fastQC to obtain the information of sequencing quality distribution, base content distribution, repeat sequencing fragment ratio, etc. The clean data was obtained by de-junctioning and quality control of the Raw data for subsequent data analysis. The clean reads were then compared with the reference genome of the sample species using HISAT2 software to obtain uniqe mapped reads for the next step of analysis. For Peak Calling, the software used in this analysis is exomePeak, which is an exon-based software for Peak Calling and differential RNA modification analysis. Finally, we used HOMER (http://homer.ucsd.edu/homer/ngs/peakMotifs.html) software to perform ac4C motif analysis on Peaks.

Difference analysis of ac4C modification

By finding differences between samples or between disease groups (treatment groups) and controls, we can obtain the differential ac4C modification levels for a particular disease (or treatment conditions) and thus explain the role of the episomal transcriptome in disease onset and progression at the ac4C level. In this paper, we use the exomePeak R package to reveal the dynamics of post-transcriptional regulatory RNA acetylation and report the sites of differential post-transcriptional acetylation under two experimental conditions. The differential analysis was performed in 2 steps, firstly, peaks detection, and then the presence or absence of differential acetylation modification (statistical test) of these sites under the two experimental conditions, and the differential multiplicity method was used to calculate the degree of difference. The genes that were significantly differentially acetylated were screened for subsequent analysis using p < 0.05 as the screening criterion.

Functional enrichment analysis

In organisms, different genes coordinate with each other to perform their biological functions. By exploring the significant functional enrichment genes, a functional enrichment analysis was carried out on ac4C genes. GO (Gene Ontology) is a comprehensive database describing gene functions, divided into three parts: molecular functions, biological processes and cellular components. KEGG is a comprehensive database integrating genomic, chemical and systemic functional information. In organisms, different genes coordinate with each other to perform their biological functions, and significant enrichment by Pathway allows the identification of the most important biochemical metabolic pathways and signal transduction pathways involved in candidate target genes. The analyzed ac4C differential genes were functionally annotated at the GO and KEGG levels respectively based on the DAVID database (https://david.ncifcrf.gov/), and the significance level (P-Value) of each GO was calculated using Fisher's test to screen out the pathways and functions with significance padj < 0.05.

Expression and immune infiltration of differentially expressed genes affected by NAT10 in TCGA breast cancer cohort

To further explore how NAT10 influences the development of TNBC, we obtained RNA-SEQ data for TNBC from the TCGA database (https://portal.gdc.cancer.gov/). Gene expression information from 116 TNBC patient samples was included, and the limma package[24] was used to screen for differential lncRNAs between TNBC with high NAT10 expression and TNBC with low NAT10 expression (p<0.05&|FC|>2). Meanwhile, we assessed the proportion of immune cell infiltration in TCGA TNBC samples using ImmuCellAI (http://bioinfo.life.hust.edu.cn/ImmuCellAI). To explore the differences in the proportion of immune cell infiltration between different clusters.

Merging of differential ac4C genes

In this study, ACRIP sequencing data from different public platforms were integrated using the robust rank aggregation (RRA) method to obtain the integrated differential ac4C genes from different platforms. The RRA method uses a probabilistic model for aggregation, which is robust to noise and helps to calculate the probability of occurrence of significance for all genes in the final ranking. Robust Rank Aggregation is another R package that mainly combines the results of variance analysis from different platforms with the Robust Rank Aggregation (RRA) algorithm to obtain a comprehensive list of variance significance rankings. The different sets of platform variance genes are intersected while also considering their ranking. Overall, the genes that showed differences in multiple datasets and ranked high for each difference were selected as the result of the final difference gene merge. We finally selected genes with significant differences p<0.05 in the integrated dataset, and genes differentially modified in both sequencing as the final selection of ac4C differential genes.

Constructing a network to screen the ac4C-related prognostic risk lncRNAs

To screen ac4C-related prognostic risk lncRNAs, this study used Pearson correlation analysis to construct a co-expression network of lncRNAs and ac4C differential genes. In the TCGA database, lncRNAs with significantly different expression between TNBC and normal groups were screened by t-test (p.adj<0.05). Pearson correlation analysis was performed on the expression levels of differential ac4C genes and differentially expressed lncRNAs in TNBC samples, and the co-expression network was constructed to screen for significant ac4C gene -lncRNA correlations (R>0.5, p<0.05). One-step neighbor lncRNAs of ac4C prognostic risk genes were mined in the co-expression network, and these lncRNAs were intersected with the list of prognostic risk lncRNAs affected by NAT10 to obtain ac4C-related prognostic risk lncRNAs, as well as their co-expressed ac4C genes.

Prediction of drug targets and sensitivity

The data related to drugs and their drug targets were obtained from the DRUGBANK database, and the intersection was taken with ac4C differential genes to construct a drug-target network from which drug targets for prognostic risk ac4C differential genes were mined.

Since the lack of efficacious target drugs for TNBC, we used the ac4C related lncRNAs signature to perform drug sensitivity prediction by R package “pRRophetic”[25]. In combination with the latest TNBC dosing consensus, Doxorubicin and Paclitaxel were selected for drug sensitivity prediction in this study [26]。From the pharmacogenomics database “The Genomics of Drug Sensitivity in Cancer” (GDSC) (https://www.cancerrxgene.org/)[27] . Half-maximal inhibitory concentration (IC50) of TCGA samples were estimated by ridge regression against the GDSC train-ing set[28]. Tenfold cross-validation was used to evaluated prediction accuracy of estimated IC50. Mann–Whitney–Wilcoxon Test was used to test whether IC50 distributions of high-risk group and low-risk group were identical.

Assessment of NAT10 characteristics in TNBC based on TCGA database

RNA-seq data of breast cancer patients (1104 cases) with cancer tissues and (113 cases) with paracancer tissues were obtained from the TCGA database, and the Reads obtained by sequencing were used to re-annotate to. NAT10, and after comparing the expression of NAT10 in cancer tissues relative to paracancer tissues, significant high expression occurred in each PAM50 isoform, among which, the difference in high expression between the basal like isoform and the difference in high expression between the basal like subtype and the paraneoplastic tissues was most obvious (Figure1A, t-test, p<0.05); a significant high expression of NAT10 was also found in 116 TNBC cases (Figure1B, t-test, p<0.05). Further, the present study carried out an in-depth exploration of TNBC.

Since the correlation between tumorigenesis and its immune microenvironment, the relationship between NAT10 expression and immune cell infiltration in TNBC was investigated by using the immune cell abundance prediction database ImmuCellAI[29] (http://bioinfo.life.hust.edu.cn/ImmuCellAI) to predict TNBC Immune cell abundance. TNBC samples were divided into high and low expression groups using NAT10 expression. The box plots of the abundance of different immune cell infiltration between high and low expression groups showed that the infiltration abundance of immune cells such as iTreg, Th2,Th17, DC, B cell, Monocyte in TNBC was related to NAT10 expression (Figure1C, t test, p<0.05). The abundance of iTreg, Th2, DC cells were higher in the high NAT10 expression group, and the abundance of Th17, B cell, Monocyte cells was higher in the low NAT10 expression group than the other group. These results suggest that NAT10 is involved in the pathogenesis of TNBC by affecting the tumor immune microenvironment.

The changes of mRNA and lncRNA between high and low NAT10 expression groups in TNBC were also observed. The 731 significantly up-regulated mRNAs (Figure 1D) and 730 significantly differentially expressed up-regulated lncRNAs (p<0.05&|FC|>2) were screened using the limma package (Figure 1E). The pearson correlation analysis of expression between 730 differential lncRNAs and 731 differential mRNAs were performed and filter 1157 pairs of relationships contained 470 lnc,591 mRNAs(r>0.5) (Figure 1F). The expression of these 470 lncRNAs that were significantly different between high and low NAT10 expression groups in TNBC.

In order to find the most prognostically relevant lncRNA features, single-factor cox and multi-factor cox were carried out by combining the age, gender, stage, TNM stage and other clinical factors of TNBC patients. Finally, 12 lncRNAs that were significantly associated with prognosis were screened(p<0.05). The sum of the product of their multi-factor cox risk coefficients and expression values was used as the prognostic risk-score, and the median of the risk-scores was used to divide the TNBC samples, and a log-rank test was performed between the high-risk and low-risk groups. The results showed that the prognostic outcomes differed significantly between the high- and low-risk groups (Figure1G, p=0.00092), and the high-risk group corresponded to a relatively poor prognosis. The above results suggest that lncRNA is involved in the process of NAT10 promoting acetylation.

Screening of key ac4C-modified genes

Prof. Shalini Oberdoerffer from NIH published a study in CELL[30], which revealed for the first time that a large number of ac4C modifications exist on mRNA and affects the stability and translation efficiency of mRNA. In order to investigate the role of ac4C acetylation in TNBC, the genome-wide acetylation levels of cancer and paracancer tissues from two patients were examined by acRIP-seq. The cumulative distribution of ac4C modified peak on RNA structures from both sequencing showed the same results, with the enriched regions of ac4C modifications in disease and normal groups mainly concentrated in the CDS region (Figure2A-B). Peak annotation analysis showed that the enriched regions of ac4C modifications in cancer and ortho-paraneoplastic tissues generated by sequencing in both pairs of patients were mainly concentrated in the CDS, the 5'UTR and 3'UTR regions, with the least distribution in exonic regions (Supplementary Figure1). The enrichment analysis was also performed on the acetylated ac4C-modified motifs sequenced twice using HOMER software, and the enrichment of each motif was determined by scanning all sequenced sequences, and its significance was calculated by hypergeometric distribution. The top three motif predictions in order of enrichment significance p-value are shown (p<1e-15), and the results show that the motifs exhibit a high level of matching with "CXXCXXCXX..." (Figure2C).

The ac4C-seq sequencing results obtained from the two experimental conditions were screened for differentially peak ac4C genes separately. The first screening yielded a total of 350 differential ac4C genes, including 174 differentially up-regulated genes and 176 differentially down-regulated genes (Supplementary Figure2). A total of 1242 differential ac4C genes were screened in the second screening, among which there were 858 differentially up-regulated genes and 384 differentially down-regulated genes (Supplementary Figure3). The significantly different ac4C genes from two conditions were sorted and combined by difference level log2FC using RRA tool. The genes with difference significance p<0.05 in the integrated dataset and those differentially modified in both sequencing were selected as the final selection. 180 differential ac4C genes were eventually obtained, of which 30 genes were significantly different in both patients relative to the normal ac4C-modified peak at the same time, and more than half of the genes modified peak showed the same differential trend (Figure2D). The mean values of these genes relative to the normal modification peaks in both patients as well as the mean expression levels are also shown in Figure2E. Further, the gene expression data from TCGA and sequenced gene expression data were corrected for batch effects using the COMBAT in R. The gene expression profile of 180 differential ac4C genes in TCGA for TNBC showed the same trend in the expanded sample for the 180 genes (Supplementary Figure4).

Biological function of the ac4C differential genes in TNBC

In order to explore the biological functions of the 180 peak differential ac4C genes, GO annotations based on the DAVID database were performed at three levels, biological processes (BP), molecular functions (MF), and cellular components (CC), respectively. Fisher's exact test was used to calculate the significance level (P-Value<0.01) of each analysis. The results showed enrichment in BP such as mRNA transcription-translation processing, cell-cell adhesion, CC such as extracellular exosomes and adherent patches, and MF related to gene expression such as protein binding, RNA binding, and translation (Figure3A). Pathway annotation of the screened differential ac4C genes was performed based on KEGG, and the significance level of Pathway was calculated using Fisher test (P-Value<0.05). The results showed that ac4C modified altered genes were enriched in pathways such as shear body, RNA transport, lysosome and hepatitis B(Figure3B).

Further, Gene expression levels of 180 differential ac4C genes were combined with clinical factors such as age, gender, stage, and TNM stage of TNBC patients in order to perform univariate cox and multivariate cox regression analyses to find the most prognostically relevant features. Six ac4C genes were finally screened that were significantly associated with prognosis (p<0.05): COL3A1, CYFIP1, SFN, SMOC2, TRIR, and USP8. Prognostic risk scores were obtained by summing the product of the multivariate cox coefficients and gene expression values of the six prognostic risk ac4C genes. The median value of the risk scores was used to classify TNBC samples into high and low risk groups. The log-rank test was performed between the two groups, and the results showed that the prognosis was significantly different between the high and low risk groups (p<0.05) (Figure 3C), and the high risk group corresponded to a relatively poor prognosis. In addition, using the mean values of the expression levels of the six prognostic risk genes to classify the samples into high and low risk groups, the final infiltration Score of all immune cells differed between the high and low groups (Figure3D) (p<0.05).

The ac4C-related prognostic risk lncRNAs in TNBC

To explore lncRNAs specific in TNBC, t-tests were performed between TNBC and normal groups were screened in TCGA, and got 2544 lncRNAs with significant differences in expression (p.adj<0.05). With pearson correlation analysis 180 ac4C differential genes and 2544 differential lncRNAs were screened for significant correlations between them (pearson, r>0.5, p<0.05). A co-expression network (Figure4A) was constructed, totally 1080 pairs of lncRNA-gene relationship pairs were screened. The results included 436 lncRNAs with differences between TNBC and normal, 116 ac4C differential genes. In particular, 5 out of 6 ac4C prognostic risk genes were found in the network, and 44 lncRNAs were co-expressed with them, including 3 NAT10-influenced prognostic risk lncRNAs screened out: LINC01614, OIP5-AS1 and RP5-908M14.9 (Figure4B). The one-step nearest neighbor of three lncRNAs was extracted from the co-expression network (Figure4C), and LINC01614-COL3A1, OIP5-AS1-USP8, RP5-908M14.9-TRIR showed strong correlation in expression in TNBC. The expression levels of COL3A1、TRIR showed an upregulation trend in TCGA, but USP8 showed an downregulation trend (Figure4D). Also, all 3 genes showed differential acetylation, higher in cancer than in paracancerous tissues (Figure4E). Analysis of the data in the TCGA database also suggests that NAT10 protein may further regulate the expression of ac4C gene by regulating the expression of related lncRNAs thereby affecting the process of acetylation modification and participating in TNBC prognosis.

Acetylating ac4C genes predict drug sensitivity in TNBC

TNBC is insensitive to endocrine therapy because of ER and PR negativity, and insensitive to targeted therapy (Herceptin) because of HER-2 negativity. The main treatment option is chemotherapy, but it responds poorly to conventional chemotherapy and easily resistant. It is important to explore markers that can be used to predict drug sensitivity. To further determine whether differential ac4C genes could help predict effective drug targets in TNBC, 79 in 180 differential ac4C genes were screened from the DRUGBANK database that could be used as drug targets corresponded to 35 drugs. Among them, genes USP8, one of the three prognostic risk ac4C gene obtained from the previous analysis, is drug target, and corresponded to drugs is Zn(II). Zn(II) was reported to be antiproliferative in human cancer cells[31] and had stronger active cytotoxicity in inducing morphological changes in breast cancer cell lines compared to cisplatin, and non-toxic to fibroblasts[32]. The network of differential ac4C genes with drug targets is shown in Figure5A, in which the orange and purple nodes are 79 differential ac4C genes and the green nodes are drug names, and the purple nodes are prognostic risk genes related to NAT10.

Furthermore, to explore the predictive effect of the screened three prognostic risk ac4C genes (USP8, COL3A1、TRIR) regulated by NAT10 protein on drug-sensitive response, the sum of the product of the cox risk coefficients of the 3 genes and the expression values were used as the prognostic risk score, and the median value of the risk scores were used to divide TNBC samples with high and low groups. Drug sensitivity prediction was performed using the R package "pRRophetic". According to the latest expert consensus, the drugs commonly used in TNBC are Doxorubicin and Paclitaxel. (Figure 5B, 5E), the IC50 data of the two drugs were obtained from GDSC database. The linear model was used to verify whether the drugs met the linear model, and the QQ plot showed that both drugs were in line, and the linear model criteria could be used to predict the IC50 values of the samples using ridge regression. Next, ridge regression was used to estimate the half-maximal inhibitory concentration (IC50) of the TCGA samples and tenfold cross-validation was used to assess the predictive accuracy of the estimated IC50. The Mann-Whitney-Wilcoxon test was used to test whether the IC50 distribution was the same for the high-risk and low-risk groups.

Doxorubicin, a highly effective broad-spectrum anticancer drug widely used in TNBC, and an anthracycline antibiotic that embeds between the DNA double helix and inhibits replication after unstranding, significantly predicted drug sensitivity in both high and low risk groups, allowing particularly accurate estimation of IC50 values in patients (Figure5C, 5D). Paclitaxel inhibits microtubule depolymerization by by inhibiting mitosis in cancer cells, and the PD-L1 antibody atezolizumab in combination with albumin-paclitaxel was approved by the FDA as standard therapy for PD-L1-positive TNBC. In TNBC samples from TCGA. The sensitivity of the drug was significantly predicted in both high and low risk groups classified by prognostic risk ac4C genes regulated by NAT10 protein, and the IC50 value could be accurately estimated (Figure5F, 5G). However, there was no significant difference in sensitivity between the two drugs in the high and low risk groups (Supplementary Figure5), which also demonstrated the high risk of TNBC and the lack of effective targeted drugs. In particular, lncRNAs (LINC01614, OIP5-AS1 and RP5-908M14.9) interacting with three prognostic risk ac4C genes can also significantly predict drug sensitivity in TNBC patients, but with low relative accuracy (Supplementary Figure 6).

Acetylcytidine is an ancient, evolutionarily conserved modification. There is increasing evidence to support a strong link between acetylation imbalance and carcinogenesis. Mutations in the gene encoding acetyltransferase lead to the development of various solid tumors, such as colorectal cancer[33] and gastric cancer[34]. In this study, acRIP-seq technology was used to detect the genome-wide acetylation levels of cancer and paracancerous tissues of 2 patients, and 180 shared differentially acetylation genes were found, which showed similar expression differences in TNBC samples from TCGA. The functions of these ac4C genes are significantly enriched in biological processes (mRNA transcription, translation and intercellular adhesion) and molecular functions related to gene expression (protein binding, RNA binding and translation). Among them, intercellular adhesion is significantly associated with cancer metastasis [35]. Combined with TCGA expression and clinical data, 6 ac4C genes (p<0.05) significantly associated with prognosis were screened out of 180 differentially acetylation genes, namely: "COL3A1", "CYFIP1", "SFN", " SMOC2", "TRIR", "USP8". Their expression levels significantly affected the survival time of TNBC patients.

NAT10 is an acetyltransferase belonging to the GCN5-associated N-acetyltransferase (GNAT) family [35]. It can acetylate target proteins, regulate DNA damage response [36] and affect cancer development [37] and so on. NAT10 catalyzes N4-acetylcytidine (ac4C) deposition on different RNA substrates and is involved in colon cancer invasion and metastasis[38]. In previous studies, it was found that lncRNAs can be activated by acetylation and act as miRNA sponges in retinoblastoma to activate signaling pathways and induce cancer [39]. In this study, in order to explore the sponge function of acetylation genes in TNBC, 2544 lncRNAs with significant differences in expression between TNBC and normal group were screened from the TCGA database. Using pearson correlation analysis, a co-expression network of 180 ac4C differential genes and 2544 differential expression lncRNAs was constructed, including 1080 lncRNA-gene relationship pairs. The three lncRNAs in the co-expression network were prognostic risk lncRNAs affected by NAT10, namely namely "LINC01614", "OIP5-AS1", and "RP5-908M14.9". Their interaction pairs in the network were LINC01614—COL3A1, OIP5-AS1—USP8, and RP5-908M14.9—TRIR. COL3A1 was found to be significantly associated with breast cancer brain metastases in multiple studies [40, 41]. The inhibition of ubiquitin-specific protease 8 (USP8), a novel deubiquitylase of Notch1 intracellular domain (NICD), downregulated the Notch signal pathway, resulting in the retardation of cellular growth and colony forming ability of breast cancer cell lines [42]. The relevance of USP8 for patients with BRCA has been reported, which high-expression were correlated with better clinical features[43]. At present, there are few reports about the role of TRIR in breast cancer. But in melanoma, TRIR is found to inhibit angiogenesis and is related to the activity of cancer cells [44]. This study found that NAT10 related acetylation genes play an important role in cancer, but the molecular mechanism of TNBC has not been fully confirmed. In the future, these genes can be used as potential therapeutic targets to predict the survival time of TNBC patients and develop effective treatments.

Because TNBC is negative for ER, PR, and HER-2, it is not sensitive to endocrine therapy or targeted therapy. It is resistant to chemotherapy, and it has important significance to predict drug response markers. We selected Doxorubicin and Paclitaxel, which are commonly used in TNBC, and downloaded the sensitivity data of these two drugs from the GDSC database. Three prognostic risk ac4C genes (USP8, COL3A1、TRIR), which screened in this study and regulated by NAT10 protein, were used to predict drug sensitivity. In the high and low risk groups divided by the three ac4C genes, their expression levels can significantly predict drug sensitivity and accurately estimate the IC50. However, there is no significant difference in the sensitivity of the two drugs in the high and low risk groups, which also proves that there is a lack of effective drugs in the treatment of TNBC patients. Therefore, it is urgent to find TNBC targeted drugs. A small molecule inhibitor of NAT10 named "remodelin" was discovered in 2014, which can repair nuclear defects and improve chromatin structure in lamellar cells and progeria diseases by inhibiting NAT10[45]. Therefore, it is highly desirable to develop a NAT10-based drug for the treatment of cancer. To explore the potential drug targets for TNBC, the relevant data of drugs and drug targets were obtained from the DRUGBANK database, and they were intersected with 180 ac4C differential genes. The results showed that 79 of the 180 differential ac4C genes were drug targets, corresponding to 35 drugs. Among them, the prognostic risk ac4C differential gene USP8 is a drug target, and its corresponding drug is Zn(II). Zinc (Zn) provides structural integrity for many proteins, such as in zinc finger proteins or flexibility. The National Cancer Institute (NCI) found that certain tumor cell groups are sensitive to specific metal agents. The Zn complex compound can induce cancer cells to cause adaptive tumor immunity [46]. In 2021, Prihantono found the Zn(II) arginine dithiocarbamate complex has more active cytotoxicity. But compared with cisplatin, it had lower cytotoxicity in inducing morphological changes in T47D breast cancer cell line [32]. This improves the feasibility of Zn(II) as a targeted drug for TNBC cancer.

This study started with the NAT10 protein that promotes RNA ac4C acetylation, combined with whole-genome sequencing of acRIP-seq technology, and used lncRNA as a bridge to deeply explore the ac4C gene that significantly affects the prognosis of TNBC patients. These results found that RNA ac4C acetylation can significantly affect the prognosis of TNBC and accurately predict its sensitivity to chemotherapeutic drugs through the involvement of lncRNA. The predicted drug targets based on RNA ac4C acetylation are credible and are expected to solve the dilemma of TNBC without targeted drugs. However, the specific mechanism of RNA ac4C acetylation still needs a lot of experiments to prove, and the development of drugs also needs in-depth research.

TNBC: triple-negative breast cancer; ac4C: N4-acetylcytidine; RRA :Robust Rank Aggregation; BC: breast cancer; EGFR: epidermal growth factor receptor; PARP: poly (ADP-ribose) polymerase ; VEGF: vascular endothelial growth factor; ROC: receiver operating characteristic;NAT10: human N-acetyltransferase-like protein; lncRNAs: long-stranded non-coding RNAs; OS: overall survival; RFS: recurrence-free survival; GO: Gene Ontology; RRA: robust rank aggregation; RRA: Robust Rank Aggregation; GDSC: Genomics of Drug Sensitivity in Cancer; IC50: half-maximal inhibitory concentration; BP: biological processes; MF: molecular functions; CC: cellular components ; COL3A1: alpha 1 chain of type III collagen; GNAT: GCN5-associated N-acetyltransferase; USP8: ubiquitin-specific protease 8 ; NICD: Notch1 intracellular domain ; NCI: National Cancer Institute; TCGA: The Cancer Genome Atlas.

Acknowledgements

We thank Guangzhou Epibiotek Co., Ltd (http://www.epbiotek.com/) for providing AcRIP-seq sequencing service of TNBC samples.

Funding

This work was supported by funding from the Project Nn10 of Harbin Medical University Cancer Hospital (Grant Number Nn102017-02), the National Natural Science Foundation of China (Grant Numbers 61972116, 62102120, 81872149, 82072904) and Heilongjiang Postdoctoral Fund (Grant Number LBH-Z20158).

Availability of data and materials

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Authors’ contributions

D. P., Y.G. and X.D.Z designed the study and performed the data analysis. J.Y.W, Y.Z.H. and X.Z performed tissue sample collection and processing. J.Q.Z, S.G, H.H.L, and G.Z.L. for bioinformatic analysis. D.P., Y.G and X.D.Z wrote the manuscript. All authors have read and Manuscript approved.

Ethics approval and consent to participate

All patients consented to an institutional review board-approved protocol that allows comprehensive analysis of tumor samples (Ethics committee of Harbin Medical University). This study conforms to the Declaration of Helsinki.

Consent for publication

Written informed consent for publication was obtained from the patients.All authors have agreed to publish this manuscript.

Consent for publication

Written informed consent for publication was obtained from the patients. All authors have agreed to publish this manuscript.

Competing interests

The authors declare that they have no competing interests.

Conflict of interest statement.

None declared.

Author details

1.Department of Breast Surgery, Harbin Medical University Cancer Hospital,150 Haping Road, Harbin 150081, China.

2. School of Life Science and Technology, Computational Biology Research Center, Harbin Institute of Technology, Harbin 150001, China

Liang, Zhang, Song, Yang: Metastatic heterogeneity of breast cancer: Molecular mechanism and potential therapeutic targets. Seminars in cancer biology 2020; 60:14-27.
Nagini: Breast Cancer: Current Molecular Therapeutic Targets and New Players. Anti-cancer agents in medicinal chemistry 2017; 17:152-163.
Garrido-Castro, Lin, Polyak: Insights into Molecular Classifications of Triple-Negative Breast Cancer: Improving Patient Selection for Treatment. Cancer discovery 2019; 9:176-198.
Zachau, Dutting, Feldmann: The structures of two serine transfer ribonucleic acids. Hoppe-Seyler's Zeitschrift fur physiologische Chemie 1966; 347:212-235.
Thomale, Nass: Elevated urinary excretion of RNA catabolites as an early signal of tumor development in mice. Cancer letters 1982; 15:149-159.
Liebich, Lehmann, Xu, Wahl, Haring: Application of capillary electrophoresis in clinical chemistry: the clinical value of urinary modified nucleosides. Journal of chromatography B, Biomedical sciences and applications 2000; 745:189-196.
Szymanska, Markuszewski, Markuszewski, Kaliszan: Altered levels of nucleoside metabolite profiles in urogenital tract cancer measured by capillary electrophoresis. Journal of pharmaceutical and biomedical analysis 2010; 53:1305-1312.
Zhang, Wu, Ke, Yin, Li, Fan et al: Identification of potential biomarkers for ovarian cancer by urinary metabolomic profiling. Journal of proteome research 2013; 12:505-512.
Li, Qin, Shi, He, Xu: Modified metabolites mapping by liquid chromatography-high resolution mass spectrometry using full scan/all ion fragmentation/neutral loss acquisition. Journal of chromatography A 2019; 1583:80-87.
Duan, Zhang, Hu, Lu, Yu, Bai: N(4)-acetylcytidine is required for sustained NLRP3 inflammasome activation via HMGB1 pathway in microglia. Cellular signalling 2019; 58:44-52.
Doskocil, Holy: Inhibition of nucleoside-binding sites by nucleoside analogues in Escherichia coli. Nucleic acids research 1974; 1:491-502.
Dominissini, Rechavi: N(4)-acetylation of Cytidine in mRNA by NAT10 Regulates Stability and Translation. Cell 2018; 175:1725-1727.
Liu, Liu, Yang, Zhang, Zhang, Hu et al: Acetylation of MORC2 by NAT10 regulates cell-cycle checkpoint control and resistance to DNA-damaging chemotherapy and radiotherapy in breast cancer. Nucleic acids research 2020; 48:3638-3656.
Tan, Miow, Huang, Wong, Ye, Lau et al: Functional genomics identifies five distinct molecular subtypes with clinical relevance and pathways for growth control in epithelial ovarian cancer. EMBO molecular medicine 2013; 5:1051-1066.
Oh, Lee, Lim, Lim: Inhibition of NAT10 Suppresses Melanogenesis and Melanoma Growth by Attenuating Microphthalmia-Associated Transcription Factor (MITF) Expression. International journal of molecular sciences 2017; 18.
Yang, Wu, Zhang, Liu, Zhao, Sun et al: Prognostic and Immunological Role of mRNA ac4C Regulator NAT10 in Pan-Cancer: New Territory for Cancer Research? Front Oncol 2021; 11:630417.
Quinn, Chang: Unique features of long non-coding RNA biogenesis and function. Nature reviews Genetics 2016; 17:47-62.
Dykes, Emanueli: Transcriptional and Post-transcriptional Gene Regulation by Long Non-coding RNA. Genomics, proteomics & bioinformatics 2017; 15:177-186.
Kumar, Goyal: LncRNA as a Therapeutic Target for Angiogenesis. Current topics in medicinal chemistry 2017; 17:1750-1757.
Jiang, Liu, Xu, Jin, Hu, Yu et al: Transcriptome Analysis of Triple-Negative Breast Cancer Reveals an Integrated mRNA-lncRNA Signature with Predictive and Prognostic Value. Cancer research 2016; 76:2105-2114.
Batista, Chang: Long noncoding RNAs: cellular address codes in development and disease. Cell 2013; 152:1298-1307.
Guo, Lian, Liu, Dong, Guo, Yang et al: Integrated analyses of long noncoding RNAs and mRNAs in the progression of breast cancer. The Journal of international medical research 2021; 49:300060520973137.
Zhang, Zhang, Liu, Su, Liang, Li et al: Epigenetic Regulation of NAMPT by NAMPT-AS Drives Metastatic Progression in Triple-Negative Breast Cancer. Cancer research 2019; 79:3347-3359.
Ritchie, Phipson, Wu, Hu, Law, Shi et al: limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res 2015; 43:e47.
Geeleher, Cox, Huang: pRRophetic: an R package for prediction of clinical chemotherapeutic response from tumor gene expression levels. PLoS One 2014; 9:e107468.
Elghazaly, Rugo, Azim, Swain, Arun, Aapro et al: Breast-Gynaecological & Immuno-Oncology International Cancer Conference (BGICC) Consensus and Recommendations for the Management of Triple-Negative Breast Cancer. Cancers (Basel) 2021; 13.
Yang, Soares, Greninger, Edelman, Lightfoot, Forbes et al: Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells. Nucleic Acids Res 2013; 41:D955-961.
Sen, Zhan, Jing, Yi, Wanqi: Chemosensitizing activities of cyclotides from Clitoria ternatea in paclitaxel-resistant lung cancer cells. Oncol Lett 2013; 5:641-644.
Miao, Zhang, Lei, Luo, Xie, Wang et al: ImmuCellAI: A Unique Method for Comprehensive T-Cell Subsets Abundance Prediction and its Application in Cancer Immunotherapy. Adv Sci (Weinh) 2020; 7:1902880.
Arango, Sturgill, Alhusaini, Dillman, Sweet, Hanson et al: Acetylation of Cytidine in mRNA Promotes Translation Efficiency. Cell 2018; 175:1872-1886 e1824.
Das, Datta, Frontera, Wen, Roma-Rodrigues, Raposo et al: Zn(II) and Co(II) derivatives anchored with scorpionate precursor: Antiproliferative evaluation in human cancer cell lines. J Inorg Biochem 2020; 202:110881.
Prihantono, Irfandi, Raya: The comparison of Zn(II) arginine dithiocarbamate cytotoxicity in T47D breast cancer and fibroblast cells. Breast Dis 2021; 40:S55-S61.
Janknecht, Wells, Hunter: TGF-beta-stimulated cooperation of smad proteins with the coactivators CBP/p300. Genes Dev 1998; 12:2114-2119.
Kim, Lee, Yoo, Lee: Frameshift mutations of tumor suppressor gene EP300 in gastric and colorectal cancers with high microsatellite instability. Hum Pathol 2013; 44:2064-2070.
Paulitschke, Berger, Paulitschke, Hofstatter, Knapp, Dingelmaier-Hovorka et al: Vemurafenib resistance signature by proteome analysis offers new strategies and rational therapeutic concepts. Mol Cancer Ther 2015; 14:757-768.
Liu, Ling, Gong, Sun, Hou, Zhang: DNA damage induces N-acetyltransferase NAT10 gene expression through transcriptional activation. Mol Cell Biochem 2007; 300:249-258.
Liu, Tan, Zhang, Zhang, Zhang, Ren et al: NAT10 regulates p53 activation through acetylating p53 at K120 and ubiquitinating Mdm2. EMBO Rep 2016; 17:349-366.
Zhang, Hou, Wang, Liu, Jia, Zheng et al: GSK-3beta-regulated N-acetyltransferase 10 is involved in colorectal cancer invasion. Clin Cancer Res 2014; 20:4717-4729.
Gao, Luo, Zhang: LincRNA-ROR is activated by H3K27 acetylation and induces EMT in retinoblastoma by acting as a sponge of miR-32 to activate the Notch signaling pathway. Cancer Gene Ther 2021; 28:42-54.
Zhang, Wang, Yang, Li, Fang: Identification of potential genes related to breast cancer brain metastasis in breast cancer patients. Biosci Rep 2021; 41.
Zeng, Lin, Jin, Zhang: Identification of Key Genes Associated with Brain Metastasis from Breast Cancer: A Bioinformatics Analysis. Med Sci Monit 2022; 28:e935071.
Shin, Kim, Kim, Ylaya, Do, Hewitt et al: Deubiquitylation and stabilization of Notch1 intracellular domain by ubiquitin-specific protease 8 enhance tumorigenesis in breast cancer. Cell Death Differ 2020; 27:1341-1354.
Qiu, Kong, Cheng, Li: The expression of ubiquitin-specific peptidase 8 and its prognostic role in patients with breast cancer. J Cell Biochem 2018; 119:10051-10058.
Isenberg, Yu, Roberts: Differential effects of ABT-510 and a CD36-binding peptide derived from the type 1 repeats of thrombospondin-1 on fatty acid uptake, nitric oxide signaling, and caspase activation in vascular cells. Biochem Pharmacol 2008; 75:875-882.
Larrieu, Britton, Demir, Rodriguez, Jackson: Chemical inhibition of NAT10 corrects defects of laminopathic cells. Science 2014; 344:527-532.
Huang, Wallqvist, Covell: Anticancer metal compounds in NCI's tumor-screening database: putative mode of action. Biochem Pharmacol 2005; 69:1009-1039.

No competing interests reported.

Additionalfiles.pdf
Figure S1.Peak annotation analysis of ac4C modification-enriched regions in cancer and primary paraneoplastic tissues in sequencing results. Figure S2. Volcano plot of differential ac4C genes obtained from the first samples. Figure S3. Volcano plots of differential ac4C genes obtained from the second samples. Figure S4. Expression of differential ac4C gene in TCGA. Figure S5. Sensitivity to doxorubicin and paclitaxel in high and low risk groups classified by prognostic risk ac4C genes regulated by NAT10 protein. Figure S6. Association of lncRNAs interacting with the prognostic risk ac4C gene and drug sensitivity in TNBC patients.

Download PDF

Version 1

posted

You are reading this latest preprint version

Landscape of N(4)- acetylcytidine modified genes in triple-negative breast cancer Revealing the potential markers of N(4)-acetylcytidine through acRIP-seq in triple-negative breast cancer

Status:

Version 1

Abstract

Figures

Background

Materials And Methods

Result

Discussion

Abbreviations

Declarations

References

Additional Declarations

Supplementary Files

Status:

Version 1