Identification of High-risk Genes in Triple-negative Breast Cancer by Bioinformatics
Background: Current research has failed to find a target gene for triple-negative breast cancer (TNBC), which has resulted in the treatment for TNBC being less effective than that for other types of breast cancer. Finding high-risk genes for TNBC by bioinformatics may help to identify target genes for TNBC.
Methods: The gene expression data of 4 chips (GSE7904, GSE31448, GSE45827, GSE65194) which contains of normal breast tissue and TNBC tissue were obtained from the Gene Expression Omnibus. The differentially expressed genes (DEGs) between normal breast tissue and TNBC tissue were identified. Gene Ontology (GO) functional annotation analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis of DEGs were performed by the DAVID website. Protein-protein interaction network analysis of DEGs was carried out by the STRING website, and the results were imported into Cytoscape. Then, module analysis was carried out by using the MCODE app. The online tool of the Kaplan-Meier Plotter website was used to analyse associations between relapse-free survival (RFS) and the expression of genes obtained by MCODE, and the metastasis-free survival (MFS) data from GSE58812 were used for survival verification. The difference in the expression of the identified genes was verified by the online tool of the UALCAN website.
Results: There were 127 upregulated and 293 downregulated genes in the DEGs. The GO and KEGG analysis showed that the DEGs were particularly enriched in mitotic nuclear division, extracellular space, heparin binding, and ECM-receptor interaction. MCODE obtained a total of 47 genes in 4 gene clusters, 29 of which were related to RFS. Survival verification indicated that 14 out of 29 genes were related to MFS, namely, CCNB1, AURKB, KIF20A, BUB1B, DLGAP5, CXCL11, CXCL9, CXCL10, CXCL12, IGF1, FN1, CFD, SGO2 and CDCA5.
Conclusions: We identified 14 genes as the high-risk genes for TNBC. Further research on these genes may identify the target genes of TNBC.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
This is a list of supplementary files associated with this preprint. Click to download.
Additional file 1: Table S1. The list of normal tissue samples and TNBC tissue samples selected from GSE7904, GSE31448, GSE45827, and GSE65194.
Additional file 2: Table S2. The intersection DEGs were detected from the 4 chips (GSE7904, GSE31448, GSE45827 and GSE65194), including 127 upregulated genes and 293 downregulated genes in normal breast tissues compared to TNBC tissues.
Additional file 3: Fig. S1. Kaplan–Meier curves for 14 genes in MCODE 1 that were related to RFS. Fig. S2. Kaplan–Meier curves for 15 genes in MCODE 2 - 4 that were related to RFS.
Posted 31 Dec, 2020
Invitations sent on 11 Jan, 2021
On 05 Jan, 2021
On 25 Dec, 2020
On 16 Dec, 2020
On 09 Dec, 2020
Identification of High-risk Genes in Triple-negative Breast Cancer by Bioinformatics
Posted 31 Dec, 2020
Invitations sent on 11 Jan, 2021
On 05 Jan, 2021
On 25 Dec, 2020
On 16 Dec, 2020
On 09 Dec, 2020
Background: Current research has failed to find a target gene for triple-negative breast cancer (TNBC), which has resulted in the treatment for TNBC being less effective than that for other types of breast cancer. Finding high-risk genes for TNBC by bioinformatics may help to identify target genes for TNBC.
Methods: The gene expression data of 4 chips (GSE7904, GSE31448, GSE45827, GSE65194) which contains of normal breast tissue and TNBC tissue were obtained from the Gene Expression Omnibus. The differentially expressed genes (DEGs) between normal breast tissue and TNBC tissue were identified. Gene Ontology (GO) functional annotation analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis of DEGs were performed by the DAVID website. Protein-protein interaction network analysis of DEGs was carried out by the STRING website, and the results were imported into Cytoscape. Then, module analysis was carried out by using the MCODE app. The online tool of the Kaplan-Meier Plotter website was used to analyse associations between relapse-free survival (RFS) and the expression of genes obtained by MCODE, and the metastasis-free survival (MFS) data from GSE58812 were used for survival verification. The difference in the expression of the identified genes was verified by the online tool of the UALCAN website.
Results: There were 127 upregulated and 293 downregulated genes in the DEGs. The GO and KEGG analysis showed that the DEGs were particularly enriched in mitotic nuclear division, extracellular space, heparin binding, and ECM-receptor interaction. MCODE obtained a total of 47 genes in 4 gene clusters, 29 of which were related to RFS. Survival verification indicated that 14 out of 29 genes were related to MFS, namely, CCNB1, AURKB, KIF20A, BUB1B, DLGAP5, CXCL11, CXCL9, CXCL10, CXCL12, IGF1, FN1, CFD, SGO2 and CDCA5.
Conclusions: We identified 14 genes as the high-risk genes for TNBC. Further research on these genes may identify the target genes of TNBC.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6