2.1 Data and immune cell expression estimation
In this study, I calculated the gene expression RNA sequence data (n = 424) of LIHC data in TCGA database in UCSC Xena with CIBERSORT algorithm, the number of repetitions was 1000, and then extracted the data with high significance, that is, P value < .05. In this way, I roughly estimated the approximate expression data and proportion of 22 immune cells in each patient(Fig. 1A). Then I matched and grouped them by the stage according to the clinical data(n = 469). Finally, it was divided into stage I, stage II and stage III.
2.2 Further analysis of immune cells
On this basis, I conducted enrichment Heatmap analysis(Fig. 1B) and correlation analysis(Fig. 1C) on immune cells, and compared them according to the stage of patients, especially stage I and stage III(Fig. 2C). I found that there are several immune cells with high differences: T cells follicular helper, T cells regulatory (Tregs), T cells gamma delta, Macrophages M2, Dendritic cells resting. Therefore, I analyzed the survival of patients based on the expression of significantly different immune cells(Fig. 3). T cells follicular helper(Fig. 3A) and T cells regulatory (Tregs) (Fig. 3B) have obvious differences in different stages of hepatocellular carcinoma. Then, I extracted the expression of these significantly different immune cells in stage I, stage II and stage III, and analyzed their data to obtain these results(Fig. 4). It can be seen from the results that there are significant differences in the prognosis between Tregs and hepatocellular carcinoma stage, and the decrease of Tregs content from stage I to stage III is significant(Fig. 4B), and the impact on survival and prognosis is also significantly different. According to the above situation, I believe that Tregs has a more obvious effect on hepatocellular carcinoma. High content of Tregs is harmful to the survival and prognosis of hepatocellular carcinoma, and low content of Tregs is beneficial to the survival and prognosis.
2.3 Screening and determination of differential genes
Therefore, I began to look for differential genes, that is, differential genes that can regulate Tregs, and differential genes have a significant impact on the survival and prognosis of patients with hepatocellular carcinoma. The screening of differential genes is based on the critical value set by log2 fold change (log2 FC) > 1 and Padj < .05. After screening differential genes, I used the survival data in TCGA database (n = 463) for Cox regression analysis, and established a Cox regression analysis model to screen the top gene, in which p-vlaue < .05 and the 95% confidence interval of gene risk ratio does not include 1. Because the high expression of Tregs has a bad effect on the survival and prognosis of hepatocellular carcinoma, the genes I selected are divided into two types. One is the gene of risk factor. When such gene is highly expressed, the expression of Tregs should be increased; When the expression of these genes is low, the expression of Tregs should be reduced, that is, these genes should be positively correlated with Tregs. The other is genes with protective factors. When such genes are highly expressed, Tregs should be lowly expressed; When the expression of such genes is low, the expression of Tregs should be increased, that is, such genes should be negatively correlated with Tregs. So I found the gene CENPO.
2.4 Analysis of differential genes
According to Cox regression analysis, gene CENPO is a gene with risk factors, so I drew the forest map according to the results of Cox regression analysis(Fig. 5A). According to the above inference, CENPO should be negatively correlated with the expression of Tregs(Fig. 5B).In addition, I need to know whether CENPO have significant differences in the survival and prognosis of hepatocellular carcinoma. I analyzed their survival data and drew their survival curve(Fig. 5C). In addition to the above studies, the expression content of CENPO in stage is also different, and there is an upward trend from stage I to stage III(Fig. 5D). Therefore, I calculated and plotted the correlation between CENPO and Tregs, whether the expression of the gene is related to the expression of 22 immune cells, and especially the content of Tregs(Fig. 5E).The results show that there was a significant difference between CENPO and the content of Tregs, which was negatively correlated with Tregs, and there was also a significant difference in the survival and prognosis of hepatocellular carcinoma. I used two GEO datasets gse25097 and gse36376 to verify whether CENPO is significantly different from adjacent hepatocellular carcinoma(Fig. 6).
2.5GO and KEGG Enrichment Analysis
In order to better understand the expression of these genes in hepatocellular carcinoma, I conducted go and KEGG enrichment analysis to analyze the aggregation of CENPO for metabolic pathways and pathways of hepatocellular carcinoma. According to go analysis, CENPO is mainly enriched in organelle fission, nuclear division and response to xenobiotic stimulus in molecular function, in apical part of cell, synaptic membrane and apical plasma membrane in cellular component, and in channel activity and passive transporter activity in biological process(Fig. 7A). According to KEGG analysis, the pathway enrichment of CENPO on hepatocellular carcinoma is mainly in neuroactive live receptor interaction(Fig. 7B).
2.6 Gene Set Enrichment Analysis(GSEA)
In order to explore the enrichment of up-regulated and down-regulated pathways of this gene, I performed GSEA, and I extracted the first five and immune related pathways. The results show that CENPO down regulates these pathways(Fig. 7C,7D).
2.7 PPI Network
Firstly, I use CIBOPORTAL database(http://www.cbioportal.org/ )11 to analyze the co-expression genes of CENPO and screen out the genes with more than 0.7 correlation with CENPO. Then I use the string tool to establish the PPI network of these genes to analyze the direct relationship between CENPO and other genes, select the genes known to interact with CENPO and make the PPI network (Fig. 7E). The redder the color, the higher the degree.