A Pan-Cancer Analysis of the Tumorigenic Role of Yin-Yang1 (YY1) in Human Tumors

Background: Yin-yang1 (YY1) is a nuclear transcription factor possessing dual transcriptional activity, which has different expression in a variety of tumor tissues. However, it remains unclear that the role of YY1 in most tumors and its association with immune cell inltration. Methods: The expression of YY1 was analyzed in pan-cancer data which were downloaded from The Cancer Genome Atlas (TCGA) database. The clinical survival data downloading from TCGA was used to analyze the effect of YY1 on clinical prognosis. We had access to the R package “clusterProler” to make the enrichment analysis of YY1. The score of the immune cell inltration of TCGA samples was downloaded from published articles and the correlation between YY1 expression and the immune cell inltration was analyzed. Results: YY1 had a high expression in 25 tumors and strongly associated with clinical stage. In most tumor types, the over-expression of YY1 was connected to the worse prognostic indicator, such as overall survival(OS), progression-free survival (PFS), disease-specic survival(DSS) and disease-free survival (DFS). Moreover, the expression of YY1 had a correlation with tumor mutation burden(TMB). Nearly all of immune-related genes had co-expression with YY1 and almost all genes had positive correlation with YY1 in all types of tumors. It's worth noting that the expression levels of B cells and T cells were lower in the group with high YY1 expression. In addition, 22 m6A methylation-related cells were co-expressed with YY1, such as METTL3, YTHDC1, FTO and so on. Conclusions: Our study leads to a suggestion that YY1 may be a marker of bad prognosis and high expression of YY1 may lead to immune inltration and be connected to m6A methylation. Thymoma(THYM), Uterine Carcinosarcoma(UCS), Uterine Corpus Endometrial


Introduction
YY1 is a member of the Zinc nger transcription factor family. It's a nuclear transcription factor which possesses double activity of transcription. It has extensive expression in a variety of human tissues, regulating the activity of transcription of a variety of genes and taking part in a variety of biological processes like embryonic formation, cell apoptosis, chromatin remodeling, etc [1]. Since YY1 was discovered, it has been proposed that how YY1 plays its biological role in the development and progression of tumor on account of the regulatory activity of YY1 on a variety of proteins relating to cancer and signaling pathways in the majority of cancers [2].
Currently, it is known that the function of YY1 can be realized mainly through the regulation of target genes in two main forms : 1) transcriptional activation. In metastatic renal cell carcinoma, it was found that the down-regulation of PTEN protein could promote the expression of YY1 through siRNA interference technology, thus promoting the proliferation of tumor cells [3]; 2) Transcriptional inhibition. YY1 inhibits proliferating cell nuclear antigen (PCNA) expression and phosphorylation of retinoblastoma suppressor protein (Rb) in breast cancer and glioma, thereby inhibiting the proliferation of tumor cells [4].
At present, although there is a large amount of experimental evidence based on cells or animals to support the association between YY1 and cancer [5][6][7][8][9][10], however, based on large clinical data, the widespread cancer evidence of the relationship between YY1 and a variety of tumor types is still lacking. Therefore, the cancer genome map from Atlas, TCGA and Gene Expression Omnibus (GEO) was used in this study. In addition, we also included several factors like survival status, immune in ltration, RNA methylation, genetic change and related cellular pathways to explore the feasible molecular mechanisms of YY1 in the pathogenic mechanism or clinical prognosis of diverse cancers.

Results
In Several Tumor Types YY1 was Highly Expressed and Associated With Clinical Stage

YY1 Expression in Tumor Tissue Samples Differed from That in Normal Tissue Samples
The IHC outcomes furnished by HPA data was analyzed and the results of YY1 gene expression data from TCGA were made comparison in order to make assessment on the expression of YY1 at protein level ( Figure 3A-L). Data analysis results of the two databases were consistent: moderate YY1 was found in normal kidney, prostate, liver and breast tissue IHC staining, and the tumor tissue was strongly stained.

High Expression of YY1 was Associated with Poor Cancer Prognosis
The relationship between YY1 expression and some prognostic indicators like OS, PFS, DSS and DFS was analyzed via the TCGA cohort on the purpose of evaluating the value of YY1 in making predictions on the patients' prognosis. Firstly, we studied the correlation between the expression of YY1 and OS. Cox proportional risk model analysis revealed that the expression level of YY1 was associated with LUAD(P =0.023), KIRP(P =0.023), UCES(P =0.007), LIHC(P =0.014), ACC(P =0.037), KIRC(P =0.006), THYM(P =0.035) ( Figure.4A). It's revealed by Kaplan-Meier survival analysis that in LUAD, KIRP, UCES, LIHC and ACC, high expression of YY1 was prominently connected with worse OS, whereas high expression of YY1 was prominently connected with better OS in KIRC and THYM ( Figure.4B--H).
Moreover, we also studied the correlation between the expression of YY1 and DSS. Forest maps revealed that the expression of YY1 was correlated with DSS in ACC, PAAD and OV Figure.   Finally, through the Forest maps of the Cox proportional hazards models analysis, we found that the expression of YY1 in BLCA, ACC, UVM, LUAD KIRC, OV and THYM had a signi cant change Figure.7A,all p<0.05 . To be speci c, In BLCA(P =0.0029),ACC(P =0.017), UVM(P =0.048), LUAD(P =0.025), higher YY1 expression was notably correlated with decreased DSS , but in KIRC(P =0.0063), In OV(P =0.036) and THYM(P =0.026), higher YY1 expression was signi cantly correlated with increased DSS ( Figure 7B--H).

Enrichment Analysis of YY1-related Cooperators
In order to ulteriorly study the molecular mechanism in tumorigenesis of YY1 gene, we made attempts to screen out YY1 expression related genes of the binding protein targeting YY1 for a sequence of pathway enrichment analysis. Grounded on the STRING tool, to the amount of 50 YY1 binding proteins had been obtained supported by evidence of experiments. The interaction network of these proteins were displayed( Figure. 8A). Then the GEPIA2 tool combined with all expression data of tumors from TCGA was employed to obtain the rst 100 gene maps associated with YY1 expression. Figure 6B,C is shown based on Metascape Online GO and KEGG analysis, bar chart. In addition, the hub genes in the YY1-binding and interacted genes were displayed by PPI network and MCODE, along with the connection and distribution of various functions of them( Figure.8D,E).

YY1 Expression Correlated With Tumor Mutation Burden and Tumor Microsatellite Instability
Then, we studied if there is a correlation between the YY1 expression level and TMB and MSI. Results showed that in nine types of cancer, including ACC, PAAD, LUAD, LUSC, SKCM, ESCA, TIICA, DLBC, UVM, YY1 expression is associated with TMB (p<0.05). YY1 expression in ACC,PAAD,LUAD,LUSC and SKCM is positively associated with TMB, while that in ESCA,TIICA,DLBC and UVM is negatively correlated with TMB ( Figure. 9A). At the same time, Genetic variations of YY1 in TCGA tumors were detected using the cBioPortal database. The mutation features of YY1 on the basis of tumors from TCGA was displayed by employing the cBioPortal tool. (Figure. 9B). Then, we made observation on the genetic altered condition of YY1 in various tumor samples from the TCGA cohort. As displayed in Figure 9C, the highest change frequency of YY1 (> 3%) existed in patients with UCEC in which the "mutation" is the dominant type . The CNA "ampli ed" type was the predominant type of LUSC cases, with a change frequency of about 2.5% ( Figure 8C). It is noteworthy that all cases of cholangiocarcinoma CHOL with genetic changes (~3% frequency) had YY1 copy number deletion ( Figure.

YY1 Expression Levels Had a Correlation with Tumor Immune Cell In ltration
A co-expression analysis of gene was conducted to seek the relationship between the expression of YY1 and genes related to immunity in 33 kinds of tumors. Immune-activation genes and other genes encoded chemokine and chemokine receptor proteins. The heat map showed that nearly all of genes related to immunity had co-expression with YY1 ( Figure. 10). It's worth noting that almost all genes had a positive correlation with YY1 in every tumor type that we analyzed (P <0.05).
Then, the impact of YY1 on the immune microenvironment of the tumor was ulteriorly analyzed. We examined the relationship between the expression level of YY1 and the in ltration level of immune cell , especially T cells and B cells. We screened out the top 12 tumors which had the highest correlation from the heat map of YY1 expression and immune cell in ltration level( Figure.11). It turned out that in most tumor types that we screened out, the expression levels of T cells and B cells were lower in the group with high YY1 expression. It was noted that in THCA,CESC,BRCA,LUSC, the difference of enrichment scores became more obvious (P <0.001), which meant the expression of YY1 had a negative correlation with the expression levels of T cells and B cells. In THYM, however, T cell expression increased in the group of high YY1 expression (P<0.05).

Correlation of YY1 Expression With RNA Methylation
In addition, a co-expression analysis of YY1 and M6A methylation was conducted to explore the relationship between YY1 expression and RNA methylation. The outcomes showed that in PRAD, LUSC, KIRC, BLCA,LIHC and BRCA tumors, 22 m6A methylation-related cells were co-expressed with YY1 ( Figure  12). It should be noted that in LIHC, all m6A methylation-related cells were signi cantly related to YY1(p<0.001).

Discussion
In recent years, studies have shown that YY1 has a high expression in various tumor tissues, including breast [15], prostate[16], ovarian [17], brain[18], osteosarcoma [19], colon [20], esophagus [21, pancreas [22,23], and melanoma [24], YY1 can promote the emergence and growth of tumors by regulating oncogene suppressor gene angiogenesis related factors and inhibiting tumor cell apoptosis, indicating that YY1 could play a signi cant part in judging the prognosis of tumor patients and tumor targeted therapy[25-28]. However, through literature search, we did not nd any literature with generalized cancer analysis of YY1 at the level of overall tumor. Therefore, we made a comprehensive examination on YY1 genes in 33 various tumors grounded on TCGA and GEO database.
In our study, we used TCGA pan-cancer data to detect the expression level and prognostic function of YY1. On the basis of our results, it turned out that YY1 was highly expressed in 25 tumors compared to TMB is a pan-cancer predictive biomarker with great promise [29]and it can also assist in making prediction on prognosis in pan-cancer patients who already accepted immunotherapy [31]. Besides, MSI is a another signi cant biomarker in immune-checkpoint inhibitors (ICI) [30,32]. Our results showed that in nine types of cancer YY1 expression is associated with TMB and MSI is concerned with two types of cancer. In our study, although it seemed that there is no speci c relevance between the expression of YY1 and TMB, which means the high expression of YY1 can lead to both increasing and decreasing TMB, this nding can still lead to a conclusion that YY1 expression level could in uence the TMB and MSI of cancer, thus in uencing the reaction of patients to suppression therapy of immune checkpoint which can provides a fresh reference for the immunotherapy prognosis.
The microenvironment of tumor, especially the immune microenvironment, is an important component of tumor biology and a growing body of evidence reveals its clinicopathological importance in making predictions on outcomes and therapeutic outcomes [33,34]. Our research shows that YY1 is connected with immune-related cell in ltration level, especially T cells and B cells.It was noted that the high expression of YY1 inhibited the presence of T cells and B cells in most tumor types, which means the high expression of YY1 has a certain inhibitory effect on the immune in ltration of immune cells. Next a gene co-expression analysis was conducted to observe the relationship between the expression of YY1 and genes related to immunity in 33 kinds of tumors. The heat map showed that nearly all of genes related to immunity had co-expression with YY1. This result further proves that YY1 may affect tumors by regulating immune-related genes and immune cells. Together, our ndings indicate that YY1 could be a valuable prognostic biomarker and potential target for immunotherapy.
As is known to all, the morphological and functional diversity of RNA is based on the extensive modi cation of the four typical base groups of RNA. Since the 1960s, more than 150 chemical modi cations of RNA have been discovered [35], among which m6A modi cation, namely the addition of methyl to the N-6 position of adenosine residue, is the most common in eukaryotes Common posttranscriptional modi cations. Therefore, in order to explore the correlation between the expression of YY1 and M6A methylation, we made a co-expression analysis of YY1 and M6A methylation. It is the rst time to explore the correlation between methylation of YY1 promoter and cancer. We found that in PRAD, LUSC, KIRC, BLCA,LIHC and BRCA tumors, 22 m6A methylation-related cells were co-expressed with YY1, such as METTL3, YTHDC1, FTO, RBM15 and so on, which means YY1 expression was associated with RNA methylation. Therefore, YY1 methylation levels can be employed as a prognostic biomarker in cancer patients in years to come. However, the concrete correlation between YY1 and RNA methylation and the speci c molecular mechanism of interaction between YY1 and RNA methylation need further experiments and research.
Taken together, our pan-cancer analysis of YY1 veri ed that YY1 highly expressed in a few tumor types and was associated with clinical stage. Then, on the basis of this conclusion, our study further revealed statistical correlation of YY1 expression with prognostic in uence in clinical, immune cell in ltration, m6A methylation, tumor mutational burden and microsatellite instability, which can make a contribution to understanding what role YY1 are playing in tumorigenesis. Nevertheless, it is necessary that we need more basic and large clinical trails to validate these ndings.

Conclusion
Our study leads to a suggestion that YY1 may be a marker of bad prognosis and high expression of YY1 may lead to immune in ltration and be connected to m6A methylation.

Data acquisition and YY1 Expression Analysis
Pro les of YY1 expression and clinical pan-cancer data from TCGA(contains 11069 samples from 33 types of cancer) were downloaded by using the UCSC Xena (https://xenabrowser.net/datapages/) database, which can explore the expression of gene and phenotype information as an on-line tool. In order to avoid the analysis error caused by the small sample size, we selected not less than 5 normal samples of cancer species for follow-up analysis. On the purpose of standardization, quantile normalization was conducted on all data using the log2-scale transformation. The comparison of YY1 expression between normal samples and tumor samples were made by Wilcoxon test.

Survival prognosis analysis
Kaplan-meier survival analysis was grounded on the best cut-off value pair, patients from TCGA were split into two groups with high YY1 expression and low YY1 expression, and overall survival(OS), progression-free survival(PFS), disease-speci c survival(DSS) and disease-free survival(DFS) were made comparison between the two groups. The optimal cut-off value was determined by using the surV-Cutpoint function in surVMINER R package. The survival curve between the two groups was made by Survival R package. Cox regression analysis took the expression of YY1 as a continuous variable to analyze the correlation between the expression of YY1 and the overall survival rate of the patients. Cases without prognostic follow-up were excluded from the survival analysis. Based on Cox proportional hazards model and Kaplan-Meier model, the risk ratio (HR) was calculated, with p 0.05 statistically statistically signi cant differences.

Genetic variation analysis
Firstly enter the cBioPortal website(https://www.cbioportal.org/) [11,12]. Then select "the TCGA pancancer map study" in the "rapid selection" section. Next, query the genetic change characteristics of YY1 by entering "YY1". Lastly, in the "cancer type summary" module, the changing frequency of the mutation types along with CNA(copy number change) results of overall TCGA tumors can be found.

Correlation of YY1 Expression With Tumor Mutation Burden(TMB) and Tumor Microsatellite Instability(MSI).
Tumor Mutation Burden(TMB) is a quanti able immune response biomarker which re ects the mutation numbers in tumor cells [13]. A Perl script was used to calculate the TMB score which was corrected then by the overall length of exon extras. The MSI rating of all samples was determined according to somatic mutation data which was downloaded from TCGA (https://tcga.xenahubs.net) and then we used Spearman rank correlation coe cient to analyze the relationship between YY1 expression and TMB and MSI.

Immune in ltration analysis
Tumor Immune Estimation Resource (TIMER) is a comprehensive database (https://cistrome.shinyapps.io/timer/), which is employed to predict gene expression quantity relationship with the condition of each type of immune cell in ltration [14]. Next the correlation between YY1 expression and various tumor cells related to immune in ltration online was analyzed, including CD8+ T cells, CD4+ T cells, B cells, neutrophils, monocytes and dendritic cells.

YY1-related gene enrichment analysis
To query protein name and species with STRING (YY1; Homosapiens), and then set the following parameters: the minimum interaction score required set to low reliability 0.150; The line color indicates the type of interactive evidence; Maximum number of interactive objects to display: no more than 50; Interaction source: At the end of the experiment, 50 experimentally veri ed proteins binding to YY1 were screened out and GEPIA2 was used for correlation analysis. Grounded on TCGA and GTEx data sets, the top 100 genes associated with YY1 were selected and YY1 was selected from them Log2TPM was used as the Pearson correlation analysis scatter plot for the pair of genes, and P value and correlation coe cient R were given according to the correlated genes selected in the previous step. The correlation heat map between the selected genes and YY1 was made for further screening of the base. Next, the interactive Wayne gure (Venndiagram) viewer Jvenn were used to make correlation of 100 gene and protein interactions of 50 intersection analysis. In addition, combine two sets of data in KEGG (kyotoencyclopediaofgenesandge -nomes) road Analysis by gene list uploaded to DAVID (databaseforannotation, the visualization, andintegrateddiscovery), and select OFFICIAL_GENE_SYMBOL and Homosapiens as gene identi er and the species respectively, got the functional annotation data In the end, a kind of used for data analysis and visualization in at platform (http://www.bioinformatics.com.cn) to draw bubble chart displays the pathway enrichment In addition, the GO(geneontoligy) rich set analysis for biologicalprocess (BP) cell compo-nent (CC) and partition work are also carried out (MolecularFunction,MF) data visualization.

Statistical analysis
Univariate and multivariate Cox regression analyses were conducted via R package "survival"[32], and the hazard ratios (HRs) and 95% confdence intervals (CIs) in the same way. Additionally, the diference of diverse clinical factors was made comparison by means of the independent t test. P<0.05 manifested statistical signifcance.