RNA-Seq revealing the tumor immunity regulation mechanism of circular RNA in human laryngeal squamous cell carcinomas

Motivation: Laryngeal squamous cell carcinoma (LSCC) is one of the relatively common malignant tumors occurring in the epithelial tissue of the larynx. Recently study showed that tumor immunity mechanisms have highly correlation with the development and progression of LSCC. With the development of research on circular RNA (circRNA) in recent years, a large number of abnormal expressions of circRNA in various tumors have been reported. There is a large number of evidences indicated that circRNA is widely involved in stage development of tumors and it also plays an important role as a biomarker in diagnosis and treatment. Therefore, the study of the pathways involved in the development of tumor by circRNA is a necessary process for humans to further understand cancer at the transcriptional level. The purpose of this study was to identify the circRNAs in tumor tissues of patients with laryngeal squamous cell carcinoma (LSCC) by NGS technique and to nd the circRNAs that were signicantly different from normal adjacent tissues. Finally, the mechanism of these differentially expressed tumor immunity circRNAs affecting LSCC development was further analyzed. differential expressed circRNAs from those identied candidates. Of the 39 circRNAs, 21 were detected to be signicantly less expressed in LSCC tissues than in adjacent normal tissues, and other 18 were signicantly higher. Through the interaction analysis of circRNA-miRNA-mRNA, we found that these differential expressed circRNAs were related to tumor immunity. After screening, four circRNAs with the strongest correlation with tumor immunity were obtained. We found that they play an important role in the membrane receptor tyrosine protein kinase signaling pathway pathway (Ras-Raf-MAPK pathway) and p53 signaling pathway.


Conclusion
According to the results, the method of detecting expression of four selected circRNAs has a bright future to be a kind of new LSCC diagnose approach.

Background
Laryngeal squamous cell carcinoma (LSCC) is one of the relatively common malignant tumors occurring in the epithelial tissue of the larynx. Worldwide, about 5% of cancer patients are diagnosed with LSCC each year [1]. Moreover, the number of LSCC patients is increasing every year. LSCC is triggered by many factors, including smoking, heavy drinking and severe air pollution. Although a number of encouraging advances have been made in LSCC in recent years, the pathogenesis of the disease and tumor growth are still not fully known at the transcriptional level. Although much work has been done worldwide over the past few years to identify biomarkers associated with LSCC in an attempt to reduce the mortality rate among LSCC patients, studies have shown that more than half of those diagnosed with LSCC die within ve years [2]. Therefore, the research on LSCC from the perspective of High throughput sequencing may nd new potential biomarkers from a completely new perspective. It may also provide a reference for the development of new targeted therapies.
In recent years, long non-coding RNA (LncRNA) has become a research hotspot, and several studies have shown that some LncRNAs expressions in some cancerous tissues are signi cantly different from those in normal tissues [3]. In addition, some LncRNAs have been proved to be related to the functions of avoiding growth inhibitors, inhibiting cell apoptosis and maintaining cell growth and proliferation.
Therefore, for the research direction of targeted therapy, the search for LncRNA biomarkers is currently a meaningful direction. circRNA is a special type of LncRNA. Different from common linear RNA with 3' end and 5' end, circRNA molecule has a closed ring structure, it is not easily degraded by RNA exonuclease.. Therefore, the expression of circRNA is more stable [4]. Functionally, the literature indicated that microRNA (miRNA) binding sites are abundant in the circRNA ring structure and play the role of miRNA sponge in cells. By competitively binding to mRNA, circRNA molecules can reduce the intervention of miRNA in the transcription process and indirectly remove the inhibition of miRNA on gene expression. CircRNA have an important regulatory function in disease by interacting with disease-related miRNA [5]. According to a paper published in 2015, circRNA can already be used as a reliable biological signal for cancer diagnosis [6]. Moreover, in recent years, there have been literatures showing that the expression of some circRNA in LSCC patients is signi cantly different from that in normal adjacent tissues, and it is believed that there are potential new biomarkers in these signi cantly different circRNA [7].
Recent studies have shown that the development of LSCC tumors is highly correlated with tumor immunity [8]. Antigenic proteins on some immune cells, such as CD3, CD4 and CD8, show different expression levels at different stages of LSCC development [9]. It can be said that tumor immune mechanisms have highly correlation with the development and progression of LSCC. Therefore, this study focused on the circRNAs related to tumor immunity in LSCC samples.
Restricted by the limitation that microarray technology can only detect known circRNAs, previous studies based on this technology to nd circRNA biomarkers related to LSCC tumor growth and development may have missed many potential circRNAs that have not been added to the database.Therefore, this study used RNA-Seq technology to detect all possible circRNAs and some potential pathway. In addition, this study also takes into account the RNA interactions predicted by different algorithms, and maximizes the reduction of false positive rate in the results by taking the intersection of results obtained by different algorithms. The ultimate goal of this project is to identify potential novel circRNA biomarkers associated with LSCC tumor growth and development by investigating circRNAs that are signi cantly differentially expressed in lesion samples and adjacent normal samples of LSCC patients and the miRNA-mRNA interactions associated with these differential circRNAs, which showing that hsa_circ:chr1:8525985-8601377 circRNA may involved in the regulation of Ras-Raf-MAPK and p53 signaling pathway to accelerate the LSCC development.

Material And Methods
Samples source and description Sample sequencing pre-treatment The process of isolating and purifying RNA strictly follows standard procedures and all the reagents and lab equipment were RNase/DNase-free. Firstly, total RNA in the tissue sample was extracted with TRIzol agent (Invitrogen, Carlsbad, CA, USA). The RNA pellet was cleaned by 75% ethanol (1 ml) two times, then natural drying was implemented and redissolve in ultra-pure water (20 µl). The RNA solution obtained after the above steps also contained a small amount of DNA, and by using Turbo DNase Kit (Ambion) treatment, the remaining DNA in the sample was degraded, with only RNA remaining. The puri ed RNA solution was obtained by using RNeasy MinElute Cleanup kit (Qiagen, Valencia, CA, USA). A NanoDrop ND1000 spectrophotometer was then used to assess the quality of the concentrated RNA. The ratio of OD260/OD280 is between 1.8 and 2.1 and the ratio of OD260/OD230 greater than 1.8 is considered a reliable result. Denatured agarose gel electrophoresis was then used to assess RNA integrity and DNA contamination.Finally, we enriched circRNA with RNase R (Epicentre, Inc.) and removed linear RNA from the samples.

Sequencing
The pretreated RNA samples were ampli ed into small fragments by bivalent cations at high temperature.
These small fragments were reverse-transcribed into cDNA, which was puri ed by PCR ampli cation after terminal repair and opening primer, and then the appropriate library was conducted by NEBNext UltraTM Directional RNA Library Prep Kit for Illumina. Then the cDNA libraries were sequenced according to Illumina HiSeqTM 3000 system guidelines.

Differential expression analysis
The original data were removed with Cutadapt [10] (Version: 2.6) software to remove sequencing connectors and low quality reads, and the clean data were compared with human Hg19 reference genome using Tophat2 [11] (Version: 2.1.0) software. The sequences not matched to the reference genome were further compared by Tophat-Fusion [12] (Version: 2.1.0) to obtain candidate circRNAs.
These candidate circRNAs were further screened and annotated by Circexplorer2 [13] (Version: 2.3.8) software. The annotated circRNAs were quanti ed by subprogram featureCounts of Subread [14] (Version: 2.0.0) package, and nally the read count expression matrix of each circRNA in different samples was obtained. We used DESeq2 [15] (Version: 1.26.0) package to analyze the differential expression of read count, and set the difference threshold as log 2 FoldChange greater than 2.0 or less than − 2.0. Wald test was used to calculate the difference signi cance, and the signi cance threshold was set as pValue less than 0.01.

Prediction circRNA-miRNA interaction
Miranda [16] (Version: 3.3a) algorithm and RNAhybrid [17] (Version: 2.1.2) algorithm were applied to predict the interaction between differential expression circRNAs and known miRNAs in screened LSCC tumor tissues and adjacent normal tissues. Since neither of the two tool algorithms can completely avoid false positive results, we take the intersection of the results obtained by the two tools to minimize the number of false positive results. The threshold are the minimum free energy of the binding process is less than − 30 kcal/mol and seed type between nucleotides 2-7. Cytoscape [18] (Version: 3.7.2) was applied to visualize the interaction between circRNA and miRNA.
Prediction of circRNA-miRNA-target gene associations Two algorithms, TarPmiR [19] (Version: 3.0.0) and MirTarget [20] (Version: 4.0.0), were used to predict the target genes of miRNAs that interacted with the screened circRNA. In addition, to eliminate false positive results, we introduced data collected from the miRTarBase [21] (Release 7.0) public database, which were experimentally supported. By combining the data results from these three sources, we obtained the interaction relationship between the miRNA and its target genes with high reliability.

CircRNA functional analysis
The functions involved in differentially expressed circRNAs were explored through our circRNA-miRNAtarget gene network. First of all, we analyzed all the target genes involved in this network by KEGG, and obtained the pathways related to cancer. Then, based on those target genes contained in these screened pathways, the miRNAs related to them in the network are screened reversely, and then the corresponding circRNAs is found. After the above steps are completed, the more interactive circRNAs were considered as potential biomarkers, and the corresponding miRNA interactions of these potential circRNAs biomarkers are intersected for further analysis. At last, a KEGG analysis of the target genes corresponding to the miRNAs in the interaction was carried out to explore the regulatory mechanism of these potential circRNAs biomarkers for tumor development.
QRT-PCR quanti cation for circRNA The extracted RNA from LSCC tumor and normal adjacent tissues is retro-transcribed into cDNA by SuperScript First-Strand Synthesis System (Invitrogen, Carlsbad, CA, USA). We select SYBR Green PCR Mastermix kit (Applied Biosystems, Foster City, CA, USA) for amplifying circRNA in samples. According to the instructions of the kit, the reaction system has a total of 20 microliters, including 10 microliters of SYBR mixture, 1.6 microliters of ultra-pure water, 0.8 microliters of positive primer ampli cation, 0.8 microliter of negative primer ampli cation and 6.8 microliter of cDNA library. ABI 7300 PCR System then was applied to conduct the whole qPCR reaction. The qPCR settings are as follows: a cycle of 50 ° C for 2 min and a cycle of 95 ° C for 5 min was set, followed by 40 cycles at 95 ° C and one cycle at 60 ° C.
Repeat the above steps 3 times. Finally, the circRNA expression was calculate by 2 − ΔCt method. The tools used for data calculation was R programming language (Version: 3.6.2). The statistical test method is two-sided t-student test, and the threshold of signi cance p value is set to 0.05, which is considered signi cant if it is less than this threshold.

Identi cation of circRNA
After comparing and identifying the data set containing 4 LSCC tumor sample data and 4 adjacent normal tissue sample data with the human reference genome of hg19 version. All of detected circRNAs are detailed in additional le 1. Visualization of related results is shown in Fig. 1.
The raw data were ltered and aligned with the human reference genome, then all sequences that could not be mapped to the human genome were aligned again to nd potential junction reads which were characteristic of circRNA. A total of 75931 circRNAs were found and total of 39 signi cant expressed circRNAs were ltered (Additional File 2) according to the threshold of pValue < 0.01, |log2FoldChange| > 2.0, Among those different expression circRNAs, 23 terms (58.97%) were identi ed as locating on promoter on whole human genome; 7 terms (17.95%) were on exon; another 6 and 2 terms were positioned at 5'UTR and 3'UTR respectively; and then the last one was positioned at distal intergenic area (Fig. 1A). Principal component analysis (PCA) revealed the differentiation between LSCC tumor samples and adjacent normal tissue samples (Fig. 1B). Compared with the normal adjacent tissue samples, 18 of these screened differential expression circRNAs were up-regulated and 21 were down-regulated in LSCC tumor samples (Fig. 1C). Then the expression pattern of differentially expressed circRNA among samples was shown by hierarchical clustering (Fig. 1D).
Establish the interaction network between circRNA and miRNA CircRNAs combine miRNA by miRNA response elements (MREs). Based on this binding pattern, we introduced Miranda and RNAhybrid tools to predict the binding sites of circRNA and miRNA. After the intersection of the two software prediction results, we nally obtained a total of 2,166 combinations of circRNA and miRNA (Additional File 3, Fig. 2A). However, some circRNAs have a small number of interactions (edge < 20), and these small number of interactions are not only relatively isolated in the whole network but also tend to have a negative impact on subsequent analysis. Therefore, we selected only circRNA with binding of more than 20 miRNAs to be included in the network. After ltering, 2097 relationships are left in the network (Additional File 4). The entire network is visualized through Cytoscape (Fig. 2B).

Predicted CircRNA related miRNA and target gene screening and KEGG analysis
There are a total of 727 miRNAs involved in the network constructed in the previous step. TarPmiR and MirTarget were used to predict the target genes corresponding to these miRNAs. In order to reduce false positive entries, experimental data supported by miTarBase database were introduced. After that, the data from these three sources were intersected and 10,835 combinations were nally obtained (Fig. 3A). Of these miRNA and gene combinations, we obtained a total of 3,831 genes.
The KEGG pathways involved in the obtained genes were analyzed and screened by the in-house R language program, and there were 129 KEGG pathways that met the screening threshold (pValue < 0.05, qValue < 0.2) (Fig. 3B). Among these pathways, we sorted them by gene ratio and visualized by pictures. These genes were analyzed to obtain details of the KEGG pathway in additional le 5. In the 129 KEGG pathways obtained, according to the number of gene mapped to pathway, we showed the rst 30 pathways with pictures. Surprisingly, four of these pathway were annotated relating with immunity pathway, which indicated they are involved in tumor immunity. Therefore, including hsa05161 (Hepatitis B), hsa05167 (Kaposi sarcoma-associated herpesvirus infection), hsa05170 (Human immunode ciency virus 1 infection), hsa05166 (Human T-cell leukemia virus 1 infection) pathway were selected for further subsequent analysis.
These four circRNAs have different miRNAs to bind to each other. After analysis, we found that 33 common miRNAs can bind to all these four circRNAs. We selected the target genes of these 33 miRNAs for further KEGG analysis (Additional File 6). (Fig. 4C/D) KEGG results showed that these four circRNAs were involved in hsa04151 (PI3K-Akt signaling pathway), hsa04010 (MAPK signaling pathway), hsa04014 (Ras signaling pathway) and hsa04115 (p53 signaling pathway). Since these pathways are highly correlated with tumor formation, we believe that these four circRNAs are used as biomarkers to participate in the regulation of tumor immunity and further affect LSCC tumor growth and development.

QRT-PCR validation
Since it was di cult to construct speci c primers to detect hsa_circ:chr2:168920010-168986268, hsa_circ:chr13:24164289-24200931 and hsa_circ:chr2:240078348-240085619, only hsa_circ:chr1:8525985-8601377 was selected for validation. We collected tumor samples and adjacent normal tissue samples from 10 LSCC patients for QRT-PCR veri cation (Fig. 5). The results showed that in most of the control group (adjacent normal tissue samples), the expression level of hsa_circ:chr1:8525985-8601377 was generally much higher than that of the treat group (LSCC tumor samples). The experimental results were consistent with the sequencing results.

Discussion
In this study, through high-throughput sequencing of circRNA in 4 LSCC tumor samples and 4 adjacent normal tissue samples, 18 up-regulated circRNA and 21 down-regulated circRNA were found in LSCC tumor tissues. These 39 differentially expressed miRNAs were used to predict the binding ability to all known miRNAs. The interaction network of circRNA and miRNA was constructed by the intersection of predicted results of Miranda and RNAhybrid. After screening, 2097 interactions were found.Then the target genes of miRNA in the network are predicted by three tools and databases and KEGG analysis is carried out for these genes. We screened four tumor immune related pathways from 129 KEGG results.
Four miRNAs with the strongest correlation with these four pathways were screened out from the previously constructed network, and then the target genes of three miRNAs with interaction relationship with these miRNAs were analyzed by KEGG again. We found that these four screened circRNAs were related to p53 signaling pathway. Because this pathway is very important in the process of tumor growth and development, we believe that these four circRNAs are potential new biomarkers for judging tumor growth status.
So far, clinical judgment LSCC biomarkers of tumor growth and development status of squamous cell carcinoma antigen (SCCA), carcino-embryonic antigen (CEA) and other compounds [22]. However, none of these biomarkers can re ect the growth and development of tumor in a highly speci c way. Taking the biomarker SCCA as an example, it is negative in benign esophageal tumors, while the positive rate in patients with esophageal cancer is 40-52%, and this rate increases with the further development of lesions. However, SCCA positivity is rare in the early stages of cancer [23]. Therefore, the purpose of this study is to identify potential circRNA biomarkers based on tumor immune-related pathways, so as to provide a more speci c and accurate method for determining the growth and development of LSCC tumors.
According to our previous conclusions, the four potential circRNA biomarker related genes are corresponding to four signaling pathways. The most signi cant is the PI3K-Akt signaling pathway. Studies have shown that the PI3K-Akt signaling pathway is associated with the development of a variety of tumors, such as lung, breast, and laryngeal cancers [24]. Structural changes in key molecules in the PI3K-Akt signaling pathway, such as p110, p85, Akt and PTEN, are related to cell transformation. These genes have been con rmed to be proto-oncogenes or tumor suppressor genes [25]. Therefore, It can be speculated that these circRNAs affect the normal regulation of the pathway and thus lead to the development of tumors.
In addition to the PI3K-Akt signaling pathway, Ras-Raf-MAPK signaling pathway are also signi cantly affected by the four potential circRNA biomarkers. Raf signaling pathway and MAPK signaling pathway are closely related in function. The activation of Ras-Raf-MAPK cascade regulates the transcription of related genes and promotes the proliferation and spread of cancer cells [26]. These four circRNAs may be involved in the regulation of tumor proliferation.
Except for the above three pathways, the p53 pathway is another primary pathways affecting tumor development. The related genes are TP53, CDKN1A, MDM2, CDK2, BCL2L1 and MDM4. Among these genes, TP53 and MDM2 genes are widely regarded as the key genes in the p53 signaling pathway. The MDM2-TP53 regulatory pathway is an important part of the p53 signaling pathway [27]. According to the report, any abnormality in one of the genes in this pathway can induce the development of tumors [28]. MDM2 is the main regulator of tumor inhibition function of TP53. MDM2 can promote its degradation by binding to TP53. The consequence of the imbalance of MDM2-TP53 relationship caused by various factors is the inactivation of TP53 and the occurrence of tumors [29]. Therefore, it can be seen that the four circRNAs screened in this study are highly correlated with the occurrence of tumors and may be a relatively stable class of LSCC biomarkers.
By combing these pathways, we found that the potential circRNA biomarkers we screened not only participate in the development of tumor in the early stage, but also have a certain regulatory effect on the spread of tumor cells. Therefore, the method of detecting the expression levels of the four circRNAs has a forecast bright future as a strong support for the diagnosis of LSCC tumors. However, these study have two limitations. The rst one is the number of sample is relevantly small, which could has an negative impact on statistic analysis. Secondly, the study only analyzed circRNA, multi-omics integration analysis is required to explore the role of circRNA in the growth and development of LSCC tumors.
Compared with traditional methods, this study developed a higher accuracy and wider coverage method based on high-throughput technologies to screen potential LSCC tumor-related biomarkers according to differential expression circRNAs. In addition, the conclusion can be a theoretical basis for the development of new LSCC diagnostic techniques.  Heatmap. This gure shows the results of hierarchical clustering of all differentially expressed circRNAs.
According to different expression levels, circRNAs and samples are also divided into different groups. Target gene KEGG analysis. A. Three tools predicted miRNA target gene results. After the intersection of the three results, 10835 miRNA and gene interactions were obtained. B. KEGG bubble plot. In the bubble diagram, we selected the 30 pathways with the largest number of mapped genes from the results. The Xaxis represents the number of mapped gene to the pathway, and the Y-axis represents pathway name.
The entries in the red rectangle are the four tumor immune-related pathways we selected.

Figure 4
Tumor immunity pathway related circRNAs KEGG analysis. A. In general, this network is identical to the previous one, with the only difference being that some miRNAs are marked blue, meaning that they are associated with four previously screened tumor-immune-related pathways. B. Of the four circRNAs with the most tumor immunological associations and the miRNAs associated with them respectively, we found that 33 miRNAs were commonly associated. C. The bubble map of the gene KEGG corresponding to 33 commonly related miRNAs. D. This circos diagram shows the relationship between the KEGG pathway and the corresponding gene, and different colored lines correspond to different pathway.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.