Genetic Landscapes and Prognostic Implications of Circadian Rhythm in Non-small Cell Lung Cancer

Background: Circadian rhythm plays an inuential role in the vast majority of behavior and physiology, even of tumorigenesis and development. In this study, we tried to explore the prognostic implication for non-small cell lung cancer (NSCLC) based on the expression proles of circadian clock-related genes (CCRGs), and describe the changes of immune inltration and cell functions of related to the circadian rhythm. Methods: Univariate and multivariate Cox proportional hazard regression were performed to determine and construct a CCRGs risk-score signicantly correlated with overall survival (OS) of the training set and validation set. GO, KEGG, and GSVA indicated discrepant changes in cellular processes and signaling pathways associated with these CCRGs. Further, immune cell inltration, mutation rates, and related drugs and pathways analyses were investigated by the online analysis platform and the algorithm provided by works of literature. Results: The risk-score based on ten-gene signatures could independently predict the OS both in TCGA lung adenocarcinoma (p < 0.0001, HR: 2.117, 95% CI: 1.546 to 2.900) and lung squamous cell carcinoma (p < 0.0001, HR: 2.066, 95% CI: 1.552 to 2.751), respectively. The risk-score also had superior accuracy and predictability (LUAD: AUC 0.788; LUSC: AUC 0.738). The prognostic performance of the ten-gene signature was nally validated in validation set based on GEO cohorts. The circadian oscillations driven by CCRGs could disturb the metabolism and cellular functions of cancer cells. We also found that the inltration level of critical cells in specic anti-tumor immunity process was suppressed apparently. In contrast, the inltrating of inammatory cells and immune cells with negative regulatory effects were promoted in the high-risk group. CCRGs were evolutionarily conserved with low mutation rates, which brought diculties to explore therapeutic targets. Conclusions: We described the landscapes and prognostic implications of CCRGs. The risk-score based on CCRGs was an independent predictor of prognosis and could signicantly stratify patient outcomes and immune cell inltration levels. CCRGs were evolutionarily conserved with low mutation rates, which brought diculties to explore therapeutic targets.Our study revealed that circadian rhythms might play an inuential role in the NSCLC. cell functions immune inltration the low-risk group GGRGs Differential cell inltration level in low


Background
Circadian rhythm exists endogenously in almost all organisms. The expression of circadian clock genes drives oscillatory changes in innumerable behavioral and physiological processes, including tumorigenesis and development (1,2). According to the organism and cell type, the circadian clock promotes the rhythmic expression of 1% to over 60% of the genome, serving as the molecular basis for rhythmic control at the system's level (3). In recent years, it is becoming an increasing focus of the role of the circadian clock in tumorigenesis, cancer hallmarks, therapeutic options, and discussions of how circadian clock genes can lead a new dimension in future medicine (4,5). Accumulating evidence identi ed that there was a tight association between cancer and disruption of circadian in curative effect and prognosis, and the core circadian transcripts are generally altered in many kinds of cancers (6)(7)(8). Nevertheless, the regulatory mechanism of circadian clock genes and their effects on clinical prognosis is not precise yet.
Nowadays, lung cancer is the most leading cause of cancer death (18.4% of the total cancer deaths) with the highest incidence (11.6% of the total cases) around the world (9). Non-small cell lung cancer (NSCLC) is the most common type and accounts for about 85% of total cases (10). Previous studies have shown that the dysregulation of the circadian rhythms can in uence cancer development by regulating tumor cell apoptosis, autophagy, immune in ltration, and tumor cell-host interactions via related genes oscillatory and differential expression, which has been studied in many kinds of cancers including breast, colorectal cancer, and head and neck squamous cell carcinoma (11)(12)(13). However, landscapes and implications of differentially expressed circadian clock-related genes (CCRGs) in lung cancer have remained poorly de ned. A potential role of these genes differential expression cannot be ignored between cancer cells and normal cells. In this present study, we purposed to explore a risk-score as a "Classi er" to predict the prognosis of patients based on the genomic expression pro les from public databases. Comprehensively, a total of 1,382 CCRGs with oscillatory transcripts with experimentally validated by techniques including RT-PCR, Northern blot, in situ hybridization, and Microarray or RNA-seq were analyzed in the present study (14). Among them, we identi ed 290 and 447 differentially expressed CCRGs in lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) tissues, respectively based on the Cancer Genome of Atlas (TCGA) cohort. Then, a combination of univariate and Cox regression hazard regression analysis was used to screen the differential expression of CCRGs, which associated with overall survival. Subsequently, we established two optimal risk-score models to divide LUAD and LUSC patients into the high or low risk-score group, followed by veri cation in combined Gene Expression Omnibus (GEO) validation sets, respectively.
Further, we assessed the prognosis value and complementary value of molecular and clinical characteristics by survival, receiver operating characteristic (ROC) curve, and correlation analysis. We also identi ed the differences in the critical signaling pathways among these differential expression CCRGs using Gene Ontology (GO), the Kyoto Encyclopedia of Genes and Genomes (KEGG), and Gene Set Variation Analysis (GSVA) methods. Finally, the correlative immune in ltrates in LUAD and LUSC, genetic alteration of CCRGs in risk-score models, and related drugs and signal pathways in cancer cells were also explored to provide novel ideas for the clinical translation of circadian genes.

Patient information and databases in training and validation sets
Thoroughly, a list of 1,382 homo sapiens (human) CCRGs validated by experiments including RT-PCR, Northern blot, and in situ hybridization were obtained from the Circadian Gene Database (The CGDB: http://cgdb.biocuckoo.org/index.php/) (14). In the training set, both the gene expression pro les (HTSeq -FPKM) and patient clinical information of 594 LUAD samples (normal count: 59; tumor count: 535) and 551 LUSC samples (normal count: 49; tumor count: 502) were downloaded from the TCGA database (https://portal.gdc.cancer.gov/). Patients who lacked follow-up information were exclude in the survival analysis. In the testing set, Microarray expression pro les and clinical information were obtained from the GEO database (ncbi.nlm.nih.gov/geo/) using the accession number GSE30219, GSE31210, GES3141, GSE37745, GSE50081, GSE68465, which contained more than a thousand samples of patients with lung cancer. Then, all samples were classi ed into the LUAD or LUSC type due to histological criteria to verify the risk-score model, respectively.

Data processing
To avoid the heterogeneity among different datasets and ensure a uni ed standard, the RNA-seq pro les were transformed using the formula log2(x + 1) and normalized. R version 3.6.1 (https://www.r-project.org/) software was used to normalize and process the data. Also, all data processing, analysis, and mapping were done using the R version 3.6.1 software and the Perl Programming Language version 5.28.1 (https://www.perl.org/) in the present study.

Functional enrichment analysis
To explore the pathways and interactions that are affected among these differential expression CCRGs, the statistical and visualize analysis of functional annotation (GO), including biological process, cellular component, and molecular function, and the KEGG pathway enrichment analysis and visualization were performed by using the R package "clusterPro ler" (http://www.bioconductor.org/). Finally, we also introduce the GSVA, a gene set enrichment method, to estimate the variation of pathway activity between the high risk-score and low risk-score groups in an unsupervised manner (15).

Risk-score model construction
We performed the univariate analysis and Cox proportional hazard regression to conducted to screen the differential expression CCRGs signi cantly associated with prognosis in the training cohort. Then, a risk score for each patient of prognostic risk was calculated respectively in LUAD and LUSC sets, according to the regression coe cients of the individual CCRGs screened from the multivariate Cox regression model and the expression value of each of the selected CCRGs. The computational formula used for this analysis was risk-score = h 0 (t)*exp(β 1 X 1 + β 1 X 1 +…+β n X n ). β refers to the regression coe cient, and a hazard ratio (HR) value can be obtained after taking the natural logarithm exp (β). H 0 (t) is the function of a benchmark risk, and h (t, X) is a risk function associated with X (covariant quantity) at time t. X (covariant quantity) represented the relative expression pro les of every CCRGs, which standardized by z-score. After modeling by multivariate Cox regression, the value of the risk score calculated by function "predict ()" is h (t, X). Then, all patients were into a high or low-risk group according to the median value as a cutoff to separately dichotomize the training sets in LUAD and LUSC, and a low-risk score indicates a superior prognosis for the patients. Finally, the risk-score signature of prognosis was veri ed in the testing sets, which integrated from the GEO cohorts (GSE30219, GSE31210, GES3141, GSE37745, GSE50081, GSE68465).

Immune in ltration analysis
We analyzed the correlation of these CCRGs included in the risk-score model with the abundance of immune in ltrates, including B cells, CD4 + T cells, CD8 + T cells, neutrophils, macrophages, and dendritic cells, via the TIMER (https://cistrome.shinyapps.io/timer/), in LUAD and LUSC. TIMER, including 10,897 samples across 32 cancer types from TCGA, is a comprehensive resource for systematic analysis of immune in ltrates across diverse cancer types, which depends on a deconvolution method to infer the abundance of tumor-in ltrating immune cells from RNA-Seq expression pro les (16). Gene expression levels against tumor purity are displayed on the left-most panel. All the genes expression levels were displayed with log2 RSEM.
Further, we also compared the difference of immune cell in ltration between the high-risk group and low-risk group by using the CIBERSORTx (https://cibersortx.stanford.edu/). CIBERSORTx is an extension of the CIBERSORT, which provides an analytical method to infer cell-type-speci c gene expression pro les without digital cytometry (17). By the transcriptome pro ling of single cells or sorted cell subpopulations based on a machine learning method, CIBERSORTx provides new possibilities for applying the signature matrix to bulk tissue expression pro les to infer cell-type proportions and represent cell type expression signatures. Based on this, we provided an analytical estimation of the abundances and distribution difference of 22 immune cell types in a mixed cell population of high-risk and low-risk group samples, using CCGRs expression data in LUAD and LUSC respectively. During the analysis, gene expression was corrected by the normalization using the R package "limma," and samples after the abundance estimation were ltered with the p-valve (< 0.05). When within-group data were merged, we combined the data by global averages.
Mutation analysis of CCRGs in risk-score model The cBioPortal for Cancer Genomics (http://cbioportal.org) provides a visualized Web resource for exploring and analyzing multidimensional cancer genomics data, which is now developed and maintained by a multi-institutional team, such as the Memorial Sloan Kettering Cancer Center, the Dana Farber Cancer Institute, Princess Margaret Cancer Centre of Toronto, Children's Hospital of Philadelphia. In this portal, molecular pro ling data of cancer tissues and cell lines were reduced into readily visual and understandable genetic, epigenetic, gene expression, proteomic events, and clinically relevant events. According to the cBioPortal, we investigate mutations and expressions of CCRGs in the risk-score model in lung adenocarcinoma (TCGA PanCancer Atlas) (566 samples) and lung squamous Cell Carcinoma (TCGA PanCancer Atlas) (487 samples) respectively.

Related drugs and pathways analysis
We also analyzed related anti-tumor drugs and related pathways of the CCRGs included in the risk-score model. All

Statistical analysis
All statistical analyzes in the present study were performed using the R version 3.6.1 software (https://www.rproject.org/), and p-value < 0.05 was regarded as statistically signi cant for all the analyses. The Kruskal-Wallis test and one-way ANOVA were used to check the expression differences among the genes and the association of the risk-score with clinical signatures. OS was analyzed by the Kaplan-Meier survival curve and the log-rank test to check the signi cant difference between the high-risk and low-risk groups. The univariate analysis and multivariate Cox proportional hazards regression model were used to analyze the key CCRGs that affect the prognosis of NSCLC patients. The ROC curve analysis was used to evaluate the sensitivity and speci city of prognostic prediction of the CCRGs signature risk-score model. The prognostic accuracy was presented by the area under the ROC curve (AUC). All tests were two-sided.

Results
Identi cation and screening of differential expression genes Functional enrichment analysis of differential expression genes Firstly, we analyzed the association of these differentially expressed CCRGs with the GO terms of the biological process (BP) and cellular component (CC) categories. For LUAD samples, the top ve enriched BP terms were 'regulation of hemopoiesis,' 'neutrophil degranulation,' 'neutrophil activation involved in immune response,' 'neutrophil-mediated immunity,' and 'neutrophil activation ( Fig. 2A-D).' The top two enriched CC terms were 'chromatin' and 'secretory granule membrane ( Fig. 2A-D).' For LUSC samples, the top ve enriched BP terms were 'neutrophil activation involved in immune response,' 'neutrophil activation,' 'neutrophil degranulation,' 'neutrophilmediated immunity,' and 'negative regulation of immune system process,' which was similar to the results of the former, while more inclined towards the immunomodulation (Fig. 2E-H). For the results of CC, the top two enriched terms were precisely the same as the former (Fig. 2E-H). Then, results of KEGG analysis indicated that altered CCRGs were mainly involved in the systemic lupus erythematosus and osteoclast differentiation of LUAD samples, meanwhile in the systemic lupus erythematosus, apoptosis, and B cell receptor signaling pathway of LUSC samples (Fig. 3).
Finally, we further investigated the differential distribution of signal pathway enrichment of differentially expressed CCRGs set between the high risk-score and low risk-score groups using the GSVA method. In LUAD samples, compared with the low-risk group, CCRGs of the high-risk group mainly enriched in the G2M_CHECKPOINT, Subsequently, results indicated that the expression of 10 CCRGs correlated with the OS of LUAD and LUSC patients, respectively (Table 1). According to this, we constructed a risk-score model for predicting the prognosis of LUAD and LUSC patients using the calculation formula mentioned in the method part, respectively. Finally, CDA, POU2AF1, TUBB6, SPAG8, NT5E, ARRB1, DDIT4, HAL, PHLDB2, and AGMAT as risk genes in the risk-score model of LUAD and ALOX5AP, RALGAPA2, TIGD3, PNPLA6, ALPL, TREM1, VSIG4, CD300C, HIST1H2BH, and WNT10A as risk genes in the LUSC risk-score model (Fig. 5C-F). Survival analysis revealed that there was a signi cant difference between the high-risk group and the low-risk group in OS, and patients in the high-risk group signi cantly correlated with an inferior prognosis (LUAD: p < 0.0001, HR: 2.117, 95% CI: 1.546 to 2.900; LUSC: p < 0.0001, HR: 2.066, 95% CI: 1.552 to 2.751) (Fig. 6A-B). Finally, we also ranked the risk scores of LUAD and LUSC patients for OS and explored the distribution features ( Fig. 6C-D). The dot plots revealed the status of each patient in the training sets ( Fig. 6E-F). The heat maps showed the differential expression of the feature CCRGs in the high-risk and low-risk groups ( Fig. 6G-H). As results show, there was an upregulation of HAL, PHLDB2, AGMAT, DDIT4, CDA, NT5E, and TUBB6 as high-risk genes and downregulation of POU2AF1, SPAG8, and ARRB1 as protective genes of LUAD patients in the high-risk score group comparing the low-risk score group. Moreover, samples with high-risk scores of Table 1 Descriptions of CCRGs of the risk-score model in Cox proportional hazard regression analysis. Validation of the prognostic risk-score model in the testing set We next validated the stability and accuracy of the prognostic risk-score model in the testing sets, which included LUAD and LUSC cohorts from the GEO database. The OS was selected as the key indicator to compared the groups and samples were divided into low and high risk-score groups based on the calculated risk score. The formula is as mentioned before. For the testing set of LUAD type, 519 and 544 samples were separated into low and high riskscore groups, respectively. Survival analysis showed that there was a signi cant difference between the high and low risk-score groups (p < 0.0001, HR: 1.493, 95% CI: 1.248 to 1.787) (Fig. 7A). Similarly, 88 samples of the low-risk group and 89 samples of the high-risk group were included in the survival analysis of the LUSC testing set. Results also suggested a signi cant difference between the high and low risk-score groups (p = 0.0486, HR: 1.453, 95% CI: 1.002 to 2.105) (Fig. 7B). To summarize, our results con rmed that these two risk-score models based on CCRGs signatures were all stable and accurate in predicting the prognosis of patients.

Clinical characteristics correlation analysis
In this section, we further explored the stability and reliability of the risk score as a clinical indicator. Seven  (Fig. 8A-D).
Our analysis showed that risk-score was found to be an independent prognostic indicator both in LUAD patients and LUSC patients. Then, we constructed ROC curves for different variables to evaluate the risk-score as classi ers, and the AUC was calculated and considered as the basis for evaluation (LUAD: AUC 0.788; LUSC: AUC 0.738) ( Fig. 8E-F). Our results indicated that the risk-score had superior accuracy and predictability comparing other clinical characteristics both in LUAD and LUSC samples.

Immune correlation and in ltration analysis
Firstly, correlation analysis between immune cell in ltration levels and CCRGs of the risk-score model was performed using the TIMER. We initially assessed the correlations of characteristic CCRGs expression with immune in ltration levels in LUAD and LUSC samples using the TIMER one by one. We found that almost all CCRGs expression in the risk-score model has signi cant correlations with in ltrating levels of B cells, CD8 + T cells, CD4 + Fig. 1-2). Further, we explored the immune in ltration level of the diverse immune in ltrating cells between the high-risk group and the low-risk group by the CIBERSORTx. We rst calculated the abundances of 22 immune cell types in each sample using standardized CCGRs expression pro les in LUAD and LUSC, respectively. Then we ltered each sample according to the p-value to eliminate the bias caused by inaccurate estimation and grouped the samples based on

Genetic alteration analysis
We investigated the genetic alteration of these CCRGs in the risk-score model to further understand their contributions to carcinogenesis by the cbioportal. We found that these risk-associated CCRGs were relatively conservative, and the mutation rates of these genes were all lower than 3% (most the percent of which were around 1%-2%) both in LUAD and LUSC (Fig. 10). The low frequent genetic alterations suggested the high stability and conservatism of these CCRGs and the crucial roles of these genes in the genetic, epigenetic, and development of lung cancer.

Related drugs and related pathways
We analyzed related drugs and signaling pathways related to these CCRGs of the risk-score model in tumor cell lines. All relevant information was retrieved by searching the GeneCards and supplemented by retrieving relevant literature in PubMed. We found that although some anti-tumor drugs targeting these CCRGs have been applied in clinical practice or were in clinical trials, there were still many blank spots in relevant research, especially for LUSC patients (Fig. 11). As the results showed, most of the existing drugs played a corresponding role in cancer through classic signaling pathways, such as the EGFR signaling pathway, NF-kB signaling pathway, and PI3K-Akt signaling pathway.

Discussion
Circadian biology is being considered as the fourth dimension of modern medicine (18). There is clear evidence for the differential expression of CCRGs in a variety of diseases, and cancers are no exception (19). At the same time, these self-perpetuating differential expressions were also synchronized by the external environment (light, temperature, ingestion, and activities) that were known as the zeitgebers. Also, the process of differential expression was 24-hour periodicity and might also be affected by the seasons. Universally recognized, implications of regulating CCRGs expression in epigenetic control mechanisms have been described during the tumor initiation and progression, which included circadian metabolic changes and tumor-derived macroenvironment, which has been reported in studies of breast cancer and LUAD in mouse models (20). The implication also has been indicated by epidemiological studies (21). Changes and disruptions of circadian rhythms in humans signi cantly impinged on the increasing risk of tumorigenesis (22,23). Evidently, circadian biology is becoming a critical involvement in improving the understanding of molecular mechanisms involved in cancer cells. Nevertheless, its importance has sparsely been well recognized in clinical studies and practice, and even more when translating to the bedside.
Based on this, we attempted to represents a substantial step toward that direction, which aims rst to describe landscapes and implications of these differentially expressed CCRGs and investigate the connection between impingement of circadian rhythms and prognostic signi cance, in the most common and malignant tumor. We nally integrated and analyzed the expression pro les of 1,382 human CCRGs in NSCLC wholly and systematically via the CGDB, TCGA, and GEO database. We pioneering proposed a CCGRs-based risk-score model better to assess the effects of circadian rhythm on prognosis accordingly. In addition, according to the score, we further focused on the in ltration changes of immune cells, genetic alteration, and the possibility of being a pharmacological target in these samples.
Recently, a study based on integration and analysis of data from the TCGA database has investigated the association between 14 kinds of clock genes and prognostic signatures in NSCLC patients, which also showed that differentially expressed clock genes constitute their characteristic asynchronous circadian rhythms (24). To date, thousands of genes and proteins are considered to be related to the circadian rhythms' oscillation. Given the signi cance of circadian rhythms in lung cancer, it is reasonable to speculate that CCRGs hold excellent promises in prognostic prediction and that a risk score based on multiple-gene signatures derived from dependable algorithms would be more reliable and superior to any single molecules in predicting prognosis of NSCLC. We, therefore, put forward a risk-score model, in which ten-gene signatures were selected and calculated for evaluating the prognostic risk of LUAD and LUSC training sets, and the predictive validity of the risk-score model was validated in several GEO NSCLC cohorts, respectively. Fortunately, the risk scores signi cantly strati ed patient outcomes and immune cell in ltration levels between the high-risk and low-risk groups. Further, the risk-score also showed its excellent stability and accuracy as a classi er in the Cox proportional hazard regression, including riskscore and other clinical variables. It was evident that the high-risk group has an inferior prognosis and a more reduced anti-tumor immune response in our analysis. In the risk-score model containing ten genes of LUAD and LUSC respectively, we found that these genes are involved in many biological processes, such as cell cycle control, metabolism, immune-modulating, in ammatory reaction, cytoskeletal reorganization, chromatin remodeling, apoptosis in response to DNA damage repair, and protein synthesis and transportation, through systemic functional analysis. We concluded that these above processes in tumor cells might be affected by the circadian rhythm. Several studies have demonstrated that a wide range of core circadian clock components is epigenetically altered, and this perturbation could promote tumorigenesis, progression, and decreased survival in lung cancer, which also suggested an essential position of circadian homeostasis in the tumor-suppressive role (25,26).
Interestingly, we found that the in ltration level of critical cells in speci c anti-tumor immunity process, such as CD4 + T cells, CD8 + T cells, and dendritic cells, were suppressed apparently, while the activity and in ltrating of in ammatory cells and Tregs with negative regulatory ability were promoted in the high-risk group. It established that circadian rhythms and related genes played a vital role in the tumor immune and tumor-associated in ammatory response. The latest studies have con rmed our results. To date, current notion suggests that CCRGs express in most immune cells universally and present a circadian oscillation with a xed rhythm, which performs essential roles in a wide range of immunomodulation process, including the phagocytosis, apoptosis, the synthesis, and release of cytokines, chemokines, and cytolytic factors, the response occurring through pattern recognition receptors (27). Differential expression of CCRGs also plays a vital role in the development and speci cation of immune cell lineages (28). This view also re ected in our analysis. For instance, immune in ltration level of resting memory CD4 + T cells and naive CD4 + T cells were decreased, while the level of activated memory CD4 + T cells was increased in the high-risk group. Consequently, it is evidence that alterations in circadian rhythms due to differential expression of genes in cancer cells may lead to disturbed the immune responses, and these changes may be caused by clock gene mutation, environmental disruption, or the age and tumor itself. A study of circadian rhythm reprogramming during the lung in ammation suggested that the early events in lung injury may produce a complex reorganization of cellular and molecular circadian rhythms and further regulate immune responses of the host (29). It will be essential to determine the mechanism and causality of oscillations driven by CCRGs in cellular function, metabolism and immunity, and whether the critical drivers for oscillations are the time of day/season-dependence. If so, it might strengthen our fundamental understanding of how the circadian rhythm disturbs metabolism and immune functions to anticipate changes in the environment, and provide a bridge between the circadian rhythms and novel insights to facilitate the development of chronotherapies for ghting cancer and other diseases.
Besides, our genetic alteration analysis also suggested the low mutation rates of these CCRGs in the risk-score model, which was also in line with the current view that CCRGs were evolutionarily conserved in eukaryotes (18).
Moreover, this conservatism would affect plenty of critical cell functions, such as immunomodulatory. Studies over the last decade indicate that immune responses related to the circadian oscillators are a consequence of this Darwinian selection process, and the circadian rhythm could minimize costs and maximizes bene ts of immunity to optimize organismal tness in a given environment (30). Thus, the disruption of the normal circadian rhythmic may result in the appearance of CCRGs differential expression and metabolic rhythms, which might function to support host immunity but also increase the probability of tissue damage and a catastrophic vulnerability (31).
Furthermore, because of this, in the analysis of related anti-tumor drugs of these CCRGs in the risk-score model, there are few direct targets of these genes. Meanwhile, we also found the subtle difference between CCRGs differential expression and immune cell in ltration in LUAD and LUSC, which might result from the speci c contexts of different types of cancers. These are still urgent questions needed to be studied and solved today.

Conclusions
All organisms on Earth are exposed to regular environmental cycles generated by the rotation and revolution of the Earth. This, in turn, has led to the evolution of circadian rhythms driven by CCRGs, which facilitate lives to anticipate and adapt to the internal and external changes during their environment. We preliminary explored a riskscore based on ten CCRGs signatures based on TCGA and GEO database in LUAD and LUSC, respectively. This risk-score was an independent predictor of prognosis. Further analysis of cell functions and immune in ltration between the high-risk and low-risk group and genetic alteration of these GGRGs also investigated in our study.
Differential expression of CCRGs also regulated the immune cell in ltration level in NSCLC. These CCRGs were evolutionarily conserved with low mutation rates. Further studies and experimental con rmations are needed to explore potential drug targets for therapy. Declarations Ethics approval and consent to participate Not applicable.

Consent for publication
Not applicable Availability of data and materials The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Competing interests
There are no con icts of interest to declare.

Figure 10
Genetic alteration of ten CCRGs in the risk-score model. (A) For ten CCRGs in the risk-score model of LUAD patients. (B) For ten CCRGs in the risk-score model of LUSC patients.