Comprehensive Analysis of the Expression and Prognosis for IRXs in Non-small Cell Lung Cancer

Iroquoishomeobox transcription factor family (cid:0) IRXs (cid:0) have been increasingly reported to play roles in suppressing or promoting a variety of cancers, however, little is known about their expression and prognostic value in terms of human lung cancer. In this study, Oncomine, GEPIA, Kaplan-Meier plotter, and cBioPortal databases were used to analyze the different expression patterns and prognostic values of six IRXs in NSCLC and examine their related functions and pathways using GO enrichment. Compared with normal lung cancer tissues, the expression of IRX1 and IRX2 in NSCLC tissues was signicantly lower and was positively correlated with the 10-year survival rate of patients. Higher expression of IRX4 was related to terminal tumors, and suggested a poor prognosis. It was also found that IRXs may play a tumor-suppressive role in the localization of cytoplasm in NSCLC, while localization in the nucleus suggests a more malignant behavior. Together these results suggest that IRX1 and IRX2 may be prognostic indicators of LUAD, and that IRX4 could be a potential target for LUAD treatment.


Introduction
Lung cancer is the second leading cause of death worldwide, as de ned by the World Health Organization (WHO) Bray, et al. 2018 . Global statistics show that lung cancer is still the most common type of cancers, with 11.6% of the total cancer cases diagnosed and 18.4% of the total cancer deaths. Bray, Ferlay, Soerjomataram, Siegel, Torre, Jemal 2018 . More than half of lung cancer patients died within one year after diagnosis, and the 5-year survival rate was only 17.8% Zappa, Mousa 2016 . Lung cancer cells originate from respiratory epithelial cells, and the most common type of lung cancer is non-small cell lung cancer (NSCLC), accounting for 85% of lung cancer patients. NSCLC can be divided into three main pathological subtypes: adenocarcinoma (LUAD), squamous cell carcinoma (LUSC), and large cell carcinoma Dela Cruz, et al. 2011 . LUAD is the most common subtype, accounting for 40% of NSCLC cases, followed by LUSC(25-30%) and large cell (undifferentiated) carcinoma (5-10%) Zappa, Mousa 2016 . In the development of NSCLC, the survival of patients and response to treatment, the discovery and recognition of different biomarkers is of great signi cance for the development of early detection tools and treatment options.
IRX genes were rst identi ed in the neuro-developmental system of Drosophila melanogaster, and encode highly conserved transcription factors (TFs) Gomez-Skarmeta, Modolell 2002 . Genes from this family were expressed in the embryonic tissues of invertebrates and vertebrates, and were found to participate in the development of various organs as well as other crucial functions for survival Gomez-Skarmeta, Modolell 1996Leyns, et al. 1996Netter, et al. 1998 . In vertebrates, six IRX genes have been identi ed and divided into two categories, where IRXA (IRX1, IRX2, IRX4) and IRXB (IRX3, IRX5, IRX6), IRX1 and 3, IRX2 and 5, and IRX4 and 6 are respectively homologous genes Houweling, et al. 2001McDonald, et al. 2010Ogura, et al. 2001 . In the human genome, IRXA and IRXB are located at 5p15. 33  Recently, it has been reported that IRX genes play important roles in the occurrence and development of a variety of tumors.
Overexpression of IRX1 may lead to pulmonary dysplasia by inducing pulmonary interstitial thickening Doi, et al. 2011 . Hypomethylation of IRX1 is related to a high risk of lung metastasis of tumors Lu, et al. 2015 , and IRX1 methylation is negatively correlated with LUSC prognosis Gao, et al. 2019 . In LUAD, the DNA methylation level of IRX2 was found to decrease, and was negatively correlated with invasion Sato, et al. 2014 . Whole genome analysis indicated that IRX4 and IRX5 were potential carcinogenic genes for NSCLC, as well as that IRX5 was negatively correlated with the survival rate of lung adenocarcinoma in nonsmokers. Silencing IRX5 can inhibit tumor formation and cause G1 phase cell arrest, as CyclinD1 is its downstream target D. L.  .
Despite this understanding, the function of IRXs in NSCLC and their basic mechanism of promoting or inhibiting cancer have not been fully elucidated. The abnormal expression of IRXs and their relationship with clinicopathological features and prognosis have been reported in human NSCLC, however, as far as we know, bioinformatics has not been used to explore the role of IRXs in NSCLC. RNA and DNA research has become increasingly important to biomedical study as tools to understand pathological processes.
Based on the analysis of thousands of published gene expression datasets and variant copy numbers, we analyzed the results of differential IRX factor expression and mutations in NSCLC patients in order to determine IRXs expression patterns, potential functions, and different prognostic values in NSCLC.

Oncomine Analysis
The transcriptional levels of IRXs in various tumor tissues were analyzed by tumor datasets in Oncomine database https://www.Oncomine.org/resource/login.html , and the expression levels of IRXs mRNA in normal tissues and tumor tissues were statistically analyzed. According to the de nition of statistical signi cance, the cutoffs of P value and fold change were set to 0.05 and 2.0 respectively. GEPIA Dataset GEPIA Tang, et al. 2017 (http://gepia.cancer-pku.cn/) is an interactive web server, it can be used to analyze the whole transcriptome datas of 9736 tumors and 8587 normal samples from TCGA databases and GTEX projects. It provides the differential expression analysis of various genes in tumor tissues relative to normal tissues, different tumor types or pathological stages, and carries out survival analysis of patients combined with clinical information, and can analyze the correlation between genes.

Prognosis Analysis
In order to analyze the overall survival rate of patients with non-small cell lung cancer, patients were divided into two groups according to the median expression (high expression and low expression), and Kaplan Meier survival curve was used for evaluation. Hazard ratio (HR), 95% con dence intervals (CIs) and log rank p values are indicated in the gures. The number of samples was shown at the bottom of the main chart. (http://kmplot.com/analysis/index.php?p=service&cancer=lung).

The Cancer Genome Atlas Data and cBioPortal
All non-small cell lung cancer data sets were screened out in cBioPortal database, and the data from 3854 pathological reports were included. The cBioPorta online tool was used to analyze the IRXs genome map, including mutations, putative copy number alterations (CNAs) for genome identi cation of important cancer targets, and Z-score of mRNA expression.

Expression of IRXs in Different Types of Cancer and Transcriptional Levels in NSCLC
Examining the expression of IRXs in pan-cancer obtained from the TCGA database ( Figure 1), IRX1, IRX4, and IRX6 are expressed in certain cancers, while IRX2, IRX3, and IRX5 are generally expressed across most cancers. In NSCLC, the total expression of IRX1 and IRX2 in both LUAD and LUSC was moderate, with no signi cant difference; in contrast, the expression of IRX3 and IRX5 in LUAD was higher than in LUSC, and the expression of IRX4 and IRX6 in LUSC was signi cantly higher than that of LUAD (especially IRX4). The transcriptional levels of IRXs in NSCLC compared with those in normal tissues was also analyzed by Oncomine (Figure 2A). IRX1 and IRX2 were signi cantly under-expressed in NSCLC, and the mRNA expression levels of IRX1 were low in two data sets, expression of IRX2 was also low in three datasets. IRX4 and IRX5 were over-expressed in NSCLC, and the mRNA expression levels of those were high in one datasets. There were no signi cant results in terms of differential IRX3 and IRX6 expression in NSCLC. Looking further at the various types of datasets statistics for NSCLC Table 1 , it can be seen that the expression of IRX1 and IRX2 in LUAD, LUSC, and large cell lung cancer are signi cantly lower than in normal lung tissue, and that the logFC value is more than double. The expression of IRX3 in two datasets was not statistically signi cant in either cancer or adjacent cancer in LUAD, but it was lower than in normal lung tissue in LUSC. Interestingly, we found that the expressions of IRX4, IRX5 and IRX6 were more concentrated in one lung cancer subtype IRX4 expression in LUSC was statistically higher, while IRX5 expression was statistically higher in LUAD and the expression of IRX6 in large cell lung cancer was low.

Relationship between mRNA Expression Levels of IRXs and Clinicopathological Factors in NSCLC
The GEPIA database was used to compare the mRNA expression level of IRXs between NSCLC and normal lung tissues ( Figure 2B, C). The expression of IRX1 and IRX2 in LUAD and LUSC were found to be lower than that seen in normal tissues. Expression of IRX4 in LUSC was signi cantly higher than in normal tissues, but almost no expression was seen in LUAD. The expression of IRX3 and IRX5 in LUAD was higher than normal, but was lower than normal in LUSC. IRX6 expression was lower than normal in LUAD. The expression and signi cance of IRXs in different stages of NSCLC was also examined, and it was found that IRX1, IRX2, and IRX4 had signi cant differences in expression in different stages of LUAD, while IRX3 and IRX5 were differentially expressed in different stages of LUSC. The expression of IRX6 in different stages of both LUAD and LUSC was not statistically different.
The immunohistochemical information of IRX1, IRX2, and IRX6 in lung cancer was obtained from the Human Protein Atlas database, but unfortunately, there were no immunohistochemical results available forIRX3, IRX4, and IRX5( Figure 4). From this data, the existing positive expression rate of IRX1, IRX2, and IRX6 in lung cancer tissue was found to be relatively low, positive results for expression were mainly found in the cytoplasm and membrane. Due to the lack of immunohistochemical results for normal pulmonary bronchial epithelium and alveolar epithelium, the expression trend of IRXs in NSCLC tissues could not be fully determined from this analysis.

The Effect of mRNA Expression Levels of IRXs on the Prognosis of NSCLC
The survival rate of IRX mRNA expression in different NSCLC patients was analyzed using the Kaplan-Meier and GEPIA databases ( Figure 5). The results showed that the expression of IRX1, IRX2, IRX3, and IRX6 were positively correlated with the prognosis of LUAD patients while the expression of IRX4 was negatively correlated. Interestingly, survival analysis showed that IRXs expression had a more signi cant effect on the prognosis of patients with LUAD (whether for high expression of IRX1, IRX2, IRX3 or IRX6, or for low expression of IRX4). In the analysis of LUSC prognosis, only patients with high IRX2 expression had a higher survival rate, which contrasted the prognosis correlation of IRX2 in LUAD.

Analysis of Gene Alteration of IRXs in NSCLC Patients
The genetic alternation of IRXs in NSCLC was analyzed with cBioportal for cancer genomics ( Figure 6 and Supplementary Table1). In 3854 samples of 3797 NSCLC patients, 49% of IRXs were altered, among which 15% of IRX1, IRX2, and IRX4 were altered, with less alteration observed for IRX3, IRX5, and IRX6.
The alternation of IRX1, IRX2, and IRX4 in NSCLC was mainly gene ampli cation mutations.

Predicted Functions and Pathways of IRXs in NSCLC
We further analyzed the proteins that may interact with IRXs by using the String database, and constructed an interaction network of the rst 30 proteins according to the combined score ( Figure 7A and Supplementary Table 2). Following this, the g: Pro ler online analysis program was used to enrich the functions of these interacting genes ( Figure 7B-D). GO enrichment analysis predicts the function of IRXs in three aspects including biological process, molecular function, and cellular components. In terms of molecular function, IRXs were mainly related to DNA binding, transcriptional activation, and could also participate in Smad binding and BMP (bone morphogenetic protein) receptor binding(Supplementary Table 3). In terms of biological process, IRXs were mainly related to organ development and cell differentiation. Through the analysis of KEGG (Kyoto Encyclopedia of Genes and Genomes) enrichment of IRX-interacting genes, there were seven pathways related to IRXs found in NSCLC( Figure 7E and Supplementary Table 4), including the TGF-β(transforming growth factor-β) signaling pathway (hsa04350) and Hippo signaling pathway (hsa04390).  Bennett, et al. 2009 . In this study, expression of IRX1 mRNA in both LUAD and LUSC was signi cantly lower than that seen in normal tissues. We also found that patients with high expression of IRX1 had a higher ten years survival rate, and that this increased expression was negatively correlated with tumor stage, though there was no signi cant correlation between IRX1 and the prognosis or tumor stage of patients with LUSC.

Discussion
Current research on IRX2 shows that it regulates MMP-2 and MMP-9 by activating the AKT pathway to promote tumor migration and invasion in osteosarcoma Liu, et al. Liu, et al. 2014 . In soft tissue sarcoma, IRX2 affects development through the Wnt pathway Adamowicz, et al. 2006 . Loss of IRX2 expression can also lead to early hematogenous spread in breast cancer Werner, et al. 2015 . In NSCLC, it has been reported that methylation of IRX2 CpG islands in LUAD may be related to malignancy Rauch, et al. 2012 . In this study it was found that IRX2, similar to IRX1, had low expression in both LUAD and LUSC cancer tissues, it is interesting that when patient's prognoses were analyzed, LUAD patients with high IRX2 expression had higher survival rates, while LUSC patients with similar expression of IRX2 had lower rates of survival. Immunohistochemical information indicated that positive expression of IRX2 in LUSC was localized inside the nucleus, while mainly in cytoplasm in LUAD. It has been reported that IRX1is localized in the cytoplasm in normal brain tissue and low-grade gliomas, but was found in the nucleus in higher grade gliomas Zhang, Liu, Xu, Wang, Cheng, Jin, Wang, Yang, Liu, Zhang, Tu 2018 .
IRX5 is able to promote cancer cell proliferation through cyclinD1 in LUAD, and immunohistochemical data shows that it is localized in the nucleus in LUAD Zhang, Qu, Ma, Zhou, Wang, Zhao, Zhang, Zhang, Wang, Zhang, Yu, Sun, Gao, Cheng, Guo, Huang, Zhou 2018 . From this, we can speculate that the effect of IRXs on the tumor process is related to cell localization, such that expression of IRX2 in the nucleus of LUSC cells determines a poor prognosis.
Studies of IRX3 in nephroblastoma have shown that it can inhibit tumor growth through the classic Wnt/ β-catenin pathway Holmquist Mengelbier, et al. 2019 . There are also reports that IRX3 interacts with the NOTCH pathway and Rho signaling during the formation of renal tumors Mengelbier, et al. Scarlett, et al. 2015 . IRX3, as the target of mir-377, can also promote the occurrence of liver cancer Wang, et al. 2016 , however, there is no related research on the role of IRX3 in lung cancer. Here, the expression of IRX3 in LUAD tumor tissue was found to be higher than that in normal tissue, while expression in tumor tissue of LUSC patients was slightly lower. When analyzing the relationship of expression with tumor stage, the expression of IRX3 in LUSC patients was signi cantly different between tumor stages. However, prognostic statistics show that IRX3 was positively correlated with the prognosis of LUAD patients, which leads to the opposite conclusion of the expression pattern data. In order to determine why these differences exist, the composition of the IRX3 protein structure was examined (Supplementary Figure1), and it was found that the structure of IRX3 and IRX1 are very similar, both have a homeobox TALE-type domain and a homeobox KN domain in the same location. The difference between the two is that the lenghth of IRX3 protein is 501aa, while IRX1 is 480aa, a leucine to proline change exists in the 422nd amino acid of IRX3 protein, along with a glutamine to histidine change at the 479th. From this, we can speculate that the different roles of IRX1 and IRX3 in NSCLC are not related to their homeobox TALEtype and homeobox KN domains, but should be related to their different N-terminal structures, though the detailed mechanism behind this differential behavior must be studied further IRX4 has been reported to inhibit the growth of prostate cancer cells by interacting with vitamin D receptors (VDR) Nguyen, et al. 2012 . Differential gene analysis of NSCLC showed that IRX4 may also act as a tumor-promoting gene in NSCLC Zhang, Qu, Ma, Zhou, Wang, Zhao, Zhang, Zhang, Wang, Zhang, Yu, Sun, Gao, Cheng, Guo, Huang, Zhou 2018 . Here, IRX4 was scarcely expressed in LUAD patients, but was expressed signi cantly higher in tumor tissues of LUSC patients. Interestingly, the expression of IRX4 was correlated to a poor prognosis in LUAD, but was not signi cantly correlated with the prognosis in LUSC. Therefore, although the expression of IRX4 in LUAD patients is low, it may still act as a good prognostic indicator. It is important to note, however, that the above results were all based on mRNA expression, as there was a lack of protein expression information available, because of this, the expression and function of IRX4 in NSCLC still needs further study. 2019 , and has also been reported to promote hepatocarcinogenesis Zhu, et al. 2018 . In LUAD, IRX5 can cause G1 phase cell arrest and promote cell proliferation through cyclinD1 Zhang, Qu, Ma, Zhou, Wang, Zhao, Zhang, Zhang, Wang, Zhang, Yu, Sun, Gao, Cheng, Guo, Huang, Zhou 2018 . In this study, the expression of IRX5 in LUAD tumor tissues was found to be higher than that in normal tissues, which is consistent with known reports. Interestingly, IRX5 expression patterns were signi cantly different from different tumor stages in LUSD patients, with a signi cant increase seen in stage IV patients, suggesting further that high expression of IRX5 indicates a malignant outcome with LUSC. IRX5 is similar in structure to IRX2, whose role in cancer is mostly related to cell migration and distant spread, so it is reasonable to speculate that high expression of IRX5 in patients with stage IV LUSC may be affected by migration invasion as well.
IRX6 has rarely been studied in tumors, it was reported, however, that IRX6 plays a role as a oncogene in colorectal cancer, and that its expression suggests a poor prognosis Zuo, et al. 2019 . Here, IRX6 expression in tumor tissues of LUAD patients was found to be lower than that in normal tissues and patients with high expression had better prognoses, suggesting that it plays a tumor-suppressive role in LUAD. The known immunohistochemical data also shows that IRX6 is expressed in the cytoplasm in tumor cells of lung cancer patients, which is consistent with our previous conjectures, the members of the IRX family seem to play a suppressive role in cancer when located in the cytoplasm, but may play a role in promoting cancer when localized to the nucleus.
This study demonstrates important insights into the expression patterns and prognostic analysis of IRX members in NSCLC. We believe that in NSCLC, IRX1 and IRX2 play a tumor suppressive role in LUAD and may become prognostic indicators. IRX4 seems to have a carcinogenic role in LUAD, and may become a potential target for LUAD treatment. Differential localization of IRXs in lung cancer cells was also found to have opposing effects in terms of tumor development, the discovery of this phenomenon still needs more study in order to gain a deeper understanding of the mechanism of IRX action and the mechanism of nuclear access which would be helpful in terms of understanding how to regulate these proteins to change the malignant biological behavior of tumor cells.

Declarations
Ethics approval and consent to participate Not applicable

Consent for publication
Not applicable

Data Availability Statement
The raw data supporting the conclusions of this manuscript will be made available by the authors, without undue reservation, to any quali ed researcher.