Irinotecan Resistance – In Search of New Markers in CRC.

Colorectal cancer (CRC) is one of the most prominent causes of cancer death worldwide. Chemotherapeutic regimens consisting of different drugs combinations such as 5-uorouracil, and oxaliplatin (FOLFOX) or irinotecan (FOLFIRI) have been proven successful in the treatment of CRC. However, chemotherapy often leads to the acquisition of cancer drug resistance followed by metastasis and in the aftermath therapeutic failure. The molecular mechanism responsible for drug resistance is still unclear. The systemic search for new biomarkers of this phenomenon may identify new genes and pathways. To understand the drug resistance mechanism in CRC, the in vitro study based on the molecular analysis of drug-sensitive cells lines vs drug-resistant cells lines has been used. In our study to bridge the gap between in vitro and in vivo study, we compared the expression proles of cell lines and patient samples from the publicly available database to select the new candidate genes for irinotecan resistance. Using The Gene Expression Omnibus (GEO) database of CRC cell lines (HT29, HTC116, LoVo, and their respective irinotecan-resistant variants) and patient samples (GSE42387, GSE62080, and GSE18105) we compared the changes in the mRNA expression prole of the main genes involved in irinotecan body’s processing, such as transport out of the cells and metabolism. Furthermore, using a protein-protein interaction network of differently expressed genes between FOLFIRI resistant and sensitive CRC patients, we have selected top networking proteins (upregulated: NDUFA2, SDHD, LSM5, DCAF4, and COX10, downregulated: RBM8A, TIMP1, QKI, TGOLN2, and PTGS2). Our analysis provided several potential irinotecan resistance markers, previously not described as such.


Introduction
Colorectal cancer (CRC) is the third most common cancer worldwide and the second leading cause of cancer death [1]. Currently, the main therapeutic approaches to treat CRC are surgery, chemotherapy, irradiation, and targeted therapy [2]. The rst-line chemotherapy strategy is usually a combination of 5-uorouracil, leucovorin, and either oxaliplatin (FOLFOX) and irinotecan (FOLFIRI) [3].
Irinotecan (also known as Camptothecin-11, CPT-11) is a semisynthetic analog of plant alkaloid, camptothecin (CPT), [4,5] and was rst approved for the treatment of metastatic CRC refractory to  in the United States in 1996 [6]. Irinotecan is a pro-drug that is metabolically activated in the body to 7-ethyl-10hydroxycamptothecin (SN-38) which inhibits topoisomerase-I (Topo 1) activity, an enzyme involved in DNA replication, and can affect the proliferation of tumor cells. The central role of Topo1 in the mechanism of irinotecan action has been largely demonstrated, however, its Topo1-independent activities were reported. Irinotecan inhibits binding of the oncoprotein, FUBP11 (Far Upstream Element (FUSE) Binding Protein 1), to its target sequence FUSE on single-strand DNA. FUBP1 is required for hepatocellular carcinoma tumorigenesis, and its downregulation sensitizes hepatocellular carcinoma cells for apoptosis-inducing chemotherapeutic drugs [7]. Irinotecan was also proven to interact directly with the E3 ubiquitin ligase MDM2 and the anti-apoptotic protein Bcl-xL [8]. Those ndings may demonstrate the new mechanism of action for irinotecan as a selective protein binding inhibitor.
After more than two decades of clinical usage, irinotecan turned out to be a versatile chemotherapeutic agent which combines well both with other cytotoxic agents like 5-uorouracil and oxaliplatin and with monoclonal antibodies, such as cetuximab and bevacizumab. Experimental and clinical studies also indicated that irinotecan can be combined with kinase inhibitors, such as fruquintinib, apatinib, dasatinib, regorafenib, and sunitinib or with cell-cycle checkpoint inhibitors [9,10]. Although irinotecan still represents the backbone of CRC chemotherapy several serious disadvantages of its administration such as poor bioavailability, lack of tumor speci city, and most of all the susceptibility to multidrug resistance (MDR) have been revealed [11]. To date, the mechanism of irinotecan resistance is still inconclusive and requires further investigation. The several possible mechanisms that may lead to irinotecan resistance, were suggested: 1) modi cation of structure or changes in expression level of Topo 1, 2) activation of anti-apoptotic signaling pathway, 3) mutations in enzymes which process irinotecan, and 4) reduction intracellular drug accumulation by active drug e ux. Two latter mechanisms are strictly connected to the complicated metabolism of irinotecan which can be described in three major steps, (1) the conversion into its active metabolite, SN38, (2) the detoxi cation of SN38 to generate the inactive metabolite SN38-glucuronide, (3) the reactivation of SN38-G in the intestine by bacterial β-glucuronidase (βG) [12]. These three metabolic steps and irinotecan transport have been widely exploited to identify biomarkers of therapeutic effectiveness.
Irinotecan metabolism is fairly complex and takes place in several different organs, thus its transport, and transport of its metabolites is a very crucial step executed by transporters belonging to the ATP-binding cassette (ABC) proteins superfamily. They are active transporters with a broad range of substrate spectra and several members of ABCB, ABCC, and ABCG proteins are involved in the elimination of harmful xenobiotics including anticancer drugs and their metabolites [18]. Irinotecan or/and SN-38 are transported into blood or bile from hepatocytes by ABCB1, ABCC1, ABCC2, and ABCG2, whereas OATP transporter (SLCO1B1) enable its in ux from blood ABCB1. ABCC2 and ABCG2 are present on the bile canalicular membrane [19,20], contrary to ABCC1, which is strongly expressed on parenchymal cells, resulting in directed transport from the tissue into the blood [21]. The active e ux of irinotecan and its metabolites by ABC proteins is also one of the mechanisms responsible for acquired MDR resistance. ABC proteins, such as ABCB1, ABCB5, ABCC1, ABCC2, ABCC4, ABCC5, ABCG2, were identi ed to be involved in irinotecan/SN-38 transport out of the cytosol, to reduce its intracellular concentration and e cacy in cancer cells [22,23].
Irinotecan resistance due to the complicated nature of its metabolism and transport is very composed. This study used a microarray gene expression pro le from an open database to identify potential biomarkers for irinotecan resistance acquired by colorectal cancer cells. We aimed to investigate mRNA pro le differences between several CRC cell lines, commonly used in vitro study on irinotecan resistance, such as HT29, LoVo, and HTC116, and their irinotecan (SN-38) resistant variants and patients with metastatic colon cancer sensitive or resistant to rst-line treatment of FOLFIRI.

Microarray data processing and analysis
The Gene Expression Omnibus (GEO) database (http://www.ncbi.nlm.nih.gov/geo/) gene expression pro le with accession numbers: GSE42387, GSE62080, and GSE18105 were downloaded. GSE42387 -HCT116, HT29, and LoVo cells were exposed in vitro to gradually increasing SN-38 concentrations for about nine months, generating sub-cell  lines with acquired resistance. Gene expression pro les of the parental and resistant cell lines were obtained after 2-3 weeks cultured in drug-free medium (each in triplicate) using GPL16297 Agilent-014850 Whole Human Genome Microarray 4x44K G4112F platform (Agilent Systematic Name, collapsed probe, version). GSE62080 database is composed of 21 patients with advanced colorectal cancer treated using FOLFIRI scheme and classi ed according to the WHO criteria as the responders (sensitive S) and non-responders (resistant R) group. Gene expression pro le was obtained using Human Genome GeneChip arrays U133. GSE18105 database is composed of mRNA pro les of normal tissue and CRC patients primary tumors with and without distant metastasis. 111 microarray datasets (77 for LCM samples, and 17 pairs for homogenized samples from tumor and adjacent tissues) were normalized using a robust multi-array average (RMA) method under R 2.6.2 statistical software together with BioConductor package. Next, the gene expression levels were log2-transformed. For our analysis, we have excluded samples named: "metastatic recurrence" obtained from homogenized normal tissue. All data were processed using the GEO2R online analytical tool which uses the R language as described in (https://doi.org/10.1371/journal.pone.0180616 ). Differently Expressed Genes -DEGs were further calculated and visualized using JASP 0.14.1.0 software (https://jasp-stats.org/) using Normality test (Shapiro-Wilk`s) followed by Mann-Whitney U test (for not normally distributed data) or T-test (for normally distributed data). Additionally, in the case of small n size, we have used JASP 0.14.1.0 software (https://jasp-stats.org/) to perform a Bayesian Manna-Whitney U test based on a data augmentation algorithm with 5 chains of 1000 iterations to further verify our data.

Hierarchical clustering analysis
After extracting the expression values from the gene expression pro le, a bidirectional hierarchical clustering heatmap was constructed using an open-source machine learning and data visualization platform Orange (https://orangedatamining.com/).

Results
We analyzed the mRNA pro le of 3 CRC cell lines, HT29, LoVo, and HTC116, and their respective SN-38 resistant variants using the GSE42387 database (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE42387). Gene expression pro les of the parental and resistant cell lines were obtained from cells cultured 2-3 weeks in a drug-free medium (each in triplicate). In our analysis rst, we have focused on ABC proteins, as their expression level is considered to have a high impact on overall SN-38 intracellular concentration, and their expression level is mostly analyzed in CRC cells lines to establish the mechanism of acquired irinotecan resistance [24,25] Fig. 1A. Since our previous studies showed that the expression of ABC transporters in CRC was correlated to cell phenotype and shifts during ongoing EMT [22], and considering that many other reports showed that irinotecan resistance is often related to advanced EMT, we additionally compared the expression level of the major EMT markers, such as E-cadherin Our analysis proved the high heterogeneity among all tested cell lines. In general, all tested SN38 resistant variants of CRC cell lines presented the elevated expression of several ABC proteins, however, their pro les were different among all cell lines, Fig. 1A. The expression of prominent SN38/irinotecan transporters, i.e. ABCB1, ABCC1, and ABCC2 was signi cantly upregulated in the majority of SN-38 resistant variants except for HCT116 cells. We noticed an interesting correlation between ABCC4 and ABCG2 expression. In HT29 and LoVo SN38 resistant variants mRNA level of ABCG2 is signi cantly upregulated (compared to parental cell lines), whereas in HCT116 mRNA of ABCC4 is upregulated. Our previous studies showed, that in CRC ABCG2/ABCC4 expression shift is correlated with EMT and epithelial HT29 cells presents a high ABCG2 expression level that is switched for ABCC4 in HT29 cells that acquired mesenchymal-like phenotype through Snail overexpression [22]. In this study LoVo SN38 resistant cells, which present a more mesenchymal phenotype than the parental cell line (vimentin and CDH2 upregulation), exhibit higher mRNA expression of ABCG2. Interestingly, HT29 SN38 resistant cells show the upregulation of ABCC5 -that bears a strong resemblance in structure and substrates speci city to ABCC4 [27]. Regarding the expression pro le of main enzymes involved in irinotecan processing, we noticed that HT29 SN38 resistant variants demonstrate an increase in CES1 expression and a decrease in CES2 expression. Surprisingly, HCT116 SN38 resistant variants cells present upregulated mRNA level of CES2 and downregulated mRNA level of CYP3A5 that metabolizes SN-38 into NPC. The mRNA level of UGT1 enzymes that inactivate SN38 into SN38G also presents ambiguous results. UGT1A6 is upregulated in HT29 and LoVo SN38 resistant variants, UGT1A8 is upregulated in HT29 SN38 but downregulated in LoVo SN38 cells.
To summarize changes between different cell lines and their differences in SN-38 acquired resistance, we performed clustering analysis using mRNA levels of above-analyzed gens Fig. 2. Our results indicated that all SN-38 resistant variants presented signi cantly different pro les compared to the respective parental cell lines, however, the HT29 cell line and its SN-38 resistant variant are signi cantly different than other tested cell lines, and clusters separately. Interestingly LoVo cell line is more closely related to the HCT116 and HTC116 SN38 than to the LoVo SN38 variant, thus we can assume that obtained resistance for SN-38 triggered the biggest changes in mRNA expression of tested genes.
Next, we analyzed samples from 21 patients with advanced colorectal tumor (GSE62080 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE62080), that at the end of the rst-line treatment of FOLFIRI scheme were de ne to the responders (sensitive S) and non-responders (resistant R) group according to the WHO criteria [25]. As expected, the expression pro le related to chemotherapy resistance observed in CRC tissue samples differs substantially from those present in the resistant cell lines. In patient samples no statistically signi cant changes were observed in mRNA expression levels of neither ABC transporters nor irinotecan processing enzymes UGT1 variants between sensitive and resistant patient group. To better characterized the patients cohort we analyzed the expression of Leucine-rich repeat-containing G-protein coupled receptor 5 (LGR5). LGR5 potentiates canonical Wnt/β-catenin signaling and is considered a marker of advance CRC state and unfavorable prognostic marker [28]. Bayesian Manna-Whitney U test BF 10 suggests "anecdotal evidence" of mRNA upregulation in resistant group. Surprisingly, mRNA level of vimentin (EMT marker) and TWIST-1 (EMT related transcription factor) were signi cantly downregulated in resistant patient group (supplementary les), The epithelial phenotype may be related to the 5FU resistance [29] as patients were treated with FOLFIRI that contains 5FU.
Next, we analyzed top 250 differently expressed genes between resistant and sensitive patient groups using proteinprotein interaction (PPI) network by STRING version 11.0 (https://string-db.org/) Fig. 3. PPI network was enhanced by known proteins that lls the gaps in proper networking as described in [30]. Upgraded networking for up and down regulated gens respectively contained of: number of nodes: 198; number of edges: 2141; average node degree: 1.6; avg. local clustering coe cient: 0.64; expected number of edges: 488; PPI enrichment p-value: < 1.0e-16; and: number of nodes: 203; number of edges: 2989; average node degree: 29.4; avg. local clustering coe cient: 0.656; expected number of edges: 756; PPI enrichment p-value: < 1.0e-16. KEEG pathway analysis using KEGG PATHWAY Database (https://www.kegg.jp/kegg/pathway.html), STRING version 11.0 and DAVID online tool indicated several signaling pathways that reached False Discovery Rate (FDR) < 0.05 and p < 0.05 as shown in Table.1. Top 5, respectively, up and down regulated gens with highest networking score are shown in Table.2 and their respective KEGG pathways are presented in Table.3. Even though, chosen genes presented the highest networking score, DCAF4 QKI and TGOLN2 shows no KEGG involvement. On the other hand, NDUAF2, SDHD and PTGS2 are involved in high number of pathways. Interestingly, COX10, one of the highest networking protein places on number 1 spot among all upregulated genes in resistant patient group compared to sensitive group. Finally, we have investigate chosen DEGs in uence on CRC patients survival using The Human Protein Atlas data base (https://www.proteinatlas.org/) [31]. 5 year survival time of colon adenocarcinoma patients expressing high or low level of chosen DEGs were compared to the rectum adenocarcinoma, Table.4. We have noticed that 5 year survival time substantially depend on cancer locations. Importantly, LSM5 and DACF4 upregulated in resistant patient samples seems to be unfavorable factor in colon adenocarcinoma patients leading to higher mortality. On the other hand in rectum adenocarcinoma upregulation of NDUFA2 is highly unfavorable factor presenting 19% 5-year survival rate in comparison to 67% for NDUFA2 low expression. Furthermore, for DEGs downregulated in resistant patients, low expression of PTGS2 is unfavorable factor for both colon and rectum adenocarcinoma whereas QKI and RBM8A only for rectum adenocarcinoma. Surprisingly, NDUFA2, SDHD and COX10 that are upregulated in resistant patients presents positive in uence on colon adenocarcinoma patient survival.
Tab.1 KEEG pathway analysis of PPI network. Visualization of top dysregulated pathways by up and down-regulated genes in resistant patients samples from GSE62080 database. Tab.4 The prognostic relationship between high and low expression of speci c genes involved in drug resistance to the overall survival of colon and rectum adenocarcinoma. 5-year survival factor obtained from The Human Protein Atlas database [35]. To better characterize CRC cell lines and their respective SN-38 resistant variants, we have analyzed changes in mRNA expression of top-up/down-regulated best networking proteins that were identi ed in patients samples. The obtained data presented in Fig. 4A,B clearly shows that the data from CRC cell lines analysis partially overlapped with data obtained from CRC patients samples analysis. SN-38 -resistance-related changes in LoVo lines presents signi cant resemblance in changes of mRNA levels of NDUAF2, DCAF4, and PTGS2. Interestingly NDUAF2 and PTGS2 are two of the three most in uential tested proteins involved in several important pathways (Table 3). HCT116 and their SN-38 resistant variant, in comparison to the patient samples, present similar (statistically signi cant) changes in mRNA expression of multifunctional PTGS2 and LSM5, that are involved in RNA degradation and spliceosome mechanism. The above-mentioned cell lines also present a similar tendency observed in CRC patients samples (yet not statistically signi cant change) in mRNA expression of RBM8A that is also involved in spliceosome and RNA transport. The same expression tendency was observed in the case of NDUFA2. On the other hand, the mRNA expression pro le of SDHD presents inverse changes. In HT29 similar to patient samples mRNA expression of TIPM1 and TGOLN2 was downregulated in the SN38-resistant variant. The expression of NDUFA2 was also downregulated in the SN38-resistant variant however, its expression was upregulated in resistant patients. Surprisingly, COX10 which is the number one among up-regulated genes and one of the highest networking proteins, is not upregulated in any of SN-38 resistant variants.
Using mRNA expression pro les of selected genes, we performed clustering analysis, Fig. 5. HT29 and HTC116 with their respective SN-38 resistant variants differ enough to establish their phylogenetic group but are closely located in two respective arms. Interestingly, LoVo and LoVo SN38 cell lines present the highest diversity not only inside the resistant/parental group but also among all tested cell lines, located separately, on different arms of a phylogenetic tree.
To analyze the mutual changes between different DEGs, we decided to correlate mRNA expression pro les of previously described top 5 up and top 5 down-regulated, and the highest networking proteins and we performed a correlation matrix using Pearson correlation, Fig. 6. LoVo (pooled parental and their respective SN-38 resistant variant), similarly to patient samples, presented a positive correlation (yet not statistically signi cant in all cases) of the majority of top 5 upregulated genes of the best networking proteins: NDUFA2, SDHD, LSM5, and DCAF4 (COX10 positively correlates only with NDUFA2 and DCAF4). A negative correlation, similarly to patient samples, was observed among RBM8A, TGOLN2, and PTGS2. In the case of HT29 and HTC116 cell lines data were more ambiguous, as NDUFA2, SDHD, LSM5, DCAF4, COX10, RBM8A, TIMP1, QKI, TGOLN2, and PTGS2 presents mixed correlation to one another.
EMT was proven to play a role in the acquisition of chemoresistance on several anticancer drugs [32]. Observed in LoVo cell line shift towards mesenchymal phenotype upon acquisition of SN38 resistance with changes in EMT markers, vimentin and CDH2 encouraged us for analysis of EMT phenotype in FOLFIRI sensitive and resistant patients cohorts (GSE62080, n=21). First, using the mRNA level of 4 main EMT markers -E-cadherin (CDH1), Ncadherin (CDH2), vimentin (VIM), and bronectin (FN) we divided patient samples into 3 groups: strongly epithelial (E), partially mesenchymal (E/M), and mesenchymal (M). However, all subgroups presented elevated expression levels of CHD1, suggesting a strongly epithelial, well-differentiated phenotype. This observation can be explained by the fact that GSE62080 data set is composed of cancer samples diagnosed at an early stage, with no metastatic occurrence. Thus, we decided to assign the patient samples to E or E/M group. (Fig. 7A). Interestingly, samples distribution proved, that FLOFIRI resistant E/M CRC samples present upregulation of FN and CDH2 rather than VIM (Fig. 7A). Nevertheless, in the resistant cohort (n=12) 7 samples were quali ed into the E/M group and 5 into E while the sensitive cohort (n=9) E/M group was composed of 3 and E of 6 samples (Fig. 7B). These results showed the tendency of the resistant cohort to acquire a more mesenchymal phenotype. Furthermore, this heterogeneous composition of the E and E/M group, which consists of both resistant and sensitive patients samples in each phenotypical group, partially explained why no statistically signi cant upregulation of EMT markers between resistant and sensitive cohorts was observed in GSE62080 data set .
Since GSE62080 data sets include a low number of patients samples we have decided to apply the GSE18105 data set to analyze the mRNA expression level of previously selected top networking DEGs (up-regulated: NDUFA2, SDHD, LSM5, DCAF4, COX10 and down-regulated: RBM8A, TIMP1, QKI, TGOLN2, and PTGS2) in CRC patient samples. GSE18105 (https://www.ncbi.nlm.nih.gov/geo/geo2r/?acc=GSE18105) database composed of mRNA pro les (n=111) obtained using an oligonucleotide microarray of normal tissue and CRC patients primary tumors with and without distant metastasis. Similar to previous analysis samples were divided into E, E/M, and M groups using mRNA levels of 4 main EMT markers -E-cadherin, N-cadherin, vimentin, and bronectin. Only six out of ten tested proteins,i.e. NDUFA2, SDHD, LSM5, COX10, QKI, and TGOLN2, presented statistically signi cant changes in mRNA expression among E, E/M, and M patient groups (Fig. 8A). We observed that in M and E/M group similar to the resistant patients' group NDUFA2, SDHD, LSM5, COX10 and TGOLN2 was upregulated, additionally, the mRNA level of QKI was upregulated in E/M and M phenotype group unlike to the one observed in the resistant patients' group. Mesenchymal phenotype correlated upregulation of NDUFA2, SDHD, and COX10 that are involved (among the vast amount of different processes) in oxidative phosphorylation (OXPHOS) is extremely interesting. Several experiments proved that human CRC exhibit higher rates of oxidative phosphorylation than large intestine normal cells [33]. EMT regulates ABC proteins expression [35]. Previously we proved that ABCC4 in CRC correlates with EMT and CRC cells with advanced EMT phenotype presents a higher level of ABCC4 contrary to ABCG2 level [22]. In the current study, we con rmed our previous observation that ABCC4 is upregulated in more mesenchymal phenotype presenting cells (M and E/M groups) whereas expression of ABCG2 is slightly (yet statistically signi cant) downregulated in advanced EMT (Fig. 8B). We also showed that ABCC1 correlates with mesenchymal phenotype, however to our biggest surprise, ABCC5 that was upregulated in SN-38 resistant variants of CRC cell line HT29 (and partially in LoVo) presents downregulated mRNA levels in both E/M and M patient groups. Finally, to prove that our E, E/M, and M groups are composed correctly, we have also tested the mRNA level of LGR5 [28], which was signi cantly upregulated in both the E/M and M group compared to E.

Discussion
Approximately 50-60% of patients who are diagnosed with CRC will eventually develop metastatic disease. Most often, metastases develop after rst-line therapy treatments for local disease thus the drug resistance is still the major problem in CRC treatment [36,37]. Gene pro ling, among other methods, has been used to identify genes involved in the progression and the prognosis of CRC or to establish current CRC classi cation based on four molecular subtypes (CMS1-CMS4) [38,39]. Previously, we identi ed NMU as a predictive marker of CRC invasiveness [40]. In the current study to bridge the gap between in vitro and in vivo study, we combined and compared the expression pro les of cell lines and patient samples from the publicly available database to select the new candidate genes for irinotecan resistance. Many attempts to identify chemotherapy predictive biomarkers of treatment response and resistance in CRC were made. However, most of these biomarkers, except for the KRAS and BRAF genes, are not accurately predicting treatment response. As such, additional studies are urgently needed to identify and validate novel biomarkers to improve the therapy for CRC patients [39].
CRC cell lines have previously been shown to recapitulate the mutational and transcriptional heterogeneity of primary tumors [41,42]. To understand the irinotecan resistance mechanism various CRC cell lines resistant to SN-38 were selected. By studying the parental drug-sensitive cells to the drug-resistant cells by molecular biology and cellular methods, particularly acquired MDR-associated molecules have been studied [43]. In our study, we analyzed the expression pro le of three cell lines, HT29, HCT116, LoVo, and their SN38 resistant variants. We applied those cell lines as they present various genetic, epigenetic, and molecular subtypes, and to some extent, they may re ect the molecular heterogeneity of CRC [41]. Since it has been demonstrated that assessment of multiple biomarkers provides an accurate prediction of drug response than a single biomarker in our study we analyzed the main enzymes involved in irinotecan transport and processing and additionally main EMT players as the mesenchymal phenotype is associated with irinotecan resistance [22,39]. First, we concluded that SN-38 treatment changed the expression pro le of all tested cells lines irrespective of their molecular background and origin. The expression of prominent SN38/irinotecan transporters, i.e. ABCB1, ABCC1, and ABCC2 has signi cantly upregulated in the majority of SN-38 resistant variants and the expression pro le of ABC transporters was speci c for each cell line. This observation is in line with many other reports on in vitro studies concerning ABC transporters' function in cancer [44]. We also observed various changes in the expression pro le of the main EMT markers in each of the tested cell lines. EMT is the best-known case of tumor cell plasticity which appears to in uence sensitivity to various chemotherapeutic drugs and EMT is an important regulator of ABC transporters. It was demonstrated that the promoters of ABC transporters carry several binding sites for EMT-inducing transcription factors [45]. According to our previous data, the phenotype conversion induced by the Snail transcription factor decreased ABCG2 and increased the ABCC4 expression level in HT29 cells [40]. In the current study, we observed that LoVo SN-38 resistant variant which exhibited mesenchymal phenotype showed the upregulation of ABCG2 expression. The possible explanation for the inconsistency may be the fact that LoVo and HT29 present different genetic pro les and belong to various CRC molecular subtypes, LoVo to CMS1, HT29 -CMS3, or that LoVo SN-38 resistant variants acquired mesenchymal feature through SN-38 treatment. Our analysis of the main enzyme responsible for the irinotecan activation (CES1/2) and deactivation indicated the different expression pro les among the tested lines. The changes in their expression are relevant in the case of irinotecan administration, however, in the case of the analysis of CRC cell lines their expression pro les are less interesting since the majority of their activity related to irinotecan in the body is located in the liver [13]. In our tested CRC cell lines, they may process inactive irinotecan metabolite -NPC into active SN-38. The clustering analysis con rmed that all SN-38 resistant variants presented signi cantly different pro les compared to the respective parental cell lines, and among three tested CRC lines, LoVo showed the most evident changes in the mRNA pro le of tested genes under SN38 treatment. To verify our above observation we analyzed the expression pro le of the patients with advanced colon cancer sensitive or resistant to rst-line treatment of FOLFIRI. We observed that in the patients' samples no statistically signi cant changes were observed in mRNA expression levels of neither ABC transporters nor irinotecan processing enzymes between the sensitive and resistant patients' groups. We con rmed the upregulation of Lrg5 that is a marker of advanced CRC, in the resistant patients' group concomitantly the expression pro les of EMT players from this group demonstrated the epithelial phenotype. The epithelial phenotype may be related to the 5FU resistance [46] as patients were treated with 5FU within FOLFIRI scheme.
To select the new candidate genes for irinotecan resistance we have analyzed the top 250 differently expressed genes between resistant and sensitive patient groups using protein-protein interaction (PPI) network and then we selected ve up, such as NDUFA2, SDHD, LSM5, DCAF4, COX10, and down-regulated genes, RBM8A, TIMP1, QKI, TGOLN2, and PTGS2 respectively, with highest networking score. Those genes were not reported as irinotecan resistance-associated thus we check their utility as the prognostic markers. We showed using the KEGG database the involvement of selected DEGs in signi cant cellular pathways. We selected the biological processes that might be changed in response to chemotherapy treatment. The oxidative phosphorylation was the most upregulated pathway while the processes related to RNA processing (RNA spliceosome, surveillance, transport, and degradation) were the most downregulated. These ndings are consistent with the previous observation that cells to survive under chemotherapy treatment reduce transcription rate and increase energy demand. Several experiments proved that human CRC cells exhibit even higher rates of oxidative phosphorylation than large intestine normal cells [35]. The elevated production of ATP may be connected to the increase in drug e ux operated by ABC transporters and/or to the increased migration of advanced CRC cells. The irinotecan-resistant Non-Small Cell Lung Cancer (NSCLC) cells are characterized by increased oxidative phosphorylation and treatment with gossypol (pan-ALDH inhibitor) and phenformin (OXPHOS inhibitor) reverses irinotecan resistance in tested xenograft model of human NSCLC [33]. CRC patient 5-year survival analysis showed that the prognostic value of selected DEGs depends on cancer location, i.e. colon vs. rectum, and additionally, they may be favorable or unfavorable factors. These results are di cult to discuss, however, they are in line with current knowledge that in the case of CRC, actual tumor location is one of the most important prognostic factors [39].
To integrate the obtained data and facilitate the selection of cell lines as appropriate research models, we reversed the standard data ow axis from "cell lines to patients" into "patients derived data to cell lines". Although in vivo irinotecan resistance mechanisms are more complicated than in vitro, investigating such mechanisms in vitro can provide strategies to overcome acquired drug resistance in CRC [47]. Expression of the top networking up and downregulated genes (NDUFA2, SDHD, LSM5, DCAF4, COX10, RBM8A, TIMP1, QKI, TGOLN2, and PTGS2) was differently expressed in the majority of irinotecan resistant variants of CRC cell lines, and often presented the reversed tendency than observed in vivo. The obtained data demonstrated that in vivo irinotecan resistance mechanisms are more complicated than in vitro and suggest that not one, but many different mechanisms, signaling pathways, and other factors such as other drugs or tumor microenvironment, may act simultaneously or complementarily in the development of irinotecan resistance in CRC. We can assume that our results may need to be veri ed by a speci c irinotecan treatment patient cohort. In the case of CRC irinotecan is commonly applied to the patients in FOLFIRI scheme. Our current results may also imply why plenty of promising in vitro studies fail the clinical implementation and why the mechanisms leading to irinotecan resistance are still poorly characterized. However, cell lines still represent a mainstay to functionalize molecular data as they allow experimental manipulation, global and detailed mechanistic studies, and high-throughput applications.
In our study, we showed that by integration of the expression pro le data each CRC cell line is a resource to select relevant models for studies of irinotecan resistance mechanisms.

Conclusion
Irinotecan represents the backbone of CRC chemotherapy. Currently, after more than two decades of clinical usage, the mechanism of irinotecan resistance remains still the subject of investigation. Although in vivo irinotecan resistance mechanisms are more complicated than in vitro, investigating of latter can provide strategies to overcome acquired drug resistance in CRC. However, CRC cell lines are a limited model often presenting different mechanisms of utilizing drug resistance, thus usage of at least two different cells lines is essential. Importantly, identi cation of irinotecan processing enzymes as potential game-changing factors in acquired irinotecan resistance seems to be pointless, as the process of irinotecan activation in vivo is located mainly in the liver but not in the tumor -contrary to in vitro cell line-based experiments. The signi cance of ABC proteins seems to be less obvious, as patients' samples presented no signi cant changes between resistant and sensitive cohorts. FOLFIRI resistant patients' cohort present a tendency for activation of EMT, that enchases both drug resistance and migration, and this observation require further analysis. Nevertheless, our analysis provided several potential irinotecan resistance markers, previously not described as such.  Hierarchical clustering analysis of CRC cell lines using mRNA expression pro le of irinotecan resistance-related proteins. A bidirectional hierarchical clustering heat map was visualized using the data visualization platform Orange. The expression values between SN38-resistant variants and corresponding parental cell lines are presented a grayscale, black represents low and white -high expression values (6.1-17.23).

Figure 3
Protein-protein interaction (PPI) network of differentially expressed genes (DEGs). PPI network was created from 250 differently expressed genes obtained from the GSE62080 database using STRING version 11.0 online software, The PPI pairs were imported into Cytoscape software as described in the Materials and Methods section. The lines represent the interaction relationship between nodes. Figure 4 mRNA expression level in CRC cell lines of 5 top networking up (A) and down (B) regulated DEGs from FOLFIRI resistant patients cohort. Data were obtained using GOE database: GSE42387 (GPL16297 Agilent-014850 Whole Human Genome Microarray 4x44K G4112F platform). Normality test (Shapiro-Wilk`s) followed by Mann-Whitney U test (for not normally distributed data) or T-test (for normally distributed data). * p>0.05, ** p> 0.005, ***p>0.001 Figure 5 Hierarchical clustering analysis of CRC cell lines using mRNA expression of top networking DEGs from FOLFIRI resistant patients cohort. A bidirectional hierarchical clustering heat map was visualized using data visualization platform Orange. The expression values between corresponding irinotecan-resistant and parental CRC cell lines are presented a gray scale, black represents low and white high expression values (6. .02). To analyze the mutual changes between different DEGs, we decided to correlate mRNA expression pro les of previously described top 5 up and top 5 down-regulated, and the highest networking proteins and we performed a correlation matrix using Pearson correlation, Fig.6. LoVo (pooled parental and their respective SN-38 resistant variant), similarly to patient samples, presented a positive correlation (yet not statistically signi cant in all cases) of the majority of top 5 upregulated genes of the best networking proteins: NDUFA2, SDHD, LSM5, and DCAF4 (COX10 positively correlates only with NDUFA2 and DCAF4). A negative correlation, similarly to patient samples, was observed among RBM8A, TGOLN2, and PTGS2. In the case of HT29 and HTC116 cell lines data were more ambiguous, as NDUFA2, SDHD, LSM5, DCAF4, COX10, RBM8A, TIMP1, QKI, TGOLN2, and PTGS2 presents mixed correlation to one another.