New MicroRNAs Candidates to Treat Human Colorectal Cancer; Molecular Dynamic Simulations Found a Major Down-Regulator of CLCA4 Tumor Suppressor Gene


 IntroductionColorectal cancer (CRC) is one of the most common malignancies worldwide. The expression of CLCA4, a tumor suppressor gene, decreases significantly in cancer cells of CRC. In this study, we identified miRNAs target the mRNA of the CLCA4 gene. ObjectiveThe aim of this study was the identification of miRNAs involved in CRC.Material and methodsWe predicted miRNA(s) that target CLCA4 mRNA applying TargetScan v.7. Then through analysis of Gene Expression Omnibus (GEO) datasets, among them, miRNA(s) over-expressed in CRC cells were determined. To identify miRNAs with the highest potential to down-regulate CLCA4 through binding, we calculated the binding free energies of the candidate miRNA- mRNA complexes using the molecular mechanics energies combined with several solvation models: The Poisson–Boltzmann (MM/PBSA), the generalized Born (MM/GBSA), and the three-dimensional reference interaction site model with Kovalenko–Hirata closure relation (3D-RISM-KH). ResultsOur TargetScan analysis predicted that 106 miRNAs could bind to CLCA4 3' UTR mRNA. Hsa-miR-934, hsa-miR-574-5p, hsa-miR-377-3p, hsa-miR-5580-3p, hsa-miR-4775, hsa-miR-590-3p and hsa-miR-501-5p showed increased expression in CRC samples compared to normal cells. MD results found the lowest free energy changes in three hsa-miR-377-3p, hsa-miR-574-5p and hsa-miR-501-5p miRNAs. ConclusionThis research beside introducing a new fast and low cost plan to find best candidate of miRNAs to bind their targets, suggested miR-501-5p as a biomarker for early diagnosis of CRC. As well, preventing of down regulation of the CLCA4 expression through interrupting in the expression of miR-574-5p and miR-377-3p and more effectively miR-501-5p probably treat or slow down the development of colorectal cancer.


Introduction
Colorectal cancer (CRC) is the third most common cancer in the world and the third diagnosed cancer among men and women in the United States [1]. This tumor is usually diagnosed at the average age of 69 years, in which 60% are over the age of 65 and 36% are over 75 years [2]. A Mortality from this cancer has been reduced due to the knowledge of risk factors, early detection, and prevention. However, CRC is still one of the leading health problems all over the world.
CLCA4 signi cantly downregulated in colon and breast cancer cells. The protein encoded by CLCA4 belongs to the family of calcium-sensitive chloride conductance proteins expressed mainly in the colon and inhibits the proliferation of cancer cells, so plays a critical role in suppression of CRC [3,4]. The decreased expression of this protein could change intracellular and extracellular acidity. Due to the effects of extracellular pH on the function of the immune system, and the weakening of many cellular responses at lowered extracellular pH [5], it can be suggested that the decreased expression of CLCA4 prevents the immune system from killing cancer cells. Messenger RNA (mRNA) destabilization and translational silencing are among the most factors leading to the repression of protein production.
Up to third of the genes encoding proteins in the human genome regulate by miRNAs through binding to the 3'-UTR (untranslated region) of target mRNA. More than 50 percent of these genes are in the sensitive and vulnerable genomic cancer-related regions. miRNAs are short non-coding RNAs, approximately 19-22 nucleotides long. They regulate gene expression at the post-transcriptional level [6,7]. Genes regulated by miRNAs controls important biological processes including response to stress, cell proliferation, apoptosis, cell division, differentiation, and cancer [8]. Some miRNAs are oncomirs while others are tumor suppressors. The overexpression of the former and underexpression of the latter leads to cancer development [9].
The role of miRNAs in tumor suppressing and oncogenesis of colorectal cancer has been studied and proven. For instance, up-regulation of miR-378 suppress proliferation, migration and invasion of colon cancer cells by inhibiting SDAD1 [10] and inversely down-regulation of miR-21 decreased tumor grow through reduction in β-catenin, SOX9, expression of some proin ammatory and procarcinogenic cytokines [11]. Therefore, miRNAs scanning (especially circulating ones) are recommended as highly sensitive and speci c noninvasive test to diagnose and control cancers including colorectal cancer [12].
So nding appropriate miRNAs associating in carcinogenicity is a crucial step not only to early diagnosis of cancer but also to treat that through blocking or overexpression related miRNA.
Given the importance of the CLCA4 gene in CRC and the serious role of miRNAs in gene expression regulation, in this study we focused on identifying miRNAs that could be the cause of decreased CLCA4 expression in CRC patients. Identifying such miRNAs along with other identi ed miRNAs will be an important step to design a treatment for CRC. Increasing the number of miRNAs as well as nding the best ones can decrease false negatives diagnosis. For nding these relevant miRNAs, molecular dynamics (MD) simulations of the miRNA-mRNA (binary) complexes were performed. The purpose of these extensive simulations is to better understand the miRNA molecular recognition mechanism by providing atomic-level details that are unreachable in experiments because of resolution restrictions. Moreover, since the stability of miRNA-mRNA duplex structure can be predicted by calculating thermodynamic parameters such as binding a nities of nucleic acids strands [13] thermodynamic analyses were also done to profoundly investigate the interactions between miRNA and its target applying MM/PBSA (molecular me chanics Poisson-Boltzmann surface area), MM/GBSA (Molecular Mechanics/Generalized Born Surface Area) and 3DRISM-KH (three-dimensional reference interaction site model with the Kovalenko-Hirata).

Material And Methods:
2.1. Finding target for miRNAs Target sequences of miRNA in this study were predicted using TargetScan v.7, 1 web server [14]. The TargetScan is one of the most reliable tools for predicting biological targets of miRNAs. This tool searches the existence of conserved 8mer, 7mer, and 6mer sites that match the seed region of each miRNA using an extensive set of 14 robust features related to miRNA, mRNA and site [14].

Analysis of Gene Expression Omnibus (GEO) datasets
Although a reduction in the level of CLCA4 gene expression in CRC has been proven [15], we downloaded GEO databases to con rm this decline. We studied ve GEO datasets of GSE21510 [16], GSE25071 [17], GSE15781 [18], GSE32323 [19] and GSE8671 [20] related to the mRNA and protein expression of genes in CRC and control pro les by GEO2R analyzer. Also, in order to investigate whether the miRNAs detected by the Targetscan program were upregulated or downregulated in CRC, at rst, we carried out library studies, then downloaded and analyzed the miRNAs expression data of four GEO datasets of GSE35602 [21], GSE35982 [22], GSE39833 [23] and GSE77380 [24] in CRC.

Detecting miRNAs targeting CLCA4 mRNA gene
To detect miRNA targets, a computational prediction is one of the most e cient methods [25]. To predict the miRNAs that have conserved sites on CLCA4, we used TargetScan v.7.1 [14]. The TargetScan is one of the most robust and reliable tools for predicting biological targets of miRNAs [14]. While the basic features of the miRNA-mRNA interactions, such as seed match, site accessibility, free energy, and conservation, are considered in all miRNA target prediction tools [26], TargetScan implements an extensive set of 14 features related to miRNA, mRNA, and site [14]. It also deliberates the degree of repression and the cooperative miRNA function [14]. According to the identi er of CLCA4 corresponds, we used the most prevalent transcript (ENST00000370563.3) with 409nt, 3' UTR length for CLCA4 (ENSG00000016602.8) (Fig. 1).

Molecular models of miRNA-mRNA complexes
Because crystallographic structures of miRNA-mRNA complexes are unavailable, we constructed their 3D models. First we created unimolecular complexes by joining the strands of miRNAs and their mRNA targets by a GAAA tetramer loop (Fig. 2).
Then we used these single-stranded RNA sequences in SimRNAWeb [27], an RNA modeling tool, to generate 3D structure models. Afterward, the GAAA tetraloop was removed to disperse the two strands (Fig. 2).

MD simulations
All molecular dynamics simulations were carried out using the Amber18 [28] package with the ff99bsc0_chiOL3 force eld [29]. To neutralize the system, sodium counterions were used. After neutralization Na + and Cl − ions were added to mimic 0.15M salt solution. All model systems of miRNA-mRNA complexes were solvated in a box of TIP3P water molecules with a spacing between the edge of the box and the RNA molecules of 15 Å°. Particle Mesh Ewald (PME) was engaged to treat electrostatic interactions with the default settings. The steepest descent energy minimization with 500 steps was carried out to relax any structural bad contacts in the solvated systems. It was followed by 50ps of heating to 300K in the NVT ensemble with restrained RNA heavy atoms (constraint force constant = 2 kcal/(mol* Å 2 ) ), 50 ps of density equilibration in NPT ensemble at 300K with the same constraints, and 10 ns equilibration without restraints in the NPT ensemble at 300K with PMEMD program from Amber18.
All simulations were performed with a 2 fs time step, Langevin thermostat and SHAKE algorithm to constrain bonds with hydrogen atoms. Each of the eight solvated systems was simulated for 200 ns.
We used the CPPTRAJ module of Amber18 [30] to analyze MD trajectory les. Hydrogen bonds were identi ed with a distance cutoff of 3.5 Å and an angle cutoff of 135°.

Binding free energy calculation
Hybridization of miRNA with mRNA makes miRNA-mRNA complex, accordingly the free energy difference between complex and two dissociated RNA strands can be calculated as: Where the angular brackets denote average over all MD trajectory frames.
The binding free energy is composed of the gas-phase MM energetic contribution ΔE MM, , entropic contribution, and the solvation free energy: ΔG bind = < ΔE MM > + < ΔG sol > − T < ΔS MM > Molecular mechanics and solvation contributions can be further decomposed into the following terms: Here E MM (the total gas phase energy) which is composed of E int (internal energy of bonded interactions), E ele (electrostatic energy) and E vdW (van der Waals energy) between the strands of miRNA-mRNA complex was acquired using Amber18.
The calculation of ΔGsol for all miRNA-mRNA complexes was carried out by three methods (MM/GBSA, MM/PBSA and MM/3D-RISM-KH) on 200 ns MD trajectories using single-trajectory protocol.
The solvation free energy ΔG sol is comprised of ΔG pol (polar) and ΔG np (nonpolar) which were calculated using methods of Poisson-Boltzmann and generalized Born continuum electrostatics molecular theory of solvation as implemented in Amber18. In 3D-RISM-KH method, thermodynamic calculations, electronic properties and molecular solvent structure of a soluble molecule in a solvent with medium computational cost are possible and the solvent and soluble chemical properties such as hydrogen bonds and hydrophobic forces are provided by a three-dimensional site of solvent density distribution [31]. We have not decomposed solvation free energy into polar and nonpolar terms in calculations using 3D-RISM-KH method due to high computational cost.
Entropic contribution -T < ΔS MM > composed of the vibrational, rotational and translational components was calculated by normal mode analysis (nmode module of the Amber18). We used several solvation models to assess robustness of free energy calculations.
All simulations were carried out on Compute Canada (www.computecanada.ca) high-performance clusters.

Down-regulation of CLCA4 in CRC
Although downregulation of CLCA4 in CRC was proved in previous studies, our analysis on the data of CLCA4 mRNA expression in ve GEO datasets (GSE21510, GSE25071, GSE15781, GSE32323 and GSE8671), veri ed the downregulation. According to the analysis of this data, we respectively found an 80-fold, 180-fold, 19-fold, 22-fold and 38-fold downregulation of CLCA4 mRNA expression in CRC tissues in comparison with normal tissues (Table 1). 3.2. Prediction of miRNAs targeting CLCA4.
Applying TargetScan v.7,1 to predict miRNAs which could bind to CLCA4 3' UTR [14], we identi ed 106 miRNAs ( Table 2). 3.4. Analysis of GEO datasets for identifying upregulated miRNAs 106 miRNAs were investigated for the extent of their expression in accordance with experimental data of the gene expression of GEO. So, we analyzed the downloaded GEO datasets of GSE35602 [21], GSE35982 [22], GSE39833 [23] and GSE77380 [24]. The results showed that hsa-miR-501-5p, which is among the miRNAs with 8mer site type seeds (potentially are expected to be the most forceful [38] and has the high value of repression [38,39] ) is more highly expressed in CRC than normal (Table 3). This miRNA could bind to the two sequences of CLCA4 mRNA; so, CLCA4 mRNA with two binding sites for this miRNA could have more chance to be adhered. According to analysis of GEO datasets of CRC samples and library studies of experimental results, eight miRNAs (hsa-miR-934, hsa-miR-574-5p, hsa-miR-377-3p, hsa-miR-5580-3p, hsa-miR-4775, hsa-miR-590-3p, hsa-miR-501-5p with mRNA target position of 3' UTR 69-76 and hsa-miR-501-5p with mRNA target position of 3' UTR 144-151) were chosen to be analyzed with MD simulations. There are ve site types in total: 8mer, 7mer-m8, 7mer-A1, 6mer, and 6mer offset (listed in the order from the strongest binding to the weakest). All 106 miRNAs with the ability to bind to the mRNA sequence of the CLCA4 gene predicted by TargetScan are binding to 8mer, 7mer-m8, or 7mer-A1 site types. Thus miRNAs selected for MD simulations belong to these three groups as shown in Fig. 3. To compare simulated structures with the initial structure of the duplex, we plotted the root-mean-square deviation (RMSD) of RNA backbone atoms C3', C4', C5', O5' and P as a function of time (Fig. 4). The results show that all complexes except miR-590 reached equilibrium state during the rst 10 ns of simulation. RMSD of miR-590 exhibited a modest increase at about 120 ns of simulation which persisted up to 200 ns. A longer simulation is required to determine whether it is a temporary uctuation or persistent conformational change. The largest uctuations were observed in miR-501(144) complex. Also to study the differences of structure, we drew snapshots during simulation time of 50, 100, 150, and 200 ns (Fig. 5).

Lifetime of hydrogen bonds
Reports on the lifetime of hydrogen bonds were extracted and summarized in the table below. These results indicate that hsa-miR-501-5p, hsa-miR-377-3p, and hsa-miR-574-5p complexes have 65 (two sites), 37 and 34 hydrogen bond respectively (27) in the seed region. These miRNAs not only have more cases of Hbonds in the seed region but also these bonds are more stable and lasting for the most time of simulation. As well miR-501(144) and miR-574 showed longer HB lifetime in comparison to others for the most time of simulation. Inversely miR-5580 complex presented both the least number of HB and HB life time. All data has been presented in Table 4.

Binding Free energy calculations
The data from the calculation of binding free energy of all miRNA-mRNA complexes by MM / GBSA) demonstrate that hsa-miR-377-3p, hsa-miR-574-5p and hsa-miR-501-5p, respectively, have the lowest ΔE total and ΔGbind in the 8 studied complexes with more than − 100 and − 76 Kcal / mol respectively (Fig. 6). Also, the results of data analysis of free energy and their physical components by the three studied methods are summarized in Table 5. The data in the table shows that -T < ΔS MM ,>, the average contribution of entropy to the binding free energy obtained by normal mode analysis is the highest in hsa-miR-574-5p, hsa-miR-501-5p and hsa-miR-377-3p. The most negative ΔE total calculated in all three methods are related to the hsa-miR-574-5p, hsa-miR-501-5p and hsa-miR-377-3p complexes. Also free binding energy for these three miRNAs are − 81.1, -80.5 and − 76.8.
Although there is a little difference in the results of ΔG bind in the order of the most negative with three methods, miRNAs of 377, 574, 501(144), 934, 4775, 5580, 501(69), overall considering entropy contribution binding energies of most complexes are similar and 501(69) seems to be slightly better than others and these miRNAs seem to be almost equally good candidates to bind with their CLCA4 mRNA target and miR-590 cannot be considered in this case.

Discussion
CRC is a major health problem and the third cause of death due to cancer. Altered expression of some genes, including tumor-suppressors or oncogenes, are the most important factors in tumorigenicity and cancer development [41]. CLCA4 is one of the tumor suppressor genes and its encoded protein is a calcium-dependent chloride channel. Because of its importance in regulation of cell acidity [42,43] and immune system [4,44], we checked CLCA4 expression in cancerous cells. Using ve GEO datasets, our results showed down expression of CLCA4 mRNA in CRC tissues from 19 to 80 fold in comparison with normal tissues. These data is in accordance to previous data which reported down-regulation CLCA4 expression in CRC [15] MiRNAs through binding to the target mRNA act as an epigenetic mechanism of gene regulation and are one of effective factors at gene down-expression [45,46], so the decrease of expression level of CLCA4 in CRC prompted us to investigate miRNAs could target CLCA4 and determine the mechanism of their action. The role of some miRNAs has been reported in developing and repressing colorectal cancer. In this study, we considered identifying miRNAs which could repress CLCA4 expression. Applying TargetScan v.7, it was predicted that 106 miRNAs could bind to CLCA4 3' UTR MR. miRNAs bind to their mRNA target through 5 types of miRNA binding sites, including 8mer site (the highest a nity), 7mer-m8 (matched at position 8), 7mer-A1 site (adenosine at position 1), 6mer site and offset 6mer site (the least a nity). All determined miRNAs were binding to 8mer, 7mer m8, or 7mer-A1 site types. The 8mer site type (a precise match to positions 2-8 of the mature miRNA, followed by an 'A') of miRNA binding site, has more effects on gene repression and advises the robust suppression [38], probably because of the increased binding site complementarity [39].
Of this 106 recognized miRNAs, laboratory studies and our analysis through GEO datasets of CRC samples de ned seven miRNAs (with eight sites) are highly expressed, so were selected to be analyzed with MD simulations. These miRNAs included hsa-miR-934, hsa-miR-574-5p, hsa-miR-377-3p, hsa-miR-5580-3p, hsa-miR-4775, hsa-miR-590-3p, hsa-miR-501-5p with mRNA target position of 3' UTR 69-76 and hsa-miR-501-5p with mRNA target position of 3' UTR 144-151. Snapshot structures obtained from the MD trajectory of system was used to calculate the average interaction energies of the miRNA and mRNA [47]. Results showed that three hsa-miR-377-3p, hsa-miR-574-5p and hsa-miR-501-5p miRNA, respectively, have the lowest entropy and free energy changes which is in line with the process of changes of in observed hydrogen bond lifetime. These features along with the higher number of H bonds in the seed area, has made these tree complexes more stable and lasting. The lower free energy represents higher binding a nity that are more effective in gene suppression [48]. Therefore, hsa-miR-377-3p, hsa-miR-574-5p and hsa-miR-501-5p have major role in CLCA4 down-regulation and causing CRC. Considering these nding, these three miRNAs are recommended as potential targets to treat CRC. Inhibiting their binding to CLCA4 mRNA can prevent its down-regulation. Of this miRNAs, hsa-miR-501-5p bind to two sequences of CLCA4 mRNA with the highest number of H bonding. Also, as it was mentioned hsa-miR-501-5p is among the miRNAs with 8mer site type seeds which is the potentially most forceful and high value of repression miRNA [38,39]. Therefore, CLCA4 mRNA with two binding sites for this miRNA have more chance to make hsa-miR-501-5p-CLCA4 mRNA complex. As a miRNA with high ability to bind to CLCA4 mRNA and reduce its expression, which up-regulates in CRC, miR-501-5p can be a biomarker for early detection of this cancer. As well, interrupting in conforming of hsa-miR-501-5p-CLCA4 mRNA complex may be an effective way to prevent or slow-down development of CRC. These ndings shed light on future experimental studies.
In conclusion, with the help of bioinformatics algorithms and prediction tools, we identi ed miRNAs which could target CLCA4 and found that their expression was upregulated in CRC tissues (according to library studies and by analyzing the GEO database). By studying and analyzing the structural condition and thermodynamic data of these eight miRNAs using molecular dynamics simulations, we have shown that seven miRNAs including miR-4775, miR-934, miR-5580 and mostly miR-574-5p, miR-377-3p and specially miR-501-5p with two binding positions can play the oncomir role in regulating of CLCA4 gene expression in CRC. Although more research is needed to understand whether downregulation of these miRNAs led to increasing CLCA4 gene expression, elevating intracellular pH and decreasing extracellular pH, these miRNAs could be potential candidates for colorectal cancer diagnosis and therapy.

Declarations
The sequences of 409 nt of 3' UTR of CLCA4 and highly upregulated miRNAs in CRC tissues in comparison with normal tissues which could be bound to.

Figure 2
Schematic representation of introducing a GAAA tetraloop between miRNA and mRNA strands. Root-mean-square deviation (RMSD) of miRNAs-mRNA complex backbone atoms with respect to the initial conformation as a function of time. RMSDs of miRNAs-mRNA complexes in three binding site groups (8mer, 7mer-A1, and 7mer-m8) are shown in panels A, B, and C respectively. Time starts from the beginning of production run after 10 ns long-equilibration.