Use of MethHC and SurvExpress database in the course of molecular biology for medical students: Understanding of epigenetic modifications in gene expression with problem-based learning

Background: Currently, there are teaching methods that allow students to understand clinical conditions, pathologies and hereditary diseases, mainly supported by problem-based learning (PBL), simulation-based learning (SBL), image analysis, case studies, and use of platforms electronic and mobile applications. However, the use of databases and PBL in the compression of molecular biology applied to research has not been well investigated. The scientific literature has reported combinations of pedagogical strategies with positive results compared to the traditional method based on reading, therefore, this study aimed to design a pedagogic method making use of databases applied in scientific research to understand the relation of methylation on gene expression and its relationship in the appearance of diseases such as cancer. Methods: Three databases were used, Genetics Home Reference (GHR) for the theoretical understanding of the study gene, MethHC for the analysis of the effect of methylation on gene expression and SurvExpress for biomarker validation for cancer gene expression. All databases support your information from The Cancer Genome Atlas (TCGA) program of the National Cancer Institute (NCI). The guidelines of each platform for the analysis of the information were followed. The present study exemplified with the analysis of the POT1 gene that is involved with the maintenance of telomere length and has been associated with breast cancer. Results: The function of the POT1 gene, chromosomal and molecular position, number of exons and their transcriptional variants in invasive breast carcinoma were described. It was found that the expression of POT1 depends on the positioning of the levels of methylation within the structures that comprise the gene in healthy and tumor tissue, besides it was reported through a survival analysis that the gene is best expressed in the low group's risk compared to the high-risk group. Conclusion: The use of databases allows the student to make use of his theoretical knowledge and take them to clinical

practice and research, the results obtained from the use of GHR, MethHC, and SurvExpress allow to apply molecular biology in clinical research and promote problem-based learning developing analysis and interpretation skills in students. This pedagogical method allows the understanding of epigenetic effects on the expression of cancer-related genes.

Background
The medical career is characterized by the constant reading as a method of study, where most students find the subjects of biochemistry and molecular biology are difficult to understand due to the amount of information on the topics that involve metabolic pathways and molecular mechanisms, especially during the first years [1]. Theoretical reading provides valuable information and basic knowledge for students; however, this study habit limits the interaction between students and teacher-student [2].
Students who study the subject of molecular biology often find it difficult to differentiate biological mechanisms such as DNA replication, transcription and translation through mental categorization [3], or understanding a dynamic sequence of events through the physical organization of the molecules in an image, model or mental scheme. Comprehend the molecular mechanisms and interactions in their baseline state, will help the student to compare, establish and elucidate the biological consequences of possible alterations [1,3,4]. An important issue in the content of the molecular biology course is "gene silencing by DNA methylation" considered as an epigenetic mechanism that can control gene expression without altering its sequence and categorized as a risk biomarker for cancer development [5][6][7]. A dynamic way for students to understand the effect of an epigenetic mechanism on the expression of a gene and its biological consequences, is by including bioinformatics and databases as pedagogical tools for learning and understanding [8], that contribute to the development of three fundamental skills 1) the process of science, 2) communication and understanding of science and 3) practical aspects in the science community [9]. The databases allow students to make use of theoretical knowledge with the interpretation of data in their laboratory practices and in the development of research projects [8][9][10].
In this context, MethHC (http://MethHC.mbc.nctu.edu.tw) is a database that extracts its information from The Pan-Cancer Project gene expression, methylation and microRNA expression, useful for the study of methylation on the level of expression of a gene, the expression of mRNA and microRNA, through statistical correlations associated with various types of cancer [7,11].
Similarly, SurvExpress (http://bioinformatica.mty.itesm.mx/SurvExpress) it is a bioinformatics tool for the validation of multigenic biomarkers related to cancer through an analysis of genetic expression and survival graphs [12,13]. MethHC and SurvExpress support their information in the cancer history program "The Cancer Genome Atlas" (TCGA) of the National Cancer Institute (NCI) https://www.cancer.gov/.
With the use of bioinformatics tools, the student might develop essential research skills, interpretation of molecular processes, statistics, biochemistry and molecular biology applied to the clinic [14]. In addition, it will strengthen the ability to perform and/or interpret graphics, actions considered as a basic skill in students [15], generating interest in students in the field of biology and its involvement in clinical research as well as promoting enthusiasm for learning science [16]. This study aimed to design a pedagogic method making use of databases applied in scientific research to understand the relation of methylation on gene expression and its relationship in the appearance of diseases such as cancer.

Design and organization
It is important that students have studied in class the concepts of gene structure, DNA methylation, gene silencing by histone-DNA modification, epigenetics and genomic imprinting in the field of molecular biology and / or cellular and molecular biology, for the further understanding of epigenetic modifications and gene expression. In addition, having basic knowledge of computer science and statistics, this activity can be carried out in a team or individual work, as homework, laboratory practice and / or displayed in the classroom.

Information search
Students will search for the information of a gene (s) of interest or can be proposed by the teacher, on the Genetics Home Reference (GHR) page from National Institutes of Health (NIH), US National Library of Medicine, (https://ghr.nlm.nih.gov/gene), understanding the function of its normal expression, its chromosomal location and synonyms of the gene in humans, in addition, you they find on the page the genetic changes and their health implications through the following steps;

Gen > Search > Select gene and analyze the information.
To exemplify this procedure, the function of the POT1 gene that is associated with breast cancer, with coding functions towards a nuclear protein, involved in the maintenance of telomere length and a transcriptional expression associated with carcinogenesis was analyzed [11,[17][18][19][20][21][22]. In this example, the expression of POT1 was analyzed specifically in invasive carcinoma of the breast, it is important that the information of the gene (s) is found in the databases of MethHC and SurvExpress to carry out this method.

Methylation analysis: Use of MethHC
Using this database, the student will reaffirm knowledge about the processes of methylation in the CpG islands, gene structure, genetic expression and biostatistics. The student will enter the MethHC database through the link; http://MethHC.mbc.nctu.edu.tw for the evaluation of the gene expression and the level of methylation of the example POT1 gene, by the following procedure;

Option Genes Search > Select cancers > Breast Invasive Carcinoma > Select a gene region (Promoter and Gene body) > Methylation level evaluation method > load interest gene in the box (POT1) > Search.
With these steps students will be able to analyze and assess the behavior of POT1 gene expression at methylation levels, in addition, they can evaluate the level of expression by methylation in different parts of the gene such as in the promoter region, first exon, body of the gene, CpG islands among others, making the comparison with healthy and tumor tissue. The results of the analysis will be obtained through statistical data comparing the groups of tissues with the t-test and Pearson's correlation coefficient for the association between methylation levels and gene expression, with probability values (p) as a significant statistical value

Genetic expression analysis: Use of SurvExpress
With the SurvExpress online tool, students can develop the ability to validate biomarkers for gene expression in cancer, reaffirming knowledge in the clinical, genetic and biostatistical areas. Through the following link http://bioinformatica.mty.itesm.mx/SurvExpress students will be able to enter the database to analyze, validate and compare the expression of the gene or group of genes as possible biomarkers for prognostic performance as a gene expression signature, through a survival and risk analysis, with the following steps; With the results obtained, students will be able to evaluate and study the behavior of gene expression and its possible use in research as a genetic biomarker for the diagnosis of cancer, according to the evidence recorded in the SurvExpress database supported by the TCGA.

Results
The information search was carried out on the GHR page of the NIH, analyzing the main characteristics of the POT1 gene such as its function, chromosomal location and molecular location. A summary of the information is presented in Table 1, which the students can present in their reports, presentations and tasks with the most important information, in addition, through the GHR page the student can deepen his knowledge by following the links presented on the portal where the information was automatically extracted for GHR and supported from the online scientific databases that include NCBI Gene and UniProt, GHR was updated on August 20, 2019. Using the correlation graphs, the student assesses the level of methylation and its effect on the expression of the POT1 gene, Figure 2a shows the percentage of methylation in the promoter region of the POT1 gene in samples of healthy and tumor tissue, with a tendency towards the decrease of POT1 expression in invasive carcinoma of the breast. In Figure   2b, the result of methylation in the promoter region is shown indicating a decrease in the expression of POT1. However, the degree of methylation in the body of the POT1 gene shows higher levels of methylation, increasing the degree of expression in invasive breast carcinoma ( Fig. 2c and 2d).

Gene expression by survival analysis
The SurvExpress result for the analysis of the POT1 gene in invasive breast cancer is analyzed through a survival graph evaluating the effect of POT1 expression in invasive breast carcinoma as reported in previous research. Figure 3a shows  Curves p = 0.117, (Fig. 3c), the Cox model reported an n = 502 with a number of events of 65 and exp (coef) = 0.607; p = 0.038 and a concordance of 0.557 (Fig. 3d).

Discussion
In our study, we present an applicable teaching method for medical students, with the aim of presenting a dynamic way to understand the expression of genes by epigenetic modulations such as methylation, with a practical approach directed at research in human health, making use of a database with great scientific value. We believe that, with a solid molecular biology base, the student can start the application of their knowledge, through the use of databases. The student should use the vast resources of the genomic revolution and the computational speeds that are currently available as described by Bell [23].
Genetic Home Reference was developed by Mitchell [24], with the purpose of providing a bridge between the clinical questions of the public and the plenty of data emerge from the human genome project data, making the information accessible to the public.
Attached to the GHR approach, we take it as the basis of theoretical knowledge for the analysis of the normal function, location, size and synonyms of a single gene or group of them, as it is a dynamic and credible website that contains a secular language that explains the effects of genetic variation on human health [25], for which medical students will be able to understand quickly. We firmly believe that reinforcement in the theoretical area and emerging concepts of complex processes will help establish a clear communication of the language of molecular life sciences [26]. Genetics Home Reference could be proactive in providing education on precision medicine for the public while implementing this new approach for health progress [27]. Erdmann and Stains [28] describe that the use of genomic and bioinformatics tools can improve the theoretical explanation in the classroom since data itself has only a limited value, they only express the value of a variable, taken as a feature and not as a mistake. While students deepen their understanding of the functioning of the human body, the comprehension of molecular biology in life must move to systematic approaches, promising to transform the understanding of the mechanisms of the biological systems into the practice of research, therefore, the data integration is not the end, but the beginning of new knowledge, hypotheses and feedback [29]. Bioinformatics tools and databases provide the ability in medical students to generate critical thinking and the generation of hypotheses about the state of DNA methylation correlated with cancer and the level of methylation also inversely associated with mRNA expression levels [30][31][32], and changes in telomere length [33][34][35].
The MethCH database is a tool that has been validated and has provided support for the selection of candidate genes based on their level of genetic expression for study in multiple genes and as in our example for the POT1 gene [11, 17-21, 36, 37]. In our study, we implemented the use of MethHC so that students will evaluate the methylation of genes by MethHC's ability to store information on 18 types of cancer in more than 600 samples, 6548 microarrays and 12 567 RNA sequencing data, showing the differences of the groups and correlation in graphs in a simple way, generating the ability in the students to interpret graphics [15]. On the other hand, the SurvExpress online bioinformatics tool for multi-genetic analysis for biomarker gene validation presents results of its analysis, in tables, survival charts and boxplots graphs for the validation of the expression of candidate genes as genetic markers in different types of cancer, this tool has been used in the field of research, which has supported various reports with the validity of the results [12,[38][39][40][41].
Our study proposes a method of a practical teaching method where students put their theoretical skills of molecular biology and practices with the use of computer science, research, database and graph interpretation and statistical tests. White [9] describes the skills that molecular biology and biochemistry students must master; the scientific method, communication and understanding of science and practical aspects in the science community. In addition to the essential concepts and underlying theories of physics, chemistry and mathematics, fundamental skills that undergraduate students and molecular biology students must understand to complete their courses [42], however, for this dynamic to be carried out successfully, the teacher must have explained and resolved the doubts raised in the classroom regarding the basic concepts of molecular biology with the help of the integration of image analysis as a form of dialogue in their classes [43].
Previous research has reported that students prefer to learn by performing image analysis rather than just doing a traditional reading [44]. In this sense the pedagogical practices that develop self-efficacy and the sense of belonging help to reinforce the climate of inclusion in the classroom by improving the skills of the students [45].

Conclusion
In conclusion, we believe that this methodology should be used as a practical complement in the classroom, this will foster the student-teacher interaction creating a harmonious environment in which the students are able to generate questions that solve their doubts about molecular biology in the life sciences.

Availability of data and materials.
The datasets generated and/or analyzed during the current study are available in the