Deep reinforcement learning for diagnosing various types of cancer by TP53 mutation patterns

One of the key challenges for classifying multiple cancer types is the complexity of Tumor Protein p53 mutation patterns and its individual effects on tumors. However, far too little attention has been paid to Deep reinforcement Learning on TP53 mutation patterns because of its extremely difficult result interpretations. We introduce a critic network by a long-short term memory, which is appropriated for discriminating the noise samples from a Feedback Generative Adversarial Network and analyzing the actor network. The correlation and analysis of the results in a belief network demonstrates significant relations between mutations and disease risk in cancer subtypes identification. In other words, the results indicate statically significant differences between the primary and secondary subtype groups of the most probable tumor.


Introduction
Recent developments in cancer therapy 1 have heightened the need for cancer subtypes identification 2 . The P53 mutations data is an increasingly important area in therapeutic targets 1 . In sum, despite a large amount of research on the measuring mutations 3 , very few studies have investigated on the probability of mutational processes 4 with Deep Learning (DL) 5 . This paper focuses on diagnosing the most probable tumor by p53 mutations in human cancer 6 . The aim of this study is tumor subtypes identification for the most likely cancer. There is an interesting statistically difference in the subtypes of breast cancer as the most likely cancer with a significant mutational signature and its variant effects on disease risks 7 .
This research focuses on the designing and training DL framework by Neuro-Evolutionary of Augmenting Topologies (NEAT) algorithm which is one of the more practical ways of optimal architectures in DL 8 . Here, it is necessary to clarify exactly what is meant by NEAT. The purpose of this research using the Meta heuristic of Adaptive Genetic Algorithm (AGA) which has shown in two levels 8 in figure 4. The upperlevel problem is made for generating candidate mutations. The task of lower level is generating candidate exons and codons with their probabilities for preferable mutations generated at the upper level. In our opinion, this approach is useful for the deep genomics problems 9 . According to the recent studies, Markov Decision process (MDP) is the most striking issue for mapping states in Reinforcement Learning (RL) 10 .
It has been seen that there are a number of topics in DRL network; broadly speaking, the three factors are as follows: behavior, decision, and feedback 11 . We found out that the single most striking observation of behavior is an attention-based decision and the evaluation feedback 12 . It should be noted that a creative DRL framework that has shown in figure 3 can be a combination of DL frameworks 13 ; for instance, a Feedback Generative Adversarial Network (FGAN) 14 that we presented in this article.
DRL framework is constantly being improved by the neuromorphic approaches 8 from which the prefrontal cortex (PFC) is inspired using the Long-Short Term Memory (LSTM) 15 . Although the critic network in meta-Reinforcement Learning (RL) is a LSTM, here, the critic network is a bidirectional array-based LSTM 16 which plays a discriminator role in a FGAN 17 . In this study, we have experimented on FGAN with an analyzer to improve the active learning in GAN 14 . Considering the open-ended algorithms, 8 some of the new samples from the generator, which have the threshold value condition, are related specifically to LSTM as the newest trained samples 14 . In this model, the operation of training phase is drop out-based and it has a low polynomial complexity. Remarkably, the main conclusion to be drawn from this discussion is that training the LSTM as a discriminator is the best way when the generator is unable to discover any patterns.
Our DRL framework gives a new insight into the topic of relationship between the artificial agents and cognitive environments in On-behavior subjects. Moreover, the great advantage of DRL is that there is no credit-assignment problem in proceedings 12 . The purpose of this study is to specify the effect of the critic network in making more informative and better decisions. In addition, it sets out with the aim of assessing the importance of synapse neurons in actor network as the nonstationary environment. Although most studies in the field of DRL have only focused on the weight of neurons, this research has tended to focus on the weight of synapses. Following this, the most probable synapse is important in the actor network.
The precision measure as the evaluative feedback for a recurrent reward factor accentuates the consciously alternative actions. In our opinion, this method is the most precious approach that analyzed P53 mutation patterns in deep genomics, because it helps the neural network to learn from input instead of only particular p53 mutation patterns. One also should not overlook the fact that a strong relationship between the most likely tumors and the DNA destructive mutations has been reported in P53 mutation patterns. In fact, in our opinion, this type of research has made a major difference in developing deep genomics for the drug targets.
In light of these points highlighted by previous research, this study was guided by the following question: "How can the DRL network diagnose various types of multiple cancer and their tumor subtypes by p53 mutation patterns?" In this connection, according to the results interpretation, we demonstrate a belief network of the most likely cancer with the tumor subtypes with the most possible signature by distinct possibility hypotheses. The foregoing discussion has attempted to evaluate the performance of our DRL network; by way of contrast, we apply many classification measures consisting of accuracy, precision, error rate, recall, sensitivity, and specificity 18 .
Passing on now to the DRL architecture, this shows a need to be explicit about exactly what is meant by the DRL. Here, the architecture may be defined as the cognitive systematic process, which consists of two networks: FGAN and Actor-critic network with four types of layers: input layer, hidden layers, a SoftMax layer, and recurrent dense layer. It would be better to say that the LSTM plays a critic and discriminator role in the embedded layer. With this idea in mind, each neuron of input layer represents a Wild type codon. The hidden layers are Wild type mutant, Exon intron, and Codon number, respectively.
The SoftMax layer consists of all types of diseases classes which calculates the probability of each tumor risk, because we focus on the binary classification problem of multiple cancer. This layer is usually used before the output layer. Dense layer is a recurrent output layer with a single neuron that computes the precision measure of the most probability disease in the SoftMax layer that is its previous layer. Then, it gives the evaluative feedback to the LSTM as the estimation rewards.
It could be reasonably argued that modern creative DL techniques provide new comprehensible results in cancer genomics and genetics 19 . There is no doubt that an impressive classification approach can separate the most likely cancer in p53 mutant data. Unlike the previous work that has focused only on the statistical model demonstrating the probability of P53 mutational activities as a function, this article is concerning the relative importance of the p53 pathway cancer diagnosing and its subtypes. Thus, the representing subtypes of the most likely cancer in p53 mutations has drawn attention to this fact that some p53 mutants are more equal than others 20 . There is a reliable evidence that measuring mutations plays a significant role in diseases risk that is related to the human genomics, however, a widely held view among wild types of mutation is p53 mutants 21 . Accordingly, mutation patterns data are great with the aim of categorizing and understanding the cancer subtypes. Furthermore, previous works have investigated cancer therapy according to p53 mutation pathway 22 . The present study confirms previous findings and gives additional evidence that is essential for categorizing various types of breast cancer metastasis 23 .
In this article, we attempt to draw a comparison between studies, which focus on signatures in exon and codon numbers. In fact, this research demonstrated that p53 mutant is the major factor in breast cancer and lung cancer is the most possible subtype of it. In the light of these new creative approaches, we believe that deep genomics has its weakness 13 . The fact remains, however, that DL is still the best framework which is constantly being improved highlighting the need for more research on cancer sequencing data 24 .

A deep reinforcement-learning network for P53 mutations in human cancer
Although the previous studies of the p53 mutants have not dealt with DRL, this research have classified the tumor subtypes for the therapeutic targets 20 . Most studies in the field of p53 mutants have only focused on structural, functional and mutational processes 25 . As mentioned before, the present study is designed to determine the effect of signatures in mutations for the most likely cancer and its subtypes. Several studies have produced the estimation functions of p53 mutants 25 , but there is still comprehensive method for diagnosing relative multi-region tumors 26 . The DRL method is one of the most practical ways of quantitative measures and analysis 12 .
Regarding the question of the most likely tumor and its subtypes, this study indicates that breast cancer provides the largest set of significant categories of subtypes. The belief network in figure 1 shows that not only the results are remarkable statistically, but they also are significant probability differences between the primary and other subtypes. Our findings are in agreement with previous studies which described breast cancer in p53 mutations 27 ; additionally, they seem to be consistent with further support for its subtype's idea 28 . Moreover, in comparison with former findings in the evaluative classification measures, no evidence more than of our precision is detected in breast cancer metastases 23 . Since this difference has not been found elsewhere, it is probably due to DRL framework.

A deep reinforcement-learning network for variant cancer classification
This technique may explain the relatively good correlation between decision and diagnosis. A description for this might be an attention-based learning and the potential interference of LSTM. This combination of findings provides some support for the conceptual premise of breast cancer with G:C>A:T mutations in mutant p53. An implication of this is that the incidence of this signature is more likely in breast cancer relative to brain and colorectal cancers 29 . One of the other outcomes that emerges from these findings is that tumors in liver and bladder cancer are as the same secondary subtypes for breast and colorectal cancer with the distinct eventualities 30 .
There are not, however, other subtypes in brain cancer in 9 selected tumors. The observed increased primary subtypes in breast cancer could be attributed to lung, esophagus and ovary, respectively. The present study confirms previous findings and gives additional evidence that suggests the ideal therapeutic targets. The current findings add substantially to our understanding of using cancer subtype identification. The method used for this research may be applied to other studies elsewhere on the various cancer diagnosing. The empirical findings in this study provide a new mutation data understanding. The most obvious limitation concerns other forms of mutation data that could usefully have been obtained.
These findings cannot be extrapolated to all patients because the signature must be interpreted cautiously. However, with only p53 mutation patterns, caution must be applied, as the findings might not be transferable to different types of mutational processes and pathways. Diagnostic problems are an intriguing one that could be usefully explored on the mutation data in further research. Future research should therefore concentrate on the investigation of relations among the multiple cancer subtypes. It is suggested that the association of these factors is investigated in future studies. It would be interesting to distinguish the effects of multiple cancers.
Interestingly, we anticipated only a small number of mutations for liver tumor as a subtype of lung tumor, and most of those mutations are esophagus and colon tumor subtypes with the most possibilities up to 0.7. Another most remarkable observation to appear from the data comparison was ovary and brain tumors, which are not related to multiple cancers. The more surprising observation is with the brain tumor. It is not a multiple cancer and a subtype of other tumors. Although ovary and bladder tumors are the subtypes of multiple cancer and their correlations are significant, they are not multiple cancer.
The most important result is a significant difference between the two groups from the preliminary analysis of the generated sequences and the ignored sequences charts in training phase. It is obvious from Training Phase Generating bar chart that the growth of admitted sequences is more than the half of generated sequences in all epochs. The results, as shown, in figure 2 as the Evaluation Measures bar chart indicate that several evaluative measurements are a strong correlation in our test phase, and the 15 epochs in the test step was selected from the data on nine elected tumors specifically.

Discussion
However, a relationship between p53 mutants and multiple cancer has been reported. So far, our presented approach has not only been applied to cancer genomics. The present study found that lung cancer is the most probable tumor of breast cancer subtypes. In addition, one unanticipated finding is that only lung and esophagus tumors had the same subtypes. This finding is unexpected and it suggests multiple cancer subtypes in mutation patterns.
The present study produces result that corroborates the findings of many the previous works in this field. These findings support previous research into cancer genomics linking signatures and exons. It seems that the present findings are consistent with other researches, which found breast cancer in p53 mutants. However, the tumor subtypes have not previously been described with details in p53 mutation patterns. However, the cancer subtype topics emerge from these findings, and they are important for therapeutic targeting in cancer genomics.
This study especially is concerned with mutation data for cancer subtype identification, which is an impressive classification researches based on the mutation patterns. The cancer subtype classifications are a major current focus in the efficient therapeutic targeting for cancer therapy. It has generated considerable recent research interest in the last decades.

Method
The samples were from patients with p53 mutants, which included in the previous research analysis completely. According to the studies that used these samples, they approximately were matched for all of the individual structures of p53 mutations. These samples were chosen because of the expected deep genomics in diagnosing cancer topics. In order to understand how the belief network is regulated and divided into the subtype groups in the signature, we controlled the results interpretation in each epoch. When breast cancer exposure was completed as the most probable cancer, the exclusive signature is recognized from the process and is placed in belief network. Following this, the tumors are regained on their level with the own probabilistic hypotheses in it.
The LSTM training analysis was checked when initially performed and then checked again at the end of FGAN training. Many of the reinforcement-learning agents can act in actor network; in this study, we used SARSA method. DRL methods are constantly being improved. One of the big advantages of DRL framework is that it is a smaller and lighter than others. Despite a few problems with the design like nonstationary and adversarial environments, its advantages such as model comprehensibility clearly outweigh its disadvantages. Although a generally accepted definition in RL agent is lacking in actor network, we are responsible if the DRL method runs out and goes wrong. However, we have been trying to find a solution to the DRL agent for acting in actor network.
Although extensive research has been carried out on evolutionary method for designing neural networks and types of RL in biological systems, no single study exists which adequately covers a strong relationship between the DRL network and the evolutionary methods. Nevertheless, the current study found association between the evolutionary method for designing neural networks and the DRL behaviors based on the evaluative feedback. There are several possible explanations for these results. Since these results have not been found elsewhere, it is probably due to DRL framework. These findings have important implications for developing RL field. That would be interesting to assess the effect of LSTM network, and its performance can be used to develop targeted. Throughout this research, the term deep genomics is used to refer to deep learning in biomedical application frameworks 5 . A generally accepted definition of a selective attention based and active-inference 31 method in deep genomics is lacking. In this paper, the term that is used to describe this technique is actor-critic network.

Competing Interests
The authors declare that there are no competing interests.

Author contributions
Armaqan Rahmani-Saraji was responsible for study design, interpretation of results and manuscript writing and performed statistical analyses, summarized results and drafted the manuscript. Behrouz Minaei-Bidgoli recruited study subjects and managed the respective project. Meysam Ahangaran coordinated the project. All of the authors reviewed, approved and contributed to the final version of the manuscript. All methods were carried out in accordance with relevant guidelines and regulations, and all experimental protocols were under supervision of Prof. Behrouz Minaei-Bidgoli the facility member of Computer Engineering department of Iran University of Science and Technology and Dr. Meysam Ahangaran the facility member of Computer Engineering department of University of Science Technology of Mazandaran.

Data Availability
The datasets analyzed during the current study are available in the IARC repository, https://p53.iarc.fr. No new datasets were generated or analyzed during the current study. According to IARC reports, it informed consent was obtained from all the patients and researchers who participated in the dataset.