Developing an algorithm for the application of Bayesian method to software using artificial immune systems

This paper develops a new algorithm by applying the Bayesian method to software using artificial immune systems. An artificial immune system is an adaptive computing system that uses models, principles, mechanisms, and functions used to solve problems in theoretical immunology. Its application to various fields of science is studied. The role that artificial immune systems play in software is invaluable. Methods for detecting malware are explored. Some works in the field of artificial immune system are analyzed and issues to be addressed are identified. The Bayesian method accurately calculates the probability of occurrence of any event under certain conditions. Therefore, the Bayesian method is applied to software using artificial immune systems. By applying this method, fast software performance can be achieved. For this, a new algorithm is developed and experiments are conducted. The developed algorithm is one of the new ones. The results of the experiments provide good performance.


Introduction
The expansion of information technology has led to the development of many fields of science. The technologies based on the working principles of the human body have gained wide popularity: artificial intelligence, visual processing tools, artificial retina, all types of genetic algorithms and so on. The information systems based on the working principles of immune systems have great potential in many areas (Gavrilyuk et al. 2010).
The immune system in medicine is a complex protection mechanism, the main function of which is to protect the body against harmful external substances, toxins, microorganisms and harmful cells.
Protecting the living organism from the persistent impact of dangerous external and internal factors promotes the development of the immune system. The immune system has a destructive response to endogenous substances and does not adversely affect the tissues of the body (De Castro 2002).
Most immunological reactions are short-term and controlled by regulatory mechanisms that impede very strong reactions.
Tolerance is a set of mechanisms in which the immune system prevents destructive reactions to its own body.
Two mechanisms of tolerance are provided in Fig. 1. Most lymphocytes located in all primary lymphoid organs and acting against their own antigens of the body are destroyed by central tolerance mechanisms.
Immunology is a field of science about the structure and working rues of the immune system, diseases and immunotherapy. Are studies the biological mechanisms of self-protection of the body against any foreign substance.
Artificial Immune System (AIS) is an adaptive computing system that uses the models, principles, mechanisms and functions to describe and solve the problems in theoretical immunology.
Although the natural immune systems have not been completely studied so far, today, there are at least three theories explaining the functioning of the immune system and its interactions (Fig. 2).
AIS appeared in the works on immune networks in the mid-1980s in the articles by Fermer, Packard and Perelson  (Kephart 1994) published the first article on AIS in 1994, and Dasgupta conducted extensive research on the theory of negative selection. Hunt and Cooke started working on the theory of the immune network in 1995. Timmis and Neal continued this work and improved it. The first book on artificial immune systems was edited by Dasgupta in 1999(Dasgupta 1999. The technology based on the same principles as a human body is now widely used: the neural networks based on artificial intelligence, visual image processing, artificial retina, and various genetic algorithms. The information systems based on the principles of immunity are very promising for many areas. Currently, artificial immune systems mainly use on type of artificial intelligence to protect against computer viruses, to detect network interventions, and so on. Here, another process also arises where the natural selection of antibodies occurs during clonal selection: survival of only those who live under the identified external body. At the same time, the information about the emerging antibodies is ''entered'' to the gene library mentioned above. Thus, the gene database contains only the information about the highest threats. The vital features of the immune system needed are highlighted below (Fig. 3).
Unlike the available protection systems, the abovementioned system does not have a central control system; it is a centralized high-level parallel distributed data processing and analysis system and is particularly beneficial for the protection of distributed computing environments. AIS comprises the following algorithms: • Clonal Selection Algorithm-A class of algorithms based on the clone selection theory of the obtained tolerance, explaining that the lymphocytes B and T improve their response over time. These algorithms are based on the attributes of Darwin's theory, where the selection is based on the change based on the convergence of the interaction of the antigens and antibodies, and on the cell's division principles and on the somatic hypermutations; • Negative Selection Algorithm; • Immune Network Algorithm; AIS is a class of automated computing systems based on the principles and processes of the immune system. Typically, such algorithms use the memory and learning capability of the immune system to solve the problem.

Related works
Some of the available studies in the field of AIS are reviewed below.
1. Many studies have been targeted at solving the complex technological problems inspired by immune systems. These problems may include abnormal detection, pattern recognition, system security, and data collection, etc. 2. Large-scale software systems are often difficult to manage and control. In many cases, unexpected events occur in these systems, especially after the upgrades or changes in their environment (e.g., when updating the operating system, etc.). Therefore, in order to avoid this in a changing environment, there is a need for selfadaptive methods to detect errors and monitor performance (Ligeiro 2014). There are similarities between the detection of program errors and the problem of detecting pathogenic microorganisms found in natural immune systems. Inspired by the vaccine and negative clonally selections observed in these systems, an effective adaptive model for software monitoring is developed by analyzing the system resource indicators. 3. Artificial Intelligence (AI) represents the intelligence of the person presented in the machine and software. It is a highly demanded academic field in many modern researches. Leading researchers in AI define this area as ''learning and designing the intelligent agents.'' The term was first invented by McCarthy in 1955 and designated as ''the science and technology of creating intelligent machines.'' The main objectives of AI research are comprehension, knowledge, planning and training, natural language processing (communication), perception and ability to move and manipulate the objects. In fact, the interdisciplinary field of AI is quite broad and includes many sciences and professions, including computer science, psychology, linguistics, philosophy, neurology, and so on. This area was built on the idea of Homo Sapiens as a central intellectual property of people ''that an intelligent can describe the mind so precisely that a machine can be created to model it.'' AI has been a subject of great optimism; however, it has surprisingly failed (Sniecinski et al. 2018). 4. Several traffic signal control systems are developed the intersection traffic control based on optimization techniques and artificial intelligence. The proposed system is applied to the modeled intersection using modern traffic modeling software VISSIM (a visual programming language designed for dynamic systems modeling). The obtained results suggest that the offered system is recommended for use in extreme conditions associated with blocked approaches and high traffic, and that it is competitive and capable to control various traffic scenarios (Louati et al. 2018). 5. The immune system of the human body has a great potential for protecting it against many harmful viruses and external objects. Throughout history, people have been infected with microorganisms. To limit the nature, dimensions and intensity of these microbial invasions, humans have the ability to cope with them. The human immune system is capable to protect the body, skin, cells and tissues from external effects. [8] presents an example of the study of the human immune system through the mathematical models of adaptive immune systems. Extensive simulations are performed to study the effects of external particles on the recovery mechanism of the body. The results confirm the validity of the immunological mathematical model of a human. A strong security and confidentiality in the human body can help to build a strong networking system (Rathore et al. 2018). 6. Dynamic risk identification is used to predict the future risks based on possible risk data. A training model of risk identification technology based on Fuzzy Support Vector Machine (FSVM) is able to fully and automatically identify potential risks, which becomes a key method for the dynamic risk identification. Selection of FSVM parameters is crucial for improving the recognition efficiency and accuracy. Artificial immune algorithm (AIA) is an effective technology for stochastic global optimization and has the benefits of high precision, convergence. Therefore As the review of related studies shows, the use of artificial immune systems in various areas is very important and it provides effective performance.
The advantages and disadvantages of protection systems based on artificial immune systems are shown in Fig. 4.

Methods for malware detection
One of the most serious threats to data security of automated systems is malware. Malware can steal confidential information from computer, infect files, spread throughout computer or entire local network, cause loss of confidential data, encrypt all stored data for malicious purposes and damage the data owners economically, and so on. Malware should be detected and removed as soon as possible to prevent malicious activity (Tokarev and Sychugov 2017).
Malware may conditionally include viruses, worms, Trojans, spyware, adware, etc. (Yevdokimov 2012). 1. Hiding methods These methods alter the syntax of the program without changing its semantics, making it more difficult to analyze and comprehend malicious codes (Venkatesan 2008). This can be achieved by changing the registers where the variables are placed, but the program code remains unchanged; it changes the rules of execution of processes dealing with other harmful instructions with exchange instructions, etc. 2. Encryption code Malware encryption includes password decryption/encryption operations, encryption keys, and specific encrypted malicious code itself.
3. Encrypted Malware may include the following types: • Oligomorphic programs make small changes such as encryption, descriptions that make it difficult to detect some of the features of the code through malware signaling systems; • Polymorphic programs that encrypt with different encryption algorithms while using a large number of encryption algorithms and keys, and is based on each new infection, because their detection is very complex and time consuming; • Metamorphic programs that can completely change themselves being unlike the original version, which is the most difficult to detect.
Malware detection systems are often the first defense of a protected computer system. Most of these methods can be classified into the following three classes. Although they are very effective, if there is a very small change in the code, it, in turn, changes the signature and becomes ineffective. In addition, this method requires regular updating of the signature database to detect new malware (Biryukov 2012); 2. Behavioral methods provide continuous monitoring of the behavior of the program to determine if it is malicious or not. Experiments show that these methods have high levels of false positives. 3. Heuristic method primarily uses machine learning and data mining to determine software behavior (Kris 2006).

Development of an algorithm using Bayesian method
The significance of the artificial immune systems for software is already reported in previous sections. With the use of Bayesian method, it is possible to determine how protective the immune systems are for software. Bayesian method is briefly explained below. The Bayesian method accurately calculates the probability of occurrence of any event under certain conditions. Therefore, the Bayesian method is applied to software using artificial immune systems.
Bayesian theorem (or Bayesian formula) is one of the basic theorems of elementary probability theory that enables a precise calculation of the probability of an event under a certain condition (Ye 2005). Bayesian formula may be derived from the main axioms of probability theory, particularly from the conditional probability. Bayesian theorem results in a large number of calculations required for its application in practice. Therefore, Bayesian evaluations began to be actively used only after the revolution in computer and network technologies (Daniel 2005).
Bayesian formula is shown below: Here. P A ð Þ-priori probability of a hypothesis of A; PðAjBÞ-probability of hypothesis A at the time of event B (a posterior probability); PðBjAÞ-probability of event B in case of hypothesis A-is true; P B ð Þ-complete probability of event B. PðAjBÞPðAjBÞ-conditional probability of AA if event BB occurs.
Experiment Examples may include temperature maintenance, bridge building scenarios or as virtual entities for use purely in software applications. The frequency of virus spreading within software is 0.001, and the method for immune system protection is 0.9. In this case, the probability of the failure of the positive result is 0.01. Here, the probability of the program being protected by the immune system has to be found, that is, to prove that the virus is false. During the verification, it is necessary to find the probability of uninfected software which is assumed to be infected with the virus.
I-software is infected; ( I )-software is verified to be infected; H-indicates software is uninfected. Then, the condition given is written as follows: Thus, software assumed to be infected with virus, in fact, is estimated to be uninfected 91.7% during the examination.
If the results of the examination can be considered as a random error, then the re-examination of same software will be independent of the first result. In this case, the software should be re-examined in order to reduce the false positive probability of the results if it is estimated to be infected with virus. It is assumed that the software is uninfected after obtaining the re-examination result as infected with virus, moreover, with the Bayesian formula; it can be calculated as follows: Here, three software are used for the experiment. The presented rule is applied to each software and the results are summarized in Table 1. Note that the software for these systems is developed by the author of the article (Table 1) and (Fig. 5) (Alguliyev 2005).

Conclusion
The research highlighted important role of AIS in all fields of science. AIS uses different algorithms and can be applied to solve various problems. The use of AIS in programming is of particular importance. AIS is used to ''clean '' malware (viruses, worms, Trojans, spyware, etc.). In this article, a new algorithm was developed and experimented for the application of Bayesian method to software using AIS. The advantages and disadvantages of AIS were shown. To eliminate the disadvantages, perfect AISs should be developed to enable the software more efficient and effective. The application of the Bayesian method results in the fact that its practical application does not need huge calculations. By applying this method, fast software performance can be achieved. Mention that the proposed method could be optimized using any optimizer.