An Approach of CA with M-RIPPER for Heart Disease Prediction

:- In the present world, mobile computing devices are popular and are identified in each aspect of life. This combination among computing and the present world is not restricted to the everyday life. The medical field was similarly concerned, where care is given in a wide scope of areas and conditions. The medical domain is continually being immersed with new kinds of innovations, including context-aware system and application. In this research, a context aware healthcare model based on IoT application is proposed. The smart medical devices are used to measure the data from the patients and store it in database. From the database, the patient ’ s information and medical records are considered as context aware data. For analyzing and classifying the data, the MRIPPER (Modified Repeated Incremental Pruning to Produce ErroR) algorithm is used. This algorithm is a rule-based machine learning algorithm. By using this algorithm, the rules are framed for the analysis of dataset for the prediction of heart disease. The performance analysis of the proposed model is experimented in MATLAB simulation tool. Further, the performance of the proposed model is compared with other existing models like J48, random forest, CART, OneR, and JRip algorithms. The proposed algorithm has achieved 98.89% accuracy, Precision is 96.76%, recall or sensitivity is 99.05%, specificity is 94.35%, and f-score is 97.60%. Overall, the proposed model has obtained 97.38% accuracy in predicting normal class and 97.93% in predicting abnormal class subjects.


INTRODUCTION
Context Awareness originated as a term from Ubiquitous computing, which is turning into a reality that highlights the integration between the data space and the physical space. With its assistance, individuals could receive and process data whenever and anyplace through a device that can link any internet. Therefore, it can lessen the difficulty of utilizing the device and make individuals' lives simpler and progressively effective. The environment of user in ubiquitous computing, for example, the location, or terminal equipment, and so on is continually changing, which is called context. As part of the central zones of ubiquitous computing, context aware computing has become increasingly very well known among people. Numerous authors have described context according to their comprehension with an exertion to review an extensive basic idea of the context [1]. Schilit and Theimer utilized the term context aware in 1994 and refer to context as location, identities of objects and nearby people and changes to those objects. The term context has been sorted into two classes (logical and physical) [2]. Physical context could be decided through hardware sensor and logical context was either provided through the user's feedback or by observing their communications with the services accessible. For instance, by monitoring or reviewing the user's profile, working schedules, activities, composing movement, and so on. Most research around there utilizes physical sensor for movement, sound, touch, temperature, light, and of course location. The logical sensor, though give associated data by reading user's data from public website pages and different archives and furthermore reviews user's information (interaction) and dependent on those interaction target publicizing [2].
A context can be various elements or factors such as Location, User Identity, Time, Activity, Current Task, Environment, and Hardware. Context-awareness refers that one can utilize context data. A system was context-aware if that it could extricate, decrypt and use context data and modify their performance to the present context of usage. The name contextaware computing was generally perceived through those performing in context-aware, where it was considered that context was a source in its attempt to distribute and directly combine computer advancement into our lives. Context-aware systems can modify their activities dependent on the present context. This likewise expands adequacy by considering environmental context. CAS observes the condition constantly and proposes reasonable recommendations to users which they could make important actions [3].
The context-aware healthcare system helps hospitals to enhance performance efficiency by incorporating real-time contextual data into the actual workflow like the locations and current conditions of medical devices and employees. It also allows access to environmental data in order to deliver the best possible patient experiences. The solution supports both realtime asset monitoring and event-driven tracking, with real-time tracking moving across the surrounding and event-driven tracking existing and entering regions. The primary goal of health care systems must be to safeguard and maintain the security of patient data. In addition, the use of contextual is critical in interactive frameworks where the user's data changes frequently, such as in portable and ubiquitous computing ( Figure 1). The combination of Internet and the Medical domain is a progressive innovation, with present research aiming on the use of computing to support in training among the medical sector. The smart clinical devices market was predicted to reach above 126 billion dollars profit by the year 2028 while smart wearable devices intended to be extensively utilized to accomplish enhanced health, quality of life, and protection of people. Moreover, to their capability to aid real-time constant observance of patient's data, such devices also make contextaware mobility significance to enhance overall condition of medical care. Context Aware System (CAS) is a system that can adjust their activities to context changes without unequivocal user intercession. The CAS platform should unequivocally present by its component's functionalities, context data, and the control activity and furthermore provides services to users utilizing context data where pertinence relies upon the user's operation. In this way, a context aware domain could be intended middleware support that permits the exchange of environmental data out of the minimum infrastructure range to a more significant range for definition and decision. This multi-layered design was common for the Cloud computing sequence that permits setting the middleware layer as a major aspect of a Sensor-Cloud interface in the layer of PaaS (Platform as a Service) [4].
Context aware systems also play an important role in health care systems, whereby automatically distinguishing a patient from the rest of the surroundings, recording the various events associated with a specific patient, keeping track of the various services provided in a specific location, and providing the necessary documentation are some of the important functionalities that must be encountered by the system. There is also the extra obligation on the system to keep the patient's and health care professional's information secure and safe. Security is also required for any equipment used by health care institutions. As a result, health care systems cannot be viewed as a separate system from the rest of the technological systems; rather, they are a sociotechnical system that is dependent on the collaborated results of the communications among the technology and user. Context awareness aids in the more precise diagnoses of the observed patient's health problems. It can recognize behavioural patterns and so make more exact conclusions about people and their surroundings. Adaptation, personalization, and proactivity are the three most essential advantages of context awareness. The following is a breakdown of these three advantages: Adaptation is focused with tailoring a service or information to the user's present situation. A specific example is when the system in issue adjusts the data it delivers based on network and device context, such as connection speed and display resolution. Personalization is the process of customizing a system to individual users such that each user sees the system differently at the same time. Personalization is based on individual user's choices, habits, abilities, duties, and so on. The information or, more accurately, the degree of detail at which information is supplied to doctors, for example, in Healthcare Monitoring Systems, is plainly different from that delivered to patients or caregivers.
Proactivity is concerned with providing services to users based on forecasts of future circumstances. When it comes to Healthcare Monitoring Systems, proactivity is critical in producing really useful and promising solutions. There are several examples and use cases. For example, being proactive and anticipating health issue situations aids in the discovery of these problems at an earlier stage, which frequently enhances the likelihood of averting or, at the very least, minimizing the harm caused by these health problems. Another situation or use case in which proactivity aids in the prediction of diseasecausing mutations induced by genetic alterations in the genome that are likely to have a molecular effect. From the most recent decade, the CAS targets around web applications, and desktop computing to the Internet of Things (IoT). Because of advance sensor innovations, sensors are getting stronger, less expensive and less in size. In this present world, there are numerous sensors and eventually, these sensors create a lot of information, for example, big data. Except if we dissect, interpret and comprehend the information which collected that information may not produce important data. Context-aware computing plays a significant part in handling this task, for example, mobile and pervasive, which would be effective in the IoT model also. This enables us to save the context data associated to sensor information, so the interpretation should be possible all the more effectively, genuinely and furthermore context makes it simpler to execute machine-to-machine interaction as it is the core component in the IoT condition [5].
Heart disease is one of the most critical and difficult health problems in the modern world. Heart disease reduces blood vessel function and causes coronary artery infections, both of which weaken the patient's body, especially in adults and the elderly. According to the WHO, heart diseases are the leading cause of death globally. In 2019, an estimated 17.9 million people died from heart diseases, accounting for 32% of total worldwide mortality. Stroke or heart attack caused 85 percent of these fatalities. More than three-quarters of all heart disease deaths occur in low-and middle-income countries. Heart diseases are responsible for 38% of the 17 million premature deaths (before the age of 70) caused by noncommunicable disease in 2019. Most heart disease can be prevented by addressing behavioural risk factors such as cigarette use, poor diet and obesity, inactivity, and excessive alcohol use. It is critical to detect heart disease as soon as possible so that therapy with counselling and medicines may begin. Heart diseases are a kind of heart and blood vessel disease. Among these include cerebrovascular disease, coronary heart diseases, peripheral artery diseases, congenital heart diseases, rheumatic heart diseases, deep vein thrombosis, and pulmonary embolism [https://bit.ly/35qpAGG].
Context awareness performs a significant role in the concept of Internet of Things, as it provides rich contextual knowledge that can make the system perform more effectively. Since every context of healthcare is different, it is important to determine an adequate context aware architecture for IoT healthcare applications. In this research, a context aware heart disease prediction model based on IoT application is proposed. From the dataset, the patient's information and medical records are considered as context aware data. For analyzing and classifying the data the modified RIPPER (Repeated Incremental Pruning to Produce ErroR) algorithm is used. By using this algorithm, the rules are framed for the analysis of dataset for the prediction of heart disease. The remaining part of this research is presented in following sections as, section II discusses the related works, section III presents the proposed methodology, section IV presents the performance analysis, and section V presents the conclusion and future extension of the research.

RELATED WORKS
Yousef A presented an analysis of healthcare monitoring framework and its offerings on the IoT platforms. Many functions that exist in healthcare systems have been described and modelled. In addition, this work aimed to establish and propose a general framework for the development and design of context-aware healthcare monitoring framework in the IoT domain. The essential elements of healthcare monitoring framework, as well as their relationships, were discovered and modelled in such a model. The work also emphasized the importance of the AI sectors in tackling robust context aware healthcare monitoring. This framework was built on a distributed layer architecture, with distinct components implemented across the physical layers, cloud platform, and fog platform [1].
Mohamed A B et al. presented a novel decision-making paradigm focused on an IoT method for identifying and tracking type 2 diabetes patients. Wireless BAN was used to track changes in the user's body symptoms, and a smartphone phone interface was used to record social interactions. Since it was necessary to enhance the decision support schedules for the accurate predictions of type 2 diabetes issues, the hybrid approach focused on type 2 neutrosophic with the VIKOR process were proposed in this analysis. The performance of this model was satisfactory and the accuracy could be improved by using advanced approach [6]. Abdur RMF et al. proposed a knowledge discovery-based approach that enabled a contextaware system to change its behaviour in real time by analyzing large volumes of data produced in ambient assisted living frameworks and stored in the cloud databases. The proposed model allowed big data research within a cloud setting. It first analyzed the dynamics and patterns in a particular patient's records, along with the associated odds, and then used the information to learn proper irregular conditions. The results of this learning approach were then used in context-aware decision-making scheme for the patient. This model can be improved with more context domains [7]. Deeba K and RA. K. Saravanaguru proposed a model for monitoring signs and health conditions of elderly people. The data from the system was observed by the caregivers for identifying the daily activities through IoT. A fuzzy logic controller was designed from the initial stage of data collection, data processing, filtering and accumulating it into contextual data and reasoning for identifying the elder people's health conditions [8]. Jalil N-K et al. proposed a novel hybrid technique for heart diseases diagnosis using optimization method in feature selections. This analysis mainly focused on the features selection enhancement and reducing the features count. In this analysis, imperialist competitive algorithm with meta-heuristic technique was proposed to choose essential features of the heart diseases and the K-nearest neighbour technique was utilized for the classification. This model could enhance the features selection technique for missed and incomplete data [9].
Daniel A D et al. designed a context-aware system to assist health care providers in home-based caring environments. A reliable NFC authentication scheme was used, which creates a secure channel by encoding sensitive contextual data during data transmissions. Using a context-aware gateway node, this system performs authentications and authorization for accessing a specific patient's data. The proposed solution aimed to improve health care data access and safe data delivery while protecting users' privacy. This research provided a foundation for physicians to develop different smart treatment alternatives, as well as for home-based care [10]. Deeba K and RA. K. Saravanaguru proposed a Smart Home Caregivers System (SHCS) capable of collecting real-time patient's heart rate, oxygen leakage in, abnormal and normal patient's condition observed through MQ6 sensor. The data sensed was transmitted to the base station, where it was controlled by caregivers via PC or mobile device. This method was carried out by either wired or remote users using REST web services [11].
Based on Fuzzy Logic, Byung-K L et al. presented contextaware health care model for disease reasoning. It was made up of two modules: Fuzzy-based disease reasoning model (FDRM) and the Fuzzy-based context aware model (FCAM). The FCAM calculated the correlations coefficients and supports among the conditional attributes and the decision attributes and produced fuzzy rule based solely on the conditional attributes with the highest correlation coefficients and supports. Based on the results performed with a SIPINA mining method, the average accuracy of Fuzzy Rules dependent on correlations coefficients and supports (FRCS) and enhanced C4.5 was 0.84 and 0.81, respectively. That was, as correlated to the enhanced C4.5, the FRCS reduced the rules produced while improving accuracy of rules [12]. In this research, a context aware heart disease model based on IoT application was proposed. The smart medical devices were utilized to measure the data from the patients and store it in the database. From the database, the patient's information and medical records were treated as context aware information. For analyzing and classifying the data, the modified RIPPER algorithm was utilized. This algorithm was a rule-based machine learning algorithm. By using this algorithm, the rules were framed for the analysis of data set for the classification and prediction of heart disease. Based on the classification results, the prediction of the heart disease was performed ( Figure 2).
The smart medical wearable devices based on body sensor network is used for the computation of patient's physiological medical data (i.e., Heart rate, temperature, etc.). The data from these devices are stored in the cloud platform for data storage. The stored data can be further managed or used by the user or the medical centres for analyzing the patient's health conditions. By using specific application, patient can be monitored from remote places via internet through smartphones or PCs. For analyzing and classifying the data the modified RIPPER algorithm is used. By using this algorithm, the rules are framed for the analysis of data for the prediction of heart diseases. The block diagram of the proposed model was shown in figure.2. The proposed algorithm is discussed as follows.
The IoMT devices and wearable devices are considered as the IoT devices. They are equipped to accumulate the patient's data from remote areas. These data are collected as patient's information that are accumulated using IoT devices connected or equipped with the human body.

RIPPER ALGORITHM
The rule-based machine learning can be detailed as the basic concept description. The RIPPER algorithm is one among them most widely used. Comparing to various algorithms this technique has more benefits, that it could be comprehended with ease, with generated rules in the form of If-Then format, implying that the model is entirely interpretable. The RIPPER algorithm is a rule-based classification algorithm that produces a rule-based classifier model, which is a collection of IF-THEN rules derived straight from the training data set, thus the name "direct process." It can be utilized for multi-class and binary classifications. The RIPPER algorithm's core framework was split into two types: optimization rules and generation rules. The generation type is the two-layer loops in which the outer loop produces the rule and applies it to the rule base next to pruning, while the inner loop includes one antecedent to the rule at once. The optimization type creates alternate rules based on the rule base's rules, and the minimum description length (MDL) criteria was utilized to pick the right rule and attach with rule base [13]. This algorithm goes through four stages: Growth: During this process, a rule was created by greedily applying features to the rule before it passes the stopping criterion.
Pruning: Throughout this process, every rule was pruned and rendered shorter by eliminating repetition and reducing the duration of previous rules, allowing the rule to improve.
Optimization: The initial prune and growth process creates rules from an empty ruleset. The optimization phase makes use of the rules created during the initial pruning and growth stages and attempts to create new rules from the rule set. The rules can be additionally optimized with,  Adding features to the initial rule using the greedy approach (i.e., depth initial search).  Following the growth and pruning process, a new rule set is created.
Selection: At the selection process, the best rules were held and the rest of the rules are removed from the system. The specifications of this algorithm are as follows: D dataset is used as input (equation 1 and 2).


Step 1: Split the dataset D into individual growth sets Gro and prune set Pru  Step 2: The growth set Gro was utilized at this point as dataset. The growth rule starts with no rules, and every time an appropriate combination of potential features and thresholds are chosen as the antecedents would be included to the rules. The information gain was utilized as the evaluation criterion: Cover is the number of positive instances that were covered since adding the antecedents to the rule, rt' was the proportion of positive instances in the data covered using the rule since adding the antecedents to a rule, and rt is not. The iteration of including antecedents would continue until the Gro was empty.
 Step 3: The pruning process utilize the pruning set Pru to measure the rule's generalization capacity. Begin with past thing added, and eliminate an antecedent in the rule. When pruning, the metric was p was positive instances covered by the rule in Pru, n was negative instances covered by the rule in Pru, the point of the calculation was to increase the precision of the pruned set.


Step 4: After pruning the rule, it was tried to be included to a rule base. The inclusion would fail if the number of instances covered by the rule was too limited or the precision was too poor. If the rule was effectively included, the instances covered will be removed from the D [14].

MRIPPER ALGORITHM
The RIPPER algorithm for rule induction was implemented as a replacement for the Incremental Reduced Error Pruning (IREP) algorithm. About the fact that the fundamental ideals remain similar, Modified-RIPPER strengthens IREP in certain details and was also capable of dealing with multiclass issues. A single MRIPPER rule was made up of a consequent and an antecedent part. The antecedent part was a predicate (selector) conjunction, and the consequent part was a class assignment. MRIPPER learns those rules greedily, using a divide-andconquer approach. The training data are classified by class terms in increasing order based on the respective class frequencies prior to the learning process. The rules for the initial m-1 classes are then learned, beginning with the smallest. When the rule was established, the instances concealed by that rule are excluded from the training data, and this process was replicated till no instances from the target classes remain. After that, the algorithm moves on to the next class. Finally, as MRIPPER discovers that there are no more rules to learn, a default rule (with an empty antecedent) was applied for the last class. Single-class rules are learned before either all positive instances were concealed or the last rule applied was "too difficult." The last feature was applied in terms of overall description length: the stopping criteria was met if R's description length was quite longer than the shortest description length found so far as represented in the algorithm [15]. The ruleset for the algorithm discussed is framed based on the medical condition of the patient regarding the prediction of heart disease prediction. The ruleset is framed from the conditions related to the causes of heart disease as shown in table.1. Each attribute represented in the table has the range based on the condition of description. According to the ranges, the disease condition will be predicted with the ruleset framed for context aware system. The context database collects the context aware data from the input which is medical records, personal information, and physiological data of the patient. Based on the context analyzer and context reasoning the ruleset is framed and analyzed. The data from the context database is processed by the proposed algorithm for the classification. For experimental analysis, the Cleveland dataset is implemented in this research for evaluation.

PERFORMANCE ANALYSIS
The performance analysis of the proposed model is experimented in the MATLAB simulation tool. The experiment was carried out on a 64-bit CPU, i5 processor operating on Windows 10, with 8 GB RAM, using the MATLAB & Simulink tool R2017a. The data classification is significant in this analysis. The proposed classifier classifies the data for the prediction of heart disease, which the result will be in the form of absence or presence of disease. The results are carried out using the dataset, and various classification parameters like accuracy, recall, precision, and F-measure. For classification, the benchmark dataset is classified using rule-based machine learning classifier called modified RIPPER algorithm. In this analysis the heart disease prediction is the main work concentrated and the prediction model can be used to perform prediction for any serious disease by using the different dataset in the process.

DATASET DESCRIPTION
For the heart disease prediction and classification, Cleveland data set from UCI repository was utilized in this research, available in public database. Each data set has its very own instances and attributes, in that Cleveland dataset is used for training which has 76 attributes and 303 records. But, only 13 attributes in dataset were utilized for this analysis & experiment as represented in Table 2 [16].
Recall or Sensitivity measures the ability to detect a patient at risk for heart disease and is stated as an equation 5.

Recall
The F-measure, which is defined as the weighted harmonic mean of test precision and recall, assesses test accuracy. The accuracy does not take into account how the data is disseminated. As a result, the f-measure is utilized to accurately manage the distribution problem (equation 7).
F-Score = Tpos-True Positive is proper prediction on healthy classes; Tneg-True Negative is proper prediction on abnormal classes; Fpos-False Positive is improper prediction on healthy classes; Fneg-False Negative is improper prediction on abnormal classes.
From the dataset, the selected data are analyzed and classified by the proposed MRIPPER algorithm. Five healthy samples and five abnormal (Heart disease) samples are used for the experiment in this research. The subject 1-5 are classified as healthy class and 6-10 are classified as abnormal class from the dataset. The performance analyses of these ten subjects are computed and tabulated in table 4. Figures 3 and 4 represents the graphical representation of performance analysis made on normal and abnormal subjects as shown in table.4. Accuracy, precision, recall, specificity, and f-score are the parameters used in this research for the evaluation of performance.    Table 5 represents the comparison of the proposed algorithm's performance analysis with other existing algorithms. The proposed MRIPPER algorithm is compared with J48, random forest, CART, OneR, and JRip algorithms. The proposed algorithm has achieved 98.89% accuracy, which is 1.2% to 4.8% higher than the other algorithms. Precision of the MRIPPER algorithm is 96.76%, which is 1.7% to 3.5% higher, recall or sensitivity is 99.05%, which is 1.2% to 4.8% higher, specificity is 94.35%, which is 0.8% to 4% higher, and f-score is 97.60%, which is 1.4% to 6.4% higher than the other algorithms. Overall, the proposed model has obtained 97.38% accuracy in predicting normal class and 97.93% in predicting abnormal class subjects ( Figure 5 and 6). The use of context awareness in medical field is embedded with other domains like IoT, Cloud computing, etc. With the combination of integrating with these technologies the developed application on context awareness system has many advantages over other methods. Different applications like health monitoring, analyzing on diseases, and assisting on medications can be done on remotely with the combinations of these technologies. In this research, a context aware healthcare model based on IoT application was proposed. The smart medical devices were used to measure the data from the patients and store it in database. From the database, the patient's information and medical records were considered as context aware data. For analyzing and classifying the data a rule-based machine learning algorithm, modified RIPPER algorithm was used. By using this algorithm, the rules were framed for the analysis of data for the prediction of heart disease.