In this work, a context aware health care method based on the application of IoT was proposed. Smart medical devices were utilized to collect and retain patient data, which was stored in a database. The database contained context-aware data, such as the names, addresses, and medical histories of the patients. A rule-based machine learning technique, a modified RIPPER algorithm, was utilized to analyze and classify the data. The rules for analyzing data for the prediction of heart disease were developed using this algorithm. Based on the classification results, the prediction of the heart disease was performed (Fig. 2).
The smart medical wearable devices based on body sensor network is used for the computation of patient’s physiological medical data (i.e., Heart rate, temperature, etc.). The data from these devices are stored in the cloud platform for data storage. The stored data can be further managed or used by the user or the medical centres for analyzing the patient’s health conditions. By using specific application, patient can be monitored from remote places via internet through smartphones or PCs. The block diagram of the model was shown in figure.2. The proposed algorithm is discussed as follows.
The IoMT devices and wearable devices are considered as the IoT devices. They are equipped to accumulate the patient’s data from remote areas. These data are collected as patient’s information that are accumulated using IoT devices connected or equipped with the human body.
3.1. RIPPER ALGORITHM
The rule-based machine learning can be detailed as the basic concept description. The RIPPER algorithm is one among them most widely used. Comparing to various algorithms this technique has more benefits, that it could be comprehended with ease, with generated rules in the form of If-Then format, implying that the model is entirely interpretable. The RIPPER algorithm is a rule-based classification algorithm that produces a rule-based classifier model, which is a collection of IF-THEN rules derived straight from the training data set, thus the name "direct process." It can be utilized for multi-class and binary classifications. The RIPPER algorithm's core framework was split into two types: optimization rules and generation rules. The generation type is the two-layer loops in which the outer loop produces the rule and applies it to the rule base next to pruning, while the inner loop includes one antecedent to the rule at once. The optimization type creates alternate rules based on the rule base's rules, and the minimum description length (MDL) criteria was utilized to pick the right rule and attach with rule base [13]. This algorithm goes through four stages:
Growth
During this process, a rule was created by greedily applying features to the rule before it passes the stopping criterion.
Pruning
Throughout this process, every rule was pruned and rendered shorter by eliminating repetition and reducing the duration of previous rules, allowing the rule to improve.
Optimization
The initial prune and growth process creates rules from an empty ruleset. The optimization phase makes use of the rules created during the initial pruning and growth stages and attempts to create new rules from the rule set. The rules can be additionally optimized with,
Adding features to the initial rule using the greedy approach (i.e., depth initial search).
Following the growth and pruning process, a new rule set is created.
Selection: At the selection process, the best rules were held and the rest of the rules are removed from the system. The specifications of this algorithm are as follows: D dataset is used as input (Eqs. 1 and 2).
Step 1
Split the dataset D into individual growth sets Gro and prune set Pru
Step 2
The growth set Gro was utilized at this point as dataset. The growth rule starts with no rules, and every time an appropriate combination of potential features and thresholds are chosen as the antecedents would be included to the rules. The information gain was utilized as the evaluation criterion
$$IGN=cover\left({\text{log}}_{2}r{t}^{\text{'}}-{\text{log}}_{2}rt\right)$$
1
Cover is the number of positive instances that were covered since adding the antecedents to the rule, rt' was the proportion of positive instances in the data covered using the rule since adding the antecedents to a rule, and rt is not. The iteration of including antecedents would continue until the Gro was empty.
Step 3
The pruning process utilize the pruning set Pru to measure the rule's generalization capacity. Begin with past thing added, and eliminate an antecedent in the rule. When pruning, the metric was
Table.1. Description of Data used in Ruleset
Type
|
Range
|
Description
|
Chest Pain
|
1 to 4
|
Typical angina
Atypical angina
Non angina
Asymptomatic
|
Cholesterol
|
< 197
188–250
217–307
> 281
|
Lower
Medium
Higher
Very higher
|
BP
|
< 134
124–153
142–172
> 154
|
Lower
Medium
Higher
Very higher
|
Blood Sugar
|
< 120
>=120
|
No
Yes
|
ECG
|
< 0.4
0.4–1.8
> 1.8
|
Normal
Abnormal
Hypertrophy
|
Thallium
|
3
6
7
|
Normal
Fixed Defect
Reversible Defect
|
Age
|
< 35
35–45
40–58
> 58
|
Younger
Middle
Older
Very older
|
Gender
|
1
0
|
Male
Female
|
Smoking (in years)
|
<=10
> 10
|
Lower
Higher
|
Drinking
|
0
1
|
No
Yes
|
Family history
(diabetes, hypertension, ...)
|
< 1
>=1
|
No
Yes
|
Medical records
(diabetes, hypertension...)
|
< 1
>=1
|
No
Yes
|
p was positive instances covered by the rule in Pru, n was negative instances covered by the rule in Pru, the point of the calculation was to increase the precision of the pruned set.
Step 4
After pruning the rule, it was tried to be included to a rule base. The inclusion would fail if the number of instances covered by the rule was too limited or the precision was too poor. If the rule was effectively included, the instances covered will be removed from the D [14].
3.2. MRIPPER ALGORITHM
The RIPPER algorithm for rule induction was implemented as a replacement for the Incremental Reduced Error Pruning (IREP) algorithm. About the fact that the fundamental ideals remain similar, Modified-RIPPER strengthens IREP in certain details and was also capable of dealing with multiclass issues. A single MRIPPER rule was made up of a consequent and an antecedent part. The antecedent part was a predicate (selector) conjunction, and the consequent part was a class assignment. MRIPPER learns those rules greedily, using a divide-and-conquer approach. The training data are classified by class terms in increasing order based on the respective class frequencies prior to the learning process. The rules for the initial m-1 classes are then learned, beginning with the smallest. When the rule was established, the instances concealed by that rule are excluded from the training data, and this process was replicated till no instances from the target classes remain. After that, the algorithm moves on to the next class. Finally, as MRIPPER discovers that there are no more rules to learn, a default rule (with an empty antecedent) was applied for the last class. Single-class rules are learned before either all positive instances were concealed or the last rule applied was "too difficult." The last feature was applied in terms of overall description length: the stopping criteria was met if R's description length was quite longer than the shortest description length found so far as represented in the algorithm [15].
3.3. PROPOSED ALGORITHM
procedures BUILDSET (P, N)
P = positive samples
N = negative samples
Rule Set = {}
DL = Description length (Rule Set, P, N)
while P {}
//Grow and prune a new rule
split (P, N) into (Gro P, Gro N) and (Pru P, Pru N)
Rule = Gro Rule (Gro P, Gro N)
Rule = Pru Rule (Rule, Pru P, Pru N)
add Rule to Rule Set
if Description Length (Rule Set, P, N) > DL + 11 then
//Prune the whole ruleset and exit
for every rule R in Rule Set (considered in reverse order)
if Description Length (Rule Set {R}, P, N) < DL then
delete R from Rule Set
DL = Description Length (Rule Set, P, N)
end if
end for
return (Rule Set)
end if
DL = Description Length (Rule Set, P, N)
delete from P and N all instances covered by Rule
end while
end BUILDRULESET
procedure OPTIMIZERULESET (Rule Set, P, N)
for every rule R in Rule Set
delete R from Rule Set
U Positive = instances in P not covered by Rule Set
U Negative = instances in N not covered by Rule Set
split (U P, U N) into (Gro P, Gro N) and (Pru P, Pru N)
Rep Rule = Gro Rule (Gro P, Gro N)
Rep Rule = Pru Rule (Rep Rule, Pru P, Pru N)
Rev Rule = Gro Rule (Gro P, Gro N, R)
Rev Rule = Pru Rule (Rev Rule, Pru P, Pru N)
select best of Rev Rule and Rep Rule and add to Rule Set
end for
end OPTIMIZERULESET
procedure RIPPER (P, N, k)
Rule Set = BUILDRULESET (P, N)
repeat k times
Rule Set = OPTIMIZERULESET (Rule Set, P, N)
return (Rule Set)
end RIPPER
3.4. RULESET
Rule 1: if (Thallium was less) and (Chest pain was typical angina) hence (No heart disease)
Rule 2: if (Thallium was less) and (Chest pain was atypical angina) hence (No heart disease)
Rule 3: if (Thallium was less) and (Chest pain was non angina) hence (No Heart disease)
Rule 4: if (Thallium was less) and (Chest pain was asymptomatic) and (Vessel was less) hence (No Heart disease)
Rule 5: if (Thallium was less) and (Chest pain was asymptomatic) and (Vessel was higher) hence (Presence of Heart disease_2)
Rule 6: if (Thallium was higher) and (vessel was less) and (No angina) hence (No Heart disease)
Rule 7: if (Thallium was higher) and (vessel was less) and (has angina) hence (Presence of Heart disease)
Rule 8: if (Thallium was higher) and (vessel was less) hence (Presence of Heart disease_2)
Rule 9: if (family history was true) hence (Presence of Heart disease_1)
Rule 10: if (medical record was true) hence (Presence of Heart disease_1)
Rule 11: if (family history was true) and (medical record was true) hence (Presence of Heart disease_1)
The ruleset for the algorithm discussed is framed based on the medical condition of the patient regarding the prediction of heart disease prediction. The ruleset is framed from the conditions related to the causes of heart disease as shown in table.1. Each attribute represented in the table has the range based on the condition of description. According to the ranges, the disease condition will be predicted with the ruleset framed for context aware system. The context database collects the context aware data from the input which is medical records, personal information, and physiological data of the patient. Based on the context analyzer and context reasoning the ruleset is framed and analyzed. The data from the context database is processed by the proposed algorithm for the classification. For experimental analysis, the Cleveland dataset is implemented in this research for evaluation.