The fitness of any constructed classifier model is determined by sensitivity and specificity for any prediction [12]. These two factors can calculated with true positive (TP), false positive (FP), true negative (TN) and false negative (FN) instances of the classified dataset. Hence, the proportional probability of the liver disordered patients with a positive result of diabetes mellitus is known as the sensitivity. It is defined by a formula of TP/(TP+FN). Specificity specifies the possibility proportion of the liver disordered patients without diabetes mellitus and so it reflect in negative result group. Specificity can be calculated with TN/(TN+FP). These indicators of the classifier are recognized as the fitness recognition factors.
The classifier fitness = sensitivity* specificity
Table IV
Confusion Matrix of the CHAID classifier
|
Predicted patients
|
AF
|
NAFm
|
NAFf
|
Actual patients
|
AF
|
391
|
0
|
0
|
NAFm
|
1
|
179
|
0
|
NAFf
|
0
|
0
|
179
|
The values from confusion matrix shows that,
Sensitivity of AF= 391/ (391+0) =1
Specificity of AF= 359/ (359+0) =1
Fitness of AF = 1*1 =1
Sensitivity of NAFm= 179/ (179+0) =1
Specificity of NAFm= 571/ (571+1) =0.998
Fitness of NAFm = 1*0.998 =0.998
Sensitivity of NAFf= 179/ (179+0) =1
Specificity of NAFf= 571/ (571+0) =1
Fitness of NAFf = 1*1 =1
A. ROC Curve
Receiver operating characteristic (ROC) curves are graphically effective way to differentiate positive and negative result groups with a clear cut off value for a class or a decision threshold [13]. An ROC constantly represent a unit square and this curve passes through two intervals, which are 0,0 and 1,1.
The beginning point of it is 0,0 and it represents that there is no classifier sensitivity. The classifier shows its maximum sensitive when the curve reaches to 1,1 [14]. ROC is formed with X axis as specificity and Y as sensitivity. Therefore, the curve for AF and NAFf lies in Y axis upto 0.9988 and 1 respectively. ROC curve for NAFm starts at 0.002 of X axis and grows towards Y axis until reach 0.9968. According to the ROC curve, it can be determined that the indicated model is fit for the given data classification.
Table V
Mean and standard deviation of P1 and P2 in liver disordered patients data
Factors
|
AF
|
NAFm
|
NAFf
|
S.D.
|
Mean
|
S.D.
|
Mean
|
S.D.
|
Mean
|
P1
|
0.443
|
0.75
|
0.343
|
0.87
|
0.451
|
0.72
|
P2
|
0.402
|
0.80
|
0.412
|
0.78
|
0.335
|
0.87
|
Table VI
Correlation factors of P1 and P2 in AF patients disorder data
Attribute
|
HPC1
|
GPC5
|
NPC0
|
ZC0
|
HNC1
|
GNC5
|
NNC0
|
P1
|
B2, S2, A2, B1
|
|
It, Sm
|
|
Ob, S1, P2
|
B1, To
|
Ag, B2
|
P2
|
OS, S1,S2, A2, To,G1, P1,B1
|
B2
|
It
|
|
Ag, B2
|
Sm, B1
|
|
Table VII
Correlation factors of P1 and P2 in NAFm patients disorder data
Attribute
|
HPC1
|
GPC5
|
NPC0
|
ZC0
|
HNC1
|
GNC5
|
NNC0
|
P1
|
Ag, Sm, He, Ac, S1, S2, To, P2, B2
|
B1
|
It, B2, Ga
|
|
A2, A3
|
|
A1
|
P2
|
Ag, Sm, He, Ac, S1, To, P1, B1
|
B2
|
It, A1, B2
|
|
A3
|
Ga
|
S2, A2
|
Table VIII
Correlation factors of P1 and P2 in NAFf patients disorder data
Attribute
|
HPC1
|
GPC5
|
NPC0
|
ZC0
|
HNC 1
|
GNC5
|
NNC0
|
P1
|
|
|
Ag, It, Sm, B2,A2, B1,B2
|
|
|
A3
|
A1, He, B1, S1,S2, Ga, To, P2
|
P2
|
|
|
A1, Sm, He, B2, A2, To, B2
|
|
|
|
Ag, It, B1, S1, S2, Ga, A3, P1, B1
|
HPC1 - High Positive Correlation with 1% of significance.
GPC5 - Good Positive Correlation with 5% of significance.
NPC0 - Normal Positive Correlation without any significance.
ZC0 - Zero Correlation (without any correlation).
HNC1 - High Negative Correlation with 1% of significance.
GNC5 - Good Negative Correlation with 5% of significance.
NNC0 - Normal Negative Correlation without any significance.
If the symptom X and Y are correlated, it unavoidably states that X either causes for Y or Y causes for X [15]. Here for an example, age is correlated with plasmas glucose F and R, but we can’t blindly conclude that age is the reason for diabetes or diabetes is the reason for aging. These two might have a relation with another factor like liver problem or correlated with multiple symptoms that have a solid relation with disease [16, 17]. The above tables shows that most of the collected instances have positive or negative correlation with P1 and P2.