Background As per the 2017 WHO fact sheet, Coronary Artery Disease (CAD) is the primary cause of death in the world, and accounts for 31% of total fatalities. The unprecedented 17.6 million deaths caused by CAD in 2016 underscores the urgent need to facilitate proactive and accelerated pre-emptive diagnosis. The current research took an innovative approach to implement K Nearest Neighbor (k-NN) and ensemble Random Forest Machine Learning algorithms to achieve a targeted “At Risk” Coronary Artery Disease (CAD) classification. To ensure better generalizability mechanisms like k-fold cross validation, hyperparameter tuning and statistical significance (p<.05) were employed. The classification is also unique from the aspect of incorporating 35 cytokines as biomarkers within the predictive feature space of Machine Learning algorithms.
Results A total of seven classifiers were developed, with four built using 35 cytokine predictive features and three built using 9 cytokines statistically significant (p<.05) across CAD versus Control groups determined by independent two sample t tests. The best prediction accuracy of 100% was achieved by Random Forest ensemble using nine significant cytokines. Significant cytokines were selected to decrease the noise level of the data, allowing for better classification.
Additionally, from the bio-medical perspective, it was enlightening to empirically observe the interplay of the cytokines. Compared to Controls, moderately correlated (correlation coefficient r=.5) cytokines “IL1-β”, “IL-10” were both significant and down regulated in the CAD group. Both cytokines were primarily responsible for the Random forest generated 100% classification. In conjunction with Machine Learning (ML) algorithms, the traditional statistical techniques like correlation and t tests were leveraged to obtain insights that brought forth a role for cytokines in the investigation of CAD risk.
Conclusions Presently, as large-scale efforts are gaining momentum to enable early detection of individuals at risk for CAD by the application of novel and powerful ML algorithms, detection can be further improved by incorporating additional biomarkers. Investigation of emerging role of cytokines in CAD can materially enhance the detection of risk and the discovery of mechanisms of disease that can lead to new therapeutic approaches.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
The full text of this article is available to read as a PDF.
Loading...
On 07 Feb, 2021
Received 06 Feb, 2021
Invitations sent on 24 Jan, 2021
On 24 Jan, 2021
On 23 Jan, 2021
On 23 Jan, 2021
On 23 Jan, 2021
On 28 Nov, 2020
Received 27 Nov, 2020
On 09 Nov, 2020
Invitations sent on 03 Nov, 2020
On 31 Oct, 2020
On 31 Oct, 2020
On 31 Oct, 2020
Posted 15 Jun, 2020
On 27 Jul, 2020
Received 12 Jul, 2020
On 08 Jul, 2020
Received 04 Jul, 2020
On 21 Jun, 2020
Invitations sent on 18 Jun, 2020
On 10 Jun, 2020
On 09 Jun, 2020
On 09 Jun, 2020
On 09 Jun, 2020
On 07 Feb, 2021
Received 06 Feb, 2021
Invitations sent on 24 Jan, 2021
On 24 Jan, 2021
On 23 Jan, 2021
On 23 Jan, 2021
On 23 Jan, 2021
On 28 Nov, 2020
Received 27 Nov, 2020
On 09 Nov, 2020
Invitations sent on 03 Nov, 2020
On 31 Oct, 2020
On 31 Oct, 2020
On 31 Oct, 2020
Posted 15 Jun, 2020
On 27 Jul, 2020
Received 12 Jul, 2020
On 08 Jul, 2020
Received 04 Jul, 2020
On 21 Jun, 2020
Invitations sent on 18 Jun, 2020
On 10 Jun, 2020
On 09 Jun, 2020
On 09 Jun, 2020
On 09 Jun, 2020
Background As per the 2017 WHO fact sheet, Coronary Artery Disease (CAD) is the primary cause of death in the world, and accounts for 31% of total fatalities. The unprecedented 17.6 million deaths caused by CAD in 2016 underscores the urgent need to facilitate proactive and accelerated pre-emptive diagnosis. The current research took an innovative approach to implement K Nearest Neighbor (k-NN) and ensemble Random Forest Machine Learning algorithms to achieve a targeted “At Risk” Coronary Artery Disease (CAD) classification. To ensure better generalizability mechanisms like k-fold cross validation, hyperparameter tuning and statistical significance (p<.05) were employed. The classification is also unique from the aspect of incorporating 35 cytokines as biomarkers within the predictive feature space of Machine Learning algorithms.
Results A total of seven classifiers were developed, with four built using 35 cytokine predictive features and three built using 9 cytokines statistically significant (p<.05) across CAD versus Control groups determined by independent two sample t tests. The best prediction accuracy of 100% was achieved by Random Forest ensemble using nine significant cytokines. Significant cytokines were selected to decrease the noise level of the data, allowing for better classification.
Additionally, from the bio-medical perspective, it was enlightening to empirically observe the interplay of the cytokines. Compared to Controls, moderately correlated (correlation coefficient r=.5) cytokines “IL1-β”, “IL-10” were both significant and down regulated in the CAD group. Both cytokines were primarily responsible for the Random forest generated 100% classification. In conjunction with Machine Learning (ML) algorithms, the traditional statistical techniques like correlation and t tests were leveraged to obtain insights that brought forth a role for cytokines in the investigation of CAD risk.
Conclusions Presently, as large-scale efforts are gaining momentum to enable early detection of individuals at risk for CAD by the application of novel and powerful ML algorithms, detection can be further improved by incorporating additional biomarkers. Investigation of emerging role of cytokines in CAD can materially enhance the detection of risk and the discovery of mechanisms of disease that can lead to new therapeutic approaches.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
The full text of this article is available to read as a PDF.
Loading...