Development of Chronic Kidney Disease Risk Prediction and Management System- Research study

doi:10.21203/rs.3.rs-2692488/v1

Download PDF

Research Article

Development of Chronic Kidney Disease Risk Prediction and Management System- Research study

https://doi.org/10.21203/rs.3.rs-2692488/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Background: Chronic kidney disease is one of a major global public health issue, affecting over 10% of the population worldwide. It is the leading cause of death in 2016 ranking 16th and is expected to rise to 5th rank by 2040.Consequently, tools to identify patients at high risk of having CKD and management of risk factors are needed, particularly in limited-resources settings where laboratory facilities are scarce. This study aimed to develop a risk prediction and management system using data from JUMC, SPHMMC and MTUTH.

Objective: To develop chronic kidney disease risk prediction and management system is using expert system.

Method :General chronic kidney disease risk factor were collected from expert knowledge .The identified general risk factors were applied on 384 patients data collected from three hospitals to identify risk factors in Ethiopia .The risk factors were identified using statistical analysis .After identifying the risk factors from the statistical analysis ,risk factor managements techniques were identified from expert knowledge. Knowledge gained from the expert knowledge and statistical analyses were combined and developed using rule based expert system.

Main outcome measure: Accuracy, Precision and recall are the parameters which have been evaluated from the developed system using confusion matrix.

Result: The system has showed 63.3 %, 65.3 %and 77.5%accuracy at 14%, 24% and 34% cut off percent respectively in estimating probability.

Conclusion: This study will have significance in preventing chronic kidney disease at early stage and creating awareness.

Funding Statement: The authors received no specific funding for this study.

Chronic kidney disease

expert system

Chronic kidney disease (CKD) is one of a major global public health issue, affecting over 10% of the population worldwide, and is defined as an abnormality of kidney function or structure for 3 months. The problem is the leading cause of death in 2016 ranking 16th and is expected to rise to 5th rank by 2040 [1].In developed countries, the prevalence of moderate-to-severe CKD (stages G3–G5) in population representative surveys is estimated at 5–6%[2].Due to the significant increase in non-infectious disease (particularly diabetes and hypertension),CKD is rapidly increasing in developing countries. Furthermore, infectious diseases, such as HIV, schistosomiasis, and leishmaniasis, which also contribute to CKD, are highly prevalent in low-income to middle-income countries [3]. Although there is no generalized data on the prevalence of CKD in Ethiopia, a study done prevalence on of chronic kidney disease among public hospitals in Addis Ababa showed that Stage (1–2) prevalence of CKD is 27.2%, (15.6% and 11.6%) respectively. Whereas stage (3–4) prevalence of CKD is 34.1%, (19.4% and 14.7%) respectively [4],26% prevalent among diabetes mellitus and hypertensive patients of Jimma University Medical Center[5], and 21.8% prevalent among diabetic adults at university of Gondar Hospital [6].The leading causes of chronic kidney disease in all developed countries are diabetes and hypertension. Glomerulonephritis and unknown causes are more mutual in countries of Asia and sub-Saharan Africa. These differences are related mainly to the burden of disease moving away from infections toward chronic lifestyle-related diseases, decreased birth rates, and increased life expectancy in developed countries [3].

However, the onset and progression of CKD is frequently preventable. The primary level of prevention requires awareness of modifiable CKD risk factors and efforts to focus healthcare resources on those patients who are at the highest risk of developing new onset. Populations or individuals at CKD risk must be diagnosed and treated early to prevent the onset of and delay the progression of the kidney disease. Management of CKD risk factors is also needed for prevention CKD[7]. In addition, aggressive risk factor reduction should be carried out in individuals at increased risk for CKD even when the disease is not clinically apparent[8].Real prevention policies count on an accurate understanding of the incidence and prevalence of CKD in a given setting, as well as the distribution and burden of risk factors[9]. Despite the known adverse consequences of CKD, the vast majority of the people remain unaware of the disease that kidney disease can be diagnosed with simple laboratory procedures. However, people’s practice toward testing (screening) themselves to know the status of CKD is exceptionally very low. Further, awareness of CKD remains unacceptably low among care providers[5].A study done on awareness of chronic kidney disease among adult diabetic outpatients in Northeast Ethiopia showed that there is a high prevalence but low awareness of CKD in diabetic outpatients attending the health center.

Prediction models are important to identify high risk subgroups for chronic kidney disease. These models enhance the ability of health care providers to prevent or delay serious sequelae, including kidney failure[10]. In order to identify risk high risk individuals different models are developed for different countries .A study done on Peru people developed risk score using risk factors and laboratory-based variables. This study included sex, age, hypertension, and cholesterol level as independent variable. They developed two risk scores for prevalent undiagnosed CKD: complete and a laboratory-free risk score. The complete and laboratory-free risk scores performed similarly well with a ROC area of 76.2% and 76.0%[11].

Another study done by Heejung Bang et.al developed a simple prediction model using adult participants in the National Health and Nutrition Examination Surveys in United States. Socio-demographic factors, health conditions were included in the statistical analysis[12]. Similarly, another study done on American population developed prediction system that can estimate the probability of CKD. This model includes socio demographic factors, laboratory tests, health status and lifestyles. Evaluation of the final model of the study resulted in 86% sensitivity, 85% specificity, 18% positive predictive value, and 99% negative predictive value[10].

A study developed using Canadian cohorts of patients developed a predictive model for progression of chronic kidney disease to kidney failure. They showed that a model using routinely obtained laboratory tests can accurately predict progression to kidney failure in patients with CKD stages 3 to 5utilizing demographic, clinical, and laboratory data from 2 independent Canadian cohorts of patients with CKD stages 3 to 5 (estimated GFR, 10–59 mL/min/1.73 m2).Statistical analysis was used to identify significant factors. In the validation cohort, this model was more accurate than a simpler model that included age, sex, estimated GFR, and albuminuria[13].Additional research which was done also established and validated a risk prediction model for end-stage renal disease in patients with type 2 diabetes, which included retrospective cohort study consisting of 24,104 Chinese patients with type 2 diabetes .This study developed 3-, 5-, and 8-year ESRD risk scores with good prediction accuracy and discriminatory ability. Another study also developed predictive model for progression of chronic kidney disease to kidney failure. This model used a large administrative claims database. patients ≥ 18 years of age and continuously enrolled for 36 months. Multivariate logistic regression was also used to develop the model. The strongest predictors of progression to kidney failure found during this study were CKD stage (4 and 3), HTN, and DM. The developed system is intended to identify patients in a large administrative claims database with CKD stages 3 or 4 who were at high risk for progression to kidney failure[14].

Similarly, additional study has combined renal risk factors into scoring systems that can be used to assess renal risk in individual patients. An approach they used are decision-tree simulation and Bayesian modeling to assess risk and this has been practiced to determine individual risk of ESRD in a hypothetical population of USA patients using blood pressure and a measure of proteinuria as well as basic demographic data [15]. Similarly, Masaki Makino et al. depicted that artificial intelligence can predict the progression of diabetic kidney disease using big data machine learning. This new predictive model using AI could detect the progression to DKD, which may contribute to more effective and accurate intervention to reduce hemodialysis and cardiovascular types[16].The risk prediction models has identified risk factors and utilized them for risk prediction in the form of scoring.

The developed models focus solitary on one specific country and can’t be applied for Ethiopia because of difference in risk factors. Additionally, the developed risk prediction models do not recommend management of risk factors .Consequently; this work aimed at developing CKD risk prediction and management system using expert system. General chronic kidney disease risk factor were collected from expert knowledge .The identified general risk factors were applied on 384 patients data collected from JUMC ,SPHMMC and MTUTH to identify risk factors in Ethiopia .The risk factors were identified using statistical analysis .After identifying the risk factors from the statistical analysis risk factor managements techniques were identified from expert knowledge. Rule based expert system was developed using knowledge gained from the expert knowledge and statistical analyses. To validate the developed system confusion matrix was used and the system has showed 63.3%, 65.3% and 77.5% accuracy at 14%, 24% and 34% cut off percent respectively in estimating probability of having CKD.

The overall methodology for the achievement of the research included knowledge acquisition, data collection, data analysis, risk factor management technique identification, expert system development, and performance evaluation. Figure (1) shows general methodology of the study.

3.1 Knowledge Acquisition

Study variable needed for this study were gained from expert knowledge. Different variables were identified from CKD guidelines, and articles related to risk factor identification, and diagnosis methods. Different studies have been done on CKD risk factors and also some nephrologist has published different common risk factors. In this study research articles and guidelines were reviewed, and variables applicable to Ethiopia were identified. The identified risk factors were older age, sex, diabetes mellitus, hypertension ,body mass index, injury on kidney, presence of family with the disease ,cigarette use, alcohol consumption, being hospitalized before , and related kidney diseases [5, 17–32].

3.2 Data Collection

In order to distinguish the risk factors in general people in Ethiopia, data was collected from selected population by preparing questionnaire using the variables identified from expert knowledge and previous studies format as reference[20, 23]. The data collection has included different steps; Figure (2) shows methods undertaken for the successes of data collection.

3.3 Sample Size Determination

The sample size needed was calculated on the basis of the following equation (1)[33, 34], single proportion formula with a 95% confidence level, standard deviation=5, 5% margin of error. The values were selected on the basis of fulfilling the criteria for performing logistic regression analysis.

3.4 Ethical Consideration

The research data collection was approved by the Institutional Review Board of College of Medicine and Health Sciences, Jimma University and St Paulo’s Millennium medical college. The purpose of the study was explained to the study participants accordingly.

3.5 Setting and Population

This data collection was conducted from March 4 to June 14, 2021 at the inpatient settings Jimma University Medical Center, St Phaulos Millenium Medical College ,and MizanTepi University Teaching Hospital. The hospitals were selected using purposive sampling. In purposive sampling researcher decides which particular groups to select. Purposive sampling is used when it is challenging to reach every area, household or individual member of the population and dependable information about population locations and numbers is not available[35]. Additionally it is used when there is insufficient time to visit the number of households or individuals needed. Due to lack of time and budget these sampling was used to select the hospitals.

As humans were involved in the study, the study protocol is performed in accordance with the relevant guidelines. All subjects involved in the study were invited to participate on a voluntary basis. A written informed consent is obtained from all the participants with age 18 and above, who were suspect of CKD and admitted to the hospital were eligible for the study. Age above 18 is selected because it is preferred to perform studies for age above 18 because , they cannot provide their consent and information needed by themselves without help of their parents and some of the questions are related with addiction (smoking and chat consuming) they may not provide the right answer [36, 37]. A total of 384 patients who fulfilled the above criteria were consecutively included for the final analysis. Socio-demographic and some risk factor variables were collected using a structured questionnaire by five nurses. Patient histories were reviewed to identify presence and absence of CKD, HTN, DM, kidney related disease and other diseases. Creatinine data was obtained from patient history and the glomerular filtration rate (GFR) was estimated using Modification of Diet in Renal Disease (MDRD) study equation shown in 2[6] and stage of CKD was identified using KDIGO guideline[2].

3.6 Measures

The height and weight were taken at the time of the interview and used to measure the BMI. Participants were categorized by BMI into normal (BMI 18.5–24.9), underweight (<18.5), overweight (25.0–29.9), obese (30.0–39.9), and morbid obese (≥40.00) according to guidelines[38].Participants were considered to have diabetes mellitus if previously they had been recognized by the doctor as having DM or any documents in favor of DM or they reported taking insulin or oral anti-diabetic drug or random plasma glucose ≥11.1 mmol/L with symptom. Blood pressure readings have been obtained by a qualified nurse using an electronic sphygmomanometer. Hypertension was defined as systolic BP ≥ 140 mmHg or diastolic BP ≥ 90 mmHg or use of medication for hypertension irrespective of the blood pressure.

3.7 Statistical Analysis

A set of methods used to analyze data are called statistic. Statistic exists in all areas of science involving the collection, handling and sorting of data, given the insight of a specific phenomenon and the possibility that, from that knowledge, inferring possible new results[39]. One of the goals with statistics is to extract information from data to get an improved understanding of the situations they represent. Thus, the statistics can be thought of as the science of learning from data. In other way , we can say that statistic based on the theory of probability, provides techniques and methods for data analysis, which help the decision-making process in various problems where there is uncertainty[39].

As shown in figure (3), statistical analysis was performed using version 26 of the SPSS. The first approach done before analyzing data was entering and editing the data. Data was entered in the form of number and string. In order to identify missing values missing value analysis using frequency analysis was performed. Patient data containing incomplete information were excluded/ corrected before performing the analysis. After data entry and missing value analyses, descriptive analyses, bivariate analyses, and multivariable logistic regression were performed. Descriptive analysis was performed to identify participant’s socio demographic status and stage.

Bivariate analysis was performed to identify differences in patient’s characteristics and risk factors for CKD were analyzed using chi-square test. Chi-square testis used to determine whether the association between two qualitative variables is statistically significant, since researchers must conduct a test of significance[40]. Additionally, In order to estimate the unique relationship between the included variables and CKD status multivariable logistic regression was performed. Multivariable logistic analysis (MVLA) model is selected because it is efficient method for the analysis of with one outcome (dependent) and multiple independent variables[41]. In this study CKD status is taken as dependent variable while other factors are taken as independent variable.In statistical analysis to identify the significance of the independent variable P value is the taken as a measurement tool. The P stands for probability and measures how likely it is that any observed difference between groups is due to chance. Being a probability, P can take any value between 0 and 1. Values close to 0 indicate that the observed difference is unlikely to be due to chance, whereas a P value close to 1 suggests no difference between the groups other than due to chance. The smaller is the probability of the result being “statistically significant” (p-value < 0.05 or <5%)[42].In this work P value is taken P< 0.05 for MVLRA and bivariate analysis. In multivariable analysis, male gender (AOR =2.297; 95% CI:1.407-3.753), hypertension (AOR =3.095; 95% CI: 1.882-5.089), family history of kidney disease (AOR =4.128; 95% CI: 2.302-7.402), diabetes above ten and below ten years (AOR =30.986; 95% CI: 3.972-241.744) , COR =3.011; 95% CI: 1.904(1.212- 7.483) respectively, hypertension (AOR = 3.60; 95% CI: 1.98–6.54) ,smoking above 4years (COR =2.226; 95% CI: 1.014- 4.883),being overweight and injury on kidney (COR =1.904; 95% CI: 1.904(1.119-3.239) (AOR =2.362; 95% CI :1.016-5.491) were independently associated with the presence of CKD .Tables(1) and (2) shows relation between the variables from bivariate and MVLRA analysis.

Table 1: Crude odd ratio of factors associated with CKD of respondents from bivariate analysis

Variables		Frequency	Percentage		COR (95%CI)	P Value
			CA (%)	NCA (%)
History of HTN	Yes	151	51	49	1.861(1.419,2.440)	0.000
History of HTN	No	233	27	72	.675(.570,.798)	0.000
Diabetes Duration >=10	Yes	30	96.7	3.3	27.528(3.788,200.057)	0.000
Diabetes Duration >=10	No	354	47.5	52.5	0.857 (.808,0.909)	0.000
Duration<10	Yes No	38 346	78.9 21.1	48.3 51.7	3.559(1.675,7.561) 0.885(.828,.946)	0.00
Sex	Female	169	38.5	61.5	0.590(.464,.749)	0.00
Sex	Male	215	61.4	38.6	1.509(1.251, 1.820	0.00
Age	Age>=60	64	54	46	1.113(0.895, 1.385)	0.335
Age	Age<60	320	49	51	0.913(.760, 1.098)	0.335
History of Smoking	Duration>4	51	74.5	25.5	2.775(1.527, 5.041	0.001
History of Smoking	Duration<4	333	47.7	53.3	0.867(0.802, .938)	0.001
Chat Consumption	Yes No	54 330	46.3 52.1	53.7 47.9	0.818(0.498, 1.344) 1.033(0.953,1.121)	0.427
Hospitalized Before	Yes	115	44.3	55.7	0.756(0.556, 1.030)	0.075
Hospitalized Before	No	269	54.3	45.8	1.127 (0.987,1.286)	0.075
Experienced Injury	Yes	40	72.5	27.5	2.503(1.288,4.864)	0.005
Experienced Injury	No	344	48.8	52.2	0.906(0.846,0.970)	0.005
Presence of Family with kidney Disease	Yes	92	69.9	30.4	2.170 (1.460,3.225)	0.00
Presence of Family with kidney Disease	No	292	45.5	54.5	0.794 (0.708,0.890)	0.00
Affected with CVD	Yes	49	40.8	59.2	0.655(0.384, 1.116)	0.116
Affected with CVD	No	335	42.8	42.8	0.670(0.566,0.793	0.116
BMI
Normal	Yes	85	44.7	55.3	0.767(0.526, 1.120)	0.167
Normal	No	299	52.2	44.8	1.078(0.968, 1.200)	0.167
Underweight	Yes	48	54.2	36.8	1.122(0.659, 1.908)	0.671
Underweight	No	336	50.9	49.1	0.984(0.912,1.061)	0.671
Over weight	Yes	127	61.1	38.9	1.875(1.372,2.563)	0.00
Over weight	No	257	44	56	.745(.646,.859)	0.00
Obese	Yes	6	50	50	0.949(0.194, 4.644)	0.886
Obese	No	378	51.4	48.6	1.001(0.949,1.462)	0.886
Other kidney disease presence	Yes	8	25	75	0.316(0.065, 1.548)	0.133
Other kidney disease presence	No	376	51.9	48.1	1.023(0.993,1.053)
Alcohol Consumption	Yes	18	72.5	27.5	2.468(0.897, 6.788)	0.069
Alcohol Consumption	No	366	50.3	49.7	0.960(0.918,1.003)	0.069

Table 2: Adjusted odd ratio and p values of factors associated with CKD of respondents from Multivariable Logistic regression analysis

Risk Factor	Significance P Values	Final Model AOR(95% CI)	Final Model B₀ Coefficient (intercept = -1.836)	Low Risk	High Risk
Diabetes Duration >=10	0.00	30.986(3.972,241.744)	3.434	Has No Diabetes	Diabetes above 9 years
Presence of Family with kidney Disease	0.000	4.128(2.302, 7.402)	1.14	Without presence of family with kidney disease	Has family with Kidney disease
Hypertension	0.000	3.095(1.882, 5.089)	1.13	Has No diabetes	Has no hypertension
Diabetes Duration <10	0.018	3.011(1.212, 7.483)	1.102	Has No Diabetes	Diabetes between 0 and 9 years
Experienced Injury around kidney	0.046	2.362(1.016,5.491)	0.86	Has not experienced injury	Experienced injury
Sex	0.001	2.297(1.407,3.753)	.832	Female	Male
Smoking	0.046	2.226(1.014, 4.883)	0.800	Below 4 years	Above 4 years
Over weight	0.018	1.904(1.119, 3.239)	0.644	Underweight and Normal	Overweight

3.8 Expert System Development

The Expert System (ES) is a computer system that emulates the decision-making ability of a human expert in a limited domain. The Expert System is one of the leading artificial intelligence (AI) techniques that have been adopted to handle such task. ES provide powerful and flexible means for obtaining solutions to a variety of problems that often cannot be dealt with by other, more traditional and orthodox methods[43].In this research rule based expert system is developed to make a system that can predict risk of individuals and suggest management ways. Figure (4) shows general frame work of the developed expert system.

3.9 Rule Based Expert System

Rule-based expert systems use rules as a knowledge representation technique. If and then statements are used to present rules. The “if” part is called premise, the “then” part is called conclusion. The data and associated conditions are the fact elements. Facts interact with data directly to determine if the event is of interest. The rule component of the expert systems relates facts with actions. In other words, it constructs an If-then rule by putting the facts under the If part and the set of actions under the then part. Through, joining rules using logical operator’s complex rules can be formed. AND, and OR operators are used to form premise part of the rule. A rule can also activate multiple set of actions. These set of actions can also be joined by logical operators when there are multiple set of facts to be checked individually[44].Basic structure of expert system contains Knowledge Base, Inference Engine and User Interface[45]. Knowledge base contains domain-specific and high-quality knowledge. Inference engine gets and uses the knowledge from the knowledge base to reach at a specific solution. It applies rules repetitively to the facts, which are obtained from earlier rule application. It adds new knowledge into the knowledge base if required. Resolves rules conflict when multiple rules are applicable to a particular case use of efficient procedures and rules by the Inference Engine are essential in deducting a correct, flawless solution. To recommend a solution, the Inference engine uses forward chaining and backward chaining[46].Forward chaining systems are data-driven rule-based systems that trigger actions based on the facts under the premise part of the rule. They start from the known data and add a new fact to the knowledge base, if it is not already in the knowledge-base. The disadvantage with forward chaining is many rules can be executed even they do have nothing to do with the established goal. So it is not efficient if one fact is only to be inferred. Forward chaining systems perform well when the goal is not known. They can trigger sounding actions if adequate information is gathered[43].User interface offers interaction between user of the ES and the ES itself[47].

3.10 Knowledge Acquisition for Expert system

3.10.1 Risk prediction knowledge acquisition

In order to make a system that can predict a risk of CKD knowledge must be attained and stored as set of rules. For the risk prediction the knowledge from different literature reviews and expert was analyzed on the patient’s data to identify the relationship between the disease and significant factors. The identified risk factors from the MVLA are used for risk prediction. Identified risk factors can be used for estimating probability of disease using logistic regression equation [48] shown in equation 3 below and to identify risk level .For a factor that increases risk, the probability of disease when the factor is present exceeds that in absence of the characteristic. Logistic regression models can account for the joint effects of multiple factors on the occurrence of disease. The multivariable logistic model provides an estimation of risk for subsequent disease[48, 49]. The risk level is identified with presence and absence of risk factor which can be classified as low risk and high risk[10].

Where, P is the probability of CKD during a stipulated period of observation, where B₀is the intercept, B₁ is the regression coefficient for the first independent variable (x1), B₂ is the regression coefficient for the second independent variable (x2), and so forth for each of the variables.

3.10.2 Risk factor management technique identification

Identifiable risk factors of CKD can be classified as modifiable and non-modifiable risk factors [50].Some of the risk factors found in this study are modifiable. Hypertension, diabetes, and BMI, are factors that are part of metabolic syndrome. Even though cigarette smoking is not a component of metabolic syndrome, it is also a known modifiable risk factor .Interventions that delay and prevent the onset of diabetes mellitus, reduce overweight , support smoking cessation, and control hypertension should be considered to improve to prevent or delay CKD[51].Genetics related factors, gender, and injury are not modifiable[50, 51].After identifying related risk factors from the statistical analysis risk management ways are searched for from different guidelines for modifiable risk factors like DM,HTN, smoking and being overweight. For non-modifiable risk factors male gender, experiencing injury, presence of family members were excluded as they can’t be modified .For the modifiable risk different guidelines were reviewed and identified [28, 39, 52–64].

3.10.3 Knowledge Representation

In these research work, rule based expert system is developed. The facts gained from different guidelines, books and statistical analysis were stored as facts and implemented using rules by means of ‘’if‘’ and ‘’then’’ cases. The proposed expert systems reasons based on different health implications, socio demographic status, and health implications, and generate three types of results. Taking these questions as an input, the expert systems trigger the inference engine to fire probability of the disease, risk level and management suggestion. The system asks different health implications; socio demographic status and habits gained from MNLRA.As shown below in figure (5) Q₀ represent age. Q_1-Q₉represents question that will be asked from the user, B represents values for each question, the questions include presence of DM, duration of diabetes, presence of HTN, habit of smoking, duration of smoking, height, weight, presence of family history with CKD, and experience of injury. For questions that need duration, it asks the duration if the person has the disease otherwise it takes it as ‘No”. If the answer of each question is ‘Yes” , the B value is set to a number greater than zero which is found from logistic regression unless B value is set to zero, and probability of the person is calculated using logistic regression formula .If the person has one risk factor, the system identifies the risk level as high risk individual. If no risk factor is found, it identifies as low risk .In addition, if the person has modifiable risk factor, it suggests risk factor type the person has and how it should be managed.

3.10.4 Graphical User Interface (GUI)

Tkinter library and Azure theme (GU styling) is used for creating an application of user Interface, to create windows and all other graphical user interface. Python programing tool is also used to write the code.

Performance Evaluation Metrics

After the system is built, its performance must be evaluated so as to know the actual result. The system was evaluated using data extracted from medical records of patients at JUMC. Medical records of 120 patients containing 60 patient’s data with CKD and 60 without CKD were collected for evaluation and patients were interviewed. To evaluate the system, all data was organized exported to the developed system and the system output was compared with the diagnosis recorded in the medical records. The system is evaluated using confusion matrix. The confusion matrix is a square matrix table that is used to describe the performance of any classification models on test dataset by representing the actual (column) and predicted (row) dimensions. It makes it easy for programmers to clearly see the performance of the model designed. A number of model performance metrics can be derived from the confusion matrix. Perhaps, the most common metric is accuracy defined by the following formula, precision and recall[65].The evaluation was performed at three cut off percent’s.

A TP (True Positive) value indicates that what is predicted is true; A TN (True Negative) value indicates that the predicted class is truly negative. A FP (False Positive) value indicates that a thing is predicted as if it is part of the class while it is not, FN (False Negative) the prediction indicates that it is not part of the class while it is[66].

Table (3) shows the results of the evaluation of the risk prediction system at different three cut off percentages.

Table 3: Performance evaluation result

Cut off probability percent	Recall	Precision	Accuracy
14	96.6	58	63.3
24	98.3	59.5	65.3
34	83.3	74.6	77.5

The risk factor management techniques are shown below in Table (4) from the reviewed guidelines, and different articles[28, 39, 52–64].

Table 4: Risk Factor management techniques

Risk Factors	Management Ways
HTN	Stopping Smoking Reducing calorie intake Reducing Alcohol consumption Performing Regular dynamic exercise(such as brisk walking, swimming, cycling) Taking Medications given properly
DM	Reducing Alcohol consumption Avoiding eating too much carbohydrate food Choosing and preparing food and drinks with less salt, sugar, fat and oil Staying as close as possible to schedule of eating, activity, and medication. Setting goals with health care team for weight, activity, blood sugar level, and A1C level. Checking blood sugar as directed and share tracking records with health care team. Ceasing smoking if you smoker
Overweight	Limit energy intake from total fats and sugars Increase consumption of fruit and vegetables Engage in regular physical 150 minutes spread through the week for adults. Decrease energy density of foods and drinks Decrease the size of food portions Avoid snacking between meal Do not skip breakfast and avoid eating in the night time Manage and reduce episodes of loss of control or binge eating
Smoking	Develop a plan to quit Setting a quit date. Telling friends, family, and coworkers. It is important to share goal to quit with those the smoker interact frequently. Anticipate challenges to the upcoming quit attempt Remove tobacco products from environment.

4.1 Graphical user interface (GUI) Implementation

Developed GUI was tested with respect to response time and ease of use. It is found to be easy to use and is convenient for users. Once initialized, result can be achieved within 2 seconds. The GUI has three options of the reasoning process “Probability Estimation”, “Risk Level” and “Risk management” buttons. The “probability estimation” button part asks different 9 socio-demographic, habits, health implications questions and it estimates probability of having the disease. Risk level and management also has same questions to that of probability estimation if the user hasn’t entered before. If the User has not entered any question before the user is required to enter questions for “Risk Level” and “Risk management”. “Risk Level” displays the risk level and “Risk management” also displays management techniques for different cases. Figure (6) shows a snap shot of the general layout of the developed GUI.

This study has assessed the factors of CKD among patients at JUMC,SPHHMC, MTURH and developed risk prediction and management system. Evaluation of final system for probability estimation the prediction system has showed: at cutoff point of 14% resulted in 96.6% recall, 58%precision and 63.3%accuracy. At cutoff point of 24%, resulted in 98.3% recall, 59.5%precision and 65.3%accuracy, at cutoff point of 34% resulted in recall 83.3%,precision 74.6%, and77.5 %accuracy. The accuracy of the model is not high because the system estimates the probability of having the disease, and the system does not include laboratory measurements.

The previous models have included different variables in their models. The developed system has shown variations from these models, the models didn’t include family member, being overweight, smoking presence, but in this study it was found significant and included[11, 67, 68].As indicated above the risk factor variation maybe might be due to differences in lifestyles such as dietary habit, sedentary way of life and physical activities

There are several potential implications of this work. First the prediction system is combined from expert knowledge and statistical analysis to identify the real effects of the risk factors and this enables to identify the real phenomena. Secondly, by allowing physicians to determine an individual’s estimated risk for chronic kidney disease, the prediction may inform clinical counseling and decision-making. For example, a higher chronic kidney disease may weigh against a decision to use a potentially interventions, favor increased intensity and frequency of follow-up testing and, assist in the decision to institute renal primary prevention measures. The other is that it can identify the individual’s risk without help of any laboratory tests and engage them self’s for diagnosis, and be aware of risk factors. The third thing is that the system provides management ways that can help patients reduce their risk. As per our knowledge there is no prediction system which has incorporated risk prediction and management, which is key contribution on of the study.

Finally, the prediction system requires no prior laboratory tests to be performed; it could be used for focused renal screening, identifying individuals in whom Creatinine measurement should be considered. The limitation of this study is that it is based only on three hospital’s patient data.

Chronic kidney disease is clinically silent in the early stages resulting in most patients being detected shortly before, or with, the onset of symptomatic disease. Identifying high risk groups can help the clinicians and patients suspect the disease early.

The research employed expert individual’s risk identification method and statistical analysis to provide a system that can identify risk of individual’s. The research also identified related risk factors and showed their significance in risk prediction. Additionally the study has also provided risk management ways that enables user to identify their risk and work on managing it. The system has shown 96.6% recall, 58%precision and 63.3%accuracy at14% cut off percent of probability estimation. This may help individuals to know their probability and engage themselves for screening. Furthermore, clinicians also can use the system to identify high risk individuals and suspect presence of CKD.

The key contribution of the study is that it’s performed on general people and identified the risk factors. The other contribution is that the system has utilized identified risk factors for self-risk prediction for the first time in Ethiopia and developed management system which is new. As per, our knowledge there is no a system which has incorporated risk prediction and management.

AOR: Adjusted Odd Ratio

ACR: Albumin Creatinine ration

BMI: Body Mass Index

BUN: Blood Urea Nitrogen

CA: Chronic Kidney Affected

CKD: Chronic Kidney Disease

COR: Common Odd Ratio

CVD: Cardio Vascular Disease

DKD; Diabetic Kidney Disease

DM; Diabetes Mellitus

eGFR: estimated Glomerular Filtration Rate

ESRD: End Stage Renal Disease

FP: False Positive

FN: False Negative

ROC: Receiver operating Characteristic

GFR: Glomerular Filtration Rate

HDL: High Density Lipoprotein

HTN: Hypertension

JUMC: Jimma University Medical College

KFRE: Kidney Failure Risk Equation

KDIGO: Kidney Disease Improving Global Outcomes

MDRD: Modification of Diet in Renal Disease

MVLR: Multi Variable Logistic Regression

MTUTH: MizanTepi University Teaching Hospital

NC: Not Affected

RL: Risk Level

RM: Risk Management

SPHHMC: SaintPauls Hospital Millennium Medical College

SPSS: Statistical Package for the Social Sciences

SBP: Systolic Blood Pressure

TP: True Positive:

TN: True Negative

UACR: Urine Albumin Creatinine Ratio

WC: Waist Circumstance

Ethics approval and consent to participate

Permission was obtained from two institutions one from, Institutional Review board (IRB) of St.paul’s Hospital millennium medical College (SPHHMC) with reference no: PM23/34 dated 11.6.2021, and second is from Jimma University Institute of Health Institutional Review board with reference no JHRPEY/23/26 dated 20.4.2021. The consenting adults participating in this study were at minimal risk. However, participants were informed the purpose and duration of the research. 100% waiver was obtained for this study since it was for research purpose. We confirm that all methods were performed in accordance with the relevant guidelines and regulations.

Consent for publication

Not applicable

Availability of data and materials

The data sets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Competing interests

The corresponding author declares that there is no conflict of interest on the part of any of the other authors, including themselves.

Funding

The authors received no specific funding for this study.

Author’s contribution

Abeba Getachew Asmare:^:Conceptualization, Data curation, Analysis and Interpretation of data, Methodology, software, validation, Visualization, Writing original Draft, Writing- review & Editing.

Bheema Lingaiah Thamineni: Conceptualization, Formal Analysis, Methodology, Supervision, software, validation, Visualization, Writing original Draft, Writing- review & Editing.

Hanumesh Kumar. Dasari: Formal analysis, Critical revision of the manuscript, investigation, resources, Writing- review & Editing

Solomon Woldetsadik: Data Analysis, Resources, Investigation, Validation, Writing- review & Editing.

Acknowledgement

The authors of this study would like to acknowledge Jimma University Medical College, Saint Phaulos Millenium Medical College and Mizan Tepi Teaching Hospital for giving as the opportunity to collect data of participants. Moreover, the authors also would like to extend special thanks for Nephrologist Meakel Belay, Dr. Melaku Tsediwu and Nurse Kebede Temesgen for their guidance, support and advice. Finally, the authors gratefully acknowledge Jimma University School of Biomedical Engineering staffs for their support and advice.

Id SE, Cockwell P, Maxwell AP, Griffin M, Brien TO, Neill CO. “The impact of chronic kidney disease on developed countries from a health economics perspective: A systematic scoping review,” pp.1–19, 2020, doi: 10.1371/journal.pone.0230512.
Fraser SDS, Blakeman T. “Chronic kidney disease: identification and management in primary care,” pp.21–32, 2016.
Lv J, Zhang L. Prevalence and Disease Burden of Chronic Kidney Disease. Springer Singapore. doi: 10.1007/978-981-13-8871-2.
Kore C, Hm Y. “Prevalence of Chronic Kidney Disease and Associated factors among Patients with Kidney Problems Public Hospitals in Addis Ababa, Ethiopia,” vol. 4, no. 1, pp.1–5, 2018, doi: 10.4172/2472-1220.1000162.
Goro KK, et al. Patient Awareness, Prevalence, and Risk Factors of Chronic Kidney Disease among Diabetes Mellitus and Hypertensive Patients. at Jimma University Medical Center, Ethiopia; 2019.
Shewaneh D et al. “Chronic Kidney Disease and Associated Risk Factors Assessment among Diabetes Mellitus Patients at A Tertiary Hospital, Northwest Ethiopia.&#8221
Li PK et al. “Kidney health for everyone everywhere – from prevention to detection and equitable access to care,” vol. 53, pp.1–10, 2020, doi: 10.1590/1414-431X20209614.
Recommendations CP, Providers H.“Chronic Kidney Disease (CKD)”.
Luyckx VA, et al. Reducing major risk factors for chronic kidney disease. Kidney Int Suppl. 2017;7(2):71–87. 10.1016/j.kisu.2017.07.003.
Fisher MA, Taylor GW. “A Prediction Model for Chronic Kidney Disease Includes Periodontal Disease,” vol. 80, no. 1, 2009, doi: 10.1902/jop.2009.080226.
Carrillo-larco RM et al. “Risk score for first-screening of prevalent undiagnosed chronic kidney disease in Peru: the CRONICAS-CKD risk score,” pp.1–11, 2017, doi: 10.1186/s12882-017-0758-4.
Examination N. “SCreening for Occult REnal Disease (SCORED),” vol. 167, 2007.
Tangri N, Stevens LA, Griffith J, Naimark D, Levin A, Levey AS. “ONLINE FIRST A Predictive Model for Progression of Chronic Kidney Disease to Kidney Failure,” vol. 305, no. 15, pp.1553–1559, 2015, doi: 10.1001/jama.2011.451.
Dai D, Alvarez PJ. “A Predictive Model for Progression of Chronic Kidney Disease to Kidney Failure Using a Large Administrative Claims Database,” pp.475–486, 2021.
Taal MW. “Predicting Renal Risk in the General Population: Do We Have the Right Formula ?,” pp.1523–1525, 2011, doi: 10.2215/CJN.04200511.
Makino M, Yoshimoto R, Ono M, Itoko T, Katsuki T. “Artificial intelligence predicts the progression of diabetic kidney disease using big data machine learning,” no. April, pp.1–9, 2019, doi: 10.1038/s41598-019-48263-5.
Raghavan D, Holley JL. Conservative Care of the Elderly CKD Patient: A Practical Guide. Adv Chronic Kidney Dis. 2016;23(1):51–6. 10.1053/j.ackd.2015.08.003.
Chronic Kidney Disease (CKD) Management in Primary Care. 2020. [Online]. Available: www.kidney.org.au
Cabrera VJ, Hansson J, Kliger AS, Finkelstein FO. “Evidence-Based Nephrology Symptom Management of the Patient with CKD: The Role of Dialysis,” vol. 12, no. 3, pp.687–693, 2017.
Disease R, Chala G, Sisay T, Teshome Y. “Chronic Kidney Disease And Associated Risk Factors Among Cardiovascular Patients,” 2019.
Intercollegiate S, Network G. “Diagnosis and management of chronic kidney disease.(SIGN Guideline No103),” 2008.
Hunegnaw A, Mekonnen HS, Techane MA, Agegnehu CD. “Prevalence and Associated Factors of Chronic Kidney Disease among Adult Hypertensive Patients at Northwest Amhara Referral Hospitals, Northwest Ethiopia, 2020,” vol. 2021, 2021.
Fiseha T, Tamir Z. “Prevalence and awareness of chronic kidney disease among adult diabetic outpatients in Northeast Ethiopia,” pp.1–7, 2020.
Kelly JT et al. “Modi fi able Lifestyle Factors for Primary Prevention of CKD: A Systematic Review and Meta-Analysis,” pp.239–253, 2021, doi: 10.1681/ASN.2020030384.
Ji A et al. “Prevalence and Associated Risk Factors of Chronic Kidney Disease in an Elderly Population from Eastern China,” vol. 000.
Kazanciog R. “Risk factors for chronic kidney disease: an update,” pp.368–371, 2013, doi: 10.1038/kisup.2013.79.
Indrayanti S, Ramadaniati H, Anggriani Y, Sarnianto P, Andayani N. “Risk Factors for Chronic Kidney Disease: A Case- Control Study in a District Hospital in Indonesia,” vol. 11, no. 7, pp.2549–2554, 2019.
Naiker IP, Chb MB, Uk M, Lond F, Sa FCP, Assounga AG. “Diagnostic approach to chronic kidney disease,” vol. 105, no. 3, pp.2–4, 2015, doi: 10.7196/SAMJ.9414.
Ngendahayo F, Mukamana D, Ndateba I, Nkurunziza A, Adejumo O, Chronic Kidney Disease (CKD). “: Knowledge of Risk Factors and Preventive Practices of CKD Among Students at a University in Rwanda,” vol. 2, no. 2, pp. 185–193, 2019.
Eisenhower DD, Medical A, Gordon F. “Chronic Kidney Disease:Detection and Evaluation,” 2017.
Death E. “Chronic Kidney Disease in the United States, 2021,” 2021.
Recommendations K. “Chronic Kidney Disease in Adults – Identification, Evaluation and Management,” pp. 1–11, 2019.
KIBUACHA F, “No Title. ” 2021. https://www.geopoll.com/blog/sample-size-research/
Charan J, Biswas T. “Review Article How to Calculate Sample Size for Different Study Designs in Medical Research ?,” vol. 35, no. 2, 2013, doi: 10.4103/0253-7176.116232.
Guidance T et al. “2.7 Sampling,” no. 1, pp.95–113, 2017.
Greig A. “Guidance Note 1: Research Involving Children,” pp. 2–4, 1999.
Noret N. “Guidelines on Research Ethics for Projects with Children and Young People,” pp.1–5, 2009.
Fsrh A, Overweight “FSRHG. Obesity & Contraception,” no. April, 2019.
Sarmento RP. “An Overview of Statistical Data Analysis An Overview of Statistical Data Analysis,” no.August, 2019.
Mindrila D, Ph D, Balentyne P, Ed M.“The Chi Square Test,” no. 2013.
Ebrahimi M, Ms K, Ms RJ, Ms EZ. “Letter Distinction Between Two Statistical Terms: Multivariable and Multivariate Logistic Regression,” pp.1446–1447, 2021, doi: 10.1093/ntr/ntaa055.
Gonzalez-chica DA, Duquia RP. “Test of association: which one is the most appropriate for my study ? *,” vol. 90, no. 4, pp.523–528, 2015.
Prasetyo DB, Simon R. “Car Problem Diagnosis Using Rule-Based Expert System,” no. March 2017, 2019.
Misgna H, Ahmed M, Kumar A. “MatES: Web-based Forward Chaining Expert System for Maternal Care,” pp.1–16.
Dr JV. “Expert sytem and knoweledge representation,” p.8.
Negnevitsky M, Intelligence A, Systems I, Wesley A.Rule-Based Expert Systems. 2004.
Dath A, Balakrishnan M. “Expert System on Coconut Disease Management and Variety Selection,” vol. 5, no. 4, pp.242–246, 2016, doi: 10.17148/IJARCCE.2016.5462.
Vilaça M, Macedo E, Tafidis P, Coelho MC, Vilac M. Multinomial logistic regression for prediction of vulnerable road users risk injuries based on spatial and temporal assessment injuries based on spatial and temporal assessment. Int J Inj Contr Saf Promot. 2019;0(0):1–12. 10.1080/17457300.2019.1645185.
Connor GTO et al. “Multivariate Prediction of In-Hospital Mortality Associated With Coronary Artery Bypass Graft Surgery,” pp.2110–2118, 1989.
Levin A. “Identification of patients and risk factors in chronic kidney disease - Evaluating risk factors and therapeutic strategies,” Nephrol. Dial. Transplant., vol. 16, no. SUPPL. 7, pp. 57–60, 2001, doi: 10.1093/ndt/16.suppl_7.57.
Manuscript A, Access “NIHP. ” vol. 37, no. 2, pp.133–142, 2011.
Brig AM, editor, editor. National Guidline for management of Bangladesh. 2013.
Galanti LM. “Tobacco smoking cessation management: integrating varenicline in current practice,” vol. 4, no. 4, pp.837–845, 2008.
Somasundaram NP, Katulanda P, Wickramasinghe P. “Management of obesity,” no. January, 2014.
Guideline O. “Key points,” no. June, pp. 1–17, 2020.
National T, Prevention D, Program C. “DIABETES PREVENTION AND MANAGEMENT,” no. March, 2012.
Noël PH, Pugh JA. “Clinical review Management of overweight and obese adults,” pp.757–761.
Yumuk VD, Tsigos C, Schindler K, Busetto L. European Guidelines for Obesity Management in Adults. no Dec. 2015. 10.1159/000442721.
Organisation WH. “CLINICAL GUIDELINES FOR THE MANAGEMENT OF”.
Health D, Team C, Numbers T, List M. “Daily Diabetes Management Book”.
Ababa A. “Guidelines on Clinical and Programmatic Management of Major Non Communicable Diseases&#8221.
Stel VS, Bru K, Fraser S, Zoccali C, Massy ZA, Jager KJ. “Full Review International differences in chronic kidney disease prevalence: a key public health and epidemiologic research issue,” no. February, pp. 129–135, 2017, doi: 10.1093/ndt/gfw420.
Charge T. “Your Diabetes Care and Management Plan,” p.20, [Online]. Available: diabetes.org
WHO., A guide for tobacco users. 2014.
Yerokun OM, Onyesolu MO. “Developing and Evaluating a Neuro-Fuzzy Expert System for Improved Food and Nutrition in Nigeria,” vol. 8, pp.1–21, 2021, doi: 10.4236/oalib.1107315.
Santra AK, Christy CJ. “Genetic Algorithm and Confusion Matrix for Document Clustering 1,” vol. 9, no. 1, pp.322–328, 2012.
Wen J et al. “Risk scores for predicting incident chronic kidney disease among rural Chinese people: a village-based cohort study,” pp.1–10, 2020.
Lee C et al. “Framingham risk score and risk of incident chronic kidney disease: A community-based prospective cohort study,” vol. 2019, no. 1, pp. 49–59, 2019.

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

Development of Chronic Kidney Disease Risk Prediction and Management System- Research study

Status:

Version 1

Abstract

Figures

Background

Methodology

3.1 Knowledge Acquisition

3.3 Sample Size Determination

3.4 Ethical Consideration

3.5 Setting and Population

3.6 Measures

3.7 Statistical Analysis

3.8 Expert System Development

3.9 Rule Based Expert System

3.10 Knowledge Acquisition for Expert system

3.10.1 Risk prediction knowledge acquisition

3.10.3 Knowledge Representation

3.10.4 Graphical User Interface (GUI)

Results

4.1 Graphical user interface (GUI) Implementation

Discussion

Conclusion

Abbreviations

Declarations

References

Additional Declarations

Status:

Version 1