3.1 DSS methodology
We show in Fig. 1 a schema for the proposed methodology to design DSS for identifying and monitoring CKD in Brazilian communities. Three actors interact with the DSS generated following the methodology: Physician (internal), Patient (internal), and Government Health System (external). This type of methodology is relevant because developing countries such as Brazil usually suffer from precarious primary health care in specific settings, e.g., hard-to-reach and rural settings
3.1.1 System for patients
We divided this methodological step into two main tasks, which may be conducted simultaneously, for designing the front-end and back-end of web-based systems used by patients. The system should contain personal health records (PHR) and risk assessment functionalities. The risk assessment of the monitored chronic disease is based on the machine learning technique of decision tree analysis. A decision tree is suitable for this type of DSS because it is a white-box analysis approach, enabling physicians to double-check the patient's system's risk assessments quickly. In a previous study [4], we compared existing risk assessment models, showing the suitability of decision tree models for the context of developing countries. Once the patient's system evaluates the user's clinical situation, it sends a clinical document, structured using the HL7 CDA, to the physician responsible for monitoring the patient. The HL7 CDA document is an XML file that contains the risk analysis data, a risk analysis decision tree, and the PHR. Being a web-based system, patients can use it in remote and hard-to-reach settings using different devices, such as desktop computers, smartphones, and tablets. The computer-based risk assessment is used to address precarious public health problems, lack of EHRs, and lack of primary care physicians related to remote and hard-to-reach settings in developing countries, e.g., reaching people who live in the Brazilian Amazonas' state. The PHR is continuously sent to a central data server to update the patient's medical records.
3.1.2 System for physicians
We defined two main tasks for this methodological step, which may be conducted simultaneously to design the front-end and back-end web-based systems used by physicians. The system should enable physicians to receive CDA documents from the patient's system to double-check the chronic disease risk assessment, conduct the final diagnosis using the risk analysis data, risk analysis decision tree, and PHR. The decision tree is relevant for physicians to perform a step-by-step verification of the initial risk assessment. From these initial data, in case of uncertainty about the diagnosis, the physician may use the system to include more specific tests in the CDA document and send it to other physicians to get second opinions until a more precise diagnosis is reached. When the physician concludes, the system updates data into the patient's medical records maintained using the clinical data server. This system is relevant to enable the remote evaluation (i.e., a simple analysis of risk assessment results provided by the model) of people who live in remote and hard-to-reach-settings.
3.1.2 Government Health System
Considering the maintenance of PHR, the continuous usage of the system by patients and physicians results in a large and centralized dataset of users under monitoring in remote and hard-to-reach settings. The subsystems handled by patients and physicians use a server subsystem's web services, aiming to update the local data into the centralized dataset. External government health systems can benefit from the centralized dataset by applying machine learning techniques, generating relevant information for planning public policies, e.g., conducting disease awareness marketing tactics for preventing chronic disease, focusing on settings that present high incidence.
3.2 WebMultCare
The proposed system is composed of three main subsystems: Patient, Medical, and Server. The Patient subsystem is composed of functionalities to handle, among other elements, glucose, and blood pressure sensors, acquiring data related to DM and AH. These data are recorded locally and sent to a database by the Server subsystem. When the CKD risk is identified, alerts and the patient's data are sent to the Medical subsystem; a subsystem used by a physician in a healthcare environment. Thus, the Medical subsystem enables physicians to analyze the risk analysis data and the patient's PHR, updating/confirming the patient's clinical condition under monitoring using the Server subsystem. As an example of a health system, the Brazilian SUS can reuse the patient's central data for planning public policies.
The architecture of the WebMultCare was defined following the attribute-driven design method [35], and guided by the architectural drivers' modifiability, portability, scalability, availability, and interoperability. The system is based on the model-view-controller (MVC) pattern and architectural tactics called semantic coherence and information hiding to achieve modifiability and portability. In contrast, we use the client-server architectural pattern and web services to improve scalability, availability, and interoperability.
3.2.1 Patient subsystem
A previous Android version of the Patient subsystem was presented in [19], including formal specifications, effectiveness evaluation, and usability tests. The usability tests showed some limitations that motivated the re-engineering of the subsystem based on web technologies. Additionally, the version presented in this article improves the CKD risk analysis using machine learning and knowledge-based system concepts. For instance, the system provides a new feature to refer patients with specific emergencies (i.e., hyperglycemia, hypoglycemia, hypokalemia, and hyperkalemia) to an adequate healthcare facility using a knowledge base when visiting an unknown location.
The back-end of the Patient subsystem was implemented using Java and web services. The subsystem comprises the following main features: access control, management of ingested drugs, management of allergies, management of examinations, monitoring of AH and DM, execution of risk analysis, generation and sharing CDA documents, and analysis of the emergency. In contrast, the front-end of the Patient subsystem is implemented using HTML 5, Bootstrap, JavaScript, and Vue.js. Fig. 2a illustrates the graphical user interface (GUI) for recording a new CKD test result (the main inputs for the risk assessment model). The user can also upload an XML file containing the test results to present a large number of manual inputs. Once the patient provides the current test results, the main GUI of the Patient subsystem is updated, showing the test results available for the risk assessment.
Fig. 2b illustrates the main GUI of the Patient subsystem, describing the creatinine, urea, albuminuria, and GFR (i.e., the main features used by the risk assessment model). This study reduces the number of required test results to conduct the CKD risk analysis from 5 to 4 compared to the previously published research [19]. This is critical for low-income populations using the Patient subsystem. The subsystem provides a new CKD risk analysis when the patient inputs all CKD features.
During the CKD risk analysis (conducted when all tests are available), and based on the presence/absence of DM, presence/absence of AH, age, and gender, the J48 decision tree algorithm classifies the patient's situation considering four classes: low risk, moderate risk, high risk, and very high risk. In case of moderate risk, high risk, or very high risk, the subsystem packages the classification results as a CDA document, along with the decision tree graphic and general data of the patient. The Patient subsystem alerts the physician responsible for the patient and sends the complete CDA document (i.e., the main output of the DSS) for further clinical analysis. In case of low risk, the Patient subsystem only records the risk analysis results to keep track of the patient's clinical situation. It does not send the physician alert, automating the risk analysis and sharing, previously requested to the users by button events [19]. In this article, the data of 114 records, available in the same CKD dataset used in [4], guided the training of the J48 decision tree algorithm to define the final risk assessment model embedded in the proposed DSS. We experimented with modifying the parameters of the J48 decision tree algorithm to improve accuracy. Thus, we configured the split point, preventing the scanning of the entire dataset for the closest data value (relocation). For the remaining parameters, we used the default values of the J48 Weka© package.
Results presented in a previous study [4] justify the usage of the J48 decision tree algorithm and features (i.e., presence/absence of DM, presence/absence of AH, creatinine, urea, albuminuria, age, gender, and GFR) to conduct risk analyses in developing countries. The physician responsible for the healthcare of a specific patient can, remotely, access the CDA document by Medical subsystem, re-evaluate or confirm the risk analysis (i.e., preliminary diagnosis) provided by the Patient subsystem, and share the data with other physicians to get second opinions. If the physician confirms the preliminary diagnosis, the patient can continue using the Patient subsystem to prevent the CKD progression, including the monitoring of risk factors (DM and AH), CKD stage, and risk level.
Besides, the Patient subsystem includes a knowledge-based system to refer the patient with CKD and risk factors to an adequate healthcare facility at an emergency, as another new contribution from [19]. This feature considers the patient's scenario outside his/her county and does not know the correct facility for treatment, according to the current health situation. Based on semi-structural interviews with an experienced nephrologist that has treated patients in Brazil for more than 30 years, we addressed the following topics: (i) possible emergency care locations; (ii) pathology to be identified; (iii) symptoms; and (iv) associated drugs. For hyperglycemia, hypoglycemia, hyperkalemia, and hypokalemia, the system can refer the patient to emergency care units (ECU) or hospital emergencies, based on the current patient's health condition.
The knowledge base defined for the knowledge-based system comprises data collected from medical guidelines and semi-structured interviews with the nephrologist. The data relates to the symptoms that patients may present and risk factors that can cause health conditions (e.g., specific drugs). Fig. 3 describes the first decisions used to identify the risk of hyperglycemia, hypoglycemia, hyperkalemia, or hypokalemia. Nausea is a symptom shared by all clinical conditions, and each including symptom helps identify a specific condition.
In addition to the symptoms, the excessive consumption of alcohol, and excessive quantity of insulin, may increase the risk of hypoglycemia. For all clinical conditions, the usage of specific drugs may also result in the clinical conditions considered. The possible ingestion of a drug is a relevant indication of the risk of a specific clinical condition. Fig. 4 describes the commonly ingested drugs that may lead to hyperglycemia, hypoglycemia, hyperkalemia, and hypokalemia.
Fig. 5 illustrates, as a tree, a summary of the relationships between the questions presented in the DSS, considered during the identification of hyperglycemia (left side of Fig. 3). We generated the tree from the knowledge base to present an overview of a sample of the knowledge-based system that composes the DSS. Hyperglycemia is a common clinical condition in patients who have DM. The rule base for hypoglycemia, hyperkalemia, and hypokalemia, is defined similarly to the example of Fig. 5, differing by specific tests, symptoms, and ingested drugs. Finally, Fig. 6 shows a view of the GUI of the knowledge-based system in a risk scenario of hyperglycemia. Whenever facing an emergency, the patient can provide information about his/her current clinical condition, enabling the DSS to identify the emergency and recommend a healthcare unit (another example of the DSS). In this case, after asking about specific symptoms, the patient is required to inform if he/she ingested some drugs to increase confidence in the evaluation, following the relationships presented in Fig. 5.
3.2.2 Medical subsystem
On the one hand, the back-end of the Medical subsystem is implemented using Java, Spring MVC framework, and Drools (a business rules management system). The subsystem comprises the following main features: (i) access control; (ii) management of CDA documents; (iii) control version of CDA documents; (iv) sharing of CDA documents; (v) history of CDA documents versions and (vi) re-evaluation of risk analysis. On the other hand, the front-end of the Medical subsystem is implemented using the HTML 5, CSS, JavaScript, Bootstrap, and Java server pages. After validating his/her credentials, the system directs the doctor to the main GUI, displaying a brief presentation of the available features to handle clinical documents.
Two scenarios guide the usage of the Medical subsystem: creating a new clinical document and evaluating an existing clinical document. Fig. 7a illustrates the feature of creating a new clinical document that enables physicians to start evaluating a patient without the dependency on data received from the Patient subsystem. Fig. 7b illustrates that the physicians are requested to provide the risk assessment for patients guided by the classifications proposed in well-accepted international medical guidelines. In contrast, the evaluation of an existing clinical document relies on data received from the Patient subsystem, which performs the risk assessments of patients. The remote monitoring feature is relevant to address precarious public health, the absence of EHRs, and the lack of primary care physicians in Brazilian communities. Suppose a moderate risk, high risk, or very high risk is identified by the Patient subsystem. In that case, physicians receive general data and the risk assessment conducted using the J48 decision tree algorithm, enabling the final evaluation or the interaction with other physicians to improve confidence in a suspicious clinical situation. When clinical documents are available, physicians can perform version control to access current and past documents—the version control helps keep track of the history of clinical evaluations of patients. The re-evaluation of only a subset of patients (referred by the patient`s system) can reduce the burden (or inefficiency) of the public health.
3.2.3 Server subsystem
A real-time database supports the central data server, assisting data analysis by patients, physicians, and the government. The Patient and Medical subsystems use web services provided by the Server subsystem to update the PHR of patients as part of the medical records available in a healthcare facility. Therefore, the government can conduct data mining, which is relevant to enable the analysis of a large number of data to support the planning and execution of public health policies. For example, it is possible to identify locations that require educational activities to prevent worsening mortality rates.
3.3 Evaluation
3.3.1 Machine Learning Model
A CKD dataset guided the evaluation of the patient`s subsystem of the DSS when conducting the CKD risk assessment according to low-risk, moderate risk, high risk, and very high risk. The data collection was approved by the Brazilian ethics committee of the Federal University of Alagoas, approval number 47350313.9.0000.5013. The evaluation relied on 114 records, including 60 real-world and 54 augmented data (only for the training set), as detailed in Section 2.1. As highlighted, we used 108 records for training and 6 for testing. Table 3 illustrates the evaluation results of the CKD risk assessment conducted using the patient`s subsystem. When using the 10-fold cross-validation, the model presented high accuracy (i.e., 95.00%). The 10-fold cross-validation was executed 5 times, showing stability. The J48 decision tree presented a precision of 0.97, ROC area of 0.96, and PRC area of 0.94.
Table 3. Evaluation results, using the performance metrics, of the CKD risk assessment conducted by the patient`s subsystem.
Metric
|
Result
|
Correctly classified instances (%)
|
95.00
|
Incorrectly classified instances (%)
|
5.00
|
Precision
|
0.97
|
Precision-recall curve area
|
0.94
|
Receiver operating characteristic area
|
0.96
|
3.3.2 Knowledge-based system
Also, the complete system was presented to the experienced nephrologist, confirming the completeness of the requirements. For instance, the knowledge base and questionings were presented to a nephrologist with more than 30 years of teaching and treating patients with CKD and DM to evaluate the knowledge-based system as part of the DSS. The nephrologist reviewed the knowledge base and questionings, validating the final version of the DSS. The nephrologist analyzed simulated data to evaluate the risk of hyperglycemia, hypoglycemia, hyperkalemia, and hypokalemia. The same data was analyzed using the DSS to compare risk assessments. Table 4 presents a sample of the total 112 simulated scenarios used to evaluate the knowledge-based system, covering 7 of the 21 paths of Fig. 5. This sample relates to fictitious subjects with a risk of hyperglycemia.
Table 4. Sample of hyperglycemia simulated scenarios used to evaluate the knowledge-based system.
ID
|
Symptoms
|
Drugs
|
DM
|
Glucose
|
Refer
|
1
|
All symptoms
|
No
|
-
|
-
|
ECU
|
2
|
All symptoms
|
Yes
|
No
|
-
|
ECU
|
3
|
All symptoms
|
Yes
|
No
|
120
|
ECU
|
4
|
All symptoms
|
Yes
|
No
|
126
|
Hospital
|
5
|
All symptoms
|
Yes
|
Yes
|
-
|
ECU
|
6
|
All symptoms
|
Yes
|
Yes
|
60
|
ECU
|
7
|
All symptoms
|
Yes
|
Yes
|
250
|
Hospital
|
To conduct the evaluation, Cohen’s kappa statistic was measured by calculating the gross agreement and the kappa concordance. This task consisted of two steps of evaluations with the experienced nephrologist using Cohen’s kappa. In the first step (Table 5), the knowledge-based system only achieved substantial agreement (k = 0.6821) and moderate agreement (k = 05962) with the nephrologist for risk classification and refer, respectively. The main cause of the disagreement was that the nephrologist considered some of the scenarios of hyperkalemia and hypokalemia risks as inconclusive (Table 7, column 2). In the second step, the knowledge base was corrected, and the evaluation resulted in 100% concordance with the opinion of the experienced nephrologist.
Table 5. Results of the fist evaluation step of the knowledge-based system by comparing it with the opinion of an experienced nephrologist using the kappa statistic.
Risk
|
k By Risk
|
Refer
|
k By Refer
|
Hyperglycemia
|
0.9361
|
ECU
|
0.5880
|
Hypoglycemia
|
0.8886
|
Hospital
|
0.5547
|
Hyperkalemia
|
0.5089
|
Inconclusive
|
1.0000
|
Hypokalemia
|
0.7404
|
|
|
Inconclusive
|
0.0290
|
|
|
Global Kappa
|
0.6903
|
|
0.5962
|