Development and validation of an equation to predict the incidence of coronary heart disease in patients with type 2 diabetes in Japan

Objective In the diabetes treatment policy after the Kumamoto Declaration 2013, it is difficult to accurately predict the incidence of complications in patients using the JJ risk engine. This study was conducted to develop a prediction equation suitable for the current diabetes treatment policy using patient data from Kitasato University Kitasato Institute Hospital (Hospital A) and to externally validate the developed equation using patient data from Kitasato University Hospital (Hospital B). Outlier tests were performed on the patient data from Hospital A to exclude the outliers. Prediction equation was developed using the patient data excluding the outliers and was subjected to external validation. Results By excluding outlier data, we could develop a new prediction equation for the incidence of coronary heart disease (CHD) as a complication of type 2 diabetes, incorporating the use of antidiabetic drugs with a high risk of hypoglycemia. This is the first prediction equation in Japan that incorporates the use of antidiabetic drugs. We believe that it will be useful in preventive medicine for treatment for people at high risk of CHD as a complication of diabetes or other diseases. In the future, we would like to confirm the accuracy of this equation at other facilities. Supplementary Information The online version contains supplementary material available at 10.1186/s13104-021-05844-w.


Introduction
It is important to prevent the development of diabetic complications during diabetes treatment [1]. One of the methods to predict the incidence of diabetic complications is through the use of risk engine, which is used to personalize medicine for patients. Currently, several risk engines have been developed to predict the incidence of diabetic complications [1][2][3][4][5]. In 2012, the Japan Diabetes Complications Study (JDCS)/The Japanese Elderly Diabetes Intervention Trial (J-EDIT) risk engine (JJ risk engine) was developed to accurately predict macro-and microvascular complications in Japanese patients with type 2 diabetes [1]. However, after the Kumamoto Declaration 2013, this risk engine was evaluated only through internal validation, without considering hypoglycemia prevention as the priority, and the implementation of external validation has been a challenge [1,6]. In our previous study, we externally validated the prediction accuracy of the JJ risk engine using data from patients with type 2 diabetes at Hospital A. The results showed that the prediction of the JJ risk engine and the actual frequency of diabetic complications in Hospital A diverged [7]. Although the cause of this discrepancy is unknown, one reason may be the change in diabetes treatment to one that emphasizes hypoglycemia prevention [6,[8][9][10]. Therefore, we conclude that it is difficult to accurately predict the complication rates in patients using the JJ risk engine based on the diabetes treatment policies after the Kumamoto Declaration of 2013 [7].
Besides hypoglycemia, other risk factors for coronary heart disease (CHD) include aging, hypertension, hyperlipidemia, obesity, and chronic kidney disease (CKD) [11,12]. Outlier tests were conducted for each risk factor, because prediction without outliers is more accurate than prediction with outliers [13,14]. Therefore, in this study, we developed a new prediction equation that is more accurate and suitable for the patient population of Hospital A and externally validated the new prediction equation using patient data from Hospital B.

Hospital (target facility)
Kitasato University Kitasato Institute Hospital (Hospital A). Kitasato University Hospital (Hospital B).

Selection criteria
The subjects were patients with type 2 diabetes who visited Hospital A or Hospital B from January 2013 to December 2013 and continued treatment for the following 5 years until 2018.

Exclusion criteria
Patients who refused to participate in the study or had a history of any of the following diseases were excluded: angina, myocardial infarction, stroke, peripheral arterial disease, familial hypercholesterolemia, familial type III hyperlipidemia, nephrotic syndrome, renal diseases other than diabetic nephropathy, microhematuria, preproliferative and proliferative retinopathy, or major ocular diseases (e.g., glaucoma, dense cataract, or a history of cataract surgery).
This study was conducted in accordance with the Ethical Guidelines for Medical and Health Research Involving Human Subjects. The Kitasato University Kitasato Institute Hospital, Research Ethics Committee, approved the study (Control Number: 20051 and 20051-2) and provided permission to review patient records and use the corresponding data. The option to opt-out of the study was provided to the patients at the start of the study (2021).

Statistical analysis
We developed a prediction equation based on the Cox proportional hazard model using patient data from Hospital A [15]. The backward stepwise method was used for the selection of variables [16].

Discrimination
It is an index that evaluates how accurately the presence or absence of an event can be predicted by a prediction model. The C-statistic, which is calculated based on the receiver operating characteristic (ROC) curve, is used as a criterion for measuring the predictive accuracy [17,18].

Calibration
It is an index to measure the degree of agreement between the prediction by the model and the actual outcome. The significance probability calculated using the Hosmer-Lemeshow test is used as the criterion for predictability. The significance level was set at 0.05 (p < 0.05) [18,19].

External validation
We developed prediction equations using Hospital A data and then performed external validation using Hospital B data.

Outlier testing using box plots
Outlier tests with box plots were performed to reduce the impact of outliers of each risk factor on the prediction accuracy.
Although blood pressure was measured at the time of medical examination, it was excluded from the risk factors in this study because the time of measurement varied among subjects [21]. R version 2.5.1 (http:// www.r-proje ct. org, library Design, Hmisc, ROCR) was used to determine discrimination and calibration, whereas the ROC curve, Hosmer-Lemeshow test, and box plot were used for calculation [18][19][20].

Results
There were 572 and 285 patients in Hospitals A and B, respectively. The baseline characteristics of the patients are presented in Table 1.
Patients who used either sulfonylurea (SU) drugs or insulin were considered medicine users. Among the variables, only medicine was found to have a value of p < 0.05 (p = 0.03) (Additional file 1: Table S1). Therefore, only medicine was included as a variable in the prediction equation, and the prediction equation developed is as follows: λ t : Incidence rate by time t; λ 0t : Baseline hazard for time t; β: partial regression coefficient; medicine (0, 1): 1 for patients who used either SU or insulin, 0 for patients who used neither.
The prediction equation using data from Hospital A resulted in a C-statistic of 0.734 and a calibration of p > 0.05, indicating no significant difference between the measured and predicted values. In contrast, external validation using data from Hospital B resulted in a C-statistic of 0.809 and a calibration of p < 0.05, indicating a significant difference between the measured and predicted values ( Table 2).
Therefore, an outlier test using a box-and-whisker diagram was performed to improve the prediction accuracy. The outliers were age: 39 years or less; total cholesterol: ≥ 266 and ≤ 96; HDL cholesterol: ≥ 105.5; BMI: ≥ 35; and urine albumin: ≥ 78.7. A total of 120 patients from Hospital A and 84 patients from Hospital B were excluded.
In the analysis of the variables after exclusion, the p-value for medicine was < 0.05, indicating a significant difference.
The developed prediction equation is as follows:  λ 1825 : Incidence rate within 5 years; medicine (0, 1): 1 for patients who used either SU or insulin, 0 for patients who used neither.
The C statistic was 0.644 and the calibration was p > 0.05. There were no significant differences between the measured and predicted values. A total of 201 excluded patients from Hospital B were used for external validation; the C statistic was 0.750, and the calibration was p > 0.05. There were no significant differences between the measured and predicted values (Table 3).

Discussion
After the outliers were excluded, a prediction equation (K-medicine equation) was developed for the incidence of CHD in patients with type 2 diabetes using patient data from Hospital A. Furthermore, after excluding outliers, the external validation using data from Hospital B showed that the C-statistic was moderate and the calibration was not significantly different, indicating a correct prediction. Exclusion of outlier data for age, total cholesterol, HDL cholesterol, BMI, and urinary albumin levels was a condition used in this prediction equation.
Compared to the JJ risk engine, we incorporated the use of antidiabetic drugs into the prediction equation. In addition, while risk engines developed in other countries have incorporated therapeutic drugs (dyslipidemia drugs) into the prediction equation [22], our study is the first to incorporate therapeutic drugs (diabetes drugs) into the prediction equation in Japan. This prediction equation is intended for the primary prevention of CHD in Japanese patients with type 2 diabetes. Compared with other prediction formulae, this equation has the advantage that the variables can be selected according to the characteristics of each institution; the disadvantage is that the risk of developing CHD is calculated simultaneously prior to the administration of SU drugs and insulin, because SU drugs and insulin are introduced much later in diabetes treatment.
In contrast, validation using Hospital B patient data showed significant differences in the calibration, indicating an incorrect prediction. As per previous studies [11][12][13][14], the incorrect prediction might be owing to outliers of risk factors that influence the development of CHD, which affect the prediction accuracy.
In a previous study, the risk of developing CHD was higher in patients who used SU drugs and insulin than in those who received dipeptidyl peptidase-4 inhibitors [23], indicating that SU drugs and insulin are risk factors for CHD.
In the selection of variables, there was a significant difference in the presence or absence of SU drugs or insulin use, and it was reasonable to include it in the variables of the prediction equation. The risk of CHD associated with the use of SU drugs and insulin is consistent with the results of previous studies [23]. This prediction equation indicates that the use of diabetes medications with a high risk of hypoglycemia influences the development of CHD as a complication of type 2 diabetes.
Because of outliers, about 21% and 30% of patients were excluded in Hospital A and Hospital B, respectively. As more than 70% of the patients in each institution remained after exclusion, we believe that our prediction formula is applicable to many patients with type 2 diabetes.
In the variable analysis, there was no significant difference in the risk factors for CHD. Therefore, in this study, the risk factors for CHD were not included as a variable in the prediction equation; however, for patients with elevated laboratory values for risk factors for CHD, it will need to be considered as risk in the future. The column for LDL cholesterol under Hospital B (Table 1) is blank, because the LDL cholesterol values could not be obtained at this hospital. Since the blood pressure data presented in this study is only what was collected at the clinic, it is necessary to collect blood pressure data multiple times at the time of consultation and conduct additional analysis using more reliable blood pressure data [21].
The unique feature of this study is that the prediction equation was developed using patient data from a medium-sized hospital in Japan, rather than largescale clinical data. Furthermore, we were able to create a prediction equation and validate externally. Although this prediction equation focuses on drugs that tend to cause hypoglycemia, which is a risk factor for the development of CHD, some hypoglycemic drugs, such as SGLT2 inhibitors and GLP-1 receptor agonists, reduce the risk of CHD [24]. In Hospital A, we have been using these drugs at full scale since 2016. Less than 9% of patients use these drugs; and therefore, it is difficult to consider the impact of these drugs in this study. The frequency of use of these drugs is expected to increase in the future; and it is necessary to consider the development of a prediction formula that includes them.

Conclusion
We developed an equation to predict the incidence of CHD in patients with type 2 diabetes. Based on the prediction equation developed in this study, we believe that the use of diabetic drugs with a high risk of hypoglycemia influences the incidence of CHD as a complication of type 2 diabetes. Although this prediction equation is based on the patient population of Hospital A, we would like to confirm the accuracy of our prediction theory in other institutions in the future.

Limitations
The analysis of this study included patients who were using medications to prevent cardiovascular disease, but the effects of these medications were not considered in the analysis. Consequently, the possibility that they may affect the outcome cannot be excluded. A history of CHD in the family could influence a patient's risk of developing CHD. However, in this study, we were unable to investigate family history, which could affect the results of the prediction equation. Therefore, family history should be considered in future studies. The relatively small number of people who developed CHD may have affected the reliability of the analysis [25].