Bone Metabolic Biomarkers-Based Diagnosis of Osteoporosis Caused by Diabetes Mellitus using Support Vector Machine

Background: Diabetes has significant effects on bone metabolism. Both type 1 and type 2 diabetes can cause osteoporotic fracture. However, it remains challenging to diagnose osteoporosis in type 2 diabetes by bone mineral density which lacks regular changes. Seen another way, osteoporosis can be ascribed to the imbalance of bone metabolism, which is closely related to diabetes as well. Method: Here, to assist clinicians in diagnosing osteoporosis in type 2 diabetes, an efficient and simple SVM model was established based on different combinations of biochemical indices, including bone turnover makers, calcium and phosphorus, etc. The classification performance was measured using several evaluations. Results: The predicting accuracy rate of final model is above 88%, with feature combination of Sex, Age, BMI, TP1NP and OSTEOC. Conclusion: Experimental results show that the model has come to an anticipant result for early detection and daily monitoring on type 2 diabetic osteoporosis. Keyword: Bone Turnover Markers, Support Vector Machine, Type 2 Diabetes, Osteoporosis. Running title: AI (artificial intelligence) for diseases diagnosis


Background
Diabetes and osteoporosis are two of the most common diseases in modern age. Moreover, the osteoporosis is often complication of diabetes. [1] Traditionally, dual-energy x-ray absorptiometry (DEXA) test is the gold standard to evaluate and diagnose osteoporosis by checking bone mineral density (BMD). However, it has been found that the BMD-test is not always amenable to diagnose the osteoporosis caused by diabetes. Type 1 diabetes mellitus (T1DM) inhibits the formation of new bone so that the BMD is decreased, which resembles the common osteoporosis [2]. In contrast with T1DM, the BMD variety of type 2 diabetes mellitus (T2DM) is often irregularly [3].
The BMD seems sometime normal or higher [4]. However, for T2DM patients, the risk of osteoporotic fracture is higher than that expected in clinic and the therapeutic plan is different [5,6]. Thus, accurate evaluation of bone health by BMD in the T2DM patients remains a critical link in clinic. An effective diagnostic method of osteoporosis that can be used for the T2DM patients routinely is required.
The serological test is common in the routine physical examination. Because both the diabetes and the osteoporosis are metabolic diseases, the serological test should be sensitive to body changes. Moreover, studies revealed that the glucose level is closely associated with the bone health as well [7][8][9]. Diabetes could result in vitamin-D deficiency [10], and hyperglycemia could suppress osteoblastic formation [11]. Thus, the test of bone turn-over markers (BTMs) is possible to indicate the osteoporosis from the T2DM patients. Nowadays, the test of BTMs has not been widely clinically used in contrast with the DEXA [12] resulting from lack of the specific biomarkers for the osteoporosis [13]. To solve this issue, an alternative is to use computer to analyze big data so that the accuracy of diagnostic result can be enhanced. With development of artificial intelligence, this strategy has been applied in diverse areas of healthcare [14][15][16].
Here, we utilized support vector machine (SVM) to analyze the database of T2DM patients and the algorithm can effectively predict the osteoporosis from the T2DM patients. SVM, also called as large margin classifier, basically is to minimize the distance between classification hyperplane and the support vectors, which are the closest points to the hyperplane. With robust classification ability and excellent generalization performance, SVM only needs to set a few parameters to tune the model based on hundreds of samples [17]. Here, we proposed a SVM-based method to diagnose the osteoporosis from T2DM based on the BTMs of serological testing. Different combinations were generated as inputs sets according to the importance of testing items (Introduction to the common BTMs was shown in Supporting Information, Note S1). Multiple SVM models with different input sets were established, among which the combination of TP1NP (total procollagen I n-terminal propeptide), OSTEOC (osteocalcin), gender, age and BMI (body mass index) showed the best performance. The diagnosis accuracy can reach 88%. Surprisingly, ALP (alkaline phosphatase) that is the common biomarker for osteogenesis was found to have insignificant effect on the classification model. These results demonstrated that computer science will boost the traditional means of diagnosis and play an increasingly important role in the diagnosis of chronic diseases.

Datasets
Data used in this study was collected from the Department of Endocrine, Zhongda Hospital affiliated to Southeast University. The dataset distribution was shown in Figure 1. In modeling dataset, 202 qualified samples were collected from patients during Jan. 2016 to Mar. 2018, and each sample consists of 10 attributes including gender, age, BMI, levels of Ca, P, ALP, TP1NP, PICP, OSTEOC and VIT-D.

Implementation design
After the modeling dataset was established, SVM algorithm was used for classification task.
The flowchart of data processing was shown in Figure 2. The classification was done based on Scikit learning which is a software package of machine learning in Python. The detailed steps were given as follows.

1) Data Preprocessing
Every samples which diagnosis was T2DM complicated by osteoporosis was labeled as positive one (1), falling into positive class. If the diagnosis was just T2DM, the sample was labeled as negative one (−1), falling into negative class. If the gender was female, the sample was labeled as 1. Otherwise, it was 0. Because age has significant influence on risk of osteoporosis both for men and women, the age was grouped and weight of each group was set, as shown in Supporting Information, Table S1. The setting of weight was dependent upon the sample number each group.
For different range of attributes, the data were normalized in order to avoid the influence of large numeric attributes on the calculation results, which was as Eq. (1) This formula converted the eigenvalue to a specific interval, where y is the data before scaling, and y' is the scaled data. lower and upper are the lower bound and upper bound of the given interval, respectively. In this study, the importance of all attributes is considered as the same at first. The data were scaled into [0, 1].

2) Modification of imbalanced data
The practical data are always imbalanced, especially data collected directly from the clinic.
There is always a tendency for the classifiers to get biased in order to achieve higher prediction accuracy. There were 40 samples or 19.8% of positive class and 162 samples or 80.2% of negative class in our dataset. Synthetic Minority Oversampling Technique (SMOTE) was adopted due to the limited sample size in this experiment [18]. After SMOTE, the sample size of minority class was increased from 40 to 162. Finally, the dataset contained 324 samples in total.

3) Selecting of important features
The impact of each feature on classification result is different. Therefore, based on the original data, importance of the features was judged using tree-based estimators. The features were ranked in the order of importance as shown in Figure 3. The weight was larger if the feature was more important. To ensure the classification accuracy and reduce the cost of computing, data dimensions were reduced by ignoring less important features. Six combinations of the attributes were tested which were called as Test 1-6, as shown in Supporting Information, Table S2. Here, 323 samples were used as training set and 1 sample was for testing. After repeating 202 times in each test, classification performance was finally obtained.

4) Parameter optimization
To map the original low-dimensional space into the high-dimensional feature space, the training set was modeled by various kernel functions including radial basis function, polynomial, and sigmoid. In order to improve the generalization ability, soft margin was introduced by adjusting the penalty coefficient C. Parameter C represents the relative importance of classification risk and error rate, which means C is the trade-off between the maximum margin and the noise tolerance.
The larger C means the classification is more rigorous, yielding less mistakes. Secondly, when the gaussian kernel function was selected, the complexity of the model can be adjusted by changing the parameter gamma. The larger value of gamma means the original data are mapped into the higher dimensions and the boundary of classification is more complex. To obtain the best parameters of each model, cross validation was used. As mentioned above, 323 samples in training set were divided into 5 sub-sample sets. One sub-sample set was selected randomly as the verification one while the other sets were used for training. After multiple training and verification, average training score was obtained. The model with the highest training score was considered as the best one. By completing the above operations, the SVM model was established.

5) Evaluation of the classification performance
The classification performance was evaluated by using four metrics: accuracy, precision, recall and area under the receiver operating characteristic curve (ROC-AUC value) based on the confusion matrix. The accuracy, the precision and the recall were calculated by Eq. (2)-(4): (2) among which the values above 0.9 indicate excellent prediction, between 0.7 and 0.9 good, between 0.5 and 0.7 poor and any value below 0.5 is considered no better than a random guess [19].  Figure 3, the top 5 attributes were TP1NP, age, P content, gender and OSTEOC, which are of great guiding significance in diagnosis of T2DM complicated with osteoporosis.

Based on
Interestingly, the VIT-D, the BMI and especially the ALP were found insignificantly important.
Moreover, both TP1NP and PICP were reported to indicate bone formation [20]. However, the AI results showed that TP1NP is more sensitive than PICP in BMT-based diagnosis of T2DM complicated with osteoporosis. Surprisingly, ALP, as the commonly preferred biomarker of osteogenesis, was at the bottom. Also, BMI seems less closely associated with osteoporosis rather than that people always thought [21]. Besides, Ca, PICP and VIT-D also showed less importance than expected in this issue. These results will be helping for physicians in clinical diagnosis of T2DM complicated osteoporosis.

Classification results
The SVM-based classification algorithms are often evaluated using confusion metrics as shown in Supporting Information, Fig. S1. For evaluation and comprehensive analysis of each classifier, the classification performances of 6 tests were listed in Supporting Information, Table 3 and plotted in Figure 4. It was seen that Test 1, 2 and 3 possessed over 85% accuracy and over 50% precision. There was a positive correlation between the number of attributes and classification accuracy. The accuracy score of Test 1, which included 10 attributes, was improved remarkably than Test 4, 5 and 6, which included 4 or 5 attributes. At the same time, it should be noted that the precision in all tests was relatively low because of the imbalanced data in verification. As the number of features decreasing, the really positive samples were more difficult to be distinguished from the positively marked samples.
Here, one valuable conclusion is that the not all the testing items are needed. Compared with Test 1, Test 2 with 7 attributes showed the nearly same classification accuracy and ROC-AUC value. The recall of Test 2 was even higher that of Test 1, indicating that it is feasible to use a few most influential testing items for diagnosis. With the same number of dimension, Test 3 obtained higher score on classification accuracy and ROC-AUC value than Test 4 and 5, which demonstrated that TP1NP from Test 3 is better as an attribute than PICP and ALP from Test 4 and 5. This may suggest that TP1NP is more specific as an evaluating indicator for bone metabolism in BTMs testing.
In addition, too few attributes are inadequate to yield correct results with the SVM algorithm.
Test 6, which included the important attributes, showed bad classification performance. Under the premise without decrease of testing performance, the reduction of items is good. Actually, resulting from complex interactions among the organs and systems, biochemical information from clinical tests may be redundant. By AI technology, some diseases can be diagnosed by relatively simpler testing items but significantly reduce the cost. Furthermore, AI can establish connections between the phenotype of serological testing and the development of disease. This is important for the diagnosis of degenerative diseases, such as osteoporosis, because there remains no highly specific biomarker for these diseases.

Conclusion
In this paper, SVM algorithm was tried to classify the osteoporosis from the T2DM relying on several serological items and personal information. The accuracy can be over 85%, showing promising potentials for the diagnosis of T2DM complicated with osteoporosis in clinic. This method is cheap, safe and extendible. Interestingly, some cases different from common sense were found, such as ALP playing an insignificant role in the AI-based diagnosis. These results will be helpful for the clinical and POCT diagnosis of osteoporosis, deepening the investigation of pathological mechanism.