Background: Type 2 diabetes is commonly diagnosed at a later stage due to low attendance to health examinations, especially among certain demographic groups with financial constraints, differences in self-care routines, and medical habits. Given these circumstances, the need to develop reliable predictive tools for type 2 diabetes is becoming increasingly pressing.
Methods: To address this, we routinely collected health checkup data from patients with diabetes and used their clinical symptoms, demographics, and diabetes knowledge to predict the disease onset. Data from 444 Nigerian patients were used to develop a predictive model, with 80% of the dataset selected for training and the remaining 20% for testing.
Results: Using multivariable penalised logistic regression, we predicted type 2 diabetes with an AUC of 99% (95% confidence interval [CI] = 97% - 100%) for the training set and 94% (95% CI = 89% - 99%) for the test set, incorporating waist-hip ratio (WHR), triglycerides (TG), catalase, and atherogenic indices of plasma (AIP) as informative features. A significant increase in the adjusted odds ratios (AOR) of type 2 diabetes was observed with WHR (AOR = 70.35; 95% CI = 10.04 - 493.1, p-value < 0.0001) and AIP (AOR = 4.55; 95% CI = 1.48 - 13.95, p-value = 0.0081). Meanwhile, TG had an increased adjusted odds ratio of 1.04 for type 2 diabetes (95% CI = 0.4 - 2.71), although this association was not statistically significant (p-value = 0.9377). Conversely, catalase exhibited a decreased adjusted odds ratio for type 2 diabetes (AOR = 0.33; 95% CI = 0.22 - 0.49, p-value < 0.0001).
Conclusion: Our study revealed associations between clinical symptoms and type 2 diabetes, emphasising the crucial role of early diagnosis. A web application is provided to aid the early identification of at-risk individuals, reduce health complications, and address inconsistent checkup gaps.