The carbohydrates we eat break down into glucose, which is one of the primary sources of energy used by cells. Produced by the pancreas, insulin is a hormone that acts as a kind of key, allowing blood glucose to be carried into cells and produce energy. Diabetes mellitus (DM) is a chronic metabolic disorder caused by a deficiency in insulin production or an inability of the body's cells to make use of it properly. Over time, this causes an increase in blood glucose levels known as hyperglycemia, which can cause numerous health complications.[1].
There are three main classifications of DM: type 1, type 2, and gestational. Type 1 diabetes can occur at any age, being more frequent in children and adolescents and corresponding to 10–20% of cases. It is characterised by little or no insulin production due to the destruction of pancreatic β cells and requires daily insulin injections to keep glucose levels under control [1].
Type 2 diabetes, on the other hand, accounts for more than 90% of cases, usually occurring in older individuals (over 40 years of age), although it can also occur in young people and children [2]. Its main characteristic is tissue resistance to insulin action, the causes of which have yet to be fully clarified, but it is strongly related to behavioural factors, such as eating habits, physical inactivity, and obesity [1]. The diagnosis relies on laboratory tests or the appearance of chronic complications when the disease is already advanced [1].
According to the 8th edition of IDF Diabetes Atlas [1], approximately 8.8% of the World's population aged 20 to 79, or 425 million people, have DM. By 2045 it is predicted to reach a total of 693 million people aged between 18 and 99 years. Of this total, about 50% have not been diagnosed and do not know they have the disease, causing delayed treatment and increasing health costs [1]. Worldwide, DM costs account for about 1/8 of total health spending; it is also among the diseases with the highest death tolls, accounting for more than 10% of deaths worldwide [1].
Thus, early diagnosis is essential to avoid further complications and reduce treatment costs. However, early diagnosis does not frequently occur, as almost half of those affected by the disease are unaware of the disease [1].
Diabetes is diagnosed via analysis of laboratory tests, such as fasting plasma glucose (FPG) or plasma glucose (2 h-PG) 2 h after ingestion of 75 g of glucose (OGTT), in addition to glycated haemoglobin (HbA1c). The accuracy of all these tests can be used to diagnose diabetes mellitus but may differ. [3]. Studies show that, compared to the cut-off points for FPG and HbA1c, the two hour PG value diagnoses more people with diabetes [3]. However, HbA1c has advantages, such as international standardisation of assays, lower biological variability, being unaffected by acute stress, and no need for fasting, among others [4]. Thus, given the characteristics of the methods presented, HbA1c has been increasingly indicated for screening and diagnosis of diabetes [5].
For the fasting blood glucose test (FPG), patients with glucose levels below 100 mg/dL are considered healthy. Patients with glucose levels between 101 and 125 mg/dL are pre-diabetic, and patients with glucose equal to or above 126 mg/dL are considered diabetic. However, this test requires the patient to fast for at least eight hours [3]. If glucose presents values above 200 mg/dL (even without glucose intake or fasting) and the patient presents symptoms, that patient is considered diabetic [3].
For the 2 h glucose test, after 75 g of glucose intake, patients results are considered healthy if their glucose level is below 140 mg/dL, considered pre-diabetes if their glucose is between 140 and 199 mg/dL and considered diabetes if the glucose is greater than or equal to 200 mg/dL [3]. When using HbA1c, patients are classified as healthy if HbA1c is below 5.7%, pre-diabetic if HbA1c is between 5.7% and 6.4%, and diabetic if it is equal or greater than 6.5% [3].
Considering that approximately 60% of patients present no symptoms in the initial phase of the disease [6], patients must perform some of these tests to detect DM, but most people without symptoms do not pursue these tests. In observing the frequency of tests performed by a laboratory in Florianópolis, Santa Catarina, Brazil in 2017, the blood count with analytes was found to be the most performed exam. This year, FPG was the third most performed test, with HbA1c occupying the 52nd position. Measuring glucose 2 h after ingestion of 75 g of glucose was in the 409th position.
The characteristics presented hinder early diagnoses of diseases such as type 2 diabetes. Patients often have several exams throughout their lives that may be useful in analysing their health; however, physicians may overlook relevant results or fail to notice patterns in the laboratory dataset, because valuable information related to a diagnosis may be too subtle and more difficult to be identified by a human without adequate computational support [7].
To interpret these results correctly, clinicians must evaluate many tests and interpret them, along with other clinical data, while considering patient history. Although this manual approach to exam interpretation is standard in most cases, computational approaches to laboratory data integration and analysis offer great potential in the search for diagnoses [8].
Clinical laboratories present most test results as individual numerical values. However, the results of these tests viewed in isolation usually have limited usefulness in obtaining a diagnosis. Luo [8], in his ferritin study, found that laboratory tests often include redundant information. Thus, through machine learning-based models, it was possible to predict the results of ferritin laboratory tests from the result sets of other laboratory tests from each patient, providing additional information to refine the diagnosis.
In the same study, Luo [8] found that when measuring ferritin in laboratory tests, they found a high false-negative rate when compared to the computational model. This illustrates that with access to large databases, intelligent systems can improve the interpretation of laboratory test results.
Similarly, Gunčar [9] found that machine learning models can be used to predict hematological diseases using blood tests only. In the study, Gunčar says that laboratory tests have more information than that commonly considered by health professionals.
With significant evolution in recent years [10], machine learning methods are powerful tools in supporting medical diagnoses. Studies [11, 9, 12] have shown that these methods are capable of predicting and identifying diseases based on laboratory tests and clinical data with similar accuracy to a human specialist. Other studies [13, 14, 15] have also been able to assist in the diagnosis of diabetes by making use of machine learning techniques.
Given the facts presented, this study is intended to make use of a database of laboratory tests to predict possible diseases in individual patients. The main goal is to try to predict or assist in the diagnosis of diabetes mellitus through routine examinations and machine learning techniques.