SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2), which caused the COVID-19 pandemic, exerts its effect by infecting pulmonary epithelial cells through the ACE-2 (angiotensin-converting enzyme-2) receptor [1]. While SARS-CoV-2 causes damage to the pulmonary epithelium, it also infects macrophages using ACE-2 receptors and leads to their activation [2]. Macrophages, neutrophils, and T cells can be activated through sustained elevation of important cytokines such as interleukin (IL)-1, IL-6, and tumor necrosis factor (TNF) alpha. Eventually, type 2 pneumocyte apoptosis may be induced and patients may directly face acute respiratory distress syndrome (ARDS) [2]. At this point, host responses can be enhanced in some cases by the overwhelming expression of proinflammatory cytokines. This "cytokine storm" can be blamed for some of the serious events of COVID-19, such as ARDS [3,4].
Vitamin D (Vit-D) is defined as a steroid hormone produced in the body by exposure of the human skin to UV (ultraviolet) B rays and has important roles in calcium and phosphorus metabolism as well as bone mineralization [5,6]. Subcutaneous production can be described as the major source of Vit-D synthesis, although foods such as dairy products, fish oil, and liver also contain noticeable amounts of Vit-D [6,7]. Synthesis and availability of Vit-D depend on many important factors, including exposure to sunlight, latitude, the season of the year, the hour of the day, pigmentation of the skin, age, sex, and body mass index. Vit-D binds to nuclear Vit-D receptors called VDRs during cellular events, forming a VDRE. In this case, the activated VDR can attach to the cathelicidin gene promoter VDRE and thus cause the host to initiate defense against some viral infections. Vit-D can also directly affect the innate immune system through the expression of lysosomal enzymes and the release of nitric oxide. In both cases, Vit-D directly contributes to the fight against infection. Thus, Vit-D plays an immunomodulatory role in SARS-CoV-2 infection by suppressing the adaptive immune system of respiratory epithelial cells [5,8,9].
It is a well-known fact that Vit-D deficiency is a risk factor for many diseases, such as autoimmune diseases [10], cardiovascular diseases [11], obesity [12], and cancer [13]. Vit-D deficiency in individuals has been clearly demonstrated in several studies conducted in many countries, including Turkey [14,15]. Furthermore, with the curfew within the scope of the measures taken during the COVID-19 pandemic process, the amount of daylight that individuals are exposed to is decreasing day by day, and that contributes to Vit-D deficiency.
In recent years, there has been an increasing interest in studies on Vit-D. Most of these studies are focused on both disease pathogenesis and the etiology of Vit-D [18-23]. Degerud et al. (2016) investigated the seasonal Vit-D levels in cardiovascular diseases [40]. In another study conducted by Zgaga et al. (2022), statistical power was evaluated in randomized controlled trials investigating Vit-D [41]. However, there is no study aimed at the mathematical modeling of Vit-D. In this study, it was aimed to examine Vit-D deficiency not only biologically but also mathematically and produce a mathematical model to determine the level of Vit-D if the COVID-19 measures continued.
Materials and methods
This study suggests two different approaches: one is to create an empirical model, and the other is to generate a statistical analysis of the data. In this retrospective, the cross-sectional and methodological study included 125.643 patients (18–75 years) who were admitted to the Dokuz Eylul University Hospital for various reasons and whose Vit-D levels were measured in the Biochemistry unit between 2019-2020 and 2020-2021 (before and after the COVID-19 outbreak). It has been accepted that 2020 reflects the period of the COVID-19 pandemic, while 2019 reflects the non-pandemic period. The dependent variable of the study was Vit-D level, and the independent variables were defined as age, sex, seasons, and months. This study was approved by the Dokuz Eylul University Non-Interventional Research Ethics Committee (Date: 05.01.2022, Approval No: 2022/01-26). Due to the retrospective observational nature of the study, written informed permission was not required, and all techniques followed the Declaration of Helsinki.
Vitamin D total analytical properties
The Vit-D total test was measured in vitro for the quantitative determination of a total of 25 (OH) Vit-D in human serum and plasma (EDTA, lithium-heparin, sodium-heparin) using ADVIA Centaur XP systems. The ADVIA Centaur Vit-D test is measured with the chemiluminescence immune method. It is a competitive immunoassay using an anti-25(OH) Vit-D monoclonal mouse antibody labeled with acridinium ester (AE) and a fluorescently labeled Vit-D analog. Analysis can be monitored with the REF 10493589 ADVIA Centaur Vit-D Calibrator. Three levels of REF 10632229 ADVIA Centaur Vit-D Control are used as internal quality control.
Performance characteristics of the analysis
Measuring range: The ADVIA Centaur Vit-D test measures 25(OH) Vit-D in the concentration range from 4.2 to 150 ng/ml (10.5 to 375 nmol/l); Sensitivity: The functional sensitivity of the ADVIA Centaur Vit-D test is 3.33 ng/ml (8.33 nmol/l); Repeatability (Precision): 4.93% Within-Run 5.44% Between-Runs; Method Comparison: ADVIA Centaur Vit-D = 1.15 (LC/MS/MS) + 0.70 ng/ml, r = 0.91. In the verification study performed in the laboratory, the total error was found to be 12.97%.
Description of the dataset
As mentioned earlier, the dataset was taken by "The Biochemical Laboratory of Dokuz Eylul University". The observed raw data includes 125.643 analysis results. Moreover, the features of the dataset contain information about age, the date of birth, sex, the date of the analysis, and the score Vit-D of the patient. Among these features, "sex" is a categorical value, the date of analysis and the date of birth are time series, and the rest of them are numerical values.
The raw data has been cleaned elaborately with the consensus of the authors. First, the unnecessary columns are removed. This step is followed by limiting the study to the range of 18-75 years due to a potential bias that may occur from age. Then, both the incidental values and the null values have been removed. Additionally, the outliers are also removed. To investigate the seasonal effects of COVID-19 on the levels of Vit-D, the 'Date of Analysis' time series has been manipulated to convert it into "Year" and "Month". Finally, the years are grouped by considering the mean of the Vit-D Score.
After the cleaning process has been completed successfully, the dataset contains 86.772 samples and 5 features such as "age", "sex", "Vit-D Score", "year" and "month".
Mathematical model
Understanding the existing truth behind a disease to interpret the clinical approaches is not simple. Mathematical models can be instrumental in our understanding of the biological behavior of such a system. A convenient mathematical model relies on many complex patterns, such as the selection of the crucial features of the system, the choice of specific parameters, the use of the correct mathematical tools, physical compatibility, etc. Besides the complexity of the anatomy of a human, all these issues make developing a mathematical model a much more challenging problem.
Various types of mathematical models can be used for different purposes in the applied sciences. Among them, due to the use of experiments rather than physical theory, empirical models are more original. While describing an empirical model, statistical inferences are used for model parameter selections. However, the creation process still has its own difficulties and limitations. One of the greatest challenges in describing an empirical model is the non-availability of the use of categorical values in the dataset. This makes our task more difficult but more attractive.
In our study, the collected dataset contains both numerical and categorical values. Therefore, the first aim has been designed to study the categorical values priorly. For this purpose, the significance levels of the mean score of Vit-D were carried out for both sexes in a month. Moreover, for a continuous mathematical model, the age column was also excluded. Thereafter, for the purpose of seasonal study, the mean levels of Vit-D are grouped by years.
Vitamin-D levels have a wave-curved form. It is worth noting that the results can vary from data to data. The following functions can be utilized to describe a model for such a curve:
where \({a}_{i},{b}_{i},{c}_{i}\) and \(\omega\) are constants. Moreover, \({\widehat{L}}_{Vit-D}\left(t\right)\) represents the approximated value of Vit-D levels where \(t\) stands for time (in months). Furthermore, \({N}_{e}\) denotes the number of terms of the approximation.
The rearranged data were randomly divided into training (70%) and test (30%) sets. The goodness of the models was controlled by\({ R}^{2}\)and Adjusted-\({R}^{2}\) (A_\({R}^{2}\)) such that:
$${R}^{2}=1-\frac{SSE}{SST}$$
3
$$A\_{R}^{2}=1-\left(\frac{n-1}{n-p}\right)\frac{SSE}{SST}$$
4
Notice that SSE and SST denote the sum of squared error and the sum of the squared total, respectively. Additionally, n is the number of observations, and p is the number of regression coefficients.
Statistical analysis
In the evaluation of the data, descriptive statistics, means, median values (percentiles), and standard deviations (sd) of the patients were calculated. Compliance with the normal distribution was checked with the Kolmogorov-Smirnov test. The Student’s t-test and ANOVA were used to examine the differences between continuous variables. All computational studies were performed using MATLAB software (version 2021b). For describing the model, the MATLAB Curve Fitting Toolbox has been used. In so doing, SSE has been optimized by the Trust-Region algorithm, where all the other parameters are selected as default. The SPSS (version 24.0) and R (version 4.1.3) package programs were also used for the analysis and visualization of statistical data. The statistical significance level was determined as a 2-sided p < .05.