Clinical and anthropometric characteristics of the study population: The clinical and anthropometric characteristics of the study group were presented in Table 1. There was not significant difference between the groups in terms of age. When compared the lipid profiles, serum TG, TC, LDL were higher in the all CAD patients than in the non-CAD patients (p<0.05, respectively), whereas serum HDL levels were significantly higher among non-CAD patients (p<0.05). The average fasting plasma glucose of the CAD patients were significantly higher than that of non-CAD patients (p<0.05). CAD patients had higher levels of systolic blood pressure and diastolic blood pressure. There was also a higher prevalence of obesity, hypertension, diabetes, hyperlipidemia and smokers in CAD patients compared to non-CAD patients (p<0.05). In addition, male gender was predominant in CAD patients (76%). Hypertension was the more commonly associated clinical condition in CAD patients (92%), while the least common one was determined to be the hyperlipidemia (56%). The further data analysis demonstrated that male gender, obesity, hypertension, diabetes mellitus, smoking and hyperlipidemia were the predominating risk factors for CAD in Tanzanian population.
Table 1: Baseline characteristics of the study groups
Variables
|
CAD
(n=200)
|
non-CAD
(n=220)
|
P Value*
|
Age (year)
|
60.04±7.67
|
58.64±10.27
|
0.580
|
Sex/ male (%)
|
19 (76%)
|
10(40%)
|
0.010
|
Weight (kg)
|
84.56±11.58
|
67.52±14.56
|
0.001
|
Height (m)
|
1.60±0.088
|
1.65±0.073
|
0.291
|
BMI (kg/m2)
|
32.36±4.70
|
23.84±3.98
|
0.001
|
Systolic BP (mmHg)
|
133.88±28.89
|
122.64±24.56
|
0.037
|
Diastolic BP (mmHg)
|
91.40±15.54
|
77.08±16.33
|
0.003
|
Glucose (mmol/L)
|
8.01±2.14
|
4.13±0.78
|
0.001
|
Cholesterol (mmol/L)
|
5.88±0.95
|
3.95±0.72
|
0.001
|
HDL (mmol/L)
|
0.84±0.28
|
1.36±0.30
|
0.001
|
LDL (mol/L)
|
3.97±0.76
|
2.56±0.66
|
0.001
|
VLDL (mmol/L)
|
1.19±0.44
|
0.52±0.28
|
0.001
|
TG (mmol/L)
|
2.63±1.31
|
0.98±0.41
|
0.001
|
Obesity (%)
|
18 (72%)
|
0
|
0.001
|
Hypertension (%)
|
23 (92%)
|
11 (44%)
|
0.001
|
Diabetes (%)
|
17 (68%)
|
1 (4%)
|
0.001
|
Hyperlipidemia (%)
|
14 (56%)
|
0
|
0.001
|
Smoking (%)
|
17 (68%)
|
7 (28%)
|
0.005
|
Data: mean ± SD *Comparisons of differences between mean values of two groups unpaired Student t-test was used
Allele and genotypic association of different SNPs: The allele and genotype frequencies of five SNPs (rs10757274, rs2383207, rs2383206, rs10811656, and rs10757278) were analyzed for all participants including 200 CAD patients and 220 non-CAD patients and the results are presented in Supplemantary Table 1. All SNPs were at Hardy-Weinberg equilibrium in both groups (all 𝑝 values>0.05). Significant differences were observed in genotype and allele frequencies of rs10757274, rs2383206, rs10811656, and rs10757278 variants between CAD and non-CAD patients (p<0.005).
The genotype frequencies of rs10757274, rs2383206, rs10811656, and rs10757278 SNPs remained significant when analyzed in the subgroup including 25 CAD patients undergoing CABG operation and 25 non-CAD patients undergoing heart valve operation (Table 2). Furthermore, the risk alleles rs10757274 G allele, rs2383206 G allele, rs10811656 T allele, and rs10757278 G allele were found statistically significant (OR=5.79, 95% CI= 2.32-14.43, OR=3.80, 95% CI= 1.65-8.74, OR=6.16, 95% CI=2.41-15.75 and OR=5.67 95% CI= 2.14-14.99, respectively) in CAD patients compared to non-CAD patients of the subgroup (Table 2). The analyses of allelic and genotypic showed that the rs2383207 SNP is not associated CAD risk in both groups.
ANRIL and ANRIL splice variants expression levels in adipose tissues and PBMCs from CAD patients and non-CAD patients: For expression analysis, we included 25 CAD patients undergoing CABG and 25 non-CAD patients undergoing valve replacement. The expression levels of ANRIL, and ANRIL transcripts; circANRIL, NR003529, EU741058 and DQ485454 were studied in EAT, MAT, SAT and PBMCs (Figure 1). ANRIL expression levels were significantly up-regulated in PBMCs of the CAD patients compared to non-CAD patients (fold change=1.6, p<0.001). Although ANRIL expression levels in EAT, MAT and SAT were found increased in CAD patients compared to non-CAD patients, the differences were not significant (Figure 1A).
circANRIL was significantly down-regulated in PBMCs of CAD patients compared to non-CAD patients (fold change=5.3, p<0.001) (Figure 1E) while no significant differences were determined in EAT, MAT and SAT. The associations of ANRIL and circANRIL expressions with the CAD severity (double stenotic vessels disease (n=13) versus triple stenotic vessels disease (n=12)) were also evaluated. A statistically significant difference in the expression levels of circANRIL was determined between the two groups (p<0.05) (Figure 2) but no statistically significant difference was observed for ANRIL expression (p>0.05).
Moreover, the expression levels of NR003529, EU741058 and DQ485454 in EAT, MAT, SAT and in PBMCs showed no statistical differences between CAD patients and non-CAD patients (p>0.05, respectively). The expression level of EU741058 in PBMCs was down-regulated in CAD patients compared to non-CAD (fold change≈0.8) but the difference was not significant.
Table 2: The genotypic and allelic frequency distributions of SNPs on chromosome 9p21.3 in the subgroup

The subgroup includes 25 CAD patients undergoing CABG and 25 non-CAD patients undergoing valve placement. OR: Odd Ratio, CI: Confidence Interval *The genotypic and allelic frequency distributions of polymorphisms between the groups were compared using x2 and HWE test. In all cases differences were considered significant at p< 0.05.
Associations between 9p21.3 risk locus genotypes and ANRIL and ANRIL transcripts expression levels: To better understand the relationship between ANRIL and CAD, we next evaluated the potential effects of 9p21.3 risk locus SNPs on the expression levels of ANRIL and ANRIL transcript variants in PBMCs and adipose tissues of the subgroup including 25 CAD patients undergoing CABG and 25 non-CAD patients undergoing valve placement. The expression levels of ANRIL in PBMCs were significantly higher in the risk genotype carriers of rs10757278 and rs10811656 (GA and GG for rs10757278 and CT and TT for rs10811656) in CAD patients compared to wild type carriers (p=0.004 and p=0.013, respectively). Also, the expression levels of ANRIL transcript variants; NR003529 and EU741058 in EAT, MAT and PBMCs were significantly higher in CAD patients carrying the risk genotype of rs10757278 and rs10811656 compared to wild type carriers, while there was no difference in DQ485454 (p=0.001, p=0.007, p=0.028, p=0.006, p=0.002, p=0.019 and p>0.05, respectively). However, the expression levels of circANRIL in PBMCs were significantly down-regulated in rs10757278 and rs10811656 risk genotype carriers compared to wild type carriers (p=0.001 and p=0.01, respectively).
Impacts of CAD risk factors on the expression levels of ANRIL transcripts: The correlation analyses between expression levels of candidate genes and risk factors of CAD were analyzed by the Spearman correlation test. ANRIL expression in PBMCs was positively correlated with BMI, glucose level, total cholesterol, TG and LDL (r=0.362, p=0.01, r=0.325 p=0.021, r=0.323 p=0.02, r=0.444 p=0.001 and r=0.460 p=0.001, respectively) but negatively associated with HDL (r=0.304 p=0.032).
The circANRIL expression levels in PBMCs were negatively correlated with BMI, glucose level, total cholesterol, TG, LDL, Systolic BP and Diastolic BP (r=0.531, p=0.001, r=0.547 p=0.001, r=0.599 p=0.001, r=0.558 p=0.001, r=0.535 p=0.001, r=0.363 p=0.009 and r=0.469 p=0.001, respectively) but positively associated with HDL (r=0.583 p=0.001).
ANRIL and circANRIL were co-regulated with most of the risk factors of CAD such as lipid levels, blood pressure and glucose levels. These positive and negative correlation results with the risk factors, which mentioned above, suggested that ANRIL and circANRIL expressions regulate the risk factors leading to the CAD development and ANRIL and circANRIL may serve as indicator genes in CAD patients.
Future importance of the variables: Machine learning (ML) is a highly effective method for disease prediction using machine learning techniques. They are able to capture the complex interactions between predictors and outcomes during the data process, and they would provide a new and novel discernment towards the disease [39]. Recent studies showed that random forest (RF) is the most efficient algorithm for the prediction of CAD than the other algorithms and it gives consistently better accuracy of the prediction system [40-42]. We measured the importance of different features for the risk factors of CAD together with the expression levels of ANRIL and its transcript variants by the mean decrease impurity (Gini importance) of all decision trees in a tuned RF model. The importance of included variables obtained from the tuned RF model is presented in Figure 3. As expected, age, systolic BP, BMI and smoking were among the top risk factors. In addition we observed that expression levels of ANRIL and circANRIL in PBMCs were also among the top risk factors.
Improvement in the diagnostic value: Multivariate logistic regression analyses revealed that weight (OR=0.91, 95%CI=0.85-0.96), height (OR=3.54, 95%CI=1.60-7.81), BMI (OR=0.61, 95%CI=0.46-0.80), systolic BP (OR=0.98, 95%CI=0.96-1.0), smoking (OR=5.46, 95%CI=1.62-18.35), diastolic BP (OR=0.95, 95%CI=0.90-0.99), sex (OR=0.22, 95%CI=0.06-0.71) and the expression levels of ANRIL (OR=2.05, 95%CI=1.74-2.35) and circANRIL (OR=13.63, 95%CI=3.74-49.60) in PBMCs were potential biomarkers for CAD.
To test comparisons of the diagnostic value of expression levels of ANRIL and circANRIL to top risk factors which are clinical features and observed in the RF model in the development of CAD, receiver operating characteristic (ROC) curve analysis was performed and the area under curve (AUC) was calculated.
Three models for CAD prediction based on clinical features and expressions of ANRIL and circANRIL were built. The first model (clinical model) consisted of CAD risk factors: weight, height, BMI, systolic BP, diastolic BP, smoking and age. The second model (clinical+ANRIL expression model) consisted of the clinical model and ANRIL expression, and the last model (clinical+circANRIL expression model) circANRIL expression was included in the clinical model.
The AUC value was 0.844 for clinical model (95% CI: 0.724-0.963, Optimal Cut-off: 0.55, Specificity 0.88, Sensitivity: 0.80), further the introduction of ANRIL expression increased AUC from 0.844 to 0.912 (95% CI 0.821-1.0, Optimal Cut-off: 0.61, Specificity 0.96, Sensitivity: 0.84, p=0.02) (Figure 4A), while the circANRIL expression into the clinical model including the CAD risk factors (age, weight, height, BMI, systolic BP and smoking) the area under curve (AUC) significantly increased from 0.844 to 0.980 (95% CI 0.953-1.0, Optimal Cut-off: 0.33, Specificity 0.88, Sensitivity: 1.0, p=0.009) (Figure 4B).
Finally, ROC analyses suggested that the detection of ANRIL expression and circANRIL expression together with risk factors of CAD exhibited a higher diagnostic performance compared with the detection of risk factors only. This result implied that the combination of ANRIL and circANRIL expressions in PBMCs has a great potential to be sensitive, and reliable biomarker possibly having a higher diagnostic value for CAD.