3.2. PLS-DA models
Three PLS-DA models were generated; the first PLS-DA model was created with the absorbance data of the visible spectrum of AuNPs. Figure 1A displays the visible spectrum of AuNPs (blue line, NP), urine AH patients with AuNP solution (red line, AH), and urine HV patients with AuNP solution (green line, HV). This figure shows that when the AuNPs are mixed with urine, a new absorption band appears, which may be attributed to the interaction of the AuNPs with the existing metabolites in urine, possibly due to an agglomeration generated in the sample. The sensitivity and specificity for cross-validation for this model were 0.76 and 0.69, respectively; these values indicate acceptable discrimination, suggesting that the concentrations and the type of metabolites in urine differ between HV patients and AH patients. A second model PLS-DA based on the size distribution data was obtained by DLS, as verified by the size of the AuNP clusters with urinary metabolites measured by DLS, giving a sensitivity and specificity of 0.67 and 0.53, respectively.
A third PLS-DA model was generated using the second derivative of SERS spectral data of urine samples from HV and AH patients. The first 6 PLS-DA models presented error values of 0.413, 0.357, 0.299, 0.237, 0.261, and 0.285, and the lowest error value was for model #4, with an error of 0.237 with 4 latent variables, RMSEC of 0.3269 and RMSECV of 0.4275 (Table 1). The model’s performance was evaluated through the parameters of the confusion matrix, such as sensitivity and specificity. Sensitivities and specificities were calculated for the calibration (Cal), validation (Val), and prediction models28. The results indicated a sensitivity of cross-validation (SenCV) of 0.769 (76.9%), sensitivity prediction (SenPred) of 0.864 (86.4%), specificity of cross-validation (SpeCV) of 0.772 (77.2%), specificity prediction of 0.778 (77.8%), accuracy of cross-validation (AccCV) of 0.770 (77.0%), and accuracy prediction of 0.825 (82.5%) (Table 1). The same model was then used to discriminate only patients diagnosed with renal damage and controls, and a new confusion matrix was generated; the sensitivity and specificity were found to be better than for discrimination considering only hypertension (Table 1).
Table 1
PLS-DA classification confusion matrix for hypertensive models.
Hypertensive model |
Confusion Table (Cross Validation) |
| AH, RAH | HV | N | Sensitivity | Specificity | Accuracy |
Predicted as AH, RAH | 50 | 13 | 65 | 0,769 | 0,772 | 0,770 |
Predicted as HV | 15 | 44 | 57 | | | |
Confusion Table (Prediction) |
| AH, RAH | HV | N | Sensitivity | specificity | Accuracy |
Predicted as AH, RAH | 19 | 4 | 22 | 0,864 | 0,778 | 0,825 |
Predicted as HV | 3 | 14 | 18 | | | |
Hypertensive diagnosis with kidney damage model |
Confusion Table (Cross Validation) |
| RAH | HV | N | Sensitivity | specificity | Accuracy |
Predicted as RAH | 22 | 13 | 26 | 0,846 | 0,772 | 0,795 |
Predicted as HV | 4 | 44 | 57 | | | |
Confusion Table (Prediction) |
| RAH | HV | N | Sensitivity | specificity | Accuracy |
Predicted as RAH | 9 | 4 | 9 | 1,000 | 0,778 | 0,852 |
Predicted as HV | 0 | 14 | 18 | | | |
In addition, the ROC curve of the third PLS-DA model was generated to verify how strong the contribution of the spectral variables was with respect to the classification of HV, AH, and RAH patients. The ROC curve was evaluated with the specificity and sensitivity parameters for the validation and prediction model for AH (Fig. 2A) and RAH (Fig. 2B), with an area under the curve of 0.835 (83.5%) for AUC (CV-AH), 0.907 (90.7%) for AUC (Pred-AH), 0.870 (87.0%) for AUC (CV-RAH) and 0.90.1 (90.1%) for AUC (Pred-RAH). Figure 2C and D show the value predicted for AH in the classification of patients into two classes, HV and AH (with the RAH subclass), of the model classification for hypertension. In addition, the contribution of LV scores in the classification of patients displays two well-defined classes of HV and AH patients using a graph of scores with only the first 4 LVs. Notably, even with the obtained sensitivity and specificity, it was necessary to conduct validation to evaluate the clinical efficacy of the classification model for hypertensive patients.
To identify the significant spectral bands for the model, the VIP scores > 1 were identified and compared with the average Raman spectra and the signals generated by the [2ndD] preprocess (Fig. 3). A comparison of the spectrum (data) average of the urine samples from HV with AH was generated (Fig. 3A). Considerable variations were observed in the presence and intensity of spectral peaks in different regions between 400–570, 600–800, 900–1170, and 1200–1400 cm− 1. When performing second-derivative preprocessing, as shown in Fig. 3B, the different spectral bands were highlighted, with the observation of the highest intensities in the spectral ranges 600–800, 970–1050, and 1300–1350 cm− 1.
VIP is the measure that indicates the variables that are important and displays their contributions to the model. It helps to select the variables that will be used to develop the predictive model. The calculation of the VIPs of the variables is conducted by means of the weighted sum of the squared correlations between the components and the original variable and whose weightings correspond to the percentage variation explained by the PLS-DA component in the model29,30.
The VIP scores help identify the most important spectral regions that contribute to the optimal performance of the model30.
In the VIP scores of the Raman spectra matrix of urine samples for the classification model of hypertensive patients, 36 bands were identified that exceeded the limit value, some contributing to the classification model with more weight than others. A total of 36 Raman spectral bands that contribute to the PLS-DA classification model were assigned tentatively, according to the literature, and the bibliographic references of the main Raman bands in urine samples from HV and AH (Table 2).
Table 2
VIP scores matrix spectra Raman urine samples. n.r., not resolvable
Signal | Value VIP scores | Urine peaks (cm− 1) | Possible assignment |
1 | 1.68 | 302 | n.r. |
2 | 1.41 | 326 | n.r. |
3 | 1.32 | 432 | n.r |
4 | 1.77 | 531 | S–S stretching protein11 |
5 | 1.21 | 607 | Creatinine, glycerol31,32 |
6 | 1.79 | 630 | Glycerol, C–S gauche amino acid methionine32 |
7 | 1.25 | 642 | Uric acid, C–C twisting mode of tyrosine12,32 |
8 | 2.07 | 660 | C–S stretching mode of cystine collagen type II32,14 |
9 | 1.91 | 706 | N–H Uric acid, C–S trans amino acid methionine13,32 |
10 | 2.35 | 726 | DNA/RNA bases, hypoxanthine, C–S protein, CH2 rocking adenine12,32 |
11 | 1.24 | 770 | Alanina16 |
12 | 1.97 | 786 | Ring vibration cytosine, DNA O–P–O uracil, thymine13,32 |
13 | 1.31 | 801 | Backbone geometry and phosphate ion interaction32 |
14 | 1.33 | 828 | Glutathione, tyrosine PO2 stretch DNA phosphodiester, O–P–O stretching DNA/RNA12,32 |
15 | 1.49 | 894 | Phosphodiester deoxyribose32 |
16 | 1.52 | 907 | Creatinine, creatine, hydroxibutyrate31 |
17 | 1.34 | 935 | C–C stretching mode of proline and valine and protein backbone31 |
18 | 2.16 | 993 | Ring vibration uric acid, phenylalanine13 |
19 | 6.53 | 1008 | N–C–N stretching urea, phenylalanine31,32 |
20 | 2.23 | 1026 | C–H stretching phenylalanine11 |
21 | 2.85 | 1037 | n.r. |
22 | 1.32 | 1064 | Skeletal C–C stretch of lipids 32 |
23 | 1.25 | 1089 | Po2 stretch, phosphate, histidine, nucleic acid12,13,32,33 |
24 | 1.47 | 1100 | C–C vibration mode of the gauche-bonded chain, amide III32 |
25 | 1.97 | 1121 | C–N stretch protein backbone, vibrations C–O, C–C, C–N uric acid34,13 |
26 | 1.76 | 1146 | C–C lipids, fatty acid33,32 |
27 | 1.68 | 1271 | Amide III band in protein, amide III C–N stretch, C = C fatty acid, typical phospholipids32 |
28 | 1.35 | 1292 | Interring stretching, cytosine32 |
29 | 1.47 | 1306 | CH3/CH2 twisting or bending mode of lipid/collegen32 |
30 | 2.40 | 1327 | CH3CH2 wagging mode in purine bases of nucleic acids32 |
31 | 3.25 | 1342 | CH3, CH2 twisting nucleic acid, wagging protein, G(DNA/RNA), CH deformation protein and carbohydrates11,34,32 |
32 | 1.32 | 1378 | Ring breathing modes DNA, paraffin1332 |
33 | 1.62 | 1404 | CH deformation32 |
34 | 1.71 | 1417 | C = C stretching in quinoid ring32 |
35 | 1.17 | 1428 | CH2 creatinine, valine12,14 |
36 | 1.20 | 1479 | Amide III32 |
Of the assignments to the VIP scores of the 36 identified bands that exceeded the threshold, 19 bands had an intensity of > 1.5, and 8 bands had the highest intensity of > 2.0 (660, 993, 1007, 1025, 1036, 1327, and 1342 cm− 1).