3.1 Amino acids content in porcine, bovine and fish skin gelatines
This study investigated the distribution of AA content in porcine, bovine and fish skin gelatines. Table 1 shows the amino acid content in each skin gelatine. The presence of 17 AAs in the gelatine was confirmed with a retention time of SS. Glycine was dominant, while His was undetected in the porcine skin gelatine. The ranking of AA concentration in the porcine skin gelatine was as follows: Gly (33.66%) > Pro (12.16%) > Hyp (10.63%) > Ala (9.77%) > Glu (6.54%) > Arg (6.30%) > Lys (3.92%) > Asp (3.48%) > Ser (3.08%) > Leu (2.46%) > Val (2.37%) > Thr (1.79%) > Phe (1.56%) > Ile (1.05%) > Met (0.77%) > Tyr (0.45%) > His (0.00%). In comparison with Hafidz & Yaakob (2011), Azilawati et al. (2015) and Widyaninggar et al. (2012), our study had a similar AA distribution: Gly > Pro, Asp > Ser, and Ile > Met > Tyr. The AA distribution in porcine skin gelatine analysed by a validated and verified method by Abdullah Sani et al. (2021) showed a similar distribution. Although the porcine bone could also be used to produce gelatine, to the authors' knowledge, no report was found on the AA distribution from the porcine bone gelatine.
Based on the ranking of bovine skin gelatine, i.e. Gly (33.83%) > Pro (11.90%) > Hyp (10.89%) > Ala (9.95%) > Glu (6.72%) > Arg (5.95%) > Lys (3.84%) > Asp (3.65%) > Ser (3.11%) > Leu (2.46%) > Val (2.23%) > Thr (1.79%) > Phe (1.47%) > Ile (1.26%) > Met (0.65%) > Tyr (0.28%) > His (0.00%) (Table 1), the AA distribution was similar to the AA distribution of porcine skine gelatine probably due to both of pocine and bovine are mammals. This similarity may render difficulty in differentiating the porcine and bovine skin gelatines. This AA distribution contradicted the finding of Azilawati et al. (2015) and Hafidz & Yaakob (2011) except Gly > Pro, Asp > Ser and Ile > Met > Tyr > His distributions. Valipour et al. (2008) identified the AA distribution of bovine bone gelatine as follows: Gly (17.24%) > Glu (15.56%) > Asp (11.47%) > Pro (9.4%) > Ala (6.67%) > Lys (3.78%) > Thr (3.15%) > Phe (3.15%) > Ser (2.94%) > Arg (2.38%) > Leu (2.27%) > Val (2.09%) > Ile (1.15%) > Met (0.78%) > His (0.67%) > Tyr (0.66%). From this distribution, both bovine skin and bone gelatines had the Gly as the dominant AA, and similar Ile > Met distribution. On contrary, Valipour et al. (2008) identified 0.67% His in the bovine bone gelatine while our result found no His in the bovine skin gelatine.
Table 1 also presents the AA distribution of fish skin gelatine, which followed this ranking: Gly (35.44%) > Ala (9.73%) > Pro (9.43%) > Arg (6.77%) > Glu (6.25%) > Hyp (6.22%) > Ser (6.02%) > Lys (3.81%) > Asp (3.79%) > Thr (2.76%) > Val (2.02%) > Leu (2.02%) > Met (1.81%) > Phe (1.43%) > Ile (1.20%) > His (0.96%) > Tyr (0.33%). The AA distribution of yellowfin tuna (Thunnus albacares) skin gelatine was in line with our finding at the Pro > Ar > Glu and Ile > His > Tyr ranking (Nurilmala et al., 2019). Nevertheless, Nawaz et al. (2020) stated that cold-water fish skin gelatine had lower Hyp than the skin gelatine of warm water fish. This finding was supported by a higher Hyp in tilapia (Oreochromis mossambicus), yellowfin tuna (Thunnus albacares) and blackcarp (Mylopharyngodon piceus) than cod (Gadus morhua), hake (Merluccius capensis) and alaska pollock (Gadus chalcogrammus). Of the 17 AAs, our fish skin gelatines had similar AA distribution of Met > Phe > Ile > His > Hyl > Tyr in cod; Gly > Ala > Pro, Glu > Hyp, Thr > Val, and Met > Phe in hake; and : Gly > Ala > Pro and Met > Phe > Ile > His > Hyl > Tyr (Derkach et al., 2020). The bone gelatine of Ephinephelus sp. has Gly > Pro > Glu > Ala > Arg > Asp > Leu > Ser > Lys > Thr > Val > Phe > Ile > His > Tyr distribution where only Thr > Val and Phe > Ile > His distribution were similar to our study (Suprayitno, 2019). These findings indicated that each skin and bone gelatine of cold-water fish had their individual AA distribution although some similarities are recorded. Due to this reason, the differention of the skin gelatine of porcin, bovine and fish may possibly be carried out via the statistical analysis.
The ANOVA test in Table 1 shows a significant difference in the mean value of AA among the skin gelatine of porcine, bovine, and fish where skin gelatines with different superscript alphabet were significantly different (p < 0.01). The Arg, Pro, Tyr, Met, Val and Ile were significantly different among the three skin gelatines. Specifically, the gelatine of porcine skin had the highest content of Pro, Tyr and Val, and the lowest Ile content. The gelatine of bovine skin had the highest percentage of Ile while had the lowest percentage of Arg, Tyr and Met. The fish skin gelatine had the highest Arg and Met content while it had the lowest Pro and Val content.
The Arg, Pro, Tyr, Met, Val and Ile could be utilised to differentiate porcine and bovine skin gelatines, although the AA distribution between these skin gelatines was similar. However, the content differences between porcine and fish skin gelatines were significant in all AAs except Ala and Lys. Likewise, our study observed a significant difference of all AAs except Asp, Ala, Lys, and Phe in bovine and fish skin gelatines. Nevertheless, the application of the ANOVA test was insufficient to discriminate the three gelatines since more than one AA characterised the gelatines; hence, Abdullah Sani et al. (2021) and (Azilawati et al., 2015) proposed the MDA application to discriminate the skin gelatines.
3.2 Outlier treatment and dataset adequacy
Prior to the MDA, the skin gelatine datasets underwent pre-processing to ensure the dataset fulfilled the MDA prerequisite, including outlier treatment, dataset transformation, and dataset adequacy test (Ismail et al., 2021). The training dataset had 29, 12 and 21 outliers in the porcine, bovine and fish skin gelatines (Table 1), respectively, where our method replaced the outliers with the mean value of each AA (Abdullah Sani et al., 2021). Then, this training dataset was transformed via standardise (n-1) method. Although only negligible reports carried out dataset transformation in their works, our study performed the transformation to fulfil the prerequisite of MDA (Azilawati et al., 2015). Additionally, various dataset transformations are available for MDA, e.g., standardise (n), standard deviation − 1 (n-1), standard deviation − 1 (n), centre, 0 to 1 rescaling, 0 to 100 rescaling, Pareto and log methods (Ismail et al., 2021); however, our study adopted standardise (n-1) as proposed by Abdullah Sani et al. (2021) for gelatine matrix since high AA numbers achieved normality post this transformation.
Table 1 shows the individual KMO value for each AA where Met (0.9274) and Ile (0.6075) had the highest and lowest KMO values, respectively. Comparison of the average KMO value (0.7874) with the guideline from Williams and Brown (2012) study indicated that the dataset adequacy fell on the good ranking (0.7 < KMO < 0.8). Yuswan et al. (2021) and Azilawati et al. (2015) employed MDA without declaring the fulfilment of the dataset adequacy; hence, comparison of the result may not be possible. Nevertheless, other gelatine studies found that KMO > 0.7 signified that the dataset was adequate for MDA (Abdullah Sani et al., 2021). Our KMO value (0.7874) was higher than the gelatine study by Ismail et al. (2021) (KMO = 0.7542). Our KMO value indicated that the dataset was adequate for MDA based on these comparisons.
3.3 Development of model of partial least square discriminant analysis for skin gelatine sources
In this study, the PLS-DA model generated two components to explain the classification ability of the sources of skin gelatine. Table 2 shows two PLS-DA models that provided the classification ability to discriminate the porcine, bovine and fish skin gelatines. The first DM was partial least square-discriminating analysis (PLS-DA) for 17 AAs (PLS-DAAA), while the second DM was PLS-DA for AA with variable importance in the projection (VIP) score > 0.8 (PLS-DAVIPAA). Since too low a cut-off VIP threshold value may lead to a selection of unrelated variables to the gelatine sources, our study followed a VIP score > 0.8 as recommended by Sharin et al. (2021), where AA with a high VIP score could explain most of the variance among the porcine, bovine and fish gelatines. Selection of the best PLS-DA model was made by evaluating the performance of PLS-DAAA and PLS-DAVIPAA models.
The quality of both PLS-DA models was evaluated via R2Y cumulated (R2Y cum), R2X cumulated (R2X cum) and Q2 cumulated (Q2 cum) indices on each component. For DM of PLS-DAVIPAA, the R2Y cum (0.9356) and R2X cum (0.8650) were higher than the PLS-DAAA (R2Y cum = 0.9057 and R2X cum = 0.7186), indicating that the PLS-DAVIPAA was better in explaining the gelatine clusters and AA contribution to the gelatine clusters, respectively. These findings were associated with the definition of R2Y cum that is a sum of determination coefficients between the gelatine clusters and two components, while the R2X cum is the sum of determination coefficients between the AA and two components. The R2Y cum and R2X cum measured the two components' power to explain the gelatine clusters and AAs. As a generic consequence, the Q2 cum (0.9320) of PLS-DAVIPAA was also higher than the Q2 cum (0.8961) of PLS-DAAA, signifying that the two components generated by the PLS-DAVIPAA model had a significant contribution to predictive quality for skin gelatine sources. Additionally, the AAs with VIP score > 0.8 proved to be the main contributors to the predictive quality for skin gelatine sources.
The PLS-DAAA identified 13 significant AAs with descending VIP score, i.e., Tyr (1.4149), Phe (1.3326), Arg (1.2440), Thr (1.0960), Ser (1.0936), Met (1.0924), His (1.0912), Val (1.0783), Gly (1.0783), Hyp (1.0754), Ile (0.9959), Pro (0.9853) and Leu (0.9550) where the Tyr and Leu were the most and least significant AA, respectively. Based on the VIP score, these AAs explain most of the variance among the porcine, bovine and fish gelatines. Hence, these AAs could be used to differentiate the gelatine sources. Also, all 13 AAs of the PLS-DAVIPAA yielded VIP scores > 0.8, confirming the AA significance in discriminating the gelatine sources (Table 2).
Of these AAs, Fig. 1 (a) depicted the 17 AAs plot from PLS-DAAA with individual value of correlation matrix (CMV) for each AA, where His, Hyp, Ser, Thr, Met, Pro and Leu had CMV of 0.95–0.88 and were followed by Val, Arg and Gly with CMV of 0.69–0.63 in component 1. The Ile, Phe, Tyr, Glu, Asp, Lys and Ala had the lowest CMV (0.47–0.064). For component 2, the Tyr and Phe had the highest CMV (0.85–0.83); Arg, Gly, Val, Lys and Ala had the moderate CMV (0.70–0.51); and Glu, Asp, Pro, Leu, Ile, Met, Thr, Ser, His and Hyp had the lowest CMV (0.38–0.04). Figure 1 (b) of PLS-DAVIPAA shows the CMV of 13 AAs where the CMV for each AA in component 1 and 2 had a similar value. Jannat et al. (2018, 2020b) carried out PLS-DA analyses to distinguish porcine, bovine and fish gelatines, but none of them explained the CMV of the detected compounds or amino acids.
Nevertheless, Ismail et al. (2021) classified the AAs into strong (CMV ≥ |0.750|), moderate |0.500| < CMV < |0.749| and weak CMV ≤ |0.499| factor loading for AAs according to the CMV of AAs from principal component analysis (PCA), not PLS-DA. The CMV was used to delineate the AA relationship among them and assign the AA to the gelatine sources (Ismail et al., 2021). Figure 1 (b) exhibits positive correlations based on AA direction proximities; His, Ser, Met and Thr; Gly and Arg; Tyr and Phe; and Leu and Pro. On the contrary, negative correlations of AAs were observed based on their opposite direction: His, Ser, Met and Thr against Hyp; Tyr and Phe against Ile, Val against Ile, and Leu and Pro against Ile. Arg and Gly did not correlate with Ile since their directions were at 90◦.
Ismail et al. (2021) proposed that the AA's correlations were due to the AA's polarity side chain; however, our study found that only Met has a non-polar side chain although it had a positive correlation with His, Ser and Thr. Further generic grouping of AAs, e.g. basic, carboxylic, hydroxylic and hydrophobic based on the chemical characteristics by Derkach et al. (2020), could not support the AA correlations. The opposite side chains of Gly and Arg and Tyr and Phe also signified that the correlations of AAs were independent of their polarity side chain and generic chemical characteristics. Nevertheless, backbone of the chemical structure may suggest the reason for the positive correlations among the AAs and vice versa. For instance, Met, His, Ser and Thr, and Gly and Arg share HO-CO-CNH2- backbone; Tyr and Phe share HO-CO-CNH2-CH2-benzene backbone while Leu and Pro shares HO-CO- backbone. Furthermore, Hyp and Pro, and Leu and Val were also positioned at close proximity that share HO-CO-pyrrole and HO-CO-NH2 backbones, repectively.
To assign the AAs to porcine, bovine and fish skin gelatines, the skin gelatines and AA plots shall be overlaid together where the PLS-DA feature of XLSTAT 2019 could not provide in this study. However, the PCA is a preferable method since AA and skin gelatine plots are available in the PCA feature that serves as an exploratory MDA. Hence, in the next section, our study carried out the AAs assignment via PCA.
Table 2 also exhibits the correct classification of PLS-DAAA on the porcine, bovine and fish gelatines. The training and validation datasets exhibited 100% total classification of the porcine, bovine and fish skin gelatines (Table 2), indicating the PLS-DAAA was able to discriminate the gelatines at a 99% confidence level. This result was evident via the small p-value (p < 0.0001) of Mahalanobis distance and three distinct skin gelatine clusters, i.e. porcine, bovine and fish, in Fig. 1 (c). Further investigation on the predictive ability of the PLS-DAAA model on the testing dataset showed that it was able to achieve 93.3% of correct classification, where at least 90% skin gelatine of porcine and bovine and 100% of fish skin gelatine were correctly classified. This finding indicated that the PLS-DA with 17 AAs may facilitate the identification of skin gelatine sources, i.e. porcine, bovine and fish, in this study. The PLS-DAVIPAA also exhibited similar correct classification for training and validation datasets. The PLS-DAVIPAA also rendered 93.3% correct classification of its testing dataset, which was similar to the result of PLS-DAAA. Although both PLS-DA models had the same correct classification, Fig. 1 (d) of PLS-DAVIPAA showed that each porcine, bovine and fish skin gelatine were located nearer within their clusters as compared to the skin gelatine plot for PLS-DAAA in Fig. 1 (c). This result was evident in a narrower range of component 2 score of PLS-DAVIPAA than PLS-DAAA, i.e. porcine skin gelatine (-3 to 0), bovine skin gelatines (1 to 4) and fish skin gelatine (-1 to 1) in Fig. 1 (d). On the contrary, the component 2 score of the gelatine sources for PLS-DAAA was as follows: -4 to 0 for porcine skin gelatines, 0 to 5 for bovine skin gelatine, and − 3 to 1 for fish skin gelatine (Fig. 1 (c)). This correct classification also supported the finding on the higher R2Y cum and Q2 cum of PLS-DAVIPAA over PLS-DAAA. Hence, this study proposed selecting AAs with VIP scores > 0.8 and employing PLS-DAVIPAA to authenticate skin gelatine sources.
3.4 Development of model of discriminant analysis for skin gelatine sources
Table 3 presents two models of discriminant analysis (DA) i.e. discriminant analysis for 17 AAs (DAAA) and DA entailing AA with p-value < 0.01 (DAAAPV). The DAAA identified 15 AAs that significantly contributed (p < 0.01) to the discrimination of porcine, bovine and fish skin gelatines. The decreasing ranking of AAs according to the F-statistic value was as follows: His > Ser > Hyp > Thr > Met > Tyr > Arg > Phe > Pro > Leu > Val > Gly > Ile > Asp > Lys indicating that AA with high F-statistics were more significant than the lower ones in discriminating the sources of skin gelatine. Of the 15 AAs, the DAAAPV selected 13 AAs with a similar ranking of F-statistic value in DAAA, where Asp and Lys were removed from the list since these AAs were the least significant ones.
Figure 2 (a) shows the 17 AA plots of DAAA while Fig. 2 (b) depicts the AA plot of DAAAPV with p < 0.01. These figures depicted similar correlations among the AAs with high F-statistic value and p < 0.01. Positive AA correlation appeared among His, Ser, Met and Thr; Gly and Arg; Tyr and Phe; and Leu and Pro. There is a negative correlation between His, Ser, Met and Thr versus Hyp; Tyr and Phe versus Ile; Val versus Ile; Leu and Pro versus Ile. Also, Arg and Gly showed no correlation with Val, as evidenced by their 90◦ direction. These correlation patterns were similar to the correlations shown by PLS-DAAA and PLS-DAVIPAA.
The DAAA was able to perform the discrimination role at a 99% confidence level, as shown in 100% classification of the sources of skin gelatines using training and validation datasets. The low p < 0.0001 of Fisher distance and three distinct sources of skin gelatines in Fig. 2 (b) supported this finding. Nevertheless, the training dataset had a 96.7% correct classification of the sources of skin gelatines. The DAAAPV had also 100% correctly classified 120 and 40 skin gelatines in the training and testing datasets. On the other hand, the DAAAPV could only classify 30 skin gelatines at 96.7% correct classification, which denoted that both DAAA and DAAPV had similar classification capability.
However, the Fisher distance of DAAA among the clusters, i.e. porcine against bovine and porcine against fish skin clusters, were higher than DAAPV, suggesting that the DAAA could yield more precise classification and lower possibility of cluster overlapping. A more detailed investigation also showed that the DAAA had smaller cluster dispersion in porcine, bovine and fish skin gelatine (Fig. 2 (c)) as compared to DAAAPV in Fig. 2 (d). For DAAA, the porcine cluster had a range of 4.00 for component 1 and 4.19 for component 2; bovine cluster had a range of 3.42 for component 1 and 4.98 for component 2; and fish cluster had a range of 4.20 for component 1 and 2.53 for component 2. For DAAPV, the porcine cluster had a range of 5.74 for component 1 and 4.03 for component 2; bovine cluster had a range of 3.26 for component 1 and 3.68 for component 2; and fish cluster had a range of 4.81 for component 1 and 4.27 for component 2. These ranges denoted that the DAAA had lower intra-cluster variance that could reduce the possibility of incorrect classification compared to DAAAPV. Hence, the DAAA was the best DA model compared to the DAAAPV. This result signified that no identification of significant AAs via F-statistic value and the p-value is needed if DA is employed to authenticate the source of skin gelatine.
From the results of PLS-DAVIPAA and DAAA, it could be concluded that the DAAA was the best DM for authentication of porcine, bovine and fish skin gelatine sources since the DAAA had 96.7% correct classification of the skin gelatine sources as compared to 93.3% for the PLS-DAVIPAA. This finding was in line with Brereton & Lloyd (2014), which indicated that DA was more suitable than PLS-DA since the number of gelatine samples (observation) were higher than the AA numbers (variables), and there was no missing value in the dataset (Komsta et al., 2018).
3.5 Exploring amino acid profile in skin gelatines
The PCA application in this study aimed to explain the distribution of significantly identified AAs by the DAAA in porcine, bovine and fish skin gelatines. The skin gelatine plots in Fig. 3 (a) – (d) had two principal components (PCs) with cumulative variability (CV) of 78.54% with an eigenvalue (EV) of 3.97 that explained the 13 AAs distribution. However, our study could not achieve these purposes when clusters of different sources were mixed, as shown in Fig. 3 (a). Hair et al. (2014) adopted orthogonal or Varimax rotation and its rotation value since it is superior to other orthogonal rotations, e.g. Equimax and Quartimax, in simplifying the PC structure and providing optimal clusters (Otto, 2017). This study applied Varimax rotation at two, four and six rotation values to enhance variance of factor loadings (FLs) of the PC, reducing dimensionality and facilitating the explanation of 13 AAs distribution in each skin gelatine (de Almeida et al., 2020). Of these Varimax rotations, the four Varimax rotation in Fig. 3 (c) was the optimised rotation where it could reposition all skin gelatines into their clusters.
Figure 3 (e) assigned the AAs to the three clusters by overlaying the skin gelatine and AA plots to investigate their distribution in each skin gelatine. Figure 3 (a) also depicted the absent information in the PLS-DA, such as the dominant, moderate and low AA content in each cluster. The dominant AAs were as follows: Tyr, Phe and Val in porcine gelatine and Met, Thr, Ser, His, Arg and Gly in fish gelatine since these AAs and the clusters were in the same direction. This finding was in line with Abdullah Sani et al. (2021). Our study had a similarity with Azilawati et al. (2015) on Tyr, Met, Thr and Ser in porcine and bovine skin gelatines, respectively. The Pro, Leu and Hyp contents were moderate in both porcine and bovine skin gelatines, while Ile content was moderate in bovine and fish skin gelatines. This moderate content was due to these AAs' direction in the middle of these clusters. Since the Hyp was moderately distributed in porcine and bovine skin gelatines, our result may agree with Yuswan et al. (2021) study that proposed Hyp as one of the biomarkers for halal authentication in gelatine products. The Arg and Gly, and Ile contents were low in porcine and bovine skin gelatines, respectively, since their directions were opposite these clusters. Likewise, fish skin gelatine had low Pro, Leu and Hyp. This finding recommended conducting PCA with four Varimax rotations to (1) ensure all skin gelatines are grouped in their specific clusters and (2) assign the dominant, moderate and low content of AAs in each cluster.