Anticoagulant Activity Analysis and Origin Identification of Panax notoginseng Using HPLC Chromatography and ATR-FTIR Spectroscopy

doi:10.21203/rs.3.rs-1601385/v1

At present, the correlation between saponins and anticoagulant activities analyzed from different origins and identification of the origins has been rarely reported. Here, the High-performance liquid chromatography (HPLC), pharmacological experiments and attenuated total reflectance Fourier transform infrared spectroscopy (ATR-FTIR) spectroscopy coupled with chemometrics were employed to analysis the correlation and origin identification of Panax notoginseng. The HPLC and pharmacological experiments showed that P. notoginseng saponin (PNS) of main roots was not significantly different, but the Wenshan Prefecture was the highest at 9.86%. The prolonged coagulation time was done for the best effect with 1.92s, and the anticoagulant activity was positively correlated with the contents of the three saponins and PNS with correlation coefficients of 0.298, 0.636**, 0.507**, and 0.597**, respectively. Therein, the content of ginsenoside Rg₁ had the greatest influence on the anticoagulant effect. Therefore, it is necessary to identify its origins. On the base, the variable selection was performed on ATR-FTIR spectra by applying manual selection, competitive adaptive reweighted sampling (CARS) and variable importance (VIP) in projection. Partial least squares discriminant analysis (PLS-DA) classification algorithm was applied to identify the origin of P. notoginseng. The results show that the variable selection method could extract a small number of variables containing valid information, improve the performance of the model, and the VIP has the best ability to identify the origins of P. notoginseng. It could thus be a powerful analytical tool for the identification of other Chinese medicinal materials.

Panax notoginseng

HPLC

ATR-FTIR

Anticoagulation activity

Origin identification

Panax notoginseng (Burk.) F. H. Chen is one of the traditional precious and bulk-traded medicinal materials in China, and it is also a famous Chinese traditional medicinal herb in Yunnan Province. At present, it has a medicinal history of more than 600 years, is often sold as herbal medicine or made into herbal preparations, occupying a unique position in the regional market (Li et al, 2019; Xiong et al, 2019). As a kind of homologous medicine and food, it has the effect of dispersing blood stasis and promoting blood circulation, expanding blood vessels, eliminating swelling, regulating blood lipid and antithrombosis, was popular among consumers (Wang et al, 2016; Shi et al, 2017). Saponins have been important secondary metabolites and the pharmacodynamic basis of P. notoginseng (Wang et al,2014; Kim 2012). These saponin components are often used as quality control indicators for P. notoginseng medicinal materials or preparations. Modern pharmacological studies have shown that P. notoginseng anticoagulation is mostly considered to be related to saponin components. For example, Ren et al (2020). evaluated the anticoagulation activity of the extracts by measuring and analyzing the saponin chemical compositions of P. notoginseng and different ginseng genus medical plants, and to verify and clarify the anticoagulation effect's possible mechanism. Du et al (2015). showed that saponins have strong anticoagulation activity through ultra-performance liquid chromatography (UPLC) combined with hierarchical clustering analysis (HCA) and multiple linear regression analysis (MLRA). In addition, through isolated, identified and quantitatively analyzed 16 ginsenosides and two sapogenins., indicated that some of these compounds were potential natural inhibitors of coagulation factors Xa (FXa) (Xiong et al, 2017).

However, in recent years, with the continuous increase in demand and the influence of continuous implant disorder in P. notoginseng, the shortage of land resources in Wenshan Prefecture in Yunnan Province, leads to the gradual introduction of the surrounding areas, such as Honghe, Qujing, Kunming and other origins (Dong et al, 2020; Li et al, 2017). However, environmental factors such as temperature, humidity, altitude, and soil type of different origins could lead to significant differences in the internal composition and quality of P. notoginseng, resulting that appear the phenomenon of shoddy exceed and mixing the false in the market. One of the factors that have the greatest impact on the quality of P. notoginseng is often the chemical composition (Saponins), and the difference in the content and compositions will lead to change its main pharmacological effects and clinical efficacy. Therefore, the origin difference of P. notoginseng has become a key issue worthy of attention at present, and it is necessary to conduct in-depth and comprehensive research on it.

The chemical fingerprints combined with chemometrics methods are currently one of the most commonly used methods for evaluating the chemical characteristics and identification of origins of Chinese medicinal materials. High-performance liquid chromatography (HPLC) dominates the field of drug analysis with its fast analysis speed and wide range of applications (Luo et al, 2019; Li et al, 2012). It was mainly through the separation and identification of the chemical components of Chinese medicinal materials, which provide assessment tools for their quality (Zhang et al, 2019; Guo et al, 2020; Xu et al, 2022). It was widely used for the similarity and difference analysis, determination of components, identification and pharmacological analysis of Chinese herbal medicines (Peng et al, 2021; Sun et al, 2019; Wang et al, 2018; Nie et al, 2011). Infrared spectrum (IR) has been widely applied in the identification of Chinese herbal materials owing to their characteristics of being non-destructive and fast, which could macroscopically reflect the overall structure information of the sample chemical composition (Casale et al, 2016; Biancolillo et al, 2020). It was developed mainly based on the compound structure information to analyze the overall chemical composition, combined with chemometrics to identify and predict the Chinese herbal materials, and to evaluate the quality by analyzing the model effect (Yue et al, 2021; Chen, Lin and Tan 2018; Liu et al, 2020). In the qualitative and quantitative analysis based on the IR, it was often combined with chemometric methods to explore the linear correlation between the overlapping spectra and the chemical composition of the sample (Du et al, 2021). Variable selection was one of the important steps in chemometrics methods, which was mainly applied to eliminate redundant and collinear information to reduce the computing tasks and model dimensions, improving model performance (Liu et al, 2022). As the most frequently applied variable selection methods, variable importance in projection (VIP) values, interval partial least squares (iPLS), competitive adaptive reweighted sampling (CARS), monte Carlo uninformative variable elimination (MC-UVE) have been widely applied with IR as powerful tools to identify the quality of Chinese medicinal materials (Zhou et al, 2007; Pan et al, 2020; Liu et al, 2021, SL. Li et al, 2020). However, up to now, there are few kinds of research using feature extraction methods to variable selection from IR to identify the origin of P. notoginseng.

Therefore, the purpose of this work was applying high performance liquid chromatography (HPLC) and attenuated total reflectance Fourier transform infrared spectroscopy (ATR-FTIR) to analyze the influence of saponin components of P. notoginseng from different origins on the anticoagulant activity, and to identify their origins, which comprehensively evaluate the quality. The saponins from different origins of P. notoginseng were determined by HPLC, and was analyzed the correlation between anticoagulation activity and saponins content. The potential and feasibility of using variable selection methods of VIP and CARS based on ATR-FTIR data for identifying P. notoginseng from different origins. In addition, in order to receive the best identification method, different preprocessing methods were applied to the ATR-FTIR, and the most influential spectral wavelengths in the classification process were manually selected.

Materials and Reagents

A total of 359 main roots parts of P. notoginseng samples were collected from four different origins in Yunnan province, which include Honghe (HH), Kunming (KM), Qujing (QJ), and Wenshan (WS). Detailed information of P. notoginseng samples was shown in Fig. 1. All collected fresh samples were cleaned and dried at room temperature naturally, and the main roots, rhizome, and fibrous roots parts were separated and sealed for storage. Before the experiment, the main roots sample was grinded by Pulverizer (FW-100, Tianjin Huaxin Instrument Factory) and powder through 90 mesh sieves. Finally, dried in an oven at 50℃ and reserved under constant temperature situations for further analysis.

The standards were purchased through the Beijing Soledad Bao Technology Co (Beijing, P.R. China), including notoginsenoside R₁ and ginsenosides Rg₁, Rb₁ (Batch number 1028C022, content ≥ 98%). Methanol and acetonitrile of chromatographically pure for HPLC were purchased from Sigma— Aldrich (Shanghai-China) and Merck & Co., Inc. (Kenilworth, NJ, USA), respectively. Analytic grade Ethanol reagent that was offered by Tianjin Zhiyuan Chemical Reagent Co., Ltd. Deionized (ultra-pure) water injected into the HPLC system was prepared by using a UPTL-II-40L system (Chengdu, China).

Phosphate-buffered saline (PBS) was purchased from Beijing Soledad Bao Technology Co. (Beijing, China). Rivaroxaban was purchased from Bayer Pharma AG (Berlin, Germany). The prothrombin time (PT) assay kit was purchased from Sysmex Corporation (Kobe, Japan).

Male rats (220-260g) were obtained from the Hunan Slake Jingda Experimental Animal Co., Ltd (License No. SCXK (Xiang)2019-0004). The animal research was approved through Yunnan University of Chinese Traditional Medicine, and housed at the experimental animal center of Yunnan University of Traditional Chinese Medicine. They were housed in a dark of air-conditioned room with controlled temperature is 23 ± 1℃, the humidity is 30–70%, and there is unlimited access to food and water. In addition, the animals were acclimated for no less than a week in the presence of any experiments.

Pharmacological Analysis

HPLC Determination and Saponin Content Analysis

All samples were analyzed by using an Agilent 1260 liquid chromatograph (Agilent Corporation, USA) coupled with Zorbax Eclipse Plus C18 column (4.6 mm × 250mm,5 µm, Agilent Corporation, USA). Mobile phase A was water, while mobile phase B was acetonitrile used to gradient elution. The gradient conditions were as follow: 0–12 min, 19% B, 12–60 min, 19%→36% B, 60–77 min, 36% B, 77–80 min, 36%-19%B, 80 min, 19% B. The flow rate was 0.6mL/min, with an injected volume of 10 µL, the column temperature is 30 ℃, and the samples were detected by absorption at 203 nm. The method validation according to ICH Harmonised Tripartite Guideline (2006) for measuring precision, repeatability and stability. Through Similarity Evaluation System for Chromatographic Fingerprint of TCM (Version 2004 A Chinese Pharmacopoeia Committee, Beijing, China) to analyze the similarity and fingerprint.

The sampling amount of each origin is based on the amount of crude drug 0.2 g was placed into a 10 mL centrifuge tube, to which 5 mL of methanol was added. After filtering through a nylon membrane filter (0.22 µm) as the test solution into 2 mL sample bottle. The mixed reference solution of notoginsenoside R₁, ginsenoside Rg₁ and ginsenoside Rb₁ was prepared according to the standard of Chinese Pharmacopoeia 2020 Edition (Commission C.P., 2020), and identified through HPLC. Based on the standard curve of the control solution concentration, the test solution content was calculated.

Preparation Methods

The powder was obtained from 10 g of each sample, and 80 mL of 80% ethanol was added. Then, they were ultrasonic extraction for 30 min of each time, sucking filtration, repeated extraction, and the filtrates were combined. Reduced pressure distillation until solvent drying, we could obtain the total extract of P. notoginseng main roots. The PBS (pH = 7.4) was prepared with 0.01 M dibasic sodium phosphate, samples amount of each origin were converted at 0.2 g raw drug, and add 2 mL PBS was to total extract. The supernatant was isolated from 220–260 g of male rats which the strain was SD, anesthetized with intraperitoneal injection, 2 ml of blood was taken from the intraperitoneal vein (0.2 mL of sodium citrate + 1.8 mL of venous blood), gently reversed and separated within 1 h. It was prepared for plasma preparation of rats by centrifugation at 3000 r/min for 15 min. The positive control was prepared by diluting 10 mg/tablet of rivaroxaban to 0.001 mg·mL^− 1.

Determination of the Prothrombin Time

Analysis was performed using the Sysmex CA-600 automated blood fluid analyzer (Sysmex Corporation, Kobe, Japan). The 200 µL of plasma was placed in 2 mL collection of blood vessels, the experimental group add 100 µL of samples, while the blank control group add 100 µL of PBS, and the positive control group add 100 µL of rivaroxaban dilution. Prothrombin time was determined by applying a fully automatic blood coagulation analyzer (CA-600, Sysmex, Japan). Determined three times in parallel per sample.

Correlation Coefficient Analysis

SPSS Statistic 21.0 was applied to process and analyze experimental data, whether saponin content was significantly different in P. notoginseng main roots of different origins. Data were examined for normality and homogeneity of variances, and a one-way variance comparison was conducted. Univariate homogeneity test in the observations, homogeneous variance if P > 0.05, and significant difference in univariate analysis if P < 0.05.

Spectral Analysis

ATR-FTIR Spectral Acquisition

FTIR spectra were collected with a Frontier FT spectrometer (PerkinElmer, USA) equipped with a deuterated triglycine sulfate (DTGS) detector, which coupled with ZnSe attenuated total reflectance accessory (Perkin Elmer, Norwalk, CT, USA). The instrument was preheated at 65% relative humidity for 30 min prior to the analysis. Furthermore, it was demanded that the indoor temperature at 25℃ and the relative humidity of under 45% be sustained throughout the whole experimentation of scanning. Reducing the absorption of CO₂ and H₂O when scanning the sample in order to remove the background. Scanned the spectral information in the absorbance of 4000 − 400 cm^− 1. Each sample was scanned 16 times at a resolution of 4 cm^− 1 and tested three times parallelly. Finally, the average spectrum was calculated to establish a model for the next analysis.

Data Preprocessing

The original spectrum has existed a lot of chemical information, but there were peak overlaps and interferences such as stray light, noise, and baseline drift (Y. Li et al, 2020). Therefore, the derivative (Arndt et al, 2020), multiplicative scatter correction (MSC) and standard normal variable (SNV) (Dhanoa et al, 1994), Savitzky-Golay (S-G) filtering(Savitzky and Golay 1964) and their combinations were applied for preprocessing to take the unnecessary signal variations away. Before preprocessing, applied MATLAB 2017a (the MathWorks) to divide the data into the test set and training set based on the classic Kernnard Stone (K-S) algorithm to eliminate human interference. Among them, 238 samples are used as the training set, and the other 121 samples are applied for external model verification. All pre-processing methods were carried out through SIMCA-P + 14.0 software (Umetrics, Umea, Sweden).

Selection of Feature Variable

The main purpose of variable selection is to select relevant and information-rich data to reduce the dimensionality and redundancy of the data. Besides, it could decrease irrelevant information's interference with the model, improve the efficiency of the model and the accuracy and reliability of the prediction consequences (Pei, Zhang and Wang 2020). The variable importance in the projection (VIP) method was the most commonly applied variable selection method in PLS models, and has been widely used to select important variables in PLS/PLS-DA models to reduce model dimensions and enhance interpretability (Galindo-Prieto, Trygg and Geladi 2017). The confidence interval of VIP was 95%. When the VIP value was > 1, indicating that the variable is important, it was from 0.5 to 1, the importance of the variable should be analyzed according to the specific problem. However, if the VIP value was < 0.5, it shows that the variable was not important (Liu et al, 2020). Therefore, the variables with VIP value > 1 in the 4000 − 400 bands were screened through SIMCA-P + 14.0 software (Umetrics, Umea, Sweden) for further modeling analysis.

The CARS algorithm is a newly established method for extracting wavenumbers of characteristic variables. Foremost, applied Monte Carlo sampling or random sampling to select a part of the samples in the calibration set, and then retain the maximum absolute value of the regression coefficient that is based on the PLS model through the adaptive weighted sampling method to evaluate each variable's importance. Finally, establishing models for each subset through the cross-validation, select the wavelength variable subset for the smallest value of root mean squares error of cross-validation (RMSECV) as the optimal subset (Li et al, 2009). The method was performed using MATLAB software (version R2017b, MathWorks, USA). In this study, the number of Monte Carlo Simulation was set to 500, the group number for cross-validation was set as 7-fold, and the pretreatment method was determined as center.

Pattern Recognition Technology

The PLS-DA as a method for linear multivariate data discrimination is widely used in chemometrics, with applications in food and herb areas, such as the authentication of origin, cultivation model, processing method, fraud, etc (Lu et al, 2020; Ballabio et al, 2018; Walkowiak et al, 2019; Górski, Kowalcze and Jakubowska 2019). It could calculate the probability of each class and select the class associated with the highest probability for sample classification, which is widely applied to cope with complicated data matrices through dimension reduction. Therefore, PLS-DA was performed to establish a discrimination model to identify the origin of P. notoginseng in this study. The evaluation of classification performance parameters was performed by confusion matrix. The total number of True Positive (TP), False Positive (FP), True Negative (TN), and False Negative (FN) samples were summarized to calculate the sensitivity, specificity, accuracy, and performance of the model. A good model must have high sensitivity and specificity coefficients. The closer the value was to 1, showing that the better the model effect. PLS-DA models were established through SIMAC-P + 14.0 (Umetrics, Sweden) software. The related equations were shown behind:

$$Sensitivity=\frac{TP}{TP+FN}\times 100$$

1

$$Specificity=\frac{TN}{TN+FP}\times 100$$

2

$$Accuracy=\frac{TP+TN}{Total samples}\times 100$$

3

Saponin Content and Prothrombin Time Analysis

The content of 3 saponin in P. notoginseng samples from different origins was calculated according to the content determination method. The positive control prothrombin was extended for 12s, and the results of prolonged prothrombin time of 40 origins P. notoginseng samples are shown in Table 1, which is significantly different from the positive control. The saponins content of P. notoginseng main roots was not significantly different, including WS (9.86%) > HH (7.60%) > KM (7.56%) > QJ (7.53%). The prolonged coagulation time was WS (1.92s) > QJ (1.81s) > KM (1.72s) > HH (1.40s).

Table 1

Content of 3 saponins and prothrombin time in *P. notoginseng*.
No.	State/Origins	Notoginsenoside R₁(%)	Ginsenoside Rg₁(%)	Ginsenoside Rb₁(%)	Total of saponins (%)	Extension PT(s)
1	WS	1.13 ± 0.40^a	4.99 ± 2.15^a	3.89 ± 1.58^a	9.86 ± 3.72^a	1.92 ± 0.80^a
2	HH	0.93 ± 0.21^a	3.74 ± 0.49^a	2.92 ± 0.32^a	7.60 ± 0.82^a	1.40 ± 0.55^a
3	QJ	0.98 ± 0.17^a	3.79 ± 0.80^a	2.75 ± 0.51^a	7.53 ± 1.29^a	1.81 ± 0.26^a
4	KM	0.99 ± 0.15^a	3.79 ± 1.01^a	2.78 ± 0.44^a	7.56 ± 1.31^a	1.72 ± 0.52^a

The correlation analysis of saponin content and the prolonged prothrombin time was performed using SPSS21.0 software. The correlation coefficient between notoginsenoside R₁ and prolonged coagulation time was 0.298, illustrating a weak positive correlation. The correlation coefficients of ginsenoside Rg₁, ginsenoside Rb₁ and P. notoginseng saponin (PNS) with prolonged coagulation time were 0.636**, 0.507**, and 0.597**, respectively, showing a significant positive correlation (Table 1). That is to say, the higher the saponin content, the better the anticoagulant effect of P. notoginseng. Among them, the content of ginsenoside Rg₁ has the greatest influence on the anticoagulant effect, and there is a huge potential to be explored in later experiments (Table 2).

Table 2

Correlation coefficient of saponins.
Saponins components	Correlation coefficient
Notoginsenoside R₁	0.298
Ginsenoside Rg₁	0.636**
Ginsenoside Rb₁	0.507**
Total of saponins	0.597**
* Represents it at the 0.05 level, and * * represents at the 0.001 level.

ATR-FTIR Spectra Analysis and Preprocessing

Figure 2 shows the average spectrum and preprocessing based on the different origins in the range of 4000 to 400 cm^− 1. Seen from Fig. 2A, different origins' sample absorption peaks are the same, and the peak intensity is marginally different. Bands around 3285 cm^− 1 were attributable to the symmetric and asymmetric stretching vibration of O-H. The peak at 2922 cm^− 1 was mainly assigned to the asymmetric stretching vibration of CH₂ (Li, Zhang and Wang et al, 2018). The peak at 1753 cm^− 1 was assigned to the carbonyl peak, and the peak at 1630 cm^− 1 was assigned to the bending vibration of O-H. The broad bands around 1411 cm^− 1 were mainly assigned to the transformation vibration of C-H. The peak at 1368cm^− 1 corresponded to NO3⁻, indicating the signature of nitrate in the samples (Yang et al, 2018). The peak at 1236 cm^− 1 showed the CH₂OH mode and the bands at 1145 and 1072 cm^− 1 were composed of symmetric vibration of C-C and C-O. The peak at 1009 cm^− 1 was mainly assigned to the bending vibration of C-O-H (Ma et al, 2016). The peaks at 923 and 862 cm^− 1 represented the stretching vibration of C-C and the bending vibration of C-H (Yang et al, 2019). According to the original spectrum, the different origins couldn’t be directly discriminated from the spectra only by naked eyes, this was could owing to the similar chemical composition of P. notoginseng samples.

Each ATR-FTIR spectrum dataset was pre-processed by different methods, the results showed that the second derivative (2nd) spectral has the best effect (Table 3). It was improved the rate of change of the entire spectrum and revealed slight differentiation. As shown by a circle in Fig. 2B, 2nd spectra's main differentiation was located in two spectral regions, which was consistent with the original spectral absorption peak (Fig. 2A) results. The characteristic peaks in the 1800 − 500 cm^− 1 and 3700 − 2800 cm^− 1 regions were more obvious, and the absorption intensities of the characteristic peaks for KM and WS were higher than those of other origins. That is to say, the types and components of saponins, flavonoids, sugars and others of different origins was exist some variance. Therefore, the supervised PLS-DA model was applied based on the 2nd spectral for further analysis.

Table 3

The classification accuracy for each class and total accuracy.
Pretreatment methods	Training set					Test set
	1	2	3	4	ACC	1	2	3	4	ACC
None	0.800	0.384	0.717	0.550	0.613	0.600	0.621	0.867	0.733	0.705
1st	0.817	0.850	0.883	0.800	0.838	0.933	0.897	0.967	0.900	0.924
2nd	0.933	0.914	0.933	0.901	0.920	0.966	1.000	0.967	0.933	0.967
MSC	0.767	0.833	0.850	0.684	0.783	0.867	0.966	0.933	0.933	0.924
SNV	0.800	0.850	0.850	0.650	0.786	0.833	0.931	0.967	0.933	0.916
2nd + SMC	0.967	0.983	0.917	0.900	0.942	0.967	1.000	1.000	0.933	0.975
2nd + SNV	0.966	0.950	0.900	0.883	0.925	0.967	1.000	1.000	1.000	0.992
2nd + SMC + SG	0.900	0.900	0.816	0.850	0.867	1.000	0.966	0.933	0.866	0.941
2nd + SNV + SG	0.800	0.883	0.783	0.800	0.817	0.833	0.897	0.867	0.833	0.857

PLS-DA Analysis of the Manual Selected Wavelength Range

Some wavelengths in spectral of whole spectral region data could contain useless information, which may influence the establishment of discrimination models in the entire spectral region. Therefore, selecting wavelengths correlated with the target sample characteristics extremely could thus contribute to eliminating redundant information and increasing model performance (Moros et al, 2010). So as to overcome this drawback, as shown in section 3.2, the regions of the main absorption peak distribution with noise removed, including 3700 − 2800 cm^− 1 and 1800 − 500 cm^− 1 were manually selected.

The classification parameters and accuracy of the classification models established through whole and selected wavelengths were showed in Tables 4 and 5. Models with classification's high accuracy were considered the best discriminant equation. The identification effect of the whole wavenumber 2nd spectrum model accuracy of the training set and the test set were 0.967 and 0.920, respectively, which indicated that the model accuracy was high. Relatively speaking, the performance of the manual selected wavenumber model has been decreased, and the accuracy has been 0.934 and 0.912, respectively. From Table 5 showed that the accuracy of classification parameters for whole and manually selected wavenumber were all higher than 90%, however, 19 out of 238 training set samples and 4 out of 121 test set samples were misclassified in whole wavenumber (Fig. 3A). It was indicated that the PLS-DA model of whole wavenumber could more effectively classify P. notoginseng than manually selected wavenumber. The explanation could be owing to the manual selection of the main absorption peak region, while ignoring some spectral information that was important for multivariate analysis.

Table 4

The classification parameters of PLS-DA models based on original and variable selection methods.
Select methods		Training set			Test set
Select methods	Classes	SEN	SPE	ACC	SEN	SPE	ACC
Whole	HH	0.933	0.961	0.954	0.996	1	0.992
	KM	0.914	0.972	0.958	1	1	1
	QJ	0.933	0.978	0.966	0.967	0.967	0.967
	WS	0.900	0.983	0.962	0.933	0.990	0.922
Select	HH	0.950	0.961	0.958	1	0.967	0.975
	KM	0.862	0.967	0.941	0.871	0.989	0.959
	QJ	0.933	0.972	0.962	0.935	0.978	0.975
	WS	0.900	0.983	0.962	0.931	0.978	0.959
VIP	HH	0.967	0.983	0.979	1	1	1
	KM	0.966	0.989	0.983	1	1	1
	QJ	0.983	0.994	0.992	0.983	1	0.983
	WS	1	1	1	1	0.978	0.983
CARS	HH	0.950	0.972	0.966	0.967	0.978	0.975
	KM	0.897	0.983	0.962	0.968	0.978	0.975
	QJ	0.933	0.972	0.962	0.933	1	0.933
	WS	0.933	0.978	0.966	1	1	1

Table 5

The classification accuracy and total accuracy for different variables selected methods.
Variable selected methods	Training set					Test set
Variable selected methods	1	2	3	4	ACC	1	2	3	4	ACC
Whole	0.933	0.914	0.933	0.901	0.920	0.966	1.000	0.967	0.933	0.967
Select	0.950	0.862	0.933	0.900	0.912	1	0.871	0.967	0.900	0.934
VIP	0.967	0.967	0.983	0.983	0.975	1	1	0.933	1	0.984
CARS	0.950	0.897	0.933	0.933	0.929	0.967	0.968	0.933	1	0.967

PLS-DA Analysis Based on Variable Selection

As mentioned earlier, variable selection strategies could select the important variables that influence the model, allowing for the analysis of the question posed. We selected variables with VIP value > 1 as the important wavenumber (Fig. 4). Based on the ATR-FTIR spectra of the P. notoginseng samples, 760 variables wavenumber were selected. Then, the selected variables wavenumbers were re-integrated into 359 samples, and a data matrix of 359 samples × 760 variables was established, which was analyzed by the PLS-DA model.

Subsequently, the CARS algorithm was used to extract the important variables in the ATR-FTIR spectral. One of the great advantages of the CARS algorithm was that the variable selection of each algorithm's strategy could offer the analogous probability of being sampled for each variable, which raises important variables' selected opportunities greatly. In the present work, the ATR-FTIR spectral data was input into the script of the CARS algorithm, and 125 variables wavenumber were finally screened. Therefore, 359 samples ×125 variables data matrix were built and further analyzed.

The classification parameters, classification accuracy and confusion data matrix of the PLS-DA model that was established through the two variable selection methods were demonstrated in Table 4, Table 5, Fig. 3, respectively. As can be seen, the discriminative accuracy of the model using the variable selection method was higher than that of the whole wavenumber and manually selected wavenumber. The variable selection methods of VIP have the highest accuracy, with the training set and test set reaching 98.4% and 97.5%, respectively. The total classification accuracy of the training set and test set was greater than 0.95. In addition, the confusion matrix results showed that 4 samples out of 238 samples in the training set were misclassified, and 2 samples out of 121 samples in the test set were misclassified (Fig. 3C), which indicated that the model could identify the origins with high accuracy of P. notoginseng samples. The results based on the CARS algorithm showed that 17 samples among 238 samples (training set) were misclassified, and 4 samples among 124 samples (test set) were misclassified (Fig. 3D), and the total classification accuracy is low relatively. It was indicated that the identification accuracy of the model is relatively low than VIP, which might be due to the fact that the feature variables extracted using the algorithm still exist redundant information. Overall, variable selection methods for spectral data could improve the prediction accuracy and stability of the discriminant model.

In the present work, the three saponins in P. notoginseng from different origins were determined by HPLC, and the influence of origin on the anticoagulant activity of P. notoginseng was analyzed. The ATR-FTIR spectral data were analyzed by manual selection and two variable selection strategies to identify the different origins, which demonstrated the feasibility of variable selection strategies in identifying P. notoginseng from different origins. The results showed that the PNS content in Wenshan Prefecture was the highest at 9.86%, the prolonged coagulation time is done for the best effect with 1.92s, and the anticoagulant activity was positively correlated with contents of the three saponins and PNS, with correlation coefficients of 0.298, 0.636**, 0.507**, and 0.597**, respectively. The PLS-DA model using variable selection provided a reliable technique for the identification of P. notoginseng from different origins. However, VIP achieves the best performance with accuracy of 0.984 and 0.975 of the training and test sets, respectively, so VIP was slightly better than CARS. To sum up, in terms of PNS content, the quality of P. notoginseng from Wenshan Prefecture was better. The higher the saponin content, the better the anticoagulant effect of P. notoginseng, and the content of ginsenoside Rg₁ has the greatest influence on the anticoagulant effect. In addition, the variable selection strategy results are better than the manual selection, and a small number of variables containing effective information can be extracted, providing better results for the identification of different origins, scientific basis for the quality control and rational utilization of P. notoginseng.

CRediT authorship contribution statement

Zhiying Cui and Chunlu Liu contributed equally to this manuscript. Zhiying Cui: HPLC chromatography and pharmacological analysis data curation and analysis, Software, and Writing-review & editing. Chunlu Liu: ATR-FTIR spectroscopy data curation and analysis, Software, and Writing-review & editing. Dandan Li: Methodology and validation. Yuanzhong Wang: Supervision, Project administration. Furong Xu: Review-editing, Project administration, Funding Acquisition.

Funding This work was supported by the National Natural Science Foundation of China grants (Grant Number: 81460581) and the Key Project for Yunnan Provincial Traditional Chinese Medicine Joint (Grant Number: 2018FF001(-004).

Compliance with Ethical Standards

Conflict of Interest The authors declare that they have no conflict of interest.

Data Availability StatementsThe datasets generated during and analyzed during the current study are available from the corresponding author on reasonable request.

Ethical Approval This article does not contain any studies with human participants or animals performed by any of the authors.

Informed Consent Not applicable.

Arndt M, Drees A, Ahlers C, Fischer M (2020) Determination of the Geographical Origin of Walnuts (Juglans regia L.) Using Near-Infrared Spectroscopy and Chemometrics. Foods 9: 1860. https://doi.org/10.3390/foods9121860.
Ballabio D, Robotti E, Grisoni F, Quasso F, Bobba M, Vercelli S, Gosetti, Calabrese G, Sangiorgi E, Orlandi M, Marengo E (2018) Chemical profiling and multivariate data fusion methods for the identification of the botanical origin of honey. Food Chem 266: 79–89.
Biancolillo A, Marini F, Ruckebusch C, Vitale R (2020) Chemometric strategies for Spectroscopy-Based food authentication. Applied Sciences 10: 6544. https://doi.org/10.3390/app10186544.
Casale M, Bagnasco L, Zotti M, Piazza SD, Sitta N, Oliveri PA (2016) NIR spectroscopy-based efficient approach to detect fraudulent additions within mixtures of dried porcini mushrooms. Talanta 160: 729–734. https://doi.org/http://dx.doi.org/10.1016/j.talanta.2016.08.004.
Chen H, Lin Z, Tan C (2018) Fast discrimination of the geographical origins of notoginseng by near-infrared spectroscopy and chemometrics, J Pharmaceut Biomed 161: 239–245. https://doi.org/10.1016/j.jpba.2018.08.052.
Dhanoa MS, Lister SJ, Sanderson R, Barnes RJ (1994) The link between multiplicative scatter correction (MSC) and standard normal variate (SNV) transformations of NIR spectra. J Near Infrared Spec 2: 43–47.
Dong JE, Wang Y, Zuo ZT, Wang YZ (2020) Deep learning for geographical discrimination of Panax notoginseng with directly near-infrared spectra image. Chemometr Intell Lab 197: 103913. https://doi.org/10.1016/j.chemolab.2019.103913.
Du QW, Zhu MT, Shi T, Luo X, Gan B, Tang LJ, Chen Y (2021) Adulteration detection of corn oil, rapeseed oil and sunflower oil in camellia oil by in situ diffuse reflectance near-infrared spectroscopy and chemometrics. Food Control 121: 107577. https://doi.org/10.1016/j.foodcont.2020.107577.
Du XH, Zhao YL, Yang DF, Liu Y, Fan K, Liang ZS, Han RL (2015) A correlation model of UPLC fingerprints and anticoagulant activity for quality assessment of Panax notoginseng by hierarchical clustering analysis and multiple linear regression analysis. Anal Methods-UK 7: 2985–2992. https://doi.org/10.1039/C4AY02277G.
Galindo-Prieto B, Trygg J, Geladi P (2017) A new approach for variable influence on projection (VIP) in O2PLS models. Chemometr Intell Lab 160: 110–124. https://doi.org/10.1016/j.chemolab.2016.11.005.
Górski L, Kowalcze M, Jakubowska M (2019) Classification of six herbal bioactive compositions employing LAPV and PLS-DA. J Chemometr 33: NO. e3112. https://doi.org/10.1002/cem.3112.
Guo L, Gong MX, Wu S, Qiu F, Ma L (2020) Identification and quantification of the quality markers and anti-migraine active components in Chuanxiong Rhizoma and Cyperi Rhizoma herbal pair based on chemometric analysis between chemical constituents and pharmacological effects. J Ethnopharmacol 246: 112228. https://doi.org/https://doi.org/10.1016/j.jep.2019.112228.
Kim DH (2012) Chemical Diversity of Panax ginseng, Panax quinquifolium, and Panax notoginseng, J. Ginseng Res. 36: 1–15. https://doi.org/10.5142/jgr.2012.36.1.1.
Li HD, Liang YZ, Xu QS, Cao DS (2009) Key wavelengths screening using competitive adaptive reweighted sampling method for multivariate calibration. Anal Chim Acta 648: 77–84. https://doi.org/10.1016/j.aca.2009.06.046.
Li J, Wang RF, Zhou Y, Hu HJ, Yang YB, Yang L, Wang ZT (2019) Dammarane-type triterpene oligoglycosides from the leaves and stems of Panax notoginseng and their antiinflammatory activities. J Ginseng Res 46: 377–384. https://doi.org/10.1016/j.jgr.2017.11.008.
Li SL, Xing BC, Lin D, Yi HJ, Shao QS (2020) Rapid detection of saffron (Crocus sativus L.) Adulterated with lotus stamens and corn stigmas by near-infrared spectroscopy and chemometrics, Ind Crop Prod 152: 112539. https://doi.org/10.1016/j.indcrop.2020.112539.
Li X, Wang YR, Ma L, Cui JX, Shen WX (2012) Application of the quality evaluation of traditional Chinese herbal medicines using chromatography of fingerprint. J Biomed Eng 29:192–196.
Li Y, Shen Y, Yao CL, Guo DA (2020) Quality assessment of herbal medicines based on chemical fingerprints combined with chemometrics approach: A review. J Pharmaceut Biomed 185: 113215. https://doi.org/10.1016/j.jpba.2020.113215.
Li Y, Zhang J, Xu FR, Wang YZ, Zhang JY (2017) Rapid prediction study of total flavonids content in panax notoginseng using infrared spectroscopy combined with chemometrics. Spectrosc Spect Anal 37: 70–74.
Li Y, Zhang JY, Wang YZ (2018) FT-MIR and NIR spectral data fusion: A synergetic strategy for the geographical traceability of Panax notoginseng. Anal Bioanal Chem 410: 91–103. https://doi.org/10.1007/s00216-017-0692-0.
Liu CL, Zuo ZT, Xu FR, Wang YZ (2022) Authentication of herbal medicines based on modern analytical technology combined with chemometrics approach: A review. Crit Rev Anal Chem https://doi.org/10.1080/10408347.2021.2023460.
Liu L, Zuo ZT, Wang YZ, Xu FR (2020) A fast multi-source information fusion strategy based on FTIR spectroscopy for geographical authentication of wild Gentiana rigescens. Microchem J 159: 105360. https://doi.org/10.1016/j.microc.2020.105360.
Liu ZM, Yang SB, Wang YZ, Zhang JY (2021) Multi-platform integration based on NIR and UV-Vis spectroscopies for the geographical traceability of the fruits of Amomum tsao-ko, Spectrochim Acta A 258: 119872. https://doi.org/10.1016/j.saa.2021.119872.
Lu X H, Xia ZY, Qu FF, Zhu ZM, Li SW (2020) Identification of authenticity, quality and origin of saffron using hyperspectral imaging and multivariate spectral analysis. Spectrosc Lett 53: 76–85. https://doi.org/10.1080/00387010.2019.1693403.
Luo JY, Chen GS, Liu DH, Wang Y, Qi Q, Hu HY, Li PY, Bai J, Du SY, Lu Y, Wang YM, Liu C (2019) Study on the material basis of houpo wenzhong decoction by HPLC fingerprint, UHPLC-ESI-LTQ-Orbitrap-MS, and network pharmacology. Molecules 24: 2561. https://doi.org/10.3390/molecules24142561.
Ma F, Chen JB, Wu XX, Zhou Q, Sun SQ (2016) Rapid discrimination of Panax notogeinseng of different grades by FT-IR and 2DCOS-IR. J Mol Struct 1124: 131–137. https://doi.org/10.1016/j.molstruc.2016.02.087.
Moros J, Garrigues S, Guardia MDL (2010) Vibrational spectroscopy provides a green tool for multi-component analysis. TrAC-Trend Anal Chem 29: 578–591. https://doi.org/10.1016/j.trac.2009.12.012.
Nie H, Zhang H, Zhang XQ, Luo Y, Meng LZ, Yin Z, Zhang JY, Li KY, Xiong AH, Zhong L, Huang HQ, Ye WC (2011) Relationship between HPLC fingerprints and in vivo pharmacological effects of a traditional Chinese medicine: Radix Angelicae Dahuricae. Nat Prod Res 25: 53–61. https://doi.org/10.1080/14786419.2010.490784.
Pan W, Wu M, Zheng ZZ, Guo LH, Lin ZY, Qiu B (2020) Rapid authentication of Pseudostellaria heterophylla (Taizishen) from different regions by near-infrared spectroscopy combined with chemometric methods. J Food Sci 85: 2004–2009. https://doi.org/10.1111/1750-3841.15171.
Pei YF, Zhang QZ, Wang YZ (2020) Application of authentication evaluation techniques of ethnobotanical medicinal plant genus paris: A review. Crit Rev Anal Chem 50: 405–423. https://doi.org/10.1080/10408347.2019.1642734.
Peng C, Zhu YL, Yan FL, Su Y, Zhu YQ, Zhang ZY, Zuo CJ, Wu H, Zhang YJ, Kan JY, Peng DY (2021) The difference of origin and extraction method significantly affects the intrinsic quality of licorice: A new method for quality evaluation of homologous materials of medicine and food. Food Chem 340: 127907. https://doi.org/10.1016/j.foodchem.2020.127907.
Ren YS, Ai J, Liu XQ, Liang S, Zheng Y, Deng X, Li Y, Wang J, Chen LL (2020) Anticoagulant active ingredients identification of total saponin extraction of different panax medicinal plants based on grey relational analysis combined with UPLC-MS and molecular docking. J Ethnopharmacol 260: 11295. https://doi.org/https://doi.org/10.1016/j.jep.2020.112955.
Savitzky A, Golay MJE (1964) Smoothing and differentiation of data by simplified least squares procedure. Anal Chem 36: 1627–1639. https://doi.org/10.1021/ac60214a047.
Shi XW, Yu WJ, Liu LX, Liu W, Zhang XM, Yang TT, Chai LM, Lou LX, Gao YH, Zhu LQ (2017) Panax notoginseng saponins administration modulates pro- /anti-inflammatory factor expression and improves neurologic outcome following permanent MCAO in rats. Metab Brain Dis 32: 221–233. https://doi.org/10.1007/s11011-016-9901-3.
Sun SS, Li YC, Zhu LJ, Ma HY, Li LP, Liu YF (2019) Accurate discrimination of Gastrodia elata from different geographical origins using high-performance liquid chromatography fingerprint combined with boosting partial least‐squares discriminant analysis. J Sep Sci 42: 2875–2882. https://doi.org/10.1002/jssc.201900073.
Walkowiak A, Ledziński A, Zapadka M, Kupcewicz B (2019) Detection of adulterants in dietary supplements with Ginkgo biloba extract by attenuated total reflectance Fourier transform infrared spectroscopy and multivariate methods PLS-DA and PCA. Spectrochim Acta A 208: 222–228. https://doi.org/10.1016/j.saa.2018.10.008.
Wang JR, Yau LF, Gao WN, Liu Y, Yick PW, Liu L, Jiang ZH (2014) Quantitative comparison and metabolite profiling of saponins in different parts of the root of Panax notoginseng. J Agr Food Chem. 62: 9024–9034. https://doi.org/10.1021/jf502214x.
Wang T, Guo RX, Zhou GH, Zhou XD, Kou ZZ, Sui F, Li C, Tang LY, Wang ZJ (2016) Traditional uses, botany, phytochemistry, pharmacology and toxicology of Panax notoginseng (Burk.) F. H. Chen: A review. J Ethnopharmacol 188: 234–258. https://doi.org/http://dx.doi.org/10.1016/j.jep.2016.05.005.
Wang Y, Shen T, Zhang J, Huang HY, Wang YZ (2018) Geographical authentication of gentiana rigescens by High-Performance liquid chromatography and infrared spectroscopy. Anal Lett 51: 2173–2191. https://doi.org/10.1080/00032719.2017.1416622.
Xiong LX, Qi Z, Zheng BZ, Li Z, Wang F, Liu JP, Li PY (2017) Inhibitory Effect of Triterpenoids from Panax ginseng on Coagulation Factor X. Molecules 22: 649. https://doi.org/10.3390/molecules22040649.
Xiong Y, Chen LJ, Man JH, Hu YP, Cui XM (2019) Chemical and bioactive comparison of Panax notoginseng root and rhizome in raw and steamed forms. J Ginseng Res 43: 385–393. https://doi.org/10.1016/j.jgr.2017.11.004.
Xu J, Zhou RR, Luo L, Dai Y, Feng YR, Dou ZH (2022) Quality evaluation of decoction pieces of gardeniae fructus based on qualitative analysis of the HPLC fingerprint and Triple-Q-TOF-MS/MS combined with quantitative analysis of 12 representative components. J Anal Methods Chem. https://doi.org/10.1155/2022/2219932.
Yang XD, Li GL, Song J, Gao MJ, Zhou SL (2018) Rapid discrimination of Notoginseng powder adulteration of different grades using FT-MIR spectroscopy combined with chemometrics, Spectrochim Acta A 205: 457–464. https://doi.org/doi:10.1016/j.saa.2018.07.056.
Yang XD, Song J, Wu X, Xie L, Liu XW, Li GL (2019) Identification of unhealthy Panax notoginseng from different geographical origins by means of multi-label classification. Spectrochim Acta A 222:117243. https://doi.org/10.1016/j.saa.2019.117243.
Yue JQ, Li, ZM, Zuo ZT, Zhao YL, Zhang J, Wang YZ (2021) Study on the identification and evaluation of growth years for Paris polyphylla var. Yunnanensis using deep learning combined with 2DCOS. Spectrochim Acta A 261: 120033. https://doi.org/10.1016/j.saa.2021.120033.
Zhang XF, Zhang SJ, Gao BB, Qian Z, Liu JJ, Wu SH, Si JP (2019) Identification and quantitative analysis of phenolic glycosides with antioxidant activity in methanolic extract of Dendrobium catenatum flowers and selection of quality control herb-markers. Food Res Int 123: 732–745. https://doi.org/10.1016/j.foodres.2019.05.040.
Zhou XH, Xiang BR, Wang ZM, Zhang M (2007) Determination of quercetin in extracts of ginkgo biloba l. Leaves by near-infrared reflectance spectroscopy based on interval partial least-squares (iPLS) model. Anal. Lett 40: 3383–3391. https://doi.org/http://dx.doi.org/10.1080/00032710701689081.

No competing interests reported.

Anticoagulant Activity Analysis and Origin Identification of Panax notoginseng Using HPLC Chromatography and ATR-FTIR Spectroscopy

Status:

Version 1

Abstract

Figures

Introduction

Materials And Methods

Materials and Reagents

Pharmacological Analysis

Spectral Analysis

Results And Discussion

Saponin Content and Prothrombin Time Analysis

ATR-FTIR Spectra Analysis and Preprocessing

PLS-DA Analysis of the Manual Selected Wavelength Range

PLS-DA Analysis Based on Variable Selection

Conclusions

Declarations

References

Additional Declarations

Status:

Version 1