DRIFT spectroscopic determination of clay and organic matter in sediment by mixed soil-sediment calibration approach

Bioavailability and movement of pollutants through land and underground flows are strongly related to some of the sediment characteristics, such as clay minerals and organic matter. Therefore, the determination of clay and organic matter content in sediment is of great importance for environmental monitoring. Clay and organic matter in sediment were determined using diffuse reflectance infrared Fourier transform (DRIFT) spectroscopy in combination with multivariate analysis methods. Sediment from different depths was used in combination with the soil samples of different texture. Using multivariate methods and DRIFT spectra, sediments from different depths were successfully grouped according to similarity to different texture soils. Also, a quantitative analysis of clay and organic matter content was performed, where a new calibration approach was used in which sediment samples combined with soil samples were used for principal component regression (PCR) calibration. PCR models for the assessment of clay and organic matter were determined for a total of 57 sediment samples and 32 soil samples, and satisfactory determination coefficients were obtained for linear models (0.7136 for clay and 0.7062 for organic matter). The obtained RPD values for both models gave very satisfactory values of 1.9 for clay, i.e., 1.8 for organic matter.


Introduction
The sediment acts as a final sink for mixtures of contaminates exported to the surface water which mainly originates from land and river runoff soils and anthropogenic organic and inorganic loads. These contaminants may pose a high risk to the environment on a large scale; hence,once contaminated sediment can become a source of secondary pollution when due to changing conditions in the water system (floods, acidification) sorbed pollutants are desorbed and returned to the water phase where they pose a danger again (Liu et al., 2023). Environmental monitoring is conducted as continuous or repeated measurements of certain physical parameters as well as observing chosen characteristics and properties of 1 3 437 Page 2 of 7 Vol:. (1234567890) selected environmental components. To monitor pollution behavior in sediment, it is necessary to develop a good monitoring plan since it can raise a series of questions, such as those related to which parameters and chemicals should be included and where and when they should be measured (Di Guardo, 2014). Different sediment characteristics, such as clay minerals and organic matter, tend to demonstrate a strong positive correlation with organic and inorganic pollutants partitioning due to their sorptive nature and large reactive surface area (Crawford & Liber, 2015). The sorptive nature of clay and organic matter depends on many factors, like particle size, flow resistance and scale of surface runoff, slope, and weather conditions. For example, Lee et al. (2019) concluded that the organic matter composition and flocculation potential of the river water were dynamic under different meteorological, hydrological, ecological, and anthropogenic conditions and closely correlated with each other. The higher intensity of rainfall and the associated higher kinetics of raindrop impacts on soil surfaces induce an increased aggregate breakdown phenomenon, causing micro-aggregates to wash away. Additionally, it is important to emphasize that the size of sediment particles is closely related to the process of their aggregation with a significant contribution of organic matter of indigenous origin. The formation of aggregates also depends, among other things, on the physical (e.g., cohesion phenomena or the lack thereof), chemical, and electrochemical conditions characterizing the medium under analysis, as well as the properties of elementary particles (including structure, shape, and grain size), the intensity of surface runoff, climate zone, and season (Cieśla et al., 2022;Schweizer et al., 2021).
The most frequently used FTIR spectroscopic analysis approaches for sediment investigation in environmental studies are classical linear regression based on Lambert-Beer law using peak areas of the characteristic band and the multivariate approach using selected spectral ranges. Successful classification and identification of different minerals are possible using FTIR techniques and methods of hierarchical cluster analysis (Hassaan & El Nemr, 2021). The significant potential of FTIR techniques has been shown in paleontological studies by identifying specific sources of organic matter (Maxson et al., 2021). The spectroscopic ATR technique has shown very successful application in the determination of diffusion in sediments (Fang et al., 2008) and the determination of carbonates or biogenic silicates in marine sediment samples (Melucci et al., 2019). Further, FTIR-DRIFT has been frequently used in the analysis of sediment such as rapid quantification of humic substances (Tremblay & Gagne, 2002), particle structure and interactions in biofilms (Gallé et al., 2004), and determination of carbonate minerals (So et al., 2020). There are also numerous examples of the application of DRIFT spectroscopic techniques in soil analysis, especially using the multivariate approach in quantitative analysis (Matamala et al., 2019;Silvero et al., 2020;Jović et al., 2019).
In this paper, the DRIFT spectroscopic technique with a multivariate approach was used to classify the sediment and establish a regression model for determining two very important sediment parameters, namely, clay and organic matter. To improve the classification model in this study, a combined approach was used for the first time by adding soil samples to the calibration and validation data set.

Sediment and soil sampling and preparation
Fifty-seven sediments and 32 soil samples used for this experiment were collected from the territory of the Autonomous Province of Vojvodina, Republic of Serbia ( Figure S1). All sediment samples are taken from the canal network and connected rivers (from one or all defined depths depending on the amount of sediment in that location). The soil samples are taken from agricultural land, given that soil properties play a crucial role in soil fertility. A detailed description of the selected study area is given in the supplementary information. Clay and organic matter analysis and sample preparation details are presented in the supplementary information.
DRIFT spectroscopy measurement and data analysis Before the spectral recording, the samples were dried for 24 h at a temperature of 100 °C. A fraction greater than 2 mm was separated by sieving and milled to reduce the light scattering on the particles. Infrared spectra were obtained using the Thermo-Nicolet iS20 instrument. Spectra were obtained using the diffuse reflectance (DRIFT) technique. The spectral range was 4000-400 cm −1 , and a total of 40 scans Page 3 of 7 437 Vol.: (0123456789) per spectrum were recorded at a resolution of 4 cm −1 . Soil spectra were first reduced to 4000-700 cm −1 to eliminate noise at the edge of each spectrum. After spectral treatment, spectra were reduced by averaging 4 successive wavelengths. The samples without dilution with KBr were recorded in triplicate.
Statistica software version 13 was used for all manipulation with spectral data. To obtain the best possible results, the following various pretreatments and their combinations were used: Savitzky-Golay filters, multiplicative scatter correction (MSC), and standard normal variate (SNV).

Results and discussion
Visual inspection of the spectra and band assignment Figure 1 shows the average spectra for 57 sediment samples, 12 arenosol soil samples, and 20 mixed soil samples from Vojvodina. As can be seen from Fig. 1 and Table 1, certain spectral regions show a significant degree of variance and as such can provide a hint of their possible use in multivariate analysis based on spectroscopic data.
Although in the IR spectra from the middle infrared wavelength range the overlap of the characteristic functional groups is present, some spectral regions can be related to the investigated parameters. The spectral MIR regions that were of interest are the region of stretching vibrations of alkyl residues of organic matter 3000-2800 cm −1 , the region characteristic of clay and carbonate minerals 2600-2400 cm −1 , and the region of 1200-850 cm −1 restrahlen bands of Si-O-Si quartz characteristic for sandy samples (Fig. 1).

Sediment and soil grouping by statistical methods based on FTIR spectra
For preliminary determination of the similarity and potential grouping of the examined sediment and soil samples, hierarchical cluster analysis (HCA) and principal component analysis (PCA) of all tested samples were performed. Details about the statistical analysis have been provided in the supplementary information. Figure 2 shows the obtained grouping results. The first principal component explains the very high 93% of the variance, while the second contributes with 3%, which indicates the very high collinearity of the wavelengths used.  The examined sediment samples were taken from different depths of 20, 40, 60, and 100 cm. Based on the obtained grouping of samples according to similarity, it can be assumed that the sand fraction progressively and significantly decreased, while the clay fraction increased with increasing sediment depth. Namely, sediment samples from upper layers of 20 and 40 cm are distributed according to the first principal component on the left (where sandy soil samples are also located), while sediment samples from deeper levels are on the Vol.: (0123456789) right side where soil mixture samples are distributed. Also, a similar distribution is obtained in the dendrogram; on Fig. 2b, a similar grouping of samples from different depths can be observed; namely, all samples of arenosol and 80% of samples from lower depths are in cluster B. Obtained results showed that the sand fraction progressively and significantly decreased, while the clay fraction increases with increasing sediment depth. These results are in accordance with some previous studies conducted by Gul et al. (2011) where they conclude that these results are explained by the downward movement of clay particles with drainage waters in the high coefficient tile drainage system exerting internal downward pressure.

Principal component regression (PCR) prediction of clay and organic matter
Since high multicollinearity was found among the examined wavelengths in the MIR area, principal component regression (PCR) was used for the quantitative assessment of clay and organic matter. Test set and cross-validation approaches were used. From 7 to 10, principal components were used for the obtained regression models, and the models were also tested by calculating ratio prediction to deviation values (RPD). The parameters of the obtained models are given in Table 2. If the RPD value is between 2 and 3, then the model is characterized as good, from 1.5 to 2 is acceptable, while for values below 1.5 RPD is characterized as weak (Janik et al., 1998).
Based on the data given in Table 2, the obtained models for estimating selected sediment parameters show satisfactory accuracy. Due to the similar characteristics of soil and sediment (especially deeper layers), it is recommended to use samples from close locations for such mixed calibration models. In this way, important characteristics of soils and sediments can be connected.
Obtained RPD values are in the range of 1.5 to 1.9 which indicates that the models for the organic matter and clay prediction are sufficiently accurate and reliable. The slightly weaker prediction model is obtained for organic matter, whereas the better model was established for clay prediction. Using synthetic sediment mixes, Meyer-Jacob et al. (2014) have very successfully quantified biogenic silica by the DRIFT-PLS technique (R 2 = 0.97; RMSCEV = 4.7). Successful FTIR-PLS analysis of organic and inorganic carbon in sediment was performed by Rosén et al. (2010); using selected wavelengths, significantly better prediction model was determined for inorganic carbon compared to organic (R 2 (OC) = 0.71; R 2 (IC) = 0.95). Also, the DRIFT-PLS method proved to be very successful in quantifying humic substances, % C and N as well as C/N ratio, certainly as in this research and the previously mentioned references of model performance parameters are similar and comparable (R 2 ( Humic sub ) = 0.89; R 2 (% C HS ) = 0.603 (Alaoui et al., 2011)).

Conclusion
Using a combined calibration approach and adding soil samples to sediment samples, it is possible to improve multivariate models of classification and prediction of sediment parameters based on infrared spectroscopy. Classification models showed similarity of lower sediment layers with loamy soil type, while upper layers showed similarity with arenosol soil types. The obtained PCR models for the prediction of clay and organic matter content in sediments are of satisfactory accuracy (RPD values 1.5 for clay and 1.9 for OM). The combined calibration approach can certainly be improved by further expanding the calibration data set with more heterogeneous soil and Table 2 Obtained parameters for PCR regression models: root mean square error (RMSE), coefficient of determination (R 2 ), and ratio prediction to deviation (RPD) A graphical presentation of the results is given in Fig. S2  sediment samples. The approach of mixed calibration in the DRIFT determination of clay and organic matter is practical for the rapid classification of sediment and soil. The obtained method is sufficiently accurate, very fast, and does not require chemicals or significant sample preparation. The resulting calibration approach is useful in the perspective of a green chemical way of sediment and soil investigation and can be used in both agricultural and environmental research.
Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.