A novel approach for calculating prediction uncertainty when using acoustic indices and machine learning algorithms to monitor animal communities

doi:10.21203/rs.3.rs-4494063/v1

Download PDF

Research Article

A novel approach for calculating prediction uncertainty when using acoustic indices and machine learning algorithms to monitor animal communities

https://doi.org/10.21203/rs.3.rs-4494063/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

There is a growing interest in using passive acoustic monitoring methods to survey biodiversity. Many studies have investigated the efficacy of acoustic indices in monitoring animal communities, particularly bird species richness, with mixed results. It has been suggested that combining multiple acoustic indices could improve accuracy. To accomplish this, researchers have employed machine learning methods, such as the Random Forest Regression, which are considered more robust in this context. However, most machine learning methods have a limitation in that they do not provide well-calibrated uncertainty quantification measures for their predictions. Quantifying uncertainty with the use of appropriate prediction intervals is of paramount importance for making informed management decisions. In this study, we propose addressing this issue using a Machine Learning framework, called Conformal Prediction, which has been developed to provide guaranteed coverage prediction intervals. Specifically, we examine the application of a recently proposed combination of Conformal Prediction with Gaussian Process Regression using data collected through bird and acoustic surveys at biodiverse sites in Cyprus and Australia. Our goal is to demonstrate how the Conformal Prediction framework can be used to assess the models’ prediction accuracy and associated uncertainty when monitoring biodiversity using acoustic indices and machine learning methods. Moreover, we discuss how the framework can be integrated into a wider range of ecological applications to help make more informed conservation management decisions.

Terrestrial Ecology

Biodiversity monitoring

Conformal Prediction Framework

Ecoacoustics

Gaussian Process Regression

Machine learning

Soundscape Ecology

There is growing interest in using novel technologies to monitor biodiversity over large spatial and temporal scales (Gonzalez et al., 2023). Much of the research on monitoring vocalizing animals, such as birds, has focused on passive acoustic monitoring methods. These methods involve collecting acoustic data using autonomous recording units, which are then analyzed to extract ecological information about a specific study system. Advances in machine learning methods have enabled, in many cases, the accurate identification of species from acoustic data (Kahl et al., 2021; Stowell, 2022). However, species identification presents difficulties when monitoring entire communities, especially in biodiverse areas with large numbers of species and limited training data. Acoustic indices have emerged as a popular alternative for assessing biodiversity without needing to identify individual species (Pan et al., 2024; Stowell and Sueur, 2020). The underlying principle is that more biodiverse communities have more heterogeneous acoustic environments due to a larger variety of acoustic signals (Buxton et al., 2018a).

The effectiveness of acoustic indices in surveying biodiversity has been extensively investigated in a variety of settings (Alcocer et al., 2022; Pan et al., 2024), including their use as proxies for animal species diversity in temperate and tropical regions (Bicudo et al., 2023; Eldridge et al., 2018; Mammides et al., 2017) and as tools for rapid assessments (Rajan et al., 2019; Sueur et al., 2008). Most studies have focused on birds (Alcocer et al., 2022), using the indices to quantify metrics related to species richness, diversity, and abundance, as well as the abundance and diversity of animal vocalizations (Alcocer et al., 2022; Buxton et al., 2018a). For the most part, studies have tested the indices individually, yielding mixed results (Alcocer et al., 2022; Pan et al., 2024).

Various explanations and solutions have been proposed (Bradfer-Lawrence et al., 2019; Mammides et al., 2021; Metcalf et al., 2021; Pan et al., 2024), with a growing body of research suggesting that combining the available indices can improve accuracy (Allen-Ankins et al., 2023; Buxton et al., 2018b; Eldridge et al., 2018; Mammides et al., 2024). To accomplish this, researchers have turned to machine learning methods such as the Random Forest Regression (Buxton et al., 2018b; Mammides et al., 2024; Sethi et al., 2023), which are considered more robust than common parametric regression techniques in this context because they make fewer assumptions about the data, are more suitable for high-dimensional settings, and can have higher predictive power. However, the vast majority of the conventional machine learning regression techniques have one crucial drawback in that they do not provide any information about the uncertainty associated with their predictions for each observation (e.g., the species richness at each site when used for monitoring). Even in the few cases in which prediction uncertainty is provided, its estimation is based on unrealistic assumptions (Papadopoulos, 2023), yielding suboptimal results.

A proposed solution to address this issue is the extension of conventional Machine Learning techniques through the Conformal Prediction (CP) framework. The CP framework, first developed in the mid-1990s, is a statistical approach that can be used in conjunction with machine learning techniques to generate prediction uncertainty (i.e., prediction intervals in the case of regression) with a guaranteed coverage probability of 1-α for any desired significance level α (Papadopoulos, 2023; Papadopoulos et al., 2011). For instance, a calculated 95% CP interval is guaranteed to contain the true outcome 95% of the time. Importantly, the validity of this coverage guarantee relies solely on the exchangeability assumption (Papadopoulos, 2023), which is less strong than the assumption of independent and identically distributed data (Papadopoulos, 2023). The CP framework has been successfully implemented in multiple fields and cases in which providing well-calibrated uncertainty measures associated with machine learning predictions is important (Angelopoulos and Bates, 2023). Examples include ovarian cancer detection (Gammerman et al., 2009), stroke risk assessment (Papadopoulos et al., 2017), and mobile malware detection (Papadopoulos et al., 2018). Despite the importance of obtaining well-calibrated prediction intervals when using machine learning techniques in ecological applications, researchers in ecology have yet to adopt this framework in their workflow.

In this short communication, we aim to introduce researchers in ecology to the utility of the CP framework (Angelopoulos and Bates, 2023; Vovk et al., 2005) by illustrating its use in assigning reliable prediction intervals to machine learning regression assessments when combining acoustic indices to monitor animal diversity. To achieve this, we use previously published data collected through acoustic and bird surveys in Cyprus and Australia (Mammides et al., 2024) to demonstrate how CP intervals can be applied to improve biodiversity monitoring. We also discuss how the CP framework can be used in other ecological applications. We specifically focus on a CP method developed by the senior author of this study (Papadopoulos, 2023), which produces prediction intervals based on Gaussian Process Regression (GPR), a non-parametric, kernel-based, Bayesian approach to regression (Schulz et al., 2018).

2.1 Study Areas

2.1.1. Acoustic and Bird Surveys in Cyprus

The dataset in Cyprus was collected during the bird spring migration of 2023 (from March to May) at 60 sites (Mammides et al., 2024), representing a subset of those surveyed annually by BirdLife Cyprus as part of the Pan-European Common Bird Monitoring Scheme (PECBS) using line transects of ~ 1 km (mean = 1.05 km, sd = 0.34 km). A Song Meter Mini Recorder (Wildlife Acoustics, USA) was placed at each site, near the center of the line transect, for approximately a week, recording 30 one-minute audio files per hour for 24 hours per day, resulting in approximately 5,000 audio files per site and over 300,000 files in total (> 5,000 hours of recordings). The audio files were recorded at a sampling rate of 44.1 kHz and saved in WAV format. During the same period, volunteers recruited by BirdLife Cyprus conducted bird surveys at each site between 5:30 a.m. and 11:30 a.m., recording all birds seen and heard within 50 meters on either side of the transect. Each transect was surveyed twice, once during the first half of the migratory season and another during the second half. Bird species richness at each site was calculated by summing the number of recorded species.

2.1.2 Acoustic and Bird Surveys in Australia

The dataset in Australia was collected between February and December 2017 at 24 study sites situated in the subtropical Big Scrub region of eastern Australia (Huang and Catterall, 2021). The sites consisted of subtropical rainforest fragments of varying sizes: small (1–3 hectares; n = 6), medium (4–21 hectares; n = 6), and large (> 50 hectares; n = 6), as well as regrowth forest patches dominated by mature, but invasive, camphor trees (size: 2–20 hectares; n = 6) (Huang and Catterall, 2021). At each site, the bird communities were surveyed within a 0.6-hectare plot using 40-minute-long surveys carried out twice in late summer, winter, and early summer. All surveys were carried out within the first three hours following sunrise or the last three hours before sunset. All birds seen and heard in the plot during the surveys were identified and recorded. Additionally, an automated Bioacoustic Recorder (Frontier Labs, Australia) was placed at the center of each plot for three days during each season, recording a 5-minute-long sound every hour from 5:00 am to 21:00 pm, resulting in 3,672 audio files and over 300 hours of recordings. As in Cyprus, audio files were recorded at a sampling rate of 44.1 kHz and saved in WAV format.

2.2 Data Analysis

For each audio file recorded, we calculated sixty acoustic indices (Table S1) currently available in Python's “scikit-maad” package (Ulloa et al., 2021). We used the default settings for all the calculations, except that we first applied a low-stop filter to each audio file to remove frequencies less than 500 Hz to reduce microphone self-noise (Bradfer-Lawrence et al., 2020; Mammides et al., 2024). We calculated the mean and standard deviation of each of the sixty acoustic indices at the site level to match the level at which bird species richness was recorded (i.e., n = 60 and n = 24 for Cyprus and Australia, respectively).

We then run univariate Pearson correlations between the bird species richness at each area and the mean and standard deviation (Bradfer-Lawrence et al., 2019) of the sixty acoustic indices (Table S1) to identify the set of indices most useful for developing the regression models. We experimented with multiple combinations of the indices in each area based on their correlation strength with bird species richness (Table S1) and evaluated the models’ accuracy using the R² value. Using the indices with the most predictive power in each area (ten indices for Cyprus and six indices for Australia; Table 1), we trained the Gaussian Process Regression Conformal Predictor (GPR-CP) models for each area following a leave-one-out cross-validation process.

The GPR-CP approach is one of the very few regression adoptions of the full (also known as “transductive”) version of the framework (Papadopoulos, 2023), making it suitable for small-sized datasets common in ecology. It should be noted that while the Gaussian Process Regression produces a predictive distribution for each test observation, this relies on the assumption that the model is well-specified, which is rarely the case (Papadopoulos, 2023). Consequently, these predictive distributions can become very misleading, as demonstrated experimentally in Papadopoulos (2023), and hence the need for the CP extension.

For the implementation of the GPR-CP, we used the GPyTorchCP library (Papadopoulos, 2023). The inputs were normalized by setting their mean to 0 and their standard deviation to 1 based on the training examples of each fold. The underlying models were based on a Matern covariance function with smoothness set to v = 3/2, while all hyperparameters were optimized within each training set by minimizing the negative log marginal likelihood (Papadopoulos, 2023). The nonconformity measure parameter γ of the CP approach was set to γ = 2, which can be considered as the default value, without performing any optimization. We calculated prediction intervals for three different coverage levels (1-α): 80%, 90%, and 95%.

In total, we recorded 82 bird species in Cyprus (mean = 15.2; sd = 4.8) and 92 in Australia (mean = 29.7; sd = 4.6). The complete list of indices used to build the models and their corresponding correlations with bird species richness in each area are presented in Table 1.

Table 1

The list of indices used to train the Gaussian Process Regression Conformal Predictor (GPR-CP) models for each area, based on their correlation strength with bird species richness. M: Mean; SD: Standard Deviation; r: Pearson Correlation coefficient
Cyprus (R² = 0.03)			Australia (R² = 0.51)
Index	Metric	r	Index	Metric	r
ACTtMean	SD	0.41	TFSD	M	0.67
SNRt	SD	0.40	NDSI	M	0.67
SNRf	SD	0.40	MFC	SD	0.60
TFSD	M	0.38	ZCR	M	0.57
ACTtMean	M	0.38	ADI	M	0.51
NDSI	M	0.36	AEI	M	-0.48
SKEWf	M	-0.35	TFSD	M	0.67
SNRt	M	0.34
ACTspMean	SD	0.33
EAS	SD	0.32
H_Renyi	M	0.31
SNRf	M	0.31
Ht	M	-0.31
H_Renyi	SD	0.30
EAS	M	0.29

The accuracy of the models in Cyprus and Australia, respectively, were R² = 0.03 (MAE = 3.53; RMSE = 4.72) and R² = 0.51 (MAE = 2.57; RMSE = 3.21). The prediction intervals in Cyprus were, on average, wider than those in Australia (Table 2; Tables S2 & S3) despite the smaller number of species at each site in Cyprus due to the model’s lower accuracy. The mean interval width at the 80% coverage level in Cyprus and Australia was 10.79 and 7.88 species, respectively (Table 2). The empirical miscoverage rates for each confidence level (Table 2) are in all cases below the nominal ones (1-α), demonstrating the well-calibratedness of the conformal prediction intervals.

Table 2

The mean width and miscoverage percentages of the generated conformal prediction intervals in each area for the three coverage levels (1-α).
	Cyprus			Australia
	80%	90%	95%	80%	90%	95%
Mean prediction interval width	10.79	14.06	16.99	7.88	14.42	16.09
Miscoverage (%)	20	10	5	17	8	4

Given the models' imperfect accuracy, as demonstrated by our findings and earlier research (Bicudo et al., 2023; Buxton et al., 2018a; Sethi et al., 2023), it is critical to have an accurate measure of the uncertainty associated with the predictions for each site in each area. Below, we discuss the models’ performance and mention how the CP framework can be used to improve monitoring applications and other ecological applications in general.

4.1 Effectiveness of the indices in measuring bird species richness

We observed considerable differences in the accuracy of the two models for each area, which can be attributed to factors described in more depth in our recent work (Mammides et al., 2024). In summary, the discrepancies are likely due to dissimilarities in sampling design and fundamental differences in the two areas' soundscapes. For instance, the bird communities in Cyprus were surveyed only twice, using ~ 1-kilometer-long line transects that extended beyond the acoustic recorders' range. In contrast, the bird communities in Australia were surveyed six times using 40-minute-long surveys positioned closer to the recorder. Additionally, the study sites in Cyprus are embedded within working landscapes, leading to acoustic environments that include various other sounds, such as those produced by human activities, which have been shown to affect the indices’ performance (Galappaththi et al., 2024). That said, in Australia, where these issues did not exist, the model's accuracy (R² = 0.51) was still insufficient for accurately monitoring bird species richness (Mammides et al., 2024). Our findings are consistent with several other recent studies that used regression models to assess the combined effectiveness of acoustic indices in measuring bird species richness, particularly in biodiverse environments (Bicudo et al., 2023; Buxton et al., 2018a; Eldridge et al., 2018; Sethi et al., 2023).

4.2. The importance of measuring uncertainty associated with the predictions

As mentioned above, given the inaccuracy of the models and the indices in measuring animal diversity, it is crucial to have a measure of the associated uncertainty. The conformal prediction framework (Angelopoulos and Bates, 2023; Papadopoulos, 2023) is important in this context as it allows users to quantify uncertainty by generating appropriate intervals at the site level, rather than having to rely solely on assessments of the model's overall accuracy, e.g., using R² values (Mammides et al., 2024). Site-level assessments are especially important for ecological applications because management decisions and conservation interventions are usually designed at that spatial scale. Furthermore, site-specific uncertainty assessments are critical in monitoring contexts, such as situations when a subset of the sites is used to build a model to identify the most useful indices for a particular area (Mammides et al., 2024), and the resulting model is then applied to predict biodiversity at new sites based solely on acoustic data. In such scenarios, having an estimate of the uncertainty associated with the predicted levels of biodiversity at each site is crucial for making informed management decisions.

Our results indicate clearly the probable range of the number of species in each area (Tables S2 & S3), allowing conservation management efforts to be adjusted accordingly. For instance, when considering Sites 1 and 2 in Australia (Table S3), for which the model predicted 33.19 and 32.17 species, respectively, we observe that the corresponding prediction intervals (at 80% coverage level) are 29.49–36.83 and 28.24–36.08, allowing for more informed comparisons. Importantly, depending on the intended management interventions, the coverage percentage can be adjusted accordingly to produce intervals with a higher probability of containing the true value, such as 95% instead of 80%, albeit at the expense of increasing interval width (Tables S2 & S3).

Another application of the conformal prediction framework within the context of biodiversity monitoring is identifying sites for which predictions are less accurate by comparing their interval width to those of the other sites (Tables S2 & S3). For example, although the mean width of the prediction intervals in Australia was 7.88 species (80% coverage), it ranged from 6.85 to 9.90 (Table S3). This variation indicates that predictions at specific sites were less accurate and should be used with additional caution, suggesting also that further sampling at these sites may be beneficial.

4.3 The utility of the conformal prediction framework in ecological applications

Our study highlights the utility of the conformal prediction framework as a tool for assessing uncertainty in predictions made using acoustic indices and machine learning methods. However, it should be emphasized that this framework can be applied to a variety of other ecological applications involving the use of machine learning methods, which are gaining popularity in ecology (Pichler and Hartig, 2023), particularly in spatial and landscape ecology (Stupariu et al., 2022), but also for other applications such as studying species interactions (Pichler et al., 2020) and managing protected areas (Guan et al., 2021). By integrating the framework into these applications, it is possible to precisely quantify the associated uncertainty, enabling more informed conservation decisions and actions. Our goal is for this work to serve as a starting point for ecologists looking to incorporate prediction intervals into their machine-learning methods.

Acknowledgements

We are thankful to Anders Gray, Mel Kemp, Dave Thurlow, and the other volunteers who conducted the bird surveys in Cyprus. This work was carried out as part of the BIOMON project, which is funded by the European Union's Horizon Europe programme, ERA Talents, under grant agreement 101090273.

Alcocer, I., Lima, H., Sugai, L.S.M., Llusia, D., 2022. Acoustic indices as proxies for biodiversity: a meta‐analysis. Biological Reviews brv.12890. https://doi.org/10.1111/brv.12890
Allen-Ankins, S., McKnight, D.T., Nordberg, E.J., Hoefer, S., Roe, P., Watson, D.M., McDonald, P.G., Fuller, R.A., Schwarzkopf, L., 2023. Effectiveness of acoustic indices as indicators of vertebrate biodiversity. Ecological Indicators 147, 109937. https://doi.org/10.1016/j.ecolind.2023.109937
Angelopoulos, A.N., Bates, S., 2023. Conformal Prediction: A Gentle Introduction. Foundations and Trends® in Machine Learning 16, 494–591. https://doi.org/10.1561/2200000101
Bicudo, T., Llusia, D., Anciães, M., Gil, D., 2023. Poor performance of acoustic indices as proxies for bird diversity in a fragmented Amazonian landscape. Ecological Informatics 77, 102241. https://doi.org/10.1016/j.ecoinf.2023.102241
Bradfer-Lawrence, T., Bunnefeld, N., Gardner, N., Willis, S.G., Dent, D.H., 2020. Rapid assessment of avian species richness and abundance using acoustic indices. Ecological Indicators 115, 106400. https://doi.org/10.1016/j.ecolind.2020.106400
Bradfer‐Lawrence, T., Gardner, N., Bunnefeld, L., Bunnefeld, N., Willis, S.G., Dent, D.H., 2019. Guidelines for the use of acoustic indices in environmental research. Methods Ecol Evol 10, 1796–1807. https://doi.org/10.1111/2041-210X.13254
Buxton, R.T., Agnihotri, S., Robin, V.V., Goel, A., Balakrishnan, R., 2018a. Acoustic indices as rapid indicators of avian diversity in different land-use types in an Indian biodiversity hotspot. JEA 2, 1–1. https://doi.org/10.22261/jea.gwpzvd
Buxton, R.T., McKenna, M.F., Clapp, M., Meyer, E., Stabenau, E., Angeloni, L.M., Crooks, K., Wittemyer, G., 2018b. Efficacy of extracting indices from large-scale acoustic recordings to monitor biodiversity: Acoustical Monitoring. Conservation Biology 32, 1174–1184. https://doi.org/10.1111/cobi.13119
Eldridge, A., Guyot, P., Moscoso, P., Johnston, A., Eyre-Walker, Y., Peck, M., 2018. Sounding out ecoacoustic metrics: Avian species richness is predicted by acoustic indices in temperate but not tropical habitats. Ecological Indicators 95, 939–952. https://doi.org/10.1016/j.ecolind.2018.06.012
Galappaththi, S., Goodale, E., Sun, J., Jiang, A., Mammides, C., 2024. The incidence of bird sounds, and other categories of non-focal sounds, confound the relationships between acoustic indices and bird species richness in southern China. Global Ecology and Conservation 51, e02922. https://doi.org/10.1016/j.gecco.2024.e02922
Gammerman, A., Vovk, V., Burford, B., Nouretdinov, I., Luo, Z., Chervonenkis, A., Waterfield, M., Cramer, R., Tempst, P., Villanueva, J., 2009. Serum proteomic abnormality predating screen detection of ovarian cancer. The Computer Journal 52, 326–333.
Gonzalez, A., Vihervaara, P., Balvanera, P., Bates, A.E., Bayraktarov, E., Bellingham, P.J., Bruder, A., Campbell, J., Catchen, M.D., Cavender-Bares, J., Chase, J., Coops, N., Costello, M.J., Dornelas, M., Dubois, G., Duffy, E.J., Eggermont, H., Fernandez, N., Ferrier, S., Geller, G.N., Gill, M., Gravel, D., Guerra, C.A., Guralnick, R., Harfoot, M., Hirsch, T., Hoban, S., Hughes, A.C., Hunter, M.E., Isbell, F., Jetz, W., Juergens, N., Kissling, W.D., Krug, C.B., Le Bras, Y., Leung, B., Londoño-Murcia, M.C., Lord, J.-M., Loreau, M., Luers, A., Ma, K., MacDonald, A.J., McGeoch, M., Millette, K.L., Molnar, Z., Mori, A.S., Muller-Karger, F.E., Muraoka, H., Navarro, L., Newbold, T., Niamir, A., Obura, D., O’Connor, M., Paganini, M., Pereira, H., Poisot, T., Pollock, L.J., Purvis, A., Radulovici, A., Rocchini, D., Schaepman, M., Schaepman-Strub, G., Schmeller, D.S., Schmiedel, U., Schneider, F.D., Shakya, M.M., Skidmore, A., Skowno, A.L., Takeuchi, Y., Tuanmu, M.-N., Turak, E., Turner, W., Urban, M.C., Urbina-Cardona, N., Valbuena, R., Van Havre, B., Wright, E., 2023. A global biodiversity observing system to unite monitoring and guide action. Nat Ecol Evol. https://doi.org/10.1038/s41559-023-02171-0
Guan, Z., Elleason, M., Goodale, E., Mammides, C., 2021. Global patterns and potential drivers of human settlements within protected areas. Environ. Res. Lett. 16, 064085. https://doi.org/10.1088/1748-9326/ac0567
Huang, G., Catterall, C.P., 2021. Effects of habitat transitions on rainforest bird communities across an anthropogenic landscape mosaic. Biotropica 53, 130–141. https://doi.org/10.1111/btp.12853
Kahl, S., Wood, C.M., Eibl, M., Klinck, H., 2021. BirdNET: A deep learning solution for avian diversity monitoring. Ecological Informatics 61, 101236. https://doi.org/10.1016/j.ecoinf.2021.101236
Mammides, C., Goodale, E., Dayananda, S.K., Kang, L., Chen, J., 2017. Do acoustic indices correlate with bird diversity? Insights from two biodiverse regions in Yunnan Province, south China. Ecological Indicators 82, 470–477. https://doi.org/10.1016/j.ecolind.2017.07.017
Mammides, C., Goodale, E., Dayananda, S.K., Luo, K., Chen, J., 2021. On the use of the acoustic evenness index to monitor biodiversity: A comment on “Rapid assessment of avian species richness and abundance using acoustic indices” by Bradfer-Lawrence et al. (2020) [Ecological Indicators, 115, 106400]. Ecological Indicators 126, 107626. https://doi.org/10.1016/j.ecolind.2021.107626
Mammides, C., Wuyuan, P., Huang, G., Sreekar, R., Ieronymidou, C., Jiang, A., Goodale, E., Papadopoulos, H., 2024. The Combined Effectiveness of Acoustic Indices in Measuring Bird Species Richness in Biodiverse Sites in Cyprus, China, and Australia. https://doi.org/10.2139/ssrn.4823337
Metcalf, O.C., Barlow, J., Devenish, C., Marsden, S., Berenguer, E., Lees, A.C., 2021. Acoustic indices perform better when applied at ecologically meaningful time and frequency scales. Methods Ecol Evol 12, 421–431. https://doi.org/10.1111/2041-210X.13521
Pan, W., Goodale, E., Jiang, A., Mammides, C., 2024. The effect of latitude on the efficacy of acoustic indices to predict biodiversity: A meta-analysis. Ecological Indicators 159, 111747. https://doi.org/10.1016/j.ecolind.2024.111747
Papadopoulos, H., 2023. Guaranteed Coverage Prediction Intervals with Gaussian Process Regression. https://doi.org/10.48550/arXiv.2310.15641
Papadopoulos, H., Georgiou, N., Eliades, C., Konstantinidis, A., 2018. Android malware detection with unbiased confidence guarantees. Neurocomputing 280, 3–12.
Papadopoulos, H., Kyriacou, E., Nicolaides, A., 2017. Unbiased confidence measures for stroke risk estimation based on ultrasound carotid image analysis. Neural Computing and Applications 28, 1209–1223.
Papadopoulos, H., Vovk, V., Gammerman, A., 2011. Regression conformal prediction with nearest neighbours. Journal of Artificial Intelligence Research 40, 815–840.
Pichler, M., Boreux, V., Klein, A.-M., Schleuning, M., Hartig, F., 2020. Machine learning algorithms to infer trait-matching and predict species interactions in ecological networks. Methods in Ecology and Evolution 11, 281–293. https://doi.org/10.1111/2041-210X.13329
Pichler, M., Hartig, F., 2023. Machine learning and deep learning—A review for ecologists. Methods Ecol Evol 2041–210X.14061. https://doi.org/10.1111/2041-210X.14061
Rajan, S.C., Athira, K., Jaishanker, R., Sooraj, N.P., Sarojkumar, V., 2019. Rapid assessment of biodiversity using acoustic indices. Biodivers Conserv 28, 2371–2383. https://doi.org/10.1007/s10531-018-1673-0
Schulz, E., Speekenbrink, M., Krause, A., 2018. A tutorial on Gaussian process regression: Modelling, exploring, and exploiting functions. Journal of Mathematical Psychology 85, 1–16.
Sethi, S.S., Bick, A., Ewers, R.M., Klinck, H., Ramesh, V., Tuanmu, M.-N., Coomes, D.A., 2023. Limits to the accurate and generalizable use of soundscapes to monitor biodiversity. Nat Ecol Evol. https://doi.org/10.1038/s41559-023-02148-z
Stowell, D., 2022. Computational bioacoustics with deep learning: a review and roadmap. PeerJ 10, e13152. https://doi.org/10.7717/peerj.13152
Stowell, D., Sueur, J., 2020. Ecoacoustics: acoustic sensing for biodiversity monitoring at scale. Remote Sens Ecol Conserv 6, 217–219. https://doi.org/10.1002/rse2.174
Stupariu, M.-S., Cushman, S.A., Pleşoianu, A.-I., Pătru-Stupariu, I., Fürst, C., 2022. Machine learning in landscape ecological analysis: a review of recent approaches. Landsc Ecol 37, 1227–1250. https://doi.org/10.1007/s10980-021-01366-9
Sueur, J., Pavoine, S., Hamerlynck, O., Duvail, S., 2008. Rapid Acoustic Survey for Biodiversity Appraisal. PLoS ONE 3, e4065. https://doi.org/10.1371/journal.pone.0004065
Ulloa, J.S., Haupert, S., Latorre, J.F., Aubin, T., Sueur, J., 2021. scikit‐maad: An open‐source and modular toolbox for quantitative soundscape analysis in Python. Methods Ecol Evol 12, 2334–2340. https://doi.org/10.1111/2041-210X.13711
Vovk, V., Gammerman, A., Shafer, G., 2005. Algorithmic learning in a random world. Springer.

The authors declare no competing interests.

Download PDF

Version 1

posted

You are reading this latest preprint version

A novel approach for calculating prediction uncertainty when using acoustic indices and machine learning algorithms to monitor animal communities

Status:

Version 1

Abstract

1. Introduction

2. Methods

2.1 Study Areas

2.1.1. Acoustic and Bird Surveys in Cyprus

2.1.2 Acoustic and Bird Surveys in Australia

2.2 Data Analysis

3. Results

4. Discussion

4.1 Effectiveness of the indices in measuring bird species richness

4.2. The importance of measuring uncertainty associated with the predictions

4.3 The utility of the conformal prediction framework in ecological applications

Declarations

Acknowledgements

References

Additional Declarations

Status:

Version 1