Calibration and aggregation
Archaeological radiocarbon dates were collated from published studies that previously adopted null hypothesis significance testing (NHST) approaches towards prehistoric demography. Our literature search identified 16 studies that collectively span six continents and approximately 30,000 radiocarbon years (table S1). We applied a consistent protocol to the calibration of radiocarbon assays. The calibrate function within the R package rcarbon17 was used to convert 14C radiocarbon years to calendar years before present (cal BP). The IntCal2041 and SHCal2042 curves were applied to dates in the northern and southern hemispheres, respectively. Calibrated ages are reported as the 95.4% (two-sigma, 2σ) age range. Laboratory codes and Site Identification numbers were appended to each calibrated date range and post-calibration distributions were not normalised. To account for between-site variation in sampling intensity, multiple dates from a single site that are within 50 radiocarbon years of each other were pooled (‘binned’) before aggregation into regional summed probability distributions (SPDs).
Bayesian model fitting
Our protocol aims to replicate the results of the 16 identified case studies to the greatest extent possible. In order to reproduce statistically significant negative population events (‘downturns’) produced by rcarbon’s ‘modelTest()’ function in the original studies, while simultaneously addressing the well-known limitations of using summed probability distributions in NHST, we adopt an alternative, Bayesian modelling approach. Markov Chain Monte Carlo (MCMC) methods implemented in the nimbleCarbon package43 for radiocarbon data can obtain robust parameter estimates, as the Bayesian MCMC controls for radiocarbon measurement errors and sampling biases separately. Previously, this has been a major drawback of NHST approaches to aggregated radiocarbon data, with several alternatives proposed in the literature44,45. Using the MCMC-derived parameter estimates in posterior predictive checks enables us to detect periods where expected growth trajectories are lower than the fitted model parameters, and which are more robust than least-squares regression approaches. The protocol produces outputs that are analogous to those in prior NHST studies (Figure S).
We analyse regional SPDs separately by fitting identical bounded exponential growth models to each dataset. This common model is defined by three parameters: growth rate (r) and boundaries (a and b). A weakly informative exponentially distributed prior (λ = 1/000.4) was used for r to capture a broad range of potential growth rates among the cases. Parameters a and b were adopted from the original studies. Markov chain traceplots (Figure S-Figure S) evaluate chain mixing alongside model convergence (Gelman-Rubin Ȓ) and effective sample size (ESS) diagnostics (table S7). Three chains of 50,000 iterations were run per region, with a burn-in of 5000 iterations and a thinning interval of 2. To ensure comparability of results with published studies that used a logistic growth model as a null hypothesis, regional datasets were subset based on expert judgement at documented palaeodemographic transitions and treated as two separate exponential growth models. Subsetting was only performed on the Near East and Italy, Sicily, and Sardinia datasets. Downturns adjacent to transition points were removed from the sample to avoid including data points introduced by the subsetting. Posterior predictive checks were executed using samples of parameter estimates obtained by the MCMC approach to simulate and back-calibrate a number of radiocarbon dates equal to the sample size. The procedure was repeated 1000 times to derive critical envelopes.
Resilience metrics
The resilience metrics target periods where empirical SPD curves are below the 90% confidence envelopes of the fitted models, per the posterior predictive checks (periods termed ‘downturns’). Extraction was performed using a bespoke R function modified from Riris and De Souza12, available in the repository (https://dx.doi.org/10.5281/zenodo.10061467). The principal response metrics in our analysis are Resistance and Resilience (Figure S4). Respectively, these metrics quantify the normalised response to downturns and the rate of recovery relative to baseline conditions. Resistance is measured on SPDs using two parameters: SPD values at the start of a downturn (Gb) and at downturn minima (Gx), while resilience is measured across the entire period of decline until its end (Ge) relative to the minimum and baseline (table 1). Resistance ranges between 0 – 1, indicating a 100% change from baseline to no change. Resilience ranges between ±1, with 1 indicating full recovery by the end of the downturn. Negative values of resilience indicate that the baseline value has been exceeded, although remaining outside the expectations of the null model. Finally, zero indicates no recovery11,46. We also collected information on the start and ends of downturns, their duration, elapsed time until SPD minima were reached, and the cumulative number of recoveries (table S2).
Statistical modelling
The target of our comparative analysis is the resistance and resilience of human populations to disturbance as defined in each individual study. Our approach assumes that high values reflect resilient populations that successfully re-establish growth regimes after periods of decline related to disturbance events. We also assume that downturns are randomly distributed in time and geographical space. To account for the influence of inter-regional and -event cultural variation on outcomes, we drew on expert judgement and close readings of the published literature to record the broad category and specific type of disturbance during downturns, as well as the dominant land use type and the nature of resulting socio-cultural changes, if any (table S3). These variables provide a control on whether a given population within a cultural system retains its identity and function over time, or if system transformation and adaptive change is archaeologically evident.
Linear mixed-effect models were executed to evaluate the presence and strength of relationships between resistance, resilience, the recorded variables, and case study locations. This analysis was performed using the cAIC4 and lme4 R packages47,48, scripts are available in this repository: https://dx.doi.org/10.5281/zenodo.10061467. Initial models were defined with Resistance and Resilience as response variables, with only case identifiers (Region) as a random effect. As observed downturns are sequential within each case, the random effect controls for potential pseudoreplication while avoiding the need to weight the data by group size. A stepwise search using Akaike’s Information Criterion was implemented for investigating the information gain of including fixed effects in each model in turn. These candidate models were sequentially fitted using restricted maximum likelihood. Most fixed covariates (Results, table S5) were left out of the final models. Region was retained as a random effect in all cases, to produce two models:
Model output is summarised in table S5 and diagnostics in Figure S5-Figure S6. We present standardised residuals, by region and in full, as well as leverage and Cook’s Distance.
To further explore the relationship between rates of downturns and Resistance and Resilience, we performed an additional modelling exercise with the same random effect and full suite of fixed effects, with the frequency of downturn as the independent variable (table S6, Figure S7):
The effect sizes (standardised coefficients) of the significant model terms are plotted graphically in fig. 4C and reported in full in table S6. We report effect sizes in the text as η2 (eta-squared), that is, the total variance explained by differences between means.
METHODS REFERENCES
41. Reimer, P.J. et al. The IntCal20 Northern Hemisphere radiocarbon age calibration curve (0–55 cal kBP). Radiocarbon62, 725-57 (2020).
42. Hogg, A.G. et al. SHCal20 Southern Hemisphere calibration, 0–55,000 years cal BP. Radiocarbon62, 759-78 (2020).
43. Crema, E.R. nimbleCarbon (v.0.2.1): Models and Utility Functions for Bayesian Analyses of Radiocarbon Dates with NIMBLE (2022). https://github.com/ercrema/nimbleCarbon.
44. Carleton, W.C. Evaluating Bayesian Radiocarbon‐dated Event Count (REC) models for the study of long‐term human and environmental processes. J. Quat. Sci.36, 110-123 (2021).
45. Timpson, A., R. Barberena, M.G. Thomas, C. Méndez, K. Manning, Directly modelling population dynamics in the South American Arid Diagonal using 14C dates. Philos. Trans. R. Soc. Lond., B, Biol. Sci.376, 20190723 (2021).
46. Orwin, K.H., D.A. Wardle, New indices for quantifying the resistance and resilience of soil biota to exogenous disturbances. Soil Biol. Biochem. 36, 1907-1912 (2004).
47. Bates, D., M. Mächler, B. Bolker, S. Walker, Fitting Linear Mixed-Effects Models Using lme4. J. Stat. Softw.67, 1–48. (2015). doi:10.18637/jss.v067.i01.
48. Säfken, B., D. Rügamer, T. Kneib, S. Greven, Conditional Model Selection in Mixed-Effects Models with cAIC4. J. Stat. Softw, 99, 1–30 (2021). doi:10.18637/jss.v099.i08.