In a first step, the role of waterways as obstacles to movement will be considered, testing the hypothesis that, after arrival at the coast via boats, further penetration may have occurred overland. Next, the scenario in which rivers could potentially act as facilitators and accelerators for the group will be developed. Then, a third type of scenario will be modelled, assuming that the river did not play a specific role; it will be considered neither an aid nor an obstacle.
In the first case, a scenario was modelled where rivers hindered the movement of the penetrating groups. By applying the MOV function (1), the walking speed of the group was calculated. The pace measured in minutes per kilometre was derived as a variable to generate an effort surface for use as a friction map (Posluschny, 2010), representing a variable based on speed but expressed in terms of time. In order to integrate this surface with those of other parameters, the pace values underwent reclassification to obtain dimensionless values ranging from 1 to 10. Initially, the pace value at a 20% slope was calculated, based on Ganskopp and Vavra (1987), representing the limit within which cattle have been observed to move for grazing. Subsequently, the pace values were divided into 5 classes using the Jenks natural breaks classification method, ranging from the minimum to the maximum corresponding to the calculated limit, and values between 1 and 5 were assigned to them in ascending order. Pace values exceeding the calculated limit were assigned a weight of 10 due to cattle exhibiting avoidance behaviour for those slope percentages.
Two additional grids were added to this surface: one related to slope and another considering the presence of rivers. Regarding slope, to underscore its crucial role, a reclassification of cattle preferences was performed, and the following weights were assigned accordingly:
-
For slopes between 0 and 9%, a weight of 1 was assigned to indicate preference.
-
Values between 9 and 19% were given a weight of 2, indicating cattle indifference, albeit with a slight preference compared to the first class.
-
A weight of 3 was assigned to signify avoidance of slopes greater than 20%.
Given the initial assumption that waterways should be treated as barriers, a third surface was created to address this issue. Specifically, a weight of 4 was assigned to the pixels corresponding to the rivers and a buffer zone around the rivers with a distance parameter set at 500 meters. A weight of 1 was given to all the remaining pixels. The decision to incorporate a river buffer stems from the fact that the data used in this work are contemporary and not paleogeographic. Therefore, the weight was attributed to the entire buffer to minimize errors resulting from possible changes in the course of the rivers. Additionally, an extra weight of 1 was assigned to the remaining land areas.
By summing up all the grids, the obtained output is a cost surface (Figs. 2, 3, 4), indicating areas with lower or higher costs to traverse. In the analysis of a migratory process, it is crucial to consider the cumulative cost. Hence, the subsequent step involved computing cumulative cost grids3, storing the costs of movement radiating from the origin to every other cell in the raster grid (Herzog, 2020). Cumulative cost surfaces are derived by applying a spreading function to minimize the cumulative cost from one or multiple cells with an initial value of 0 (Conolly & Lake, 2006).
In each basin, the starting points were selected based on the two sites with the oldest dating near the shores, except for the Ebro basin, where a starting point from the Pyrenees was also considered (Gassiot-Ballbè, 2021). In the case of the Rhône basin the two points on the coast with the oldest dates were initially set as starting points: Pont de Roque-Haute (Median 7840 cal BP) and Caucade (Median 7656 cal BP). The GRASS r.cost algorithm (r.cost, 2023) was then utilized to compute the cumulative cost (Manen et al., 2019). Similarly, for the Ebro basin, the same raster grid was generated, with Guixeres de Vilobi (Median 7528 cal BP) on the coast and Balma Margineda (Median 7548 cal BP) in the Pyrenees selected as starting points (Oms et al., 2018; Manen et al., 2019). The Po basin followed a similar workflow, with the two chosen sites being Riparo Ala Le Corone (Median cal 7512) and Fiorano Modenese (Median cal 7453) (Starnini et al., 2018).
In the second scenario, the role of the river is considered a support rather than an obstacle to movement, offering a quicker and more convenient pathway for the group of humans and animals. Building upon this premise, cost maps were generated, and subsequently, cumulative cost maps were derived in a similar manner.
Initially, movement through the river was regarded as isotropic, without accounting for the potential effects of downstream or upstream navigation on penetration speed. In contrast to the cost map computed in the previous section, a low value of 1 was assigned to the buffer of the river layer in this case, favouring water passage and indicating its supportive role. For the terrestrial areas of the basin, the pace values calculated using the previous workflow were retained. The only additional grid introduced in this scenario was the slope grid reclassified based on cattle preferences. Subsequently, cumulative cost surfaces were computed for the same sites mentioned earlier.
Subsequently, an attempt was made to model fluvial movement by considering space anisotropically (see Supplementary Materials for the workflow). Although this type of analysis yields outputs with greater approximation compared to those computed in an isotropic space, theoretically, they better reflect reality. Indeed, one of the most crucial factors, if not the most significant, when modelling river movement using vessels, is the current of waterways. The latter can influence the effort involved in movement, doubling the navigation speed when moving downstream, namely, in the same direction as the current, and halving it for upstream navigation (Livingood, White & Surface-Evans, 2012).
All functions describing anisotropic movement must take its direction into account; otherwise, they revert to isotropy (Van Leusen, 2002). To incorporate the direction of movement into the calculation of upstream and downstream speeds, the formula (2) developed by Krist (2001a, b) for computing an adjusted slope was employed:
(2)
$$\:Sa\:=\:S\:*\:cos(At\:-\:A)$$
where S and A are the original slope and aspect values, and At is the direction to the starting point (Van Leusen, 2002). The At values have been obtained using the r.cost algorithm. After computing the adjusted slope for the raster grid of rivers with Strahler order greater than 4, a reclassification was conducted to assign different velocity values, specifically:
-
A speed of 4 km/h, equivalent to the average measured during the Monoxylon II experimental expedition (Tichý, 2016), was considered for sections unaffected by current flow.
-
Downstream sections were assigned a speed value of 6 km/h.
-
Upstream navigation was modelled with a speed value of 2 km/h.
Using these velocity values, pace values were derived and added to those already calculated for terrestrial movement, which were given a higher weight in this scenario. Once again, cost surfaces were generated by summing information related to the reclassified slope according to cattle preferences. Finally, cumulative cost maps were computed for the three basins from the same sites.
In the final scenario, rivers were not considered to play any significant role. Cost maps were computed by combining the reclassified layer of pace with the surface containing the values of the reclassified slope. No additional information was included, resulting in a fictional scenario where waterways do not exist.
Although highly improbable, a scenario depicting the complete absence of waterways is useful for simulating the total indifference of Neolithic people towards rivers. Following the workflow described in the preceding sections, cumulative cost maps were obtained.
In the final stage of the research, several statistical calculations were conducted to better understand the outputs of the cost-based analyses and to compare them with the archaeological data. However, due to the scattered nature of the dataset, the most practical approach was to employ a grid of predicted chronological values obtained through Kriging interpolation of the initial 14C datings. Universal Kriging was used across all three basins, as it employs a distinct model for analysing the first-order pattern associated with the directional trend (Lloyd & Atkinson, 2020; Fort, 2015). This trend corresponds to the migration direction from the coast to the hinterland. The interpolation was calculated on GRASS GIS and resulted in three different chronological maps, one for each basin (Figs. 5, 6, 7).
All statistical analyses were performed using RStudio (for the scripts, refer to the GitHub repository: https://github.com/giadapirrone/Sailing-West.git). The input raster grids included Universal Kriging, the cumulative cost maps with rivers considered as an obstacle to movement, the cumulative cost maps with rivers considered as an aid to movement (both isotropic and anisotropic), and those where rivers are not considered. The first calculations employed the stats package (R Core Team, 2022). At first,correlation was performed between Kriging and each cumulative cost map, aiming to understand whether and how the chronology is correlated with the direction and speed of cost accumulation across the territories. The correlation values obtained are summarized in Table 1.
Table 1 Correlation values.
RHÔNE |
Obstacle | Aid Isotropic | Aid Anisotropic | No River |
-0.9307318 | -0.9300138 | -0.9300645 | -0.9307632 |
EBRO |
Obstacle | Aid Isotropic | Aid Anisotropic | No River |
-0.5786837 | -0.5926106 | -0.5962887 | -0.5724935 |
PO |
Obstacle | Aid Isotropic | Aid Anisotropic | No River |
-0.0005386776 | -0.5769551 | -0.5257057 | − 0.0005385455 |
All results indicate a negative correlation between variables in all cases. Specifically, as the values of the Kriging maps decrease, those of the cumulative cost maps tend to increase, and viceversa. Additionally, there are variations in the strength of correlation, with the Rhône case exhibiting values close to -1, indicating a strong correlation. In contrast, the Ebro and Po basins show lower values but still suggest a relatively robust correlation. Two exceptions are noted in the Cumulative Cost map: the version that considers the Po River as an obstacle, and the version that excludes the river, both displaying coefficients close to zero, suggesting an apparent lack of correlation with the Universal Kriging map.
After confirming the correlation between variables, linear regression was utilized to quantify the relationship, by modelling the associations between variables and determining the best-fitting straight line to the dataset (Hacigüzeller, 2020). Let us consider the linear regression equation\(\:\:y\:=\:a\:+\:bx\), where y represents the estimated dependent value (cumulative cost), a is the average value of y when x is zero, b is the average change in y associated with a one-unit increase in x, and x is the predictor variable's value. The focus is on the slope value (b), for it provides quantitative insights into how the cost variable changes in relation to the temporal variable.
In the Rhône basin, significant p-values were observed in all three cases, allowing for the rejection of the null hypothesis of the slope being 0. The slope value for the regression between Kriging and the Cost with the river as an obstacle was − 0.04001, indicating that for every unit decrease in Kriging, there's a 0.04001 increase in the Cost. When considering the river as an aid, the slope value (b) was − 0.05555 in an isotropic environment and − 0.03314 in an anisotropic environment. Finally, when the river was not considered, the slope value was − 0.06474.
For the Ebro basin, significant p-values were observed in all three cases as well. The slope values for the regression between Kriging and the Cost with the river as an obstacle were − 0.02914. When the river aids movement, the slope values were − 0.03895 in an isotropic environment and − 0.02225 in an anisotropic environment. In the fourth case, the slope value was − 0.04391.
In the Po basin, linear regression between Universal Kriging and the cumulative cost map with the river as an obstacle and between Universal Kriging and cumulative cost map without the river were not computed due to their initial lack of correlation. However, in cases where the river aids movement, linear regression was conducted. This resulted in a slope value (b) of -0.03780 in the isotropic case and − 0.02356 in the anisotropic case.
The final statistical analysis conducted on the cost maps entailed a model comparison. The objective of model comparison is to assess the adequacy of the considered models for the data (Kuha, 2004). Various search methods exist for model selection, with common approaches including the computation of AIC (Akaike Information Criterion) or BIC (Bayesian Information Criterion). Although AIC and BIC share the goal of evaluating model fit, they differ in their precise definitions of what constitutes a 'good model' (Kuha, 2004): BIC tends to favour simpler models, whereas AIC favours models with greater complexity. For this analysis, a stepwise BIC model-selection approach was employed. It comprises two components: the likelihood term, indicating how well the model fits the data, and the complexity term, penalizing models with high complexity. The model selection is based on the lower BIC, indicating a model that is both efficient and parsimonious (Brandolini & Carrer, 2021). To conduct the selection, the MASS package (Ripley et al., 2013) was utilized, with linear regressions between the various cost maps and the Universal Kriging discussed previously. Additionally, BIC weights were calculated, offering a normalized estimation of the relative performance of the models (Basile & Campus, 2024). A higher weight signifies better model performance.
Tables 2 and 3 present the obtained results, considering all three basins:
-
model0 represents the linear regression between Universal Kriging and the cost map where the river is regarded as an obstacle.
-
model1 corresponds to the linear regression between Universal Kriging and the cost map where the river serves as an aid, and the spatial consideration is isotropic.
-
model2 entails the regression between Universal Kriging and the cost map where the river acts as an aid within an anisotropic spatial context.
-
model3 denotes the regression computed between Universal Kriging and the cost map where rivers are not taken into consideration.
As shown in Table 2, for the Rhône basin, model3 exhibits the lowest BIC value, for the Ebro basin, model2 has the lowest BIC, and, finally, for the Po basin, it is model1. When examining BIC weights (Table 3), in all three cases, the model with the lowest BIC value demonstrates significantly better performance compared to the other two, displaying a maximum weight of 1, while the weights of the other two are equal to zero.