In the case study we quantified that on a one-hour resolution resource availability contributed for 37% to cow movements (consisting of movement through the landscape, body part movement, and emergent patterns like group characteristics, and displayed activities) and time since milking contributed for 33%, while wind speed did not contribute noticeably. This supports our expectations that both resource availability and time since milking are important in shaping the movement of cows, but that wind speed (during relatively mild conditions) is not. Furthermore, it seems that the moderate correlation between resource availability and wind speed was indeed spurious. This framework proved to be insensitive to this spurious correlation, as it did quantify the contribution of wind speed to cow movement to be 0%. Furthermore, the Support Vector Regression (SVR) models performed overall better than the Random Forest Regression (RFR), especially when confronted with a dataset with both GPS and accelerometer variables, but the qualitative patterns when comparing the three different environmental contributions to single-sensor movement datasets were the same for both algorithms. Due to the SVRs higher performance, we do consider it to be the better alternative over RFR for this analytical framework when dealing with hyperdimensional datasets, especially when variables from multiple sensors are mixed. Moreover, we found that resource availability contributed more to accelerometer variables (29%) than to GPS variables (8%), but this contribution to GPS variables still was largely independent from accelerometer variables (less than 1% of the total variation was shared). This indicates that, at this temporal scale and with these computed movement variables, the individual movement of cows through the landscape and the spatial group characteristics hardly contained any signature of resource availability and that almost all of the contribution of resource availability to cow movements became apparent from the acceleration variables of the cows’ neck during grazing. The acceleration variables of the cows’ neck during grazing, being descriptive for bite frequency and bite force (Table 2, Fig. 1 and Table 5), probably link more explicitly to grazing behaviour than GPS variables do. As grazing behaviour in cows is closely linked to resource availability (28), probably therefore do these accelerometer variables contain a larger signature of resource availability than GPS variables. For time since milking the opposite was found as it contributed more to GPS variables (29%) than to accelerometer variables (21%), with a lot of their explained variation being shared (17% of the total variation). This links well to our previous argument about that the accelerometer variables are shaped for a large part by the cows’ neck movement during grazing, which is intuitively more heavily influenced by grass biomass than by time since milking. Previous studies also found that the lactation stage, a variable that we expected to be linked to time since milking regarding its effect on cow behaviour, influences the relative distribution of cow activity patterns and cow movement through the landscape (20,21). This supports our finding about a higher contribution to GPS variables with a large shared contribution with accelerometer variables, because the movement through the landscape is measured by GPS variables and the activity patterns are measured by both GPS and accelerometer variables. Finally, the estimated model parameters were the same for all cows, indicating that the cows responded to changes in resource availability and time since milking in the same way. However, it should be noted that all the results that are presented above are of course context dependent. With a different experimental setup, e.g., indoor instead of pasture housing or different ranges of environmental variable values, the quantified contributions can change.
Our case study illustrates how the proposed analytic framework can quantify the contribution of an ecological variable to animal movement. Having this quantification as the goal of the analytic framework, human interpretation and understanding of the correlative relationships within the model is initially of lesser importance. The goal of the framework is to build a model that can explain as much of the variation of the measured environmental variable as possible, without doing concessions to the model’s complexity to facilitate human interpretation. Only then the overall contribution of the environmental variable to animal movement is under scrutiny to be quantified. This analysis could be followed by a stage where the researcher is selective in the choice of movement variables, to study which movement variables are mainly influenced by the environmental variable. Due to the way the framework is set up, the environmental contribution to multivariate animal movement will by definition always be higher or equal to the environmental contribution to a subset of the animal movement variables. Thus, using this framework to first determine the environmental contribution to multivariate animal movement and afterwards determine the contribution to specific subsets of movement variables, allows for an analysis that shows in which movement variables the environmental contribution is most or least visible. This is demonstrated in our case study, where resource availability mainly contributed to accelerometer variables and much less to GPS variables, indicating that resource availability was more tightly linked to the cows’ movement of body parts than to their movement through the landscape. The opposite was true for time since milking, where also the explained variation by the accelerometer data was largely shared with GPS data. Furthermore, this framework allows for a comparison between the contributions of multiple environmental variables to animal movement whilst being insensitive to moderate spurious correlations between environmental variables, which is also shown in our case study with regards to the contribution of wind speed. Therefore, this framework could be well suited for exploratory analyses of the link between environment and animal movement. However, it should be noted that the environmental contribution to animal movement (i.e., the variation in an environmental variable that is traceable in animal movement data) is not the same as the environmental dependency of animal movement (i.e., the variation in animal movement that is dependent on an environmental variable), where potentially the environmental contribution can be large but the dependency small or vice versa. To accommodate for a multivariate analysis of animal movement we determine environmental contribution instead of using the route of causal inference. In movement ecology usually the environmental dependency of animal movement is the focus of analyses, as this allows for the determination of the direction and strength of the environmental influence on an animal movement variable. Therefore, post hoc analyses that link environmental variables to a simplified animal movement descriptor can supplement our proposed multivariate analytic framework in order to study the route of causal inference (2,3).
Various factors in the relationship between the environment and animal movement influence the quantification of the environmental contribution to animal movement (Fig. 5). First, many environmental variables are correlated and interact with each other in their influence on the animal’s decision making and, thus, movement (1). When the contribution of a single environmental variable to animal movement is under scrutiny, these correlations and interactions with other environmental variables need to be taken into consideration. In the proposed analytic framework we do not distinguish between the independent, shared, and interaction contributions environmental variables to animal movement (9), which is different from the independent and shared contribution to multiple subsets of the movement variables as described in our case study. As a consequence, both the direct and indirect contributions of an environmental variable to animal movement are combined into a single metric. Future research could potentially be aimed at the distinction between these contribution types of multiple environmental variables on multivariate animal movement, e.g. by using multi-target (Support Vector) regression and variation partitioning procedures (29,30). Furthermore, when the contribution of an environmental variable to animal movement is quantified, it is important that the movement itself does not influence the environmental variable directly at that point in space and time as well. Social proximity is for example an important variable in the shaping of individual animal movement, but individual movement parameters also directly shape collective movement patterns (31). The fit of a model with social proximity as response variable and individual movement variables as input data would then not be solely the contribution of an environmental variable anymore. This could consequently yield unrealistically large values of the explained variance, which should be prevented.
In the relationship between the environment and animal movement, the animal’s internal state (“why move?”), motion capacity (“how to move?”), and navigation capacity (“where to move?”) are also involved (1). The animal’s internal state is composed of many different factors, e.g., physiological “need” (hunger, fear, etc.), physical characteristics (age, sex, body condition, etc.), and personality differences (laziness, level of sociality, etc.), that combined result in a certain response by the animal when confronted with a set of environmental variables at certain moment in time (1). We translate this combined net effect of the internal state factors into the willingness of the animal to respond to the environment (Fig. 5). The motion and navigation capacity can be translated into the ability of the animal to respond. Another factor that is involved, even before the animal can decide whether it is willing and able to respond, is the animal’s perception of the environment (15). Only when an animal can observe changes or differences in an environmental variable can it decide to respond in a certain way. Because of the aforementioned latent variables – perception, willingness, and ability – the movement of the animal is not purely a deterministic function of a fixed set of environmental variables (1). These latent variables can thus cause a partial environmental contribution to animal movement. Furthermore, these latent variables are in part individual-specific (1), which is why differences between individuals should be taken into consideration by standardizing the movement variables per individual and/or adding individual identifiers as variables to the model.
Other factors, which are more data-related, also influence the quantification of the environmental contribution to animal movement (Fig. 5). First, environment and animal movement are linked through sensor measurements, which influence the outcome of the analysis through varying sensor types, resolution, extent, and precision. Second, the movement variables that are computed from the animal movement data to describe the movement process determine how much of the environmental contribution to animal movement is traceable in the data. Therefore it is key to extract as many predictive movement variables from the animal movement data as possible in this proposed framework, because only then the total environmental contribution is under scrutiny to be quantified and can the contribution of different environmental variables be equally compared. Third, the temporal scale at which these variables are computed determine the temporal scale for which the contribution of the environmental variable to animal movement is quantified. As the effect of an environmental variable on animal movement data varies with temporal scales, the choice of the temporal scale of the variables is relevant (32). Finally, the algorithm that is used to predict an environmental variable from animal movement data influences the level of fit that can be attained, which is demonstrated in our case study with SVR outperforming RFR on all occasions. Algorithms that can model complex interactions between variables are often able to make better predictions of the response variable, e.g. RFR, SVR, and Neural Network Regression, likewise are algorithms that take into account the sequence of time series data, e.g. Recurrent Neural Network. Quantitative comparisons between the contributions of different environmental variables to animal movement can thus only be done reliably when the same algorithm is used on the same underlying animal movement dataset.
Apart from only using the R2 of the model predictions to acquire ecological insights, the patterns of the observed vs. predicted plots can also potentially generate insight. For an environmental variable to influence animal movement, the animal’s perception, willingness, and ability are conditionalities (Fig. 5). Therefore, certain parts of the environmental variable’s range might be better predicted by the model than other parts. It could be argued that this could be an explanation for the better SVR predictions during intermediate grass biomass compared to low and high biomass levels, thereby creating a lower overall slope of the predictions compared to the observations (Fig. 2). However, apart from animal perception, willingness, and ability, other factors might also influence patterns of the observed vs. predicted plot (Fig. 5). In this case the algorithm might be the underlying cause for the lower overall slope of the SVR biomass predictions, due to a “regression toward the mean” characteristic (see Additional file 6). Furthermore, the overall gradient of the time since milking predictions follows the measurements quite accurately for both models from 0.5 to 6.5 hours, but after 6.5 hours it levels off (Fig. 2). This suggests that until 6.5 hours cows continue to change their movement in response to the time since they were last milked, but after 6.5 hours there is no noticeable change in movement anymore. Besides a potential behavioural ecological cause for this pattern, it could also be (partially) caused by correlations with other time variables due to our experimental setup where the cows were milked two times a day around the same time of day. Follow-up studies could focus on these predicted time since milking patterns, where the experimental setup should alternate the time of the day when the cows are milked. Finally, apart from concluding that wind speed probably has no noticeable effect on cow movement in this study (Fig. 2), it becomes clear that the model performance suffered from some higher wind speed values in the test set compared to the train set (thereby generating an R2 lower than 0).