Predictive control of aerial swarms in cluttered environments

Classical models of aerial swarms often describe global coordinated motion as the combination of local interactions that happen at the individual level. Mathematically, these interactions are represented with potential fields. Despite their explanatory success, these models fail to guarantee rapid and safe collective motion when applied to aerial robotic swarms flying in cluttered environments of the real world, such as forests and urban areas. Moreover, these models necessitate a tight coupling with the deployment scenarios to induce consistent swarm behaviours. Here, we propose a predictive model that incorporates the local principles of potential field models in an objective function and optimizes those principles under the knowledge of the agents’ dynamics and environment. We show that our approach improves the speed, order and safety of the swarm, it is independent of the environment layout and is scalable in the swarm speed and inter-agent distance. Our model is validated with a swarm of five quadrotors that can successfully navigate in a real-world indoor environment populated with obstacles. The movement of drone swarms can be coordinated using virtual potential fields to reach a global goal and avoid local collisions. Soria et al. propose here to extend potential fields with a predictive model that takes into account the agents’ flight dynamics to improve the speed and safety of the swarm.

PFs encode the desired behaviours of the swarm. They regulate the inter-agent distance among neighbouring individuals in a similar manner to a spring-mass system, adjust the velocity of the agents, steer them towards a common direction and regulate their distance to obstacles 23 .
The advantage of PF swarm models is that they are purely reactive, meaning that their decisions are solely based on the current sensory information and thus have low computational complexity 20,23 . For this reason, PF models are convenient for implementation on real robotic systems, either in free environments 21,25 or in environments with convex obstacles 24 . In the latter case, collision avoidance is obtained by defining virtual repulsive agents (called shill agents) located along the obstacles' boundaries. However, these shill agents present the inconvenience of slowing down the swarm as it approaches obstacles 20,26 . This effect becomes prominent in environments with high obstacle densities, where PF swarms can slow down notably. The slowdown can be attenuated by weakening the repulsion potentials, albeit at the expense of the swarm safety, because some agents may collide. Moreover, to account for the idiosyncrasies of the real world, these models often include a large number of parameters that have complex interdependencies 2,24 . As a consequence, they often require the adoption of optimization techniques such as evolutionary algorithms to identify a viable instantiation of the parameters, and each instantiation is specific to the swarm's preferred speed and inter-agent distance and to the environmental layout 21,24,27 .
In this Article, we propose a method to remove those difficulties that consist of endowing swarming agents with prediction-based control. Specifically, we show that aerial swarms with predictive control display faster flight while guaranteeing safe navigation in cluttered environments, they can adapt to diverse obstacle densities, and they are scalable to changes in the inter-agent distance and swarm speed. Recently, it has been advocated that some form of predictive control, in the form of an internal model of the actions of their conspecifics, may also be leveraged by biological swarms where the apparent synchronization of coordinated manoeuvers, such as a flock of starlings or a school of fish, cannot be explained by

Predictive control of aerial swarms in cluttered environments
Enrica Soria ✉ , Fabrizio Schiano ✉ and Dario Floreano ✉ Classical models of aerial swarms often describe global coordinated motion as the combination of local interactions that happen at the individual level. Mathematically, these interactions are represented with potential fields. Despite their explanatory success, these models fail to guarantee rapid and safe collective motion when applied to aerial robotic swarms flying in cluttered environments of the real world, such as forests and urban areas. Moreover, these models necessitate a tight coupling with the deployment scenarios to induce consistent swarm behaviours. Here, we propose a predictive model that incorporates the local principles of potential field models in an objective function and optimizes those principles under the knowledge of the agents' dynamics and environment. We show that our approach improves the speed, order and safety of the swarm, it is independent of the environment layout and is scalable in the swarm speed and inter-agent distance. Our model is validated with a swarm of five quadrotors that can successfully navigate in a real-world indoor environment populated with obstacles. a purely reactive system 19 . Inspired by this hypothesis, the method proposed in this Article endows flying agents with a model of swarm behaviour based on nonlinear model predictive control (NMPC).
Model predictive control (MPC) is a method that computes the control action of a system as the solution of a constrained optimization problem 28,29 . MPC leverages a mathematical representation of the system to predict and optimize its future behaviour in an iterative process. Unlike PF control, MPC can explicitly handle constraints such as physical limitations (for example, the flight speed and acceleration ranges of a drone) [30][31][32] and environmental restrictions (for example, no-flight zones) [32][33][34] . However, the recursive online solution of constrained optimization problems is associated with higher computational costs, so the adoption of predictive controllers in robotics has spread only recently 35 .
MPC has shown promising results in simulations on multi-vehicle systems. Examples include the stabilization of multiple agents in obstacle-free environments 36,37 , in the presence of obstacles 33 , and in the generation of collision-free trajectories for groups of robots with known target locations [38][39][40] . NMPC is a variant of MPC that can handle the nonlinearities of a system or its constraints 29 . This advantage comes at the cost of being more computationally demanding. In the simulation, NMPC has been used to control leader-follower formations of drones without obstacles 41 , and to control two-dimensional (2D) quadrotor formations in the presence of convex obstacles 34 .
Less work has been done on the use of MPC with multiple real drones, notably due to the difficulty of real-time implementation. Linear MPC has been used for trajectory planning in the presence of virtual obstacles in a leader-follower configuration, where a drone (the follower) has to keep a constant distance from a virtual agent (the leader) 42 . However, in leader-follower approaches, the leader has the extra knowledge of the group trajectory, which is either preprogrammed or provided by an external source. This aspect introduces an asymmetry in the agents' roles and adds a single point of failure in the swarm 43 . MPC has been used for the online generation of collision-free trajectories for a group of drones in environments with obstacles, where every drone is individually assigned an initial position and a target destination 32 . Instead, the model presented here is meant to coordinate the navigation of the swarm as a unique entity and guarantee internal order, in lieu of generating the trajectories separately. Concurrently, we avoid imposing a rigid formation or a fixed topology to the swarm, which may impact the freedom and fluidity of the agents' movements. Finally, NMPC has been shown to be capable of dealing with non-convex collision avoidance constraints in real multi-drone systems when the agents are assigned intersecting paths, although they were flying in empty environments 44 .
In the proposed NMPC model, the objective function to be optimized is made of three components inspired from PF swarm models: (1) separation, which drives the inter-agent distances to a preferred value, (2) propulsion, which propels the agents with a preferred speed value and (3) direction, which steers the swarm along a preferred direction. The separation component incorporates the effects of cohesion and repulsion in a single rule, in such a way that the inter-agent distance between neighbouring individuals appears as an explicit parameter. The propulsion and direction components taken together in the NMPC swarm model replicate the effect of the migration term in the PF swarm models. However, keeping two terms with two independent parameters allows separate adjustments of their relative strengths. (4) As a fourth rule, control effort is added to minimize the agents' accelerations, thereby smoothing flight trajectories and increasing energy efficiency. Each drone regulates its flight based on the knowledge of its neighbours and its own state and predicts its own trajectory and those of its neighbours thanks to a linearized dynamical model. The drones' neighbours are selected within a topological range; that is, for every drone only a constant number of nearest neighbours are considered 3,7 . The proposed NMPC model integrates a set of constraints to ensure safety distances among drones and with obstacles. We implement a centralized version of our NMPC model and we compare simulation results to a PF model. We show that predictive controllers can safely fly the swarm in cluttered environments while notably increasing the flight speed and synchronization of the swarm. Also, we show that the performance of the proposed NMPC model is independent of the obstacle density and environmental layout, unlike PF models. Additionally, we test the scalability of the proposed model to variations of desired inter-agent distance and swarm speed. We perform systematic experiments in simulation and validate the results with a swarm of five palm-sized quadrotors.

results
For the performance assessment of the swarm models, we set up a forest-like environment that consists of a rectangular flight region populated with cylindrical obstacles (Fig. 1a). At the experiment onset, we place five drones at random positions within a predefined start area on one side of the region (Fig. 1a, red zone) and let the swarm fly through the region along the migration direction (Fig.  1a, orange arrow). The mission is completed when all drones cross the arrival plane (Fig. 1a, orange plane) on the opposite side of the region.
We assess the quality of the aerial swarm's flight considering eight different metrics. The mission completion time T measures the time that the swarm requires to cross the region. The inter-agent distance error E d measures the agents' deviation from the preferred distance d ref , and the inter-agent distance range R d measures the range in which the inter-agent distances vary (defined by the minimum and maximum inter-agent distance over time). The speed error E v measures the deviation of the agents' speeds from the preferred migration speed v ref , and the speed range R v measures the range in which the agents' speeds vary. E d , R d , E v and R v take values greater than or equal to 0 (ideal case). We determine the swarm's level of synchronization by calculating the directional correlation of the agents' movements, expressed by the so-called order Φ order . Φ order takes values between −1 (complete disorder) and 1 (perfect order). Finally, the agent-agent safety Φ agent-safety assesses the ability of the swarm's agents to avoid collisions among themselves, and the agent-obstacle safety Φ obs-safety assesses the ability of the agents to avoid collisions with the obstacles. Φ agent-safety and Φ obs-safety take values between 0 (complete unsafety) and 1 (perfect safety, that is, zero collisions) (Supplementary Table 1 provides the mathematical formulation). To evaluate the overall performance of the swarm during a mission, we compute the average and standard deviation of these metrics. For the instantaneous evaluation of the swarm over time, we additionally plot the inter-agent distance and speed, and the distance to obstacles, from which we can appreciate their respective errors and ranges, and the occurrence of collisions.
We extensively tested the proposed NMPC swarm model in a simulation and compared it to a reactive PF model that has been recently described and validated on 30 real drones 24 . In addition to the repulsion and obstacle avoidance rules, the PF model includes a friction rule to reduce velocity oscillations. To ensure cohesive goal-directed flight in open environments, we added the rules of cohesion and migration to the PF model. As in previous work 21,24,27 , we used evolutionary optimization to search the large parameter space of the PF swarm model, and favoured swarms with highly ordered flight (Φ order = 1) and a low number of agent-agent and agent-obstacle collisions (Φ agent-safety = 1, Φ obs-safety = 1) (Supplementary Table 3). The purpose of the experimental comparison between NMPC swarming and PF swarming is to emphasize the behavioural differences and performance advantages of the proposed NMPC swarm model. However, the choice of a swarm model for deployment on physical drones should also consider computational resources, which are substantially larger for NMPC swarming.
In the following we present three sets of simulation experiments: (1) we compare the performance metrics of the two models in the same environmental conditions, (2) we investigate the adaptability of the PF and NMPC swarm models to environments with different obstacle density and (3) we study the scalability of the NMPC swarm model at different preferred speeds and inter-drone distances. Finally, we experimentally validate the NMPC swarm model with five palm-sized drones (Fig. 1c) flying through a room with cylindrical obstacles (Fig. 1b).
Comparison of PF and NMPC aerial swarms. Both PF and NMPC swarms navigated around the obstacles without collisions (Fig. 2e), but the NMPC swarm completed the mission 57% faster than the PF swarm. The reduced mission time is due to the ability of the NMPC swarm to track the preferred speed v ref more Fig. 2c). The NMPC swarm also generated a smaller inter-agent distance error (E d = 0.11 ± 0.02) and range (R d = 0.55 ± 0.18) than the PF swarm (E d = 0.26 ± 0.15, R d = 0.90 ± 0.26) (Fig. 2b). The NMPC model generated almost perfectly ordered flight manoeuvres throughout the entire flight (Φ order = 0.98 ± 0.02), whereas the PF model displayed lower and more variable order (Φ order = 0.78 ± 0.17; Fig. 2d). Neither the NMPC nor the PF swarm presented agent-agent or agent-obstacle collisions (Φ agent-safety = 1 ± 0, Φ obs-safety = 1 ± 0; Fig. 2e). Although optimizing the swarm's objectives, the NMPC model reduced the minimum distance to obstacles to 0.03 m. In comparison, the PF swarm achieved a minimum distance to obstacles of 0.14 m. This difference is due to the fact that in the PF model the obstacles apply a repulsion force on the agents in their proximity, while in the NMPC model there is no penalty for approaching the obstacles. As a consequence, when implementing the NMPC model on a real-world swarm, the user should carefully choose a safety margin.
Environments with different obstacle densities. We tested the PF and the NMPC swarm models for three different obstacle densities (case A, 0.06; case B, 0.12; case C, 0.20) to quantify the impact on the swarms' performance ( Table 1). The obstacles occupy random positions on the map, but they have a homogeneous distribution (Fig. 3a,d). The initial positions of the drones are random. In Fig. 3, we show the evolution of the inter-agent distance and speed for the scenario with the highest obstacle density (case C) and for both models. The results show that the inter-agent distance error is smaller with NMPC swarms (E d = 0.11 ± 0.02) than with PF swarms (E d = 0.27 ± 0.12), and the inter-agent distance range is shorter for NMPC swarms (R d = 0.56 ± 0.18 than with PF swarms (R d = 0.90 ± 0.26) (Fig. 3b,e). The NMPC swarms tracked the pre- , and the speed range was shorter (R v = 0.08 ± 0.07 and 0.47 ± 0.15, respectively) ( Fig. 3c,f). The faster speed of the NMPC swarms resulted in a faster mission completion time than for the PF swarms (T = 21.5 s and 34.1 s, respectively).
To assess the reproducibility of the results, we performed 10 stochastic simulations for each of the three obstacle densities and for the two swarm models, and we report here the aggregated performance results (Fig. 3g). Although the speed error in the NMPC swarm is small and constant for all obstacle densities (case A, E v = 0.01 ± 0.01; case B, 0.01 ± 0.01; case C, 0.01 ± 0.01), it is larger and increases with larger obstacle densities in the PF swarm (case A, E v = 0.08 ± 0.07; case B, 0.21 ± 0.11; case C, 0.25 ± 0.11). As a consequence, the mission completion time of the PF swarm increases when the obstacle density is increased ( Fig. 4b). The swarm's speed error is almost zero in the three cases (Fig. 4c), and it resulted in similar mission times (case A, T = 20 s; case B, 21 s; case C, 21.2 s). We did not observe collisions.
For the experiments on the scalability in speed, the speed error E v was close to zero in the three cases (Fig. 4e), while the mission times decreased with increasing speed (case A, T = 20.2 s; case B, 10.5 s; case C, 75 s). However, the variability of the inter-agent distance in case C is higher (R d = 0.46 ± 0.05) than in cases A (R d = 0.13 ± 0.11) and B (R d = 0.19 ± 0.03) (Fig. 4f). Indeed, when the agents turn around the obstacle in the middle of the scene, they rearrange and increase their distance. In these experiments, we also did not observe collisions. Comparative results on the PF swarm are shown in Supplementary Fig. 1. The aggregate results for the stochastic simulations for each of the preferred inter-agent distance and speed values, and for both the PF and NMPC models, are provided in Supplementary Fig. 2 and in Supplementary Tables 6 and 7.
Validation with real drones. We validated the NMPC swarm on five commercial quadrotors in an indoor motion capture arena where we reconstructed the environment described in the section 'Comparison of PF and NMPC aerial swarmsʼ (Fig. 5a). We measured the real flight performance, and we compared this with the simulation performance. The real drones achieve the preferred inter-agent distance d ref = 0.8 m with an error (E d = 0.12 ± 0.02) comparable to the simulation error (E d = 0.11 ± 0.02) (Fig. 5c).   Table 1).
However, the speed error is slightly higher (E v = 0.07 ± 0.03) than in the simulation (E v = 0.02 ± 0.2) (Fig. 5d). The higher speed error in the real swarm can be explained by small communication delays and air turbulence due to the proximity of the drones to each other and obstacles. The order of the real swarm (Φ order = 0.97 ± 0.04) is comparable to the simulated swarm (Φ order = 0.98 ± 0.02) (Fig. 5e), and in both cases we did not observe collisions (Φ agent-safety = 1 ± 0, Φ obs-safety = 1 ± 0; Fig. 5f).

Discussion
This Article shows that an NMPC model achieves a faster and more synchronized flight in cluttered environments than state-of-the-art models based on PFs. NMPC swarms report no collisions in cluttered environments, they better attain and maintain target speeds, and they remain more ordered and cohesive. The benefits brought by predictive controllers to robotic aerial swarms confirm a parallel with biological systems, where individuals are thought to enhance their synchronization by future state projection 19 .
In robotics, the advantages of the NMPC method are promising for applications that require navigation in crowded scenarios, such as the exploration of urban environments, collapsed buildings or forests 45,46 . Also, vision-based swarms could benefit from all these features because the reliability of reciprocal visual detection of the drones strongly depends on their distance, and the NMPC swarms showed that they can better maintain target inter-agent distances 22,47 . Overall, predictive methods can improve the autonomy of swarm operations as well as the safety of the swarm and the environment, which are both essential elements to build public confidence in the use of swarms 48 .
For the present experiments we relied on a central computing node that generates the motion of the agents at run time according to local interactions only. This assumption simplifies the implementation because it requires only one computer, acting as a ground control station, instead of several onboard computers that the agents would carry. However, the NMPC model requires a higher amount of computational resources than the PF model and scales worse with swarm size. It will be interesting to develop a decentralized NMPC model where the computational costs are independent of the number of agents. Work in this direction will allow us to scale our approach to swarms of larger size.
Finally, our results motivate future works to address research questions in the design of robust swarm models in dynamic environments. Thanks to their recursive structure, MPC controllers offer a promising method to allow navigation in scenarios with  Video 1). b,c, Inter-agent distance and speed for the experiment on inter-agent distance scalability. e,f, Inter-agent distance and speed for the experiment on speed scalability. The obstacle size and density are the same for the six cases.
Order  Table 4). c, Average inter-agent distance and range with the real swarm (solid line and shaded region, respectively) and the simulated swarm (dashed and dotted lines, respectively). d, Average speed and range with the real swarm (solid line and shaded region, respectively) and the simulated swarm (dashed and dotted lines, respectively). e, The swarm's order for the real (solid line) and simulated (dashed line) swarms. f, Swarm distance to obstacles. The offset in the real data (solid line) with respect to the simulated data (dashed line) is due to the safety margin. moving obstacles. However, a generalization of the proposed model to dynamic environments would require theoretical and numerical investigation on the conditions for stability, as well as a reliable estimation of the obstacles' motion 49 .

Methods
In this work, we consider a swarm of N agents labelled i ∈ {1, …, N}. The position, velocity and control input of the ith agent are denoted by pi, vi, ui ∈ R 3 , respectively. Let dij =∥ pj − pi ∥ represent the distance between the centre of two agents i and j, where ∥ · ∥ denotes the Euclidean norm. We model the swarm with a directed sensing graph G = (V, E), where the vertex set V = {1, …, N} represents the agents, and the edge set E ⊆ V × V contains the pairs of agents (i, j) ∈ E for which agent i can sense agent j. We denote as Ni = {j ∈ V|(i, j) ∈ E} ⊂ V the set of neighbours of an agent i in G, and | · | indicates the cardinality of a set. We define the neighbours set utilizing a topological range; that is, the set Ni contains the |Ni| nearest neighbours of agent i. This choice is convenient for keeping the cardinality of the neighbour set constant, and it has also been shown to hold true for biological swarms 7 . Other studies have investigated different neighbourhood approaches based on the Voronoi partition or ad hoc attraction topologies 50,51 . However, we discarded those methods because they would introduce discontinuities in the objective function of our predictive swarm model, or they would constrain the swarm to a fixed formation. To reproduce a forest-like environment, we introduce M cylindrical obstacles labelled m ∈ {1, …, M}. We denote as d im the distance between an agent i and the symmetry axis of cylinder m. In our simulations, the dynamics of the agents is reproduced in discrete time. We let pi(k), vi(k), ui(k) ∈ R 3 be the position, velocity and control input of the ith agent at the time t(k) = kdt, respectively. For brevity, in the following we will omit the time dependency when clear from the context.
PF swarm model. The PF model we present is inspired by a state-of-the-art model that allows drone swarm navigation in confined environments 24 . From the original model, we include the rule of repulsion to prevent inter-drone collisions, friction to reduce velocity oscillations, and obstacle avoidance to avoid collisions with obstacles. For the mathematical definition of these rules, we refer the reader to ref. 24 where we choose the pairwise gain of cohesion equal to the repulsion gain c coh = c rep and the cutoff for the minimum cohesion range equal to the repulsion range d ref .
The total cohesion effect calculated for agent i with respect to its neighbours is At any instant, the velocity for agent i resulting from the contributions above is After summing the contributions, we apply a cutoff on the acceleration at a max according to where ã i(k + 1) = (ṽi (k + 1) −ṽi(k)) /dt. Then, we apply a cutoff on the speed at v max , and get the velocity command v i of the ith agent: To search the large parameter space of the PF swarm model, we used evolutionary optimization for the highest-order flight and the lowest number of collisions. The evaluation of the swarm behaviour is based on a single fitness function that sums three independent values (Φ order , Φ agent-safety and Φ obs-safety ) smaller than or equal to 1 (ideal case). The fitness is determined in simulations where the swarm is initialized with random positions in an environment where obstacles are randomly placed. The parameter values and their description are detailed in the Supplementary Information.
Agents' dynamics. The NMPC swarm model supposes the availability of the agents' dynamic model. We assume that every drone of the swarm obeys a discrete linear system, given by where A i and B i are constant matrices. In this Article, we consider the system to represent a quadrotor with an underlying acceleration controller. The input u i is an acceleration command and the state xi = [pi, vi] ∈ R 6 is a vector containing the position and velocity. We assume that the velocities and acceleration inputs of the agents are bounded by constant vectors v min ,v max and u min ,u max , respectively. This translates into the inequalities vmin ≤ vi(k) ≤ vmax (8) umin ≤ ui(k) ≤ umax (9) Let x = [x1, x2, · · · xN] ∈ R 6N be the positions and velocities of the agents of the swarm, and u = [u1, u2, · · · uN] ∈ R 3N . The system defining the motion of the swarm can be written as At each time step, each agent updates its neighbour set and computes the cost associated with the four swarm rules only considering neighbouring agents. All agents' contributions are then summed in a global cost function that defines our centralized model. The dynamic model of the agents introduced in the section ' Agents' dynamicsʼ determines the evolution of the agents' state over a constant time window, called the prediction horizon. These predictions are optimized by the global cost function, whose solution gives the control inputs for the swarm over the so-called control horizon ( Supplementary Fig. 3). The prediction and control horizons are finite and shift forward at every time step. In the following, they will be denoted as T P = Pdt and T C = Cdt, respectively, with P ≥ C and P, C ∈ N + . We let (·)(k + l|k) represent the predicted value of (·)(k + l) with the information available at time t(k) and l ∈ {0, …, P}. Then, the continuity condition on the swarm state is written as The separation term for agent i and time t(k) is The separation component incorporates the effects of cohesion and repulsion, and drives the inter-agent distances to the preferred value d ref .
It is important to notice that commanding inter-agent distances instead of relative positions, as done in problems of formation keeping 33,34,36,37 , introduces a non-convexity in the separation term.
The propulsion term is The direction term is The combined action of the propulsion (equation (13)) and direction (equation (14)) terms contributes to the so-called migration behaviour of the swarm that regulates the two components of the swarm's velocity, that is, magnitude and direction, respectively. The choice of two separate terms that independently act on the swarm's velocity components is necessary to maintain constant flight speed during obstacle avoidance manoeuvers. The control effort is (15) where w sep , w prop , w die and w u represent the constant weights associated with the cost function terms.
To prevent the agents from colliding with their neighbours or the obstacles, we associated with the cost function two sets of collision avoidance constraints: where d agent-safety is the safety distance between two agents' positions and d obs-safety is the safety distance that an agent should keep from the obstacle's position.
Although in this study we consider collision avoidance with all obstacles (equation (17)), the model does not necessarily require it. Indeed, the obstacles that do not interfere with the agents' predicted trajectories are discarded by the optimization process. A strategy for reducing the number of constraints in the model consists of considering only the subset of obstacles that are on the collision course with the agents. Although this strategy would represent an approximation of more comprehensive modelling of all obstacles, as presented here, the advantages of the model-based approach described in this Article would still hold.
Simulation set-up. We implemented our NMPC model in MATLAB with the help of acados 52 , an open-source library for fast nonlinear optimal control. This software relies on C code generation for speeding up the computation in real-time applications. The system dynamics and the constraints of the problem are discretized by the library over the prediction horizon to obtain a structured nonlinear program (NLP). Then, the NLP is approximated through sequential quadratic programming (SQP), which iteratively solves convex quadratic program (QP) sub-problems. After applying a condensing step, a linear algebra solver, HPIPM, based on the interior point (IP) method, finds the solution of the sub-problems 53 . We run our simulations on a DELL Precision Tower with a 3.6-GHz Intel Core i7-7700 processor and 16-GB 2,400-MHz RAM, where we set the maximum number of SQP to 7 and the maximum number of QP iterations to 7.
Drone experimental set-up. In our experiments, we used five Bitcraze Crazyflie 2.1 quadrotors (Fig. 1c). Each quadrotor is equipped with a three-axis accelerometer, a three-axis gyroscope, a pressure sensor and a marker deck for hosting passive reflective markers. We used a STM32F4 microcontroller running at 168 MHz, on which both state estimation and low-level control are running. An OptiTrack motion capture system was used to track the position of the robots. All the acceleration commands for the drones were computed on a single computer with our NMPC model, integrated into position and velocity commands, and broadcast to the swarm through a radio link alongside the estimated position of each drone. The estimated positions were used by the drones to perform the lower-level control loops and track the commands sent. The positions and velocities used by the swarm model were predicted with the agents' dynamic model. To guarantee the transferability of the NMPC swarm model to hardware experiments, we decreased the maximum SQP number to 4. This was sufficient to compute converging solutions of the NLP in less than 0.1 s.

Data availability
Complementary data for reproducing the experiments are available in the Supplementary Information. Simulation and hardware experimental data that support the findings of this study can be downloaded from ref. 54 .

code availability
The code that supports the findings of this study can be downloaded from ref. 55 .