Emergence of Urban Growth Patterns from Human Movements

Cities grow in a bottom-up manner, leading to fractal-like urban morphology characterized by scaling laws. Correlated percolation has succeeded in modeling urban geometries by imposing strong spatial correlations. However, the origin of such correlations remains largely unknown. Very recently, our understanding of human movements has been revolutionized thanks to the increasing availability of large-scale human mobility data. This paper proposes a novel human movement model that offers a micro-foundation for the dynamics of urban growth. We compare the proposed model with three empirical datasets, which evidences that strong social couplings and long-memory effects are two fundamental principles responsible for the mystical spatial correlations. The model accounts for the empirically observed scaling laws, but also allows us to understand the city evolution dynamically. Over a

urban data suggests that cities grow in a bottom-up manner, calling for understandings of its micro foundation [2][3][4] .Later, three fundamental empirical laws have been discovered 2,5,6 : First, the distribution of city size follows a power law with a scaling exponent around two, implying large cities are much rarer than small towns 5 .Second, the urban area grows super-linearly with the population, due to intense competition for spaces in previous studies 7,8 .Finally, the density of occupied urban areas decreases exponentially with the radial distance to city centers [9][10][11] .Physicists have applied diffusion-limited aggregation (DLA) to model urban growth as an aggregation of physical particles 7,12 .Further works showed that correlated percolation (CP) is a better alternative to explain the emergence of the aforementioned laws 5 .A key observation of the CP model is the requirement of strong geographical correlation to reproduce the correct scaling relations 13 .While the CP model successfully explains the urban morphology, it has little connections with human activities at the micro-level.The micro-foundation of such geographical correlation remains a mystery.Here, we develop a novel urban growth model based on human movements, suggesting that strong social coupling and long memory are two fundamental principles governing urban growth.
Thanks to the availability of large-scale movement datasets, our understanding of human movements has been revolutionized over the past decade [14][15][16] .Existing human movement models fall into three classes, as depicted in Fig. 1: Class A models treat human movements as randomly moving particles without interactions.Brownian movement is one of such prototype models where an individual's displacements are normal-distributed 17 .Unlike physical particles, empirical data suggests human movements are characterized by large jumps between two consecutive steps, sat-isfying a power law, where P ( r| r ′ ) is the transition probability from location r ′ to r, with d = 2 for two-dimensional space 14,18 .The exponent α is observed around 0.55±0.05 19.The fact that the transition probability decreases with distance characterizes the cost of travel distance of human movements, i.e., most of the time people travel only over short distances, whereas occasionally people take longer trips.
Neglecting social interactions and memory effects, Eq. ( 1) suggests human movements follow a Lévy-flight, with the population density ρ( r) satisfying the fractional diffusion equation, where D is the diffusion constant (see Method section for details).Nevertheless, both Brownian motion and Lévy flight predict a uniform population distribution when time t approaches infinity, in contrast to empirical observations 9 .
Class B models such as Gravity model 8 and Radiation model 20 , originate from the study of migrations, where the traffic flow between two locations depends on their populations.For instance, the Gravity model suggests the transition probability, where ρ 0 is the (inverse) coupling constant.In addition to the fat-tailed jump size distribution (1), the gravity model (2) also requires the transition probability increases linearly with the population at the destination r 21 .This mechanism accounts for a mean-field background attractiveness rooted in social interaction, e.g., highly populated locations often offer more social opportunities 8 .One would hope that this social attractiveness being responsible for the mystical geographical correlation in the CP model.Unfortunately, we find that the diffusion process of the gravity model follows the same fractional diffusion Eq. ( 2) for Lévy flights (see Method section for details), i.e., it predicts a uniformly distributed urban patterns at a large t.

Class C models have been developed during the recent study of human mobility. Unlike
Class A where individuals move freely, empirical data found notable recurrent-visitation patterns in human movements.Consequently, individuals show an ultra-slow diffusion, in contrast to a regular power-law diffusion in the Brownian motion and Lévy flight.To explain these new findings, Individual mobility model(IMM) retreats human movements as a two-stage return-exploration process to account for long memory effects.In particular, a preferential return mechanism is imposed, i.e., the probability returning to a previous location r i , proportional to its historic visitation frequency f (r i ).Such long-memory return process slows the human diffusion drastically.In particular, IMM predicts that the traveling distance l follows, where A is the total visitation area 22 .The logarithmic growth is one of the key ingredients for human movements, characterizing the anomalous ultra-slow diffusion and home range effect 23 .Unfortunately, since IMM only models individual movements without involving social interactions, it fails to capture urban growth patterns (see Result Section).The coupling constant, ρ −1 0 , controls the strength of population attraction, i.e., increasing ρ 0 reduces the impact of population density, and consequently, the strength of social interactions.For ρ −1 0 → 0, CMM is effectively equivalent to IMM.Inspired by the strong geometrical correlation in CP models 5 , we're interested in the strong coupling limit ρ −1 0 → ∞, where CMM describes a strongly correlated many-body system.

Results
We To compare the morphology of the emulated urban systems to the empirical observations, we plot population distributions in Fig. 2A-D for all models, together with the empirical distribution of London city in where Fig. 2E.While the real-world geometry is affected by geographical features, e.g., lakes and rivers, London city still exhibits prominent features of the compact city center and fractal perimeters.These observations echo previous studies on the fractal geometry of urban area [24][25][26] .
The urban population distribution for Lévy flight and Gravity model follows the fractional diffusion process (2), implying that individuals will gradually diffuse away from their initial position over time.The emulation verifies this prediction with the urban population distributed uniformly in urban space when the systems converge(see Fig. 2A-B).It indicates these two models fail to reproduce compact and stable city centers.On the other hand, IMM predicts urban systems grow homogeneously in the perimeter.The emulation result shows the perimeter of the urban area is a standard circle, and the urban areas that have a similar radial distance to the city center have similar population density (see Fig. 2C), which is in consistence with the theoretical prediction.Therefore, the emulation suggests IMM cannot reproduce the fractal morphology of the urban area.On the contrary, CMM successfully reproduces the compact city center in the urban system, where the population density is significantly higher than the peripheral urban area (see Fig. 2D).Besides, the perimeter of the city center demonstrates prominent fractal geometry, and numerous sub-clusters are formed around it (see Supplementary Material S4 for details).These observations are in agreement with the empirical observation on London city, which indicates CMM can reproduce the morphology of the urban area.
To compare models to real-world urban growth quantitatively, we will focus on three fundamental empirical laws, each of which has been validated on multiple cities around the globe 2, 5, 6 : (A) City size distribution: The number of cities N (A) decreases with their areas A, following a power law, where the exponent τ has been reported around 2.0 5 .Percolation theory is the prevalent narrative for this observation, with each site occupied as an urban area with a certain probability 13 .It predicts the scaling law (6) with the exponent ranging between 2 and 2.5, where τ = 2 corresponds to a strong correlation between different sites and τ = 2.5 corresponds to a mean-field theory 13 .
The empirical urban datasets show that city size distributions are well approximated by Eq. ( 6) (see Fig. 3A), with τ = 2.09 ± 0.09 for U.S.A., τ = 2.01 ± 0.08 for G.B. and τ = 1.91 ± 0.16 for Berlin.These findings echo the theoretical predictions of site percolation theory and empirical observations in the previous research 5,13 .
Lévy flight characterizes the movement as an individual diffusion process.The urban popula- tion will distribute uniformly in urban space as the urban system reaches a stable state.On the other hand, although the Gravity model introduces the correlation among individuals through Eq. ( 3), our analysis shows it also predicts the population distribution follows a fractional diffusion as Eq.
(2).When time t → ∞, the population will distribute uniformly in urban space ρ(r) = c, which is independent of the coupling constant ρ −1 0 .Therefore, both the Lévy flight and Gravity model are equivalent to the uncorrelated percolation.The emulated urban systems of Lévy flight and Gravity model reproduce the scaling law distribution with −2.55 ± 0.15 and −2.58 ± 0.17 (see Fig. 3D), which is consistent with the theoretical prediction of completely uncorrelated percolation.While the analytical prediction of the memory-aware and social independent IMM model is unclear, the emulation shows it also satisfies the scaling law, yet with an exponent τ = 2.98 ± 0.51.The large exponent implies the fact that individuals are localized within their own home-range since the IMM is equivalent to the non-interactive limit of CMM with ρ −1 0 → 0. In contrast, when the coupling constant ρ −1 0 → ∞, the CMM model becomes strongly-correlated.As a result, it reproduces the scaling law with τ = 2.02 ± 0.13, which agrees with the theoretical predictions and empirical patterns observed in real-world data.These results suggest both the principals of social interaction and memory are essential components of reproducing the empirical city size distribution, while CMM successfully integrates them into a unified movement model.

(B) Super-linear relation between population and city size:
The positive allometric population growth with the urban area is widely observed in cities around the globe 27,28 .Larger cities tend to have a higher urban population density, ρ A ≡ N (A)/A, because they are developing into the third dimension 24,29,30 .Recent researches suggested the balance between the cost and gain of concentrating population in urban areas, would explain the observed super-linear growth 6 .This social-economic hypothesis consists of two assumptions: i) the average gain from the intense social interaction is proportional to the population density ρ A ; ii) the average living cost is proportional to the typical travel distance l ∼ log A (see Eq. ( 5)) to explore the city.Their balance leads to, The assumption i) agrees with the social interaction in Eq. ( 3), whereas the assumption ii) is rooted in thememory effect in Eq. ( 4).
Fig. 3B plots the population density ρ A with city area A across different cities for both U.S and G.B, finding that the empirical observation agrees precisely with the predicted logarithmic law (7).It is worth noting that previous studies reported a power-law fitting, i.e., ρ A ∼ A δ with a tiny exponent δ ≈ 0.1 6 .However, Within the range of magnitude of the empirical data the logarithmic function is indistinguishable with a small-exponent power-law.Fig. 3E compares the emulation results for the four prototype models, finding that the proposed CMM reproduces the logarithmic law, whereas there is no area-dependence of the population density ρ A for the other three models.
This result demonstrates that both social interaction and memory are necessary for the observed scaling law (7).
(C) Exponential occupation profile: The urban occupation profile φ(r) is defined as the probability of finding an inhabited area at the distance r from the city center.Empirical studies suggested an exponential profile 31 , We also observe the same exponential law for all three empirical datasets, shown in Fig. 3C, indicating that the city center attracts most of the population, whereas the occupation probability decreases rapidly with the radial distance.However, such a rapid decline somehow contradicts with the fat-tailed nature of the human movement (1) that suggests the human travels being able to reach areas far away from the initial location 18,24 .
This paradox can be also resolved by introducing jointly the social interaction and memory in human movement.Indeed, emulation results in Fig 3F shows the occupation profile in CMM agrees very well with the exponential law (8).In contrast, φ(r) is independent with r for Lévy flight and Gravity model, whereas IMM shows an non-exponential decrease.Moreover, it has been suggested that the declining rate λ shall decrease as the city evolves, due to the constantly pushing forwarded frontiers of cities 5 , in line with the observations in the Berlin dataset where φ(r) at three different time has been measure.Fig. 4A shows λ decreases gradually from 0.050 to 0.031.
The emulation results of CMM precisely reproduce the evolution of the occupation profile during urban development (see Fig. 4B).

Discussion
The rapid urbanization process urges the demand for a more comprehensive understanding of the patterns of urban growth 32,33 .Correlated percolation model (CP) has reproduced successfully urban morphology by introducing a strong geographical correlation to percolation theory, leaving the origin of such correlations a mystery 5 .In this paper, we propose a novel urban growth model that roots in human movements, providing a solid micro-foundation for the mystical geographical correlation in the CP model.It offers a bottom-up approach towards understanding the observed urban morphology and scaling laws.Two principles, namely, the strong social interaction and memory of historical movements, are shown to be the key ingredients governing human migration, and consequently the urban development.Unlike existing human movement models where individual movements are either uncorrelated or memoryless and fail to capture the urban growth, the proposed Collective mobility model (CMM) demonstrates both principals play essential roles.Theoretical analysis and emulation results show the memory principal is essential to reproduce compact and stable city centers in urban systems.In particular, CMM reproduces three major empirical laws: city size distribution, super-linear population-area relation, and the exponential occupation profile, consistent with the established CP model and social-economic model at the macroscopic level.Unlikely the CP model that is purely static, and has to take the exponential occupation profile as input, the proposed CMM predicts the occupation profile (see Fig. 3F) and its evolution (see Fig. 4B) in a self-contained manner, without imposing additional assumptions (see supplementary material S5 for more details about model parameters).In all, the CMM not only fills the missing gap in the paradigm of existing movement models (Fig. 1), but at the same time complements previous urban growth models, shedding light on the underlying mechanisms urban growth in the macroscopic level.
In addition to the scientific findings, our research also may have direct implications on wide range downstream applications [34][35][36][37] , i.e., city planning, resource allocation, disease controlling, etc.First, conventional top-down city planning strategies have been shown ineffective in governing urban growth by the previous research 2 .We propose an alternative bottom-up urban growth model that predicts the urban growth process from the perspective of human movement.These findings may inspire novel city planning policies that leverage the principals of human movement 38 .For example, promoting the development of urban regions by reducing the cost of traveling to them.
Second, the proposed urban model facilitates us to emulate the patterns of urban dynamics with high accuracy, which may shed light on the preemptive allocation of critical resources, such as transportation infrastructures 39 and medical supplies 40 .Third, the proposed model reveals the underlying correlations between urban structures and human movement.Therefore, the findings improve our understandings of urban movement, which plays a crucial role in controlling contagious diseases 41,42 .

Methods Diffusion Equation of Gravity Model
In this section, we develop the diffusion equation ( 2) for the Gravity model.We start with a lattice model to work out the master equation and then find the equation at the continuous limit by taking the lattice spacing approaching zero.
Equation.(3) indicates that the transition matrix W ij from the lattice site j to i follows, where λ g (l) is the transition rate with an appropriate scaling with the lattice spacing l, and α ∈ (0, 2].
The corresponding master equation reads, Assuming a small lattice spacing l, we approximate the summation with a continuous integration, where (−∆) α/2 is the fractional Laplacian satisfying, and The continuous equation satisfies the fractional diffusion equation, where the diffusion constant D ≡ c d,α ρ 0 lim l→0 λ g (l)l α , in line with Eq. ( 2).For α = 2 we recover the standard diffusion equation.

Model Emulation
We evaluate the effectiveness of the proposed movement models in predicting urban growth patterns through emulating urban systems with citizens governed by them.Specifically, the urban system is emulated in a two-dimensional square space with l sites in width and length and each site represents a unit area.The urban system consists of N citizens with their movement trajectories generated based on the corresponding urban movement models.The emulated citizens initially located at the centers of urban systems, and then iteratively sample their next visit sites at each epoch until the system reaches a stable state.After the system is converged, we considered the occupied sites as populated urban areas, and examine the morphology and growth patterns of the emulated urban system.Emulating agent-based urban systems according to urban movement models is time expensive 43,44 The

Supplementary Files
This is a list of supplementary les associated with this preprint.Click to download.si.pdf

Figure 1 :
Figure 1: The paradigm of human movement models.Existing human movement models classi- collect three public available urban development datasets, including the population and urban area of cities in i) United States of America (U.S.A.) at 2000, ii) Great Britain (G.B.) at 1991, and iii) the distribution of urban area in Berlin region at 1910, 1920 and 1945.For comparison, we emulate the human migrations by four typical movement models for Classes A-D respectively, namely the Lévy flight, Gravity model, IMM, and the proposed CMM.A simple emulation of large urban systems is impractical due to the high time complexity, which is O(M l 2 ) for each epoch with M and l denoting the number of citizens and the size of the urban area, respectively.We address this problem by designing improved sampling techniques to effectively reduce the complexity to

Figure 2 :
Figure 2: The morphology of urban area generated by four different human movement mod-

Figure 3 :
Figure 3: The comparison of reproducing empirical urban growth patterns with urban sys-

Figure 4 :
Figure 4: The qualitative examination on reproducing the populated area density profile in paradigm of human movement models.Existing human movement models classi-ed based on whether account for the memory of historic movements or the social interactions, are summarized as the paradigm with four classes: (A) Brownian motion and L´evy-ight belong to this class where movements are independent and memoryless; (B) Gravity model and Radiation model are the typical models where movements are socially correlated and memoryless; (C) Individual mobility model belongs to this class where movements are independent and memory-aware.(D) Missing corner of the current paradigm.The proposed Collective Mobility Model lls in this class where movements are both socially correlated and memory-aware.

Figure 2 The
Figure 2

Figure 3 The
Figure 3

Figure 4 The
Figure 4 47To address these challenges, we propose two improved sampling methods, alias sampling and sorted array sampling, to accelerate the emulation (see supplementary material S3 for details).Data Availability All the empirical urban data we examine are publicly available in government websites and previous research (see supplementary material S1 for details).Specifically, the data of U.S.A cities can be accessed through the website of U.S.A. Census Bureau45.The data of cities in G.B. is available on the website of the Statistical Office of the European Union46.Both of these two datasets provide information on urban area coverage and population distribution.On the other hand, the dataset of the Berlin region is collected from the telemetry images of urban area distribution in 1910, 1920, and 1945, which is released in the previous research47.Note that the telemetry images only provide information on the coverage area of cities but not the urban