Main for Demand-Driven Spreading Patterns of African Swine Fever in China

African Swine Fever (ASF) is a highly contagious hemorrhagic viral disease of domestic and wild pigs. ASF has led to huge economic loss and social impact worldwide. The biological mechanism of ASF’s infections is still not fully understood, and the lack of preventative options at the individual level further complicates this major global health challenge. In this paper, we propose a novel method to model the spread of ASF in China by integrating the data of pork import/export, transportation networks, and port distribution centers. We first empirically analyze the overall patterns of ASF spread and performs extensive experiments to evaluate the efficacy of a number of distance measures. These empirical analyses show that the arrival of ASF is not purely based on the geographic distance from existing infected regions. The pork supply-demand patterns have clearly influenced the spread of ASF, which cannot be well explained by conventional geographical distance and the recent effective distance methods. Predictions based on the new distance measure achieve better performance in predicting the disease spreading among Chinese provinces and thus have the potential to enable more proactive and accurate deployment of interventions.


Introduction
As a highly contagious hemorrhagic viral disease of domestic and wild pigs, usually deadly, African Swine Fever (ASF) mainly transmits by close contact with or ingestion of contaminated material, contaminated fomites, and biological vectors (1,2). Although ASF has not been found to affect the health of human beings, it has led to the social and economic impact on swine trade, pig by-products, and food security worldwide, especially in countries like China where pigs serve as a major source of protein intake to citizens (3,4). Since firstly reported in Kenya in 1921 (5), ASF has spread across the African-Europe-Asia countries (6,7). Till now, the biological mechanism of ASF's infections is still not fully understood, and the lack of preventative options at the individual level further complicates this major global health challenge (8,9).
Since the first ASF outbreak in China was reported on August 1, 2018, in Shenyang, Liaoning Province (7), a total of 157 cases have been reported in China with approximately 1.2 million pigs culled before the novel coronavirus (COVID-19) pandemic hit the world in 2020 (10). As a result, the port price has surged by 48% in just one year (October 2018 and October 2019). China has over half of the world's pig population, the ASF outbreak in China represents a serious threat to animal health, pig production, and the quality of life of people in China and neighboring countries (3,11,12).
Several studies have examined the ASF epidemics from various perspectives, including transboundary location predictions (13), molecular characterization (14), major capsid protein determination (15), pig movement simulation (16,17), analysis of carriers of ASF viruses (18). However, to date, little is known about the underlying transmission mechanism in China--the key to designing effective containment and mitigation strategies. This task is particularly challenging for ASF and other livestock infectious diseases due to the complexity of human and livestock mobility patterns (19)(20)(21)(22)(23)(24).
Conventional approaches to modeling the spread of epidemics mainly rely on spatiotemporal patterns based on geographic distance (25,26). Recent studies proposed to replace geographic distance with traffic-mediated effective distance measures, which take into account the transportation infrastructure that enables human mobility (19). Such effective distance was demonstrated to provide better predictions of disease arrival times for a number of human infectious diseases (19,27). However, it cannot be immediately applied to the modeling of livestock infectious diseases such as ASF, because the transportation pattern of livestock is different from human mobility in a number of ways (24)(25)(26). First, in general, the human movement between two regions is symmetric (e.g. people arrive and then leave) (19,25,26), but livestock movement is not (e.g. pork is imported into a region and consumed). Second, livestock has unique trade and logistical patterns that the livestock is centrally processed and distributed at slaughterhouses (28,29). Third, it is usually hard to estimate the geographical distribution of livestock demand. Another problem is the difficulty in estimating the actual ASF outbreaks in China given the potential late, missed, and under-reporting cases among pork farmers and meat producers. The actual spread patterns could be different from what the officially reported data indicate. This further highlights the need for a reliable quantitative data-driven method to characterize the underlying transmission mechanism and to predict the possible spread paths.
Furthermore, the spread of ASF in China is dependent on the huge demand for pork, unique culture mode, and logistics (30). First, the major pork culture model in China is small-scale farms. 97% of farms have fewer than 100 pigs, while only less than 0.01% of farms have over 10,000 pigs. Second, there is a long tradition of feeding pork with swill, which worsens the cross-infection of ASF. Third, people in China (and some neighboring countries) usually demand fresh pork, which results in the need to move live animals for local slaughter rather than transport of chilled meat. Fourth, the demand and logistic patterns are diverse. Certain provinces (such as Henan) export pork to others while certain provinces (such as Jiangsu) mainly import pork from others. As such, the cross-regional pork transport based on well-established ground transportation is prominent and often spans over thousands of kilometers. Note that most provinces have over ten million population and there is little inspection for such cross-province pork transport. The main ASF transmission means include cross-regional transport of live pork and its product (14.4%), swill feeding (44.1%), and viruses carried by human/vehicle (41.5%) (30).
To address these challenges, we propose a novel method to model the spread of ASF in China. Integrating the data of reported ASF cases, pork production and consumption, the locations of slaughterhouses, and the detailed ground transportation, our new model adjusts the effective distance between regions based on the estimation of demand and transport of pork. This paper first empirically analyzes the overall patterns of ASF spread in China and performs extensive experiments to evaluate the efficacy of a number of distance measures. We then evaluate the performance of the proposed demand and transport-adjusted distance in predicting the disease arrival sequence among Chinese provinces.

Empirical Analysis
On August 3, 2018, the Chinese Ministry of Agriculture and Rural Affairs (MARA) announced that the first African swine fever outbreak occurred in Shenyang, Liaoning Province, China, and spread rapidly to all 31 provinces in mainland China within 260 days. The distance from north to south and east to west spans more than 3,000 kilometers. As of October 3, 2019, a total of 149 ASF outbreaks have been reported. The total number of infected pigs, number of morbidity, and number of deaths in the outbreak farms approximated the log-normal distribution, with expected mean values of 2,367 (1,360, 4,823), 131 (86, 298), 89 (53, 186) respectively. The average incidence was 25.9% (21.93%, 30.77%), and the average mortality was 70.0% (65.31%, 75.09%). Figure 1 sorts all the 31 provinces in mainland China based on the chronological sequence of ASF outbreaks and compares pork supply and demand rates of each province. From a macro trend point of view, ASF first outbroke in two pork-exporting provinces, then spread to porkimporting provinces, and finally invaded other provinces. The ASF epidemic can be divided into seven waves, which are named wave 1, wave 2, ..., wave 7 according to the time sequence, and its spatial distribution is shown in Figure 2. Wave 2, wave 3, and wave 4 are the key to understanding the spread of the ASF epidemic. These three waves led to the invasion of 23 provinces (74.2%). From wave 1 to wave 2, the diffusion direction is mainly north-south direction, and the source may be Liaoning and Henan provinces; from wave 2 to wave 3, the diffusion direction is mainly east-west direction, and the source may be Anhui Province; from wave 3 to wave 4, the diffusion direction is also east-west direction, and the source may be Shanxi and Neimeng. The direction of the three waves of epidemic spread is in line with China's "five vertical and seven horizontal" expressway network. Among the provinces that have been involved in the inter-province pork trade, only three (Neimeng, Jilin, and Fujian) do not follow the trend of such wave-like advancement. Among them, the first two provinces border the Russian Far East, which is also experiencing ASF epidemics, and the third one is a coastal province with heavy international trade.
In addition, the 14 provinces that exported pork had more and longer outbreaks--an average of 6 To further evaluate the relationship between geographic distance and the risk for infection, we sort provinces based on the temporal order of outbreaks, only 80% of newly infected provinces are geographically adjacent to already infected provinces.
These empirical analyses show that the arrival of ASF is not purely based on the geographic distance from existing infected regions. The pork supply-demand patterns have clearly influenced the spread of ASF, which cannot be well explained by conventional geographical distance and the recent effective distance methods. Figure 3 presents the predicted ASF spreading process based on which province is predicted to be infected next. Arrows represent the predicted infection paths between provinces. The colors of arrows represent the order of the actual ASF outbreaks. If the province predicted to be the next infected one is within the top 3 in the actual order of outbreak time, the arrow is in naval blue. Similarly, top 4-6 is colored in light blue; 7-10 is colored in green; 10+ is colored in red. Generally, more blue arrows on the map indicate a more accurate model; the more green and red arrows, the less accurate the model is. Table 1 presents the accumulated ranks of ASF spreading prediction results from the three models (natural distance spreading -NDS, transportation distance spreading -TDS, demand-adjusted transport distance spreading -DaTDS, see methods sections for details). We also present the results of 1000 random guesses, where the prediction is based on a uniform distribution. All methods are statistically better than random guesses (mean=221, standard deviation=27.12, the 25% and 75% quantiles are 203 and 241, respectively). Apparently, DaTDS outperforms TDS, while the classic NDS has the lowest performance. We used the Student's t-test to determine the difference between the performance of different methods. It turned out that, DaTDS significantly outperforms TDS (t-score= -4.35, p-value<0.001) and NDS (t-score=-4.49, p-value<0.001). These results demonstrate that (1) travel distance is a more appropriate method to model the spreading of ASF; (2) the actual demand estimated based on the distribution of slaughterhouses and import/export rates can also help the prediction of where ASF outbreaks would occur next.

Demand-Adjusted Transport Distance
In addition, comparing the NDS and TDS, we find that the majority of their predictions are the same. TDS outperforms NDS mainly in the mountainous Southwestern area, where distance in the road network can be very different from geographical distance due to terrain constraints. More specifically, DaTDS predicts that the average sum arrival order of provinces in the Southwestern area is 7.3, which is smaller than 9.5 as predicted by TDS. Furthermore, compared with TDS, DaTDS successfully captures some irregular spreading patterns, where two temporally consecutive infected provinces are geographically separated by a susceptible province. For instance, the long-distance transmission from Sichuan to Shanghai was captured by both the TDS and DaTDS methods. The skip from Hubei to Fujian was not captured by any method, but DaTDS' prediction has the smallest error.
The improvement in prediction performance achieved by the proposed DaTDS also has important implications for interventions. On average, if authorities use the conventional NDS or TDS methods, five provinces should be set as priority targets for interventions in order to get full coverage of the next possible infected province. By contrast, the DaTDS method can reduce this number to four. Even the difference is only one, such reduction can lead to a significant saving of expenditure in disease containment and mitigation.

Discussion
In summary, the ASF spread in China exhibits interesting irregular patterns, which cannot be clearly explained by conventional geographical distance and the recently proposed transportation distance methods. Therefore, we propose a novel distance measure that integrates the data of pork import/export, transportation networks, and port distribution centers. Predictions based on the new distance measure achieve better performance in predicting the disease spreading among Chinese provinces. By improving the predictions of the next infected province, the proposed method can help to better identify the future spread paths of ASF, and thus enable more proactive and accurate deployment of interventions (e.g., disease prevention and control). The present study sheds light on the importance of transportation networks and demand information in modeling the spread of livestock infectious diseases.
A limitation is that the data quality may have hindered the performance boost. The predicted paths may sometimes be the actual spread paths because there are late, missed, and underreporting cases in certain provinces. New data is needed to further verify the efficacy of the proposed methods. With additional data that have a higher granularity, the performance may be enhanced and help authorities identify potentially late, missed, or under-reporting cases.

Materials and Methods
The data of ASF outbreaks (including report time and address) in China was collected from the official daily reports of MARA on Oct 3, 2019.
To estimate the movement of livestock, we extracted the major road network (as of Oct 3, 2019) from OpenStreetMap. We calculated the distance between two provinces as the shortest travel distance between their capital cities on the road network using the classic Dijkstra algorithm with a 2-kilometer resolution.
To estimate the actual demand and inter-province transportation of pork, we obtained slaughterhouses data from the open API of AMap, the largest Chinese provider for web mapping, navigation and location-based services. We assume that the slaughterhouses close to major roads serve as distribution centers for local communities, and also for intra-provincial livestock transportations, and thus we only consider the slaughterhouses within 1-kilometer from major roads in the road network.
The population and data of pork production and consumption was collected from the official 2018 Yearbook (a collection of census and economy data) of each province and 2018 Chinese National Yearbook. Such data allows us to calculate the rate of pork import/export and the per capita consumption of pork in each province.
We propose three spread mechanisms of ASF in China. The first mechanism, natural distance spreading (NDS), is the classic and commonly adopted model. It hypothesizes that the arrival of ASF in a susceptible region is dependent on the geographical distance from already infected regions.
The second mechanism, transportation distance spreading (TDS), is based on the hypothesis that the arrival of ASF in a susceptible region is associate with the actual transportation distance from infected regions. This model is inspired by the fact that pigs are usually transported along major roads in the road network.
The third mechanism, the demand-adjusted transport distance spreading (DaTDS), incorporates the demand for pork in each province and the geographical distribution of slaughterhouses along with the road network. The addition of pork demand, in the format of import and export data, is driven by the assumption that provinces that export pork are more likely to spread ASF to provinces that import pork. The slaughterhouses serve as logistical centers, where newly arrived pigs are processed, and pork is disseminated to other locations in the same region of a city or county. Specifically, we take the logarithm of the number of slaughterhouses within the 1kilometer from major roads connecting two regions, and then divide it by the travel distance between them. Note that a province may not remain "infected" at all times. It may recover to "susceptible" status because of interventions (such as isolation) from local or national authorities. Therefore, we focus on the latest province that became infected. The time interval between two consecutive outbreaks is considered: if the time interval between two consecutive provinces was too short, it was unlikely that they infected each other. Instead, they were more likely to get infected from the previously infected province.
To evaluate these models, we consider both the prediction of which province will be infected next and the timing of the actual infection of this province. were evaluated by calculating the rank of the actual sequence of arrivals (i.e., infections from real data) and the model predicted the province that became the next infected one. Specifically, in each step, after a new province is infected, we use the three models NDS, TDS, and DaTDS to calculate the distances between the latest infected province and all other susceptible provinces. Then we predict that the susceptible province with the shortest distance will be the next province to be infected. We also rank all susceptible provinces based on their actual arrival (i.e., infection) time. The earlier the predicted province actually became infected, the higher the rank. Finally, we accumulate the ranks of provinces predicted to be the next ones. For example, in step 1, a model predicts that province A will be the next infected one, while the A is the 5th to be infected among all the susceptible provinces, leading to a rank of 5. Then in step 2, the model predicts that province B will be infected, and the actual arrival order B is 3. This would lead to an accumulated rank of 5+3=8 for this model. Therefore, the smaller the accumulated rank, the better the model performs in prediction. Figure 1. The sequence of African swine fever invading China's 31 provinces and the supply and demand of pork in each province, from left to right, according to the chronological order of each province's ASF outbreaks. The red dotted line reflects the wave advance pattern of African swine fever invasion.  If the predicted to-be-infected province is actually within the top 3 in the actual order of arrival time, the arrow is in navy blue. Similarly, top 4-6 is colored in light blue; 7-10 is colored in green; 10+ is colored in red.