Model description
CystiAgent is a spatially explicit ABM that is able to simulate endemic transmission of T. solium and test a variety of population-level interventions designed to control or eliminate T. solium. CystiAgent was developed in NetLogo 6.0.4 (Northwestern University, Evanston, IL), an open-access ABM software that was chosen for its ability to represent spatial data and display simulations through a graphical interface. A basic version of the model, complete with the model code, graphical user interface, and supplemental data can be downloaded at http://modelingcommons.org/browse/one_model/6268. The model description adheres to the ODD (Overview, Design concepts, Details) protocol for describing ABMs [26].
Purpose
The purpose of CystiAgent is to deliver a model for T. solium that is able to accurately represent key spatial and behavioral aspects of transmission. This model structure has been designed with the flexibility to be applied to a variety of endemic settings and intervention types, which will facilitate validation against data from prospective trials, a key benchmark needed to test model accuracy. The ultimate objective of for CystiAgent is to have a model that can be used to compare available control and elimination strategies and provide evidence to support important policy decisions.
Entities, state variables, and scales
In CystiAgent, there are two agent classes – humans and pigs – that represent the primary and intermediate hosts of T. solium, respectively. All humans and pigs are assigned to discrete household units that are distributed across a simulation village. Currently, CystiAgent is designed to simulate transmission in one village at a time (pop. up to ~2,000), while all agents and processes are contained within the modeled village.
Each human and pig agent has an infection state, which is assigned at baseline and may change as they are exposed to infection risk throughout the simulation (Fig 1). Humans may either be susceptible (S) or infected (I) with the adult-stage intestinal tapeworm (i.e., T. solium taeniasis). Human cysticercosis, including NCC or NCC-related seizure disorders, is not included in this model as it does not contribute to transmission.
Pigs may be either susceptible (S) or infected with larval-stage metacestodes (i.e., porcine cysticercosis). Pig infection is categorized as heavy (≥ 100 cysts) (IH) or light (< 100 cysts) (IL) cyst burden, while pig exposure (E) includes the possibility of serological response to allow comparison with serological assays used in field studies. Cyst infection and serological response are assumed to be lifelong with no possibility of natural recovery or immunity, unless treatment or vaccine is applied.
Other state variables for humans and pigs are either assigned at the household or individual level. Household level variables include the x-y coordinates of the household, pig-raising by the household (yes/no), ownership of a pig corral (yes/no), use of the pig corral (always/sometimes/never), ownership of a latrine (yes/no), use of the latrine (always/never), the distance from the household that open defecation occurs when not using a latrine (log-normal distribution), and if a member of the household travels regularly outside the villages (yes/no). Human variables include whether an individual is a traveler (yes/no), and the frequency and duration of their travel to other endemic villages. There is no age or sex structure assigned to humans and there is no birth, death, or turnover of the human population. Individual pig variables include the current age of pigs (weeks), the age at which they will be slaughtered (weeks), the size of the roaming area (radius in meters), and if an individual pig is corralled at a given time (yes/no).
Each time-step of the model represents one week of cumulative activities and exposures. The one-week time step was the shortest period that could reasonably be achieved given computational limitations of the model while still providing enough accrued time for infections and other modeled behaviors to occur.
Process overview and scheduling
Model processes can be loosely categorized into seven steps that are depicted in Figure 1.
Design concept: Basic principles
CystiAgent consists of seven core functions that loop continuously in order to simulate natural endemic transmission:
(1) Pig trade. Infected pigs that are due for slaughter may be butchered at home, sold within the village, or exported. Potentially infected pigs from external villages may also be imported into the village.
(2) Pork consumption. Infected pigs are slaughtered by their owners and the resulting pork meat is either consumed at home or sold to other households, where it may cause human tapeworm infection.
(3) Human infection. When consumed pork is infected with T. solium cysts, all members of the consuming households are exposed to potential tapeworm infection. If humans acquire a tapeworm infection, the intestinal tapeworm reaches maturity after 8 weeks [27,28], and begins expelling infectious eggs at that time. Tapeworm infections naturally clear after pre-determined infectious durations [27,28].
(4) Travel. Humans that are designated as travelers leave the community at regular intervals, may contract tapeworm infections while traveling in other endemic areas, and return to the village after travel. Upon return, infected travelers resume contamination of their environment if applicable. Travel outside of the village is approximated in the model by subsetting travelers and applying a different probability of infection without explicitly removing them from the simulation village.
(5) Open defecation. Human tapeworm carriers that do not own or use a latrine release T. solium eggs and proglottid segments into the environment surrounding their household location. When tapeworm infections clear, humans stop releasing proglottid segments, but contamination of the environment with eggs persists until the eggs naturally degrade [29].
(6) Foraging. Pigs that are designated as free-roaming (i.e., not contained in corrals) are exposed to T. solium proglottids and eggs that are present in their home-range areas.
(7) Pig infection. Pigs that are exposed to proglottid segments may develop heavy cyst infection, while pigs exposed to eggs in the environment may develop light cyst infection. Either may result in seropositivity. Free-roaming pigs are exposed to an additional risk of infection or seropositivity that is proportional to the number of tapeworm carriers in the village and naïve to the pig’s location. This represents exposure to pigs that results from roaming and consumption of human feces from open defecation that occur outside of the home area.
Design concepts: Interaction and stochasticity
Each model process above is defined mathematically by a corresponding parameter(s) that were derived from data collected in Peru or other literature sources (Table 1). Depending on the model activity they represent, most parameters correspond to the central value (e.g., mean) and spread (e.g., variance) of a chosen probability distribution. During setup and running of the model, continuous features are assigned to participants based on random number generation from the designated probability distribution, while categorical features and randomly assigned from a binomial distribution. As a result of the inherent stochasticity of each model parameter, model behavior varies considerably between each individual run, but predictable patterns emerge through repeated simulations.
Design concepts: Emergence and observation
The emergent outputs of CystiAgent are the prevalence of human taenaisis and the prevalence of porcine cysticercosis, which includes the prevalence of pigs with heavy and light infection burdens, and pigs that are seropositive. These outputs are recorded at each weekly time step.
Design concepts: Collectives
Since pigs and humans belong to households that share traits and a spatial environment, clustering of behaviors and emergent patterns of infection occur among pigs and humans in the same households, and among households that are in close proximity.
Design concepts: Other
The agents in CystiAgent do not have adaptive traits, or the ability to learn from or sense features of their environment. Behaviors are determined strictly by the parameters and state variables that are defined at the initialization of a model run.
Input data
A variety of sources, including primary data, literature review, and expert opinion, were utilized to determine the values and distributions for model parameters. For the majority of parameters, we used data collected in the Piura region of northern Peru. A full description of the methods and data sources used to estimate each parameter value can be found in Additional file 1. For the purposes of sensitivity analyses, we designated a “plausible range” of values for each parameter in addition to its estimated central value. This is a range of values across which the model was evaluated to determine the impact of each parameter on model outputs. In some cases, the plausible range was determined by adopting the range of mean values observed across a group of endemic villages, and in other cases we manually widened the range to account for additional uncertainty and variability in the parameter.
For six parameters that could not be determined through primary data collection or experimentation, we estimated their values using an approximated Bayesian computation (ABC) algorithm [30]. These parameters (which will be referred to as “tuning parameters”) include two that define the probabilities of tapeworm infection after slaughter of heavily (“ph2h”) and lightly (“pl2h”) infected pigs; two that define the probability of heavy and light pig infection after exposure to proglottid segments (“heavy-inf”), and eggs (“light-inf”) present in the environment; and two that determine the probability of exposure to proglottid segments (“heavy-all”) or eggs (“light-all”) during pig-roaming outside of a pig’s home-range area.
Initialization
The NetLogo spatial environment is first populated with household by assigning an x-y coordinate to each household in a village (these can be based on real or fictitious villages). Pigs and humans are then assigned to household based on the population characteristics of the village, which can be done with census data from a real village or other data sources. State variables, including infection status, are randomly assigned to humans and pigs based on the probabilities defined by corresponding model parameters. The prevalence of initial infections in humans and pigs may be set by the user, or set to level observed in a given dataset. Once the model begins to run, however, the prevalence levels will stabilize at a natural endemic equilibrium. CystiAgent utilizes the six tuning parameters described above to adjust transmission levels in the model to a desired level in a given village. Calibration of these tuning parameters is not a required step, but would be needed for validation of the model against a specific observed dataset.
The ABC method adapted for CystiAgent tuning follows a simple “rejection sampling” approach and is based on a variety of in-depth examples found in literature [31–33]. Briefly, random values are sampled from a uniform distribution for each of the tuning parameters, and each combination of parameter values is run in the model without varying other model parameters. The average prevalence of human taeniasis and porcine cysticercosis are measured for each run and the Euclidean distance between these values and the target prevalence levels are calculated. Following a rejection sampling scheme, we select the top 1% of model runs that minimized the Euclidean distance and extract posterior distributions from the selected parameter sets. We then repeat the algorithm until a final set of parameter values is produced that adequately replicates the target prevalence levels.
Intervention tools
CystiAgent has the ability to simulate a variety of population-level interventions designed to control or eliminate T. solium transmission. A generic function is available to administer anti-helminthic treatment (e.g., niclosamide) for human taeniasis, either presumptively or after stool screening. Other functions include the treatment of pigs to cure cystic larval infection (e.g., oxfendazole), or vaccination to prevent infection. For each intervention type, user-controlled options allow for specification of participation levels, the sensitivity of screening tests, and the efficacy of drugs and vaccines used. These interventions can then be implemented through mass or targeted approaches, while varying the duration and frequency of intervention applications. Unique to this spatial model is the ability to simulate spatially targeted interventions. “Ring strategy” [25] can be applied by targeting treatment resources to households residing within a given distance of heavily infected pigs. Finally, behavioral and structural interventions such as improved access to corrals and latrines are available as stand-alone interventions or in combination with other approaches. While available in the model, not all intervention types were applied or evaluated in the current analysis.
Baseline model function and intervention application
In order to examine the stability and functionality of CystiAgent, we set up the model with observed data from Peru and applied three unique test scenarios: endemic equilibrium (no intervention), combined ring treatment strategy, and combined mass treatment strategy. The test village we used for these simulations is an endemic village in the northern Peruvian region of Piura that recently participated in a prospective trial testing a variety of T. solium control strategies (“Ring Strategy Trial”, in peer review, contact co-author Seth O’Neal). Household coordinates, input population characteristics, and prevalence of human taeniasis and porcine cysticercosis were estimated at baseline in the parent study and were made available for use in the model by the study authors.
To apply the test scenarios to CystiAgent, we first used the ABC algorithm to calibrate the model’s tuning parameter to observed transmission levels in the village, and then ran each of the scenarios across 500 Monte Carlo simulations. The first scenario (no intervention) consisted of 300 weeks without intervention. In the second scenario (combined ring treatment), we applied seven rounds of a combined human and porcine ring treatment over a two-year period. This included screening pigs for infection using the tongue inspection method every four months, and offering treatment to all human and pigs that resided within 100 meters of the identified pig. In the third scenario (combined mass treatment), all humans and pigs were offered treatment, which was applied every six months for a total of five rounds. Details of each intervention application, including drug efficacy and treatment coverage for humans and pigs are listed in the figure caption.
Sensitivity analysis of CystiAgent
We performed all sensitivity analyses in R version 3.5.1, using the “RNetLogo” package [34] to execute model simulations in NetLogo from R. Sensitivity analyses included the Latin hypercube sampling partial rank correlation coefficients (LHS-PRCC) and Sobol’ variance decomposition. Only the results of the LHS-PRCC will be presented here, however, as results were similar between the two methods. A description of the Sobol’ method and results are available in Additional file 2. Each of these methods was applied in three unique villages with different population sizes and housing densities. Household coordinates for the three test villages were based on real endemic villages in northern Peru that recently participated in a large prospective trial (“Ring Strategy Trial”, in peer review, contact Seth O’Neal). For evaluation of the CystiAgent model, sensitivity analyses were applied to two model versions: the crude model in which all parameters (k = 33) were evaluated for their impact on model outcomes , and a calibrated model for which village input characteristics and tuning parameters were fixed so that a smaller set of biological and behavioral parameters (k = 22) could be evaluated. For the calibrated model, fixed values for village input characteristics (i.e., humans and pigs per household, pig ownership, corral and latrine access) were based on data from the census applied in each village, while tuning parameters were estimated using the ABC algorithm [30], described above, to fit the model to observed levels of human taeniasis and porcine cysticercosis in each village. Each run of the model in sensitivity analyses consisted of 1000 weeks of stable endemic transmission with no interventions applied. The summary statistics collected at the end of each run were defined as the incidence-density of human taeniasis (number of new infections / 100 person-years), and the lifetime cumulative incidence of porcine cysticercosis (cumulative number of infected pigs / cumulative pig population).
In order to achieve the computational resources needed to run the model through many thousands of simulations for each of these analyses, we executed all model simulations on the Amazon Web Service EC2 cloud computing platform. Model simulations were distributed across a 72-core parallel processor using the “parallel” R-package [35] and executed on the EC2 cloud using the R-Studio Shiny server [36].
Latin hypercube sampling-partial rank correlation coefficient (LHS-PRCC)
A detailed description of LHS-PRCC method can be found elsewhere [37]. Briefly, LHS-PRCC provides a non-parametric measure of the strength of monotonic association between each parameter and each outcome of the model (human taeniasis and porcine cysticercosis incidence). For application of LHS-PRCC, we first determined the plausible ranges for each model parameter as describe above, and sampled values from each parameter distribution using a Latin hypercube sample. This procedure involves dividing each parameter range into n equal segments, and selecting a random value from each segment, as described [38]. For LHS-PRCC analyses on both the crude (k = 33 parameters) and calibrated (k = 22 parameters) models, we chose equivalent sample sizes (n) of 175,000, 50,000, and 50,000 for low, medium, and high-density villages, respectively. We then ran the model through all parameter permutations and analyzed the results to determine partial-rank correlation coefficients for each parameter using the “sensitivity” and “ppcor” R packages. For this, the PRCC formula calculates the linear correlation, ρ, between the residuals of the rank-transformed parameter input and rank-transformed model output, while accounting for correlations with all other parameter inputs [37]. Importantly, the final PRCC estimates provide measures of the strength, direction, and statistical significance of the association between parameter inputs and model outputs. P-values were obtained with a Student’s t distribution and were evaluated with a Bonferroni adjustment for 33 multiple comparisons (p < 0.0015 for statistical significance).