Model description
Model overview
CystiAgent is a spatially explicit ABM that is able to simulate endemic transmission of T. solium and test a variety of population-level interventions designed to control or eliminate T. solium. CystiAgent was developed in NetLogo 6.0.4 (Northwestern University, Evanston, IL), an open-access ABM software that was chosen for its ability to represent spatial data and display simulations through a graphical interface.
In CystiAgent, there are two agent classes – humans and pigs – that represent the primary and intermediate hosts of T. solium, respectively. All humans and pigs are assigned to discrete household units that are distributed across the simulation village, and whose locations are given by set of input coordinates that can represent real or fictitious villages. Currently, CystiAgent is designed to simulate transmission in one village at a time (pop. up to ~2,000), and can be applied to any population with corresponding input coordinates. Each time-step of the model represents one week of cumulative activities and exposures.
Model outcomes
Human and pig agents are randomly assigned an infection state at baseline. Susceptible humans (S) may be infected (I) with the adult-stage intestinal tapeworm (i.e., T. solium taeniasis), and susceptible pigs (S) may be infected with larval-stage metacestodes (i.e., porcine cysticercosis) (Fig 1). Pig infection is categorized as heavy (≥ 100 cysts) (IH) or light (< 100 cysts) (IL) cyst burden, while pig exposure (E) includes the possibility of serological response to allow comparison with serological assays used in field studies. Human cysticercosis, including NCC or NCC-related seizure disorders, is not included in this model as it does not contribute to transmission.
Model flow
Model processes can be loosely categorized into seven steps that loop continuously in order to simulate natural endemic transmission (see Fig 1):
(1) Pig trade. Infected pigs that are due for slaughter may be butchered at home, sold within the village, or exported. Potentially infected pigs from external villages may also be imported into the village.
(2) Pork consumption. Infected pigs are slaughtered by their owners and the resulting pork meat is either consumed at home or sold to other households, where it may cause human tapeworm infection.
(3) Human infection. When consumed pork is infected with T. solium cysts, all members of the consuming households are exposed to potential tapeworm infection. If humans acquire a tapeworm infection, the intestinal tapeworm reaches maturity after 8 weeks [31,32], and begins expelling infectious eggs at that time. Tapeworm infections naturally clear after pre-determined infectious durations [31,32].
(4) Travel. Humans that are designated as travelers leave the community at regular intervals, may contract tapeworm infections while traveling in other endemic areas, and return to the village after travel. Upon return, infected travelers resume contamination of their environment if applicable.
(5) Open defecation. Human tapeworm carriers that do not own or use a latrine release T. solium eggs and proglottid segments into the environment surrounding their household location. When tapeworm infections clear, humans stop releasing proglottid segments, but contamination of the environment with eggs persists until the eggs naturally degrade [33].
(6) Foraging. Pigs that are designated as free-roaming (i.e., not contained in corrals) are exposed to T. solium proglottids and eggs that are present in their home-range areas.
(7) Pig infection. Pigs that are exposed to proglottid segments may develop heavy cyst infection, while pigs exposed to eggs in the environment may develop light cyst infection. Either may result in seropositivity. Free-roaming pigs are exposed to an additional risk of infection or seropositivity that is proportional to the number of tapeworm carriers in the village and naïve to the pig’s location. This represents exposure to pigs that results from roaming and consumption of human feces from open defecation that occur outside of the home area.
Parameters
Each model process above is defined mathematically by a corresponding parameter(s) that were derived from data collected in Peru or other literature sources (Table 1). Depending on the model activity they represent, most parameters correspond to the central value (e.g., mean) and spread (e.g., variance) of a chosen probability distribution. During setup and running of the model, continuous features are assigned to participants based on random number generation from the designated probability distribution, while categorical features and randomly assigned from a binomial distribution. A variety of sources, including primary data, literature review, and expert opinion, were utilized to determine the values and distributions for model parameters. For the majority of parameters, we used data collected in the Piura region of northern Peru. A full description of the methods and data sources used to estimate each parameter value can be found in Additional file 1. For the purposes of sensitivity analyses, we designated a “plausible range” of values for each parameter in addition to its estimated central value. This is a range of values across which the model was evaluated to determine the impact of each parameter on model outputs. In some cases, the plausible range was determined by adopting the range of mean values observed across a group of endemic villages, and in other cases we manually widened the range to account for additional uncertainty and variability in the parameter.
Tuning parameters
In addition to the above suite of biological, behavioral, and environmental parameters, CystiAgent utilizes a set of tuning parameters to adjust the model to different local conditions and endemic prevalence levels. When the model is applied to specific observed prevalence levels for validation, this set of tuning parameters must be calibrated independently for each unique village using an approximated Bayesian computation (ABC) algorithm [34]. For the purposes of this sensitivity analysis, we intentionally set wide plausible ranges for tuning parameters in order to represent a broad range of possible transmission levels and measure their impact on the model.
There are six tuning parameters that represent different probabilities of exposure or infection in the model. Two tuning parameters define the probabilities of tapeworm infection after slaughter of heavily (“ph2h”) and lightly (“pl2h”) infected pigs; two other tuning parameters define the probability of heavy and light pig infection after exposure to proglottid segments (“heavy-inf”), and eggs (“light-inf”) present in the environment; and the remaining two parameters determine the probability of exposure to proglottid segments (“heavy-all”) or eggs (“light-all”) during pig-roaming outside of a pig’s home-range area.
Interventions
CystiAgent has the ability to simulate a variety of population-level interventions designed to control or eliminate T. solium transmission. A generic function is available to administer anti-helminthic treatment for human taeniasis, either presumptively or after stool screening. Other functions include the treatment of pigs to cure cystic larval infection, or vaccination to prevent infection. For each intervention type, user-controlled options allow for specification of participation levels, the sensitivity of screening tests, and the efficacy of drugs and vaccines used. These interventions can then be implemented through mass or targeted approaches, while varying the duration and frequency of intervention applications. Unique to this spatial model is the ability to simulate spatially targeted interventions. “Ring strategy” [28] can be applied by targeting treatment resources to households residing within a given distance of heavily infected pigs. Finally, behavioral and structural interventions such as improved access to corrals and latrines are available as stand-alone interventions or in combination with other approaches. While available in the model, interventions were not applied or evaluated in the current analysis.
Sensitivity analysis of CystiAgent
We performed all sensitivity analyses in R version 3.5.1, using the “RNetLogo” package [35] to execute model simulations in NetLogo from R. Sensitivity analyses included the Latin hypercube sampling partial rank correlation coefficients (LHS-PRCC) and Sobol’ variance decomposition. Only the results of the LHS-PRCC will be presented here, however, as results were similar between the two methods. A description of the Sobol’ method and results are available in Additional file 2. Each of these methods was applied in three unique villages with different population sizes and housing densities. Household coordinates for the three test villages were based on real endemic villages in northern Peru that recently participated in a large prospective trial (“Ring Strategy Trial”, in peer review) [36]. For evaluation of the CystiAgent model, sensitivity analyses were applied to two model versions: the full model that contained all model parameters (k = 33 parameters), and a reduced model for which village input characteristics and tuning parameters were fixed (k = 22 parameters), allowing for a more in-depth evaluation of key biological and behavioral parameters. For the reduced model, fixed values for village input characteristics (i.e., humans and pigs per household, pig ownership, corral and latrine access) were based on data from the census applied in each village, while tuning parameters were estimated using an ABC algorithm [34] to fit the model to observed levels of transmission in each village (i.e., baseline prevalences of human taeniasis and porcine cysticercosis in the parent study). Each run of the model in sensitivity analyses consisted of 1000 weeks of stable endemic transmission with no interventions applied. The summary statistics collected at the end of each run were defined as the incidence-density of human taeniasis (number of new infections / 100 person-years), and the lifetime cumulative incidence of porcine cysticercosis (cumulative number of infected pigs / cumulative pig population).
In order to achieve the computational resources needed to run the model through many thousands of simulations for each of these analyses, we executed all model simulations on the Amazon Web Service EC2 cloud computing platform. Model simulations were distributed across a 72-core parallel processor using the “parallel” R-package [37] and executed on the EC2 cloud using the R-Studio Shiny server [38].
Latin hypercube sampling-partial rank correlation coefficient (LHS-PRCC)
A detailed description of LHS-PRCC method can be found elsewhere [39]. Briefly, LHS-PRCC provides a non-parametric measure of the strength of monotonic association between each parameter and each outcome of the model (human taeniasis and porcine cysticercosis incidence). For application of LHS-PRCC, we first determined the plausible ranges for each model parameter as describe above, and sampled values from each parameter distribution using a Latin hypercube sample. This procedure involves dividing each parameter range into n equal segments, and selecting a random value from each segment, as described [40]. For LHS-PRCC analyses on both the full (k = 33 parameters) and reduced (k = 22 parameters) models, we chose equivalent sample sizes (n) of 175,000, 50,000, and 50,000 for low, medium, and high-density villages, respectively. We then ran the model through all parameter permutations and analyzed the results to determine partial-rank correlation coefficients for each parameter using the “sensitivity” and “ppcor” R packages. For this, the PRCC formula calculates the linear correlation, ρ, between the residuals of the rank-transformed parameter input and rank-transformed model output, while accounting for correlations with all other parameter inputs [39]. Importantly, the final PRCC estimates provide measures of the strength, direction, and statistical significance of the association between parameter inputs and model outputs. P-values were obtained with a Student’s t distribution and were evaluated with a Bonferroni adjustment for 33 multiple comparisons (p < 0.0015 for statistical significance).