Calibrating a Discrete-Event Simulation for Quantication of Sex-Specic Colorectal Neoplasia Development

Background: Medical evidence collected from new observational studies can sometimes signicantly alter our understanding of disease incidence and progression. This requires ecient and accurate calibration of disease models to help quantify the differences between observed cohorts. However, in model calibration, it is common to encounter overtting with many model parameters but few observational outcomes. Additionally, the diculty in evaluating tting performance is signicant due to a large degree of outcome variation and expensive computations for even a single simulation run. Methods: We developed a two-phase calibration procedure to address the above challenges. As a proof-of-the-concept study, we veried the procedure with a discrete-event-simulation-based study on sex-specic colorectal neoplasia development. For the study, we estimated eight disease model parameters that govern colorectal adenoma incidence risk and growth rates at three distinct states: non-advanced, advanced adenoma, and adenoma becoming cancerous. For the calibration, we dened the likelihood measure by a relative weighted sum-of-squares difference between the three actual prevalence values reported in a recent publication and those predicted by a discrete-event colorectal cancer simulation. In phase I of the calibration procedure, we performed a series of low-dimensional sampling-based grid searches to identify reasonably good candidate parameter designs. In phase II, we performed a local search-based approach to further improve the model t. Results: Overall, our two-phase procedure showed better goodness of t than a straightforward implementation of the Nelder-Mead algorithm, yielding a 10-fold reduction in calibration error (0.0025 vs. 0.0251 for an all-white mixed-family-history male cohort on the likelihood measure dened above). Further, the two-phase procedure was more effective in calibrating a validated simulation model for a female cohort than a male cohort. Finally, in phase II, performing local search on each of the parameters sequentially is more effective than searching the entire parameter space simultaneously. Conclusions: The proposed two-phase calibration procedure is effective for estimating computationally expensive stochastic dynamic disease models. In addition, initial parameter search range truncation and sensitivity analysis on various parameters can be computationally cost-effective.

adenomas can be removed, which contributes to CRC incidence reduction. However, colonoscopy is not universally accepted among the screen-eligible population. And while considered cost-effective, it is the most costly screening test too. These drawbacks give rise to the need for alternative screening tests, including stool-based tests (for occult blood with or without DNA mutations), exible sigmoidoscopy, and virtual colonoscopy, which are less costly and less invasive. However, these methods are much less accurate than colonoscopy. The U.S. Preventive Services Task Force recommend these tests for CRC diagnosis with some distinction; see [4] for more detail.
On the other hand, there is a need to improve the prediction of precancerous polyp progression so that CRC diagnostic screening and surveillance may be better targeted to those at high-risk for rapid progression. In clinical practice, patients are further classi ed by detection of advanced precancerous polyps, which include adenomas and sessile serrated polyps ≥ 10 mm and adenomas with villous histology or high-grade dysplasia [5], [6]. Individuals with an advanced adenoma are more likely to develop another advanced adenoma and CRC, as are persons with multiple non-advanced adenomas (4-6 mm) [7]. Thus, with improved prediction, we can conduct population-wide surveillance more effectively and cost-effectively, e.g., having differentiated risk of subsequent risk for advanced neoplasia (combination of advanced precancerous polyps and CRC).
It is well-established that sex plays an important role in risk for both advanced precancerous polyps and CRC. More men than women are diagnosed with CRC. While men and women have similar genetic predispositions in terms of adenoma-carcinoma sequence, there are substantial differences in cancer incidence between the two sexes [8], e.g., the American Cancer Society reported a 30% higher annual incidence rate in men than women in the United States [1]. Murphy et al. (2011) [9] found that the rates of nding cancer were higher for men than women at almost all anatomic subsites, based on data from the U.S. National Cancer Institute's Surveillance, Epidemiology, and End Results (SEER). Several studies suggested that female patients diagnosed with CRC have signi cantly longer survival rates than male patients [10], [11]. The sex disparities have been associated with exposure to behavioral risk factors, including smoking [12], red-meat diet [13], and other lifestyle-related risk factors [14]- [16].
Furthermore, men have a higher prevalence of adenomas than women by a ratio of nearly 2:1. Ferlitsch et al. (2011) [17] reported that adenomas prevalence was higher among men than women by an absolute difference of 10%, through a study of more than 44 thousand participants of a national screening colonoscopy program in Austria. Regula et al. (2006)[18] reported that advanced precancerous polyp was found with a signi cantly higher percentage in men than women, from a study of more than 50,000 Polish participants. Brenner et al. (2013) [19] reported that adenoma prevalence (both advanced and nonadvanced) was substantially higher in men than women for different age groups, from an observational study of more than 3.6 million German participants.
In summary, it is evident that men and women differ in the risk of adenoma progression. However, the current recommendations on diagnostic screening and surveillance have not taken the sex difference into account. Thus, our objective here is to quantify age-and sex-speci c colorectal adenoma progression with a computational model, which will in turn, help to inform the optimal age for each sex to initiate colonoscopy testing. In this paper, we conduct stochastic modeling of age-and sex-speci c dwell durations at four states (i.e., adenoma before initiation, non-progressive non-advanced adenoma, progressive non-advanced adenoma, and advanced adenoma). Since clinical data does not provide necessary information about age-and sex-speci c adenoma progression, the above unknown quantities cannot be directly modeled from clinical data using conventional statistical methods such as regression.
In response, we have adapted a discrete-event simulation model and performed model calibration to estimate progression parameters against sex-speci c prevalence data on key disease stages, i.e., nonadvanced adenoma, advanced adenoma, and adenoma becoming cancerous.
Only a handful number of papers in the current CRC disease modeling literature have reported their model calibration work in detail. In several studies [20]-[22], the authors conducted model calibration to estimate unobservable disease progression parameters against benchmark statistics (e.g., national CRC incidence). More recently, Erenay et al. (2011) [23] developed an individually-based event-driven state transition simulation that mimics the natural history of metachronous colorectal cancer (MCRC) for a 5year period following the treatment of primary CRC. The model comprises ve states, namely polyp free, polyp, MCRC, metastatic-MCRC, and MCRC-related death. The authors estimated six unknown parameters of the natural history of MCMC through calibrating the simulation mentioned above against two calibration targets, 5-year MCRC incidence and mortality rate, in principle of least sum-of-squared error of the two calibration targets. For the calibration, the authors simply ran the simulation model exhaustively with every possible combination of the unknown parameters and selected those with simulated outputs matching the benchmark statistics of a well-de ned patient cohort, derived from the SEER database. Rose et al. (2014) [24] proposed an individually-based state transition model consisting of two interacting submodels: a continuous-time disease-progression submodel and a discrete-time Markov submodel for surveillance and retreatment. The key components on disease progression are recurring transitions to unresectability and to the point of symptom onset, either of which is determined by the transition timing and modeled with an exponential distribution. The author estimated seven unknown parameters of disease progression through calibrating the simulation mentioned above against seven observable outcomes, reported by Pietra et al. (1998) [25]. The authors developed an e cient calibration procedure that consists of several rounds of calibration with increasingly narrowed candidate parameter sets and against a series of speci c calibration targets.
In this research, we adapted the Vanderbilt-NC State (V/NCS) simulation model, which is a discrete-eventbased stochastic model emulating the age-and sex-dependent growth of each adenoma. To calibrate this computationally expensive simulator, we used a two-phase approach to estimate the values of eight unknown model parameters over an ample design space. The rst phase used sampling to identify reasonably good candidate parameter designs; whereas the second phase used local search to further improve the model tting. We set the sum of squared deviation of the prevalence values as the loss function to minimize and used benchmark statistics of three disease states (e.g., non-advanced adenoma, advanced adenoma, and adenoma becoming cancerous) extracted from a German cohort study by Brenner et al. (2013) [19]. Brenner and colleagues analyzed national screening colonoscopy registry data from nearly 3.6 million German participants. To facilitate the calibration, we relied on subject matter expertise to select both model parameters and target responses for the calibration procedure. To e ciently adapt the local search idea, we compared two variants of the optimization procedure (more axial based vs. globally). At the end, we quanti ed the sex-speci c adenoma-carcinoma sequence for different age groups.
Our approach may be applied to calibrating "black-box" disease models with many unknown input parameters, wide value ranges, and multiple target outcomes. Our main contributions are 1) development of an e cient calibration procedure for complex discrete-event disease simulation models; and 2) quanti cation of age-dependent sex differences on the adenoma-carcinoma sequence based on empirical observations.

Overview of approach
In this study, we used the Vanderbilt-NC State (V/NCS) discrete-event microsimulation model developed by researchers at Vanderbilt and North Carolina State University [20].
The V/NCS model mimics the natural history of colorectal neoplasia for each hypothetic entity in the cohort. In addition, the model can be used to evaluate colorectal neoplasia screening strategies. One can specify the input cohort by sex, birth year, and family history of each simulated entity. Then for each entity, discrete events are simulated along the adenoma-carcinoma sequence. In the V/NCS model, events trigger changes at discrete times to the states of each adenoma along the progression pathways and the hypothetic person as a whole. These events lead to the collection of statistics and the creation of new events. Events that are relevant to our work include new person creation, natural death, cancer death, nonadvanced adenoma incidence, advanced adenoma incidence, cancer incidence from an adenoma. Other events in the model include regional cancer, distance cancer, cancer symptomatic, terminal cancer, colonoscopy, recover from cancer, terminal cancer charge, and age-based utility. For more information, we refer to Roberts et al. (2007) [20] .
The purpose of this study is to estimate a set of model parameters that specify four quantities of cohort heterogeneity, which are used to govern the progression of each adenoma created and consequently the natural history of colorectal neoplasia. These quantities cannot be directly observable in an observational study and sex-speci c estimates of these parameters are not available to simulation models currently available in the literature. Further, it is somewhat unethical to follow the continued progression of an adenoma once it is observed in clinical practice. Instead, polypectomy is recommended, and the natural progression is halted. Therefore, we resorted to calibrating the V/NCS model against prevalence data from a published study of Brenner et al. (2013) [19]. While the V/NCS model offers su cient delity to the adenoma-carcinoma sequence and lends exibility on candidate parameter design selection with a userfriendly interface, we faced a severe computational burden. It took 40-50 minutes to do one simulation run with a cohort size of 10,000 on a regular personal computer. In response, we developed an e cient two-phase calibration procedure.
Disease Progression Modeling In The V/ncs Model CRC begins as a non-visible, benign adenoma. Once such an adenoma appears, it transitions to the next stage, depending on the pathway to cancer it follows. The V/NCS model includes three types of progression (i.e., pathways to cancer): non-progressive (i.e. #1 in Fig. 1), slowly progressive (#2), and immediately progressive (#3). The rst type is non-progressive. An adenoma of this type has no chance ever to become cancerous, but can grow as a benign adenoma (i.e., advanced adenoma) to match the data on the portion of adenomas that can be detected. The second type is progressive. A non-advanced adenoma of this type can either become an advanced adenoma de ned by its histology or become cancerous directly, with the former being more common. The transition of this type is then modeled as a competing process between the above two possibilities. The third type of progression is immediately progressive, which implies that an adenoma with this type immediately progresses to becoming cancerous upon its initiation. Regardless of the second or third type, as long as an adenoma becomes cancerous, it follows the usual CRC pathway (i.e., from pre-clinical phase to clinical phase and through cancer stages).
In the V/NCS model, there are several key assumptions on adenoma incidence and progression. First, to each individual, the incidence and progression of each adenoma are independent of other adenomas. Second, both adenoma incidence rate and progression time are in uenced by an individualized risk (Liebsch 2003) [26]. Obviously, family history (i.e., presence of CRC in rst-degree relatives) is an important characteristic affecting the risk. Sex and race, to a lesser degree, also affect the risk. Speci c risks to individuals may further include factors like brous diet, lack of exercise, familiar adenomatous polyposis, and heredity nonpolyposis CRC, but the exact effect of these factors is not known and therefore not included in the model. The risk is modeled with a JohnsonSB distribution for individuals with or without family histories. In both cases, the JohnsonSB distribution has a minimum of 0.0 and a maximum of 1.0, which implies the absolute individual risk on a scale between 0.0 and 1.0. Further, both distributions are highly positively skewed with a mode close to zero, and a mean of 0.11 for individuals with family history and a mean of 0.056 for individuals without any family history. These additional speci cations match with the published research ndings through a meta-analysis (Johns and Houlston 2001) [27].
In addition to the risk adjustment, the baseline adenoma incidence follows a non-homogenous Poisson process with the incidence rate modeled by an age-dependent piecewise linear function. The baseline time taken to progress to an advanced adenoma either from a non-progressive non-advanced adenoma (i.e., NP_NON) or from a slowly progressive non-advanced adenoma (i.e., P_NON) follows a JohnsonSB distribution with a minimum of 0.0 and a maximum of 60.0. The baseline time for advanced adenoma becoming cancerous also follows a JohnsonSB distribution with a minimum of 0.0 and a maximum of 60.0. Given that an adenoma does not occur likely before the age of 40, the maximum of 60.0 implies the longest allowable time for making a transition is 60 years and thus gives su cient time-lapse during an individual's lifetime. For the above three quantities, the actual transition time is further adjusted by the personal risk.
We list several additional assumptions on adenoma progression and cancer incidence in the V/NCS model that are only indirectly relevant to our calibration work as follows. First, the distribution of adenoma to progression type is dependent upon age. As the body repair mechanism deteriorates, the ability of the body to deal with abnormal cells begins to decline. Thus, it is assumed that the percentage of progressive adenomas increases as a person's age increases. Second, for immediately progressive adenomas, they become cancerous as soon as they are initiated. Since these adenomas progress immediately, it is further assumed that they then only take no more than 10 years until cancer becomes symptomatic. The actual duration follows a JohnsonSB distribution as well. Third, the time to cancer incidence follows a JohnsonSB distribution with a mean of 20 years and a mode of 22 years. V/NCS model assumptions on cancer staging and mortality include (1) regional and distance metastasis rates are independent processes; (2) pre-clinical cancer stage progression and symptom development are lesion-speci c and independent of personal risk; (3) times to clinical cancer at the regional and distance stages follow Johnson SB distributions; and (4) rate of progression of death, and potential survival (from CRC) is determined by cancer stage at the time of symptoms.

V/ncs Model Adaptation For The Calibration
With the V/NCS model platform, one can input a realistic population that matches the U.S. census or a hypothetic population of any arbitrary size, and enter a speci c simulation start year. The simulation can then trace colorectal neoplasia development of the cohort to a pre-determined end year. One bene cial feature of the V/NCS model platform is that it generates a trace statement that summarizes a sequence of time-stamped events capturing disease development. To process the trace statement, we developed a procedure to extract all the events that take place in the input population (see Fig. 2). For each individual, we created a state transition chart to record the time at which his/her events occur. By following each individual through the simulation duration, one can characterize the health state (i.e., NOV, ADV, and CRC) of each individual at any speci c point. The state NOV includes individuals who have had at least one P_NON, NP_NON, or at least one adenoma that immediately progresses to cancer, but no advanced adenoma. The state ADV includes those individuals who have had at least one advanced adenoma but none has become cancerous. The state CRC includes those individuals who have had cancerous adenomas or have developed CRC. For each of the ve age groups (54-59, 60-64, 65-69, 70-74 and 75-79), we then counted its population at the end of the simulation horizon (i.e., at a particular year) and calculated the portion of the corresponding population subgroup in each of the three states as the three corresponding prevalence values.
To adapt the V/NCS model for our calibration study, we made further speci cations on the model calibration. First, we set an all-white mixed-family-history cohort in the V/NCS model. Second, we set the base incidence of non-visible adenomas to the same as in the original version of the V/NCS model. Note that there is no evidence showing signi cant differences in the division of having family history and not in an all-white cohort in the US and the Germany cohort studied by Brenner et al. (2014) [28], which is believed to be predominantly white. Third, we assumed that the time for some non-advanced progressive adenoma (i.e., P_NON) directly becoming cancerous does not change from the original version of V/NCS to our work. As it is quite rare for an adenoma to follow this pathway, we did not expect this xation affects the sex-based comparative results signi cantly.
Consequently, our calibration efforts were focused on quantifying the following random variates: (1) individual risk; (2) if an adenoma created is progressive, its baseline dwell duration in the non-advanced adenoma state before transitioning to the advanced precancerous polyps state (termed transition time of P_NON to ADV); (3) if the adenoma created is non-progressive, its baseline dwell duration in the nonadvanced adenoma state before transition to the advanced adenoma state (termed transition time of NP_NON to ADV); and (4) if the adenoma is in the advanced adenoma state, its baseline dwell duration before transitioning to the cancerous adenoma state (termed transition time of ADV to CRC). Further, as stated earlier, each of the above four random variables is modeled with a JohnsonSB distribution. A JohnsonSB distribution is a four-parameter distribution family where the shape and location of the distribution are governed by two model parameters δ and γ. The other two parameters, minimum and maximum, specify the scale. For these four random variates, we assumed the minimum and maximum to be unchanged with the Germany cohort. Hence, for either sex, we had eight calibration variables, i.e., four δ's and γ's, to estimate based on three system responses, namely sex-speci c age-group-aggregate NON, ADV, and CRC percentages.

A Two-phase Calibration Procedure
We took a two-phase approach for the model calibration. In phase I (a preliminary phase), we performed a series of low-dimensional searches on subsets of model parameters (i.e., a pair of δ and γ) in an ad-hoc manner. We conducted these searches progressively against varied calibration targets that are aggregate over age groups. The purpose was to identify a set of promising values on each model parameter, which would serve as the multiple starting points (parameter designs) in phase II. The sequence of these searches and design inclusion criteria were developed from discussions with a domain expert. In phase II, we viewed the model calibration task as a nonlinear optimization problem and performed the Nelder-Mead algorithm (simplex search algorithm), one of the best-known algorithms for multidimensional unconstrained optimization without derivatives. Given the high-dimensionality of the "black-box" optimization problem, we explored two variants of the search procedure, namely one-shot globally over the entire model parameter space and sequentially based on interconnections in subsets of model parameters. Design of the loss functions was consulted with the domain expert as well. As for our calibration targets, we used age-speci c prevalence data from Brenner et al. [18]. More speci cally, we used the prevalence of both men and women in ve age groups (54-59, 60-64, 65-69, 70-74 and 75-79).
Phase I (Preliminary Phase) --Identify promising initial search points for Phase II.
In our study, it is computational challenging to quantify the joint in uence of the eight input model parameters (calibration variables) on the fteen age-group-speci c system responses. We thus performed low-dimensional searches on the four pairs of model parameters over respective pre-speci ed search ranges. Note that the original version of the V/NCS model was previously calibrated against the SEER (Surveillance, Epidemiology and End Results Program) data for U.S. populations. We consulted the domain expert to nalize each search range, which is centered at the previously calibrated value. Through our preliminary simulation analysis, we observed that in each pair of δ and γ, the prevalence values are a lot more sensitive to changes in δ than in γ. Thus, we set a larger range for each δ than the paired γ. We divided the search subspace of (δ 0 , γ 0 ) with a ve-by-ve grid and divided each of δ 1 , γ 1 , δ 2 , γ 2 , δ 3 , γ 3 with ten even intervals.
Basically, we followed the adenoma-carcinoma sequence to calibrate the model parameters progressively, i.e., rst adenoma progression propensity, then transition from NON to ADV, and nally transition from ADV to CRC. In the rst step, we performed a grid search on (δ 0 , γ 0 ) and xed the other parameter values as it is rst to mimic the adenoma progression risk distribution of the simulated cohort. Our calibration targets are all three prevalence values over age groups. At the end of this step, we identi ed promising (δ 0 , γ 0 ) values such that all three predicted prevalence values are reasonably close to the observations (less than 15% relative error). Next, we xed (δ 0 , γ 0 ) values to the identi ed ones and performed orthogonal sampling in the subspace formed by (δ 1 , γ 1 ). The use of a sampling-based search as opposed to a grid search is because multiple promising (δ 0 , γ 0 ) values were identi ed and thus using all of them for ensuing search would be computationally expensive. Our calibration targets are aggregate prevalence values of NON and ADV over age groups. We then followed the same idea to search in the subspace formed by (δ 2 , γ 2 ) and again used the same calibration targets. We perturbed (δ 1 , γ 1 ) rst because there were many more transitions from P_NON to ADV than from NP_NON to ADV. At the end of this step, we identi ed promising (δ 1 , γ 1 ) and (δ 2 , γ 2 ) designs such that both predicted prevalence values (i.e., at states NON and ADV) were further closer to the observations (less than 10% relative error). Finally, we xed (δ 0 , γ 0 ), (δ 1 , γ 1 ), (δ 2 , γ 2 ) to be the identi ed values and performed orthogonal sampling in the subspace formed by (δ 3 , γ 3 ). Our calibration targets are aggregate prevalence values of NON, ADV, and CRC over age groups. At the end of this step, we identi ed promising (δ 3 , γ 3 ) designs such that all three predicted prevalence values fall in a close range of the target values (less than 10% relative error on NON, less than 10% relative error on ADV, and less than 5% relative error on CRC).
For any identi ed parameter values, we used a built-in interactive visual tool to graph the corresponding Johnson SB distributions (mainly their shapes and locations) and checked them with our domain expert. We then discarded some of the parameter designs according to the domain expert's suggestions.

Phase 2. Local-Search based Nonlinear Optimization
In this phase, we employed the Nelder-Mead algorithm for gradient-free nonlinear optimization to further improve the model tting. We set the parameter designs identi ed on Phase 1 as the starting points for the Nelder-Mead algorithm. We used the weighted sum squared of the relative errors on the three aggregate prevalence values as a similarity measure and the objective function of the unconstrainted nonlinear optimization problem (i.e., loss function of the calibration variables). Through consulting with our domain expert, we assigned a larger weight to the set of CRC similarities than to the set of ADV similarities and the set of NON similarities.
Considering the resultant optimization problem is computationally expensive due to the fact that it takes a long time to evaluate just one parameter design, we designed two search paths that differ by the search space chosen along the solution process. In the base case algorithm, we considered an 8-dimensional search space at any moment of the solution (i.e., all eight model parameters are possible to be varied). We termed this strategy the "full-space local search strategy." As an alternative option, we considered four 2-dimensional subspaces progressively (i.e., (δ 0 , γ 0 ) rst, then (δ 1 , γ 1 ), then (δ 2 , γ 2 ), then (δ 3 , γ 3 )). We termed this strategy the "sequential local search strategy." When we performed Nelder-Mead in one subspace, other were xed at the initial values. The above order used in this alternative algorithm followed the same idea as in phase I. That is, the system responses are more sensitive to (δ 0 , γ 0 ), than (δ 1 , γ 1 ), than (δ 2 , γ 2 ), and nally (δ 3 , γ 3 ).

Results
We set the simulated cohort in the V/NCS model to be either a population of 1000 white males or females with no family history and born in 1949. In this way, we were able to utilize additional parameters previously made available in the V/NCS model, e.g., mortality rate. We rst examined the e ciency of our two-phase approach in comparison with the straightforward execution of the Nelder-Mead algorithm. For any implementation of the Nelder-Mead algorithm, we simply called the corresponding Matlab function and ran it to natural termination with default algorithm parameter settings. To make fair comparisons, we also set the same two search strategies, i.e., full-space and sequential. We report the comparative results in Table 1.
Overall, our two-phase approach showed better goodness of t than the straightforward Nelder-Mead implementation. For example, both with the sequential search strategy for Nelder-Mead, the two-phase approach yielded a loss function value of 0.0025 whereas the straightforward calibration with Nelder-Mead yielded a loss function value of 0.0251 (ten-fold reduction). Furthermore, when comparing the two local search strategies, the full-space search strategy yielded a lower loss function value than the sequential search strategy (male, 0.0025 vs. 0.0056; female, 0.0005 vs. 0.0008). Finally, these preliminary results suggested that the two-phase procedure was more effective in calibrating the V/NCS model for a female cohort than a male cohort. With the above result, we conclude that our two-phase procedure with the sequential local search strategy is more effective. We next speci ed the number of simulation replications to be 10 to ensure the statistical con dence on stochastic dominance for each comparison. We collected prevalence statistics for ve different age groups in the range of 55 to 79. Table 2 shows the percentage of people with advanced adenoma for men and women within each of the ve age groups. Our results show that the model with calibrated variables underestimated male advanced adenoma prevalence and overestimated female advanced adenoma prevalence for younger age groups, whereas it overestimated male advanced adenoma prevalence and underestimated female advanced adenoma prevalence. On the other hand, the comparative results on the prevalence of adenomas having become cancerous were just the opposite except for the age group 55 to 59 years. Overall, our results supported our calibration of the V/NCS simulation for sex-speci c colorectal neoplasia development modeling.

Discussion And Conclusions
In this paper, we introduce an e cient and effective two-phase calibration procedure approach to estimate the natural history of unobservable CRC parameters. This work showcases an essential step in assessing population-level cost and effectiveness of CRC screening strategies for a population whose prevalence data were recently acquired. We took into consideration the adenoma-carcinoma sequence to calibrate the model parameters progressively, i.e., rst adenoma progression propensity, then transition from NON to ADV, and nally transition from ADV to CRC. Moreover, we quanti ed the sex-speci c adenoma-carcinoma sequence among different age groups based on observations from a large cohort study. Our results demonstrate that the combined rst phase involving a direct local search and nonlinear optimization methods in the second phase increases the accuracy of model-based parameter estimation.
In addition, the proposed model-based parameter estimation approach is a general procedure that could be extended to other disease model calibration problems that involve the adaptation of some robust simulation platform (e.g., the V/NCS).
While we developed an e cient model-based parameter estimation procedure, our study has several inherent limitations. First, the cost of the simulation is very expensive; therefore, we used a smaller cohort size, which is not comparable to the cohort studied by Brenner et al. (2014) [28]. Second, it is possible that when using Nelder-Mead in phase II, the algorithm converged on a local minimum and not a global minimum. Third, the use of calibration targets estimated from the German cohort may not re ect CRC natural history characteristics in the U.S., which can be crucial to further cost-effectiveness analysis on screening strategies for U.S. populations. Fourth, underweighted endpoints implemented in the objective function may result in biased calibration outputs, thus undermining the validity of our parameter estimation. Fifth, while the progressive calibration procedure could bring valuable insights into e cient 27. L. E. Johns and R. S. Houlston, "A systematic review and meta-analysis of familial colorectal cancer  Figure 1