Probability of Transition of Normal Esophageal Mucosa and Precancerous Lesion to the Esophageal Cancer

Background: To estimate the transition probabilities of esophageal cancer(cid:0)EC(cid:0) and its precancerous lesions by Markov model, which could provide important information for EC screening about choosing reasonable screening and follow-up intervals. Methods: The transition probabilities among pathological stages were estimated by establishing Markov models for the natural history of EC and repeatedly adjusting and calibrating Markov models by comparing the modeled incidence and distributions of pathological stages (alone or combined) with observed data in real-world condition. Results: In one year, the probabilities were 0.024, 0.05, 0.12 for people from health state progressing to mild dysplasia (mD), mild dysplasia (mD) to moderate dysplasia (MD), and moderate dysplasia (MD) to severe dysplasia/carcinoma in situ (SD/CIS), respectively. The age-specic transition probabilities were 0.08~0.18 for severe dysplasia/carcinoma in situ (SD/CIS) progressing to intramucosal carcinoma(IC), 0.4~0.87 for intramucosal carcinoma (IC) to submucosal carcinoma (T 1 N 0 M 0 ) (SC), and 0.2~0.85 for submucosal carcinoma (T 1 N 0 M 0 ) (SC) to invasive carcinoma (INC). The progression probabilities increased with age and the severity of the disease. Based on the estimated transition probabilities, we predicted the incidence of EC and distributions of its pathological stages. Comparisons between modeled results with observed data conrmed the validation of our transition probabilities. Conclusions: An esophageal cancer transition model in China established with validity. It could be a point of reference for further economic evaluation and policy formulation of cancer


Background
Esophageal cancer (EC) is one of the tumors with high morbidity and mortality. According to the International Cancer Registry, there were an estimated 572,000 new cases and 509,000 deaths worldwide in 2018, ranking seventh and sixth in terms of incidence and death of malignant tumors, respectively [1]. China is one of the countries with the highest incidence of EC. According to the latest data released by the National Cancer Center, the incidence of EC in China was 17.87/100,000 and the death rate was 13.68/100,000 in 2015, ranking the sixth and fourth in the incidence and death of malignant tumors respectively [2]. EC affects a wide range of areas in China, covering about 200 million people, mainly in Taihang mountain areas, called the high-risk areas, the mortality of EC reaches 68.3/100 000 [3], far more than the national level of 13.68/100 000 [2]. More than 95% of the patients treated for EC in the high-incidence areas were in the middle and advanced stages, and the 5-year survival rate of the total cases was less than 10%. However, if early detection, early diagnosis, and early treatment can be achieved, the 5-year survival rate of early EC can be greater than 95% [4]. Accordingly, early detection and treatment through screening programs can reduce the threaten of EC to populations in high-risk areas [5].
It is known that choosing reasonable screening intervals plays an important role in EC screening. And screening intervals are highly associated with the transition probabilities from one health status to another during esophageal carcinogenesis. There have been some follow-up studies in small cohorts of patients with precancerous lesions, and test-retest of endoscopy researches in a short period. They could provide some transition probabilities for parts of stages in the process of EC development. However, the variations of transition probabilities obtained from small populations were signi cant among studies. Two-fold differences were found in the transition probability of mild dysplasia (mD) progressing to moderate dysplasia (MD) in previous studies [6,7]. And even some probabilities were not available such as the progress probability from intramucosal carcinoma (IC) to submucosal carcinoma (SC) [8]. Thus, the applicability of previous transition probabilities is limited.
As far as we know, large-population-based prospective randomized studies are ethically di cult and expensive to conduct, and results would only be obtained in decades while the Markov model has been successfully used in studies of other cancers to address similar questions [9]. The Markov model is considered a powerful tool for simulating the natural history of chronic diseases.
In Markov models, health states passed through by patients are de ned separately; and then through modeling based on a system of transition probability among states within a cycle (usually one year), the development of diseases in a population (like diseasespeci c incidence and mortality) could be estimated [10]. On the other way around, we could estimate transition probability by repeatedly adjusting and calibrating Markov models by comparing the modeled incidence and mortality (alone or combined) with observed data in real-world conditions. This paper aims to estimate the transition probabilities for EC and its precancerous lesions by establishing the Markov models based on the natural history of EC, which can provide important information for EC screening about choosing reasonable screening and follow-up intervals.

Overview
The transition probabilities among health states of EC were estimated based on established Markov models for the natural history of EC. It is known that the natural history model can predict the incidence of EC based on the EC mortality, all-cause mortality, the prevalence of each pathological stage of EC, and transition probability. Similarly, as the above incidence/mortality/prevalence was available, we could estimate transition probability.

Study design
Based on the project of early diagnosis and treatment of EC which began in 2005 and is detailed described in our previous paper [11], this study through cluster sampling selected Hejian Town as the screening site in Linzhou County, Henan Province, and the population aged 40-69 in this town as the target screening population, among which those without contraindications for endoscopy were examined by endoscopic iodine staining, and those with positive results were examined by pathology, and their pathological results, age, gender, and other basic information were recorded in detail. For precancerous lesions and esophageal cancer diagnosed by screening, the treatment principals were as follows: (1) for severe dysplasia, carcinoma in situ and intramucosal carcinoma, endoscopic mucosal resection (EMR) or argon plasma coagulation (APC)was recommended; in the rst year after treatment, they should be followed up by endoscopy; (2) for submucosal carcinoma (T1N0M0), esophagectomy was recommended; (3) for invasive carcinoma, common treatment modalities were chosen depending on disease severity and could include surgery, radiotherapy, or chemotherapy, or a combination. At the same time, the incidence of EC in each age group every 10 years from 40 to 69 years old in  [12,13]. The state-transition Markov model based on the natural history of EC has been described in the literature and also showed in Figure 1 [14]. At the start of the model, a hypothetical cohort is distributed in these health states except the "death" state. During each Markov cycle (1 year), a person may remain in the same health state, progress to another state or regress to lesser stages, and die from other reasons or EC. The health state of EC in the next year relied only on the health state of this year and the corresponding transition probabilities [15]. Consider that in the real world, the eligible screening age is 40-69 years and the average life expectancy is 73 years [16], so in the Markov model, we assume that 100,000 participants enter the model at age 40 years and are followed up until age 70 years. TreeAge Pro 2009 Suite by TreeAge Software Inc, was used for all analysis.

Parameters used in the Markov model
To establish the Markov model of natural history for EC, the following parameters were needed: initial probability, transition probability, and death probability. Initial probability refers to the prevalence of each health state for cohort members at the start of the modeling [17]. Transition probability denotes the likelihood of progression or regression from one health state to another in a Markov cycle. Death probability represents the probability that the population dies from EC or other causes in each model cycle. while only 5% of them were Barrett's esophagus. The initial distribution probabilities of 7 Markov states except death (from "normal" to "INC") were respectively 88.95%, 8.2%, 1.8%, 0.9%, 0.08%, 0.05%, and 0.02% in the 40-44 year age group [14].

Death probability
The previous study has demonstrated that persons with SD/CIS or lesser may not die from EC; that IC or SC cases may die from all causes including EC; INC patients were assumed to mainly die from EC [14]. Therefore, in the natural-history model, the corresponding death probabilities for the three kinds of the population above were converted from non-esophageal-cancer mortality, all-cause mortality, and case fatality rate of EC, respectively. And they all were obtained from the published data, which were counted according to Linzhou County Cancer Registry during 2004-2006, and the results of our prospective cohort study based on the EC chemoprevention trial of selenomethionine and celecoxib in "Early Detection of EC" (EDEC) program [14,19]. Table 1 demonstrates the age-speci c death probabilities of different Markov states.

transition probability
The transition probabilities among health states were estimated using the approach taken by others [20][21][22][23]. Firstly, transition probability ranges were determined from published studies [6,7,[24][25][26][27][28][29][30], cohort data in the chemoprevention trial of EDEC program mentioned previously [19], and experts' opinions serving as an initial data set. Due to the limited sample size, the transition probabilities were signi cantly different among published literature, and even some parameters for progression and regression of the disease were unavailable. Thus, in the second step, the transition probabilities were hierarchically calibrated to make the modeled age-speci c EC incidence curves t the empirical ones observed in real-world settings. In the third step, we further adjusted the transition probabilities to obtain a distribution of each pathological stage for EC similar to that of surveillance data. Observed

transition probabilities
The transition probabilities from a cancer-free state to an EC state and the age-speci c transition probabilities from a preclinical cancer state progressing to the next preclinical cancer state are presented in Table 2. According to the Markov model, the sum of the transition probabilities from each initial state to the other states is equal to 1. At the same time, the transition probabilities of IC and SC vary greatly with age. Therefore, for IC and SC, we obtained the transition probabilities of ve years old as an age group through consulting experts and repeated tting. All transition probabilities were within or very close to the related ranges from the literature [6,7,[24][25][26][27][28][29][30]. The progression probability increased with the severity of the disease. And the related regression probability decreased.
The age-speci c progression probability for SD/CIS, IC, and SC cases increased with age.  [31] 65-69 years old 0.18 remaining INC 0.2304 [32,33] *: transition probabilities have been used in the health economic evaluation exploring preferable screening strategies for EC in highrisk areas of China [14]. The estimated pathological stage distribution of EC is shown in Table 3, which closely approximated the results of the EDETEC project during 2005-2008, within the 99% con dence intervals of screening data. The proportions declined with the severity of the disease, mD stage ranked rst (over 15%), followed by MD stage and SD/CIS stage, the proportion of EC stage (including IC, SC, and INC) was the lowest (about 0.8%).
And age-speci c distributions of mD, MD, SD/CIS, and EC (including IC, SC, and INC) predicted by modeling were similar to related screening results. Figure 2 shows the proportions increased with age 65-69 age group ranked rst. This trend is consistent with the natural history of the EC and with reports from other high-risk areas in China [34,35].  Table 4 shows the comparison between the incidence of middle and advanced EC obtained by model tting and the incidence of EC in Linzhou County from 2004 to 2006. The trend of age-speci c incidence rates from 40 to 69 years estimated by the Markov model was similar to the report in Linzhou County, the incidence rates increased with age. The estimated value of the model is approximately 97-107% of the observed value. It can be seen that the incidence obtained by model tting is highly consistent with the Cancer Registry Report in Linzhou County. "Estimated": refer to the incidences that were estimated through Markov models simulation.

Comparison with observed data on incidence rates
To our knowledge, our study is the rst to comprehensively present transition probabilities for the natural history of EC based on the state-transition Markov model for further evaluation of EC screening projects. The transition probabilities among some health states estimated in our paper were partly different from published data. Most previous studies on the natural history of EC were conducted by repeated endoscopic screening or follow-up observation of precancerous lesions, but these methods require a long time, and the probability between some states might not be obtained. Moreover, the natural transition probabilities for some precancerous lesions were ethically di cult to get. Although previous studies have reported some transition probabilities for parts of pathological stages of EC, the data were signi cantly different among studies mainly owing to the small sample size [6,7,[24][25][26][27][28][29][30].In contrast, the Markov model has been successfully used to simulate the natural history of some malignant tumors [36,37], so we tried to use it to simulate the natural history of EC and estimate the transition probabilities between its various states.
The Markov model is a useful method to describe the process of individuals passing through a series of states in continuous time. These types of patterns explain how patients transition between long-term disease states, which is popular in explaining the natural history of the disease [38,39]. Moreover, the Markov model has another advantage that it can utilize data with limitations in quality and accuracy, which may be collected retrospectively [40]. In the present study, we constructed a Markov model to estimate the 1- year transition probabilities of various health states of EC (normal, mD, MD, SD/CIS, IC, SC, and INC). In particular, we tted the agespeci c proportion of EC at each pathological stage and predicted the incidence rates of EC in our model, which were nally compared with the screening data of Linzhou County.
In this study, we established the Markov model for the natural history of EC based on the transition probabilities. We set the disease (EC) to progress only one stage in a year, without considering the possibility that a few individuals could develop two stages in a year, but the proportion of such individuals in the population is very small. And we also tried to set up such a model in which the probabilities were very small, they had little effect on the outcome. Compared with the transition probabilities that can be found in Most of the data used in our model came from the Linzhou County Cancer Registries, where extensive endoscopic screening has been conducted since the 1980s and established systematic cancer incidence and death registries. Therefore, the relevant data of Linzhou County are reliable. Although the model did not use near two years of data to estimate the transition probabilities of EC (because the model requires many parameters, and the data in the past two years is very limited, and many data cannot be obtained or searched), the change of EC in Linzhou County in recent 10 years was not signi cant, the transition probabilities in this model were obtained after 30 cycles of estimation, and the incidence of the disease in the middle and late stages of 40-69 years was close to the monitoring results of Linzhou County.
The model was internally validated by comparing the model predictions with epidemiological data from Linzhou County Cancer Registries. Model-predicted age-speci c distributions of pathological stages of EC matched well with the screening results of Linzhou county, and the trends were similar to data reported in Ci county, another high-risk area of EC in China [41]. Owing to the inclusion of the younger age group, the results of the Anyang trial were not comparable to our predictions [35]. The age-speci c incidence of EC estimated by modeling was quite close to Linzhou County Cancer Registry data. However, further comparisons with other areas were not conducted due to the unavailability of related reported data. In general, the estimated transition probabilities in the study were reliable for high-risk areas in China.
Our study also has several limitations. Firstly, the transition probabilities were estimated based on limited published data and shortterm cohort studies. Although model validation results con rmed a relatively high internal validity, the external validity of the model could not be assessed due to the lack of other available data sets. There is insu cient evidence to extend the results of this study to other regions. The results of the model depend on the quality of the registration and screening data to a large extent, which may have a potential risk of deviation. Therefore, long-term cohort studies with a large sample size were needed to test and improve our results. Nevertheless, this kind of epidemiological research is currently absent in China. Secondly, the transition probabilities were estimated based on the data from high-risk areas in China. Differences in social demographic structure and risk factors such as lifestyle and diet habits across China made it prudential to transfer our results to low-risk areas of EC. Finally, in this study, the transition probabilities of all health states should vary with age. However, due to the small sample size of the literature and the unavailability of data, the transition probabilities of the normal state, mD state, and MD state were xed, which would affect the results of the model to some extent [14]. Therefore, we should con rm our obtained transition probabilities by following up the natural history of a larger sample of people, and further improve our model of EC natural history and transition probabilities.

Conclusion
An esophageal cancer transition model based on the natural history of esophageal cancer in high-risk areas in China was established, and the model is effective. This can provide a reference for economic evaluation and policy formulation of esophageal cancer screening. All procedures performed in studies involving human participants were following the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.
Informed consent was obtained from all individual participants included in the study.

Consent for publication
All authors agreed on the nal version of the manuscript.

Availability of data and materials
The datasets used and analyzed during the current study are available from the corresponding author (e-mail: yangchunxia@scu.edu.cn) on reasonable request.

Competing interests
All authors declare that there is no con ict of interest.

Funding
This work was supported by The National Science and Technology Pillar Program of the 11th National Five-Year Plan of China, No.2006BAI02A15. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Authors' contributions
WqW, YlQ, and CxY designed the research and revised the manuscript; YX,and XjW participated in data collection and establishment of esophageal cancer metastasis model; ZyC ,YY and LD reviewed the literature, analyzed the data and prepared the paper.