ORIGINAL Mapping Spatial Variation of the Stomach, Esophageal and Lung Cancers and Their Shared Risk 2 Factors in Iran at a County Level 3

21 Background : Disease mapping has a long history in epidemiology. Evaluating the spatial pattern of several 22 diseases, as well as shared and specific risk factors in mortality, is considered as one of the applications 23 of disease mapping. Stomach, esophageal, and lung cancers are among the five most common cancers 24 among both genders in Iran, but no study is available on the spatial distribution of their mortality rate in Iran. 25 The present study aimed to investigate the geographical distribution of the relative risk of mortality and to 26 define the spatial pattern of shared and specific risk factors for the above-mentioned three cancers by 27 sharing their mortality data at the county level in Iran. 28 Method : This study analyzed the mortality data of stomach, esophageal, and lung cancers in Iran from 29 March 2013 - March 2015. The Besag, York, and Mollie’s (BYM) model and Shared Component (SC) 30 models were used for investigating the spatial changes of cancer mortality and determining the spatial 31 pattern of their shared and specific risk factors. Data analysis was conducted using R and OpenBUGS 32 software. 33 Results : The number of deaths for the esophageal, stomach, and lung cancers in Iran from March 2013 - 34 March 2014, was 11,720 of which stomach and lung cancers were 50% and 30%, respectively. The spatial 35 pattern of the stomach and esophageal cancer mortality was more similar to that of lung cancer due to the 36 risk factors shared only between esophageal and stomach cancers. 37 Conclusion : The effects of smoking on lung cancer mortality were higher than the other two cancers. The 38 available data indicated that esophageal cancer mortality was more affected by nutritional factors than 39 stomach cancer mortality in Iran. The effect of nutritional factors on stomach and esophageal cancer 40 mortality in the northern half of Iran was higher than the southern half. As a result, the relative risk of these 41 cancers mortality in the southern half was more affected by smoking than nutritional factors. 42 43


48
With a long history in epidemiology disease mapping can identify risk factors and determine policies to 49 reduce mortality through recognizing the spatial patterns and high-risk areas of disease in a population (1, 50 2). Researchers have used univariate methods (single analysis for a single disease) and multivariate 51 (combined analysis of several diseases) to estimate more accurate spatial pattern of diseases during the 52 last few decades (3)(4)(5).

53
The spatial changes of diseases may be related to the differences in their risk factors. Disease mapping 54 allows us to evaluate the hypotheses about the cause of diseases (3). At the beginning, only univariate 55 methods were used for disease mapping. Then, the simultaneous statistical modeling of several diseases,

56
causing the identification of their shared and specific risk factors and more accurate results than single 57 analyses, was considered by researchers (6).

58
Evaluating the spatial pattern of several diseases, as well as shared and specific risk factors in mortality, is 59 regarded as one of the applications of disease mapping (3, 6, 7).

60
Cancer is one of the main public health problems in the world (8) and the second cause of mortality in Iran 61 after cardiovascular problems (9).

62
Identifying high-risk areas and the spatial distribution of risk factors is one of the required strategies for 63 controlling and implementing preventive policies to reduce the above-mentioned cancers (10). Stomach, 64 esophageal, and lung cancers are among the five most common cancers among both genders in Iran (9, 65 11, 12), but no study is available on their mortality in Iran at the county level. The present study aimed to 66 investigate the geographical distribution of the relative risk of mortality and determine the spatial pattern of 67 shared and specific risk factors for the above-mentioned three cancers by sharing their mortality data at the 68 county level in Iran. For this purpose, the model introduced by Besag, York and Mollie (BYM) (13) was used 69 for analyzing each cancer and determining their spatial pattern. The BYM model is one of the most widely 70 used disease mapping models where the spatial correlation of neighboring areas is considered. A 71 hypothesis states that the areas close to each other behave similarly in relation to the disease (13). In 72 addition, shared components (SC) model (4) was used for highlighting the similarity and non-similarity of 73 spatial patterns of stomach, esophageal, and lung cancers mortality in the counties of Iran due to shared 74 and specific risk factors. Such a model has been used in several studies for determining the spatial changes 75 of risk factors in some diseases (7,14,15). In this case, counts of stomach, esophageal, and lung cancers 76 mortality refer to the variables of the model response with a shared risk factor (smoking (16-18)) and the 77 other risk factor, i.e. nutritional factors (16-18) is shared only between esophageal cancer and stomach 78 cancers. In this model, latent variables are used as substitutes for risk factors (4, 7). In addition, a random 79 effect is used as a model predictor to consider the probable other risk.

87
According to the previous study about risk factors of the esophagus, stomach and lung cancers, we 88 considered smoking as a common risk factor for these cancers and nutritional factor as common risk factor 89 just for esophagus and stomach cancer in the model. This study was confirmed by the Ethical Committee

93
In addition, assume that y ij has a Poisson distribution with parameters E ij θ ij where E ij indicates the expected 94 mortality rate in the i-th county due to the j-th cancer and θ ij represents a real relative risk (RR) unknown 95 for the j-th cancer in the i-th county.

96
The BYM model was used for fitting the spatial pattern of each cancer. This model is one of the most widely 97 used models in disease mapping where the spatial correlation structure of data is considered for obtaining 98 more reliable estimates. In this structure, the data of neighboring counties is shared. In this structure, two 99 counties that have at least one common border are considered neighbors. In the BYM logarithm model, the 100 relative risk for the j-th cancer and the i-th county (θ ij ) is modeled as follows: where α j is the average mortality rate in all counties for the j-th cancer. For each cancer, u i and i are the 103 random variables being given in the model to consider structured and unstructured spatial changes. It is 104 assumed that follow a normal distribution with mean equal to the average of the neighbor's number and

105
variance inversely proportional to the number of these neighbors and also has a normal distribution with 106 mean zero and variance 2 (13).

107
Then, the Bayesian shared component model was used to determine the distribution of risk factors (4).

108
Based on the previous studies on the risk factors of esophageal, stomach and lung cancers, smoking (16-

113
Similar to the BYM model, it is assumed here that risk logarithm is a function of random components: 114 where α 3 is defined like the BYM model. θ i1 , θ i2 , and θ i3 represent the relative risk of esophageal, stomach,

118
and lung cancer in the i-th county, respectively. us i and ua i are the latent random variables being 119 respectively substitute for the shared risk factor of three cancers (smoking) and the shared risk factor for 120 esophageal and stomach cancers (nutritional factor) which both follow some conditional autoregressive

121
(CAR) normal distribution to include the spatial correlation of the data.

122
is the weight for the adjacency and = 1 if i and j are adjacent and 0 otherwise. The adjacency is 126 herein defined as having at least one common border (13,21). In addition, the parameters λ us and λ ua are 127 precision parameters and are supposed to follow the Gamma (0.5, 0.0005) distribution function (22). 128 w and δ are the unknown parameters being considered for estimating the effect of each risk factor on the 129 relative risk of diseases and assuming that their logarithm has a normal distribution. Algorithm convergence was evaluated using the Gelman-Rubin test (23). Finally, the maps were drawn 137 using version 3.6.1 of R software.

139
The number of recorded mortality due to esophageal, stomach, and lung cancers in Iran during March 2013 of the relative risk of lung cancer mortality was higher than the other two cancers, so that other regions, 145 except for the southeastern region, had almost an average risk. Single analysis maps indicated a shared 146 spatial pattern for the relative risk of mortality by three cancers, especially in the northwestern and 147 southeastern regions which can be related to the shared risk factors between the three diseases (smoking).

148
As observed, the similarity between the spatial pattern of stomach and esophageal cancer mortality was 149 higher than that of lung cancer due to the shared risk factors between esophageal and stomach cancers.

150
The relative risk for esophageal and stomach cancers was significantly higher in the northern half of Iran 151 than the southern half. However, the dispersion of the relative risk of lung cancer was higher than the other 152 two cancers.

153
In addition, estimating the effects of shared and specific risk factors for the studied cancers is mapped in cancer is slightly more than esophageal cancer while its effect on lung cancer is more than stomach and 163 esophageal cancers. In addition, the posterior mean for the parameters related to nutrition for esophageal 164 and stomach cancers were obtained at 1.82 and 0.70, respectively. The available data indicated that 165 esophageal cancer mortality was more affected by nutrition than stomach cancer mortality in Iran (δ 1 δ 2 ⁄ = 166 2.6).
167 Table 2 provides the ranking according the mean value RR of the counties of each province (each province 168 includes several county, as shown in Figure 3) for each cancer. The highest mortality rate for esophageal,

169
stomach and lung cancers were in Ardabil, Zanjan and West Azerbaijan provinces, respectively.

170
Furthermore, Table 3 indicates the ranking of provinces based on the mean estimated effect of the two risk

202
Estimating the effect of shared and specific risk factors on mortality, without having real data, is considered 203 as one of the features of the SC model with latent variables (as substitutes to risk factors).

204
Based on the results, the effect of the shared risk factor (smoking) on lung cancer mortality was higher than 205 the other two cancers. Smoking has a higher prevalence than nutritional factors and has a high effect in 206 other places except in southeastern Iran. The effect of nutritional factors, which was considered as a shared 207 risk factor between stomach and esophageal cancer mortality in this study, was higher than the effect of 208 esophageal cancer on stomach cancer mortality.

209
The results indicated that the effect of nutritional factors on stomach and esophageal cancers mortality in 210 the northern half of Iran was more than the southern half and the relative risk of mortality in the southern 211 half was more affected by smoking than nutritional factors. East Azerbaijan and West Azerbaijan in 212 northwestern Iran had a high ranking among the other provinces in terms of the effect of both risk factors.

213
The above-mentioned two provinces had high rates of mortality from the three cancers, which can be 214 attributed to the interaction of smoking and nutrition.

215
The results obtained in this study are consistent with the results of previous studies (7,

227
Regarding the limitations of the present study, the access to data about Tehran province was not plausible 228 in this study. Thus, this province was excluded from the study.

230
Based on the obtained data, no study was available on stomach, esophageal and lung cancers mortality in

231
Iran. Using data at the county level instead of the province level in a multivariate spatial model was one of 232 the significant advantages of this study over other studies dealing with the geographical distribution of 233 diseases in Iran. Evaluating data at the county level provided more accurate and detailed data about their 234 status than the provincial level and could help planning and making policies more effectively. Considering 235 data on a larger scale sometimes ignores data at a smaller area and county level. In the present study,

236
Isfahan province was identified as a low-risk province in terms of esophageal cancer mortality (Table 2),

237
while Khor and Biyabank County in this province had very high esophageal cancer mortality.