Retrospective follow-up study of patients living in Mallorca diagnosed with bladder and urinary tract cancers between 2006 and 2011, identified through the Mallorca Cancer Registry.
Study population: cases with codes C65–C68 according to the ICD-O 3rd edition with any behaviour and histology except lymphomas (from 9590 to 9720 both included) were included, while cases identified exclusively through the death certificate (DCO cases) were excluded.
IACR/IARC rules for multiple cancers were used . Thus, only the first cancer was registered, whether it was uncertain behaviour, in situ, or invasive. If, subsequently, there was a progression from non-invasive to invasive, the first registered cancer was not modified.
The following data were collected: sex, age, diagnostic method, site and sub-site; histology and behaviour according to the ICD-O 3rd edition ; date of diagnosis; pathological or clinical tumour size (T), pathological or clinical regional lymph nodes (N), metastasis (M) and stage; date of last follow-up or date of death, and cause of death (bladder and urinary tract cancer or other causes).
Age was grouped as: 15–44 years old, 45–54, 55–64, 65–74, and 75 and over. Diagnostic method was recorded as clinical, pathological, or unknown. Site and sub-site were grouped as: urinary bladder, renal pelvis, overlapping and unspecified urinary organs, ureter, and urethra. Histology was recorded as: papillary transitional cell neoplasia (8130), solid transitional cell neoplasia (8120), microcytic carcinoma (8041, 8045), and other histology and unspecified (8000, 8001, 8010, 8020, 8033, 8070, 8071, 8082, 8140, 8310, 8480, 8490, 8255, 8900). Behaviour was registered as uncertain, in situ, and invasive.
Stage was calculated according to the UICC 7th edition , but regrouped in the following categories: 0a, 0is, I, II, III, IV, no stage (uncertain behaviour). Pathological T or N status was prioritised over clinical. An integrated approach  was used by combining pathological and clinical components to obtain the stage. The clinical records of cases with missing stage were reviewed in depth to minimise the number of lost values. We did the following assumptions: if T was 1 and N and M were missing, we assigned stage 1; if T was 2 and N and M were missing, we assigned stage 2, as some authors recommend for prostate cancer .
Time was calculated from date of diagnosis to date of death or date of the last follow-up. Vital status referred to the state (alive or dead from bladder or urinary tract cancer or from other causes) at the time of the last follow-up. The clinical records of deceased cases were reviewed in depth to establish precisely the cause of death. Cases that emigrated from Mallorca and lost cases were censored, as well as deaths from other causes for cause-specific survival. The starting point of follow-up was the date of diagnosis, and the end point was 31 December 2015.
Statistical analysis: MI was used to obtain stage when this was unknown, following three main steps . First, we ran the imputation model and replaced each missing value with a set of five imputations by applying the multiple imputation chained equation (MICE) procedure. We made the imputation using the variables sex, age, site and sub-site, histology, vital status and survival time. Secondly, we analysed the resulting five imputed and complete data sets independently by applying the Cox regression model. Finally, we obtained a single Cox model using Rubin’s rules  to combine the five estimates resulting from the previous Cox regression model. A more detailed description about the MICE procedure can be found in Ramos et al. .
We applied the cause-specific survival analysis developed by actuarial and Kaplan-Meier methods to estimate likelihood of survival and risk of death; relative survival using the Ederer II method ; the log-rank test to evaluate the statistical differences of the observed survival curves by each categorical variable; the log-rank test for trend to analyze the type of trend of the two variables that can be considered as ordinal, age groups and stage; and the Cox regression models to identify prognostic factors of the risk of death. Cases with uncertain behaviour were excluded for the survival analysis, since they have no stage, our main study variable. We considered age as a continuous variable because our interest was to know the effect of each unit increase on the risk of dying from bladder or urinary tract cancer. The proportional hazard assumption for each covariate was tested by introducing time dependent variables. Since age and histology did not meet this assumption, we applied the extended Cox regression, which not only analyses the effect of covariates on the risk of dying, but also allows for the modelling of the time dependent effect of age and histology covariates. The procedure for selecting the variables in the final Cox model was based on the maximum likelihood criterion. Thus, initially, sex, age, site, histology and stage were introduced into the model, as well as time-dependent variables of age and histology. To compare the effect of the imputation procedure on the hazard ratio estimation of covariates, the extended Cox regression was performed before and after MI.
MI was carried out with STATA 13, cause-specific survival analysis with SPSS 23 and relative survival with the “relsurv” library of R.