An R package for an integrated evaluation of statistical approaches to cancer incidence projection
Background: Projection of future cancer incidence is an important task in cancer epidemiology. The results are of interest also for biomedical research and public health policy. Age-Period-Cohort (APC) models, usually based on long-term cancer registry data (>20yrs), are established for such projections. In many countries (including Germany), however, nationwide long-term data are not yet available. General guidance on statistical approaches for projections using rather short-term data is challenging and software to enable researchers to easily compare approaches is lacking.
Methods: To enable a comparative analysis of the performance of statistical approaches to cancer incidence projection, we developed an R package (incAnalysis), supporting in particular Bayesian models fitted by Integrated Nested Laplace Approximations (INLA). Its use is demonstrated by an extensive empirical evaluation of operating characteristics (bias, coverage and precision) of potentially applicable models differing by complexity. Observed long-term data from three cancer registries (SEER-9, NORDCAN, Saarland) was used for benchmarking.
Results: Overall, coverage was high (mostly >90%) for Bayesian APC models (BAPC), whereas less complex models showed differences in coverage dependent on projection-period. Intercept-only models yielded values below 20% for coverage. Bias increased and precision decreased for longer projection periods (>15 years) for all except intercept-only models. Precision was lowest for complex models such as BAPC models, generalized additive models with multivariate smoothers and generalized linear models with age x period interaction effects.
Conclusion: The incAnalysis R package allows a straightforward comparison of cancer incidence rate projection approaches. Further detailed and targeted investigations into model performance in addition to the presented empirical results are recommended to derive guidance on appropriate statistical projection methods in a given setting.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Due to technical limitations, full-text HTML conversion of this manuscript could not be completed. However, the manuscript can be downloaded and accessed as a PDF.
This is a list of supplementary files associated with this preprint. Click to download.
Posted 23 Sep, 2020
On 17 Sep, 2020
Received 12 Sep, 2020
Invitations sent on 26 Aug, 2020
On 26 Aug, 2020
On 19 Aug, 2020
On 18 Aug, 2020
On 18 Aug, 2020
On 07 Aug, 2020
Received 05 Aug, 2020
On 28 Jul, 2020
Received 21 Jul, 2020
On 21 Jun, 2020
Invitations sent on 20 Jun, 2020
On 10 Jun, 2020
On 09 Jun, 2020
On 08 Jun, 2020
On 07 Jun, 2020
An R package for an integrated evaluation of statistical approaches to cancer incidence projection
Posted 23 Sep, 2020
On 17 Sep, 2020
Received 12 Sep, 2020
Invitations sent on 26 Aug, 2020
On 26 Aug, 2020
On 19 Aug, 2020
On 18 Aug, 2020
On 18 Aug, 2020
On 07 Aug, 2020
Received 05 Aug, 2020
On 28 Jul, 2020
Received 21 Jul, 2020
On 21 Jun, 2020
Invitations sent on 20 Jun, 2020
On 10 Jun, 2020
On 09 Jun, 2020
On 08 Jun, 2020
On 07 Jun, 2020
Background: Projection of future cancer incidence is an important task in cancer epidemiology. The results are of interest also for biomedical research and public health policy. Age-Period-Cohort (APC) models, usually based on long-term cancer registry data (>20yrs), are established for such projections. In many countries (including Germany), however, nationwide long-term data are not yet available. General guidance on statistical approaches for projections using rather short-term data is challenging and software to enable researchers to easily compare approaches is lacking.
Methods: To enable a comparative analysis of the performance of statistical approaches to cancer incidence projection, we developed an R package (incAnalysis), supporting in particular Bayesian models fitted by Integrated Nested Laplace Approximations (INLA). Its use is demonstrated by an extensive empirical evaluation of operating characteristics (bias, coverage and precision) of potentially applicable models differing by complexity. Observed long-term data from three cancer registries (SEER-9, NORDCAN, Saarland) was used for benchmarking.
Results: Overall, coverage was high (mostly >90%) for Bayesian APC models (BAPC), whereas less complex models showed differences in coverage dependent on projection-period. Intercept-only models yielded values below 20% for coverage. Bias increased and precision decreased for longer projection periods (>15 years) for all except intercept-only models. Precision was lowest for complex models such as BAPC models, generalized additive models with multivariate smoothers and generalized linear models with age x period interaction effects.
Conclusion: The incAnalysis R package allows a straightforward comparison of cancer incidence rate projection approaches. Further detailed and targeted investigations into model performance in addition to the presented empirical results are recommended to derive guidance on appropriate statistical projection methods in a given setting.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Due to technical limitations, full-text HTML conversion of this manuscript could not be completed. However, the manuscript can be downloaded and accessed as a PDF.