Data
In this study, we utilized the National Health Information Database (NHID) from 2013 to 2017. The NHID covers the entire population of Korea and is managed and provided by the National Health Insurance Service, Korea’s single health insurance provider. The NHID is composed of several databases (20). The eligibility database, one of the databases in the NHID, contains sociodemographic information on the entire population of Korea, including parameters such as sex, age, residence, and income-based insurance premiums (20). Death information is also collected individually in conjunction with death certificate data from Statistics Korea (20). In a previous study, the numbers of population and deaths at the district level (the administrative level in Korea above the dong/eup/myeon level) in the national statistics database and the NHID were highly correlated (21). Prior research compared the NHID with the NAD of the Ministry of Inferior and Safety (MOIS) for calculating small-area level mortality (22). The numbers of population and deaths were nearly identical between the two databases, and the estimated SMRs were correlated to a great extent in both sexes (22). Thus, using the NHID to estimate small-area mortality is considered to be valid. One of the substantive strengths of using the NHID to calculate small-area mortality is the availability of age-specific mortality data in each small area (20), unlike what was possible when using the NAD and death certificate data in previous studies (17, 18). This strength allowed us to measure small-area mortality metrics, not only with SMR, but also with CMF and LE.
As of January 1 of each year, we obtained the annual population in small areas (dong/eup/myeon) in 5-year age groups (0, 1–4, 5–9, 10–14, …, 85+) from the NHID as aggregated data. The subjects were followed for 1 year, and those who died by the end of the year were classified as deceased. If the subjects were foreigners or did not have any gender, age, or residence information, they were excluded from the analysis (1.4% of total NHID subjects), and most of those (99.8%) were foreigners.
Unit of analysis
The unit of analysis in this study was the dong/eup/myeon, which typically had between 3,850 and 21,886 inhabitants and 46 and 109 deaths as of 2017; these units were regarded as representing neighborhoods. Previous studies have also used the dong/eup/myeon as the unit of analysis to calculate small-area mortality in Korea (17, 18). Due to changes in administrative districts over time, we adjusted the unit of analysis by analyzing merged or split small areas as one unit for the entire study period. Since it is known that more than 5,000 subjects are required to calculate a stable LE (6), areas with an average population of less than 1,000 per year were merged with adjacent areas. Finally, this study reclassified the 3,500 dong, eup, and myeon areas as of December 31, 2017 to 3,377 (23). Deidentified numbers were assigned to avoid stigma for small areas found to have high mortality rates (24).
Statistical analysis
We estimated the SMR, CMF, and LE in all small areas in Korea. In this study, only age was considered to be a confounder of the association between areas and mortality, and was adjusted in the calculation of mortality metrics. We used equation (1) to calculate SMR by dividing the number of observed deaths in a small area by the expected number of deaths. The expected deaths were estimated by multiplying the age-specific population in the small area by the age-specific mortality rate of the standard population. The standard population was the total population of this study.
See Formula 1 in Supplemental Files
Where = age-specific population of standard population, = age-specific number of deaths of standard population, = age-specific population of each small area, = age-specific number of deaths of each small area. r = small area, i = 5-year age group.
We followed the method presented in the previous study for calculating the standard error (SE) and 95% confidence interval (CI) of SMR (5).
See Formulas 2 and 3 in Supplemental Files
CMF was calculated by dividing the expected number of deaths in the standard population by the number of observed deaths in the standard population. The expected number of deaths in the standard population was calculated by multiplying the age-specific mortality of each small area by the age-specific population in the standard population. The standard population used in the calculation of CMF was also the total population of this study. The equation (4) was used to calculate CMF.
See Formula 4 in Supplemental Files
We used equations (5), (6), and (7) to estimate the SE and 95% CI of CMF (5).
See Formulas 5, 6, and 7 in Supplemental Files
LE is often calculated by a deterministic approach (25). Sampling variation is not an essential issue when calculating LE at national or regional levels (1). However, when calculating LE at a small-area level, it is necessary to consider sampling variation according to the occurrence of stochastic variation over time (1, 16). The calculation of the SE of LE can also answer the question of how many years of data must be combined to achieve the appropriate level of precision (1). Chiang presumed that death numbers were distributed binomially, calculated the SE of the probability of dying in the interval, and linked it to the LE calculation in a previous study (as cited in (26)). Eayres and Williams contended that both assumptions—that deaths have a binomial distribution and a Poisson distribution—showed a high level of agreement in the results, but in the analysis of LE at the small-area level, they insisted that it would be preferable to assume a binomial distribution (27). We performed Monte Carlo simulations using the probability of dying from an abridged life table to generate a binomial distribution of death numbers (1, 26). The simulation was performed 10,000 times for each small area. We used it for the LE calculation and generated the LE distribution. The mean value of the distribution for a small area was defined as its LE. The 2.5th and 97.5th percentiles of the distribution were defined as the lower and upper limits of the CI of LE, respectively. No imputation was conducted even if the number of deaths for a specific age band was zero (27, 28). There was no small area where the number of deaths in the final age band (85+) was zero.
We set up a hypothetical situation with the same age-specific mortality rates across all small areas, applying the national age-specific mortality rates in 2015 to calculate SMR and to compare its distributions by dong/eup/myeon status. We also compared the ranking of areas by SMR, CMF, and LE, from the highest to lowest and from the lowest to highest. Lastly, we examined the ratio of CMF to SMR stratified by dong/eup/myeon status.