Study design and data resource
We conducted an ecological study using open data collected, compiled, and published as government statistics for 2010, 2013, and 2016 by prefecture. The data used for analysis were LE, three measures of HLE—population by age, number of deaths, and the number of unhealthy people. The values of the three measures of HLE and LE were obtained from the “Healthy Life Report from Japan.”3 All other variables were derived from the government statistics website, “e-Stat.”15 The population by age was derived from the 2010 population census and population estimates from 2013 and 2016. The number of deaths was derived from official vital statistics. There are three measures of “number of unhealthy people,” as they vary according to the method of HLE (Additional file1). In DFLE-AL, the people who answered “Yes” to the question “Do health problems currently affect your daily life in some way? (Yes/No)” were regarded as unhealthy, and in LE-SH, the people who answered “Not very good” or “Not good” to the question “What is your current state of health?”, which is rated on a 5-point scale (range, not good–very good), were regarded as unhealthy. The numbers of unhealthy persons in DFLE-AL and LE-SH were derived from a comprehensive survey of living conditions.16 In DFLE-CN, the people requiring long-term care (level 2 or higher) were determined to be unhealthy in a survey of long-term care benefit expenses.17
LE and three measures of HLE were used as outcome variables.
The predictor variables were converted to the following rates or ratios: aging rate, mortality, and proportion of unhealthy people. The proportion of unhealthy people was used as the main predictor. In this paper, we define the proportion of unhealthy people for three measures of HLE as the restriction rate for DFLE-AL, the subjective unhealthy rate for LE-SH, and the care need rate for DFLE-CN. The restriction rate and the subjective unhealthy rate were calculated by dividing the number of unhealthy people in the anonymous self-administered questionnaire by the number of each survey respondents. The care need rate was calculated by dividing the number of persons requiring long-term care (level 2 or higher) by the number of persons aged ≥40 years who were eligible for long-term care insurance.
Mortality, aging rate, and data year were used as the other predictor variables. Mortality was calculated by dividing the number of deaths by the total population. The aging rate was calculated by dividing the population aged ≥65 years by the total population. Data year was used after converting from 1 to 3 in the order of 2010, 2013, and 2016. To assist with interpretation, all variables except data year were expressed in “per 1000 persons.”
First, Spearman's rank correlation coefficients for LE and the three measures of HLE were calculated to confirm their similarity. Then, regression analysis was performed using a generalized linear mixed model (GLMM), in which three measures of HLE were used as dependent variables. Data year, aging rate, mortality, and proportion of unhealthy people were included as independent variables.
In the regression analysis according to the GLMM, data of 47 prefectures in 2010, 2013, and 2016 were combined for all variables and treated as variables of 140 samples (the data for Kumamoto Prefecture in 2016 are missing due to an earthquake). For the independent variables, the model was constructed after the evaluation of multicollinearity using a variance inflation factor (VIF). Models with both aging rate and mortality as independent variables had a high VIF value of 8.4–15.2, which could lead to multicollinearity problems. Therefore, Model 1, with data year, aging rate, restriction rate, subjective unhealthy rate, and care need rate as independent variables, and Model 2, with mortality as the independent variable in place of the aging rate, were developed. In these two models, regression analyses were performed by sex with each HLE as the dependent variable. For all regression analyses, we calculated a random slope for the data year variable and a random intercept by the prefecture variable (Figures 1 and 2).
For estimation of the parameters, simulated draws from the posterior were obtained for each parameter using the Markov chain Monte Carlo (MCMC) method.18-19 The simulated draws were preceded with 2500 “burn-in” draws, which were discarded from the analysis. The MCMC chain was thinned by including only every second draw, yielding 5000 simulated posterior observations. Then, Rhat was calculated to confirm the convergence of the simulation. Rhat is an index of the divergence between chains, and in the case of three or more chains, if it is 1.1 or less by convention, it is considered to have converged.
Analyses were performed using the open source statistical software R (ver. 3.6.2)20, and the Rstan19 package was used for the parameter estimation by MCMC.