Data source
The microdata used in this paper are from the death sample of the China Elderly Health Influencing Factor Tracking Survey (CLHLS), a tracking survey of the elderly organized by the Center for Healthy Ageing and Development Research/National Development Research Institute of Peking University, covering 23 provinces, autonomous regions and municipalities across China, with respondents aged 65 years and older and adult children aged 35–64 years. The questionnaires were divided into two types: the surviving respondents' questionnaire and the deceased elderly's family members' questionnaire. The survey project was followed up in 2000, 2002, 2005, 2008–2009, 2011–2012, 2014, and 2017–2018 after the baseline survey in 1998. In view of the missing data of the death sample before 2011, this paper mainly used the questionnaire data of the family members of older adults who died in 2011–2012, 2014, and 2017–2018, with a sample size of 5585 in 2011–2012, 2759 in 2014 and 2157 in 2017–2018 and combined the three periods into one mixed cross-sectional dataset. The macrolevel data at the provincial level used in this paper were obtained from the China Statistical Yearbook, which provides more detailed information on the specifics of economic development indicators and social service indicators for each province in China each year.
Variable Measurement
Dependent variable: the place of death
The place of death of the elderly was measured by the questionnaire "Where did the elderly eventually die?" This question measured "at home", "in a hospital", and "in a nursing home". Since the proportion of the sample who died in a nursing home was very small (2.29%), we combined the options "died in a hospital" and "died in a nursing home" to define the place of death variable as a dichotomous variable; if the elderly died at home, it was defined as "0 = dying at home", and if the elderly died in a hospital or nursing home, it was defined as "1 = dying in hospital".
Core Independent Variable: Level Of Urbanization
The core independent variable of interest is the level of urbanization; in this paper, we take the provinces where the elderly are located as the benchmark and mainly examine the urbanization rate (ratio of urban resident population to total population), the number of beds in medical institutions per 10,000 people, the number of practising assistant physicians per 10,000 people, the number of elderly beds per 1,000 elderly people, the number of community health service stations per 10,000 people, and the number of family general practitioners per 10,000 people in the provinces where the elderly are located. The six specific indicators correspond to the level of urbanization of the population, the level of regional medical care, and community medical resources, which are all continuous variables.
Control Variables
The control variables in this paper mainly include the demographic characteristics, socioeconomic characteristics, and residential characteristics of elderly individuals. The demographic characteristics included three variables: gender of the elderly (male = 1, female = 0), age, and whether they were bedridden before death (1 = yes, 0 = no). The socioeconomic characteristics included two variables: years of education and whether they had basic pension insurance (1 = yes, 0 = no). The residence characteristics included whether the elderly lived alone before death (1 = yes, 0 = no), the primary caregiver before death (1 = spouse, 0 = other), annual per capita household income (10,000 yuan), the presence of a doctor in the community (1 = yes, 0 = no), and the urban-rural distribution of residence (1 = urban, 0 = rural).
Model Setting
In descriptive statistics, we used the chi-square test and t-test to systematically describe the urban-rural differences in categorical and continuous variables, respectively. We simultaneously combined the data from the three surveys to graph the trends in the place of death of elderly individuals. In the empirical model, we incorporated some core independent variables of provincial-level urbanization to interpret the distribution and changing trends of the dying places of the elderly from the perspective of provincial-level urbanization. The hierarchical structure of the data assumes that the distribution and change in the place of death of the elderly is the result of the influence of both "individual-province" factors. From the statistical point of view, we considered that the traditional binary logistic regression model is based on the two assumptions, that the random error term is homoscedastic and uncorrelated with the explanatory variables, and the hierarchical data did not satisfy these two basic assumptions; thus, the traditional logistic model failed here. For this reason, we introduced a hierarchical logistic model that specifically deals with stratified data to analyse the distribution and changes in the place of death of elderly individuals. The advantage of hierarchical models is that the effects of the corresponding hierarchical variables are estimated separately at different levels of the data, allowing further analysis of the effects of provincial macro variables on the dependent variable while controlling for individual micro variables, which is in line with the purpose of this paper to develop the study from an urbanization perspective.
The modelling steps of the hierarchical logistic model are as follows.
In the first step, a null model without any variables was built. This was used to check whether the data used are suitable for the hierarchical model.
First level, individual level model.
$${y}_{ij}={\beta }_{0j}+{\epsilon }_{ij}$$
1
Second level, provincial hierarchy model.
$${\beta }_{0j}={\gamma }_{00}+{\mu }_{0j}$$
2
Full model.
$${y}_{ij}={\gamma }_{00}+{\mu }_{0j}+{\epsilon }_{ij}$$
3
where i denotes the first level unit, i.e., the individual elderly person, j denotes the second level unit, i.e., the province of the elderly person, \({\text{y}}_{\text{i}\text{j}}\) are the dependent variables location of the older person at the end of life, \({{\beta }}_{0\text{j}}\) and \({{\gamma }}_{00}\) denote intercept terms, \({{\epsilon }}_{\text{i}\text{j}}\) denotes the random effect at the individual level, and \({{\mu }}_{0\text{j}}\) denotes the random effect at the province level. The random error term in the null model was decomposed into two aspects, individual error and province error, which could be obtained by model estimation, and the intragroup correlation coefficient ICC was then found.
The ICC = 0.175 in this study implies that 17.5% of the variance in the place of death was due to provincial-level factors. According to Cohen's empirical criteria, when the ICC value exceeds 0.059, stratified models must be considered. In addition, the individual error was a constant 3.29 due to the distribution limitations of the logistic regression.
$$\text{I}\text{C}\text{C}=\frac{0.6976}{0.6976+3.29}=0.1750$$
In the second step, the study variables were gradually added after the data were determined to be suitable for use in a hierarchical model.
First, an individual-level intercept model was constructed by adding control variables describing the important influence of the individual level of older adults on their end-of-life location. The specific model is shown in the following equation, where\({ W}_{ij}\) is the included individual control variable, and \({\beta }_{1j}\) is its corresponding coefficient.
$${y}_{ij}={\beta }_{0j}+{\beta }_{1j}\ast {W}_{ij}+{\epsilon }_{ij}$$
4
Second, provincial-level urbanization variables were introduced. In this paper, six provincial-level urbanization indicators, namely, the population urbanization rate, number of medical institution beds per 10,000 people, number of practising assistant physicians per 10,000 people, number of community health service stations per 10,000 people, number of family general practitioners per 10,000 people, and number of elderly beds per 1,000 elderly people, were included in the model as the core independent variables. The specific model is shown in the following equation, where \({Z}_{j}\) is the core urbanization variable included at the provincial level, and \({\gamma }_{1j}\) are the corresponding coefficients.
$${\beta }_{0j}={\gamma }_{00}+{\gamma }_{1j}\ast {Z}_{j}+{\mu }_{0j}$$
5
The aggregated model is as follows.
$${y}_{ij}={\gamma }_{00}+{\beta }_{1j}\ast {W}_{ij}+{\gamma }_{1j}\ast {Z}_{j} +{\mu }_{0j}+{\epsilon }_{ij}$$
6
Finally, provincial-level and individual-level interaction variables were introduced on the basis of the pooled model to examine the moderating effect of provincial-level urbanization indicators on the effect of individual-level indicators on the place of death of older adults. The moderation effect model is shown in Eq. (7).
$${y}_{ij}={\gamma }_{00}+{\beta }_{1j}\ast {W}_{ij}+{\gamma }_{1j}\ast {Z}_{j} +{\alpha }_{1j}\ast {W}_{ij}\ast {Z}_{ij}+{\mu }_{0j}+{\epsilon }_{ij}$$
7