2.1 Data
The data used here come from the national baseline survey of the China Health and Retirement Longitudinal Study (CHARLS) collected from 2011 to 2015. The CHARLS is a biennial survey that was initiated in 2011 and conducted by the National School of Development at Peking University. It is a nationally representative longitudinal survey that collected information on Chinese residents aged 45 years and above and their spouses regarding assessments of social, economic, and health circumstances. To ensure cross-study comparability of the results, the CHARLS was harmonized with leading international research studies in the Health and Retirement Study (HRS) and intended to provide a high-quality public micro-database with a wide range of information that serves the needs of scientific and policy research on ageing-related issues [21].
Based on multistage probability sampling, 10,257 households and 17,708 individuals were studied through face-to-face computer-aided personal interviews. Ethical approval for this study was not required because it was based exclusively on publicly available data. All subjects were informed of a grant of confidentiality that legally protected their responses.
Consistent with the estimates obtained from other studies, we find that there are a large number of missing values and extreme values in the income variables, such as some outpatient subsamples had a household income of less than zero. In addition, the underestimation of income due to deliberate underreporting might be a problem [22]. Enlightened by the prior study, we adopted the household total expenditure per capita (EPC) as a proxy for financial status, which is the sum of the household food EPC, household monthly EPC, and household yearly EPC [22, 23]. According to the standard of the international poverty line ($1.90 a day) and the Purchasing Power Parity (PPP) of each year, we calculated the poverty line of urban and rural areas in RMB [24]. After excluding the respondents with key variables missing or for not reaching the standards, 3,760 respondents over 60 years old who live on or below the poverty line were ultimately selected for this paper from 2011 to 2015.
2.2 Measurements
Dependent variables
To account for the observable differences in health needs, this study considers the one-month morbidity, chronic disease prevalence and self-reported health status of the poor elderly population [5, 6, 25]. The dependent variables in our analysis reflect the intensity and expenditure of different healthcare utilizations. We consider the following measures of health service utilization: (a) the probability of outpatient visits during the one month that precedes the survey date; (b) the individual expenditure for the outpatient visits during the past month; (c) the probability of being hospitalized during the year that precedes the survey date; and (d) the individual expenditure for inpatient visits in the past year.
Independent variables
In this study, the independent variables are chosen based on the Andersen Behavior Model (Andersen, 1968), which was introduced in the late 1960s to help understand the use of health services, define equitable access to healthcare, and assist in developing policies to equalize access to healthcare [26]. The original model considered that health service utilization was related to three predictors, which were described as people’s predisposition to use services, the factors that enable or impede their use of services and their need for healthcare. Up to now, increasingly more studies have employed this model and its variations to assess the utilization and outcomes of healthcare services for both general and vulnerable populations [20, 27, 28]. In this study, we use a modified Andersen behavioural model of health services as a theoretical framework to analyse the factors associated with health service utilization among the poor elderly. Our model includes four types of variables, namely, predisposing, enabling, need, and lifestyle variables [29].
Predisposing factors.
The predisposing component centres on the idea that some individuals have a propensity to use services more than other individuals, and this tendency can be predicted from individual characteristics prior to an illness episode. In the present paper, the predisposing factors include gender, age, education and marital status. Age has been divided into the three groups of 60 ~ 69, 70 ~ 79, and 80་years (we labelled these three groups of elderly people as “young-old”, “the mid-aged old” and “the eldest old”, respectively). Education has the following four categories: (1) illiterate; (2) primary school; (3) middle school; and (4) high school and above. Marital status has been divided into the two categories of (1) married (including cohabitating and the spouse being away for job purposes) and (2) unmarried (including separated, divorced or widowed).
Enabling factors.
The main idea for this type of variable is that people may well be predisposed to using health services, but they also need some means of obtaining them. In the present paper, the enabling variables include whether the respondents have children, an urban or rural residence, health insurance and an old-age pension, as well as their region and their traffic time for health services. Their region is determined numerically (1 = eastern, 2 = central, 3 = western). Health insurance is measured by uninsured = no insurance, UEMI = Urban Employee Medical Insurance, URMI = Urban Resident Medical Insurance, NCMS = New Rural Cooperative Medical Scheme, private MI = private commercial medical insurance, and other = other health insurance. An old-age pension is based on whether people receive benefits from any pension programme (no or yes).
Need factors.
This variable captures the need for healthcare and represents the most immediate cause of health service use. Generally, need includes individuals’ perceived and evaluated functional capacity, symptoms, and general state of health. In this study, the need variables include self-reported health status, physical disability, chronic diseases and limitations on activities of daily living (ADL). Self-reported health is obtained from the response to the question “Would you say your health is excellent, very good, good, fair and poor?” or “Would you say your health is very good, good, fair, poor and very poor?” We combined the answers to these two questions into the three categories of poor, fair and good. Physical disabilities are based on the respondents’ answer to the question “Do you have one of the following disabilities, physical disabilities?” Chronic diseases are assessed as the cumulative number of diagnosed conditions (0,1 ~ 2 and ≥ 3). ADL limitations indicate any self-reported difficulty in any of the following activities of daily living: bathing/showering; eating; dressing; getting into or out of bed; using the toilet; or controlling urination and defecation.
Health behaviour variables.
Lifestyle is measured by the following three variables: (1) smoke (No = never a smoker, Yes = smoker); (2) drink (No = never, Yes = drinking alcohol more than once or less than once in a month); and (3) physical examination (No = not having a regular physical examination or Yes = having a regular physical examination).
2.3 Statistical analysis
A descriptive analysis is used for the demographic characteristics of the samples. The variables of morbidity and the rates of outpatient and inpatient visits were presented as rates, and the differences between the groups were examined by using the chi-square test. Subsequently, a two-part model is employed to further investigate the factors that affect the utilization of health services by the poor elderly. A two-sided p-value of < 0.05 was considered to indicate statistical significance. All statistical analyses are performed with STATA software, version 15.0.
Previous studies have suggested that many individuals did not use any healthcare services during the study period; therefore, the medical cost data are usually characterized by having a substantial proportion of zero values and a right-skewed distribution, and they may exhibit heteroscedasticity [30]. A two-part model can be used to address these data issues. The selection criteria that a high value of the variance inflation factor (VIF) is a sufficient condition for the presence of collinearity suggests that a VIF in excess of 30 is a cause for concern. Therefore, we use a two-part model to analyse health service utilization in the present paper [31]. Specifically, the first part of the model is a logistic that predicts the probability of any use of health services: in Eq. (1), the dependent variable \({\textrm Z}_{i}^{*}\) is the probability of health service utilization, and \({\epsilon }_{i}\sim{}\textrm N\left(\text{0,1}\right)\). If\({ X}_{i}{\alpha }_{i}+{\epsilon }_{i}>0\), then \({Z}_{i}=1\); otherwise, \({Z}_{i}=0\). Healthcare expenditure is analyzed by a generalized linear model with a gamma distribution and a log link that can estimate the medical costs of only the observations with positive spending [32–34]. In Eq. 3, E(Y>0|X) is the probability of health service utilization multiplied by the expected cost, which is conditional on being a user, and the sample average of E(Yi) becomes the expected healthcare spending of the elderly. Since Eq. 3 is specified as gamma GLIMMIX, the link function directly characterizes how the expectation of Yi is related to the regressors, which avoids the complications of a log-linked Ordinary Least Squares model [30]. The 2PM can be explained as follows: