Study sample
The three-wave longitudinal data obtained from the Chinese Health and Retirement Longitudinal Study (CHARLS) – which was conducted by Peking University representative regions in China in 2011, 2013, and 2015 – were used in this study. The survey objects were individuals aged 45 and older. The baseline wave included about 10,000 households and 17,500 individuals in 150 counties/districts and 450 villages/resident committees. CHARLS contained a rich set of individual-level information, such as demographic characteristics, family structure, household consumption, social participation situation, subjective and objective health status, and other related information.
In this study, the individuals who were aged 45 and older in the baseline survey, and remained in at least one of two follow-up surveys, were focused on. Furthermore, for the analysis about the onset of each diagnosed disease, individuals with an established diagnosis were removed. After further excluding the respondents who were missing key variables used in statistical analysis, the number of individuals used in this study ranged from 5,986 to 7,009, depending on health outcomes and estimation models.
The CHARLS dataset, which was used in this study, was publicly available, and the Ethical Review Committee of Peking University in China approved its study protocol. Hence, the ethical approval was not needed for this study.
Measures
The key independent variable was SP. As the dependent variables, four types of health outcomes were considered: (1) mental health, (2) SRH, (3) diseases, and (4) activities of daily living (ADL).
Social participation (SP)
Regarding SP, CHARLS asked respondents, ‘Have you done any of these activities in the last month?’, listing seven types of social activities: (a) interacting with friends, (b) playing Mah-jong, chess, cards, or going to the community club, (c) providing help to family, friends, or neighbours who do not live with you and did not pay for your help, (d) going to a sport, social, or other club activity; (e) participating in a community-related organization, (f) doing volunteer or charity work, and (g) caring for a sick or disabled adult who does not live with you and did not pay for your help. Seven binary variables of each SP activity were constructed by allocating ‘1’ to the answer yes and ‘0’ otherwise. A binary variable of overall SP was constructed by allocating ‘1’ to those participating in at least one type of SP activity and ‘0’ to others.
Mental health
Two types of mental health scores, MH1 and MH2, were constructed as follows. First, based on the questionnaire, ‘How would you rate your memory at the present time? Would you say it is excellent, very good, good, fair, or poor?’, MH1 was scored as follows: excellent = 5, very good = 4, good = 3, fair = 2, and poor = 1. Another mental health score, MH2, was constructed based on answers to ten questions of the Center for Epidemiologic Studies Depression Scale (CES-D), the validity of which has been confirmed among elderly Chinese [25]. Specifically, CHARLS provided ten items referring to feeling and behaviour about mental health status during the previous week: (a) ‘I was bothered by things that don’t usually bother me’, (b) ‘I had trouble keeping my mind on what I was doing’, (c) ‘I felt depressed’, (d) ‘I felt everything I did was an effort’, (e) ‘I felt hopeful about the future’, (f) ‘I felt fearful’, (g) ‘My sleep was restless’, (h) ‘I felt lonely’, (i) ‘I could not get “going”’, and (j) ‘I was happy’. For each item, respondents’ responses were scored as rarely or none of the time (< 1 day) = 7, some or a little of the time (1–2 days) = 5, occasionally or a moderate amount of the time (3–4 days) = 3, most or all of the time (5–7 days) = 1, while the score was reversed for (e) and (j). The scores for ten items were summed up and defined as MH2 (range: 10–70). The Chronbach’s α was 0.809 in the current study sample, indicating a reasonable level of internal consistency. For both MH1 and MH2, a higher value means better mental health.
Self-rated health (SRH)
Based on the respondents’ responses to the question about SRH, a five-point score variable of SHS was constructed as very good = 5, good = 4, fair = 3, poor = 2 and very poor = 1. A higher value means better SRH. Alternatively, a binary variable of SES was constructed as 1 = very good and good, 0 = poor and very poor, but logistic regression models with it obtained similar results. Hence, only the results with a five-point score value of SRH were reported in what follows.
Activities of daily living (ADL)
CHARLS provided information about two types of ADL: the basic activities of daily living (BADL) and the instrumental activities of daily living (IADL). CHARLS asked the respondents whether they have any difficulty in doing each of the following activities: (a) dressing, (b) bathing, (c) eating, (d) getting into or out of bed, (e) using the toilet, and (f) controlling urination and defecation for BADL. They were also asked about the level of difficulty when (a) doing household chores, (b) preparing hot meals, (c) shopping for groceries, (d) taking the right portion of medication on time, and (e) managing money for IADL. For both BADL and IADL, the respondents’ responses were scored as: I don’t have any difficulty = 4, I have difficulty but can still do it = 3, I have difficulty and need help = 2, I cannot do it =1. The scores were summed up and defined as BADL (range: 6–24) and IADL (5–20). The Chronbach’s α was 0. 833 for BADL and 0.816 for IADL in the current study sample, both indicating reasonable levels of internal consistency. For both variables, a higher value means fewer difficulties in daily living.
Diseases
Based on the questionnaire item, ‘Have you been diagnosed with diseases by a doctor?’, seven binary variables of diagnosed diseases were constructed: (a) hypertension or dyslipidaemia, (b) diabetes or high blood sugar, (c) heart attack or stroke, (d) cancer or malignant tumour, (e) emotional, nervous, psychiatric problems or memory-related disease, (f) stomach or other digestive disease, and (g) other disease.
Covariates
A set of various covariates regarding (1) demographic and family structure, (2) socioeconomic status, (3) living environment, and (4) contextual/institutional background were included in regression analysis. As demographic and family structure variables, (a) gender, (b) age and its square (to capture the possible nonlinearity of the relationship between age and health), (c) marital status (married, never married, or divorced/separated), (d) living with children, and (e) living with parents were considered. As socioeconomic status, (a) education attainment (primary school or below, junior high school, senior high school [including vocational school], college or higher [including university and the graduate school]), (b) Hukou (household registration; urban = 1), (c) work status (non-work, employed in public and private sectors, self-employed, and others [including working in agriculture industry], (d) whether having experienced retirement, and (e) household consumption per capita (quintile variables; as a proxy for household income) were considered. As for the living environment, binary variables of (a) house ownership and (b) having running water were constructed. As contextual/institutional background, (a) whether covered by public health insurance and (b) whether covered by private health insurance and. Finally, survey years (2011, 2013, and 2015) and regions (Eastern, Central, Western, and North-eastern; categorized by the National Bureau of Statistics of China [26]) were adjusted by including binary variables of each.
Analytic strategy
In regression analysis, the following model was estimated: (see Equation 1 in the Supplemental Files)
Here, i and t denote an individual and wave, respectively. H and SP indicate health outcome and SP, respectively, and X indicates a vector of covariates. ui represents a set of time-invariant individual attributes, and εit is an error term. One-wave-lagged values, Hit-1, was included in regression models for mental health, SRH, and ADL to adjust for the previous health status, while this term was not included in the regression model for each disease because the respondents who had been diagnosed with it in the previous wave were removed from the analysis. Using a one-wave-lagged variable (LV) of SP instead of its contemporaneous value is expected to mitigate biases due to reverse causality from health to SP. This model was referred to as LV1. By replacing SPit-1 by SPit-2 and also replacing Hit-1 by Hit-2, the longer-term effect of SP was examined. This model was referred to as LV2. In addition, biases due to individual heterogeneity were reduced by employing the random-effects (RE) model, which included time-invariant individual attributes (ui). This model was estimated only for LV1 and referred to as LV1+RE. Two things should be mentioned here. First, the RE model could be applied only to LV1, because only three waves of CHARLS data were available. Second, fixed-effects (FE) models were not employed, because the Hausman test did not reject the null hypothesis that u is not correlated with the explanatory variables.
Regression models were further estimated separately for men and women, for three age groups (aged 45–59 years, 60–69 years, and 70 years and older), and for each SP type. The focus was on the estimated coefficient of SP (β). If SP has a positive impact on health—even after being adjusted for the previous health conditions, as well as covariates—β is expected to be positive. The software package Stata (Release 16) was used for the statistical analysis [27].