3.1 Characteristics of the Respondents
Table 1 presents summary statistics for our sample disaggregated by region of residence (rural/urban). We report four statistics: mean, the standard deviation, minimum and maximum value for each variable. In the final column, we report the p-value to test the equality of the means between the rural and urban samples.
Firstly, we look at the demographic characteristics. Our sample is evenly split between rural and urban areas (48% are urban residents) and about half are men. A typical respondent is aged 49, who lives in a household with 3 members and self-reported an annual income of 86552 CNY ($ 12364).[1] In terms of education, about 9% of the respondents are illiterate, 17% finished high school, and 17% have a college degree or above. In terms of occupation, about 9% are working in public sectors (including civil servants, medical workers, and teachers), 29% are farmers, 18% are manual labourers, and 17% are working in private sectors. It should be noted that our respondents are likely to have a distribution of the socio-economic background that is better than the national average because Zhejiang province, where Ningbo belongs to, has a real GDP per capital above the national average.
We now turn to their health literacy and conditions of chronic diseases. Table 1 shows that the rate of health literacy on CDP is 25.8%, meaning out of 100 people living in Ningbo, 26 are able to give correct answers to 80% or more of questions presented in Table A1 in the Appendix and be considered as having adequate health literacy on CDP. This figure is much higher than the reported 2017 national level (15.7%) but similar as that for Shanghai (24.2%) [22].[1] The prevalence rate for chronic disease is 26%.[2] The most prevalent disease type is hypertension (19%) followed by diabetes (5%) and heart problems (2%). The prevalence rate for cerebrovascular diseases or cancer is not high, about 1%.[3]
Significant differences also arise in rural and urban samples in terms of health literacy and chronic diseases. The urban residents have a higher level of health literacy on CDP. At the same time, they have fewer chronic diseases. Urban residents are also significantly younger (46 vs 51 years), which we think is partly due to the rural-urban migration, where younger people from rural areas move to urban areas for better job opportunities. Urban residents tend to live with fewer household members (2.9 vs 2.8). They earn more annually (102359 vs 71462 CNY or $14622 vs $10208) and are better educated (19% vs 49% in terms of the proportion of high school or above). Not surprisingly, they are also more likely to work in public sectors and are less likely to work as a farmer.
3.2 Characteristics of Groups with Different Level of Health Literacy
From Table 1, we find urban residents are significantly better-off: they are healthier and have a higher level of health literacy on CDP; and they are younger, better educated and wealthier. In order to investigate the relationship between chronic diseases and health literacy, we further group our respondents by their level of health literacy on CDP in Table 2 to examine their respective characteristics.
Not surprisingly, we find the prevalence of chronic diseases is significantly lower among the group with adequate health literacy. In addition, this more ‘literate’ group are more likely to live in the urban areas, are younger (45 vs 50), have a higher income, are better educated, and more likely to work in public sectors or employed in private sectors.[4] Similar patterns are observed in the rural and the urban samples (See Table A4 in the Appendix).
While we observe a lower prevalence rate of chronic diseases among residents with adequate health literacy, we also find they are younger, better educated, and wealthier, which are all factors that are associated with a lower likelihood of having chronic diseases. In other words, the negative relationship we observe between rate of health literacy and chronic disease prevalence may not reflect the causal effect that health literacy has on chronic disease occurrence, but actually reflect the observed characteristics, such as age and education have on occurring chronic diseases.[1] Next, we will take into account these ‘confounders’ to untangle the relationship between health literacy and chronic diseases.
3.3 The Effect of Health Literacy on Chronic Diseases Occurrence
We predict the occurrence of chronic disease with a set of hierarchical equations in Table 3. In column (1) we include no covariate but the binary variable of health literacy alone. In columns (2)-(4), we add sequentially three blocks of variables to the equations, representing, in order of entry, region of residence, gender, income and household size; occupation; age and education. This ordering provided a means to observe how each block of variables added in later could explain the effect of health literacy shown in column (1). In column (5) we included the full set of covariates rendering us a ‘purer’ effect of health literacy, which partials out potential confounders.
The first equation in column (1) reveals that adequate health literacy on CDP is associated with a reduction in the likelihood of having chronic disease by 4.8 percentage points. This merely replays what we observed in Table 2. The second equation in column (2), which added gender, annual income and number of household members, shows that higher income is also associated with a lower likelihood of having chronic diseases and the effect of health literacy remains negative despite a small reduction in magnitude. The effect of household size is also significant, showing that respondents living in a larger household are less likely to report having chronic diseases. Results in column (3) show that occupation is also a strong predictor of the respondent’s chronic condition. Compared to those working in public sectors, farmers have a higher probability of having chronic disease by 24 percentage points, and for manual labourers, this effect is 11 percentage points. More importantly, with the inclusion of occupation, the effect of health literacy is now half the size as before, implying occupation explains away part of the negative effect health literacy has on chronic diseases. In column (4), we include age and education. The effect of health literacy changes sign and is significant at 10% significance level, implying a higher level of health literacy ‘increases’ rather than ‘decreases’ the likelihood of having chronic diseases. The size of this effect is not negligible, about 1.8 percentage points.
The effects of age and education are expected. Those who are younger and better educated are less likely to have chronic diseases. Those effects are significant both statistically and economically, suggesting they are important predictors of having chronic diseases. Also, there is a substantial increase in R-squared in column (4) at the bottom of the table compared to columns (1)-(3), implying age and education are the main confounders to the relationship between health literacy and chronic diseases we observe in column (1). In column (5) we include the full set of covariates and the estimate of health literacy is unaltered compared to column (4).[1] Similar patterns of results are observed in split rural and urban samples (Tables A5 and A6 in the Appendix).[2]
To further explore how do our results vary with the age of the respondents, we split our sample by the age of the respondent and the results are reported in Table A7. We find the positive association between health literacy and chronic disease is only present among those aged 60-69 but is absent in the two younger age groups.[3]
3.4 The Effect of Chronic Disease on Health Literacy
In this section we explicitly estimate a model that predicts the probability that a respondent has adequate ‘health literacy on CDP’. Again, we carry out this task using LPM and our main results are reported in Table 4. Differing results arise in rural and urban samples and we discuss first the urban results as a benchmark in Panel A and then highlight differences in rural results in Panel B in Table 4.
Controlling a series of characteristics of the respondents (gender, annual income, household size, occupation, age and education), we find those with at least one type of chronic disease are significantly more likely to be classified as having adequate health literacy by 3 percentage points (column 1). Given our data is cross-sectional, we cannot say chronic diseases helps a respondent to access health literacy on CDP unless we could measure the change of health literacy before and after the diagnosis of chronic diseases. Although we do not have such retrospective data, we could compare the level of health literacy between those whose first chronic disease was diagnosed less than one year ago and those whose first chronic disease was diagnosed much earlier. This is what we did in our second equation reported in column (2). It shows that among the group whose first chronic disease was diagnosed within the previous year, they are more likely to have adequate health literacy compared to those without chronic diseases. This effect, however, is absent among those whose first chronic disease was diagnosed 2-4 years ago or earlier. Besides, it appears having more chronic conditions increases the likelihood of having adequate health literacy as shown in column (3), but this difference is not statistically significant.
Next, we examine whether this relationship is related to specific type of disease(s). This is done by replacing the number of chronic conditions with six dummy variables indicating the types of diseases in column (4). We find having hypertension is associated with an increase in the likelihood of having adequate health literacy by 4 percentage points (that is 14% increase over 28.6 percentage points - the base rate of health literacy in urban areas). Insignificant results with other disease types are not reported. It is worth noting diseases with low prevalence such as cancer (less than 1%) may not have been able to be determined in this sample. For these variables, there is insufficient variation, thus a large standard error might arise and less likely can we find significant result.
Next, we move on to the results for rural sample in Panel B. The results for our rural sample differ significantly from the urban results in terms of the effects of duration and the types of diseases as shown in columns (2) and (4). For rural respondents, those whose first chronic disease was diagnosed more than 5 years ago are significantly less likely to have adequate health literacy on CDP than those without any chronic diseases in column (2). Having heart problems among rural residents is the only disease type that is significantly associated with having adequate health literacy on CDP in column (4).
The effects of other variables have expected signs, which are reported in Table A8 in the Appendix. For example, those who work in public sectors are more likely to have adequate health literacy than farmers; older respondents are less likely to have adequate health literacy (but it is only significant in rural areas) and higher education is associated with an increase in the likelihood of having adequate health literacy on CDP. In particular, for the urban sample, we find a positive association between household size and having adequate health literacy on CDP.
3.5 The Interaction between Health Literacy and Chronic Diseases
Now we are back to the question we asked at the beginning, but in a slightly different form. If being diagnosed with a chronic disease also improves people’s health literacy on CDP, could this improvement reduce the risk of having a new chronic disease? That is to say, does a higher level of health literacy reduce a patient’s likelihood in developing a comorbidity? For example, we might be interested in knowing whether having adequate health literacy reduces the likelihood of having another disease such as hypertension if the patient was diagnosed with diabetes. We will address this question by including the interaction term between health literacy and diabetes and estimate the effect it has on the occurrence of having hypertension. If the interaction is negative, it implies that the effect that health literacy has on hypertension occurrence changes with whether the respondent has had diabetes.[1]
We experimented the above specification alternating the predicting disease variable and the explanatory disease pairs (there are ten of them given we have five types of chronic diseases of interest). We do it for rural sample and urban sample, respectively. We find among urban samples, there are five pairs of disease types that entail a non-negligible interaction effect but not for the rural sample and we report it in Table 5. Separate results for rural sample are available upon request.
In columns (1)-(2), we predict the probability of having comorbid cerebrovascular diseases. Expectedly, having heart problems raises the likelihood of cerebrovascular disease by 6 percentage points when an individual does not have adequate health literacy on CDP. The coefficient on health literacy is not significantly different from zero, meaning health literacy has little role to play in preventing an individual from having cerebrovascular diseases as the first chronic disease. However, if an individual has had heart problems, having health literacy reduces the likelihood of having cerebrovascular disease by 7 percentage points. This interaction effect could more than offset the comorbid effect of having heart problems. In column (2), we replace health problems with cancer and again predict the probability of having comorbid cerebrovascular diseases. Having cancer is associated with a higher probability of having cerebrovascular diseases (by 5 percentage points) and the interaction effect is 6 percentage at borderline significance, which again could more than compensate the positive comorbid disease effect.[1]
In columns (3), we predict the probability of having comorbid heart problems with cerebrovascular disease (the reversed case as in column 1). Having cerebrovascular disease is strongly associated with a respondent’s likelihood of having heart problems when the respondent has no health literacy on CDP. The size of interaction effect is considerably large. If a respondent has had cerebrovascular disease, health literacy on CDP is associated with a reduction in the risk of having heart problems by 23.4 percentage points.
In columns (4)-(5), we predict the probability of having comorbid diabetes. The interaction effect is insignificant but sizable, showing health literacy reduces the likelihood of having diabetes by 4 percentage points if a respondent has heart problems. Similarly, health literacy reduces the likelihood of having diabetes by 16.4 percentage points if a respondent has cerebrovascular diseases.
3.6 Sensitivity Analyses
In this section, we look into the sensitivity of our main results. We added regional fixed effects (112 dummies indicating neighbourhood-communities/villages) and re-estimated results in Table 5. The results are not altered with the inclusion of regional fixed effects (see Panel A in Table A9 in the Appendix). Similar to what we have in Table 5: the interaction effects become greater in size but the significance is not altered, showing our findings are not confounded by the heterogeneity of respondents coming from different neighbourhood-committees/villages.[2] Next, we apply the sample weights (see Panel B in Table A9 in the Appendix). A noticeable difference is the interaction for cancer reduces in size and significance but all else are similar.
In Section 3.5, we analysed the effect of health literacy on several chronic diseases outcomes thus there is a risk of false positives arising from testing multiple hypotheses. If we treated the ten pairs of chronic diseases as independent of each other and with true interaction effect of zero, for =0.1, the likelihood of finding at least one false positive would be 0.6513.[3] The likelihood that, as in this paper, five out of ten pairs showed up significant by chance would be mere 0.00149.[4] However, the chronic diseases outcomes should not be considered uncorrelated. We thus tested our results in a seemingly uncorrelated regression (SUR) framework which allows for correlation between tested outcomes. First proposed by [23], the SUR model is used to estimate a system of linear equations with errors that are correlated across equations for a given individual but are uncorrelated across individuals. We find our results are almost identical to what we reported in Table 5. These results are not reported but available upon request.
Although LPM is easier to interpret, they might suffer from problems such as the error terms will not be normally distributed, there will be heteroskedasticity, and predicted values will fall outside the logical boundaries of 0 and 1. We re-estimated Table 3 and 5 using logit model and find similar results (reported in Table A10 and Table A11).[5]
Although defining health literacy as a binary outcome is easier for interpretation and comparable with national statistics, we also explore treating level of health literacy on CDP as continuous with scores ranging 0-12 and repeated what we did in Tables 3, 4 and 5. Our key information has not changed. For example, negative effect of health literacy on chronic disease occurrence changes sign and becomes insignificant after controlling for age and education (see Table A12). Hypertension (in urban sample) and heart problems (in rural sample) are found to be significantly associated with a higher score in health literacy on CDP (see Table A13). In particular, among the ten pairs of chronic diseases, we find four pairs are significantly negative suggesting health literacy is negatively associated with having a comorbid condition (see Table A14).