Study design and setting
Data for this analysis are from China Health and Retirement Longitudinal Study (CHARLS), which is a nationally representative survey of the elderly in China, designed by the National School for Development (China Center for Economic Research) together with the Institute for Social Science Survey at Peking University. The baseline wave of CHARLS was being fielded in 2011 and included about 10000 households and 17500 individuals in 150 counties/districts and 450 communities. The multistage sample was drawn at each stage based on probability proportional-to-size random-sampling procedures. The survey collected detailed demographic background, socioeconomic information, health status and functioning. The analysis draws on data of Wave 3 (2015), because it collected reliable venous blood samples which allow us to estimate the discrepancy between self-reported diseases and underlying biomarker levels among CHARLS respondents. Detailed information about the CHARLS blood sampling procedure and data quality management has been published previously [14].
We restricted the sample to people aged 40 to 85 years old with valid self-reported diseases and biomedical tests. The Wave 3 (2015) included 20284 respondents asked to consent to a venous blood draw and blood pressure measurement; 13013 provided venous blood and 16406 provided blood pressure information. Combined with those people who also provided detailed sociodemographic information, the final sample size hypertension and diabetes were 14462 and 12189, respectively. Characteristics of the sample are summarized in Table 1[1].
Table 1 Characteristics of the sample, China Health and Retirement Longitudinal Study
Variable
|
Total
|
Hypertension
|
Diabetes
|
(N=19292)
|
(N=14462)
|
(N=12189)
|
Education
|
|
|
|
Illiterate
|
24.4% (4707)
|
25.4% (3673)
|
25.6% (3120)
|
Primary education
|
45.4% (8759)
|
46.2% (6681)
|
45.9% (5595)
|
Secondary education and above
|
30.2% (5826)
|
28.4% (4107)
|
28.5% (3474)
|
Hukou
|
|
|
|
Urban
|
23% (4437)
|
20.3% (2936)
|
20.3% (2474)
|
Rural
|
77% (14855)
|
79.7% (11526)
|
79.7% (9715)
|
Drinking
|
|
|
|
None
|
73.2% (14122)
|
73.6% (10644)
|
73.8% (8995)
|
Less than 3 days a month
|
6.2% (1196)
|
6.1% (882)
|
5.9% (719)
|
Once or 2 to3 days a week
|
6.4% (1235)
|
6.3% (911)
|
6.2% (756)
|
4 to 6 days a week or daily
|
8.8% (1698)
|
8.5% (1229)
|
8.5% (1036)
|
Twice a day or above
|
5.4% (1042)
|
5.5% (795)
|
5.6% (683)
|
Number of cigarettes/day
|
1.7 (19292)
|
1.7 (14462)
|
1.7 (12189)
|
Sex
|
|
|
|
Female
|
52.3% (10090)
|
53.5% (7737)
|
54.0% (6582)
|
Male
|
47.7% (9202)
|
46.5% (6725)
|
46.0% (5607)
|
Age
|
|
|
|
40-49
|
21.5% (4148)
|
19.5% (2820)
|
18.6% (2267)
|
50-59
|
32.1% (6193)
|
31.8% (4599)
|
32.0% (3900)
|
60-69
|
29.8% (5749)
|
31.5% (4556)
|
32.4% (3949)
|
70-79
|
13.6% (2624)
|
14.3% (2068)
|
14.4% (1755)
|
80 and above
|
3.0% (579)
|
2.9% (419)
|
2.6% (317)
|
Marriage
|
|
|
|
Unmarried
|
12.2% (2354)
|
12.2% (1764)
|
12.2% (1487)
|
Married
|
87.8% (16938)
|
87.8% (12698)
|
87.8% (10702)
|
Measurement
Self-reports and biomedical measurements of hypertension and diabetes
Self-reported data on hypertension and diabetes were obtained by the question, ‘Have you been diagnosed with [conditions listed below, read one by one] by a doctor?”, and there are 14 options, which are “Hypertension, Dyslipidemia, Diabetes, Cancer or malignant tumor, Chronic lung diseases, Liver disease, Stroke, Heart problems, Kidney disease, Stomach or other digestive disease, Emotional, nervous, or psychiatric problems, Memory-related disease, Arthritis or rheumatism and Asthma” [1]. If a respondent answered hypertension or diabetes, we defined the self-reported hypertension or diabetes as 1, otherwise as 0. Biomedical blood pressure was measured three times (approximately 45 seconds apart) on a single occasion, using an electronic monitor. We take the average of the last 2 readings, after excluding the first reading to avoid white coat hypertension. Hypertension was defined as a systolic blood pressure ≥140 mm Hg and/or a diastolic blood pressure ≥90 mm Hg and/or current use of antihypertensive medication, following the WHO guideline [15]; biomedical diabetes was measured by venous blood data which provided glycated hemoglobin (HbA1c). The diagnostic criterion for diabetes in our study was defined as HbA1c values ≥6.5%. If a respondent’s glycated hemoglobin was over 6.5%, we defined the biomedical diabetes as 1, otherwise as 0. Although HbA1c may not be the most widely used screening test, it can be measured at any time of the day regardless of the duration of fasting or the content of the previous meal. Therefore, HbA1c has been used in many surveys [16].
Economic resources
Educational attainment was measured by three levels. This variable indicated the highest education degree attained by respondents at the survey point. The lowest category was individuals holding no formal education (illiterate); intermediate education ranged from not finishing primary school, home school to elementary school; and the highest category was respondents holding middle school or above degree. Hukou was measured by whether the respondent held a rural hukou (0 = No, 1 = Yes).
Health behaviors
Drinking was a 5-category variable indicating the frequency of drinking last year: none (coded as 1), less than 3 days/month (coded as 2), less than 3 days/week (coded as 3), 4 to 6 days/week or daily (coded as 4), twice a day or above (coded as 5). Smoking was a continuous variable indicating the number of cigarettes/day, which ranged from 0 to 100.
Demographic characteristics
Gender was a binary variable: male (coded as 1), female (coded as 0). Age was a 5-category variable ranging from 40 to 49, 50-59, 60 to 69, 70-79, 80 and above. Marital status was a time-varying covariate indicating whether the respondent was in a marriage status: separated, divorced, widowed and never married (coded as 0), married or partnered (coded as 1).
Analytic strategy
Our first step was to assess the difference in prevalence estimates based on two data collection methods, the prevalence of hypertension and diabetes were calculated according to self-reported information, as well as according to the results of biomedical measurements obtained from the CHARLS. We use the formula to calculate the degree of underestimation as follows.
To assess the accuracy of self-reported data, sensitivity, specificity, false negative reporting and false positive reporting were also calculated, respectively. Only sensitivity or specificity was of no practical use when it came to helping the clinician estimate the probability of disease in individual patients [17]. In addition, sensitivity and specificity assessed group-specific errors in diagnosed or undiagnosed diseases, respectively, but not overall errors. We identified both total error and the group-specific error and assessed sociodemographic characteristics that are correlated with misreporting (sensitivity, specificity, false negative reporting and false positive reporting). Controlling for education, hukou, drinking, number of cigarettes/day, age, gender and marital status, binary and multinomial logistic regression analysis were applied. As the total error outcome (correct reporting, false negative reporting and false positive reporting) has more than two categories. The model equations are set as follows[1]: