Trends in Cancer Diseases Prevalence by Different Socioeconomic Strata in the United States

Continuous variables are expressed as mean standard deviation (SD), while classied variables are expressed as numbers and their proportions. We use Chi-square test for classied variables, one-way ANOVA for normal continuous variables and Kruskar-Wallis test for skewed continuous variables. Chi-square test classied variables, one-way ANOVA continuous and Kruskar-Wallis test continuous variables.


Abstract
Background Income disparity among different socioeconomic strata in the United States has widened sharply in recent decades. Take into account the well-established link between income and health, this widening income gap may provide insight into the dynamics of the cancer disease burden in American adults. Assess the temporal trends of the 20-year predicted absolute cancer risk in American adults at different socioeconomic classes.

Methods
The cross-sectional analyses were carried out using data from adults aged 20 to 85 years between the 1999 and 2018 NHANES. Socioeconomic status was divided into three groups based on the family income to poverty ratio (PIR): high income (PIR ≥ 4), middle income (> 1 and <4), or at or below the federal poverty level (≤ 1).

Results
The analysis included 49 720 participants. The prevalence of lung cancer was lower in high-income

Conclusions
The study found that the prevalence of cancer diseases was increasingly different among participants of different socioeconomic classes of NHANES from 1999 to 2018. Further research is required on the dynamics and health impact of income inequality, as well as public health policies and efforts to reduce these inequalities.

Background
Malignant tumor is a major public health problem, which has attracted worldwide attention. According to the World Health Organization [1], 7 out of every 10 deaths in the world die of non-communicable diseases, among which the rst disease causing death is cardiovascular disease and the second is cancer. According to the International Cancer Research Institute, there will be 19.29 million new cancer cases in the world in 2020, and it is speculated that by 2040, the number of new cancer cases in the world will reach 28.4 million, an increase of 47% compared with 2020 [2]. Many studies have con rmed the role of socioeconomic status in the formation of cancer mortality and survival [3][4][5][6][7], however, there are few studies on the relationship between socio-economic status and cancer prevalence [8][9][10][11].
Over the past few decades, income inequality in the United States (US) has risen to its highest level [12].
The relationship between income and health has been set up, and higher income indicates healthier [13][14][15][16][17]. Health inequalities arise when individuals in a society enjoy unequal rights and the key determinants of health, including, but not limited to, escaping from discrimination, healthier food, clothing, better housing, education, cognitive of health and health care. Study data showed that the life expectancy of 65- year-old men in the highest-income group was estimated at 23.5 years, or 7.9 years higher than men in the lowest-income group [18]. Similarly, the woman with the lowest income of 65 years old had a life expectancy of 17.9 years, or 6.8 years lower than the highest income group [18]. There are socioeconomic differences in health, and individuals with lower socioeconomic status (SES) have a higher risk of developing mortality and morbidity than individuals with higher SES [19].
To our knowledge, there are limited studies comparing the prevalence of cancer risk factors in different socioeconomic classes. It is estimated that lung cancer will remain the leading cause of cancer death by GLOBOCAN 2020, with an estimated 1.8 million deaths (18%), followed by colorectal (9.4%), liver (8.3%), stomach (7.7%), female breast (6.9%) and esophagus (5.5%) cancers [2]. Therefore, the main aim of the study was to assess temporal trends in 20-year forecast absolute the six cancer risk in adults from three socioeconomic strata of the US: adults with high income, middle income and incomes at or below the federal poverty level.

Study Population
The National Center for Health Statistics (NCHS) established the National Health and Nutrition Examination Survey (NHANES), a series of cross-sectional surveys, using complex multi-stage probability design, obtained representative samples of the non-institutionalized civilian population residing in the 50 states and District of Columbia in the US. Details of these studies regarding sampling methods, survey instruments, and data collection have been published elsewhere [20][21][22][23]. The NHANES research was approved and agreed by the NCHS Research Ethics Review Committee. We analyzed data from the survey interview and physical examination within continuous NHANES (1999-2018, n = 102,956). For this analysis, the study population was limited to adults ≥20 years of age who had available data on family income to poverty ratio (PIR) and cancer or malignancy information (n=49,720) ( Fig. 1).

Covariant evaluation
The exposed variables were socioeconomic status and were evaluated according to PIR. According to the relationship between self-reported family income and the poverty line, family size and calendar year, the PIR of each family is calculated. A value of 1 or less is lower than the o cial poverty threshold, while a PIR value higher than 1 indicates that the income is higher than the poverty level. PIR is similar in each year of the survey because the revenue threshold for in ation is updated annually [24]. We divided the participants into three groups: adults with high income (PIR, ≥4), middle income (PIR, >1 and <4), and at or below the federal poverty level (PIR, ≤1). We selected the critical point for middle-and high-income adults under the thresholds used by the Patient Protection Affordable Care Act, in which adults with a PIR between 1 and 4 are eligible for insurance subsidies, while adults with over 4 PIR were not eligible for subsidies.
Information about age, race, marital status, insurance status, education level, citizenship status, alcohol, smoking in the past month, physical activity and family income is self-reported. Participants received a medical examination to measure weight, standing height and waist circumference in a standardized way.
Race and ethnicity are divided into four categories: non-Hispanic whites; non-Hispanic black people; Mexican American and others, including other Hispanic, Asian and multiracial participants. Body mass index (BMI) was de ned as the body weight (kg) divided by the square of height (m). We divide the education level into less than high school, high school graduation or a general educational development certi cate, and greater than high school. Alcohol consumption was assessed by self-report and classi ed as non-drinker, less than 2 drinks per week and 2 or more drinks per week. Smoking was coded as nonsmoker, former smoker and current smoker. Participants who smoked less than 100 cigarettes over a lifetime were classi ed as never smoking. Former smokers are de ned as people who have smoked more than 100 cigarettes in their lifetime but have given up smoking. At present, smokers are de ned as those who have smoked more than 100 cigarettes in their lives but still smoke. Physical activity is assessed by the number of moderate to high-intensity activities (such as walking, jogging, running, swimming, cycling, dancing or yard work) per week, while lack of physical activity is de ned as never doing moderate or highintensity activities.

Statistical analysis
Due to the complex sampling design of NHANES, all the analysis includes the research visit weight, main sampling units and hierarchical design of NHANES survey [20]. P value < 0.05 was used as a cut-off for statistical signi cance. Analyses were conducted using IBM SPSS statistical software (version 24, IBM, Armonk, NY, USA) and Stata statistical software (version 16.0; Stata Corp, College Station, TX, USA).
Continuous variables are expressed as mean standard deviation (SD), while classi ed variables are expressed as numbers and their proportions. We use the Chi-square test for classi ed variables, one-way ANOVA for normal continuous variables and Kruskar-Wallis test for skewed continuous variables. To examine the prevalence differences across income groups, we performed descriptive statistics and Chisquare test followed by Bonferroni correction to account for multiple comparisons. To con rm the prevalence changes in the six cancer diseases during consecutive surveys, we calculated the prevalence of each outcome by descriptive statistics.

Trends in Cancer Disease Prevalence
In  When cancer risk factors were included in the model, the risk trend had not changed and the difference was still not statistically signi cant (Supplemental Table 7 -12).

Association Between Cancer Disease and Other Variables
Both logistic regression analysis models suggest that, in general, older age is associated with an increased likelihood of reporting cancer disease. The ORs of cancer disease ranged from 4.729 (95% CI,  Table 1-6). When cancer risk factors were included in the second model, both groups had a generally lower probability of reporting cancer disease with higher education (Supplemental Table 7-12).

Discussion
The prevalence of lung cancer in high-income participants was lower than that in middle-income participants (0.15% vs 0.35%). When controlled for demographic variables and cancer risk factors, the model suggested that the individuals with the highest resource group were less likely to report lung cancer than the middle-income and low-income group. According to a previous study, lung cancer was relatively more common in low-income communities [25].It is well known that the inverse correlation between socioeconomic status and smoking prevalence at least partly explains the strong correlation between socioeconomic status and lung cancer incidence [26][27][28].
Some studies have shown a strong statistically signi cant correlation between community income and survival in breast cancer [29,30]. Additionally, we found an inverse relationship between income levels and breast cancer. The results of our study observed an increased odds of reporting breast cancer in the middle-income stratum and the high-income stratum. We are not sure whether the higher rates of breast cancer observed in high-income populations re ect real changes in the biological incidence of these diseases. One hypothesis is that signi cant gradients in these cancer incidence stem from higher case detection rates in the more a uent sector of the region. Income and education level are positively correlated with disease cognition, so the highest resource group will participate in cancer screening more actively. Furthermore, many cancer cases are detected only due to screening, possibly also for breast cancer, although to a lesser extent, and that screening may be used more frequently in more a uent communities [31][32][33][34][35]. Differences in total breast cancer incidence may also re ect differences in screening rates rather than actual differences in disease rates.
Results from a cross-sectional epidemiological study showed no signi cant correlation between community income and survival was observed in stomach or colon cancer [25]. Other epidemiological studies showed that there was a moderate and strong negative correlation between income levels and the incidence of lung, gastrointestinal tract and colon and rectum cancer [29,[36][37][38]. Moreover, we found no statistically signi cant relationship between income levels and the prevalence of esophagus, stomach, colon and rectum or liver cancer. The disparities in six cancers between 1999-2008 and 2009-2018 did not have statistically signi cant among different socioeconomic strata. This gap was most obvious in terms of esophageal cancer prevalence, which increased approximately two-fold in middle-income populations. However, incidence data in esophagus, stomach, colon and rectum or liver cancer are poor and sometimes even do not exist for some places or time periods. It should be mentioned that because of the small sample sizes for cancer incidence outcome, comparison among different socioeconomic strata in this study was limited by the multivariate-adjusted logistic regression analyses.
There are important limitations in our study. First, we analyzed several cross-sectional surveys, but failed to determine a causal relationship between income and cancer disease. Second, the evaluation results depend on the self-reported information. Any missed reporting of cancer reports leads to arti cially low morbidity. If this were more prevalent in poor areas, then it will lead to a signi cant increase in incidence with increasing income. However, previous analyses suggest that self-reported results of NHANES are an effective tool for assessing prevalence [39]. Third, the sample size of this study is small and does not meet the requirements of EPV (event per variable). Therefore, the results may not be robust enough.
However, considering that such patients are rare and the results are interpretable, they are still displayed. The reliability of this result needs to be con rmed by further research.

Conclusions
The cross-sectional study found signi cant and increasing differences in cancer disease rates across socioeconomic strata in the United States. In the past 20 years, the decline in lung cancer prevalence has mainly occurred in high-income people, while the prevalence of breast cancer has increased in middle-and high-income adults. Overall, recent progress in controlling cancer risk factors in the United States has not bene ted adults of all socioeconomic strata equally. There is clearly a need for further efforts to decrease income disparities in controlling cancer risk factors. Importantly, these ndings reinforce calls for action on policies based on socioeconomic inequalities in cancer disease. Figure 1 Flow chart of the study population. Describes how the present sample of participants was composed.

NHANES = National Health and Nutrition Examination Survey.
Page 17/18