Does ethnicity influence dementia, stroke and mortality risk? Evidence from the UK Biobank

Introduction The number of people with dementia and stroke is increasing worldwide. There is increasing evidence that there are clinically relevant genetic differences across ethnicities. This study aims to quantify risk factors of dementia, stroke, and mortality in Asian and black participants compared to whites. Methods 272,660 participants from the UK Biobank were included in the final analysis, among whom the vast majority are white (n = 266,671, 97.80%), followed by Asian (n = 3,790, 1.35%), and black (n = 2,358, 0.84%) participants. Cumulative incidence risk was calculated based on all incident cases occurring during the follow-up of the individuals without dementia and stroke at baseline. We compared the allele frequency of variants in Asian and black participants with the referent ethnicity, whites, by chi-square test. Hierarchical cluster analysis was used in the clustering analysis. Significance level corrected for the false discovery rate was considered. Results After adjusting for risk factors, black participants have an increased risk of dementia and stroke compared to white participants, while Asians has similar odds to the white. The risk of mortality is not different in blacks and white participants but Asians have a decreased risk. Discussion The study provides important insights into the potential differences in the risk of dementia and stroke among different ethnic groups. Specifically, the study found that black individuals had a higher incidence of dementia and stroke compared to white individuals living in the UK. These findings are particularly significant as they suggest that there may be underlying factors that contribute to these differences, including genetic, environmental, and social factors. By identifying these differences, the study helps to inform interventions and policies aimed at reducing the risk of dementia and stroke, particularly among high-risk populations.


Introduction
Dementia and stroke are major neurological diseases and the leading causes of mortality in older adults (1)(2)(3). Our understanding of the genetic, lifestyle, and medical OPEN ACCESS EDITED BY Jie Hu, The Ohio State University, United States risk factors for these disorders has increased significantly (4). However, most of these studies have focused on individuals of white European descent (5,6), while dementia and stroke are a global crisis affecting aging populations and societies worldwide. Moreover, the number of non-white citizens in Europe and the world has increased faster than the white population. It is essential to include underrepresented ethno-racially diverse groups in research on dementia and stroke (7)(8)(9) as there is a significant knowledge gap in the genetic epidemiology of non-whites, particularly in European countries. This gap concerns the incidence, genetic determinants, lifestyle risk factors, and co-morbidity across ethnic groups (10-12).
There are significant differences in morbidity across ethnicities, such as a higher prevalence of hypertension (13) and dyslipidemia (14) in Africans and diabetes in Asians, compared to Whites (15). Genetic analysis has become a major tool in the study of chronic and non-communicable diseases, as highlighted by the COVID-19 pandemic (16)(17)(18)(19)(20). Risk prediction typically includes risk factors such as age, sex, family history of the disease, and lifestyle (e.g., tobacco and alcohol consumption, physical activity) (21); however, in recent years, there has been increasing interest regarding including genomic information into risk models (22,23). Polygenic risk scores (PRS) have been developed over the last two decades (24), and may lead to significant improvement in the prevention and management of diseases (e.g., selection of patient according to APOE status) (2).
Prediction of the risk of a disease is an essential part of preventative medicine, often guiding clinical management. PRS aggregate the effects of many genetic variants across the human genome into a single score and has recently been shown to have predictive value for multiple common diseases (25). However, most of these works were done in those of European descent and little is known about other ethnicities. Current studies using well-powered genome-wide association studies (GWAS) to assess the predictive value of PRS across a range of traits and populations have made a consistent observation: PRS predict individual risk far more accurately in Europeans than non-Europeans (26). Rather than chance or biology, this is a predictable consequence of the fact that the genetic discovery efforts to date heavily underrepresent non-European populations globally (6).
This study aims to quantify the risk of dementia, stroke, and mortality stratified by ethnicity and identify the factors driving these differences.

Data sources
This study is a part of the UK Biobank project 54,520. Complete descriptions of the UK Biobank have been presented elsewhere (27). Briefly, the UK Biobank is a large-scale population-based cohort study, including 500,000 subjects aged from 37 to 73 years during recruitment. The UK Biobank has approval from the North West Multi-center Research Ethics Committee (28). All included participants have signed the information consent form. All methods were carried out in accordance with relevant guidelines and regulations.

Clinical outcome and study variables
Participants with any prevalent dementia or stroke at baseline or younger than 55 years were excluded from their respective analysis, and participants without complete information about age, sex, and qualified genotype were also excluded. The population was divided into different ethnicities based on the touchscreen questionnaire at baseline. We studied dementia, stroke and mortality in Whites, Blacks, and Asians. The flow of the study participants is presented in Supplementary Figure 1. The clinical outcomes are (1) all-cause dementia, (29) including Alzheimer's disease (AD), vascular dementia, and a part of unspecified dementia (2) stroke, (30) including ischemic and hemorrhagic stroke, and (3) mortality. The diseases were based on the self-reported illness from the verbal interview at baseline, or the ICD codes from hospital admission electronic health records in the primary or any secondary causes and/or death register. A Bonferroni correction was applied for the effective number of independent tests.
Risk factors of dementia previously identified (4) were used as explicative variables including low education, hearing loss, head injury, hypertension, alcohol assumption, obesity, smoking, major depression, social isolation, physical inactivity, diabetes, and air pollution using PM2.5. Other potential risk factors of dementia, i.e., age, sex, together with a family history of dementia, APOE genotype, and genetic risk score of dementia, were also of interest. For stroke and mortality analysis, atrial fibrillation was added in the risk factors based on the criteria of the revised Framingham Stroke Risk Profile (5). The definitions of the variables and the thresholds used are presented in Supplementary Table 1.

Genetic variants
UK Biobank genotyping was conducted by Affymetrix using the BiLEVEL Axiom array for ~50,000 participants and the remaining ~450,000 on the Affymetrix UK Biobank Axiom array (31). Detailed information on the genotyping process and technical methods are available online. We followed the UK Biobank's recommendation to exclude the participants who had failed quality control, significant missing data or heterozygosity.
Genetic risk score (GRS) was explored in the current study. Thirty independent genetic determinants were selected from previous genomewide association studies of AD in non-UK Biobank European populations (32)(33)(34). Their risk alleles and effect estimates on AD were extracted from the largest GWAS summary statistics (stage I) by Kunkle et al. which UK Biobank was not included (Supplementary Table 2) (35). Considering the disparity effect between APOE and other common variants, GRS without APOE variant was created for further analysis. The stroke data of the largest GWAS was used to create the GRS.

Statistical analysis
Cumulative incidence risk was calculated based on all incident cases occurring during the follow-up of the individuals without dementia and stroke at baseline using the cumulative incidence function (package etm) (36). Patients were censored at the date of the disease diagnosis, death, or the administrative censoring date, whichever came first. Mortality was accounted for as a competing event. The function estimates overall survival irrespective of cause of Frontiers in Public Health 03 frontiersin.org death by a modification of the Kaplan-Meier estimate, adapted for left truncation, and calculates age and cause-specific risk estimates and corresponding 95% CIs for the different ethnicities. We compared the allele frequency of variants in Asian and black participants with the referent ethnicity, whites, by chi-square test. Hierarchical cluster analysis was used in the clustering analysis. Significance level corrected for the false discovery rate was considered.
All analyses were done in R (version 3.6.2) and the level of significance was set at p < 0.05 after Bonferroni's correction.

Prevalence of risk factors of dementia, stroke, and mortality across ethnicities
After excluding the participants younger than 55 years old, with prevalent dementia or stroke, or missing age, sex or genotype values at baseline, 272,660 participants were included in the final analysis, among whom the vast majority are White (97.80%), followed by Asian (1.35%), and black (0.84%). The mean duration of the follow-up was 11.2 years, yielding 3,050,595 person-years in total. Baseline characteristics of included participants are presented in Table 1. On average, white were older than Asian and black participants. Of note is that the genetic risk score for AD, the GRS was on average lower in black participants, yet twice as many individuals carry the rare highrisk APOE*44 genotype in black (5.1%) compared to white (2.3%) and Asians (1.1%; Supplementary Figure 2). On the other hand, the GRS for stroke was higher in black compared to white and Asian participants. White are almost twice as likely to have a family history of dementia compared to the other groups (14.9% for white, 6.0% for Asian and 8.2% for black). Figure 1 shows the breakdown of lifestyle factors and comorbidities by sex across age groups by ethnicity. Of note is that across age, the proportion of people with low education is similar in white (70.7) and black (72.4) participants but lower in Asian (56.2). Alcohol consumption is most frequent in white people compared with other ethnicities. The proportion of obesity is highest in black, and in particular women, while the proportion of physical inactivity is highest in Asian. Hypertension and diabetes are more prevalent in Asian people (70.0% for hypertension and 29.4% for diabetes) and black participants (79.4% for hypertension and 23.1% for diabetes) compared to white (62.8% for hypertension and 7.1% for diabetes). Of note is that white (29.6%) and Asian (23.3%) suffered more often from hearing loss than black participants (14.4%) across age. Atrial fibrillation is more frequent in white (1.1%) compared to Asian (0.2%) and black (0.3%; see Table 1 for complete results). Table 2 and Supplementary Figure 3 give a global overview of the risk estimates for each risk factor previously (4, 5) across different ethnicities for dementia, stroke and mortality. To further compare the ethnicities, we plotted the beta of the different risk factors in Figure 2. Below we discussed the overall findings for dementia, stroke, and mortality.

Risk of dementia by ethnicity
During follow-up, 4,742 participants developed dementia. The incidence of dementia is 160 cases/100,000 person-years in white, 178 cases/100,000 person-years in Asian, and 274/100,000 person-years in black. Figure 3A shows the cumulative incidence of dementia by age across ethnicities. This increased risk in black participants is significantly different from that of white (p = 0.003). Controlling for age and sex, we found an increased risk of dementia for black relative to white [ Figure 4, HR = 2.03 (95%CI 1.62-2.53), p < 0.001] but not for Asian (HR = 1.13 [0.89-1.43], p = 0.288). Figure 4 shows that after adjusting further for APOE and the GRS of AD, the difference is still significant for black (HR = 1.90 [1.52-2.38], p < 0.001) and Asian participants (HR = 1.33 [1.06-1.68], p = 0.015). Finally, when further adjusting for all risk factors, including lifestyle factors and comorbidities, the HR in black decreased but remained significant (HR = 1.63 [1.22-2.19], p = 0.002), suggesting that differences in lifestyle and comorbidity explain part but not all of the increase in risk.
For Asians, the same trend is seen but the HR is not significantly increased compared to white [ Figure 4, HR = 1.14 (0.85-1.54), p = 0.205] when adjusting for lifestyle and comorbidity, suggesting these factors explain most of the differences. Most of the morbidity and lifestyle effect is explained by physical activity and diabetes ( Figure 1 and Table 2). Finally, to understand the difference of the GRS of AD in black and white participants (Table 1), we examined the differences in the allele frequency of AD SNPs across ethnicities ( Figure 5). Compared to the white population, more than half of the SNPs (63%) in the black population have a significantly lower frequency of AD risk variants than in whites. The differences between white and Asian (55%) is not as strong as the differences between white and black.

Risk of stroke by ethnicity
A total of 6,680 participants had at least one stroke during follow-up. The incidence of stroke is 203/100,000 person-year in white, 274/100,000 person-years in Asian, and 234/100,000 person-years in black. Figure 3B shows a significant difference in the incidence of stroke among ethnicities (p < 0.001). White had a slightly older age at stroke onset: 72.7 ± 4.3 years compared to 72.3 ± 4.4 in Asian (p compared to whites < 0.001) and 72.2 ± 4.6 in Black people (p compared to whites < 0.001). Black participants had an increased risk of stroke relative to white participants ( Figure 4

Risk of mortality By ethnicity
Concerning mortality, 23,665 people died during the follow-up. The incidence of mortality is 801/100,000 person-years in white, 727/100,000 person-years in Asian, and 716/100,000 in black people. There are no statistical differences in the age of death among the ethnicities (p = 0.65, Figure 3C). Figure 4 shows that the risk of

Discussion
Within the UK Biobank, we find those who identified themselves as black participants are at increased risk of dementia and stroke. The increased risk cannot be explained by our current knowledge of risk factors. Adjusting for genetic factors, lifestyle, and comorbidity, the risk of dementia in Asian is similar to that in white participants, while the risk of stroke is similar to that in white people. We do not find a difference in mortality in black compared to white participants. In Asian, mortality is significantly less likely than in white after accounting for adjustment lifestyle and morbidity factors.
The main finding of this study is the high risk of dementia and stroke in Black people participating in the UK Biobank compared to Whites and Asians. These results are in line with the results of large observational studies in the United States of subjects older than 65 where African Americans have the highest prevalence of dementia (14.7%), followed by Hispanics (12.9%), and non-Hispanic whites (11.3%) (37). As expected, black UK Biobank participants presented more comorbidities associated with dementia and stroke (i.e., obesity, diabetes, hypertension) (4) and are subject to higher levels of air pollution compared to white. Of note is that the level of education is similar for white and black UK Biobank participants and the risk of mortality is not increased in black compared to white people. These findings do not exclude inequalities between white and black participants, e.g., schooling and healthcare infrastructure in the general population. However, in the setting of the UK Biobank, it is unlikely that these inequalities explain the higher risk of dementia in black. As we find that most risk factors have similar associations for the two ethnicities, the differences in the effect and frequencies of APOE may be relevant and raises the question to what extent the observed risk difference is explained by genetic factors. Of note is that despite the small sample size, we find an unfavorable distribution of APOE genotypes in black. By contrast, the GRS values are significantly lower in black compared to white participants which is a protective factor (38). Though the proportion of APOE*44 carriers is higher in Black (5.4%) than in the white population (2.3%, p < 0.001, Table 1), the effect of the APOE*44 variant on the risk of incident dementia in the Black population is much smaller than in white population, suggesting there are modifying variants in people of African descent in the UK, as has been found in African Americans (39). Although this may seem at odds with the finding that a family history of dementia does not increase the risk of dementia in black and Asian participants, one  Frontiers in Public Health 07 frontiersin.org may argue that family history in immigrants is less reliable compared to white people (40). Similarly, we find that black participants have an increased risk of stroke compared to white that is not explained by the genetic factors, lifestyle, and morbidity known to be involved in stroke incidence (41). Also for stroke, we found differences in allele frequencies (42). Compared to white participants, Asian do not have a significantly increased risk of dementia nor stroke when we adjust for genetic factors (43), lifestyles, and morbidity (12,44,45). Although the risk of dementia is increased in Asian compared to white people, this is in large part explained by lifestyle and morbidity (15). The risk of stroke in Asian is not significantly increased and is very similar to that of white after adjustment for additional lifestyle and comorbidity factors (46).
The distribution of APOE in Asian is also different from the white population with a higher proportion of APOE*33 carriers and fewer APOE*34 and APOE*44 carriers. GRS score in this group however, does not differ from the white study population. A recent study suggests that genetic factors found predominantly through European-GWASs may play a limited role in South Asians (45). Correlations between risk factors of dementia (left columns), stroke (middle columns) and mortality (right columns) by ethnicity, blue colour indicates the variables found to be significantly different between the two ethnicities.

FIGURE 3
Survival curves of the risk of dementia (A), stroke (B) and mortality (C), according to age and ethnicity. Red color is for White, green for Asian and blue for black. Colored area represents the 95%CI.
Frontiers in Public Health 08 frontiersin.org For dementia, we found the PRS (excluding APOE) in Black participants is lower than in Whites. A higher frequency of APOE*44 in black compared to white and Asian participants is fund, but the effect estimates of APOE on incident dementia is much lower in black than in white. Differences in frequencies and effect have also been reported for ABCA7 (47). These inconsistencies suggest that the PRS calculated based on the SNPs identified from the GWAS of white population may be generalizable to other populations (48). We further find that the allele frequency of most of the SNPs included in the GRS are significantly different among ethnicities, especially between black and white populations. This finding highlights the importance of generating large GWAS of dementia in the African population and that unique genetic loci associated with dementia are highly expected to emerge in such studies. In agreement with various studies in the United States, we identified that one-third of the SNPs, which have been previously found to explain the differences of risk of AD between ethnicities, are in the different direction as APOE*4 SNPs. This is to say, the two APOE SNPs have increased frequency in black people, but many other genetic variants surrounding the APOE*2/3/4 variants differ between white and black participants. These genetic variants may modify the effects seen in black and white. Similar findings are shown in Asian as well, which was not as strong as in black. It is of interest that we and others find this for dementia genes but not genetic variants implicated in stroke (49). On the other hand for stroke, we find its PRS for black participants is higher which explained the overall increased risk (50).
Compared with earlier observational studies (37,51), the strength of this study is that we adjusted our analysis for identified lifestyle and genetic risk factors. Despite those adjustments, we still found an increased risk of dementia in black people ( Interestingly the patterns of linkage between the ε4 allele of APOE and `523 poly-T alleles in the adjacent gene, TOMM40, differ between white and African Americans, both genotypic and allelic data support that among African Americans the ε4-`523-L haplotype had a stronger effect on the risk of AD than other ε4-`523 haplotypes (53).
This study has a few limitations. The most important one is the unbalanced make up of ethnicities in UK Biobank relative to the general population. Another limitation is the relatively young age of the participants at the inclusion (62.5 ± 3.8 years old) and the relatively short duration of follow-up (11.2 ± 1.8 years). These two factors imply that there have been few cases of incident cases and therefore a reduction in study power. Another limitation is that socioeconomic levels were not included in our analysis while we know this is an important factors of dementia and that there are huge disparities across ethnicities (54). Selection bias is also a concern. Black participants in our study sample have a similar educational attainment to white participants, which demographic studies suggest is not the case in the UK general population (55). This would imply that black and white in our study are more similar in dementia, stroke, and mortality risk than actually is the case in the general population. Thus, the increase in risk seen in black participants for dementia or stroke, is likely an underestimate. Finally, it has been shown that UK Biobank's participants are not representative of the population, with evidence of a 'healthy volunteer' selection bias. Nonetheless, the valid assessment of exposure-disease relationships may be widely generalizable and does FIGURE 4 Hazard ratios and 95%CIs for the risk of dementia, stroke and mortality according to ethnicity. Whites are used as the reference group.
Frontiers in Public Health 09 frontiersin.org not require participants to be representative of the population at large (56).

Conclusion
An important finding of our study is that there are no major differences in mortality across ethnicities among UK Biobank participants that may bias the risk estimates for stroke and dementia. This study emphasizes the need for more heterogeneity in large scale hypothesis-free cohort studies to understand the differences in risk of major diseases such as dementia and stroke and how this relates to genetic, lifestyle, and morbidity factors. The inclusion of participants of different ethnic backgrounds will increase the available statistical power and could lead to more targeted prevention campaigns. This same argument can be made for clinical trials (57,58). This research is key for the future prevention of dementia and stroke in low and mid-income countries (59). With the emergence of gene therapy and precision medicine, the question of health inequalities related to genetic and epidemiologic research becomes increasingly urgent. To close this gap in our knowledge we need major investments in ethnically diverse biobanks in the UK and elsewhere.

Data availability statement
Publicly available datasets were analyzed in this study. This data can be found here: UK Biobank.

Ethics statement
The studies involving human participants were reviewed and approved by North West Multi-center Research Ethics Committee. The patients/participants provided their written informed consent to participate in this study.

Author contributions
The study was conceived by BB and CD. BB and JL performed the statistical analysis. NA and CD verified the analytical methods. BB, JL, AT, NA, and CD did the data interpretation. CD supervised the findings of this work. All authors contributed to the article and approved the submitted version. The deviation of referent allele frequency of Asians and Blacks compared with the whites in the AD variants reported in previous GWAS studies (Chisquare test was performed to identify the statistical difference between frequencies). Red: Risk allele frequency larger than White population; blue: referent allele frequency smaller than White population. The depth of color range from +/− 2 to zero. *FDR < 0.001, FDR < 0.05.
Frontiers in Public Health 10 frontiersin.org Funding This research has been conducted using data from UK Biobank, a major biomedical database. The current study was authorized from UK Biobank project 54520. This project was funded through the King Abdulaziz University & Oxford University Centre for Artificial Intelligence in Precision Medicines (KO-CAIPM, CMR0020).