Cross-national Comparisons of Cognitive and Physical Health Across China, Japan, and Korea: A Systematic Review

Background: Cross-national studies are an emerging research area in public health. Specically, cross-national health comparisons are important for understanding the factors driving the success or failure of public health policies. Findings from cross-national studies can be used to improve the current health policies and practices in each country. Therefore, this study systematically analyzes studies that compared health status (physical health and cognition) using national panel data for three Northeast Asian countries—China, Japan, and South Korea. Methods: Google Scholar and PubMed were used for the literature search. The search strategy targeted papers published between 2005 and 2020, yielding a total of 205 studies, of which eight were selected for the review. Results: Two studies conducted cognitive comparisons, ve undertook physical health comparisons, and one article demonstrated cross-national comparisons using national panel surveys. The cognitive comparison items were verbal memory, orientation, visuoconstruction, numeracy and numeric ability, and executive functioning. Health comparisons were conducted by measuring the proportion of the population with chronic conditions, such as heart disease, diabetes, stroke, hypertension, arthritis, and having diculty in activities of daily living. None of the eight studies utilized a common measure for cognitive and physical health status across the three countries. Conclusion: While survey items measured cognitive function and general physical health status in each country, there was no common measure for undertaking cognitive and physical health status comparisons across the three countries. A valid cross-national outcome measure is needed to accurately compare the population-level health status across the three countries. Health Study (HRS); Study Study (KLoSA); Study on and (JSTAR); Health and Retirement Study (CHARLS); Risk of Bias for Nonrandomized Studies (RoBANS); Activities of daily living (ADL); Comprehensive Survey of Living Conditions (CSLC); Korea Community Health Survey

compared factors related to cognitive disorders and depressive symptoms among the older adults in Japan and South Korea. Similarly, Chen and Lin (6) examined the relationship between social identity factors and physical inactivity. These studies can inform public health policy interventions in the respective countries but are limited in generalizing results to other countries. There is an opportunity to expand upon this early research by considering cognitive and physical health simultaneously.
Most countries use panel surveys as comparative data. Examples of panel surveys include Health and Retirement Study (HRS) in the US, English Longitudinal Study of Aging (ELSA) in the UK, Korean Longitudinal Study of Aging (KLoSA) in Korea, Japanese Study on Aging and Retirement (JSTAR) in Japan, and Chinese Health and Retirement Longitudinal Study (CHARLS) in China. However, inconsistency between the items being compared between these countries can lead to inconsistent results (7). Another challenge is that the in uence of culture cannot be excluded when results are interpreted and understood between multiple countries (8). Since the panel surveys of each country mentioned above are based on a country-speci c database, there are inherent limitations related to crosscountry comparisons of cognitive and physical health.
Therefore, it is necessary to determine whether the panel survey databases can support cross-country comparisons of health, as well as to identify the limitations in such comparisons. The main purpose of this study is to nd out the usefulness and limitations of each country's panel database through a systematic review of literature comparing the cognitive and physical health of the elderly in three Northeast Asian countries such as China, Japan, and Korea. The studies used data from the CHARLS, JSTAR, KLoSA, and another panel survey database of national (CSLC, KCHS). This comparison of cognitive and physical health between China, Japan, and South Korea will inform health care services and public health policy. For example, through data comparing the cognitive and physical health of China and South Korea, it is possible to identify the cognitive and physical health that Korea lacks compared to China. Gaps in health between countries can be used to reinforce health policies tailored to a speci c country. We aim to understand how nationally-representative panel survey databases have been used to make cross-country comparisons of cognitive and physical health.

Research Questions
This study answers the following research questions: 1) Is it possible to compare cognition and physical health across the three Asian countries using national panel databases? 2) Are there any limitations in comparing cognitive and physical health in three Asian countries using the national panel databases?

Methods
This study shortlisted articles comparing the cognition and physical health of the older adults across the target countries using data from the CHARLS, JSTAR, and KLoSA. We undertook a systematic literature review to determine whether it is viable to conduct a comparison of cognition and physical health between the three countries. Also, we followed the Preferred Reporting Items for Systematic Reviews and Meta-analysis (PRISMA) guidelines (online supplement)21 and registered the review protocol on PROSPERO (CRD42020208200, 10/14/2020)

Information Sources
On 08/01/2020 we searched literature in Google Scholar using the keywords "CHARLS AND JSTAR AND KLoSA" and "CHRLA AND JSTAR AND KLoSA AND cognition AND physical health". On 08/05/2020 we searched PubMed using the keywords "CHARLS AND JSTAR", "JSTAR AND KLoSA", "China AND Korea AND Japan AND Cognition AND Physical health". A nal literature search was conducted on August 10st, 2020 using Google Scholar and PubMed databases to update our results with any recent publications.

Literature Search
Search targets were limited to articles published from 2005 to 2020. The keywords used were: CHARLS, JSTAR, KLoSA, cognition, and physical health. Literature search was conducted by two people, SN and IH.

Eligibility Criteria
The inclusion and exclusion criteria were determined by SN and IH. Studies that met the following criteria were eligible for inclusion: compared cognition and physical health between China, Japan, and South Korea; were comparative studies using the CHARLS, JSTAR, and KLoSA databases, or questionnaires; were including the Comprehensive Survey of Living Conditions (CSLC) in Japan, the Korean Community Health Survey (KCHS) in South Korea, and the East Asian Social Survey (EASS) in South Korea in China and Japan were also collected; were written in Korean or English; and were articles published from 2005 to 2020. Studies were excluded if they that did not include comparisons between China, Japan, and South Korea; Studies were excluded if they that did not include cognitive and physical health comparisons; or were de ned as grey literature, conference papers, or academic presentations.

Literature Collection and Screening
Two researchers participated in the search and screening of the literature. After collecting the data and excluding duplicate entries, relevant studies were selected based on the inclusion and exclusion criteria. We ensured consistency between reviewers and performed calibration exercises before starting the review.

Data extraction
Review authors individually screen titles and abstracts generated from the collected literature. Titles and abstracts that meet the inclusion criteria and both titles and abstracts with uncertainties are written into the report. Two reviewers screened the reports of the collected literature and determined whether they meet the inclusion criteria. To resolve any ambiguities about eligibility, additional resources were sought from the authors of the literature. Discussions and external advice were sought to resolve disagreements between the two reviewers. As mentioned in information sources in this study, a total of 204 documents were searched. Among them, 11 duplicate documents were removed, and 179 documents that did not include cognitive, physical health and three countries were removed. Finally, a total of eight papers were selected by removing six comparative literatures on Chinese, Japanese, and Koreans living abroad. The sequence of procedures for selecting the literature is shown in Fig. 1.

Quality of the Literature
The Centre for Evidence-Based Medicine (CEBM) Levels of Evidence was evaluated based on the level of evidence, as discussed by Phillips, Ball, Sackett, Badenoch, Straus, Haynes, Dawes (9). We also used Risk of Bias for Nonrandomized Studies (RoBANS) version 2.0 for non-randomized studies to assess risk of bias (10,11). RoBANS is a document quality assessment tool developed in Korea to evaluate nonrandomized interventional research. We anticipated our included articles would not be intervention studies because the CHARLS, JSTAR, and KLoSA are longitudinal panel surveys. The RoBANS covers a total of 8 areas: target group comparability, selection of participants, confounding variables, measurement of intervention (exposure), blinding of outcome assessment, outcome assessment, incomplete outcome data, and selective outcome reporting. It is structured so that the risk of bias can be assessed. Each area of the RoBANS tool can be judged as "low", "high" and "unclear" risk of bias. Two authors each independently evaluated the quality of selected papers based on the RoBANS tools, and items that were inconsistent with each other were determined by consulting the corresponding author.

Results
We extracted a total of eight papers. Results from the levels of evidence using CEBM criteria included one level 2A paper (4), six level 2B papers (3,5,6,(12)(13)(14), and one level 5 paper (15). The results are shown in Table 1. In addition, the risk of bias was evaluated using the RoBANS tool and found that the overall risk of bias was low in all studies. However, four of the eight studies showed risk from incomplete outcome data (4,12,14,16). Two of the eight studies showed at high risk in confounding variables (5,12). One out of eight studies showed a bias risk in target group comparability (14). The results are shown in Fig. 2, 3. Note. RCT = randomized controlled trial.
To answer our rst research question, Table 2 identi es which studies measured cognition, physical health, or both. Two studies assessed only cognition (5,13), ve assessed only physical health (3,4,6,12,14), and one study assessed cognitive items of the CHARLS, JSTAR and KLoSA (15). With respect to comparing outcomes across the three northeast Asia countries, there were three papers comparing South Korea and Japan, two papers comparing China and South Korea, and three papers comparing China, Japan, and South Korea. However, there was no paper comparing China and Japan. China, Japan, South Korea ○ Note. "X" indicates the item is not measured and "O" indicates the item is measured Cognition was assessed based on the population's verbal memory, orientation, visuoconstruction, numeracy and numeric ability, and executive functioning in the three countries. The JSTAR lacked items related to visuoconstruction and executive functioning. The CHARLS data did not include items on executive functioning. The comparison of cognitive items is shown in Table 3. Health status in the three countries was assessed based on the following parameters: Chronic conditions, lifestyle (i.e., drinking and smoking), and activities of daily living (ADL). Of the four papers comparing physical health, two used data from the CHARLS, JSTAT, and KLoSA. Comparing these two papers, there were the same physical health factors between the three countries (3,12), including ADL and Chronic disease. The comparison of physical health items is shown in Table 3. The evidence summary for all eight studies can be found in Table 4. Data used a lot by China, Japan and South Korea are from CHARLS, JSTAT and KLoSA (12). In some studies, China, Japan, and South Korea used East Asian Social Survey (EASS), Korean Community Health Survey (KCHS), and Comprehensive Survey of Living Conditions (CSLC) data. (6,14).

Discussion
This study used nationally representative databases from China, Japan, and South Korea to determine whether comparisons of cognitive and physical health could be made between older adults from these three countries. Eight papers met our inclusion criteria and used panel surveys of China, Japan, and South Korea. Overall, these studies exhibited good quality and low risks of methodological bias. These surveys are sister-studies from the U.S. Health and Retirement Study and have been harmonized by the National Institute on Aging (https://g2aging.org/); therefore, the studies included similar survey items on cognition and physical health that enable cross-national health status studies. While the current panel databases are promising for cross-national studies, there are technical limitations in each panel survey database.
There were several limitations in the process of comparing cognition and physical health between the three countries that may introduce information and selection biases. In Lee et al. (3), some survey items were adjusted, but it was noted that they were not the same across countries. Nakagawa et al. (4) mentioned that the survey periods in the three countries are different. Each panel survey was conducted in different geographical locations, such as urban areas and rural areas. The age of participants was over 45 years old in CHARLS and KLoSA and limited from 50 to under 75 years old in JSTAR. Finally, French et al. (12) found that it was di cult to unify the language in a questionnaire, impeding the ability to convey the same meaning to all respondents.
First, there were limitations related to cognitive measures. Shih, Lee, and Das (15) analyzed the cognitive items used in each country-speci c survey, and found that the cognitive items did not match. CHARLS, JSTAR, and KLoSA only have three common cognitive evaluation items. This can be a problem because the results of the comparison may be inconsistent due to discrepancies in the items being compared (7). In their analysis, Lee and Shinkai (5) selected only those cognitive items that were similar between the JSTAR and KLoSA, to ensure semantic relevance. This is problematic because it reduces the number of possible cognitive measures for researchers. Mathematically, the number of test items has an inverse relationship with a measurement precision (17). Additionally, there may be different nuances and meanings between each measure in a given language. In the case of translating the languages between countries, if the meaning is not conveyed correctly, even if the same item is measured, measurement error may occur (18). In additional some items could be interpreted differently depending on the language and culture (19).
Second, we also found limitations to measuring physical health. To understand disease and the ADL di culty, items comparing physical health in the three countries were measured using CHARLS, JSTAR and KLoSA data. Data from CHARLS, JSTAR and KLoSA were used to compare the prevalence of chronic diseases, lifestyles, and ADL impairments among the elderly in each country (3,4). The CHARLS, JSTAR and KLoSA data had no problem with items comparing physical health. However, recent studies have suggested that cultural differences can lead to different health outcomes (20). For example, individuals living in rural cultures have higher mortality rates from poverty, adult smoking, lack of physical activity, and ischemic heart disease than individuals living in urban cultures (21). In additional, the study by Takahashi, Jang, Kino, and Kawachi (14) mentioned limitations when compared using the Comprehensive Survey of Living Conditions (CSLC) and Korean Community Health Survey (KCHS) data.
As a limitation of their study, it was mentioned that the physical health items in the databases of the two countries tried to match the different items as consistently as possible. This suggests that the physical health items of CSLC and KCHS were not similar. In conclusion, our ndings are promising because it seems possible to study physical health in Northeast Asia using data from CHARLS, JSTAR and KLoSA.
The quality of evidence varied. In the study by Takahashi et al. (14), the number of respondents in Japan South Korea who participated in the survey was different, and the number in Japan was much lower. An important limitation of their study was that the estimation coe cient was not accurate because of the mismatched sample sizes (14). This problem occurred because the South Korean data used the KCHS of the 2008 cross-sectional survey data and the Japanese data used the CSLC of the 2013 data. There were 226,097 South Koreans and 12,971 Japanese enrolled in the respective country surveys. As such, if the number of respondents participating in the panel survey is different, statistical comparison becomes di cult. In the study Chen and Lin (6), the number of subjects' health measures were self-reported, and the accuracy was limited to measuring speci c activity levels. In the study by Nakagawa et al. To summarize, we concluded there are important differences in the questions comparing cognition and physical health in the three countries. In addition, because the samples of each country participating in the survey varied by size, year of self-report, language, and culture, any comparison would have substantial measurement error. Since the survey period in each country is also different, it is di cult to generalize the comparison between countries. Future studies should develop a questionnaire that can facilitate comparison between three countries. For example, using the common item linking method with the Rasch model and cross-country databases, assessment tools can be developed to compare cognitive and physical health across countries (22,23). These evaluation tools can be used to compare national policies in the future and to know the treatment trends of other countries' cognitive and physical health.

Limitations
There were some limitations to this study. First, this study should be searched using MeSH terms as a systematic review. However, the terms CHARLS, JSTAR, KLoSA, and physical health that correspond to the panel survey in each country did not t MeSH. Therefore, we cast a wide net in our search strategy, looking for each country's panel surveys: CHARLS, JSTAR, and KLoSA, and manually ltered articles via our inclusion/exclusion criteria. Second, there were few papers comparing cognitive and physical health between countries using CHARLS, JSTAR, and KLoSA. Finally, the range of year limits for inclusion criteria was wide. The selection of this range was broadly selected because the items for surveying each country's cognitive and physical health may change over time.

Conclusion
We searched for papers comparing cognition and physical health in three Northeast Asian countries and conducted a systematic review. We found that there is an opportunity to use panel survey data for comparing cognitive and physical health between countries. However, it turns out that few papers have compared cognitive and physical health between Northeast Asian countries using this data. There are numerous limitations in the existing research comparing these cross-national outcomes due to desynchronized survey years, cultural and language variation, and an overall heightened risk for measurement error. This systematic review reveals a need for robust linking methodologies to facilitate comparisons between the Northeast Asian countries. Rehabilitation care practitioners' research and practice areas will be geographically and culturally broadened by using evaluation tools that can compare a client's cognitive and physical health regardless of their country of origin. The global development of health care services can accelerate by examining the treatment trends of other countries.