Knowledge translation of scholarly publishing impcts on public health

BACKGROUND: Although scholarly publishing plays a key role in learning, the role of knowledge translation of scholarly publishing with education and income on public health has not been well established. The objective was to describe how knowledge translation of scholarly publishing impacts on public health. METHODS : The correlations between the input data and the target data were firstly calculated. After the input data that is not correlated to the target have been removed, the principal component analysis will be performed to avoid multicollinearity problems in the input data. Finally, the multivariate regression method is used to fit the relationship between the principal components and the target data. Thus both dimensionality reduction and personalized optimization oriented a target can be done. RESULTS : After the public health in China is measured by Life expectancy and Death rate, the Pearson correlation coefficient, principal component analysis, and linear regression method have been performed. It proved that some activities of knowledge translation of scholarly publishing with a focus on health and well-being have the highest correlations with the first principal component. Results are also presented on that the first and the second principal component explain 99.3% of the variation ( p <0.01) in Life expectancy and 92.8 % of the variation ( p <0.01) in Death rate, respectively. CONCLUSIONS: Scholarly publishing, education, income, health expenditure, nurses, and midwives appear to have a similarly important effect on public health.

knowledge to improve the health of populations, provide more effective health services and products and strengthen the health care system". The principles of knowledge translation can be described as 'dissemination', 'utilization', 'evidence into practice' and 'knowledge transfer' [14,15]. Understanding local Indigenous processes of knowledge creation, dissemination, and utilization is a necessary prerequisite to effective knowledge translation in Indigenous contexts [16]. What is and what is not considered to be knowledge translation is the most important [17]. Knowledge translation in some circumstances has been found as effective as complex and multifaceted ones [18]. Definitions of concepts in knowledge translation are unclear [19,20]. It results that information retrieval is difficult [21]. Making knowledge relevant to the challenges their stakeholders face, the capabilities of their stakeholders, and the expectations the government can make it work [22]. Knowledge translation in the health field strategies involved in public or community prevention orientated coalitions from a range of health and well-being disciplines [23], preventative adolescent substance abuse services [24], healthy body weight promotion [25], immunization and cancer screening prevention [26].
The research utilization theory of knowledge translation suggests that knowledge is a changing set of understandings shaped by those who both generate and use research [27]. It implied that potential users are more likely to do so if there is an identified need or incentive [28]. This is similar to the diffusion of innovations theory [29]. Potential adopters of innovations can be categorized as innovators, early adopters, early majority, etc. [30]. Many theories have varying objectives, which range from information provision individually or to large audiences to achieving behavior change through education or skills acquisition [31]. For example, the successes, challenges, and lessons learned from using social media within health research have been studied [32]. At the same time the knowledge gap have been to affect disseminate information in a social system [33,34]. More importantly, it is often difficult to measure directly the effects of knowledge creation and diffusion on society [35].

Methods
The data have been obtained from the website of the World Bank (https://data.worldbank.org) and the website of SCImago (https://www.scimagojr.com). One can note that the data used in this article do not include the influence of knowledge gaps. All data of the scholarly publishing (e.g., articles indexed by SCImago, journals indexed by SCImago, etc.), education, health, etc. in this article use the data for the entire Chinese community. This article aims to do a pioneer investigation on the relationship between knowledge translation from the research results published in the scholarly journals and the health and well-being of the whole Chinese society without considering the difference caused by knowledge gaps. In detail, the relationship has been statically analyzed based on the data of the research results published in the scholarly journals and the health and well-being data in recent years.
To check generating theories and hypotheses, it is very important for using data in testing those generating theories and hypotheses. To test most hypotheses, two variables (a proposed cause and a proposed outcome) need to be measured. Variables are things that can vary. After the data of the research have been collected, it is to analyze the data that involves both seeing what the general trends in the data are via graphical data and also fitting the data by using statistical models.
Many researchers have studied issues in the social sciences by using mathematical methods and statistic models. For example, regression and component analysis are so important and frequently used in social science research [36][37][38][39][40][41][42]. The principal component analysis, a mathematical method, can help us find relationships between two variables sets (a cause variable set and an outcome variable set) that have been collected for an issue in societies. For example, principal component analysis has been used to get socioeconomic impact [43]; the relationship between academic performance, substance use, sleep quality, and risk of anxiety and depression in young adults have been investigated by principal component analysis [44]; principal component analysis has also been used for analyzing the performance of semiconductor devices [45]; the relationships between some pre-and post-slaughter traits of broilers have been investigated by principal component analysis [46]; principal component analysis has also been used for early disease detection [47]; principal component analysis has also been used to predict ozone concentrations [48].
Multicollinearity is a linear association between two or more explanatory variables. A set of variables is perfectly multicollinear will have the following equation where n is an integer, n is a constant, and xni the ith observation on the nth explanatory variable. Thus When Eq.2 is valid, there is multicollinearity among explanatory variables.
The principal component analysis is one method to overcome the multicollinearity problem. For an np matrix： Its correlation coefficient can be calculated by Eigenvalues i(i=1,2,…,n) and eigenvectorsi=(i1,i1, …,ip) can be obtained by solving Eq.6. let 12…n. The ith principal component can be written as and its variance is The load is The component scores for all principal components are

Results
One can note that principal component analysis is the same for the same input data if input data are not cleaned [49]. To increase interpretability and void the multicollinearity problem by using principal component analysis in the input data-oriented a target, a personalized optimization will be first performed. In other words, the input data will be firstly filtered based on its correlation with the target data. In this article, the correlation coefficient between the input data and target data will be calculated according to Eq.4. After the input data that is not significantly correlated to the target data (p>0.01) have been removed, the principal component analysis will be performed for the selected input data. Finally, the multivariate regression method has been used to justify the relationship between the principal components and the target data. In this article, we do focus on the whole authors in China, whole journals published by China not focus on special types of authors in china. In other words, the study by using the cites in the Web of science or ORCID to narrow down a list of authors could be a future job.  [36][37][38][39][40][41][42]. Life expectancy at birth in China is found to nearly linearly increase with time since 1996, which is shown in Figure 1. Figure 1 also illustrates that Death rate, School enrolment (primary), School enrollment (tertiary), Current health expenditure per capita, Adjusted net national income per capita, Adjusted savings: education expenditure, intellectual properties, and Nurses and midwives in China are found to fluctuate but has a growth trend. School enrolment (primary) is found to fluctuate. In this article, either the Life expectancy at birth or the Death rate is used to measure health and well-being in China.     2019  684048  199074  544310  164545  162050  182982  662  77  2018  605616  378798  2161615  681294  140381  155037  682  80  2017  538162  393539  3512243  1209703  120602  132280  683  76  2016  498325  382165  4308362  1569550  107339  107090  682  77  2015  460425  361010  4987119  1909799  95584  90105  648  67  2014  490859  353627  5257180  2093561  86391  74855  635  57  2013  456542  324831  5145005  2138702  76196  64235  626  51  2012  415082  296107  4913231  2124347  65084  47028  595  41  2011  393879  273191  4548474  2000236  57939  35724  612  37  2010  344017  246625  4229403  1917681  50295  23977  605  33  2009  308460  224807  3895162  1794314  44109  20173  595  27  2008  261264  194182  3419775  1587921  37961  15806  577  23  2007  223247  166440  2961272  1365609  32326  13193  538  21  2006  201159  147615  2550868  1186712  28222  11325  565  21  2005  171226  124369  2194990  1032205  23663  7756  537  16  2004  117131  89148  1752095  853695  19279  4907  501  14  2003  81740  62404  1303917  650898  15203  3261  473  13  2002  68633  51546  998871  495375  11049  2148  449  13  2001  65674  45734  794033  398386  8774  1648  449  12  2000  51443  36599  636868  326461  8364  1358  375  12  1999  43315  29580  495957  267703  6839  1030  367  11  1998  42555  26791  411332  228716  6736  1055  1997  36113  22884  353338  207839  6088  743  1996  30780  18698  282718  164565  5331    The above collected multidimensional data is likely to be related, in other words, the collected multidimensional data have commonness. To see the utility of various kinds of collected multidimensional information data, it is obvious that this commonness should be removed. However, we can not roughly remove the relevant information data, because the reduction of relevant information data will inevitably lose a lot of important information, which leads to the existence of no one in the effectiveness and reliability of the target information. Besides, how to simply process these multi-dimensional collected information data into single-dimensional collected information data, then the result of calculating the target information must be independent. Therefore, there is no way to compare the comprehensive conclusion of multidimensional information collection (input information) with the information collected (input information) of each dimension. Factor analysis is a technology to extract common factors from variable groups. It is just right for solving the above problems.    The load of education expenditure that is 0.886 in the first principal components is very high. It can be compared with that of Current health expenditure per capita (0.916) and net national income per capita (0.898). That the level of education seems to exert a very high impact on regional growth has been concluded [51]. It has been found that there is a significant relationship between regional growth and higher education within North European countries [52]. A high load of education expenditure in the principal analysis agrees well with those conclusions drawn in the references [51,52]. At the same time, it was found that schoolbased programs to promote health knowledge in an area characterized by low levels of income and education may have much smaller payoffs than programs that encourage the investments in time preference made by the more educated [12]. Such a viewpoint can be partly supported by the load education expenditure locate the middle of all 14 factors because it ranks eighth out of 14 factors. Significant non-monetary returns to education concerning health outcomes and not necessarily for health-related behavior have been found [13].
Schooling could be an important factor influencing nonmarket production processes associated with fertility and child health [53]. These loads that are shown in Table 5 support the above findings on health promotion in references [12,13,53]. Most loads of the number of documents and journals indexed by SCImago are comparable with those of education expenditure, Current health expenditure per capita, and Nurses and midwives. It agrees well with the conclusion that knowledge plays a crucial role in the process of economic development [54]. That the information and knowledge exercise a decisive impact on the functionality and performance of organizations, assuring the sustainability of the economy in the long term has been found at the global level [5]. And increasing knowledge and developing knowledge can improve public health, for example, increasing knowledge and awareness by training health professionals to communicate and deliver targeted preconception care interventions may be important [9]; it is necessary to develop sustainable strategies for collective health-promoting activities, in addition to strengthening multidisciplinary work and Continuing Education actions [10]. All the above conclusions demonstrate that education, income, knowledge, etc. can have an impact on public health. Most loads that are shown in Table 5 are comparable (loads of 11 input variables change from 0.78 to 0.99 ). It means that all factors can not be neglected for public health promotion.
The component score coefficient matrix is an output product in the principal components analysis [36][37][38][39][40][41][42]. The component score coefficient represents the weighting of variables to be used when computing the saved variables of the components. Table 6   The component score coefficients of education expenditure, School enrollment, tertiary (% gross), Current health expenditure per capita (current US$), and Nurses and midwives in the first and second principal components are also very high. Most component score coefficients (9 input variables) vary from 0.09 to 0.18. In these 9 variables, there are factors for education, net national income per capita, scholarly publishing, etc. It further supports that the level of education exerts a very high impact on regional growth [51] and there is a significant relationship between regional growth and higher education [52]. These results also agree well with that school-based programs to promote health knowledge in an area characterized by low levels of income and education may have a small payoff [12], there are Significant nonmonetary returns to education concerning health outcomes [13], schooling could be an important factor influencing fertility and child health [53]. Most component score coefficients of the Document and journals published in China are comparable with those of education expenditure, School enrollment, tertiary (% gross), Current health expenditure per capita (current US$), and Nurses and midwives. This might be because knowledge is the most powerful engine for economic development [54], accelerated sustainable economic growth [5], increasing knowledge, and developing knowledge can improve public health [9,10]. That all these input factors can improve health and well-being according to Table 6 is consistent with the above conclusion drawn in the literature.

Discussion
According to Table 1 out of 14 components, only these factors whose eigenvalues are larger than 1 have been selected for multiple linear regressions between Life expectancy at birth or Death rate and the principal components. Table 7   high impact on regional growth [51,52], and an impact on health promotion [12,13,53]. Such linear relationships between the principal components (it includes scholarly publishing, education, income, health expenditure, etc.) and the target data (the data of public health in China) could originate that knowledge is necessary for the health benefits and the knowledge for health and well-being is various [6], either increasing knowledge or developing knowledge can improve public health [9,10], lack of knowledge can cause serious health problems [11], it needs knowledge for making proper decisions and realizing actions [5], making better health decisions leads to improved health outcomes [13], knowledge is the most powerful engine for economic development [54], knowledge can accelerate sustainable economic growth [5]. In other words, both economic development and regional development can improve health and well-being.  [9,10], lack of knowledge can cause serious health problems [11], it needs knowledge for making proper decisions and realizing actions [5], making better health decisions leads to improved health outcomes [13].

Conclusions
One can note that the combination method is proposed in this article is not the actual answer. Instead, this model is an example of showing our idea to study the relationship between the input data (knowledge, education, etc.) and the target data (public health).
Besides, due to the limited or possible error of the collected data of knowledge, education, etc., the relationship the input data (knowledge, education, etc.) and the target data (public health) based on the proposed method in this paper may be different from the actual situation.
However, the proposed method can be used to study the issue of how the target data depends on the input data. This is because the proposed method provides a new idea for the influence of knowledge in future studies.
The proposed method can be treated as a new method introduced to study the correlation between scholarly publishing and health and well-being.
Principal component analysis has been used to avoid the multicollinearity problem in the data used in this article. Through a case study of the data of health and well-being, education, income, and knowledge translation of scholarly publishing in China, explore the effects of the various activities of scholarly publishing on health and well-being. Results obtained from the principal analysis show that two principal components whose eigenvalues are larger than 1. This implies that scholarly publishing especially on open access publishing could be an important factor in improving health and well-being. All results demonstrate that scholarly publishing can give an important contribution to the health and well-being of China. The findings in this paper agree with the former conclusions reported in the literature that that various knowledge is necessary for health and well-being [6], knowledge can promote public health [9,10], serious health problems could occur due to lack of knowledge can cause [11], knowledge is necessary for making proper decisions and realizing actions [5], health outcomes can be improved by making better health decisions [13]. In conclusion, the combination of the correlation analysis, the principal component analysis, and multivariate regression methods is valid in the study of the correlation between scholarly publishing and health and well-being.

Ethics approval and consent to participate
Not applicable.

Consent for publication
Not applicable. Table   Table 1 Life expectancy at birth, Death rate, School enrolment (primary), School enrollment (tertiary), Current health expenditure per capita, Adjusted net national income per capita, Adjusted savings: education expenditure, intellectual properties, and Nurses and midwives in China. The data come from the website of the World Bank (https://data.worldbank.org).   Table 4 Eigenvalues, individual and cumulative by using the normalized data. The raw data come from the website of the World Bank (https://data.worldbank.org) and the website of SCImago (https://www.scimagojr.com). Table 5 The loadings after varimax rotation by using the normalized data. The raw data come from the website of the World Bank (https://data.worldbank.org) and the website of SCImago (https://www.scimagojr.com). Table 6 Component score coefficient matrix by using the normalized data. The raw data come from the website of the World Bank (https://data.worldbank.org) and the website of SCImago (https://www.scimagojr.com).