A Framework for the Evaluation of Healthcare Data’s Value from the Perspective of Open Data

Background: This study is aimed at developing a framework for the evaluation of healthcare data’s value, so as to provide a tool for data managers in making decisions on data openness. Methods: In this study, the Delphi method was adopted. Firstly, a rudimentary framework was constructed following a literature review and focus group interviews, and an inquiry letter was designed. After handing out the inquiry letters to experts in related areas, the framework was modied according to the feedbacks. This process was repeated until a consensus was reached. Results: For this inquiry 15 experts were invited; whose levels of activeness and authority were relatively high. This research produced a framework for the evaluation of healthcare data with 2 primary criteria, 7 secondary criteria and 21 tertiary criteria, after two rounds in Delphi method. Conclusion: The framework established in this research lays a solid foundation for the identifying and evaluation of healthcare data’s value, and is expected to drive the process of opening of valuable healthcare data.

It has been a universally acknowledged principle to identify and prioritize the open-up of high-value data in the open data movements in countries throughout the world. Not only did Open Government Directive and G8 Open Data Charter both highlighted the prioritizing of the open-up of high-value data, the World Bank also put forward the "80/20 principle", which means 20% of the data can contribute up to 80% of the public value and that people should identify and publish these data rst [4] . Some fast-moving countries and organizations have established principles for identifying and evaluating high-value data through practice. For instance, the Open Data Prioritization Toolkit published by the U.S. Federal CIO Council, by aggregating a list of questions, provided principles for the evaluation of value, cost and risk of open data [5] . For another instance, in July 2016, Share-PSI of the EU published Best Practice: Dataset Criteria which, in the form of the "best Practice", sets out a number of criteria that can be used to prioritize the publication of some datasets ahead of others [6] .
In the healthcare and medical industry, data are highly concentrated and the need of big data in healthcare have extended far beyond the scope of diagnosis and treatment for patients. It is expected that the opening-up, sharing and application of data in the eld of healthcare can create signi cant economic and social value. However, due to multiple characteristics inherent of it, including multiple users, privacy and complexity, there will be a lot more obstacles in the process of openingup and sharing healthcare data. Especially, when the managers of the data are faced with requests of data from multiple parties, how should they make decisions concerning the topic of open data? Thus, this research is intended to, on the basis of complying with the laws and ethics, establishing a framework for the evaluation of healthcare data and accelerate the prioritizing of open-up of high-value healthcare data, in order to fully realize the potential of data. This research has employed multiple qualitative research methods to establish a framework for the evaluation of healthcare data's value, which was completed through the following procedure: 1. By literature review, acquiring a basic understanding of the key factors determining the value of healthcare data. The author searched in both Chinese and English databases (including CNKI and Web of Science) using keywords including "Data Asset Valuation"/ "Data Value Assessment", before picking out from among the literature the key factors determining the value of healthcare data.
2. On the basis of literature review, the author invited experts in the relevant areas to conduct focus group interviews during which the factors were assessed, sorted out or supplemented by new factors, then established a rudimentary evaluation framework and designed an inquiry letter.
3. Sending out those inquiry letters to the experts in the relevant areas via email, then collecting, summarizing and analyzing the feedbacks of the experts. Then the author made modi cations to the evaluation framework, designed an inquiry letter again, and collected advices from the experts for more times before the experts' opinions reached a consensus.
4. Data statistics and analytics. The data in this research were input via EXCEL and analyzed via SPSS 22.0.
1 The activeness of the experts participating in the research was measured through the ratio of effective inquiry letters, with the following formula: Active coe cient = (effective letters collected/ letters distributed) *100%.
2 The level of authority of the experts participating in the research was measured through the authority coe cient Cr, which is normally in uenced by two factors: Ca (Expert judgment criteria) and Cs (Expert familiarity). The formula of Cr is as follows: Cr=(Ca+Cs)/2. If Cr>0.7, the authority of the expert is acceptable, the speci c quantitative details of Cr are presented in Tables 1 and Table 2 [7] .
3 The results of the expert consultation were described by mean, standard deviation, coe cient of variation and full score ratio. The higher CV is, the larger the divergence among the experts is, indicating that the key factors are subject to further modi cation; The higher full score ratio (ranging from 0-1, the ratio of one factor being attributed full scores) is, the more important this factor is.

Rudimentary framework and inquiry letter
After retrieval, 131 articles were found to be related to the research topic. Two researchers read and combed the literatures and extracted the key factors that affect the value of the data. This was followed by small-scaled focus group interviews involving 2 experts in healthcare management and 3 experts in computer information, which formed a rudimentary evaluation framework (please refer to Table 5 for details) based on the result of literature review.
An inquiry letter was designed based on the rudimentary framework, which include 3 segments: 1) an introduction to the research background, an overview of the framework and a guide as to how to answer the letter; and 2) a pool collecting the levels of agreement upon each factor which include 5 levels -"Disagree", "Slightly disagree", "slightly agree", "", "De nitely agree", with a column in which experts can write down their advice after each of the factors; and 3) a survey about the experts' basic information, familiarity with the research content, as well as experts' judgement criteria (The familiarity are divided into 5 parts ranging from "unfamiliar" to "familiar"; the criteria include theoretical analysis, practical experience, peer understanding and intuitive perception, and the levels of each criteria's weight are divided into "big", "medium" and "small").

Basic information about the experts enrolled in the Delphi method
In this research, the Delphi method involves 15 experts, whose professions include hospital managers, directors and frontline workers at information department, workers at governmental healthcare information bureau, technicians at information companies, as well as university researchers. The titles of these experts range from Senior (n=9), Sub-senior (n=4) to junior (n=1); their highest degrees range from doctor (n=4), master (n=9) to bachelor (n=2); their working age range from "≤10 years" (n=2), "10~20 years" (n=5), "20~30 years" (n=7), "≥30 years" (n=1); their specialties range from computer information (n=9), health care management (n=5) and public health (n=1) (As presented in Tables 3). Fairly familiar 0 0.00

The level of activeness and authority of the experts
The level of activeness of the experts is measured by the active coe cient. In this research 2 rounds of inquiry letters are conducted, each round with 15 letters sent out. The letters reclaimed were both 15, resulting in an active coe cient of 100%.
The level of authority of the experts is measured by the authority coe cient. The judgement criteria adopted by the experts are listed in the table 4, and the results from the two rounds were the same. The experts' authority was proved to be at a rather high level, since the authority coe cient achieved 0.81 according to the formula, with a Ca of 0.9 and a Cs of 0.72.

Results from the inquiry letters
After the rst round of inquiry letters, the results of experts' appraisal of the tentative criteria are listed in table 5. According to the results, the average levels of agreement as well as consistency regarding primary and secondary criteria were relatively higher. Subsequently several changes were made to the tertiary criteria with low level of mean and full score ratio as well as a high level of CV: 1) C3, "conformity", was changed into "standard", meaning "Whether the data model, data elements, terminology are compliant with related national or regional standards"; 2) C4, "accessibility", was added, meaning "Whether there is a lag when accessing the data and volume capacity"; 3) C12, "professional title of the project leader", was removed; 4) C15, "The professional title of the decision maker", was changed into "the position of the decision maker"; 5) C18, "the data user's level of education" was changed into "the level of the individual's knowledge of healthcare", meaning "the ability of the individual to obtain and understand healthcare information, and the ability to sustain and promote health with the information" The second round of inquiry was conducted by handing out the modi ed questionnaire which, after being lled in and collected, were again summarized and analyzed. From the results it can be seen that the levels of experts' agreement with the criteria as well as that of consistency among the opinions were enhanced, showing that a consensus among the experts was reached, given that the means of all tertiary criteria were larger than 4 and the CVs were smaller than 0.2, and the experts only offered suggestions concerning few details of the explanations of the criteria.  The speci c purpose for using the data, such as commercial insurance/R&D by a pharmaceutical company/IT system optimization/uses by medical equipment manufacturer.

The scienti city of the method
Delphi method is a research method used to appraise or predict with the help of experts' experience and knowledge, and a process in which the opinions of a panel of experts are collected through multiple rounds of questionnaires and put under effective control before a consensus is reached among the experts. The selection of experts and effective feedback lay the foundation of the scienti c nature of the research, and the number of experts is recommended to be limited between 10-50 [8] . This research adopted Delphi method to establish a framework for the evaluation of healthcare data's value, with 15 experts in relevant areas invited, all of whom work in areas related with computer information or health care, and 85% of whom are of senior or sub-senior titles, with degrees of or higher than master, and whose working ages higher than 10. The authority coe cient is 0.81, demonstrating that the experts were highly representative and had solid knowledge as well as rich practical experiences. The experts were also highly active, given that the response rates of both two rounds of questionnaires reached 100%, and that the criteria less agreed upon received speci c adjustment advice in the advice column. The nal draft of the evaluation framework was proven to be scienti c, given that the average degree of agreement and the full score ratio increased while the CV decreased to a relatively low level, after the criteria less agreed upon and inconsistently scored were modi ed based on the results of the questionnaires.

The framework and its signi cance
Following the two rounds Delphi inquiries, this research formulated the nalized framework for the evaluation of healthcare data's value, which includes two primary criteria, 7 secondary criteria and 21 tertiary criteria. The 2 primary criteria are "inner value" and "value in use". "Inner value", which includes 3 secondary criteria: "usable", "easy to use" and "important", means that the characteristics inherent of the data are related with the value of healthcare data. Taking one of the secondary criteria, B1 "usable", for instance, it means that the open data can more easily create bigger value in practice if the data is complete, faithful to the reality and is compliant with national or regional standards. The other primary criterion, "value in use", means that the value created by data open-up is related with the scenarios in which the data is used. This research summarized four major scenarios in which healthcare data could be used, including "for scienti c use", "for managers to make decisions", "the individual's purpose for using the data" and "for commercial use". Taking

Limitations and future research
This research initiated an innovative research into the opening-up and sharing of healthcare data, and established a framework for the evaluation of data. Yet there are some limitations concerning this research. Firstly, this research did not assign weights to the tertiary criteria after establishing the framework, so in future researches the author will proceed to assign weights to the current criteria so as to better guide data evaluation in practice. Secondly, data's value is not the only thing to be considered when opening up healthcare data, given that the risk and cost in the process also should not be ignored. With the basis of this research, the author will conduct further researches while aiming at establishing an open-data model for the evaluation of data from three dimensions, including value, risk and cost, so as to provide a theoretical guidance for the open data movement in the healthcare industry.

Conclusions
The application of healthcare big data will bring about profound revolution to healthcare models, yet the application is preconditioned by opening data in an orderly fashion and in accordance with laws and ethics. Prioritizing the high-value data has been one of the received principles in the open data movements across many countries throughout the world. Drawing insights from peer understanding and professional expertise, a framework for the evaluation of healthcare data's value was developed in a scienti c method, and is expected to provide a theoretical tool with which data managers can implement this principle.