Establishment of the AI model
Based on deep learning, the SCMC and YI TU Technology Company jointly developed a personalized inquisition and automatic diagnosis algorithm that could mimic the consultation with a doctor. In the first place, the EMRs (Electronic Medical Records) were structured through NLP (Natural language processing). We selected 59,041 high-quality medical records hand-annotated by a team of professional doctors and informatics experts. This NLP model utilized deep learning techniques to automate the free texts from EMRs into the standardized clinical features, allowing the further processing of clinical information for diagnostic classification. Logistic regression classifiers were used to establish a hierarchical diagnostic system, and the system was primarily based on anatomic divisions (for example, organ systems). Following by automatic diagnosis based on medical records, the matching examinations or tests items were generated. This artificial intelligence, which integrated the functions of inquiry, medical history collection, diagnosis and ordering tests or examinations, had been put into use, and we named it XIAO YI. The algorithm was similar to that of Liang’s [11], except that our model had been updated and iterated base on the data from our hospital information system. Besides, in Liang’s study, they focused on using AI to diagnose pediatric diseases, but our study used AI to prescribe examinations and tests before seeing a doctor to reduce the waiting time of patients in hospital lines.
Considering guardian’s acceptance, though AI algorithm could theoretically create most of the tests/examinations, our final client-side only considered certain kinds of the tests/examinations, which were noninvasive (or less invasive) and low-cost. This was currently set in the backstage, and no additional manual operation was needed. Thus, XIAO YI just recommended common items to patients. If a 12-year-old child urinated blood with lumbago for 1 day, the first diagnosis might be kidney stones. According to the inquisition, XIAO YI analyzed the child needed blood routine, urine routine and urinary B-ultrasound. But in some cases, doctors might also ask the patient to have a CT scan. The price of CT was higher, but B-ultrasound was sufficient for a preliminary diagnosis of kidney stones. In performance test, most errors were items missing (85%).This was the result of our deliberate choice, as we did not require XIAO YI to order all tests/examinations for patients. On the contrary, we only needed it to issue the simplest and most common parts. The rest of the complex, invasive ones would be left to professional doctors.
At the same time, in each department, we also had special backstage doctors responsible for reviewing every item ordered by XIAO YI. The doctors would adjust the tests/examinations manually according to the actual condition. For example, some parents wanted to add other tests/examinations that were not related to the disease. That didn't happen often, though. Only after the doctors’ approval, can the patients pay and complete the tests/examinations.
Procedure of the AI-assisted outpatient service
We explain the standard outpatient service process and the AI-based modifications to it. In the traditional way, patients need to register first, and after registration they will wait in the waiting area. When it is their turn, they go to the consulting room to see their doctors. Mostly, a lab test or an imaging examination is needed to confirm the diagnosis. And then patients have to pay for these, and go to the correct places to get examined or tested. After receiving the reports, patients will wait again to see the doctor and may be recommended another examination/test or some medicines. In this study, we focus on the steps from registration to the examination or test.
The first step in the AI-assisted outpatient service is registration, too. In the next step, patients open the WeChat application (a WhatsApp-like social application widely used in China) on their mobile phone. Patients’ unique outpatient numbers are linked to a small smart program based on WeChat, that is XIAO YI client-side. XIAO YI client-side is the materialization of the above-discussed algorithms, which has clients on both mobile phones and doctors’ working computers. It automatically reads the registration information of patients. Depending on the chief complaint, XIAO YI asks the patients a series of questions, like a real doctor would do. The next question is decided intelligently based on the answer to the previous question. When XIAO YI believes it has gathered enough information, the inquisition ends. XIAO YI orders tests or examinations that must be done to help the doctors make the clinical diagnosis. The tests and examinations “prescribed” by XIAO YI are basic, non-invasive, and relatively inexpensive (e.g., blood routine). Patient then make the payment for these tests and head to the testing rooms. If patients disagree, they would go through the traditional process of waiting in line to see the human doctor. When the test or examination is completed and the report is obtained, patients wait to be called to the doctor’s office for consultation. The traditional and AI-assisted workflows are shown in Fig. 1.
Selection of subjects
SCMC is one of the biggest pediatric specialized hospitals in Shanghai. It affiliates to Shanghai Jiao Tong University School of Medicine. We collected information of patient’s registrations from August 1, 2019 to January 31, 2020. The dataset included patients from the internal department, gastroenterology department, and respiratory department who visited SCMC during that period. It included their gender, age (on the day of registration), registration code, registration time, time of meeting the doctor, time of examination/testing, time of prescription by the doctor, and time of receiving the medicines, among others. We ensured patients’ privacy. In the dataset that we extracted and used for analysis, researchers could not see the patient’s name or their outpatient number. The patient’s outpatient number was recoded into a registration code, mainly because sometimes a patient would register multiple times in one day and therefore the outpatient number needed to be recoded to make it unique. In addition, in this way, the information security of patients was also guaranteed.
During this period, uniformly trained volunteers and nurses would publicize XIAO YI to the guardians of children in the internal department, gastroenterology department, and respiratory department, and directed them how to use it. With the help of volunteers, some guardians used XIAO YI to order and complete tests/examinations before they went to see a doctor, while some guardians sticked to the traditional way of seeing a doctor. Thus, patients were classified into two groups, namely, the conventional outpatient group and the AI-assisted group (AI group), depending on their own choices. Because the outpatient service process selected by patients was equivalent to exposure, and the length of the waiting time was equivalent to outcome, so we conducted a retrospective cohort study. The two groups of patients were matched first according to the registration time mainly because the time of registration might be the most influential factor affecting the waiting time of an outpatient except the grouping. Generally, there are more patients on holidays than on weekdays, and there are more patients in the morning than in the afternoon. Moreover, weather, traffic jam, and other external factors (e.g., COVID-19 outbreak) could influence the time spent by outpatients in the hospital. We needed to reduce the interference of other factors with the results, thus, we paired the patients who visited the hospital at almost the same time. And propensity score matching (PSM) was employed to balance this covariate [18].
We discovered that using only the paired dataset was insufficient. This was because in our conceptual scenario, patients were first registered, signed in and then queued up in the waiting area to see the doctors. However, the actual situation was that after registration, they did not sign in at once if they perceived there was a long waiting time due to too many patients. They (i.e., children accompanied by their guardians) might wait until there were fewer patients before signing in and waiting to see the doctor. As a result, this kind of patients spent a lot more time waiting than others. In addition, there were some patients who took advantage of the features of the system to make an appointment, especially in the AI group, as it was more convenient to make an appointment through the AI system. For example, if a patient came to register at 8 a.m. but the patient was not available until 2 p.m., the patient would request the nurse to schedule the appointment for 2 p.m. This would greatly overestimate the time spent in the hospital.
To avoid these issues, we cleaned the data according to some criteria. We excluded patients who did not have a lab test because the main function of the AI was to order a lab test before the consultation with doctor. Patients who spent more than five hours from registration to consultation were also excluded, as were those who spent more than eight hours from registration to obtain their medicines. According to the experience of many doctors in the hospital, such long waiting times usually happened because the patients either had appointment or were late for their appointment. The patients who spent less than five minutes waiting were also excluded, as these were likely errors.
Outcomes
The primary outcome was the time spent by the patient from registration to take the laboratory test or examination, defined as the waiting time. The secondary outcome was the expenses in the hospital. Thus, we evaluated the performance of the AI-system from two dimensions. In addition, patients in the AI-assisted group and the conventional group were subdivided into six subgroups according to three clinic departments, including internal department, gastroenterology department and respiratory department to further analyze the waiting times. Besides, patients were also subdivided depending on the tests (blood routine test, routine urine test and detection of influenza A and B virus test) or examinations (abdomen ultrasound and chest radiograph) they took.
Statistical analysis
Stata 15 was used for statistical analysis and PSM. Continuous variables were expressed as means ± standard deviation (SD) or medians and inter-quartile range (IQR). Categorical variables were summarized as counts and percentages. Missing data were not imputed and deleted. All of the analyses were two-sided, and P values of < 0.05 were considered to be significant. The skewness/kurtosis test for normality was used to test the assumption of normal distribution. When normally distributed, continuous variables were expressed as mean ± SD and calculated using a paired Student’s t-test. If not, as was the case with almost all continuous variables, we used the nonparametric Wilcoxon signed-rank test.
Propensity scores were estimated using logistic regression. The covariate was time of registration. This covariate was selected because it might affect the time that the patient spent in the hospital. The time from registration to take the test or examination was entered into the regression model as a dependent variable. The group was defined as an independent variable. A 1:1 nearest neighbor, case-control match without replacement was used [19]. Stata was used to test the equilibrium between the two groups after PSM, and p > 0.05 suggested that the difference in registration time was not statistically significant. The chi-square test was used to compare the sex ratio in the two groups and the ratio of visits in each department.