When we examined the correct answer success rates of the Chat GPT 3.5, Chat GPT 4.0 and orthopedic specialists who took the exam in the TOTEK proficiency exam questions, orthopedic specialists were more successful than Chat GPTs. The artificial intelligence application not only provided answers to the exam questions but also provided the necessary literature review and presented the answers to the questions in a logical way. This information was provided along with the explanation. However, these answers are not always accurate or up-to-date. GPT-4, which is a more comprehensive version than GPT 3.5, was found to be more successful than 3.5, as expected.
In our study, the percentages of correct answers to 265 questions were 56%, 45% and 37% for the GPT 4 and GPT 3.5 students, respectively. In the literature, it has been reported that the Chat GPT achieved a near-passing score on the American Medical Qualifications Exam (USMLE). [3] It has also been reported that it achieved near-perfect success in the American university admissions exam (SAT) and was successful in the graduation exams of various university departments. [4, 5] In the study conducted by Lum et al.[6], where the success of the patients in the exams for Orthopedics and Traumatology education was evaluated, 47% of the 193 questions asked to the Chat GPT answered correctly, and while they were more successful in the questions that tested memory and direct knowledge to pass the exam, success decreases in more complex questions such as comparison, interpretation ability, and use of information.
According to the literature, the Chat Gpt had a lower success rate for exams in which the question language was not English. Similarly, in a study conducted by Kaneda et al.[7], it was reported that the percentage of correct answers to the Chat GPT decreased by 10% in exams conducted in languages other than English.
The development of more modern and effective diagnostic and treatment methods in the field of medicine, as well as the search for fast and effective solutions against diseases worldwide, such as the COVID-19 pandemic, has forced physicians to work with artificial intelligence. [2] Due to the deep learning feature, deep learning methods can perform advanced medical evaluations, such as radiological imaging and evaluation, and diagnosis through photographs. [8, 9] The Chat GPT application is a chatbot application that has recently become popular because of its deep learning features. In our study, we wanted to evaluate the knowledge and analysis ability of artificial intelligence in the field of orthopedics with a proficiency exam prepared by TOTEK.
The Chat GPT's access to information on the web without being subject to any control may cause the application to be incomplete in matters such as distinguishing real, correct and up-to-date information; acting in accordance with ethical and moral rules; and directing people correctly. [11] Similar studies conducted in medical fields other than orthopedics and traumatology have shown that the Chat GPT can contribute to the training and exam success of physicians, but it has been reported that the validity of the answers is controversial. [11, 12] When evaluated from an orthopedic perspective, the Chat GPT is not a source that can be completely trusted in accessing accurate information. However, in terms of medical diagnosis and treatment methods, the Chat GPT, in light of patient history and clinical information, can provide information about the treatment process for patients and physicians. [13] In the literature on this subject, there are also articles with suggestions for researchers and physicians to use the Chat GPT more usefully. [14]
In a study different from the studies we have mentioned and our study, Klang et al.[15] prepared questions from the Chat GPT application for medical qualification exams and evaluated the questions. At the end of the study, they concluded that the Chat GPT can be used to prepare medical qualification exam questions, provided that they are checked by specialist physicians. All these studies on artificial intelligence show that in the near future, exams can be prepared entirely by artificial intelligence, and the competencies of physicians can be evaluated by artificial intelligence.