Medical School Performance as A Predictor of Scores in the National Medical Specialization Exam in Turkey

Medical educators and assessors like to include predictive validity in their validity arguments but relevant evidence may be dicult to nd. External standardized examinations may have a role in validating both the educational process of medical schools and their assessment results and outcomes. A strong correlation between medical school and external exam performances may also lend evidence of validity to the external examination. This work from one of Turkey’s top medical schools explored the correlations between students’ medical school performances and scores from the Specialization in Medicine Exam (TUS). The TUS is a post-graduate national ranking examination.


Abstract Background
Medical educators and assessors like to include predictive validity in their validity arguments but relevant evidence may be di cult to nd. External standardized examinations may have a role in validating both the educational process of medical schools and their assessment results and outcomes. A strong correlation between medical school and external exam performances may also lend evidence of validity to the external examination. This work from one of Turkey's top medical schools explored the correlations between students' medical school performances and scores from the Specialization in Medicine Exam (TUS). The TUS is a post-graduate national ranking examination.

Methods
A total of 246 students from two different programs of a medical school, which have identical curricula but different admission scores were studied retrospectively. Students' year based Grade Point Averages (GPAs) and end-of-school (graduating) GPAs were calculated using a weighted mean method. Bivariate correlations were calculated between year speci c GPAs, graduating GPAs and TUS scores.
Their graduating GPA also had a strong signi cant correlation with TUS scores (r = 0.65, p < 0.001). Linear regression models showed the signi cant relation between medical school performance and postgraduation national exam performance.

Conclusion
Student success has a high degree of consistency throughout the medical school and students' performance across all domains of assessment in the undergraduate program might be a good predictor of cognitive skills in an external national examination in the early postgraduate phase.

Background
Postgraduate external examinations are used in different regions, some for licensing to practice and some for selection purposes. The Specialization in Medicine Exam (TUS) in Turkey is a ranking examination that matches the medical school graduates to available residency programs. TUS is organized twice a year (in April and September) by a central examination council (abbreviated as OSYM in Turkish). There is no limit to the number of attempts a medical doctor can sit TUS, but students prefer to prepare earnestly for their rst attempt just following graduation. As graduation is generally in July, the rst attempt to sit TUS is in September. A minority of students who graduate after September, prepare for the April TUS of the following year. The general reason for students to sit the exam in April is a delay for any reason of their graduation from medical school. Some students strategically delay their graduation in order to have more time to study for the exam. In order to be eligible to sit TUS, students must pass the foreign language test. English is the most popular foreign language but candidates may also choose French or German.
TUS comprises two tests (basic medical sciences and clinical sciences), each composed of multiplechoice-questions (MCQs). Basic medical sciences and clinical sciences tests are sat in two different sessions on the same day. TUS as the external assessment of undergraduate studies might provide evidence of the predictive validity of the schools' assessment results and outcomes. Specialty exit exams may also be used as a validation source but as they are too far removed in time from the medical school exams, many other factors may in uence their results. As an exam composed of only MCQs with no assessment of observed clinical skills and competencies, the TUS is criticized by authorities that, it may not address the same learning outcomes as the school exams. In contrast, Medical schools, use a wide range of assessment methods that test clinical, personal and cognitive skills in an effort to more accurately assess the performance of students [1]. Thus, if there is a strong association, between performance at medical school and TUS this may also validate the latter as a tool to select the most suitable candidates for residency.
The aim of this study is to seek evidence of the predictive validity [2] of one school's assessment results using both results throughout the medical program and the speci ed external exam.

Setting
The study was conducted in Istanbul University-Cerrahpasa, Cerrahpasa Medical School. Undergraduate medical education in Turkey lasts six years. Students learn basic medical sciences in the rst three years, attend clerkships and study clinical sciences in years 4 and 5 and have an internship without examinations in the nal year. There are 2 medical education programs in Cerrahpasa Medical School, one in Turkish (program 1) and the other in English (program 2). Programs 1 and 2 are identical with the same curriculum and delivered lectures. The only differences between the two programs are the language used, and the fact that program 2 (English) requires a higher admission score. Program 2 of Cerrahpasa Medical School ranks top in Turkey for students' admission scores.

Students and TUS Dates
A total of 330 students, (270 program 1, 60 program 2) matriculating in 2007 were included in the study.
Total completion rate for program 1 was approximately 89% (240/270) across the six-year program. Twenty additional students were accepted into the program from different Turkish medical programs in different years through the process of lateral transfer. Nine students were unable to pass the foreign language examination and were ineligible to sit TUS. A total of 201 students from the program 1 took the rst TUS following their graduation The vast majority of students from program 1 graduated from medical school before September so their rst attempt to sit TUS was in September 2013. There was also a minority of students who graduated from medical school after September and their rst attempt to sit TUS was in April 2014. This study included all students who took the rst TUS following their graduation independent of their graduation date. Fifty students from program 1 either delayed sitting TUS or decided not to take it.
All nal graduating GPAs of the 201 program 1 students who sat TUS were available but one or more interim GPAs were missing for 21 students.
Total completion rate for program 2 was around 92% when all 6 years are taken into account. The lateral transfer process added nine more students to program 2. No students failed the foreign language test. This was expected as all students pass an English pro ciency test before admission to the program and all lectures are delivered in English. TUS scores were available for the 45 students who took the rst TUS after graduation. Nineteen students from program 2 either delayed sitting TUS or decided not to take it. All nal graduating GPAs of the 45 students program 2 who sat TUS were available but one or more interim GPAs were missing for 5 students.
Similar to Program 1, the majority of students from Program 2 graduated from medical school before September and their rst attempt to sit TUS was in September 2013. A small number of students whose rst attempt to sit April 2014 TUS were also taken into account.

Collection of data and variables
Performance of a graduating student at the medical school under study is assessed using a compilation of a variety of assessment methods and the contributing scores were accumulated into a Grade Point Average (GPA). Weighted means calculated as [(Σ grades*credits) / credits] on a 4.0 scale gave interim GPAs and graduating GPAs were calculated from interim GPAs. Multiple-choice questions (MCQs) constitute around 70% of the 1st year GPA. This ratio decreases gradually to almost 55% for 5th year.
Contribution of interim GPAs to graduating GPA is almost equal for each year.
School grades were obtained from the University student affairs o ce and scores at the rst attempt of TUS (at the end of medical studies in 2013) were provided by the central examination council (OSYM).

Ethics and Statistical Analysis
The study was approved by the institutional review board of Cerrahpasa Medical Faculty. (Approval number: 2014/A-37) The researchers were blinded to the identity of students and their scores by having the examination council and student affairs teams apply codes to the data before handing them over. This secured the con dentiality of the data.
Looking at the association between school performance and TUS scores has been proposed as the core of the study. Students from different programs of the school were also compared to check if the correlation is transferrable and was not affected by the fact that program 2 accepts students with higher admission scores.
Statistical analysis was carried out by R open-source package [3]. Descriptive statistics were calculated.
Inter-year speci c GPA correlations have been calculated in order to check the degree of multi-collinearity if different year GPAs were used together to predict TUS score. This also helped to evaluate the consistency of students' performance across all years of medical studies. A linear regression model was applied to formulate a relation between graduating GPA and TUS scores. Students from both programs were included in the rst model looking at the general predictive power of GPA for TUS. As a second step, linear regression models speci c to each program were compared.

Results
Medical school GPAs show a normal distribution for the entire medical school period (Graph1). Mean GPA for students for 1st year is 3.01 ± 0.43, the same as the mean graduating GPA (SD = 0.39). The mean GPA decreases to 2.95 ± 0.52 in 2nd year, which anecdotally from students appears to be the most di cult year of the program. Students' GPA grades show the largest variation in this year being as high as 2.28. The mean GPA for 3rd year is 3.04, which is slightly above the mean of the nal GPAs. The mean GPA for 4th year is the lowest of all (2.83 ± 0.45). The mean GPA for the 5th year is the highest of all reaching 3.31. The lowest GPA score from 5th year is 2.45 (SD = 0.35) and the range narrows to 1.55. (Table 1) The mean score for TUS, for all 246 students is 56.8 ± 10.
GPAs of each year (from 1st to 5th year) showed a medium to large correlation with TUS scores (Table-2 Graph 2 with two different dashed lines indicates that the pass/fail score of TUS (45) correlates with the GPA of 2.75. A GPA of 2.75 can be claimed to be the cut-off to identify students at risk of failing TUS. No student who has a 4th year GPA higher than 2.75 failed TUS. This potentially useful indicator needs to be further tested looking at different cohorts of students.
All 246 students were included in the linear model that regresses TUS scores on the nal GPA. The nal GPA made up 44% variance of the TUS scores. The model was found to be signi cant with a p < 0.001. The graphical illustration of the linear model (least squares line) is given in graph-3 and indicates the pass score for TUS (45) with a horizontal dashed line.
We analysed the correlations between medical school performances with TUS scores separately for each of the Program cohorts. For the 201 students admitted to Program 1, (lower university admission scores), the mean nal GPA was 3.02 ± 0.40, while the mean TUS score was 56.61 ± 10.28. The model was signi cant (p < 0.001) in de ning the association between GPA and TUS scores. The graduating GPA accounted for 42% of variance in the TUS scores. The mean graduating GPA of students (n = 45) in program 2 was 3.09 ± SD = 0.41 and they had a mean TUS score of 58.57 ± 9.60. According to the linear model, 51% variance of TUS scores could be explained by students' graduating GPAs. The linear model was signi cant with a p < 0.001. Medical school performances, national exam scores and the correlation between both did not show any statistical difference for two different programs.

Discussion
In our study, TUS, a national, external ranking examination-comprising MCQs only, is strongly correlated, with medium to large effect sizes, with students' grades at medical school. Although TUS does not speci cally assess the demonstration of clinical skills, the results of TUS are strongly correlated with the school results that incorporate tests of practical competence as well as cognitive skills. This may be due to the underlying factors (intelligence, conscientiousness etc) that determine performance. Although this study demonstrates that a student's performance across all domains of assessment in the undergraduate program is a good predictor of cognitive skills in an external national examination in the early post graduate phase we can not comment on the correlation with early postgraduate practical competence.
Had the TUS included tests of practical clinical competence it might not have been surprising to nd that after a year-long internship, students' skills had accelerated. Whether that rate of development could be predicted by their total undergraduate performance remains unknown in our context and should be explored in future projects.
Previous similar studies, mainly from North America, use United States Medical Licensing Examination (USMLE) as the criterion variable [4][5][6]. There are also examples from the Netherlands [7] and Australia [8]. All report similar ndings with those from North America. This study helps to generalize the ndings about the correlation between undergraduate performances with an external exam internationally.
However we also found that students' performance across a medical program is (fairly) consistent in keeping with the ndings of Hope and Cameron [9] and McManus et al [10]. McManus et al. identi ed that the continuity of academic success for medical students stretched from secondary school into the early years of their postgraduate careers and referred to this as 'the academic backbone' Although GPAs from all years have moderate to large correlations with TUS, there is a trend for the correlation power to increase from 1st to 4th years. Year 1 has the lowest correlation of all. This may be partly because some subjects in the 1st year curriculum such as history, foreign language, Turkish language etc. are not directly related to medical sciences and are not re ected in the TUS. The low correlation may also re ect the effect the adaptation period has on some students in an unpredictable fashion. The 4th year medical school GPA is the best predictor of TUS scores (r = 0.67). This may re ect the alignment between the medical school curriculum and the focus of the TUS; in this rst clinical phase of education year 4 students have the main clinical clerkships such as Internal Medicine, General Surgery, Gynecology/Obstetrics and Pediatrics. The high correlation with year 4 may also re ect the development in students' professional identity and thus their motivation to study. Fourth year may also be the year when all students' study strategies become more focused to plan their careers for specialization. In this study the 4th year GPAs correctly identi ed students at risk of failing TUS. If this nding is rati ed with further cohorts it will provide an evidence-informed metric that the medical school can use to detect and support those students who may need further remediation before they proceed further. The 5th year GPAs are the highest and the grade range narrows to 1,55. This might be due to the checkpoint function of 5th year. As students require a cumulative GPA over 2.00 to pass to the nal year; they may be preparing hard for the exams in order to lift their grades.
What message should ndings in this study give to policy makers in Turkey and internationally? In the current system, TUS is the only assessment that determines which students enter specialization (residency) training. If TUS is strongly correlated with medical school performance, does this indicate that TUS is a valid and reliable method to select candidates for each residency? There were some students who outperformed their peers in TUS although they had a lower school performance. Such a nal, end-ofschool exam may be identifying some students who have accelerated their performance during their internship due to improved motivation, a preference for learning in the workplace, unknown factors or a combination of these. On the other hand TUS may be criticized for its focus on MCQs only. In contrast, medical school performance could be considered a better measure of global achieved learning using a range of assessment tools that evaluate problem solving, communication and practical skills [11]. In addition, the medical school's use of multiple methods allows for compensation of each tool's weaknesses [1]. At this stage while the TUS continues as an MCQ only examination it is di cult to determine if the unexplained variance between school and TUS performance is due to the difference in learning outcomes addressed by different assessment tools or by differences in achieved learning over the internship.
It is reassuring that there is a good correlation between year GPAs suggesting that they may each be useful additional determinants for selection to residency slots.
The results suggest that for students in the medical school under study, adding the undergraduate performance to the TUS results may increase the validity of the decisions about the allocation of residency posts. However it cannot be assumed to be so across medical schools. In order to be more con dent about graduates' competence and the selection processes, studies looking at further cohorts and across a range of medical schools is required.
A major strength of this study is being able to include all students with available data in the analysis. As the student data were available longitudinally, consistency of student success could also be evaluated. With a strong inter-correlations of year speci c GPAs we can claim that student success has a high degree of consistency. This study also has some limitations. It is based in one-institution and analyzes student outcomes from one year of graduation. Further work should cross-validate the results by involving different institutions and graduates from different years. Another factor is that, students make extensive preparations for TUS, which may limit our comments for direct effects of school curriculum on TUS. We couldn't analyze the gender effect on student performance as assigning gender codes to unidenti ed students could jeopardize the con dentiality of data due to small numbers in the study. Although we have demonstrated that medical school performance is a good predictor of performance in the early graduate phase our study was not designed to explore the causal mechanisms. Do hard-working successful students become hard-working successful graduates? Are the exam results in medical school or in the TUS a result of commercial preparation courses? Does learning in the internship year differ from that in the rst 5 years? Why do students fail the TUS? These interesting questions, raised by this analysis, will require a more qualitative approach to address them.

Conclusion
This study demonstrates that medical school performances correlate well across the years of the program and with the Specialization in Medical Exam (TUS) that is a national ranking examination.
Student success in medical school shows consistency with a medium to large correlation between yearly GPAs and 42-52% of the variance of TUS scores was attributed to the graduating GPA.
Year 4 GPAs showed the highest correlation with TUS scores thus suggesting a timely metric that the medical school can use to detect and support those students at risk of failing the TUS.
There is no compulsory medical licensing examination in Turkey for new graduates and thus no direct validating assessment. This study explored the national MCQ based examination for selection to specialty and found a strong correlation between medical school performance and TUS for one cohort of students at one medical school. This provides evidence to validate the educational program of the medical school externally but requires further study to generalize the ndings to other cohorts and other medical schools in the country. Ethics Approval and consent to participate: The study was approved by the institutional review board of Cerrahpasa Medical Faculty (Approval number: 2014/A-37). Informed Consent was not needed as identities of all subjects were blinded to all researchers. This was also approved by the same review board.

Consent for publication: Not applicable
Author's Contributions: AhM conceptualized the idea, did the literature search, collected the data, performed the analysis and drafted the manuscript. DH designed and performed the analysis. RO was involved in planning and supervision of the work. HC critically analysed the literature, aided in interpreting the results and worked on the manuscript. All authors discussed and commented on the nal version of the manuscript.
Availability of Data and Materials: The data that support the ndings of this study are available from central examination council of Turkey (OSYM) but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of central examination council of Turkey (OSYM).