The literature includes examples of ‘minimal essential requirement’ MCQ examinations designed to assess competence (8). However, there does not appear to be a comparable study to our own with regards the generation of the MCQ items solely from ‘non-academic’ clinicians and so from this regard, this investigation introduces a novel concept.
With such a focus on the content of examination questions, there is a risk that one could lose sight of the bigger picture, which is to ensure that our medical institutions produce good quality physicians. A paper by Christensen et al in 2007 highlights the importance of an ‘outcome-based' approach in medical education, compared to a process/content orientation. However, they do hold some reservations, as worry is expressed about the taxonomy of learning in pure outcome-based medical education, in which student assessment can be a major determinant for the learning process, leaving the control of the medical curriculum to medical examiners (9). In development of the MAC examination, we have designed an examination which is outcomes based but also designed by a wide range of clinicians as opposed to a minority of faculty members.
Another study with a similar methodology to our own showed that when students (as opposed to clinicians) wrote the MCQ questions, when sat by their peers, the results correlated well with results in their official pediatrics examination but were overall of a tougher standard (10). A recent study of surgical students showed that the scores from a peer-written examination correlated well with other independent measures of knowledge such as United States Medical Licensing Examination (USMLE) and National Medical Board Examination (NMBE) examinations and also with the universities' surgery clerkship examination(11). This is comparable with results from the MAC examination which have been shown to correlate well with the same students' marks on the official RCSI final paediatric examinations.
Potts et al wrote a paper with a similar concept to our own MAC examination (12). They designed a summative assessment based on six core paediatric objectives. Passing all items was a requirement and failure required remedial oral examination of any missed items. When ‘pre-warned’ of the curriculum change and emphasis on these aspects of the curriculum, student’s grades on this examination significantly improved compared with the previous year (control group). However, this same cohort of students performed worse on the NBME paediatric subject examination. In the Potts study, the students' poor performance in the NBME is likely because their attentions were drawn towards passing the new ‘in-house' summative examination, the consequence of which was missing many key components of the curriculum as set out by the NBME. The study did not answer whether or not these students were poorly prepared to be paediatricians, but simply highlights the distinct difference in the two curriculums. This raises a few worthy points; assessment drives what students learn and therefore needs to be reflected in the curriculum, also, students are very capable of meeting an agreed learning objective when prepared for it. However, if the curriculum they are taught is not reflected in all of their summative testing it can prove detrimental. This is comparable with our study in that the students performed poorly in the MAC compared with the official RCSI examination, the curriculum for which they were familiar with and were specifically prepared for. When set a different test, albeit, on the same subject, the scores significantly reduced.
Standard setting MAC examination
The result of the standard-setting process was that the MAC examination was given a ‘passing score' of 41.2% (13/30). This is relatively low for any ‘finals' high-stake examination but particularly so when considered that the initial intention was to design an examination with questions which were deemed ‘must know’, ‘basic knowledge.’ One could argue that the method by which test questions were gathered (i.e. requesting for ‘minimal required competency') had ‘pre-standard set' the MAC examination at close to 100% [it cannot be exactly 100% due to the standard error of measurement (13)]. When considered like this, 41.2% becomes even more remarkable.
Despite the fact that in many institutions cut scores are often between 50-70 % (14), there is an argument that cut scores should be high. The higher the cut score, the smaller the chance of false positives (i.e. candidates able to pass the examination by guessing the answers). This is of particular importance when the licensure will be in a task, failing which will cause serious effect on the individual or society using the service (15), such as in final medical examinations.
Our vision of the MAC examination was one in which the questions would be of a relatively easy level, appropriately standard set with a high passing score (for example, 80-90%) and therefore candidates would need to have a complete understanding a more limited syllabus upon which the questions are based in order to pass the test.
The low standard set passing score of 41.2% can be explained by the fact that either the providers of the questions (i.e. non-academic paediatricians) had a much higher expectation than that of the faculty, or that the faculty greatly underestimated the knowledge level of the students. Either way, a passing score of 41.2% reflected a faculty opinion that this was a difficult test in which a candidate need only get 13 out of a possible 30 questions correct to satisfy the criteria worthy of passing.
Analysis of results
We must consider why the students found the MAC questions so difficult and why so many did not achieve the passing score. Was the examination standard set at too high a level? This is unlikely as the passing score is already below industry standards. Alternatively, do the students simply not have the targeted knowledge required to pass the MAC? To put it another way, their level of paediatric knowledge reflects the RCSI curriculum, indicated by the high passing rates (96-97%) of the same students in the RCSI examination. Their relatively poor results in the MAC examination have therefore highlighted a significant gap between the RCSI curriculum and the knowledge required for the MAC examination ( i.e what the non-academic consultants expect them to know).
The poor results in the MAC examination do not indicate that these students will necessarily make poor paediatric doctors, but it highlights a potential difference between the RCSI curriculum and the ‘hidden’ curriculum as determined by non-academic clinicians.
Neither the students nor the junior doctors appeared to know the “must know” questions. Is this poor preparation or unrealistic expectations? Frontline non-academic paediatric clinicians were asked to provide clinical questions based around essential knowledge for practice. Despite the instruction to the question providers that MAC questions should reflect ‘must know’, ‘basic knowledge’ and a ‘minimum accepted competency’, this exercise yielded a relatively low passing score and reflected the difficulty of the standard of questions being asked. Each individual submitting questions would have described the set standard for their own questions as 100% (i.e. “must know”) but it is possible that this was an unrealistic expectation for undergraduate students. Contributors were not asked to assess the standard of other submitted questions. This would have been a useful exercise, but it was beyond the scope of this study to recruit and train non-academic clinicians in standard setting. When considered in the context of “must know” information, the average score of 45-46% for undergraduate students was considerably less than would be expected by clinicians working at the ‘frontline’ of general paediatrics. This may reflect unrealistic expectations, or a curricular emphasis on alternative content.
Reassuringly, the paediatric SHOs‘ about to embark on their paediatric career performed significantly better than the medical students. This is an important finding as the MAC examination was designed as a test of knowledge required for ‘on the ground’ clinical practice. In Ireland, paediatric training can commence at postgraduate year 2 (graduates complete a one year ‘internship’ in general medicine/surgery during which time they apply for subspecialty training to commence the following year). The majority, but not all, of the participating SHOs’ would therefore have had 2 more years of clinical experience (1 in their final year of undergraduate study and a 1-year internship). These participants appeared to have benefitted from the extra clinical experience, albeit not in paediatrics. However, their results still did not match the "must know" standard initially expected by the clinical paediatricians setting the questions.
Why was the MAC examination result standard set so low if the questions were meant to be ‘must know' ‘basic' knowledge? This reflects a difference in opinion of expected standards between faculty for undergraduate students and that of non-academic clinicians for junior doctors in paediatrics. With the latter seemingly expecting a higher level of knowledge. However, perhaps rather than a ‘higher level’ of expected knowledge, non-academic clinicians expected a different type of knowledge. It is possible that an undergraduate focus on traditional ‘textbook' facts did not align with the clinicians’ focus on practical aspects of the job, which are particularly relevant to everyday clinical practice. This potential difference in knowledge or focus warrants further investigation at undergraduate level and possibly intervention at early postgraduate level for those planning to practice in paediatrics. There is a move in some third level institutions to revisit the structure of their undergraduate teaching to increase focus on clinical practice and the broader non-clinical skills required by the physicians (16).
All of the universities within the island of Ireland have recently collaborated to develop a national undergraduate paediatric curriculum. This will go some way to standardising the knowledge acquired by graduates working in Ireland and is a great opportunity to revisit how undergraduate programs are taught. This process should incorporate the views of a wide range of ‘non-academic’ paediatric clinicians to ensure that it can bridge the gap between what is taught and assessed at undergraduate level and what is practically important in the workplace. This study highlights the difficulty in attempting to deliver an undergraduate course that both establishes a core of basic paediatric knowledge and prepares a student for the postgraduate clinical environment. However undergraduate medical education is not merely about transferring knowledge to future medical practitioners. It is also about developing transferrable general clinical and non-clinical skills required for good medical practice, including Human Factors, and engendering the skills for lifelong self-directed learning. It may be that bridging this ‘gap’ is not necessarily the responsibility of the university that is preparing graduates to work as general physicians rather than subspecialists, but rather the postgraduate training bodies should possibly be identifying ways in which this type of knowledge is provided and assessed prior to entering the training scheme. This could be delivered in a short induction course and the transitional period of assistantship that many universities now have in place would seem a suitable time to do this. It is anticipated that the results of this study can inform the content of transition interventions to better prepare them for practice.
Did the students from year to year perform differently? RCSI students have two paediatric examinations that contribute to their final marks. The first is a clinical examination done immediately after the six-week paediatric rotation, when students are fresh from their paediatric clinical experience. The second is a written MCQ given to all students at the end of the academic year, when students have been focusing on knowledge acquisition. There was no significant difference between the results obtained in the MAC examination between either year of RCSI students, despite the fact that one year had the assessment at the end of their paediatric rotation and the other at the end of the academic year. In addition, the fact that two large groups of students obtained such similar results in the exam suggests that this examination is reproducible from year to year.
The SHOs’, with their increased clinical experience, performed significantly better than the students. This may reflect the clinical emphasis of the questions or possibly that junior doctors specialising in paediatrics were likely to be more interested in the subject and so would be expected to do better, irrespective of when they were assessed.
Did students perform differently in their official RCSI end of year examinations compared with how they performed in the MAC examination? Individual students’ performance in the MAC examination was compared with their performance in the official RCSI university paediatric examinations. A student’s rank within the class was calculated for each examination and compared to their rank in the other examination. This allowed determination of whether an individual’s performance on one type of examination (MAC or official RCSI examinations) was consistent, or whether they performed differently, relative to their peers, on different examinations. A statistically significant positive correlation between an individual’s MAC score and their score from official RCSI paediatric final assessments demonstrates convergent validity to this new type of assessment.
Did students from a different academic institution perform in a similar way compared with final results? In total, 54 QUB students sat the MAC examination. There was a statistically significant positive correlation (Spearman’s r=0.30 [p=0.029]) between QUB students ranking on the MAC examinations and their ranked performance on the paediatric aspect of their official summative university paediatric written examination. This was similar to the correlation between the RCSI students MAC examination results and their paediatric examination results (r=0.44 [p<0.01]).
Overall while the gross scores themselves may have been different for undergraduates taking both the MAC and official university exams, both assessments ranked individuals in a similar way. This is reassuring, as exam results are often used as criteria for shortlisting and appointing junior doctors to training schemes and stand-alone posts.
Quality of university examinations
Concerns have been raised that the quality of university examinations may not always be sufficient for high-stakes decision-making (17) in clinical practice. Studies have shown that undergraduate medical examinations can be of relatively low quality (18) and that the quality of written examination questions can be significantly improved by providing question writers with formal training(19). It may be an unrealistic target to expect a large group of ‘non-academic’ clinicians to undertake extra training in examination writing. A potential solution to this problem would be to encourage our ‘non-academic’ colleagues to provide the question content, in any format they feel most comfortable with, and then to deploy a team of trained academics to revise these questions into a more suitable format and improve their psychometric properties. In fact, this is how the Royal College of Pediatrics and Child Health (RCPCH) generate their examination questions. They set up question setting groups throughout the country, headed by a member of faculty but attended by non-academic consultant pediatricians and senior registrars. These questions are then reviewed by the theory examiner team at ‘board meetings' which occur twice a year, at which point the questions are either excluded or revised to be included in a potential bank of ‘live' questions for use in subsequently written examinations.
Study limitations
There were 15 consultant clinicians providing 71 questions for the MAC examination. It is possible that there would have been even greater breadth and diversity to the questions if there had been a greater number of paediatricians contributing questions. The results of this study may have been influenced by the fact that it relied on volunteers to provide questions. Therefore, these consultants have self-selected to a certain degree, and our sample may not accurately reflect the opinion of the ‘average’ paediatric clinician. However, their contribution is extremely valuable, as these individuals were sufficiently motivated to contribute to this work.
The official RCSI written examination has 150 test items and therefore the MAC examination, with only 30, is testing a smaller sample of knowledge. We appreciate that this has limited our results. However, the questions used covered a range of topics within paediatrics and represent a finite amount of ‘basic, must know knowledge.’
Both the undergraduate students and SHOs’ who sat the exam did so voluntarily, and so the results may reflect a more motivated population than the cohort overall. In the SHO cohort, the 93% response rate makes it unlikely that this would have an important effect. In the undergraduate cohort, the proportion of possible candidates volunteering for the exam was lower, so the chances of selection bias are greater. However, there was a significant positive correlation between their MAC results and their official university results. As these rankings did not merely cluster at the top of the class, it is clear that it was not just the highest achieving students who had volunteered to do the exam.