Design and Validation of an integrated Objective Structured Clinical Examination (i-OSCE) for non-surgical aesthetics program

Introduction: The Objective Structured Clinical Examination (OSCE) is a popular and practical method for evaluating trainee physicians’ competencies. To help assess non-surgical aesthetics students’ critical thinking and relevant abilities, we developed a novel assessment tool based on the OSCE, the i-OSCE (Integrated objective structured Clinical examination). Methods: Initially, an expert panel consisting of 5 Aesthetic Practitioners with over fteen years of experience and a senior clinical academic were selected to develop a blueprint for i-OSCE. Through this blueprint, essential qualities and skills were identied for the assessment. To ensure the process standardisation, training workshops for examiners and simulated patients were organised. The nal i-OSCE consisted of 12 stations (four clinical, four critical thinking, and four rest stations lasting 180 minutes. Results: The Interclass correlation coecient between the station checklist items was 0.946 (average measure upper bound 0.916, lower bound 0.968; p < 0.00), considered to be signicant. The Inter-Item Correlation Matrix among the clinical station checklist and critical thinking checklist items also showed statistical signicance. The Pearson correlation coecient (PCC) used to ascertain the correlation between checklist rating and global rating, yielding a high correlation (0.80 to 0.934). Conclusion: The i-OSCE has been proven to be a useful and reliable assessment tool to evaluate clinical competence and critical thinking in non-surgical aesthetics education.


Introduction
Assessment and evaluation are critical steps in medical education and rely on selecting a proper and robust instrument. Appropriate assessment tool helps to determine the effectiveness of educational programmes and ensures that the future clinicians are competent and suitable for independent clinical practice. However, the currently used assessment tools are insu cient to test the learners' knowledge, skills, behaviour, and critical thinking abilities holistically. In such cases, using the 'test battery' approach becomes more practical to use a mix of assessment tools for measuring an array of learning domains (1,2).
Traditionally, clinical assessment strategies comprise a combination of 'short' and 'long case' evaluations. However, criticism about its low reliability (3) and modern-day constraints such as increased litigation and student appeals (4) have led institutions to focus on exams that produce trustworthy, more easily defendable outcomes. Accordingly, conventional assessment strategy evolved to overcome the challenges of traditional methodologies, such as reliance on the patient's performance, the examiner's bias, the non-standardised grading scheme, and the candidate's actual performance; the assessment strategy went through an evolution process. Consequently, the assessment process became standard, and the number of variable affective students' performance was reduced and paved the path favouring the introduction of objective structured clinical examination (OSCE), the "gold standard" for clinical assessments globally. It aims to examine the skills and ability to understand that the assessment results will re ect the trainee physicians' day-to-day clinical performance in real-life scenarios.
In various research studies, OSCEs have been helpful in terms of reliability and validity (5). However, the long examination time is a cause of concern for the trainee physicians and costs to the program directors (6). OSCE mainly focuses on assessing affective, cognitive, and psychomotor learning domains.
Nonetheless, performance is affected by various other factors such as knowledge to apply in real-life scenarios; non-clinical skills (decision-making, teamwork, resource management, planning, and critical thinking); attitudes; environment; emotional state; physical state; and personality traits. The drawback of OSCE is that it cannot be easily used to measure non-clinical skills (2) In clinical education, critical thinking skills are measured by high-delity patient simulations, "California critical thinking skills test, California critical thinking disposition inventory, Del Bueno's performancebased development system, health science reasoning test", and Watson-Glaser critical thinking appraisal.
However, these are limited in their lack of measuring the particular aspect of health profession-related attributes, inability to evaluate medical professionals' practical reality, and effectively assess psychometric properties (7,8).
Critical appraisal is a subcategory of critical thinking, which speci es the ability to make clinical decisions by research evidence. Various studies have concluded that critical thinking can be re ned, and without this essential ability, there can be drastically negative rami cations on trainee physicians' decisions. There has been evidence of a direct correlation between critical thinking and academic success; unfortunately, many trainee physicians struggle on tests explicitly measuring it (9). While evaluating critical thinking strategies, there is a prominent spotlight on evidence-based practice and its role in education. Numerous systematic reviews have inferred that clinically integrated assessment methods are needed to improve further evidence-based practice skills (10). Critical appraisal has been incorporated into some of the high-stake professional and fellowship examinations where physicians are tested to assess their ability to judge a clinical paper in a short time based on its research design, result and whether to consider this to change one's clinical practice.
However, there is no evidence in the literature reporting the development and implementation of an evaluation tool for assessing clinical skills, analytical thinking, and non-clinical skills in the NSA educational program. Therefore, the current study aims to develop and validate an integrated objective structured clinical examination (i-OSCE) by integrating clinical and critical thinking stations for the NSA postgraduate program.

OSCE Station Blueprinting
Blueprinting is the standard process of mapping the intended learning outcomes, which comprise knowledge and understanding, intellectual, practical, affective, and psychomotor skills relating to the postgraduate curriculum on NSA, with the knowledge and skill competencies to be tested in individual stations. An 'expert panel' was formed consisting of ve aesthetic practitioners and a senior clinical academic with over fteen years of experience. A consensual and ceaseless approach was adopted to identify the tasks to be assessed, which are essential and relevant to the NSA practice, thereby validating the content of i-OSCE (Table 1).

Content development and Validation
A 2-day OSCE writing workshop was conducted for 15 aesthetic practitioners and clinical academicians in the presence of 3 expert facilitators, divided into three small groups. After a brief and structured presentation on OSCE, all the groups had a facilitator lead practice session to construct each station's case scenarios. It was followed by critical feedback from the facilitator to the participants. The three working groups met regularly to construct case scenarios, candidate instruction, standardised patient information sheet, and, most importantly, the marking sheet where the entire scenario was deconstructed to make a performance checklist effectively to match the blueprint theme.
Finally, the expert panel was reconvened again to review the constructed cases with the checklists and parity of competencies across the cases. Fifty stations (25 Clinical and 25 Critical thinking) were selected to store in the repository managed by an OSCE administrator.

Clinical Stations
These stations consisted of a simulated scenario of consultation in facial aesthetics. Here, candidates must take an appropriate history, clinical photography, and facial assessment to reach a speci c, accurate diagnosis. Either standardised patients or patient actors were utilised within these stations. Any clinical examination skills relevant to facial aesthetics were subsidised to t the station's time limit. Candidates were given a brief history and asked to perform (either all or some aspects) a clinical examination and discuss it with the patient. At the end of the station, candidates were subsequently asked to summarise their ndings or provide a brief management plan, including its justi cation, to the examiner. The other stations developed would assess candidates' professionalism and communication skills.

Critical Thinking (CT) Stations
Twenty-ve CT stations were created, where candidates were asked to critically appraise its validity and reliability (formulating PICO, review methodology and critical analysis of the discussion), whether the article published in a peer-reviewed journal and decide to adopt this into clinical practice (applicability).

Marking scheme
Every separate checklist score was weighed based on the allotted task's signi cance as the station author deemed, which later reached further agreement from an expert panel in a station review meeting. Finally, each station received an independent standardisation to create the pass marks with the help of the borderline regression method, which utilised a combination of the checklist score and the examiner's single 3 points global rating (clear pass, borderline, or clear fail) ( Table 2). Assessor Training All the examiners participated in an hour-long orientation program to familiarise themselves with the OSCE setting, competency testing, and scoring guidelines. Further, they were provided with a guide describing the de nitions of the competencies, checklist, global ratings.

Standardised Patient (SP) Training
Healthy volunteers were recruited with the help of a modelling agency to act as 'simulated' patients for all the stations. They went through coaching conducted by professional medical actors and clinical and communication skills experts. SPs were given a task, particularly to the station, and they practised until they played their roles consistently. As they were also responsible for completing their part of the checklist, a calibration video was shown to practice marking and debrie ng.

Pilot Study
A pilot study was conducted with ten aesthetic practitioners to examine the feasibility of i-OSCE. Planned eight stations were run with one examiner and one observer. Finally, the results were reviewed, including the feedback from individual stations to amend i-OSCE documents for nal implementation during the summative examination.

Final i-OSCE
The nal examination was conducted comprised four clinical and four critical thinking stations of 15 minutes each and four rest stations with forty trainee physicians. The total run for the exam was 180 minutes.

Statistical Analysis
For an assessment tool to be accepted as reliable and valid, the most widely used statistical measurement is Cronbach's alpha (11). However, some studies argue that it should not be used for internal reliability as sole measurement, as it is directly proportionate to the examination length; therefore, it indicates the station's stability, not the internal consistency (12,13). However, concurrent use of Pearson correlation coe cient (PCC) or Spearman's rank correlation helps overcome the issue (14). Therefore, PCC was used to investigate the strength of correlation between utilising the checklist and the global rating (clear pass, borderline, or clear fail), which helped provide a measure of the validity of the marking criteria used. For calculating the interrater reliability (IRR), Cronbach's alpha was used through two-way mixed effects; intra-class correlations (ICC) for consistency and internal reliability. For interpretation of ICCs, Cicchetti's classi cation (IRR less than 0.40 is poor; 0.40-0.59 is fair; 0.60-0.74 is good; 0.75-1.00 is excellent) was used (15). Moreover, content validity was measured with the help of experts. Statistical analysis was carried out by using IBM SPSS Statistics for Mac, Version 27.0 (IBM Corp. Armonk, NY, USA).

Results
The Interclass correlation coe cient between the station checklist items was 0.946 (average measure upper bound 0.916, lower bound 0.968; p<0.00), considered to be signi cant ( Table 3). The Inter-Item Correlation Matrix among the clinical station checklist items and critical thinking checklists also showed statistical signi cance ( Table 4). The Pearson correlation coe cient (PCC) used to ascertain the correlation between checklist rating and global rating (Table 5), yielding a high correlation (0.80 to 0.934).

Discussion
This is the rst integrated OSCE validation study combining clinical and critical thinking skills for a postgraduate NSA education to the best of the author's knowledge. OSCE is a exible assessment method use to evaluate competence by direct observation based on objective assessment criteria. It is composed of several "stations" where examinees are required to conduct a range of clinical tasks against required clinical competence displaying the skills and attitudes over a given duration. The OSCE has been used to assess the skills most important to the healthcare professionals' success, such as data acquisition, interpretation, troubleshooting, engagement, and management of erratic patient behaviour, otherwise di cult to obtain during the classic clinical review (16). Miller's framework for clinical competency development recommended four stages; "knows the facts"; "knows how to elaborate and integrate the understanding"; "shows how" they apply knowledge, skills and attitude for the patient outcome; and nally "does" employ all the skills in their independent practice to serve the community, proven to work reasonably well in medical education settings (17,18). Evidence suggests that the OSCE helps assess the third stage "shows how" by concentrating on the clinical skills in a safe learning environment.
Critical thinking is considered to be a crucial cognitive method for the creation and utilisation of knowledge. It plays a functional role during problem-solving and decision-making in a social, clinical, or ethical context. Moreover, it is equally valuable for analysing complex data, assessing situations, and implementing the most suitable actions. In a recent article, "critical thinking is described as a cognitive process, purposeful, self-regulatory judgment that has two components of cognitive skills (interpretation, analysis, inference, evaluation, explanation, and self-regulation) and a motivational component (the disposition toward critical thinking)" (19).
More focus has been put in recent years on improving higher-level thought (critical thinking and clinical reasoning) skills to help physicians retain clinical integrity and medical professionalism. More than twothirds of the reported mistakes in diagnosis are linked to physician's lack of critical thinking ability in the present context. Given the belief that healthcare professionals must be logical thinkers, there is no consensus on the most successful model to teach and evaluate critical thinking and clinical reasoning skills (8). Recent research, which evaluated a wide range of quantitative and qualitative competencies, including behavioural and communication skills, showed that the OSCE was valid and reliable and essential for positive educational effects. Several authors have advocated that emphasising an aim of OSCEs is to develop affability in critical thinking as a precursor to practising (20).
There is no valid assessment instrument combining clinical and critical appraisal skills to evaluate safe practice in non-surgical aesthetics. Therefore, using different stations to evaluate clinical skills and critical thinking ability is bene cial in this evaluation. The various clinical skills measured were consultation skills pertaining to the NSA, knowledge of the signs of ageing and the underlying anatomy, assessment of skin quality, full-face assessment to identify treatment needs for optimal results, clinical photography, development of an e cient and optimal treatment plan, safer injection techniques, post-treatment advice, complication management, and situational judgments. During the evaluation, the critical appraisal skills measured included understanding the research relating to facial assessment, botulinum toxin science, rheology of the soft-tissue llers and complication management.
It has shown that the generalisation coe cients appear to differ signi cantly from 0.40 to 0.85, while the majority of these coe cients vary from 0.5-0.6 (6). In the present study, the average intraclass correlation coe cient measures range between 0.916 to 0.968, which is more than the reliability coe cient threshold of 0.8 or over. The variability in the generalizability coe cients may be attributed to the examinees' variable performance on different OSCE stations (content speci city). I-OSCE is shown to be robust and able to test applicants for their competence to carry out multiple component tasks.

Conclusion
Integrated OSCE has demonstrated to be a reliable and accurate assessment tool for examining the trainee aesthetic physicians' professional competence. This tool has objectively evaluated trainee physicians critical thinking and clinical skills, including clinical reasoning. The program directors should consider the deployment of i-OSCE along with OSPE as an assessment tool in the postgraduate curriculum for non-surgical aesthetics.