1 MEDICAL EDUCATION QUARTET. Medical education quartet Assessment of clinical competence Val Wass, Cees Van der Vleuten, John Shatzer, Roger Jones Tests of clinical competence , which allow decisions to be made about medical qualification and fitness to practise, must be designed with respect to key issues including blueprinting, validity, reliability, and standard setting, as well as clarity about their formative or summative function. Multiple choice questions, essays, and oral examinations could be used to test factual recall and applied knowledge, but more sophisticated methods are needed to assess clincial performance, including directly observed long and short cases, objective structured clinical examinations, and the use of standardised patients.
2 The goal of Assessment in medical education remains the development of reliable measurements of student performance which, as well as having predictive value for subsequent clinical competence , also have a formative, educational role. Assessment drives learning. Many people argue that this frameworks against which to plan assessments are statement is incorrect and that the curriculum is the key in essential and can be defined even for generalist college any clinical course. In reality, students feel overloaded by work and respond by studying only for the parts of the Assessment programmes must also match the course that are assessed. To promote learning, Assessment competencies being learnt and the teaching formats being should be educational and formative students should used.
3 Many medical curricula define objectives in terms of learn from tests and receive feedback on which to build knowledge, skills, and attitudes. These cannot be properly their knowledge and skills. Pragmatically, Assessment is assessed by a single test format. All tests should be the most appropriate engine on which to harness the checked to ensure that they are appropriate for the curriculum. objective being tested. A multiple-choice examination, for Additionally, with an increasing focus on the example, could be a more valid test of knowledge than of performance of doctors and on public demand for communication skills, which might be best assessed with assurance that doctors are competent, Assessment also an interactive test.
4 However, because of the complexity of needs to have a summative function. Tests of clinical clinical competence , many different tests should probably competence , which allow a decision to be made about be used. whether a doctor is fit to practise or not, are in demand. This demand raises a challenge for all involved in medical Standard setting education. Tests that have both a formative and Inferences about students' performance in tests are summative function are hard to design. Yet, if Assessment essential to any Assessment of competence . When focuses only on certification and exclusion, the all- Assessment is used for summative purposes, the score at important influence on the learning process will be lost.
5 Which a student will pass or fail has also to be defined. The panel shows the key measurement issues that should Norm referencing, comparing one student with others, is be addressed when designing assessments of clinical frequently used in examination procedures if a specified number of candidates are required to pass ie, in some college membership examinations. Performance is Blueprinting described relative to the positions of other candidates. As If students focus on learning only what is assessed, such, variation in the difficulty of the test is compensated Assessment in medical education must validate the objectives set by the curriculum. Test content should be Key issues that underpin any test carefully planned against learning objectives a process Key issues Description known as For undergraduate curricula, for which the definition of core content is now becoming a Summative/formative Be clear on the purpose of the test.
6 Requirement,3 this process could be easier than for Blueprinting Plan the test against the learning postgraduate examinations, where curriculum content objectives of the course or competencies remains more broadly defined. However, conceptual essential to the speciality. Validity Select appropriate test formats for the competencies to be tested. This action Lancet 2001; 357: 945 49 invariably results in a composite examination. Department of General Practice and Primary Care, Guy's, King's Reliability Sample adequately. clinical competencies and St Thomas' School of Medicine, Weston Education Centre, are inconsistent across different tasks. London SE5 9RJ, UK (V Wass FRCGP, Prof R Jones DM); Department of Test length is crucial if high-stakes Educational Development and Research, University of Maastricht, decisions are required.
7 Use as many Maastricht, Netherlands (Prof C Van der Vleuten PhD); and Johns examiners as possible. Hopkins University School of Medicine, Baltimore, MD, USA. (J Shatzer PhD) Standard setting Define endpoint of Assessment . Set the appropriate standard eg, minimum Correspondence to: Dr Val Wass competence in advance. (e-mail: THE LANCET Vol 357 March 24, 2001 945. For personal use only. Reproduce with permission from The Lancet Publishing Group. MEDICAL EDUCATION QUARTET. for. However, differences in the abilities of student 1 0. cohorts sitting the test are not accounted for. Therefore, if 0 9. a group is above average in ability, those who might have 0 8. passed in a poorer cohort of students will fail.)
8 Norm 0 7. referencing is clearly unacceptable for clinical competency Reliability licensing tests, which aim to ensure that candidates are 0 6. safe to practise. A clear standard needs to be defined, 0 5. below which a doctor would not be judged fit to practise. 0 4. Such standards are set by criterion referencing. In this 0 3. case, the minimum standard acceptable is decided before 0 2. the test. However, although differences in candidate ability are accounted for, variation in test difficulty 0 1. becomes a key issue; standards should be set for each test, 0. CQ. ay P. al ). CE. item by item. Various time-consuming but essential se PM. Or ss OS. M. ca methods have been developed to do this, such as the te ng or techniques of Angoff and The choice of method (lo Sh ry will depend on available resources and on the o st Hi consequences of misclassifying examinees as having passed or failed.
9 Examination method Validity versus reliability Just as summative and formative elements of assess- Figure 1: Reported reliability when 4 h testing times are used ment need careful attention when planning clinical for different test formats MCQ=multiple-choice examination; PMP=patient management problem;. competence testing, so do the issues of reliability and OSCE=objective structured clinical examination. validity. Reliability is a measure of the reproducibility or consistency of a test, and is affected by many factors by knows how (applied knowledge). These can be more such as examiner judgments, cases used, candidate easily assessed with basic written tests of clinical nervousness, and test conditions.
10 Two aspects of knowledge such as multiple-choice questions. Clearly, reliability have been well researched: inter-rater and this test format cannot assess the more important facet of inter-case (candidate) reliability. Inter-rater reliability competency required by a qualifying doctor ie, the measures the consistency of rating of performance by shows how. This factor is a behavioural rather than a different examiners. The use of multiple examiners cognitive function and involves hands on, not in the across different cases improves inter-rater reliability. In head, demonstration. A senior student about to start an oral examination, the average judgment of ten work with patients must be able to show an ability to examiners, each assessing the candidate on one question, assess individuals and carry out necessary procedures.