Objective structured clinical examination for pharmacy students in Qatar: cultural and contextual barriers to assessment

K.J. Wilby,¹ E.K. Black,² Z. Austin,³ B. Mukhalalati,¹ S. Aboulsoud ⁴ and S.I. Khalifa ¹

"الفحص السريري الموضوعي المنظَّم" لطلاب الصيدلة في قطر: العوائق الثقافية والسياقية لإجراء التقييم

كابل ويلبي، إميلي بلاك، زوبن أوستن، بنان مخللاتي، سمر أبو السعود، شريف خليفة

الخلاصة: هدفت هذه الدراسة إلى تقييم الجدوى والقدرة الدفاعية النفسية للتنفيذ الشامل "للفحص السريري الموضوعي المنظّم" OSCE للبرنامج الكامل للصيدلة في أحد سياقات الشرق الأوسط، وإلى التعرف على ميسِّرات ومعيقات تنفيذه في مواقع جديدة. وقد أعِدَّت ثمان حالات، ووُثِّقت مصدوقيتها، ووُضعت لها معايير وفقاً لمخطط محدد، وقيِّمت عن طريق طلاب صيدلة متخرجين. وجرى تقييم موثوقية المقيِّمين باستخدام معامِلات ما بين الصفوف. وقيِّمت المصدوقية المتزامنة بمقارنة نتائج "الفحص السريري الموضوعي المنظّم" بدرجات مقرَّر المهارات المهنية. وتم الاحتفاظ بالملاحظات الميدانية لوضع توصيات تنفَّذ في سياقات أخرى. كانت علامة النجاح في الامتحان 424 نقطة من أصل 700 (60.6%). وقد نجح جميع المشاركين الـ 23. وكان متوسط الإنجاز 74.6%. تم الحصول على موثوقية منخفضة إلى متوسطة ما بين المقيِّمين بشأن المكوِّنات التحليلية والعالمية (متوسط مُعامل ما بين الصفوف 0.77 و0.48 على التوالي). وفي الخلاصة: إن "الفحص السريري الموضوعي المنظّم" كان مجدياً في قطر، لكن المخاوف بشأن المصدوقية والموثوقية المتعلقة بالسياق يجب أن تعالج قبل التكرار مستقبلاً في قطر وغيرها.

ABSTRACT This study aimed to evaluate the feasibility and psychometric defensibility of implementing a comprehensive objective structured clinical examination (OSCE) on the complete pharmacy programme for pharmacy students in a Middle Eastern context, and to identify facilitators and barriers to implementation within new settings. Eight cases were developed, validated, and had standards set according to a blueprint, and were assessed with graduating pharmacy students. Assessor reliability was evaluated using inter-class coefficients (ICCs). Concurrent validity was evaluated by comparing OSCE results to professional skills course grades. Field notes were maintained to generate recommendations for implementation in other contexts. The examination pass mark was 424 points out of 700 (60.6%). All 23 participants passed. Mean performance was 74.6%. Low to moderate inter-rater reliability was obtained for analytical and global components (average ICC 0.77 and 0.48, respectively). In conclusion, OSCE was feasible in Qatar but context-related validity and reliability concerns must be addressed prior to future iterations in Qatar and elsewhere.

Examen clinique objectif structuré complet au Qatar : évaluation des barrières culturelles et contextuelles

RÉSUMÉ La présente étude avait pour objectif d’évaluer la faisabilité et la solidité psychométrique de la mise en place d’un examen clinique objectif structuré (ECOS) du programme pharmaceutique complet pour les étudiants en pharmacie au Moyen-Orient, ainsi que d’identifier les leviers et les obstacles à sa mise en place dans les nouveaux établissements. Huit cas ont été élaborés, validés, se sont vus attribuer des normes en fonction d’un modèle, et ont ensuite été soumis à des étudiants en pharmacie pour évaluation. La fiabilité des examinateurs a été mesurée au moyen de coefficients intra-classe (CIC). La validité concourante a été évaluée en comparant les résultats de l’ECOS aux notes finales de cours sur les compétences professionnelles. Des notes d’observation ont été conservées en vue de la production de recommandations pour la mise en place du test dans d’autres contextes. La note de passage de l’examen était de 424 points sur 700 (soit 60,6 %). Les 23 participants ont tous réussi l’examen. La performance moyenne était de 74,6 %. Des taux de fiabilité intra-examinateur faible à moyen ont été obtenus pour les composantes analytiques et globales (CIC moyen de 0,77 et 0,48 respectivement). Pour conclure, l’ECOS était réalisable au Qatar, mais les questions de validité et de fiabilité dépendant du contexte doivent être prises en compte avant toute reproduction du test au Qatar et dans d’autres pays.

¹College of Pharmacy, Qatar University, Doha, Qatar (Correspondence to K.J. Wilby: This e-mail address is being protected from spambots. You need JavaScript enabled to view it ). ²College of Pharmacy, Faculty of Health Professions, Dalhousie University, Halifax, Canada. ³Leslie Dan Faculty of Pharmacy, University of Toronto, Canada. ⁴Qatar Council for Healthcare Practitioners, Supreme Council for Health, Doha, Qatar.

Introduction

Objective Structured Clinical Examinations (OSCEs) are commonly used to evaluate the competence of healthcare professionals (1). OSCEs aim to simulate practice settings and real-life patient care scenarios (2). They have been used to assess diverse competencies, including those noted within many health professions, such as patient care, management, advocacy, professionalism, collaboration, research and communication (3). OSCEs are used extensively within medical education, particularly in western countries, and have recently been adopted as part of the admissions process to courses for the health professions (4). Canada currently uses OSCEs as part of the pre-registration competency assessment system for entry-to-practice candidates in a variety of health professions, such as medicine and pharmacy (5–7).

There is a significant gap in knowledge regarding the feasibility and validity of adapting a traditional OSCE into non-western contexts. Concerns have been raised that the traditional methods used to create and implement an OSCE may not be valid in non-western settings (8). Case development, standard setting, standardized patient training and performance, and assessor training and performance may all be affected by contextual factors such as practice differences, cultural norms, and lack of past experience in competency-based examinations. Furthermore, adapted assessment tools may not be appropriate in different local contexts, especially in cultures that use more qualitative approaches to assessment, as well as those that differ in terms of verbal and nonverbal communication.

Many countries in the Middle East are taking significant steps towards improving healthcare practice and education. Despite attempts to adopt best practices in education programmes for health professionals, a literature review identified few reports of competency-based examinations, including OSCEs, within this region (9–11). Articles identified focused on perceptions and satisfaction, rather than validity and reliability. The cultural and practice context of the Middle East, in particular the countries of the Gulf Cooperation Council, is clearly different from that in western countries. This is evident across an array of practices, including gender relationships and communication, hierarchical healthcare structures, conceptions of teams and interprofessional collaboration, and verbal/nonverbal communication practices. Such significant contextual differences may have equally significant influences on assessments, in particular, performance-based assessments such as OSCEs.

The College of Pharmacy at Qatar University in Doha has pioneered pharmacy education in the Middle East by seeking and obtaining full accreditation for the BSc in Pharmacy and DPharm programmes by the Canadian Council for Accreditation of Pharmacy Programs (CCAPP) (12). Accreditation requires that instruction and assessment methods mirror Canadian standards, including use of OSCEs as part of the programme. However, as discussed above, it is unclear if adopting these methods is valid and psychometrically defensible. During the 2013–2014 academic year, a pilot project between Qatar University, Supreme Council of Health in Qatar, and University of Toronto aimed to develop, implement and evaluate a cumulative competency-based OSCE for graduating pharmacy students in Qatar, according to Canadian standards and procedures. The results of this pilot project are described within the present report.

The primary objective of this pilot project was to evaluate the feasibility and psychometric defensibility of implementing a competency-based comprehensive OSCE, according to Canadian standards, for graduating pharmacy students in Qatar. Secondary objectives were to determine inter-rater reliability for assessment of student performance, identify facilitators and barriers to developing and implementing an OSCE within a Middle Eastern context, and to make recommendations for faculties in other settings attempting to do the same.

Methods

The design and implementation of this project was overseen by a steering committee that consisted of representatives from Qatar University and the Supreme Council for Health with consultation from the University of Toronto. The steering committee consisted of three chief administrators and two chief examiners. All procedures and assessment methods were adopted from current practices at the University of Toronto. Qatar University Institutional Review Board exempted the project from full review (QU-IRB 373-E/14).

Case development, validation and standard setting

A 2-day case-writing workshop was led by consultants from the University of Toronto and consisted of participants representing academia, government, hospital practice, primary care practice and community practice. Thirty-six participants were recruited and divided into groups of 6. Each group comprised a mixture of expertise from the practice settings listed in Table 1. At least one pharmacy faculty member was present in each group. Confidentiality agreements were signed by all participants and administrators. Each group was responsible for writing 2 OSCE cases, using standard case-writing templates widely used in Canadian pharmacy education and in accordance with a blueprint developed by the chief examiners.

The blueprint was developed to ensure relevant distribution of competencies across a broad range of disease states as addressed by the OSCE. Chief examiners developed the blueprint based on the Entry to Practice Blueprint of the Pharmacy Examination Board of Canada and the educational outcomes for the First Professional Degree Programs of the Association of Faculties of Pharmacy of Canada (6,13). In addition, chief examiners considered most common disease states encountered in Qatar and the amount of time dedicated to each disease state in the undergraduate curriculum. Complexity was determined using a matrix of both simple and complex patients and problems. An example of a complex patient is one with multiple co-morbidities or one presenting with a language barrier. An example of a complex problem is a patient presenting for self-medication but who requires referral for presence of alarm symptoms. Communication skills for each case were evaluated according to a standard global assessment scale adopted with permission from the University of Toronto.

Each case was subsequently reviewed and validated by a different (separate) group of 4–6 participants. The objective of this process was to ensure technical/pharmacotherapeutic accuracy, contextual appropriateness of the scenario and case details, and to evaluate feasibility of portrayal within the Qatari pharmacy education context. During this phase, the review/validation group was authorized to make agreed-upon changes to enhance the quality, rigor and defensibility of each case.

The 3rd step of the case development process involved standard setting. Each validated case was given to a 3rd group of 4–6 participants. Each group was provided with preliminary training on application of the Anghoff method (14) for standard setting. Each group was then asked to set standards (i.e. establish minimal performance levels for each item on the checklist and an overall pass mark for the entire case). The standard setting group was also asked to determine a relative weighting of global assessment vis-à-vis analytical checklist for the unique circumstances of each case.

OSCE implementation

Of the 12 OSCE cases developed during the workshop, 8 were selected (by matching to blueprint specifications) for the actual examination. Forty-eight participants completed the OSCE (all 23 graduating students from the BSc Pharmacy programme in 2014, 24 volunteer pharmacists, and 1 standard control with no health professional background). To accommodate this number, the OSCE was designed to run over 2 cycles (morning and afternoon with seclusion during cycle overlap) and on 3 separate tracks (giving 24 stations running at any one time). As this was a pilot project to build capacity and assess validity and reliability of the examination, 2 standardized patients and 2 assessors were recruited for each case station. All assessors possessed a professional degree in pharmacy and were recruited from academia, hospital practice, primary care practice and community practice. All assessors were required to complete a 4-h training session run by the consultants from the University of Toronto to enhance reliability. Standardized patients were recruited from pools obtained from other health professional programmes, as well as amateur participants identified through internal and external advertisements. All standardized patients were also required to attend a 4-h training session to enhance consistency of portrayal. Examination centre staff (e.g. track coordinators, timekeepers, document collectors, runners and registration personnel) were recruited from the staff and faculty at Qatar University and the Supreme Council for Health and completed a 1-h training session.

On the day of the OSCE, all assessors and standardized patients were given a short orientation by the steering committee and consultants. Following this orientation, the assessors and standardized patients for each case (from all tracks) convened in rooms to spend 2 h completing a case “dry run” (e.g., reading and rehearsing the case as part of a standardization exercise, to enhance reliability). To ensure examination security, this was the 1st time assessors or standardized patients were exposed to the actual case used in the examination. During the dry run, chief examiners were available to answer content-related questions.

All participants and staff (standardized patients, assessors, examination centre staff, and students) signed confidentiality agreements upon arrival at the examination centre. An incident reporting system was developed that allowed assessors or examination centre staff to complete reports if any incident or unusual event occurred that might have unfairly influenced student performance.

Field notes were maintained by examination centre staff during the process to document observed facilitators and barriers to implementation of an OSCE.

Statistical analysis

A priori pass marks for each station were determined according to the standard setting process described above. Student performance for each station was calculated as the sum of weighted performances in both the analytical checklist and global assessment; overall examination performance was calculated as the sum of performances in all stations. Standards setting groups weighted the analytical and global assessments according to the general competencies addressed. A maximum of 60/40 weighting was allowed in favour of either assessment tool. To assess distribution of student performance, the Shapiro–Wilks test was used to determine normality and results were summarized using histograms. Inter-class correlation (ICC) coefficients (2-way random model) were determined for inter-rater reliability in the analytical checklists and global assessments per station. Overall inter-rater reliability for global assessment as a whole was also calculated using ICCs.

Concurrent validity was assessed using final grades from 2 courses in year 3 of professional study. The courses, Professional Skills VI and Integrated Case-Based Learning IV, were chosen because of the overlapping nature of knowledge and skills required for the OSCE. Also, the objectives of these courses closely mirrored the final blueprint of the OSCE. Validity was assessed by calculating quartiles for OSCE results and selected course final grades and determining percentage overlap of students in each quartile. All analyses were performed using SPSS version 22.

Results

Forty-seven assessors, 44 standardized patients, 23 student participants, 24 volunteer pharmacists, and 1 standard control participated in the OSCE. As only 47 assessors were available, 8 students had only 1 assessor for 1 station. Therefore, ICCs were not calculated for these 8 students on this specific station. After careful review of incident reports, 1 station was eliminated from analysis based on feedback from multiple assessors regarding reliability concerns between standardized patients. Based on the final 7 included stations and accounting for weighting differences between analytical and global assessments, the overall examination pass mark was 424 points out of a total possible 700 points (60.6%) across analytical checklists and global assessments.

Student performance results are given in Table 2. All 23 students achieved a pass score on the OSCE (i.e. combined station scores >424 points), indicating 100% success in meeting minimal competence standards as defined through this process. When converted to a percentage, the mean student performance on the OSCE was 74.6%. Results were normally distributed, as determined by a nonsignificant Shapiro–Wilks test and histogram (Figure 1). ICCs for inter-rater reliability for the analytical checklist and global assessment per station are given in Table 2. Low to moderate reliability was obtained with large variation documented among stations. Overall inter-rater reliability for the global assessment was considered low for a high-stakes examination, with a value of 0.64, 95% confidence interval (CI) 0.50–0.74.

Concurrent validity assessments showed poor results. For Professional Skills VI, 60% overlapped with the lowest quartile, 17% with the 2nd quartile, 33% with the 3rd quartile and 0% in the top quartile. Forty-two percent overlap occurred with the top 12 of 23 students and 45% occurred with the bottom 11 students. For Integrated Case-Based Learning IV, 33% overlapped with each quartile aside from the bottom quartile, which overlapped with 40%. When split into the top and bottom 50% ranges, 67% of students achieved the top 50% grades for both the course and the OSCE, while 45% achieved the bottom 50% scores for both.

Observations were recorded throughout the process of both OSCE development and implementation. First, steering committee members and consultants questioned the validity of the standard setting process in this context, as faculty members from Qatar University drove the process with limited input from practitioners. This finding may have biased standards in favour of passing, as faculty members are more aware of student competency within the classroom. Second, inconsistencies were identified within validated cases that required modification. This suggests flaws within the validation process, or lack of understanding on the part of case reviewers, that should be addressed in future cycles. Lastly, it was difficult to recruit standardized patients and those recruited had minimal experience as amateur participants. Many incident reports were written based on standardized-patient performance that may have threatened validity of student performance estimates. Recommendations for other programmes attempting to implement a cumulative OSCE in context are given in Table 3.

Discussion

This study assessed the feasibility, validity and reliability of a pilot OSCE in a Middle Eastern context, which was developed according to western processes and standards. According to this report, the OSCE was implemented successfully in Qatar and deemed to be a feasible component of the pharmacy programme curriculum. However, many issues were identified during implementation and evaluation that need to be analysed and addressed in future cycles. The following paragraphs discuss these points, which are of importance to other global centres attempting to implement a cumulative OSCE in highly contextualized settings.

All students passed the OSCE according to predefined standards set by faculty and practicing clinicians. This result is noteworthy given the high stakes nature of the examination and suggests that standards were set falsely low. During the standard-setting process, it was observed that the faculty members within each group dominated the process without significant discussion or input from practicing clinicians. There are many explanations for this observation, including cultural sensitivity, in which faculty members may have assumed authoritative roles (15). The result of this discrepancy could be standards that suit student strengths and minimize weaknesses, as faculty members have the insight into student performance throughout the 4 years of study.

Conversely (and more likely), it is possible that the high pass rate resulted from a discrepancy between academic teaching and standards of practice. Pharmacy practice in Qatar is moving away from traditional dispensing and into more clinic-oriented functions. Current clinical practice is characterized primarily by recommendations regarding empiric dosing and adverse effect management, rather than evidence-based approaches (16,17). It is possible that content of cases, prompts delivered by standardized patients, and checklist components may have diluted standards and resulted in the high pass rate. As the Qatar University programme is linked to Canadian practice standards, it is likely students are over-prepared for current practice in Qatar. As described in Table 2, future cycles should include a piloting phase to identify cases that may be designed and/or assessed according to standards set falsely low.

Many incident report forms were received during the OSCE for discrepancies and variations in standardized patient performance. Unfortunately, this examination was limited by the experience of the standardized patients available, which in many cases was no experience at all. However, we believe that other countries and settings will face similar barriers if attempts are made to implement a cumulative OSCE. Although standardized patient workshops and training were given, it is likely that the 4-h training session and 2-h standardization exercise were inadequate. Recommendations for future cycles (as well as other programmes) are to increase training of standardized patients through increased involvement in professional skills courses, and establishment of a professional pool of standardized patients that can be shared among different academic units specialized in training health care professionals, as variations in performance can significantly affect examination validity.

Recruited assessors came from a diverse demographic pool, including nationality, years of practice, practice setting, experience in assessing student performance, and experience as OSCE participants or assessors. As such, it was uncertain if reliable assessments could be obtained. However, after calculation of ICCs for analytical and global assessments, low to moderate reliability was achieved with average ICC estimates between 0.30 and 0.85. Reliability varied greatly between stations, with those stations focused on counselling points or education scoring higher. Assessors came from diverse backgrounds and had limited experience assessing communication skills, therefore, the low to moderate inter-rater reliability obtained for global performance is not surprising. To strengthen inter-rater reliability for future cycles, greater assessment training and practice for assessors is recommended prior to OSCE implementation. In upcoming cycles, it is recommended that at least 2 assessors continue to be present for evaluation of each station to ensure that evaluation of student performance is not compromised by biased assessment.

While this study provides insight into contextual adoption of an OSCE, some limitations should be addressed. As discussed above, the greatest limitation of this project was the consistency and experience of the standardized patients. The value of this component cannot be underestimated and efforts should be made to ensure standardized patients are well trained to reduce error arising from variations in performance. Also, data normality was not clear, as visual inspection does not necessarily support statistical results. However, this is expected with small samples and we consider our distribution to be correct. Future cycles should also attempt to enhance psychometric defensibility through piloting and case review, to identify content concerns prior to implementation.

This study investigated the adoption of an OSCE examination for pharmacy students in Qatar using Canadian assessment practices. We conclude that it is feasible to utilize this assessment method in the Qatari educational context, and it does add benefit to traditional local assessment practices. However, analysis of reliability data suggests that adopting western methods for design and implementation of an OSCE increases sources of error and threatens examination validity. In future cycles, efforts should be made to minimize these errors by implementing recommendations generated from this pilot. Based on the success of this pilot, the OSCE has been adopted as a course requirement for all future students graduating from the College of Pharmacy, Qatar University.

Funding: Funding for this study was provided by internal sources from Qatar University and the Supreme Council for Health in Qatar.

Competing interests: None declared.

References

Epstein RM. Assessment in medical education. N Engl J Med. 2007 Jan 25;356(4):387–96. PMID:17251535
Harden RM, Gleeson FA. Assessment of clinical competence using an objective structured clinical examination (OSCE). Med Educ. 1979 Jan;13(1):41–54. PMID:763183
Jefferies A, Simmons B, Tabak D, McIlroy JH, Lee KS, Roukema H, et al. Using an objective structured clinical examination (OSCE) to assess multiple physician competencies in postgraduate training. Med Teach. 2007 Mar;29(2-3):183–91. PMID:17701631
Eva KW, Rosenfeld J, Reiter HI, Norman GR. An admissions OSCE: the multiple mini-interview. Med Educ. 2004 Mar;38(3):314–26. PMID:14996341
Austin Z, O’Byrne C, Pugsley J, Quero Munoz L. Development and validation processes for an Objective Structured Clinical Examination (OSCE) for entry-to-practice certification in pharmacy: the Canadian experience. Am J Pharm Educ. 2003;67(3): Article 76 (http://www.ajpe.org/doi/pdf/10.5688/aj670376, accessed 28 Apr 2016).
PEBC. The Pharmacy Examining Board of Canada; 2014 (http://www.pebc.ca, accessed 19 May 2015).
Medical Council of Canada Qualifying Examination Part II. Medical Council of Canada; 2014. (http://www.mcc.ca/examinations/mccqe-part-ii/, accessed 19 May 2015).
Lee YM, Ahn DS. The OSCE: a new challenge to the evaluation system in Korea. Med Teach. 2006 Jun;28(4):377–9. PMID:16807181
Amiri M, Nickbakht M. The objective structured clinical examination: A study on satisfaction of students, faculty members, and tutors. Life Sci J. 2012;9(4):4909–11.
Jahan F, Norrish M, Lim G, Vicente O, Ignacio G, Al-Shibil A, et al. Knowledge and perception regarding objective structured clinical examination (OSCE) and impact of OSCE workshop on nurse. Middle East J Nurs. 2013 Aug;7(4):3–9.
Erfanian F, Khadivzadeh T. Evaluation of midwifery students’ competency in providing intrauterine device services using objective structured clinical examination. Iran J Nurs Midwifery Res. 2011 Summer;16(3):191–6. PMID:22224105
Accredited Programs. Canadian Council for Accreditation of Pharmacy Programs. (http://www.ccapp-accredit.ca, accessed 19 May 2015).
AFPC. Association of Faculties of Pharmacy of Canada. (http://www.afpc.info, accessed 19 May 2014).
Ben-David MF. AMEE guide no. 18: standard setting in student assessment. Med Teach. 2000;22(2):120–30.
Hall ET. Beyond culture. New York: Anchor Books; 1976.
Wilby KJ, Mohamad AA, AlYafei SA. Evaluation of clinical pharmacy services offered for palliative care patients in Qatar. J Pain Palliat Care Pharmacother. 2014 Sep;28(3):212–5. PMID:25076019
Hooper R, Adam A, Kheir N. Pharmacist-documented interventions during the dispensing process in a primary health care facility in Qatar. Drug Healthc Patient Saf. 2009;1:73–80. PMID:21701611