Abstract
Objective:
We present reflections on student evaluation of teaching (SET) in the context of recent higher educational research that assesses SET, as well as concurrent and/or subsequent student performance.
Conclusions:
In a sense, there is in-built cynicism in SET, with more favourable SET for easier assessment. There is emerging evidence that SET is inversely proportional to the performance of students in subsequent courses, i.e. the higher the ratings, the poorer the students perform in subsequent studies. It is proposed that SET should be combined with contemporaneous formative and summative assessments of student performance in medical school settings, especially in psychiatry education.
Keywords
Cecil Graham: What is a cynic?
Lord Darlington: A man who knows the price of everything, and the value of nothing.
In the context of a limited time in which to teach and burgeoning curricula, all medical specialties are in the invidious position of being constrained in medical school education of future doctors. In an increasingly commodified university sector, along with burgeoning student consumerism, 1 decisions around curricula involve consideration of student evaluation of teaching (SET), as a measure of quality and consumer satisfaction. However, given the considerable concerns about the validity of SET, we previously argued that SET must be balanced by objective assessment of student performance in the course skills and knowledge (ASP). 2 We acknowledge an imbalance between SET and ASP, in that only a subset of students (c.<25%) complete SET, whereas all students participate in ASP. Medical students typically have little or no course choice in comparison to general university students, and this may also contribute to lower SET ratings. This is especially important in medical specialties such as general practice and psychiatry that may be viewed as more challenging and of lesser status, in which SET may be unduly affected by negative sentiment, 2 or indeed, cynicism. Medical students have regarded psychiatry teaching as unscientific and emotionally draining, as well as finding the context of clinical placements is more challenging in comparison to other medical specialties. 3 Such impressions may impact dramatically on SET. Here, we seek to encourage discussion, based on research in higher education that evaluates SET specifically in correlation with concurrent as well as consequent course performance; arguing that ultimately similar evaluations are necessary in medical education, especially in psychiatry.
SET: concurrent and consequent performance of students
It is well known that SET is correlated with performance on concurrent courses, and a number of reasons, including but not limited to grade inflation, have been proposed as factors. 4 However, there are a number of studies that demonstrate more positive SET in regard to specific university educators does not necessarily correspond to improved performance of students in subsequent courses. 4
In an excellent review, Stroebe 4 describes five studies that relate teaching ratings in a concurrent course to student performance in subsequent courses, which we summarise here. Johnson 5 found that some items of SET were negatively correlated with grades on subsequent courses. 4 Yunker and Yunker 6 studied teaching evaluations and student performance in introductory and intermediate accounting courses, with the result that there was a negative correlation between evaluation of the introductory course, and performance in the subsequent course. 4 Weinberg et al. 7 investigated students undertaking introductory and intermediate economics courses at Ohio State University from 1995 to 2004, finding course ratings were correlated with grades in the concurrent course, but not in subsequent courses. 4 Carrell and West 8 conducted a study at the US Airforce Academy, measuring student performance in mandatory follow-on classes, finding that professors who excelled on SET in concurrent courses taught in ways that improved SET, but at the cost of poorer performance of students in more advanced follow-on classes. In a study at the Italian private university, Bocconi, Braga et al. 9 found students’ performance in follow-on coursework was negatively correlated with the SET of the professors.
Consequently, there is emerging evidence that SET is inversely proportional to the performance of students on subsequent courses, i.e. the higher the SET ratings, the poorer the students perform in subsequent studies. It has been posited that more experienced educators who effectively teach a foundation for later study receive lower SET scores as a consequence of the degree of challenge that students experience. 8 Conversely, those educators that provide less challenging teaching, that perhaps reinforces the illusion of mastery of the course material, 10 while at the same time being potentially vulnerable to grade inflation due to easy assessments, are more likely to be rated highly on SET. In medicine, assessment drives learning. Accordingly students are likely to rate highly any educator who helps them to pass the next barrier exam and reach the next stage of training. 11 Of course, this appears to be an easier process in medical specialties assessment, especially in more easily operationalised and circumscribed fields such as surgical anatomy. 11 However, there is a lack of research specific to SET and ASP in medical fields, and such research would better inform future medical curriculum development.
Addressing SET concurrent versus consequent performance
On the basis of the above, we propose that SET must be accompanied by assessment of student performance (ASP) longitudinally, especially in medical education in so-called ‘Cinderella’ specialities such as psychiatry and general practice; in which the subjective component of SET, infused with negative sentiment, may, if assessed without ASP, be invalid. 2 However, the very considerable challenge is that there is insufficient time and the lack of an appropriate follow-on course in which to assess performance within Australian postgraduate medical school programmes. For example, there is frequently no directly related subsequent course to psychiatry, as it is often taught with surgery, general medicine and obstetrics/gynaecology. While psychiatric skills could be assessed in medical interns or junior medical officers, this would require a separate assessment process that would be confounded by not directly following on from the medical school teaching and the presence of possible intervening factors.
A possible solution, which has the added benefit of enhancing learning, 12 is the development of progressive self-assessment tasks during the course of the term, which are graded for formative purposes, but separate from the summative formal course performance examinations. For example, this may involve second weekly or mid-term self-assessment tasks, with the grading provided to students to enable them to calibrate their learning, during the course of a typical 8-week psychiatry term. This would yield at least one to two data points per student to assess performance within the term, a third at the formal summative examinations. Such self-assessment tasks should ideally be available on demand via computer marking, with detailed feedback to guide further learning. Additionally, a further assessment in the pre-internship program, would result in three or more data points for ASP. Similarly, SET should be assessed contemporaneously with ASP throughout the medical education process.
Progressive assessment has become the new norm in medical education and, although yet to be widely adopted, is gaining impetus to achieve more effective learning. 13 We are proposing a model that involves appropriately graded formative examinations that allow students to calibrate their knowledge against the final summative examinations for assessment of student performance combined with SET at the end of each teaching term (see below). In our model, past summative examination questions (multiple choice and extended matching, at our university) will initially be administered as a self-assessment task, with feedback and discussion of results midway in an 8-week term (of which there are four per year). The use of an examination that facilitates testing and reflection upon knowledge is consistent with the cognitive science of learning, enhanced understanding and consolidation of learning. 12 The results of the combined/whole year students’ mid-term assessments may then be compared with the results from the end-of-year summative examination. Further extension of assessment of performance into the pre-internship period and pre-specialisation junior medical office postgraduate education will add real-world validity, but will necessarily be more temporally distant from, and thus less clearly linked to, SET.
Conclusion
Building contemporaneous SET as well as graded formative and summative ASP longitudinally into medical education may yield useful information on the relationship of the two types of feedback, more accurately informing on the effectiveness of teaching and student performance. In contrast to the cynical approach, we can thus assess the value as well as the price (or consumer sentiment) of medical education in psychiatry and related disciplines.
