Student evaluations of teaching (SET): implications for medical education in psychiatry and an approach to evaluating SET and student performance

Abstract

Objective:

We present reflections on student evaluation of teaching (SET) in the context of recent higher educational research that assesses SET, as well as concurrent and/or subsequent student performance.

Conclusions:

In a sense, there is in-built cynicism in SET, with more favourable SET for easier assessment. There is emerging evidence that SET is inversely proportional to the performance of students in subsequent courses, i.e. the higher the ratings, the poorer the students perform in subsequent studies. It is proposed that SET should be combined with contemporaneous formative and summative assessments of student performance in medical school settings, especially in psychiatry education.

Keywords

psychiatry medical education student evaluation of teaching concurrent student performance subsequent student performance

Cecil Graham: What is a cynic?

Lord Darlington: A man who knows the price of everything, and the value of nothing.

‘Lady Windermere’s Fan’, Oscar Wilde

In the context of a limited time in which to teach and burgeoning curricula, all medical specialties are in the invidious position of being constrained in medical school education of future doctors. In an increasingly commodified university sector, along with burgeoning student consumerism,¹ decisions around curricula involve consideration of student evaluation of teaching (SET), as a measure of quality and consumer satisfaction. However, given the considerable concerns about the validity of SET, we previously argued that SET must be balanced by objective assessment of student performance in the course skills and knowledge (ASP).² We acknowledge an imbalance between SET and ASP, in that only a subset of students (c.<25%) complete SET, whereas all students participate in ASP. Medical students typically have little or no course choice in comparison to general university students, and this may also contribute to lower SET ratings. This is especially important in medical specialties such as general practice and psychiatry that may be viewed as more challenging and of lesser status, in which SET may be unduly affected by negative sentiment,² or indeed, cynicism. Medical students have regarded psychiatry teaching as unscientific and emotionally draining, as well as finding the context of clinical placements is more challenging in comparison to other medical specialties.³ Such impressions may impact dramatically on SET. Here, we seek to encourage discussion, based on research in higher education that evaluates SET specifically in correlation with concurrent as well as consequent course performance; arguing that ultimately similar evaluations are necessary in medical education, especially in psychiatry.

SET: concurrent and consequent performance of students

It is well known that SET is correlated with performance on concurrent courses, and a number of reasons, including but not limited to grade inflation, have been proposed as factors.⁴ However, there are a number of studies that demonstrate more positive SET in regard to specific university educators does not necessarily correspond to improved performance of students in subsequent courses.⁴

In an excellent review, Stroebe⁴ describes five studies that relate teaching ratings in a concurrent course to student performance in subsequent courses, which we summarise here. Johnson⁵ found that some items of SET were negatively correlated with grades on subsequent courses.⁴ Yunker and Yunker⁶ studied teaching evaluations and student performance in introductory and intermediate accounting courses, with the result that there was a negative correlation between evaluation of the introductory course, and performance in the subsequent course.⁴ Weinberg et al.⁷ investigated students undertaking introductory and intermediate economics courses at Ohio State University from 1995 to 2004, finding course ratings were correlated with grades in the concurrent course, but not in subsequent courses.⁴ Carrell and West⁸ conducted a study at the US Airforce Academy, measuring student performance in mandatory follow-on classes, finding that professors who excelled on SET in concurrent courses taught in ways that improved SET, but at the cost of poorer performance of students in more advanced follow-on classes. In a study at the Italian private university, Bocconi, Braga et al.⁹ found students’ performance in follow-on coursework was negatively correlated with the SET of the professors.

Consequently, there is emerging evidence that SET is inversely proportional to the performance of students on subsequent courses, i.e. the higher the SET ratings, the poorer the students perform in subsequent studies. It has been posited that more experienced educators who effectively teach a foundation for later study receive lower SET scores as a consequence of the degree of challenge that students experience.⁸ Conversely, those educators that provide less challenging teaching, that perhaps reinforces the illusion of mastery of the course material,¹⁰ while at the same time being potentially vulnerable to grade inflation due to easy assessments, are more likely to be rated highly on SET. In medicine, assessment drives learning. Accordingly students are likely to rate highly any educator who helps them to pass the next barrier exam and reach the next stage of training.¹¹ Of course, this appears to be an easier process in medical specialties assessment, especially in more easily operationalised and circumscribed fields such as surgical anatomy.¹¹ However, there is a lack of research specific to SET and ASP in medical fields, and such research would better inform future medical curriculum development.

Addressing SET concurrent versus consequent performance

On the basis of the above, we propose that SET must be accompanied by assessment of student performance (ASP) longitudinally, especially in medical education in so-called ‘Cinderella’ specialities such as psychiatry and general practice; in which the subjective component of SET, infused with negative sentiment, may, if assessed without ASP, be invalid.² However, the very considerable challenge is that there is insufficient time and the lack of an appropriate follow-on course in which to assess performance within Australian postgraduate medical school programmes. For example, there is frequently no directly related subsequent course to psychiatry, as it is often taught with surgery, general medicine and obstetrics/gynaecology. While psychiatric skills could be assessed in medical interns or junior medical officers, this would require a separate assessment process that would be confounded by not directly following on from the medical school teaching and the presence of possible intervening factors.

A possible solution, which has the added benefit of enhancing learning,¹² is the development of progressive self-assessment tasks during the course of the term, which are graded for formative purposes, but separate from the summative formal course performance examinations. For example, this may involve second weekly or mid-term self-assessment tasks, with the grading provided to students to enable them to calibrate their learning, during the course of a typical 8-week psychiatry term. This would yield at least one to two data points per student to assess performance within the term, a third at the formal summative examinations. Such self-assessment tasks should ideally be available on demand via computer marking, with detailed feedback to guide further learning. Additionally, a further assessment in the pre-internship program, would result in three or more data points for ASP. Similarly, SET should be assessed contemporaneously with ASP throughout the medical education process.

Progressive assessment has become the new norm in medical education and, although yet to be widely adopted, is gaining impetus to achieve more effective learning.¹³ We are proposing a model that involves appropriately graded formative examinations that allow students to calibrate their knowledge against the final summative examinations for assessment of student performance combined with SET at the end of each teaching term (see below). In our model, past summative examination questions (multiple choice and extended matching, at our university) will initially be administered as a self-assessment task, with feedback and discussion of results midway in an 8-week term (of which there are four per year). The use of an examination that facilitates testing and reflection upon knowledge is consistent with the cognitive science of learning, enhanced understanding and consolidation of learning.¹² The results of the combined/whole year students’ mid-term assessments may then be compared with the results from the end-of-year summative examination. Further extension of assessment of performance into the pre-internship period and pre-specialisation junior medical office postgraduate education will add real-world validity, but will necessarily be more temporally distant from, and thus less clearly linked to, SET.

Conclusion

Building contemporaneous SET as well as graded formative and summative ASP longitudinally into medical education may yield useful information on the relationship of the two types of feedback, more accurately informing on the effectiveness of teaching and student performance. In contrast to the cynical approach, we can thus assess the value as well as the price (or consumer sentiment) of medical education in psychiatry and related disciplines.

Footnotes

Disclosure

The authors report no conflict of interest. The authors alone are responsible for the content and writing of the paper.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Jeffrey C L Looi

References

Naidoo

Jamieson

. Empowering participants or corroding learning? Towards a research agenda on the impact of student consumerism in higher education. J Educ Policy 2007; 20: 267–281.

Looi

Anderson

. Between SET and ASP: balancing the scales of student evaluation of teaching (SET) and teachers’ assessments of student performance (ASP) for medical school education in psychiatry. Australas Psychiatry 2018; 26: 659–661.

Curtis-Barton

Eagles

. Factors that discourage medical students from pursuing a career in psychiatry. Psychiatrist 2018; 35: 425–429.

Stroebe

. Why good teaching evaluations may reward bad teaching: on grade inflation and other unintended consequences of student evaluations. Perspect Psychol Sci 2016; 11: 800–816.

Johnson

. Grade Inflation: A Crisis in College Education. New York, NY: Springer-Verlag New York, 2003.

Yunker

. Are student evaluations of teaching valid? Evidence from an analytic business core course. J Educ Business 2003; 78: 313–317.

Weinberg

Hashimoto

Fleisher

. Evaluating teaching in higher education. J Econ Educ 2009; 40: 227–261.

Carrell

West

. Does professor quality matter? Evidence from random assignment of students to professors. J Polit Econ 2010; 118: 409–432.

Braga

Paccagnella

Pellizzari

. Evaluating students’ evaluations of professors. Econ Educ Rev 2014; 41: 71–88.

10.

Epstein

. Range: Why Generalists Triumph in a Specialized World. New York, NY: Riverhead Books, 2019.

11.

Wormald

Schoeman

Somasunderam

, et al. Assessment drives learning: an unavoidable truth? Anat Sci Educ 2009; 2: 199–204.

12.

Brown

Roediger

III McDaniel

. Make It Stick: The Science of Successful Learning. Cambridge, MA: Belknap Press, 2014.

13.

Heeneman

Oudkerk Pool

Schuwirth

, et al. The impact of programmatic assessment on student learning: theory versus practice. Med Educ 2015; 49: 487–98.