Abstract

This edited volume on assessing speaking constitutes a timely contribution to the field for language testing and assessment specialists, as well as for applied linguists and educational practitioners who may be interested more broadly in theoretical understandings and methodological considerations surrounding speaking as a dynamic and socially constructed endeavour.
The contents of the volume originate from a conference of the same name held in the United States in May 2018 which explored how existing approaches could be reconceptualised or adapted for assessing second/foreign languages. The 13 papers are grouped according to 4 key themes: conceptual and theoretical issues, collecting and rating speaking data, designing speaking assessment tests, and using new technologies to assess speaking. Despite this categorisation, there are a number of important threads running consistently throughout the volume relating to productive research methodologies and the application of theory in a practical context.
In Part 1, the editors set out their aims and aspirations for the volume, focusing on the importance of an expanded sociolinguistic definition of speaking that incorporates recent research findings on the interactional dynamics of conversation, interviews, etc. They trace some of the history of speaking assessment, highlighting key challenges such as the co-construction involved in interactional competence (IC), the role of nonverbal behaviour (NVB), and the multilingual capabilities of the learner/speaker. The other two chapters in this opening section pick up some of these threads in detail, surveying the current literature and reporting on empirical research for IC and NVB in particular.
Part 2 focuses on the collection and rating of actual speaking data, with three chapters addressing raters’ delivery of task instructions in OPI roleplays, raters’ scoring processes and strategies in paired speaking assessment, and the nature and impact of raters’ identities in L2 English oral assessment.
The four chapters in Part 3 examine the design of speaking assessment tools for observing and assessing various dimensions of interactional competence, including repair strategies. Differing learning contexts are represented in this section. Classroom-based assessment and different testing approaches are explored, for example, through social deduction board games.
The final section focuses on using new technologies to assess speaking. Part 4 contains two chapters: one explores the design and implementation of classroom-based virtual reality assessment, and the other considers the affordances of computer-mediated speaking tests for assessing interactional competence. A concluding chapter by the editors briefly considers the future of speaking assessment in the post-Covid-19 era.
The subtitle of this edited volume is “Expanding the Construct and its Applications.” This phrase accurately reflects what seems to have been happening to the construct of spoken language competence over recent years, i.e. a steady expansion from a previously fairly narrow characterisation of speaking towards an ever more complex and integrated conceptualisation of oral competence. The highly context-initiated and context-shaped nature of “talk-in-interaction” is increasingly well understood nowadays among applied linguists and language assessment specialists alike, forcing us to broaden the focus of our testing to include aspects of interactional competence as far as we can. The normally co-constructed dimension of such talk is something that many of us working in language testing have been grappling with for some time in an assessment tradition which typically assigns individualised scores to test takers. Against this background, this edited volume is a welcome addition to the bookshelf for speaking assessment specialists, bringing us up to date with recent empirical research in the field and helping us to refine the speaking constructs we are working with.
A key strength of the edited volume is the way it assembles a wealth of detailed empirical research studies covering multiple aspects of spoken language competence, such as turn-taking, repair strategies, roleplay, non-verbal behaviour and multilingual competence. The range of methodologies used in these empirical studies and the fine detail with which they are reported make this volume a particularly useful resource for graduate students and early career researchers who are seeking suitable approaches or tools for their own data analysis (such as conversation or discourse analysis, membership categorisation analysis), or who are looking to better understand how to interpret findings from such analyses. The cluster of studies exploring rater behaviour and perceptions helps to extend our understanding of rater persona and identity issues, and should bring valuable insights for those involved in rater selection, training and management. The chapters discussing alternative approaches to speaking test design and opportunities offered by new technologies open up some fresh avenues for innovative test development initiatives in the future in certain specific contexts.
However, this brings me to a concern about the overall nature and impact of the edited volume which I fear risks undermining its relevance and impact for many language testing researchers and practitioners who work in the field of large-scale language assessment. The introductory chapter, and most of the chapters that follow, are premised on a powerful critique of the inadequacy of current large-scale, institutionalised approaches to L2 oral assessment, such as those offered by tests like the ACTFL-OPI or IELTS. In fairness, this critique is not without justification if we compare the invariably narrow and selective operationalisation of speaking found in many large-scale commercial oral tests (lasting little more than 15–30 minutes) with the rich and sophisticated nature of spoken interaction in the “real world.” The title of this edited volume appears to promise readers interested in assessing spoken language not simply insights into a more sophisticated theoretical conceptualisation of the construct(s), underpinned by copious empirical evidence, but also insights into how this expanded understanding can be successfully applied and operationalised through actual assessment tools and practices. Many of the chapters then go on to describe relatively small-scale testing initiatives, often experimental in design, which seek to address the flaws or inadequacies perceived in large-scale oral testing operations, and this is encouraging in its own right. It may be that the insights to be gained from this volume have the greatest value when applied to the smaller-scale, but no less important, arena of language testing that is classroom-based assessment or testing within a localised institutional context, e.g. a school or university. Few of the chapters, however, discuss (or even acknowledge) the very real-world challenges involved in scaling up a testing instrument or an assessment approach to be sustainable on a larger scale and over a longer timeframe (beyond the one-off experimental project). There is little consideration of the resource implications—or of the quality assurance measures—that might be required to ensure the reliability and fairness of an assessment approach and system as well as its “sociolinguistic” validity. The editors’ concluding chapter touches briefly on some of these issues but it might have been good to invite and include somewhere in the volume reflections from some of the institutional language test providers who are mentioned (and critiqued), in order to engage in a constructive conversation about “expanding the construct and its applications” in high-stakes, institutionalised testing contexts. This could perhaps have been positioned as a sort of ‘real-world post-script’—serving as a counter-balance to the more detached or disengaged perspective that the world of academia often enjoys. As applied linguists and language assessment researchers it is tempting for us to think of construct definition as being the only starting point for our assessment efforts. In reality, of course, what drives an assessment approach is often a constellation of contextual factors (political, social, educational, individual, etc) that shape the purpose for testing and the constructs to be operationalised.
As a whole, the edited volume reads more like a proceedings volume from a research conference or symposium (which of course it is in one sense), but it sometimes feels a bit “patchwork-like” as a result. As a reader, I would have welcomed an overarching chapter or framework that pulled all of the studies together, proposing a comprehensive construct of IC for assessment purposes or explaining how IC is situated within the wider construct of speaking. Unfortunately, the chapters do not follow a standardised structure throughout the volume; this results in considerable repetition in the introductory sections of several chapters which proves a bit wearisome for the reader and undermines a sense of cohesion and coherence across the volume.
One final slightly disappointing feature of the book is the occasional lack of accuracy or attention to detail on the part of the editors. Examples include reference on page 12 to the volume being divided into three parts when in fact it contains four. A more worrying flaw is the reference on pages 7/8 to “the CEFR test” (there is no “CEFR test”), comparing it to the ACTFL OPI. This suggests a worrying lack of understanding of the nature and purpose of the CEFR—despite its widespread use around the world—and it risks perpetuating assertions that may be unhelpful as well as erroneous.
Regardless of the comments expressed above, the editors are to be warmly commended for bringing this volume successfully to publication during what has been an exceptionally challenging period over the past 2 years due to the global Covid-19 pandemic. This edited volume remains a welcome addition to the field, helping to broaden our understanding of construct definition for speaking assessment.
