Good qualitative research

Abstract

This issue of Language Teaching Research (LTR) includes six articles, each examining a particular aspect of second language teaching and learning, ranging from the use of classroom resources to different kinds of instructional approaches to the development of different language skills. Three of the studies have used a quantitative approach while the other three have used a qualitative one. This suggests that the number of qualitative studies is increasing in second language (L2) research and that LTR welcomes publishing them. Given that half of the studies are qualitative, I would like to take the opportunity to discuss some of the characteristics of qualitative research and also the criteria and principles that could be used to assess their quality.

Qualitative research can be broadly defined as a kind of inquiry that is naturalistic and deals with non-numerical data. It seeks to understand and explore rather than to explain and manipulate variables. It is contextualized and interpretive, emphasizing the process or patterns of development rather than the product or outcome of the research. In L2 research, qualitative methods can be used to explore an array of questions for which a quantitative method may not be suitable. In qualitative research, data are collected through qualitative data collection tools such as interviews, field notes, diaries, observations, etc. In pure qualitative research, data are both collected and analysed qualitatively. However, qualitative data can also be analysed quantitatively through assigning numerical values to the whole or sections of the data, which can then help to identify general patterns or, in some cases, to evaluate specific predictions. Two of the qualitative studies in this issue of LTR have also used quantification in their analysis.

Good qualitative research is robust, well informed, and thoroughly documented. Although naturalistic and interpretive, similar to quantitative research, qualitative research is also systematic, involving a careful process of identifying the problem, collecting, analysing, explaining, evaluating, and interpreting the data. Thus, when doing qualitative research, it is essential to ensure its rigor and quality.

Various types of validity and reliability criteria have been discussed in the literature for assessing the soundness of quantitative research. Although qualitative research is different from quantitative research, researchers in the field of qualitative research have also developed quality standards to judge the rigor of qualitative research. The notions of reliability and validity in quantitative research have always been used in relation to consistency or accuracy of tests or measurements used. In qualitative research, they are defined in terms of the trustworthiness of the findings or, as Lincoln and Guba (1985, p. 290) stated, to address the question of ‘How can an inquirer persuade his or her audiences that the research findings of an inquiry are worth paying attention to?’ (Lincoln & Guba, 1985, p. 290). Lincoln and Guba discussed four such trustworthiness principles, which have been accepted and considered important by many qualitative researchers. These include credibility, transferability, dependability, and confirmability, and have been considered as parallel substitutions to conventional notions of internal validity, external validity, reliability, and objectivity used in quantitative research. I will briefly discuss these principles below and, when possible, explain how they are met in the qualitative studies in this issue of the journal.

The principle of credibility in qualitative research concerns the extent to which the research findings and conclusions can be viewed to be believable. In other words, it concerns the truthfulness of the findings and the extent to which they reflect the reality of the phenomenon investigated. To achieve this, the researcher needs to ensure that his or her understanding of the research participants, context, and processes are as accurate and complete as possible and that the interpretations are inclusive. Depending on the data, one useful strategy is member checking or participant validation, which is sharing the data and interpretations with the research participants to see if they agree. Another is triangulation, which involves using multiple data collection methods, sources, explanations, or perspectives. Triangulation helps to achieve a more accurate and complete understanding of the issue under investigation, thus increasing the validity and credibility of the findings. Transferability concerns the extent to which the researchers’ interpretation or conclusions are transferable to other similar contexts. This requires thorough and rich description of the research activities and assumptions. Transferability looks analogous to generalizability in quantitative research. However, since qualitative research is interpretive and the participants are often small in number and not representative of the population, the findings cannot be generalizable in the sense used in quantitative research. Thus, as Lincoln and Guba (1985) noted, transferability should not be meant for the researcher to make generalizable claims but instead to provide sufficient details that make transfer possible in case readers wish to do so. Dependability is an alternative notion to reliability in quantitative research. In quantitative research, reliability refers to the consistency of data collection tools or measures. In qualitative research, this principle indicates that the study should be reported in such a way that others could arrive at similar interpretations if they review the data. This can be enhanced by carefully documenting all the research activities and the conclusions or any changes that may occur as the research evolves. Such documentations can then be reviewed by an outside researcher to examine their accuracy and the extent to which the conclusions are grounded in the data. Confirmability concerns the extent to which others confirm the researcher’s interpretations and conclusions. This standard is considered a parallel to objectivity in quantitative research. While quantitative research seeks objectivity by dissociating the researcher from the research process, qualitative research emphasizes the researcher’s active role and engagement in the research. It also resembles replicability, which concerns the extent to which a study can be reproduced. In qualitative research, Confirmability can be established by describing the data and the findings in such a way that their accuracy can be confirmed by others. One useful strategy is ‘audit trail’ where the researcher records and rationalizes all the steps taken and the decisions made regarding the data coding and analysis. These records become then available for any further evaluation and confirmation.

As noted earlier, three of the studies in this issue have used qualitative research methods. In examining these studies, I can say that many of the criteria described above have been met, which can be taken as evidence for the robustness of the research reported. The first article by van Batenburg et al. examines the interactional opportunities provided in Dutch course books for English as a foreign language (EFL). The research aims to discover the extent to which the course books’ activities provide opportunities for enhancing learners’ English knowledge and their ability to use that knowledge in real interactional contexts. The data is based on three course books used with third-year students in a four-year pre-vocational Business and Administration program in the Netherlands. The study is well conducted with the topic, research procedures, context, and analysis thoroughly and sufficiently described, which enhances its credibility, accuracy, and rigor. For the analysis, first, a three-level coding scheme with well-defined categories was developed, which allowed for both objective descriptions and subjective inferences of experts about how the activities affected learning. The coding scheme was checked and rechecked against samples of the data and was then used by trained experts to identify the interaction opportunities in each chapter. The findings showed that, while all the three books taught language forms in relation to language functions, they included few real interactional and task-specific activities. In addition, language forms were often practiced in decontextualized activities instead of being incorporated into extended spontaneous discourse. Based on these findings, the authors conclude that there is a gap between the content of the course books and current L2 theories regarding how to develop interactional abilities.

Henry and Thorsen explored disaffection and lack of participation in response to language activities in an EFL classroom. This study was based on the premise that language learning takes place as a result of social interaction and investment of the self. Lack of investment can thus lead to non-participation, which can then hamper language development. The study seeks to gain insights into why learners in language classrooms sometimes respond positively and sometimes negatively to a particular classroom activity. To examine these issues, the study analysed data from field notes and observations of lessons in a multiple case study conducted in language classrooms in Sweden. Field notes and observations are two data collection tools often used in qualitative research. For this particular article, the study focused on two cases that involved learners’ negative responses or disaffection. They were a Study Abroad project with 9-grade students and a Visiting India activity with 7th grade students. The research methodology and procedures are thoroughly described. Also, since the study used a case study approach, there was an in-depth and detailed analysis of the cases. The data were analysed using a framework that focused on students’ interactional identity. The findings revealed that disaffection was derived mainly from students’ negative feelings towards the activities and the mindset that they threatened their authentic participation. For example, in the first project, dissatisfaction was seen to originate from students’ impression of being depicted as deficient speakers of English, and in the second, it was caused by feelings of performing an unauthentic activity. Based on these findings, the authors conclude that disaffection in language classrooms could be partly prevented by redesigning activities so that they can provide opportunities for authentic engagement.

Spenader et al. explored the use of content-based instruction (CBI) in World Language teaching in the USA. The focus was on planning stages and the techniques and resources the teachers used, the challenges and opportunities they faced, and the approach they took to meet those challenges. The study was conducted on 36 CBI units that the teachers developed for these courses. The analysis was thorough, involving a number of stages, including preliminary examination of selected units, developing and verifying guidelines for the analysis, fine-tuning categories, and coding lessons based on the categories. The analyses revealed that the lessons involved both content and language but the focus was more on content than language, and that when choosing content, the teachers used more cultural than academic contents. In addition, they relied on contents that were related to their prior experience and training, which then suggests that the teachers tended to use areas they were familiar with when designing CBI unit plans.

The other three studies were quantitative. Lai et al. investigated the usefulness of guided inductive instruction versus deductive instruction on helping learners develop semantic radical knowledge of Chinese characters. The participants were 46 learners of Chinese from two intact intermediate classes assigned to either a deductive or a guided inductive instruction. The study used a pretest/posttest experimental research design involving five intervention sessions over a three-week period. A t-test was initially used to compare the two groups and found no significant difference between their placement test scores. Semantic category judgment and lexical inference tests were used as pretests/posttests, with two levels of item complexity. The findings showed an advantage for the guided inductive instruction. However, these effects were mediated by the complexity of the radical characters. The learners in the inductive group performed better on both simple and more complex test items in the lexical inference test. However, they performed better only on more complex items in the semantic category judgment test. This finding was explained in terms of differences in the nature of processing involved in the two tests.

Karim and Nassaji examined the effects of direct and indirect corrective feedback on learners’ accuracy of revision and new writings. The study used an experimental research design involving 53 intermediate level learners of English as a second language (ESL), who were divided into three experimental groups (i.e. a direct feedback group, an indirect underlining only group and an indirect underlining + metalinguistic feedback group) and a control group. Students produced several writings and revised them over a three-week period. To examine the effect of feedback, analyses of variance (ANOVAs) with post-hoc multiple comparison tests were used. The results showed a significant effect on revision for all the three feedback groups and also some notable but non-significant effect on new pieces of writing for direct and underlining plus metalinguistics feedback.

The last study by Uchihara and Clenton investigated the role of receptive vocabulary size in L2 oral proficiency. The data were collected individually from 46 advanced L2 learners who completed a test of receptive vocabulary size and also produced spontaneous speech samples elicited through picture narrative tasks. The learners’ speaking ability was rated by three experts in terms of vocabulary features using the vocabulary descriptor of IELTS speaking band, and also measured in terms of lexical sophistication based on a word frequency index. The results showed a complex pattern. While a significant correlation was found between vocabulary size and the subjective measure of vocabulary rating, such a correlation was not found between vocabulary size and measures of lexical sophistication. The former finding was taken to suggest that learners with a larger vocabulary size are more likely to be perceived as lexically proficient speakers while the latter finding was taken to suggest that larger vocabulary size may not necessarily lead to the production of more sophisticated words in speech.

In conclusion, the articles in this issue of LTR report findings from six original studies addressing a range of topics using both quantitative and qualitative research methods. All the studies are well conducted, including the three qualitative studies, which are of high quality therefore offering excellent examples of robust qualitative research. Good qualitative research is important and needs to be encouraged as it allows for in-depth exploration of topics, providing valuable and rich insights into the processes of teaching and learning as they occur in naturalistic settings. LTR has been publishing both quantitative and qualitative studies and continues to do so.

References

Lincoln

Guba

(1985). Naturalistic inquiry. Beverly Hills, CA: Sage.