Concordance or Discordance? The Broader Context of English Tests for Australian Immigration

Abstract

At the invitation of the Editors, I am writing in response to the Viewpoint by Spiros Papageorgiou and Tony Clark, presenting their reflections on the concordance study that they conducted to equate scores on the Test of English as a Foreign Language (TOEFL) and International English Language Testing System (IELTS). This and similar studies were prompted by an Australian government requirement that tests to be recognized for immigration purposes should produce such evidence of score interpretation, as a prelude to the announcement of an expanded list of accepted tests in 2025. Those undertaking concordance studies needed to demonstrate that they had followed the guidelines for good practice published by Knoch and Fan (2024). The study by Papageorgiou and Clark (2026) is one of a number of projects of this kind that adhere to the guidelines and fulfil the government’s requirement with a high degree of internal validity, so that scores on the two tests can be meaningfully compared, while taking account of the inevitable error of measurement and the limits on the assessment literacy of test users.

However, there is a broader perspective that needs to be explored in considering how these largely technical exercises fit within the context of immigration policy in Australia. Papageorgiou and Clark acknowledge briefly the societal issues involved in the use of test scores for high-stakes policy decisions of this kind, but it is important to complement the narrow mandate of their research with a critical discussion of other considerations, both technical and qualitative.

One requirement of a good concordance study, as Knoch and Fan emphasize, is a preliminary investigation to show that the constructs underlying the two tests can be meaningfully compared. The concordance studies cited by Papageorgiou and Clark (including their own) have included this component, conducted independently of the main study, and have found that generally there was a high degree of overlap between the pairs of tests in this regard. However, it should be noted that most of these tests include “Academic” in their name, meaning that the tests were primarily, if not exclusively, designed as measures of academic language proficiency for international students pursuing higher education. These students represent a major category of visa applicants to Australia, but there are other important types of visa, for skilled employment, business and investment, where English test results are required but where academic proficiency may be much less relevant to the language needs of prospective immigrants.

A brief history of English language testing for immigration purposes in Australia is relevant here (Read, 2022). After a liberalization of immigration policy in the 1980s led to an influx of non-English-speaking migrants from non-traditional sources, the government commissioned a test called access: (or the Australian Assessment of Communicative English Skills), which was developed by Australian applied linguists specifically for migrant applicants. However, that was abandoned in 1998 in favour of requiring applicants to take IELTS at their own expense. Maintaining a high-stakes immigration test was an expensive enterprise for the Australian government, whereas the IELTS Partners had the assessment expertise and a worldwide network of test centres already in place. IELTS also had face validity as the only major international test at the time that included “direct” measures of all four macro language skills. In addition, the test programme included a General Training Module which, it could be argued, addressed to some extent the needs of migrants who were not planning to engage in academic study. However, there was no systematic process of what Fulcher and Davidson (2009) called “change retrofit,” whereby the construct definition and specifications of the test would be fundamentally redefined to be consistent with this broader use of the test for immigration purposes.

Over time, vigorous promotional efforts by those responsible for “professional partnerships” for the respective test publishers succeeded in gaining acceptance of other tests for Australian immigration purposes, such as the TOEFL Internet-Based Test (iBT) and the Pearson Test of English-Academic, even though these were arguably more purely academic proficiency tests than IELTS. Three more tests, the Cambridge C1 Advanced exam (C1 Advanced), the Occupational English Test (OET), and the Canadian English Language Proficiency Index Program (CELPIP), were on the approved list as well before the 2025 review. However, IELTS was in effect the reference test. For one thing, the Australian Department of Home Affairs had adopted its own designations for levels of immigrant proficiency – Functional English, Vocational English, Competent English, Proficient English, and Superior English – which simply represent a bureaucratic relabelling of Bands 4 to 8 on the IELTS reporting scale and appear to have no independent empirical basis. It is also worthy of note that all the concordance studies cited by Papageorgiou and Clark (2026) involve a comparison of another test with IELTS, thus highlighting its reference function in practice. Ideally, there would have been an independent framework to serve as a standard for all the prospective approved tests and to allow for multiple comparisons across tests rather than the two-at-a-time approach of concordance studies. The obvious candidate is the Common European Framework of Reference for Languages (CEFR), with which English and other language tests worldwide are now routinely (claimed to be) aligned, but the six broad bands of proficiency in the CEFR do not lend themselves to equating through equipercentile linking, the basic tool in concordance research.

Since August 2025, two additional tests have been approved – LANGUAGECERT Academic and the Michigan English Test – while IELTS Academic and IELTS General Training are now listed as separate measures.¹ Thus, the approved list comprises a diverse array of nine English proficiency tests (Department of Home Affairs, Australian Government, 2026). As previously noted, several of them have been developed and validated for assessing the English of prospective international students, who certainly comprise a large proportion of the visa applicants to Australia. However, only two of the accepted tests have been designed specifically to cover other immigration purposes. One is the OET, which was originally commissioned by the Australian government around 1989 to assess the readiness of overseas-qualified health professionals to meet the language demands of practising their profession in Australia. The other is the CELPIP, a test based on the Canadian Language Benchmarks, a government-sponsored framework to address the language needs of immigrants to Canada across the full range of proficiency, incorporating social, academic and professional uses of language. (A generally favourable review of the CELPIP from a Canadian perspective has recently been published by Qin and Baker, 2025). As noted above, the opportunity of developing such a comprehensive framework for Australian immigration was missed when the decision was made to discontinue the access: programme in favour of IELTS in 1998.

There is little guidance on the immigration website as to which English tests might be appropriate for which type of visa, apart from a note that the OET is designed for health professionals. The assumption appears to be that, once a concordance study has established equivalences with the IELTS-defined proficiency scale, the tests are largely interchangeable.² On the positive side, the list of approved tests represents a menu of options for migrant applicants to choose from. Numerous factors are likely to influence their choice. The availability of the tests, and of places within test centres, varies in different parts of the world. Test-takers may have preferences for particular tests in terms of their design and delivery: direct versus semi-direct assessment of speaking; human versus automated scoring of writing and speaking tasks; selected-response versus constructed-response item types for listening and reading assessment; and so on. There are also widespread claims on social media and elsewhere about the relative easiness of the different tests and the effectiveness of strategies to enhance one’s performance, particularly on tests with automated scoring.

Another consideration worth noting is the specification of the minimum requirements for the scores on the approved tests. For the lowest proficiency level, Functional English, an average or overall score for the four skill components is acceptable, but for the four higher levels, the specified minimum score must be achieved across all four skills in a single test administration, regardless of the relative demands of, say, oral or written language in the applicant’s field of employment or social situation. This is a particular challenge at the highest levels of Proficient and Superior English (IELTS Bands 7 and 8 or equivalent), often leading to repeated attempts by migrant applicants to retake the whole test at considerable cost in money and time. To some limited extent, this is mitigated by the recent acceptance for immigration purposes of One Skill Retake, the provision by IELTS and the Michigan test to allow test-takers to re-sit for a reduced fee just one skill component in which they did not achieve the minimum score requirement (Lee et al., 2026).

Proficient and Superior English feature prominently in the process of applying for a skilled migrant visa, where the Department of Home Affairs has used a points system since the 1980s to select successful applicants for permanent residence (Frost, 2025; Frost & McNamara, 2018). The rationale for the system is to evaluate the potential of prospective migrants to contribute substantially to the economic growth of the country. Thus, points are assigned for age, educational attainment, skilled employment experience in other countries and in Australia, and for English proficiency. A minimum of 65 points is required for eligibility, but quotas are set for the allocation of visas based on estimates of labour market needs in the various skill areas, and applicants are ranked according to their total number of points (Department of Home Affairs, Australian Government, 2024). This creates a highly competitive situation in some skill areas and a strong incentive for applicants to maximize their points total.

In the short term, since migrants on temporary visas in the country often face discrimination in accessing the skilled employment in their field that would allow them to enhance their Australian work experience, their only realistic option for improving their chances of success is to obtain more points for English (Frost, 2025). The minimum level is Competent English (Band 6), which is allocated 0 points, but 10 points are awarded for Proficient English (Band 7) and 20 for Superior English (Band 8). This is the stress-filled situation in which applicants take one or more of the approved tests multiple times and may even stop working to concentrate on test preparation (Frost, 2017). From a language development perspective, it represents an unproductive use of their time because none of the proficiency tests is primarily designed to assess ability reliably at that high level and in particular (with the arguable exception of the OET) to target the kind of sophisticated communicative abilities that are essential in contemporary skilled work environments. Thus, there is a real sense in which “Superior English” means having highly developed test-taking skills (and perhaps a measure of good fortune) rather than the ability to function, say, as an effective team member in the workplace. This calls into question the validity claims made for the tests, which on websites and in promotional materials often rely on the circular argument that the test is accepted or recognized by the immigration authorities as well as higher education institutions, employers, professional registration bodies, and other test users.

Returning now to Papageorgiou and Clark’s (2026) Viewpoint: given the logic that multiple tests are acceptable to meet Australian immigration language requirements, it is important to ensure that rigorous concordance studies establish score equivalences which are as accurate as possible, within the inevitable tolerances that reflect differences in test design, error of measurement, and so on. This is an ongoing process, as there have been significant changes in some of the approved tests in recent years. However, as Papageorgiou and Clark acknowledge, it is also necessary to take a broader societal perspective and understand how English assessment is embedded in a national policy framework that prioritizes control of entry to the labour market, to the detriment of conventional validity concerns about what kinds of language ability the approved tests are actually measuring and the inferences that can be made about the meaning of the test scores.

John ReadUniversity of Auckland, New Zealand

Footnotes

ORCID iD

John Read

Author Contributions

John Read: Conceptualization; Investigation.

Notes

References

Department of Home Affairs, Australian Government. (2024). Points table for skilled independent visa (subclass 189). https://immi.homeaffairs.gov.au/visas/getting-a-visa/visa-listing/skilled-independent-189/points-table

Department of Home Affairs, Australian Government. (2026). English language visa requirements. https://immi.homeaffairs.gov.au/help-support/meeting-our-requirements/english-language

Frost

(2017). Test impact as dynamic process: Individual experiences of the English test requirements for permanent skilled migration in Australia [Unpublished doctoral dissertation]. University of Melbourne.

Frost

(2025). Language assessment for immigration in Australia: Test-policy-discourse entanglements and their ethical implications. In Kunnan

A. J.

(Ed.), The concise companion to language assessment (pp. 417-431). Wiley.

Frost

McNamara

(2018). Language tests, language policy, and citizenship. In Tollefson

J. W.

Pérez-Milans

(Eds.), The Oxford handbook of language policy and planning (pp. 281–300). Oxford University Press. https://doi.org/10.1093/oxfordhb/9780190458898.013.14

Fulcher

Davidson

(2009). Test architecture, test retrofit. Language Testing, 26(1), 123–144. https://doi.org/10.1177/0265532208097339

Knoch

Fan

(2024). Test score comparison tables: How well are they serving test users? Language Testing, 41(3), 681–693. https://doi.org/10.1177/02655322241239348

Lee

H.-W.

Bruce

Langeslag

Tasviri

(2026). Outcomes of one-skill retakes in a four-skills proficiency test: Evidence from large-scale test data. Language Testing. https://doi.org/10.1177/02655322261459898

Papageorgiou

Clarke

(2026). Reflections on the practical implementation of Knoch and Fan (2024)’s good practice principles for score concordance studies. Language Testing. https://doi.org/10.1177/02655322261441918

10.

Qin

C. Y.

Baker

(2025). Review of the Canadian English Language Proficiency Index Program (CELPIP). Language Testing, 42(1), 100–113. https://doi.org/10.1177/02655322241291764

11.

Read

(2022). Test review: The International English Language Testing System (IELTS). Language Testing, 39(4), 679–694. https://doi.org/10.1177/02655322221086211