Abstract
This article examines how vocal performances of characters can contribute to sociocritical storytelling in video games. We argue that the vocal performances of video game characters–and in particular their accents–can “fill in” the fictional story worlds of video games through associations with real people and places. These associations allow video games to evoke such social themes as are connected with accent, including privilege, conflict, class, and ethnicity. So evoked, these themes can then be critically examined. We apply this perspective in a sociolinguistic analysis of Disco Elysium, an expansive role-playing game in which the characters' vocal performances come to support the player's sociomoral orientation in the game world. Finally, we discuss a result of our analysis that runs counter to previous scholarship, namely that vocal stereotyping can serve to enhance, rather than to undermine, the player's critical apprehension of game worlds.
In 2004, The New York Times ran an interview piece on the voice acting for the newly released Grand Theft Auto: San Andreas (2004). According to creative director Dan Houser, the game contained “something like 600 different speaking parts, well over 10 h of radio content and nearly 100,000 voice samples.” All of this voice work, including a noted performance by Hollywood actor Samuel L. Jackson, was ultimately included “to convince people this is a huge and vibrant world.” Houser believed that the recordings would help immerse the player into the world of San Andreas by expressing its characters’ diverse personalities and storied backgrounds (Gnatek, 2004).
Recent academic scholarship looks rather less optimistically than Houser at the actual implementation and impact of voice acting in video games. From reading this literature, one easily gets the impression that game developers use vocal performances not to inspire and expand the player’s experience of the story world, but to flatten it. Vocal performances are said to generate “Manichaean binaries” (e.g., Ensslin, 2010, p. 214) that neatly separate good and trustworthy characters, races, and communities from their evil and deceitful counterparts. This is achieved through the stereotyped use of normative (good) and non-normative (bad) accents, which come to designate corresponding character categories: an “us” and a “them.” The point is to simplify things for the player, who will not have to think very hard about the moral landscape that they are virtually traversing. For example, Fable’s (2004) distribution of accents embeds “an overarching ‘white’ linguistic matrix to functionalize, emotionalize and demonize [non-normative] characters and their moral outfits” (Ensslin, 2010, p. 217). In Dragon Age: Origins (2009), the dominant British and American English accents effect “a ‘warped world,’ whose inhabitants are reduced to only two primary accents restricted by their race and socioeconomic status” (Goorimoorthee et al., 2019, p. 283). In World of Warcraft (2004), primitive and dangerous races are stereotyped by means of accents and other forms of expression that index, for example, Jamaican culture in the case of the Trolls, who practice voodoo, dance Capoeira, and who like to say “greetings, mon.” The result of this stereotyping is to “cheapen rather than to enhance the game’s storyline” (Monson, 2012, p. 65, emphasis in original). 1 A recurrent suggestion in this literature is that these performed accents “reproduce language ideologies that perpetuate stereotypes in both the real and virtual world” (Villanueva & Ensslin, 2021, p. 205). They promote “stereotypical, hegemonic thinking about, as well as the naturalization of and willing submission to, ‘the dominant culture and prevailing power relationships’” (Goorimoorthee et al., 2019, p. 272). Or, in another version of the same view, “the convenience of [linguistic] pigeon-holing and straightforward othering helps channel players’ concentration onto motoric and cybernetic interaction with the game rather than triggering critical reflection and debate. This results in heightened degrees of susceptibility to underlying ideological content, which tends to go unnoticed yet continues to be re-implanted multimodally in players’ subconscious minds” (Ensslin, 2011, p. 233).
We agree with this research tradition that vocal stereotyping has been common in video games and that this may “cheapen” narratives by the blinkering perpetuation of ideological presumption. As we will argue in this article, however, vocal performances also contain the opposite potentials, namely those of being fictionally constructive and critically thought-provoking. We therefore present what is, to our knowledge, the first attempt to study how vocal performances can positively contribute to sociocritical storytelling in video games.
We start by arguing that performed accents and other vocal qualities, by their associations with different peoples and places in the real world as well as characters and locations in other fictions, carry information that players use to build out and fill in the fictional story worlds of video games as experienced by them. This account informs our sociolinguistic analysis of Disco Elysium (ZA/UM, 2021), a popular and ambitious role-playing game whose vocal performances come to support the player’s social and critical orientation in the game world in ways that go far beyond the designation of “good” and “bad” characters. The game achieves this result by its conscious vocal referencing of not just real peoples and places, but also, and at a higher level of analysis, of social issues and ideologies. We then discuss a surprising result of our analysis, which is that the frequent vocal stereotyping in Disco Elysium inspires rather than stifles the player’s critical apprehension of the game world. We explain this finding by arguing that the game’s stereotypes are overt, which is to say that they call attention to themselves as imagistic representations of ideas like privilege, class, capital, race, and nationhood. So evoked, these ideas and their associated ideologies can then be critically examined.
Of Voices and Game Worlds
Fictional story worlds may be conceived as “imaginable scenarios or sets of conceivable states of affairs constructed and expressed by means of artifacts (semiotic objects)” (Margolin, 2000, p. 355). The story world of a conventional video game is constructed by the game’s graphical and auditory elements insofar as these elements represent fictional information that players can integrate into an encompassing mental model of the game world: its layout, peoples, places, happenings, and how all of these things fit together (Thon, 2017).
Voices form part of the audiovisual, world-building semiosis of games like Grand Theft Auto: San Andreas (2004) or The Witcher 3: Wild Hunt (2015). Straightforwardly, vocal performances can deliver story information to the player by means of spoken words and sentences that state, suggest, or imply fictional facts. For example, a seemingly trustworthy character in The Witcher 3 claims that Novigrad is the largest city in the Northern Kingdoms. We believe her and update our mental model of the game world accordingly. This mode of story delivery can also and straightforwardly be carried by text, and it is not the focus of this article.
Less straightforwardly, story information can also be carried by paralinguistic features of speech, including accent, amplitude, timbre, intonation, and pitch contour, which can be directly presented by recorded speech but not by text. For example, the same character from The Witcher 3 may speak with a discernible accent that may generate a large number of fictional inferences. It may suggest that the character is native or foreign to a particular fictional region; that the character has various traits in common with other speakers of the same accent; or even that the character, in freely speaking with this accent, does not feel ashamed about the history that the accent represents. These additional inferences describe the fictional game world, but they derive from the player’s apprehension of the real world. It is only because the accents of the real world so often delineate sociohistorical fault lines that the player will form congruent beliefs about accents and their distributions in the fictional world. And as we will now go on to argue, the potential of accents so to fill in the fictional world by means of associative linkages to social reality is very great.
Audiences generally imagine fictional worlds on the assumption that these worlds will in basic respects mirror reality, and that meaningful deviations from reality—whether ontological, historical, psychological, or otherwise—will be authorially signaled or genre-conventionally presumed. 2 This is equally true of human-like non-playable characters (NPCs) in video games, which will be standardly interpreted as representations of people as experienced in the real world (evidence for this claim is summarized by Hartmann & Vorderer, 2010, pp. 95–96). The player’s assumptions about and associations regarding real people who speak in a certain way may therefore activate when a fictional character speaks in that way. 3 Indeed, it is not always clear that audiences have a choice in the matter, as voice-based associations and impressions, like other aspects of social perception and cognition, may be processed automatically and unconsciously (Frith & Frith, 2012; Greenwald & Banaji, 1995; see also Reeves & Nass, 2003). Accordingly, the upper-class English accent of a fictional character may signal that the character has a lifestyle and background that is characteristic of the English upper class, or a rough and hoarse voice might evoke a character’s troubled past through associations with smoking, disease, and general hardship. As an example of the complex and meaningful associations that can in this way be activated by vocal characteristics, consider Roger Clarke’s portrayal of the protagonist of Red Dead Redemption 2 (2018), Arthur Morgan, whose “heaving Western drawl assuredly left any players feeling the weariness of a man at the end of his tether” (Bayne, 2020). The character’s manner of speaking informs players that Morgan, despite having been born in the northern United States, is a man of the American West, while also evoking his agonizing past and present inner conflicts.
All of these impressions, associations, and assumptions are defeasible. If, in a fantasy game, an elven race’s clipped English accent causes us to believe, by way of an evoked stereotype, that the elves lead an aristocratic lifestyle, the narrative can certainly cause us to believe otherwise by showing that the elves are in fact impoverished and oppressed. Also, since not all players share the same linguistic background and knowledge, individual players will differ in their capacity to distinguish different language varieties and their various associations. But the point remains that our interpretative engagement with fictional worlds is pervasively shaped by our preexisting experiences and prejudices. And there is much evidence that prior experiences and present prejudices cause hearers to infer speakers’ dispositions and backgrounds based on how they speak (as opposed to what they say). Notably, hearers readily infer such personal traits as being dominant, friendly, or outgoing from paralinguistic features of speech (McAleer et al., 2014), and accents in particular have been shown to index such “storied” aspects of an individual as their personal background (Weatherhead et al., 2016), ethnicity (Rakić et al., 2011), social status (Giles, 1970), and group affiliation (Gluszek & Dovidio, 2010). This expressive potential of the human voice is captured in the popular notion of a “Shibboleth,” that is, a manner of speaking that betrays key elements of the speaker’s personal and cultural history (McNamara, 2005). Voices, therefore, are expressive in more than a straightforwardly verbal sense. People take the voices of other people to express their personalities and backgrounds—the phenomenon known as linguistic profiling—and players take the voices of fictional characters to index corresponding fictional facts about them. 4
The storytelling potentials of socially indexical voice acting can be appositely conceptualized as “indexical,” in Clara Fernández-Vara’s (2011) use of that term. In this mode of storytelling, which develops the notion of environmental storytelling (Carson, 2000; Jenkins, 2004) for video games in particular, “the story is not ‘told’ in a traditional sense, but rather put together through different pieces … Thus, indexical storytelling is actually more story-building, both on the part of the designer and the player—the designer creates the elements of the story and integrates them in the world, the player has to interpret them and piece them together” (Fernandez-Vara, 2011, p. 5). Fernández-Vara’s exposition of indexical storytelling continually emphasizes the indexation of prior “events” in the game world. For example, rubble and bullet holes at a specific site may suggest that a battle has taken place there. In what follows, we will assume that the concept of indexical storytelling can be fruitfully extended to include the paralinguistic indexation of class, gender, race, status, privilege, power, and other social identities and categories, which are undoubtedly part of video games’ story worlds, but which designate states of affairs and dynamic social forces rather than delimited “events.”
Based on this account of how vocal performances can contribute to fictional world building in video games, we will now analyze how vocal performances in the award-winning 2019 role-playing video game Disco Elysium function to orient the player to the game’s fictional peoples and places. The player of Disco Elysium takes on the role of an amnesiac detective tasked with solving a murder case that proves politically incendiary. The game makes for a pertinent case study because, as we will argue, it exemplifies the fictionally constructive and critically thought-provoking potentials of vocal performances by its diverse and fully voice-acted cast of characters. 5
Disco Elysium
It’s the full voice acting that really has a transformative impact. The cast works magic with ZAUM’s [the developer] text, breathing life into the characters, their attitudes, their accents, which in turn tell the story of Revachol, with its melting pot of nationalities and beliefs, with much greater clarity.
Described as a work of “fantastic realism” by lead designer and writer Robert Kurvitz (The Crate and Crowbar, 2018), Disco Elysium takes place in the war-ravaged metropolis of Revachol in the story world of Elysium. This alternate universe mirrors our own in many ways: It has flora, fauna, peoples, cities, and a violent history that shapes its human inhabitants’ views of the world and of each other. The 1970s-inspired setting is also recognizable to the player by the ideologies that divide its peoples and fuel its conflicts: moralism, a form of reactionary, globalist centrism; ultraliberalism, a form of laissez-faire capitalism; as well as communism and fascism. Creative influences on the game 6 include the True Detective (2014–2019) TV series for its plot and the Émile Zola novel Germinal (1885) and other 19th-century social realist fiction for its themes and tone. The brushy, impressionistic visuals draw inspiration from contemporary pictorial artists Alex Kanvesk and Jenny Saville. Gameplay is influenced by the tabletop role-playing game Dungeons and Dragons (1970–) in its emphasis on character building and “skill checks.” The player gradually shapes the player character by putting experience points into facultative skills like “Empathy,” which allows for a deeper understanding of other characters and their motives, and “Inland empire,” which serves up gut feelings about the player’s current situation. Each point spent on a skill increases the player’s chance of passing various associated skill checks, such as conceiving a motive, smashing a door, or intimidating an adversary.
When first starting the game, the player wakes up in a trashed hostel room as haggard police detective Harrier “Harry” Du Bois, who does not remember who he is or why he is there. These facts are only discovered gradually, and mostly through dialoguing with the game’s voice-acted characters. It turns out that Harry was assigned to investigate the murder of a heavyset middle-aged man who has been hanged and is still hanging, now livid and stinking, from a tree in the hostel’s backyard. Instead, Harry drank and drugged himself to the point of passing out. Slowly coming to his senses, he teams up with Lieutenant Kim Kitsuragi and sets out to solve the case by whichever conventional or unconventional methods the player decides to adopt.
Harry’s amnesiac state and the game’s in medias res opening align the player’s and player character’s sorry epistemic states. This opening effectively communicates that the player is supposed to figure out the world and its inhabitants in the prodding and inferential style of a criminal investigator. Starting in the backyard on the northern side of the hostel, the player comes upon a white boy lobbing stones at the hanged murder victim. Bat-eared and freckled, this character, “Cuno,” presents as English lower class. He speaks in a low-status Scouse (Liverpudlian) accent, manifesting itself through sharp intonation rises (“Can’t TALK, pig. Shit coming up STRONG. Throwing ROCKS.”) and various articulatory features, such as the occasional production of a tapped /t/ in constructions like “about it.” Other vocalic and consonantal features represent a mixture of local and low-prestige British English accents. These include /h/-dropping (heard [ɛːd] rather than [hɛːd]), realization of -ing as -in’ (fucking [fʊkɪn] rather than [fʊkɪŋ]), the occasional absence of the so-called FOOT/STRUT split (couple pronounced as [kʊpɫ] rather than [kʌpɫ]), realization of some /t/ sounds as affricated or spirantized (e.g., pretty may sound like pre
What Cuno has to say at this point is unhelpful to the investigation. Even if he knows something, he has no interest in helping the “pigs” solve their case. However, how Cuno talks immediately suggests to the player that the world of Disco Elysium is like the real world in certain basic respects. First, the social world is stratified in such a way that lower-class people have stereotypically lower-class accents and dialects; and, more importantly, what it means to be someone like Cuno in this world can be understood with reference to what it means, stereotypically, to be a destitute and life-hardened child of the English lower class. The player will form matching ideas about Cuno’s background and lifestyle, and these will turn out to be mostly true: He is foul-mouthed and given to petty crime. He is the victim of frequent and vicious beatings by his drunken father, who lives in the run-down apartment complex that lies beyond the hostel backyard and who cares nothing for him. Cuno strikes the other characters in the game as ignorant, maladjusted, and futureless, wherefore they will have nothing to do with him, and wherefore he becomes ignorant, maladjusted, and futureless.
The apartment complex that houses Cuno’s father also houses other poor denizens of Revachol. This unfortunate class of characters, of whom a good number also live in a dilapidated fishing village that the player visits later in the game, speak primarily with non-normative and foreign accents, including Northern varieties of British English, Cockney, Slavic, Irish, Spanish, and various hybrids. By contrast, French-accented English is the closest one comes to a native mode for the denizens of Revachol. This is because Revachol was founded as a colony of Suresne (later Sur-la-Clef), a francocentrically fictionalized colonialist power. Officials, bureaucrats, and other relatively privileged people often speak with a French accent, as do social conservatives who long for the good old days. Most other accents are associated with poverty and struggle. This schismogenic use of accent is typical of video games and other fictional media, which, as noted in the introduction, tend to pair simple moral binaries with socio-phonetic opposites that the media user can then use to tell friend from foe. In Disco Elysium, however, the indexation of accents is not primarily moral. Rather, accents express the diverse sociocultural backgrounds of speakers, wherefore, as we shall keep seeing, accents become a guide to the player’s conscious and critically associative filling in of those backgrounds.
As the player ventures further north alongside the apartment buildings, a trailing puddle of red liquid leads them on to Cindy “the SKULL.” This teenaged, black-clad girl is pouring dyed heavy fuel oil onto a wall and into the street from a first-floor terrace. Her snide remarks to the player mix elements of standard British English with gaping-wide vowels and glottalized t’s (e.g., started is realized as [stɑːʔɪd] rather than [stɑːthɪd]), connoting something of her countercultural, punkish ethos. 7 Other examples of nonstandard features include Northern-like pronunciation of STRUT vowels (e.g., Skull [skəɫ] ∼ [skʊɫ]), occasional absence of the so-called TRAP/BATH split (last pronounced as [læhst] rather than [lɑːhst]), /t/ flapping in certain contexts (got a [gɒɾə]), and very dark or even vocalized l’s (wall [woːʊ] rather than [woːɫ]). Cindy briefly imitates a Queen’s English accent in mocking the player character’s “admiration” of her fuel-oil “mural.” Her ridicule shows sociolinguistic awareness, and that she consciously distances herself from that normativity which she takes the accent to express.
Cindy has a problem with authority and is not much help with the investigation. However, she reserves her strongest ire for Joyce Messier, a tall, lean, expensive-looking woman standing on her yacht in a pier straight across from Cindy’s perch. As a high-ranking representative of the Wild Pines Group, an international business conglomerate that employs some 72,000 workers, Messier has come to negotiate with the local dockworkers’ union to lift a strike that is costing the Wild Pines money—and she is now conducting her own investigation into the murder, which is starting to look like a drunken and resentful attack by the dockworkers against a Wild Pines security guard. The player camera tracks Cindy’s disdainful glances at this woman, leading the player on to meet her.
Messier’s lustrous vessel looks out of place in the run-down apartment complex, and so does she. Her fine, middle-aged features and set, backcombed hair combine in a Margaret Thatcher-like look that reflects her neoliberal socioeconomic convictions. She speaks in Received Pronunciation (RP), with a complete absence of nonstandard features. For instance, her English is non-rhotic (harbor [hɑːbə]), contains the TRAP-BATH split (
Messier is rich, and hers is the voice of capital. She represents Elysium’s ultraliberal elite, a social category that her accent helps associatively to fill in by its connotations to the tastes, politesse, and schooled edification of the English upper class (by her own admission, Joyce received a “preposterously expensive education”). The only other character who talks somewhat like her is the failing business owner Plaisance, a relatively minor character who seems to share Messier’s social outlook. The player comes to understand Messier by contrast to the other characters of Disco Elysium at least in part because she is one of only a few characters to speak with an accent that, as a pre-release developer’s note says about her character in general, connotes “very old money” and “finger sandwiches” (Moskvina, 2019). And it is not a coincidence that the player encounters Messier on the same geographical axis—that which extends along the run-down apartment complex—which locates Cuno and Cindy. Messier is fantastically privileged compared to these struggling characters, and her iconizing accent suggests to the player that this fictional privilege and its sources can be understood in nonfictional, anglocentric terms. The accent adds a classist and exclusionist undertone to the capitalist ideology that she represents and with which the player will have to contend throughout the game (Figures 1–3).

Having left Cuno at the crime scene (through the dark alleyway in the lower-right corner of the screen) and moving further north, the player will notice a trailing puddle of red fluid by the pier.

The puddle leads the player to Cindy (leaning over the railing on the first-floor terrace), whose disdainful glances lead the player on to Messier by the westward tracking of the player camera.

Messier’s clipped English accent will strike the player by its contrast to the lower-status English accents of Cuno and Cindy. The common national origin of these non-rhotic accents invites comparison between the characters.
Moving on from this first encounter with Messier, the player’s narrative and evidentiary search will take them east, toward the harbor, which has been closed down by the striking dockworkers. It is on this path that the player will encounter the first of the game’s four identifiably fascist characters in the form of “Racist lorry driver,” a stubby, big-nosed Caucasian male who dislikes anyone he perceives to be non-Revacholian (such as Harry’s sidekick, with whom he exchanges some unpleasant remarks). His voice is hoarse and low, with a distinctively French accent: rising intonation patterns with French stress rules (e.g., superior is pronounced as if it were the French supérieur [sypeʁˈjœʁ]), uvular fricative /r/s (realized as voiced uvular fricatives [ʁ] or uvular approximants [ʁ̞], e.g. true [tʁuː]), very back vowel in words such as do ([duː] rather than the more native-like [dʉː]), dental fricatives pronounced as alveolar ones (that [zɛt] rather than [ðɛt]), and other features.
A short walk south of the lorry driver stands René Arnoux, another fascist, who, decades after the fact, still wallows in the shameful defeat of the fascistic monarchical regime, or “Suszerainty,” of Revachol by the communists (who, in turn, were crushed by the moralist Coalition of Nations about 40 years prior to the events of Disco Elysium). The voice of this elderly man is hoarse and tuneless, yet as strong and severe as his convictions. He speaks with a French accent, although of a different variety from the lorry driver’s more mainstream (European) French. Phonetically, his /r/s are not only uvular but also trilled (returned pronounced as [ʀitɜːənd]). As in the case of the lorry driver, the becomes ze ([zi] rather than [ði]) or de, the vowels show a range of realizations typical of French-accented English, and the prosody shows French patterning as well. There is also an occasional absence of word-initial /h/ (happiest pronounced as [apiɛst]), and nothing becomes nossing (/θ/ is realized as [s]).
The ethno-nationalistic sympathies of these fascist characters are echoed by their linguistic conservatism: They speak in a French-sounding way and believe that that is the proper, Revacholian way to speak. In addition, the rough and growly quality of their voices distinguishes them from most other French-accented characters. It makes them less approachable and conveys a wounded pride in their otherwise self-assured pronouncements.
To be sure, many other characters in Disco Elysium than the fascists speak with a French accent, but it is significant that not a single communist, with their anti-nationalistic ideological commitments, does. The characters that can be readily identified as communists speak with many different accents, including German (the character known as “Echo Maker”), Spanish (“Call Me Mañana”), Irish (“Easy Leo”), an African variety (Elizabeth), as well as varieties of British English (e.g., Evrart Claire) and American English (e.g., Titus Hardie). This linguistic polycentrism represents the ideologies of fascism and communism as antipodal social orientations; if the player chooses to support one of them, then that means directly opposing the other.
North-east of René Arnoux and the lorry driver, blocking the entrance to the harbor, stands the African-looking male character Jean-Luc “Measurehead,” accurately described by Joyce Messier as a “2.20 m racist behemoth.” Measurehead is a Semenese supremacist (the Semenese Islands are a former Revacholian colony). His character satirizes the pro-Revacholian fascists who think themselves physically and mentally superior to him in virtue of their ancestry, which they manifestly are not. However, Measurehead is perhaps even more racist than they are; he is obsessed with racial pseudo-science, including phrenology, and completely unabashed about his convictions. Measurehead’s accent has French features, but they are inconsistently realized. He sometimes pronounces and sometimes omits /r/s following vowels (as in deeper, vulgar, etc.). He also stands out from the lorry driver and Arnoux by his pronunciation of dental fricatives, which he realizes either as the standard [θ] and [ð] or as [t] and [d] (and not as [s] and [z]). His intonation and rhythm are reminiscent of varieties of French spoken in African countries, such as Cameroon and Ivory Coast. His voice quality is low-pitched, breathy, and resonant.
Measurehead sounds unlike any other character in Disco Elysium. The player will naturally ascribe this fact to his cultural background, but that turns out to be a dubious interpretation. The player’s persistent questioning of Measurehead can eventually cause him to admit that he is not Semenese, although his ancestry may well be, and that he heard about the Islands “on the radio.” He is actually Revacholian, wherefore one would expect him to have a local accent. The fact that he does not becomes part of what reveals him as a fake and as just another racist blow-hard. There is even some concrete evidence of feigning in that his accent is highly inconsistent. Notably, /r/s are produced, unsystematically, in at least five variants: [ɹ], [ɾ], [ɻ], [ʀ], [ʁ]. In Measurehead as in the pro-Revacholian fascists, accent is emblematic of nationhood and ethnic ancestry, and therefore a meaningful part of these characters’ self-presentation. 8 It links them to specific regions of the fictional world and suggests, as becomes especially apparent in the satirizing figure of Measurehead, that their fascistic pride in these links is unreasoned and arbitrary. It is not a stretch to suppose that the game is making a similar point about the fascisms of the real world.
The player will eventually need to placate, circumnavigate, or knock out Measurehead in order to continue the investigation. The route leading from him and into the harbor takes the player to Evrart Claire, leader of the dockworkers’ union and professed communist. Introduced by the game’s narrator as a “walrus of a man,” Claire is enormously fat, and is for this reason physically confined to a freight container in which he has set up a mobile office that is moved by crane. This character is shot through with irony: Lounging in his massive recliner, he presents as a corporate fat cat. And in attempting to justify his bid for communal ownership of the harbor, he occasionally lapses into platitudinous commercialese: talk of being able to offer “competitive contracts” and “bold, exotic new venue streams.” He speaks in a hypernasal, RP accent that, together with his large vocabulary and British English turns of phrase, distinguishes him from his subordinates. For example, the diphthong [əʊ], as in glowing and no, is central, which is associated with upper-RP, as is his raised pronunciation of the monophthong in words such as yes, best, and success.
Of Claire’s subordinates, the group of men known as the Hardie Boys is the most significant. They sympathize with the strikers and despise Joyce Messier and the Wild Pines. In addition, various clues will connect them to the deceased, who turns out to be a mercenary, “Lely,” hired by the Wild Pines to break up the strike. The Hardie Boys also had a personal motive for killing this man, who had apparently raped a young woman, “Klaasje,” of whom they feel protective. It will later turn out that the Hardie Boys did not kill Lely (and neither did Lely rape Klaasje).
The Hardie Boys are Claire’s muscle, responsible for enforcing his policies, intimidating his adversaries, and combatting organized crime in and around the harbor. These men wear worker’s clothes and drink beer. They go by names and nicknames like Glen, Eugene, “Shanky,” and “Fat Angus.” Of these seven characters, six have working-class American dialects. 9 For example, the adjectival phrase really nice becomes real nice, and are not, as in “they are not from here,” is abbreviated to ain’t. Man is used as an intensifier, and -ing at the end of words is realized as -in’, as in the following provocation by Eugene: “Yeah, man, weren’t you listeni[n]?”
The player needs to get the Hardie Boys to talk as it quickly becomes clear that they know more than they are letting on. However, they are distrusting of the player, whom they see as an agent of a government (the internationalist moralist Coalition) that cares nothing for them. Their accents help unify them and express their spirited localism: These characters do not care much about abstract political ideals, but they do care about each other and about their community. They represent the feet-on-the-ground working class of Disco Elysium. To earn their trust, the player will need to adopt or feign their social outlook.
The accentual contrast between the Hardie Boys and their boss, Evrart Claire, serves again to express the social stratification of Disco Elysium. Specifically, the American accents of the Hardie Boys connote low social status, gruffness, solidarity, and perhaps a doltish stubbornness. Claire’s RP accent connotes high social status, material means, cleverness, and arrogance—traits of which at least the first three also describe Joyce Messier. This contrast helps to effect an implicitly class-based hierarchy between Claire and his workers and to help communicate to the player that Claire’s communism falls well short of its own egalitarian and ultimately humanitarian ideals (as has been generally true of the real-world communisms of the 20th century—a comparison that is surely intended).
Should the player manage to soften up the Hardie Boys, they will eventually admit that a friend of the group, a woman named Ruby, could be responsible for the murder. After having established Ruby’s innocence at a remote location, the player returns to the hostel. Outside, three heavily armed mercenaries—Lely’s associates—have confronted the Hardie Boys. Drunk and enraged, the mercenaries have concluded that the Hardie Boys murdered Lely. Their leader is the muscle-bound, deep-voiced Raul Kortenaer. He does most of the talking, and he does so in a recognizably General American (GA) accent (e.g., chance is realized, with a front vowel, as [tʃæns]). His voice is loud, deep, rough, and tense. It alerts the player to the character’s hateful instability, which threatens to erupt into frenzied violence (Figure 4).

Kortenaer (bottom-left) and his two associates confront the Hardie Boys. Kortenaer’s General American accent contrasts with the local varieties of the Hardie Boys.
Kortenaer hails from Oranje (or “Oranjenrijk”), a Netherlands-inspired colonialist nation known for its state-sponsored overseas commercial enterprises and private military contractors. These contractors, including Kortenaer’s employer of Krenel, serve exploitative corporate interests in developing nations with disastrously violent results. Kortenaer’s voice shapes our perception of these people’s militaristic adventurism as distinctly and satirically American. 10 That impression would have been much weaker if it were not for the voice acting, but his accent coupled with his apocalyptic, Hollywoodesque outburst, such as “Welcome to the fucking reckoning” and “Now it’s fucking time for some justice,” leaves little doubt. 11 His presentation further serves to connect the destructive ventures of the Oranjese to contemporary, real-world American military contractors like the now-defunct Blackwater, and to suggest that he and his kind are ultimately agents of international capital. (This interpretation is supported by Joyce Messier’s admission that Krenel has been renamed multiple times to bury past atrocities. Blackwater has a similar history.)
The confrontation with the Oranjese mercenaries inevitably results in violence and the deaths of multiple characters. This shootout represents the climax of the game and leads on to the plot’s resolution. Lely’s killer is eventually revealed to be the aging revolutionary Iosef Lilianovich Dros, whose motives were both political and personal. Narratively, the murder’s real significance is to focus the conflicting ideological interests of characters like Joyce Messier, Evrart Claire, and Raul Kortenaer. These characters’ vocal performances help the player navigate the fictional world and interpret its sociocritical messaging—which, as the present analysis suggests, seems to be amplified rather than muted by the game’s liberal use of vocal stereotyping. In the final section of this article, we will attempt to make sense of this critical use of vocal stereotyping.
Critical Stereotypes
Disco Elysium builds out its story world by means of paralinguistic features of speech, notably accents. These accents do not affect a story-flattening “Manichaean binary” between good and bad characters, as previous research has found to be the case in other popular games. Rather, the accents signal the socially and historically diverse and interrelated backgrounds of the game’s characters, thereby establishing meaningful connections between the characters as well as between the characters and the game’s historical backdrop. In turn, these vocal characterizations and connections convey, by means of consciously structured parallels to real-world issues and conflictual social formations, much of the game’s sociocritical messaging.
Our analysis of Disco Elysium also shows that the game builds its story world in part through the use of social stereotypes evoked via the vocal performances of the main characters. Social stereotypes are “beliefs about the characteristics, attributes, and behaviors of members of certain groups” (Hilton & von Hippel, 1996, p. 240). Such generalizations are not necessarily false (Judd & Park, 1993). For example, the stereotype that men are more violent than women is statistically true. However, stereotypes tend to overgeneralize their ascriptions, and they can also identify groups of people with characteristics that do not describe them at all. This is especially true when the stereotyping concerns groups of people perceived as enemies or outsiders, who may be stereotyped as essentially immoral and uncultured in order that their perspective on the world can be safely ignored (e.g., Hagendoorn, 1993). Stereotypes can therefore be “powerful tools of ideology” (Thompson, 2020, p. 46) that shut down critical thinking and moral scrupulousness.
This ideologically blinkering aspect of social stereotyping represents a main scholarly criticism of vocal stereotyping in popular video games, as outlined in the introduction of this article. As our analysis shows, however, Disco Elysium’s vocal stereotyping may serve to enlarge the player’s critical awareness of the game world and its ideological import. The socio-phonetic presentation of characters like Cuno and Raul Kortenaer represents a way for the game to evoke and examine power, capital, class, and other real-life social issues. But what is it about the stereotyped representations of Disco Elysium that makes them critically thought-provoking as opposed to stifling?
We believe that the most important factor is overtness. Stereotype overtness, as we will use the term, is the degree to which stereotyped characters are used to call attention to themselves as stereotypes, that is, as recognizable images of privilege, power, nationalism, poverty, etc. Because social stereotypes define perspectives on the real world, their importation into Disco Elysium allows for the critical indexation and treatment of real-world issues, which can then be dramatically evoked and enacted by the stereotyped characters in the manner of a Menippean satire. 12 And because the stereotyped characterizations call attention to themselves as stereotypes—as representations not of people but of ideas about people—their inclusion does not imply a commitment to their essentiality, naturalness, necessity, rightness, or even to their truthfulness as descriptions of real people in the real world. This matters because a basic moral problem with stereotypes, and with vocal stereotypes in particular, is exactly that they tend to attribute “necessity to a connection (between linguistic features and social groups) that may be only historical, contingent, or conventional” (Irvine & Gal, 2000, p. 37). This is emphatically not the case in Disco Elysium, wherefore the game’s embedded stereotypes appear significantly less exceptionable. 13
For example, the stereotyped portrayal of Cuno, including his accent, is transparently used to index poverty and delinquency in such a way that the game can treat these very social issues. At the same time, there is no suggestion that Cuno is himself to blame for his poverty and delinquency, or that these characteristics are in some way naturally linked to his accent. Things could have been very different for Cuno if people and society had treated him better, as the game makes clear by revealing the character’s many misfortunes. Likewise, the stereotyped portrayal of Joyce Messier is transparently used to index power and privilege in such a way that these realities and their preconditions can be critically examined by players.
By contrast to the overt stereotyping of Disco Elysium, the stereotypes that have been problematized by previous research have been—or have at least been assumed to be—covert. Covert stereotyping functions to characterize, but there is no attempt to spotlight and critically examine the ideas that enter into the stereotyped characterization. For example, action and adventure games of the 1980s made liberal use of the “damsel in distress” trope in tasking players with rescuing a young female character taken hostage by the game’s villain (Summers & Miller, 2014). These games stereotyped attractive young women as helpless and dependent victims, but they did not address themselves to the notion of feminine helplessness by questioning, ironizing, or otherwise examining its purport. Of course, reflective players may still be able to engage with such representations in questioning and critical ways, but the point is that overt stereotyping encourages critical scrutiny while covert stereotyping does not.
In attempting to state what makes the stereotyped representations of Disco Elysium more overt as opposed to covert, there is no getting around that magical word, “context.” The player interprets each stereotyped character against a background of fictional facts and social themes that bring the stereotype’s ideational import into critical focus. Our analysis of the game discusses many examples, such as the class-contrastive portrayal of Evrart Claire and the Hardie Boys or the critically Americanistic history of the organization that employs Raul Kortenaer. And then there is simply the fact that the game clearly aims to deal with such social issues as the stereotypes represent, which is communicated to the player by the socially realistic depiction of Revachol together with the centrality to the plot of ideologically conditioned conflicts of interest. Indeed, as Robert Kurvitz has explained in an interview, the fictional world of Disco Elysium was made to be “a ghost of our world” (The Crate and Crowbar, 2018); it is precisely an abstraction of the kinds of social issues that define modernity. These issues are abstracted from the real world and imaginatively evoked—and echoed, if we are right, by overt vocal stereotyping—in the game world. And as Kurvitz goes on to suggest, this imaginative evocation is what allows the game to examine real-world social issues with “a bit of psychological distance.” 14
Disco Elysium, then, represents a significant counterpoint to recurring scholarly criticisms of vocality in video games. Its vocal performances are fictionally constructive rather than flattening, and they do not, even by their stereotyping, ease the player into language-ideological conformity. We hope to have gone some way toward explaining how the game achieves these things and how other games could do the same.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Carlsberg Foundation (Young Researcher Fellowship, grant number 35891).
