Abstract
This article uses Zbikowski’s theory of ‘musical grammar’ to analyse Radiohead’s song ‘Paranoid Android’ from their 1997 album OK Computer. Invoking the close structural and compositional parallels between language and music, Zbikowski’s approach appropriates some of the core elements of cognitive linguistics to provide a means of ‘translating’ music into meaning-bearing conceptual structures via the construction of ‘sonic analogues’, which are a type of conceptual construct formed when incoming perceptual information is compared to existing cognitive knowledge stored as image schemas. The result is an analysis of the interactions between the linguistic and aural constructions of a multimodal text that not only sheds new light on the text’s meaning-making devices, but also endeavours to unlock the strategies through which such distinctive semiotic modes act and interact within texts to create meaning potential.
1. Introduction
The genre of popular music has long suffered a dearth of scholarly attention. As Morini notes (2013: 284), a very pragmatic reason for this is that researchers often lack the skills needed to analyse these two very different semiotic modes simultaneously, resulting in studies that tend to focus on either the linguistic or the musical, but rarely both (see Morini (2013) for details of these studies). However, multimodal scholars continue to develop rigorous and successful means of analysing other types of multimodal texts which, like music, prioritise the non-linguistic in their process of meaning-making; this shows that the relative paucity of musical text analysis is not simply because of the genre’s multimodal content, but rather the nature of that content. In particular, while language has referential meaning, music does not. As Michael Thaut states in his work on the neuroscience of music, ‘[m]usical sounds and sound patterns communicate themselves in abstract fashion. They do not intrinsically denote or refer to extramusical events, objects, concepts, or cognitions’ (2005: 172). In short, music cannot easily be expressed in words, a characteristic that renders it difficult to analyse in accordance with the rigorous, retrievable and replicable ideals of stylistic enquiry.
As a multimodal study that analyses the musical alongside the linguistic, Morini’s (2013) article goes some way towards redressing this scholarly oversight by convincingly identifying lyrical and musical movement in Kate Bush’s 1985 hit ‘Running Up That Hill’. Morini notes that the imprecision typical of musical meaning results in an ‘inevitable’ degree of impressionism that may be overcome by ‘find[ing] ways of looking at music which are, if not objective, at least generic in terms of cognition and perception’ (2013: 290). The theory of musical grammar proposed by Zbikowski (2017) can at least partially make claims to enhanced objectivity, for three reasons. First, as Thaut notes, ‘[m]usical meaning is embodied and its nondiscursive symbols cannot be translated directly into referential denotations’ (2005: 172; emphasis added). The embodied nature of musical meaning therefore makes music ideal for analysis using an approach grounded in cognitive science, in which enquiry into the nature of embodied meaning has considerable precedent. The process of meaning-making in any semiotic mode involves identifying correspondences – or analogies – between form and meaning. Therefore, Zbikowski bases his theory of musical grammar on this fundamental cognitive capacity: the ability to think analogically. Second, while Zbikowski has largely restricted application of his developing framework to the rhythmic and harmonic elements of musical texts, its origins in cognitive linguistics make it ideal for parallel application to the text’s lyrics. Finally, the same cognitive processes underpin the production and reception of both language and music. As such, the framework facilitates analysis of both semiotic modes through recourse to central theories within cognitive linguistics, all of which inform the song’s ‘musical grammar’.
In the next section of this article, I set out the foundations of Zbikowski’s (2017) theory of musical grammar, and the stages involved in its application. In section 3, I then use Zbikowski’s theory to analyse the musical grammar of Radiohead’s ‘Paranoid Android’.
2. Methodology
2.1 Theoretical foundations
Zbikowski’s theory of musical grammar ‘takes as its basic model the grammars developed by cognitive linguists over the past thirty years’ (Zbikowski, 2017: 16), in particular those of Langacker (1987, 1991) and Croft (2001). As such, it recognises all aspects of form, regardless of size or type, as form–function pairings, that is, form, whether linguistic or non-linguistic, has the potential to create meaning. Zbikowski applies this theory to texts in which music interacts with a number of other semiotic modes, including dance and gesture (2017). When considering how music interacts with language, however, Zbikowski does not analyse language from a cognitive perspective; as such, his approach is informed by the cognitive grammars of Langacker and Croft, rather than constituting an application of them.
Zbikowski’s earliest applications of cognitive science to musical interpretation (2002, 2008) use Lakoff and Johnson’s (1980) conceptual metaphor theory (CMT) – first, to address our frequent recourse to metaphor as a means of articulating our experience of music and, second, to establish connections between the conceptual domains of language and music. This led Zbikowski to the conclusion that music communicates by encouraging the listener to create analogous relationships between musical form and meaning, which he terms ‘sonic analogs’. Brower’s (2000) cognitive theory of musical meaning offers some insight into how sonic analogues are constructed. Drawing on Margolis’ (1987) concept of ‘pattern-matching’ and Johnson’s (1987) theory of embodied meaning, Brower contends that we compare incoming musical information to three types of stored schematic knowledge. First, we compare it to our knowledge of the musical work itself, its intra-opus patterns, which allows us to identify a repeated musical phrase or refrain, for example. Second, we compare the incoming musical stimulus to our stored knowledge of musical convention; all music listeners have competence in this area, although trained musicians and music theorists may constitute a form of ‘ideal listener’. These two sets of schematic knowledge give rise to ‘intra-domain’ mappings (2000: 324; italics in original), whereby music is compared to music. These sets of knowledge can overlap, enabling the listener to detect patterns within a piece of music and how they correspond to musical conventions simultaneously, effectively identifying internal and external musical foregrounding through processes of parallelism and deviation. The third type of knowledge, that of image schemas, involves ‘metaphorical, or cross-domain mapping’ (2000: 324; italics in original), as musical information is compared to non-musical, typically embodied, patterns of experience. The main image schemas used in music comprehension are, unsurprisingly, those most frequently used in everyday life, namely container, cycle, verticality, balance, centre-periphery and source-path-goal schemas (Brower, 2000: 326). These common image schemas reflect our embodied experience of space, time, force and movement: for example, they reflect our experience of space as comprised of discrete and bounded areas; of time as divided into cycles; of the body as being centred, balanced and positioned vertically on stable ground; and of motion as a path towards a goal. As these image schemas commonly overlap – for example, the body is typically perceived as both centre and container – we often combine them in the process of interpretation. Brower’s theory is founded on the premise that, while we draw on image schemas to make sense of musical form, musical form is itself largely a metaphorical reflection of these image schemas, so that ‘[u]nderstanding how tonal conventions reflect bodily experience can give us an insight into the novel metaphorical meanings of a musical work’ (Brower, 2000: 325).
Zbikowski notes that the image schemas identified by Brower as most pervasive in musical interpretation are united by a common ‘appeal to causality’ (2017: 92); that is, they are dynamic processes that are goal-directed in nature. For example, a state of instability typically activates a search for the means to regain stability, while a departure typically prompts the goal of return, or of reaching a fixed destination. Our tendency to conceptualise music as a goal-directed dynamic process may appear rather simplistic but, as Zbikowski notes, there is evidence to suggest that the ability to identify the causal elements of processes is integral to an infant’s competence in constructing conceptual domains (see discussion of the work of Mandler in Zbikowski, 2017: 52–53). As such, the meaning-making strategies employed by music could be seen as ‘a deeply ingrained aspect of the knowledge through which we guide our understanding of the world’ (Zbikowski, 2017: 92).
2.2 Applying Zbikowski’s theory of musical grammar
Stage 1: Identification of rhythmic and harmonic processes. The first stage in applying Zbikowski’s (2017) theory of musical grammar involves identifying the musical features that contribute to meaning (termed syntactic processes). Musical form communicates meaning via its two key resources of rhythm and pitch, or what Zbikowski refers to as music’s temporal (rhythm) and tonal (pitch) frameworks. Repetition constitutes a strong example of a syntactic process in musical communication, as repetition of a rhythmic structure (such as a drum beat) or of a particular pitch acts much like linguistic repetition in drawing attention and thereby constructing meaning. Other examples of syntactic processes include sequences and grouping; for example, a sequence of ascending or descending pitch can communicate meaning, as can a certain grouping of rhythmic patterns. All such patterns are identified at this initial stage.
Stage 2: Identification of sonic analogues. Musical communication takes place when patterns of rhythm and pitch each create a sonic analogue; as noted above, the sonic analogues created by music tend to be goal-directed, although their specific nature will depend on variations in the music, and how the music interacts with any other semiotic resources present in the text, such as gesture or words. As such, the second stage of analysis involves identifying the sonic analogues created by each of the temporal and tonal syntactic processes – that is, what sonic analogue is created by the rhythm of each section and what sonic analogue is created by the melody of each section – by relating them to the corresponding image schema(s) identified by Brower (see above). In particular, fixed pitch is a meaningful resource exploited during musical communication: it refers to how a sequence of pitches (i.e. notes) can be construed as ‘anchored’ by one specific pitch, often the tonic or dominant note of the key being played (see glossary of musical terms below). Fixed pitch therefore draws on Brower’s (2000: 326) cycle and source-path-goal schemas to create a sonic analogue of departure and return (Zbikowski, 2017: 126). Given that a ‘real’, as opposed to an ‘ideal’, listener is typically unaware of the key in which music is being played, and hence its tonic and dominant notes, I propose that real listeners instead perceive whether or not a musical phrase returns to its original note, and, if so, this is what they conceptualise as a completed cycle, or process of departure and return, depending on the nature of that musical sequence.
Stage 3: Coordination of syntactic layers. When there are multiple individual rhythmic patterns in a piece of music, they interact to create a syntactic layer known as the rhythmic layer; when multiple patterns of notes come together, they form the harmonic layer. The third stage focuses on identifying the sonic analogues created when the rhythmic and harmonic syntactic layers interact or coordinate.
Stage 4: Music and words. The final stage considers how music interacts with language to create further sonic analogues.
3. Analysis
While the many and varied musical strands in ‘Paranoid Android’ are integral to the song’s effect, the musical focus here is the song’s melody. Melody, commonly equated with tune, is typically identified as the most memorable or foregrounded sequence of notes; in the case of a song, the melody usually equates to the sequence of pitches produced by a singer (see Selfridge-Field, 1998). While this accords well with my aim of analysing both how music communicates meaning and how it does so in conjunction with language, it means that the additional tonal layers (defined in music terminology as the ‘harmony’) as well as the wordless sections of the text are overlooked, despite potentially bearing a significant semantic load.
The song is divided into four musically and lyrically distinct sections; as the final section is a wordless musical reprise of the second, it will not be analysed. For ease of reference, the divisions used by Griffiths (2004: 35–37) in his deconstruction of the song’s pitch and tempo are preserved here. It should be noted that the lyrics printed in the album sleeve differ in a number of ways from those heard in the song; they also contain some graphological and lexical deviation not communicated aurally. For these reasons, the lyrics analysed here are those heard in the song and found in the official sheet music. In order to focus, firstly, on how music alone communicates meaning and, secondly, on how music interacts with language to create meaning, I will not analyse the lyrics separately, but rather in terms of how they interact with the melody, which reflects how they are experienced by the listener whilst listening to the song.
3.1 Section A (start to 1.57 minutes)
3.1.1 Rhythmic processes
The opening section of ‘Paranoid Android’ is in 4/4 time, meaning that there are four crotchets (quarter notes) per bar; 4/4 time is also known as ‘common time’ due to its predominance in western music genres, particularly rock and pop. The rhythm is 84 BPM – that is, there are 84 crotchet beats per minute – and the section contains a total of 160 beats. It is 117 seconds long and comprises an introduction (24 beats) and what Griffiths (2004: 37) identifies as two separate verses (each verse corresponds to two lines or a single linguistic utterance) containing 40 beats each, with a 28-beat refrain between the verses and again at the end of the section.
In terms of the sonic analogue constructed, the rhythmic regularity of Section A conveys (in Zbikowski’s terms) a sense of steady progression that typically invokes a source-path-goal schema; as such, the sonic analogue constructed is that of steadfast progression towards fulfilment of a goal-directed dynamic process.
3.1.2 Harmonic processes
Opening in the key of G Minor (natural), the words of Section A’s first line commence on a repeated C4 (G Minor’s subdominant), which rises in a steady note-by-note ascension of the scale (C4-C4-D4-E4♭) (see Figure 1). Then there is a ‘jump’ over F4 to G4, followed by another step-by-step climb to the highest note in the phrase, B4♭, which is repeated. The notes then descend steadily to rest on E♮ (rather than the originating note C4); E♮ is the only note thus far that is not part of the section’s G Minor scale (in which E is always flat). The sonic analogue constructed by the music would be that of attempted progress towards a goal: this progress commences strongly, as suggested by repetition of the opening note; proceeds steadily, as indicated by the steady pitch ascent; then reaches a satisfactory mid-point, as illustrated by repetition of the phrase’s climactic B4♭. The reinforcement of this mid-point, coupled with the steady note-by-note descent, suggests attempted return to the point of original departure; however, this attempted return appears thwarted as the final note stops short of its originating pitch.

Section A, verse 1: 12-note harmonic structure of line 1, repeated in line 2.
The first verse is followed by a word-free bar; on the final beat of the second bar begins the twice-repeated refrain ‘What’s that?’ (see Figure 2). While the first refrain is sung over a narrow pitch range (only three notes) (D5-E5♮-F5-E5♮-D5), its short but steady ascent from, and descent back to, its originating pitch suggests completion of a goal-directed process, such as return from a journey. The second phrase is narrower still (D5-E5♮), and failure to reach the high note of F5 or return to the originating note D5 constructs a sonic analogue of another unfulfilled goal-directed process, a failure doubly reinforced by the extended duration of the ‘deviant’ endnote, E5♮ (held for eight beats).

Section A: refrain.
Following a one-bar rest, the first line of the second verse – ‘When I am King you will be first against the wall’ – can be broken down into three phases. The first commences with a return to the opening note of the first and second lines of verse 1 (C4) (see Figure 3). However, instead of ascending from C4 as it does in verse 1, the pitch instead descends, first to B3♭, then to a twice-repeated A3, before returning to C4. The second phase commences with a climb to a twice-repeated D4 before a retreat back to C4. This retreat continues in the third phase with a descent to B3♭ and A3, followed by a single rise to B3♭, and a final two-note drop to G3. Conceptually, stability is conveyed across phases one and two by repetition of the origin, coupled with initial descent and climax pitches, and reinforced by two completed cycles marked by returns to the origin. However, this stability is undermined in phase three by an absence of repetition, coupled with a pitch sequence that descends sharply to three diachronic pitches below its origin (to G3); in fact, across the whole line, the pitch only rises once and to only one note above its origin.

Section A: verse 2, line 1.
The second line of the second verse – ‘With your opinion which is of no consequence at all’ – deviates slightly from the musical pattern established by the first line. In its first phase, rather than a descent, there is a climb akin to that in verse 1, albeit much shorter, with the highest note reached being E4♭, before a retreat back to its origin C4 (C4-C4-D4-E4♭-D4-C4) (see Figure 4). In the second phase, the high E4♭ is gained again and retreated from again (E4♭-D4-C4).

Section A: verse 2, line 2.
The third phase comprises a descent below the originating point of C4 to the depths of G3 in a somewhat halting sequence (B3♭-A3-B3♭-A3-G3). Once again, stability is conveyed in phases one and two by the initial repetition of the origin note, the steadfast note-by-note pitch ascent and descent, and two returns to the origin. However, this is considerably undermined by the musical ‘events’ in phase three, characterised as they are by halting descent, failure to return to the origin, and an endnote three pitches below the origin. Once again, the harmonic patterns construct the sonic analogue of an incomplete action.
3.1.3 Coordination of syntactic layers
The regularity of the section’s rhythmic processes construct a sonic analogue of steadfast progression towards fulfilment of a goal-directed process, or completion of the source-path-goal image schema. However, this is juxtaposed with the sonic analogue of an incomplete goal-directed process repeatedly constructed by the section’s harmonic processes; the overall sonic analogue constructed by coordination of Section A’s tonal and temporal frameworks is therefore of steadfast attempts to achieve a goal that consistently end in failure. This is only propounded by final repetition of the refrain ‘What’s that?’ and its rhythmic sustaining of the final sung syllable for two full bars. The question ‘What’s that?’ is rhythmically and harmonically unanswered at the close of Section A.
3.1.4 Music and words
How the lyrics are sung is of significance in a number of ways in this opening section. Of the 160 beats in this section, 136 contain words. Following a four-bar introduction, the lyrics of the two-line first verse commence. The first two words of each line – ‘please could’ and ‘from all’ – are each sung on a quaver beat (a quarter beat), with the following words all sung on a crotchet beat (a half-beat) each. This rhythmic foregrounding of the first two words of each line of each verse draws attention to both the politeness marker ‘please’ and the deontic modal ‘could’, emphasising the imploring tone of the opening interrogative.
With one exception, each of the words in the first line is monosyllabic and each is sung as a single note on a single beat, with the disyllabic word ‘trying’ reduced to a monosyllable to fit this pattern. The second line of this first verse – ‘From all these unborn chicken voices in my head?’ – duplicates this rhythmic pattern and the three disyllabic words it contains are each broken into single syllables so that the same rhythmic correspondence between syllable and beat is preserved. Noteworthy is the treatment of the trisyllabic words in the second line of verse 2 – ‘opinion’ and ‘consequence’ – which stand out against the background of their largely monosyllabic counterparts; both are broken down into their constituent syllables and rendered on a rise-and-fall pattern (‘opinion’ = D4-E4♭-D4; ‘consequence’ = B3♭-A3-B3♭) to preserve the syllable–beat correspondence already patterned throughout.
During the twice-repeated refrain ‘What’s that?’, the previously established pattern (whereby each syllable is accorded a single beat) is deviated from: the contraction ‘what’s’ is treated as monosyllabic and sung on a single D5 crotchet, while the monosyllabic ‘that’ is sustained over four notes – the majority on the deviant E5♮ (seven beats), F5, E5♮ and D5 – and almost three bars, with the phrase’s repetition beginning, following a brief one-beat rest, on the final crotchet beat of the third bar. The phrase recommences on D5, but this time the word ‘that’ is sung solely in the deviant E5♮, for the whole eight beats.
In the second verse, which follows the refrain, the only disyllabic word – ‘against’ – is broken into two syllables and sung over two different notes while preserving the same syllable–beat correspondence evident in verse 1; however, the monosyllabic final word ‘wall’ is treated as disyllabic and sung over the final B3♭ and held for 2.5 beats over the final G3. The word ‘wall’ is therefore foregrounded; its deviant rhythmic rendering coupled with the pitch drop conveys disgust for the ‘you’ to be dispatched.
In terms of pitch, the opening step-by-step ascension of the G Minor scale is deviated from only once – on the word ‘the’ – to cause a ‘jump’ from F4 to G4, perhaps betraying the questioner’s panic in what is otherwise a calm request marked by positive politeness (e.g. use of ‘please’; providing reasons for the request). This is followed by another step-by-step climb to the highest note in this opening musical utterance – B4♭ – on which the phrase ‘I’m tryin’’ is sung, again suggesting anxiety in this voice seeking repose from (interior) noise. In the second verse, it is the second syllable of the word ‘chi-cken’ and the first syllable of ‘voi-ces’ which are sung on the high B♭; the effect is that both words, arguably the most important in this line, are foregrounded.
Verse 1’s final word ‘rest’ comes to ‘rest’ harmonically on E4♮, the only note thus far that is not part of the harmonic G Minor scale (which has two flats, B and E); this pitch then constitutes a deviation from the established musical norm and acts to foreground the word ‘rest’ as both a psychological and a musical respite from the ‘noise’ all around. Rhythmically, the word ‘rest’ is sustained for 2.5 beats and followed by a wordless 10 beats, a musical representation of the ‘rest’ from words and other noise that the speaker craves.
Finally, the steadily ascending harmonic sequence found in each musical utterance mimics the steadily rising vocal pitch of spoken interrogatives. Technically, verse 1 constitutes one single interrogative in which each rendering of the refrain repeats a question, ‘what’s that?’, twice. This opening section, then, comprises five questions, all of which are lyrically, rhythmically and harmonically unanswered.
While not part of Yorke’s vocal line, absent from the official lyrics and missing from many of those unofficially reproduced, the almost inaudible and indecipherable lines ‘I may be paranoid, but not an android’, which are spoken by a humanoid voice following Yorke’s plaintive ‘What’s that?’, should be mentioned. This, the listener might conclude, is the voice of the eponymous ‘paranoid android’. Although almost completely backgrounded, these lines are an interesting representation of the dilemma at the heart of the song: the voice’s admission that it suffers from an exclusively human psychological condition – paranoia – problematises its status as an android. The fact that this repeated line both follows and precedes the question ‘What’s that?’ suggests that the entity’s ontological status remains a mystery.
3.2 Section B (1.57–3.33)
3.2.1 Rhythmic processes
Section B maintains the tempo of the preceding section – 84 BPM – but its rhythm alters to 7/8 time (seven eighth notes or quavers per bar). This is known as an irregular or ‘odd-time’ signature and is unusual in contemporary rock or pop music, which tends to be in simple 3/4 or 4/4 time. This second section contains 130 beats, 75% of which (98 beats) are wordless. Following a lyric-free guitar break of 30.5 beats, verse 3 commences. While rhythmically the time signature has altered to an irregular 7/8, the general pattern of playing each syllable on a single beat is preserved. However, the first syllable of the trisyllabic opening word of verse 3 – ‘Ambition’ – is represented in the sheet music by a ghost note (which appears in brackets); a ghost note has a rhythmic value but little or no pitch, so it sounds spoken rather than sung. The second line of verse 3 largely follows the same rhythmic arrangement as the first line, with two exceptions. Firstly, there is no initial ghost note; secondly, there is an extra (10th) syllable in the line ‘kick-ing scream-ing Gu-cci lit-tle pig-gy’, which requires an extra beat and hence four rhythmic repetitions. While verse 4 follows the same temporal pattern, its harmonic and vocal renderings distinguish it from verse 3, as will be discussed below.
Verse 4 is followed by a guitar break of, again, 30.5 beats and a final pause that lasts for eight beats, which acts as a segue into Section C. Section B, then, is marked by a change in rhythm and a change to a highly unusual rhythm, given this musical genre. So, although the musical tempo of the section remains the same as Section A (84 BPM), a substantial increase in the number of beats per bar constructs the sonic analogue of a fast-paced – comparatively frenetic – dynamic process lacking balance or stability.
3.2.2 Harmonic processes
A key change to A Minor (natural) moves the tonic from the G of the preceding section to A; it also moves the dominant from D to E. As such, there is a one-pitch increase, a change that, along with the alteration of rhythmic structure discussed above, potentially heightens the turmoil and chaos conveyed in this section. In contrast to the key of Section A, which contained two flats, the natural key of A Minor does not contain any flats or sharps.
As mentioned above, the third verse opens on a ghost note, with the last two syllables of the first word ‘Am-bit-ion’ sung on a repeated A3, which then jumps up to C4 (on ‘makes’) before returning to A3 (on ‘you’), dropping slightly to a thrice-repeated G3♯ – on ‘look pret-ty’ – before rising to C4 (‘ug-’) and finally D4 (‘-ly’) (see Figure 5). A wordless seven-beat bar in which the guitar repeats the same notes sung by the voice follows both verse 3 and verse 4. The same harmonic pattern occurs in the second line of verse 3 – ‘kick-ing scream-ing Gu-cci lit-tle pig-gy’ – although the addition of a 10th syllable to this line requires an extra G3♯ (see Figure 6).

Section B: verse 3, line 1.

Section B: verse 3, line 2.
Hence, repetition is again something of a theme in verse 3: in each line, there is repetition of the origin pitch; multiple repetitions of the lowest pitch; and both lines repeat the same melody. However, the stability conveyed by repetition is discordant in nature. There are no sharps in the natural key of A Minor, and yet the G is played as sharp, rather than natural. This raising of the natural A Minor key to its harmonic variant creates a musical dissonance, an effect considerably strengthened by the note’s seven repetitions.
The harmonic pattern of verse 4 – discussed further in the section on ‘music and words’ below – is marked by repetition of slowly descending individual notes (‘You don’t remember’; see Figures 7 and 8). Commencing on G5, the harmonic pattern of all seven lines in verse 4 is similar, with only occasional differences such as the second note being D5 rather than E5, and increased frequency of pitch repetitions. For example, across lines 4–7 (Figure 8), there are 10 repetitions of D5, and a repetition of E5 that moves from E♮ to E♭ and back again, creating further musical dissonance given the lack of flats in the natural A Minor scale. As shown in Figures 7 and 8, the pattern of the harmonic processes in verse 4 is striking: the pitch range is very narrow, spanning only two or three notes, and largely favours alternation between just two notes; the trajectory from the origin pitch is always downward and is reinforced through consistent pitch repetition; there is never any return to the origin note, with the end note always being at least one (often more) pitches lower.

Section B: verse 4, lines 1–3.

Section B: verse 4, lines 4–7.
Harmonically, Section B is sonically analogous to an ongoing process marked by disruption, as signalled by the dissonant use of sharp and flat pitch variation and strong patterns of repetition of individual pitches. The pitch’s overall failure to close on its originating note of A3 – despite an initial teasing return – reinforces this sense of disruption. The fact that the pitch range across Section B is much narrower than in Section A also adds a sense of restriction to the resultant sonic analogue.
3.2.3 Coordination of syntactic layers
The sonic analogue constructed by the section’s temporal framework corresponds well with that constructed by its tonal framework: rhythmic irregularity couples with harmonic dissonance to convey an unsettling, somewhat chaotic dynamic process that fails to reach a satisfactory end point or destination.
3.2.4 Music and words
In this section, there is a considerable increase in the number of sung notes per bar. Given the continued coordination of music with lyrics (again, every syllable is represented by a single sung note), this increase results in a staccato effect that conveys panic and discord. In the first line of verse 3, the rhythmic stresses on every second syllable emphasise the middle syllable of ‘am-
This effect is further strengthened by the dissonance created through repeated use of G3♯ instead of its natural variant, as mentioned above. In verse 3, the words ‘look pret-ty’ are sung on three successive repetitions of G3♯, while the word ‘ugly’ is sung over two notes on a rising pitch ending on D4, which, as the highest note in the verse, has the effect of hanging uncertainly. In verse 3, the words ‘Gucci little’ are sung on the dissonant G♯, this time with an increased repetition of four notes. The cumulative effect is of discordance, which accentuates the lyrics very effectively: the ‘ugliness’ of ‘ambition’ is mirrored by the cacophony of musical dissonance.
In Section B, the quality of the voice is particularly meaningful. The stabbing staccato effect mentioned above is chiefly created by rhythm and musical dissonance but is further compounded by the discernible disdain in Yorke’s voice. As the section continues into verse 4, Yorke’s voice quality increasingly conveys meaning, not least because the musical pitch at which the lyrics are sung is almost indiscernible to the ear. According to the sheet music, the words ‘You don’t remember, you don’t remember, why don’t you remember my name?’ open on a high G5 as the word ‘You’ is spat out, followed by ‘don’t’ together with the first two syllables of ‘rem-em-ber’ sung one step down on F5 before the final syllable – following a sustained double-beat on ‘em’ – is sung on D5. ‘Why don’t you remember my name?’ starts with the first two words on C5; then the four syllables in ‘you remember’ are sung on D5; then ‘my’ again on C5 and finally ‘name’ is given emphasis by a two-note jump to E5 along with a doubled sustained beat of two sixteenths (semi-quavers) and then a final, unaccompanied C5, extended to a duration of three-quarters of a beat. However, the pitch and register of the notes are rendered largely incomprehensible to the listening ear because the lyrics are shouted in what appears more like a single note punctuated by rhythmic stabs. The same occurs in the lines that follow, which are now screamed by Yorke: ‘Off with his head, man. Off with his head. Why don’t you remember my name? I guess he does’. The musical notation is now rendered virtually useless to a listener; here, voice quality is a major semiotic resource, competing with rhythm and pitch and almost usurping language.
3.3 Section C (3.33–5.36)
3.3.1 Rhythmic processes
This third section is distinguished by its marked decrease in tempo from 84 BPM to 63 BPM and its return to regular 4/4 time. There are 128 beats in this section, the first 32 of which are wordless but accompanied by a vocal choral arrangement. After eight wordless bars, the voice enters on the first beat of the ninth bar with the two monosyllabic words ‘Rain down’, each of which is sustained for a full two beats; this marks the first and most noticeable departure from the one-syllable-one-beat pattern largely adhered to in the previous sections. This irregularity becomes a feature of Section C, although from bar 10 onwards syllables are held for either a single or a half-beat only.
The slower tempo and increased regularity here combine to create a rhythmic effect of controlled, if plodding, regularity. The sonic analogue constructed suggests a long, slow approach to the end point of a process.
3.3.2 Harmonic processes
Harmonically, there is further key modulation in this section, which alternates between C Minor and D Minor (Griffiths, 2004: 37). Against the background of a harmonised choral arrangement, the vocals open on E5♭ with ‘Rain down’ sung on a trio of chromatically descending pitches (to C5♯). As the phrase is repeated, it ascends again above its starting-point to F5 before descending steadily again to D5, achromatically this time (see Figure 9). It then rises quickly to a ‘great height’ of F5 before dropping dramatically to C5, a pattern repeated with the second utterance of the phrase ‘from a great height’, although this time the word ‘height’ is vocally sustained over two full bars as it leaps from B4 to E5 to F5, then drops to A4, before rising again to the same height of E5-F5-E5. As seen in Figure 10, the harmonic patterns of this section are striking in their consistency: in each case a pattern of ascension to the same climactic note is followed by one of descent to a point a full pitch lower than its origin, so that the overall rise-and-fall pattern is nevertheless increasingly declining as the low pitches become successively lower. The musical trajectory is downward, despite repeated attempts at ascension.

Section C: lines 1–2.

Section C: lines 3–4.
When lines 1–4 are repeated for the third time, they become background for a new vocal line beginning ‘That’s it, sir’ on the second beat of the sustained word ‘rain’. This second vocal line is based on the core harmonic pattern of the first (E5♭-D5-D5-C5♯, etc.) but uses the second beat of each note to insert an additional note, typically two pitches higher (e.g. E5♭-

Section C: ‘That’s it sir, you’re leaving / The crackle of pigskin / The dust and the screaming / The yuppies networking and’.
This makes for another striking harmonic pattern, as seen in Figure 12: the pitch sequences are marked by musical leaps; these leaps are increasingly shorter, decreasing from 2.5 pitches to only half a pitch and always followed by a return to their point of origin; repetition of the high notes is outnumbered by repetition of the low notes; and, finally, after being repeated once (on the third note), the origin pitch is never again returned to in the whole 46-note sequence that brings us to the final word ‘yeah’. The completed cycles within this rise-and-fall pattern represent repeated attempts to get ‘off the [harmonic] ground’; as the low pitches become gradually but consistently lower, the harmonic ‘leaps’ that follow increasingly appear doomed to failure. The sonic analogue constructed is that of a dynamic process fraught with difficulty, which once again fails to reach completion.

Section C: ‘The panic, the vomit, the panic, the vomit / God loves his children, God loves his children, yeah’.
3.3.3 Coordination of syntactic layers
The section’s decreased yet regular tempo coupled with its jerky and ultimately descending harmonic patterns construct the sonic analogue of a dynamic process, represented by a source-path-goal image schema, which is unlikely to end in success.
3.3.4 Music and words
One of the most striking correspondences of music and words in the song occurs in this section, with the phrase ‘rain down’ repeatedly sung on a sequence of notes predominantly marked by descending pitch. The repeated phrase ‘from a great height’ is, however, cleverly sung on an ascending pitch which peaks on ‘great’ and drops below its starting-point on the word ‘height’, harmonically depicting reaching a height and then falling from it. As mentioned above, repetition of the word ‘height’ is vocally sustained both temporally and tonally, ensuring that it is foregrounded for the listener.
The ‘rain down’ section is also marked by the number of rests it contains; with the exception of the first and last bars of this repeated section, every bar is marked by either a half (minim), quarter (crotchet) or eighth (quaver) rest. Quaver rests are used to great effect in the second vocal line of Section C (commencing ‘That’s it, sir’): with only one exception, the pitches are consistently arranged into sequences of three notes divided by a quaver rest. These rests have a jerky, disjointing effect on the vocals, emphasising the distinct trisyllabic lyrical groupings that they create, e.g. ‘the pan-ic, the vom-it’. The exception mentioned above occurs on ‘the yuppies networking, and’. This phrase, integral to both the meaning of this section and the song overall, is thus highlighted: the ‘yuppies’ perpetuate the ‘panic’ and chaos, and are the source of that ‘noise’ from which repose is sought at the opening of the track.
The ‘panic’ and ‘vomit’ are also musically underpinned by the pattern of continually ascending and descending pitch, which rises from but always descends back to the originating note. This pattern is repeated until it sinks to B4♭ on the words ‘the panic’, and sinks further to A4 on ‘the vomit’, before descending to its lowest, G4♯, on the first word of the final phrase, ‘God loves his children’. This final phrase is played on a twice-repeated pattern of G4♯-B4-G4♯-A4-B4♯ before closing – on the song’s final word, ‘yeah’ – on a fall back to A4. The word seems sarcastic in the context of the pitch drop and its effect is to undermine the phrase: God, it is being suggested, does not love his children at all. The layering of these two vocal lines over a choral arrangement forces the listener to construct further analogous structures. The negativity conveyed in the plaintively rendered ‘rain down’, sung on descending pitches over a listing of the ills of contemporary society, works to reinforce the sonic analogues constructed by each of the syntactic processes in isolation as well as that constructed by the coordination of syntactic layers. Cumulatively, the effect is of chaos and futility.
As a result of the musical and lyrical disparities evident across all three sections of this song, it may be assumed that tying the semiotic threads together is no mean feat. Although beyond the parameters of this article, one means of doing so, following Zbikowski, is by constructing a conceptual integration network (see Fauconnier and Turner, 1998, 2002). Whilst not essential to a musical grammar analysis (as the construction of a sonic or any other type of analogue does not automatically result in a conceptual blend), a conceptual blending analysis does offer a useful means of identifying how the meanings distilled from the text’s music interact firstly with one another, and subsequently with its language, to create the song’s overall meaning. Representing this visually via a conceptual integration network also offers a means of presenting the often-unwieldy amount of information generated by a multimodal analysis in a more streamlined and comprehensible manner. However, distillation of each semiotic mode into its respective sonic analogue also offers us a convenient means of drawing the song’s layers together. In ‘Paranoid Android’, Sections A, B and C all stimulate the construction of sonic analogues of unfulfilled dynamic processes. Common to both the musically activated sonic analogues and the key concepts of the lyrics is the perception that goal-directed actions are, in each case, doomed to failure, regardless of how the process is undertaken. Lyrically, this is evidenced in numerous ways: in the opening interrogative which goes unanswered; the request for silence that will not be granted; the seeking of repose that will never be found; ‘what’s that?’, another question that will not be answered; the seeking of an authority that will never be conferred; the quest for a world in which the opinions of the powerless may be of consequence, which will never be realised. This is reinforced in Sections B and C, in which the processes referenced are either inherently redundant (e.g. ‘kicking’, ‘screaming’); construed as worthless (e.g. the yuppies ‘networking’); or are, again, unfulfilled processes (e.g. the failure to remember). Identification of these commonalities enables the construction of new or emergent meaning; in this case, the realisation that all attempts to achieve harmonious communion with society – to fight the alienation and isolation endemic to modern living – are useless, doomed to be thwarted by inherent societal deficiencies. The listener is being shown the absurdity of modern living, and the wisdom of trying to effect change in a world where the best-laid plans are subject to the arbitrariness of fate and/or the whims of authority figures (‘Off with his head!’). God does not love his children. Modern existence, we are being told, is futile.
4. Conclusion
Zbikowski’s theory of musical grammar offers a means of converting musical utterances into conceptual constructs, which can then be analysed in a manner that facilitates cross-modal investigation. This analysis has shown not only how musical utterances communicate meaning, but also the myriad ways in which musical meaning can be heightened through interaction with other semiotic resources.
The purpose of this analysis was, primarily, to explore the extent to which Zbikowski’s theory of musical grammar provides a way of systematically converting musical utterances into tangible cognitive constructs ripe for further analysis. By his own admission, Zbikowski’s framework is ‘preliminary’ and aims to stimulate ‘further thought about the nature and structure of musical organization’ (2017: 25). He tests his theory on the genre of German Lieder, 19th-century compositions arising from the creation of a musical setting for an existing, usually poetic, text. The interaction between music and language in such a genre is, then, marked by a purposeful fitting of one semiotic mode to the other, a trait not necessarily found in musical texts. The lyrics of ‘Paranoid Android’, for example, were written months after its musical composition; the musical composition was, in turn, a haphazard splicing together of three musical arrangements by three different band members (as evidenced in its eclectic structure), which makes this song a rather different (and difficult) subject for analysis. Also, faithful application of Zbikowski’s theory at times demanded an expertise in musical theory and composition beyond the capabilities of this author. By the same token, as a music theorist, Zbikowski’s own applications of musical grammar largely leave the text’s linguistic structure untouched. Brower’s (2000) work on the image schemas underpinning musical interpretation is certainly worth further investigation, as it offers glimpses of a level of empirical verification often missing from Zbikowski’s theory; however, the density of its musical terminology leads to an increasing impenetrability that makes a convincing case for cross-disciplinary collaboration. The ideal, then, is cross-pollination not only of ideas, but also of expertise between music theorists and linguists.
Footnotes
Acknowledgements
The author would like to thank the peer reviewers for their extremely helpful feedback.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Glossary
♮ = natural version of note
♯ = sharp version of note
♭ = flat version of note
numbers in subscript denote the octave, e.g. ‘middle C’ is C4
crotchet = a quarter
quaver = an eighth
semi-quaver = a sixteenth
tonic = main note in any key, e.g. tonic of C major is C
dominant = fifth note in any key, e.g. dominant of C major is G
subdominant = fourth note in any key, e.g. subdominant of C major is F
