Abstract
Atonal compositions based on the 12-tone method devised by Arnold Schoenberg remain, in some cases a century after they were written, largely unpopular with music audiences. Research on the science of music cogni- tion may now offer some clues to why this is. Schoenberg's method of atonal composition actively undermines some of the basic cognitive princi- ples that allow our brains to turn notes into music. Unless 12-tone music is granted other aids to cognition, it may thus fail to create a cognitively coherent auditory experience, but becomes a mere collection of sounds.
Introduction
In 1958, the American composer Milton Babbitt wrote an article in the music magazine High Fidelity that still provokes fury today (Babbitt 1958). His title said it all: ‘Who Cares If You Listen?’ Responding to widespread accusations that modern classical music was incomprehensible to, and disliked by, most of the music-loving public, Babbitt argued that not only should modern composers be unconcerned at this animosity but they should welcome it. If the public had no interest in this new music, he said, the composer should stop worrying, making compromises and indulging in exhibitionism to attract an audience, and simply get on with ‘his’ craft — or perhaps, he might have been tempted to say, his science.
In the 1950s, classical composition was dominated by the form of atonalism developed at the start of the century by the Austrian composer Arnold Schoenberg and his followers in the so-called Second Viennese School, most notably Anton Webern and Alban Berg. Schoenberg devised a prescriptive system, described below, for composing music that lacked a tonic centre around which the melodies and harmonies were rooted: it gave equal status to all the notes of the chromatic scale. Within the academic spheres of music composition, Schoenberg's ‘serialist’ or ‘12-tone’ technique came to be seen as the only respectable way to write music, to the extent that any attempts to compose within the old tonal tradition were regarded as recidivist, decadent and vulgar. Babbitt was one of a group of composers who extended Schoenberg's serialist constraints on the way pitch was organized to embrace other musical parameters such as rhythm and dynamics, leading to a mode of composition called total serialism in which tightly prescribed rules dominated the composer's practice.
The result was a kind of music that, to many listeners, sounded fragmented, bleak and inaccessible. In contrast to the experiments in chromaticism, dissonance and rhythmic irregularities practiced by composers such as Wagner, Prokofiev, Stravinsky and Bartók, which were at first greeted with bafflement and even outrage by the musical public but have now contributed much-loved pieces to the standard repertoire, the atonalism of Schoenberg and his followers continues to be deemed ‘difficult’ by many concert-goers. Some of these works, such as Berg's opera Lulu, are considered by many critics to be masterpieces of modernism. But many are rarely performed, and are still regarded as commercially risky by concert programmers unless leavened with more popular pieces from the older tonal repertoire.
Whence this resistance? Babbitt put it down to conservatism and ignorance. In his High Fidelity article, he acknowledged that the ways in which pitch and other musical parameters were used in extreme atonalism ‘makes ever heavier demands upon the training of the listener's perceptual capacities’. This, however, was granted not in recognition of the difficulties with which a listener was confronted, but as the prelude to an impatient criticism of the poverty of intellectual resources that audiences brought to this complex new music. Because they are so bad at remembering precise values of pitch, register, dynamics, duration and timbre, said Babbitt, most listeners ended up ‘falsifying’ the intentions of the composer. Why on earth should the public expect to understand this advanced art form, any more than they would understand advanced mathematics?
Many people were outraged by Babbitt's remarks because they considered them to be a dereliction of the musician's duty to communicate. Yet it is not wholly unfair to impute a degree of conservatism to a large proportion of the classical audience: while some have eclectic and experimental tastes, many others still consider Bartók forbidding and post-war atonalism unlistenable. The schedules of any major concert venue, or the sales statistics of recordings of classical music, confirm the suspicion that the public taste remains firmly rooted in tonality: for example, none of the 50 top-selling composers through the UK music retailer HMV in 2003–2008 composed in an atonal style. At London's Royal Festival Hall, ‘contemporary classical music’ — including just about anything atonal — is split off from the main programmes of the resident London Philharmonic and the Philharmonia Orchestras and assigned to the London Sinfonietta, as if to imply that it forms virtually a separate genre. Even during the 1950s, at the height of serialism's popularity among composers, just ten tonal composers (Bach, Handel, Haydn, Mozart, Beethoven, Schubert, Chopin, Wagner, Brahms and Debussy) were responsible for 39 per cent of all music performed in the then-current repertoire (Moles 1966).
This is not simply an aversion to novelty. The return to tonality evident in the music of Steve Reich, John Adams and Philip Glass has been greeted with an enthusiasm that Pierre Boulez (one of the most prominent total serialists) has never been afforded. Dissonance abounds in the works of young contemporary composers such as Thomas Adès and James MacMillan, yet audiences will give them more of a chance than arch-serialists such as Babbitt or Luciano Berio.
The half-century of scientific research on music cognition since Babbitt published his article now provides material for a more informed and fruitful debate of the position he espoused. It enables us to enquire whether there are more deep-rooted, cognitive reasons for the cool reception this ‘modern’ music has received. As we come to understand more about what it is that enables the human mind to ‘think musically’, we might ask whether a distinction between ‘sound’ and ‘music’ be reasonably made on the grounds of cognitive transparency. Atonal music has tried systematically to eliminate at least some of the characteristics that traditionally render the music of most cultures comprehensible. My contention is that Schoenberg's atonalism need in itself be by no means an inaccessible system for composing music, but that unless composers working in that tradition recognize the fundamental cognitive needs and limitations of the listener, they risk not so much making music as arranging notes. The two are not the same thing.
The perceptual origins of tonality
Western tonal music — which includes just about all Western music from the early Renaissance to the late nineteenth century — has a key, and therefore an associated scale and tonic. In effect, this means that, out of the 12 notes in the chromatic scale,1 some are privileged over others. Since the early sixteenth century, this privileging has generally been governed by the use of diatonic scales — the major and minor (Figure 1). These scales select seven from the full gamut of 12 chromatic notes, and the composition is structured around this subset. To put it (too) crudely, the scale tells you which notes in a particular key ‘sound right’: in the key of C major, say, they are all the white notes of the piano.

The major and minor diatonic scales.
This certainly does not mean that the diatonic notes are the only ones that may be used in a tonal composition, however. What is it, then, that really distinguishes tonality from atonality? The perceptual rules we use for establishing a sense of tonality when we listen to music do not derive from any musical theory; they are purely statistical. That is to say, we learn by acculturation which notes to ‘expect’. We begin this learning at birth, or possibly before (Hepper 1991), and have mostly mastered it at an early age — 5-year-old children can generally identify the difference between ‘in key’ and ‘out of key’ (diatonic/non-diatonic) notes in simple tonal melodies (Krumhansl and Keil 1982; Trainor and Trehub 1994; Trehub 2003).
What the key of a piece of tonal music determines is not ‘which notes may be used’, but the probabilities of the various notes it contains: the chance that any note in the music, picked at random, will belong to a specific pitch class. A composition in the key of C major is more likely to contain a G, say, than an F# or a C#. The probability distributions of notes encode how many times each note occurs in a piece, or equivalently, the relative probability that a note chosen at random will belong to a particular pitch class. These distributions are easy to deduce simply by counting up notes. For Western classical music, they turn out to be remarkably stable across many periods and styles (Huron 2001) (Figure 2).

The frequency of occurrence of pitch classes for major-key Western tonal music of the eighteenth to early twentieth centuries. The sample (all transposed to C major) consists of: songs by Schubert and Schumann, arias by Mozart and Mendelssohn, lieder by Richard Strauss and cantatas by Johann Adolf Hasse. The width of the band spans the range of values for each of these groups.
Krumhansl and her co-workers have shown that this salience of notes in musical practice closely conforms to that in our subjective impression (Krumhansl 1990). In a typical study, they establish a tonal context — a sense of key — by playing a chord, a scale, or a short sequence of chords of the sort that might end a musical phrase or melody (called a cadence). They then immediately play their subjects a note in the chromatic scale and ask them how well it seemed to ‘fit’ the context. The results are remarkably consistent, regardless of the extent of the listeners’ musical training (Figure 3). Krumhansl calls this subjective evaluation of the ‘rightness’ of notes the tonal hierarchy. The tonal hierarchy tells us what we would intuitively expect. The common notes — the peaks in the distribution — are all in the diatonic (here the major) scale, and the troughs are all chromatic notes outside the scale. The latter are all used more or less equally rarely. The notes of the major triad chord (the tonic, third and fifth of the scale, here C-E-G) are the most frequently used.

The ‘tonal hierarchy’: how people rate the ‘fittingness’ of notes within the context of C major. Inset: Comparison with the actual note frequency distribution for Western tonal music.
The tonic note is the most salient of all: it is the note that centres the melody. Although it is normally applied only to Western music, the word ‘tonal’ is appropriate for any music that recognizes a hierarchy in which note use is favoured to different degrees. That is true of the music of most cultures.
This differentiation of notes is a cognitive crutch: it helps us interpret and remember a tune. The notes higher in a hierarchy offer landmarks that anchor the melody, so that we don't just hear it as a string of so many equivalent notes. Music theorists say that notes higher in this hierarchy are more stable, by which they mean that they seem less likely to move off somewhere else. The tonic, being the most stable of all, is where most popular melodies come to rest.
The notion that some notes are more stable than others can be turned on its head to say that some are more active, tending to ‘push’ the melody off elsewhere. We can think of the pitch space as a topographic landscape in which more stable notes correspond to the valleys (Figure 4). A melody is then like a stream of water that seeks the low ground. From any point, it will tend to run towards the nearest depression: the nearest note of greater stability. More stable notes exert a pull on nearby less stable ones. In C, an F is pulled down towards E, but also up towards G. An A gravitates down towards G, but a B tends to rise up to C. Chromatic (‘out-of-scale’) notes are particularly unstable and are likely to move quickly onto more stable ones: an F# to a G, an Eb to a D or an E. Such notes are generally just ‘passing tones’, of brief duration.

Note stabilities, and the ‘tendencies’ of melodic movement. Solid lines show stronger tendencies, dotted lines show weaker ones.
Although Krumhansl's tonal hierarchy is very similar to the actual distribution of notes (see inset, Figure 3), it is not immediately obvious what is cause and what is effect. What, ultimately, makes us decide that a G is a better fit in the key of C than is an F# — is this learnt or innate? A correlation analysis of the tonal hierarchy against measures of consonance and actual tonal distributions in Western classical music of the eighteenth and nineteenth centuries suggests that learning of statistical probabilities is far more important than intrinsic consonance (Krumhansl and Kessler 1982).
Krumhansl's methods have been criticized (Aarden 2003; Huron 2006), but her point that statistical learning guides or even governs our perception and expectation of the notes that make up a melody is widely accepted. The implication is that we form a mental image of the tonal hierarchy, and refer to it constantly to develop anticipations and judgements about a tune, whether we are listening to a nursery rhyme or a Bach cantata, in order to locate the key and tonic. Even people lacking any musical training will typically deduce a simple tune's tonal centre within a few seconds (Cohen 1977). A general sense of tonality, which entails an ability to sing a song without constantly wandering out of key, develops in most people by the age of five or six without any formal training (Trainor and Trehub 1994; Gardner 1981; Gardner et al. 1981; Sloboda 1985). By seven, many children can detect a key change — a switch to a different tonal hierarchy — in the middle of a familiar song (Sloboda 1985).
If music was simply a matter of following gravity-like attractions from note to note, there would be nothing for the composer to do: a melody would be as inevitable as the path of water rushing down a mountainside. The key to music is that these pulls can be resisted. It is the job of the musician to know when and how to do so. If there were no underlying tendencies, no implica- tions within each note of which one will follow, we would be indifferent to the outcome, and all melodies would sound like the same random meander- ing. The effect of a tune is determined by whether it follows the attractions or resists them. This is one of the fundamental principles of how music exerts its emotional power: it is a question of whether or not the music meets our expectations of what it will do next (Meyer 1956; Huron 2006). The tonal hierarchy and the different stabilities of musical notes create a context of expectation and anticipation, which the composer or performer manipulates to make music come alive and convey something akin to meaning. If the melody moves from a note of lesser to greater stability, we sense a reduction of tension, as though some constraint has been relaxed. In short, the tonal hierarchy provides the tonal composer with a framework for creating a sense of purpose, meaning and intentionality in music.
Enter serialism
In the finale of his Second String Quartet, written in 1907, Arnold Schoenberg omitted a key signature: an admission that assigning a key to this music was meaningless because it said nothing about the distribution of pitches. There was no longer a tonic note; the music was atonal.
We can now see more clearly what this notion implies. It is not the aban- donment of a key signature per se that matters. Others, such as Erik Satie, had previously omitted an initial indication of key, finding it more convenient simply to annotate the various accidentals (sharps and flats) as they occurred. And conversely, Schoenberg could have written his finale with a key signa- ture but merely applied the necessary accidentals where needed. The reason this music can be considered atonal is not because it formally dispenses with a tonic, but because it does so in perceptual terms: if we listen with an ear attuned to tonal music, we can't make out where the piece is rooted. To put it another way, the tonal hierarchy of Figure 3 no longer applies — it is not a good guide to what notes we should expect to hear. Yet just about everyone in Western culture, both in the early twentieth century and today, grows up hearing and learning this tonal hierarchy, and so will instinctively try to apply it to atonal music. This is why many find such music baffling: they have no conceptual tool for navigating it.
Cross-cultural studies have shown that new tonal hierarchies can be learnt rather quickly, and can then help us make sense of music from unfamiliar cultures and styles (Castellano et al. 1984, Kessler et al. 1984, Krumhansl et al. 2000, Eerola 2004). But the point about Schoenberg's atonality is not that it has a different tonal hierarchy; it is that this music has none.
That is quite deliberate. Schoenberg designed his atonal music explicitly to make it so. He recognized how strong our urge is to identify a tonic, and assumed (rather too simplistically) that we generally do this on the basis of note statistics: we assign the most common note as the tonic (Schoenberg 1975, 246). In order to remove all traces of tonality, he concluded that it is not enough simply to use plenty of chromatic notes outside the diatonic scale; we have to ensure that no note is played more often than any other.
This is the objective of Schoenberg's serial or 12-tone scheme. It requires the composer to begin by creating a tone row in which every note (one should really say pitch class) in the chromatic scale is arranged in a particular sequence (Figure 5). This is the raw material of the composition: the sequence specifies the order in which notes may be used. The 12-tone row must be sounded in its entirety before it may repeat. In this way, no tone can acquire any more significance than any other, and so there is no possibility of a sense of a tonic note emerging even by chance. The hierarchy is flattened by fiat.

A 12-tone row and its permitted permutations.
Schoenberg permitted various permutations of the tone row, related to the original one by symmetry operations: the order of notes may be reversed (a retrograde row), or it may be inverted, as if in a mirror plane through the first note. Or reversals and inversions may be combined (Figure 5). Moreover, each note can be sounded in any octave, and individual notes can be repeated before the next one is sounded. In Schoenberg's original scheme, the composer was also free to choose rhythm, dynamics and so on, although later total serialists precluded these options.
The cognition of serialism
Serial atonalism purposely sets us adrift from any locus to which expectations of ‘the next note’ can be pinned. To many listeners, this simply provokes irritation, exasperation or boredom — the music seems incomprehensible. To others, the effect is pleasantly arousing: like any other confusion of our expectations, it stimulates careful listening and invokes a sense of tension. Such tension can never be resolved in the ways that it is for tonal music — there is ‘no way home’ to a stable tonic centre — but atonalism can perform a delicious juggling act, offering little hints of structure and logic that keep our attention alive.
Krumhansl et al. have tested listeners’ responses to serial compositions using the same methods that were used to establish the tonal hierarchy (Krumhansl et al. 1987). They played the subjects tone rows or excerpts from serial compositions to establish a context, and then asked them to assess the fittingness of each note in turn from the chromatic scale. The responses varied hugely, and it is hard to discern any general rules in how the listeners try to organize what they hear. In general, however, they showed signs of searching (fruitlessly) for hints of structure based on their expectations from tonal music. This seems to lend some support to the assertion of composer Paul Hindemith that trying to avoid tonality is, for most listeners, ‘as promising as attempts at avoiding the effects of gravitation’ (Hindemith 1961, 64–65).
Yet Schoenberg's method does not by itself guarantee to eliminate all trace of tonality. Consider the tone row in Figure 6, which obeys Schoenberg's rules. It starts with the ascending major scale of C, and ends with a descend- ing pentatonic (five-tone) scale on F#. So this tone row will create two local sensations of tonality, in C and F#. Here we would be finding a sense of key not on the basis of overall note statistics (which remain ‘flat’ — there is still no tonal hierarchy), but because of our learnt associations of groups of notes with diatonic scales and with certain pitch steps. In other words, we are persuaded for a brief moment to ‘hear with tonal ears’: to imagine a tonal hierarchy where there is in fact none.

A tone row that creates two local sensations of tonality.
If one were to choose tone rows at random, one would quite often find little groupings like this (albeit generally less extreme) that give a momentary sense of tonality. Some 12-tone composers, including Stravinsky in his later works and indeed even Schoenberg himself, used rows that create, apparently by design, momentary tonal effects in this way. But Huron and von Hippel find that, on average, Schoenberg's tone rows have fewer local groups of notes that give a hint of tonality than a random selection would provide (Huron and von Hippel 2000). In other words, it seems that Schoenberg preferentially selected those rows that banish tonality most effectively. For this reason, Huron argues that serial composition should not be regarded as ‘atonal’ at all. Rather, the system is deliberately contratonal: not casually ignoring tonality, but taking great pains to eliminate all trace of it. It seems that Schoenberg did this unconsciously — there is no sign he was aware that the 12-tone method needed something more to achieve his contratonal objective.
Gluing a tune together
Traditional tonal music does not by any means attain its comprehensibility solely from the tonal hierarchy. There are many other cognitive mechanisms at play that weave sequences of notes into coherent streams (while keeping these streams distinct from one another), and which create a tense of trajec- tory and intention in music (Deutsch 1982a; Bregman 1990; Huron 2001; 2006). These mechanisms stem from the tendency of the human mind to seek meaningful associations between groups of stimuli: does this belong with that? The default position is that it probably does, unless there's good reason to think otherwise: we are pattern-seekers.
The key mechanisms by which such associations are identified were proposed by the German-based Gestalt psychologists in the late nineteenth and early twentieth centuries, who argued that the mind possesses holistic organizing tendencies that make perceived experience more than the sum of its parts. By grouping and separating sensory stimuli into discrete objects, we make the world intelligible. We form continuous objects from fragments, for example, if more distant objects are broken up by intervening ones. And we learn to assume continuity: to expect, say, that when an aeroplane passes behind a cloud, it will appear on the other side.
The gestalt principles are most easily understood in visual terms. We might group objects by similarity, or proximity, or by the presumption of smooth contour (‘good continuation’) (Figure 7). Our comprehension of these visual associations requires no conscious effort: in cognitive terms, we can parse the stimuli very readily. In other words, the gestalt principles operate ‘out of sight’. All of these visual principles have sonic analogues, and this means that music is only indirectly related to the acoustic signals generated by the performers. What we ‘hear ’ is an interpretation, a best guess in which our brains seek to simplify the complicated soundscape by unconsciously applying the gestalt principles, which have been found from experience to do a fairly reliable job of turning the sound into a hypothesis about the processes that created it.

The gestalt principles of similarity (a), proximity (b) and good continuation (c).
The principle of good continuation, for instance, is one of the governing factors in the cognition of melody. It is widely recognized that the contour of a melodic theme determines how easy it is to process. Melodies that advance in small pitch steps tend to be perceived as continuous, while large pitch jumps threaten to break this continuity. This is reflected in the fact that the probability of pitch steps in the music of most cultures has a universal tendency to decrease as the steps get larger (Huron 2001). In C major, there is a stronger probability that, say, a C will be followed by the D above it than by an F. (This explains why the peak for D in Figure 2 is higher than that in Figure 3 — Krumhansl's method doesn't really take into account expectations of melodic trajectory.) This preference for smooth melodic lines is an intrinsic organizing characteristic of auditory cognition.
Smoothness of melodic contour can be maintained irrespective of systems of tonality, as indeed the tone rows in Figure 6 imply. Berg's serialist Lyric Suite (1925–6), for example, makes use of predominantly small pitch steps in its melodic lines. But because Schoenberg insisted on the principle of octave equivalence — a pitch class in the tone row can be sounded in any octave — his rules promoted the notion that smooth contour need not matter. In other words, it was the absolute pitch class of each note that was important, not its distance in pitch from the preceding note. As a result, many atonal composi- tions have a jagged, disjointed pitch profile. In the total serialist approach, which prescribes the choice of register (or, in some cases, assigns this at random) as well as pitch class, smooth melodic contours may indeed be more or less banished (Figure 8). It is for this reason — and not because of the high degree of chromaticism or musical dissonance — that detractors of Schoen- berg's music are right in a sense to say that it typically has no ‘tune’.

a, Extract from Pierre Boulez's Structures Ia (1952). b, Extract from Boulez's Piano Sonata No. 1 (1951), second movement.
The extract in Figure 8a also makes clear what happened to rhythm in total serialist compositions: namely, that it was fractured to the point of invisibility. Rhythmic regularity is another of the cognitive aids on which tonal music has traditionally relied: gestalt-based processing looks for some kind of rhythmic continuity, such as what we might recognize in musical terms as a sense of metre. Even the striking irregular rhythms in Stravinsky's Rite of Spring jolt our expectations because they are created by the uneven emphasis of notes or chords spaced evenly in time — that is to say, they occur within a regular metrical pulse. The effect is disconcerting and, for many listeners, enlivening, precisely because of this interplay between the regular and the irregular: we are given the material to develop expectations, even if these are then manipu- lated and violated. As a result, the pattern is disjointed but nonetheless comprehensible. In contrast, extreme serialism of the sort practised in Boulez's Structures Ia takes pains to eliminate any sense of a metrical grid. The events seem to be independent of one another, and we have no basis for formulating any expectations at all about when they will occur. As a result, the music can seem rhythmically formless — just a collection of isolated events.
As an organizing principle in serialism, surely the most obvious candidate is the tone row itself. Won't we hear it as a kind of tune simply because we hear it again and again? Apparently not — because we don't really hear the tone row again and again. For one thing, in the absence of guidance from a tonal hierarchy or melodic contour, it becomes just a series of 12 notes — which is rather too many for the human brain to recall easily in sequence (Miller 1956) (just try memorizing a random series of a dozen numbers). Furthermore, the brain simply does not encode sequences of notes in permu- tational form, but rather, uses hierarchical structures as aides memoires. For example, Deutsch has found that the phrase in Figure 9a is more easily remembered if structured as in Figure 9b than in Figure 9c, even though they contain exactly the same sequences of notes, because the pauses in the first example divide the sequence into groups of notes with identical contours (sometimes called parallelism, because the successive melodic contours stay parallel to one another) (Deutsch 1980; 1981). In other words, there is a transparent hierarchy in the pitch grouping patterns in the former case. In a very real sense, there is less information to recall in that instance, because the repeating pattern makes it possible to condense the information of the whole sequence into a more concise form. Serial composition does not lend itself to this kind of hierarchical structuring: the tone rows do not, for a start, ‘sit’ on a tonal hierarchy, and the serial idiom does not tend to arrange rows into fragments that share similar contours.

People can recall sequences of tones more accurately if they are grouped in ways that impose easily heard regularities, for example, repetition of a pitch contour. The sequence in a is recalled more accurately if pauses are inserted between groups of three notes (b), emphasizing their identical contour. But if the pauses disrupt this repetitive structure, as in c, recall is considerably worse: the sequence ‘makes less sense’.
The situation is made worse by Schoenberg's assumption of octave equiva- lence. We do not remember melodies according to the pitch class of the component notes, but according to the contour — the series of pitch intervals. Musically untrained adults asked to sing back an unfamiliar melody might not get a single note right, yet will capture the basic contour. And familiar tunes remain recognizable when the melodic contour is ‘compressed’, as if reducing the vertical scale on a mountain range (Deutsch 1982b, Dowling 1978). Conversely, if the notes of a well-known melody are played in randomly selected octaves, so that the pitch classes are retained but the melodic contour is completely altered, the tune generally becomes very hard to identify (Figure 10) (Deutsch 1972; Dowling and Hollombe 1977).

When familiar tunes are played with the notes assigned to random octaves, as with ‘Mary Had a Little Lamb’ here, they become impossible for most people to identify. This is because the melodic contour is severely disrupted by this octave-shifting.
And once the tone row is rearranged by the manipulations permitted by Schoenberg's scheme, it becomes extremely hard to recognize as bearing any relation to the parent sequence. Cognitive tests show that sequences of notes altered in this way are rarely recognized as equivalent (Dowling 1972; Deutsch 1982b; Dowling and Harwood 1986; Francès 1988), presumably because we don't encode melodies in ways that facilitate it. For one thing, the various transformations are apt to alter the melodic contour, which, as indicated above, supplies our initial, crude mnemonic device. More generally, the fact that we do not encode melodies as mere sequences of pitch classes, which makes the tone row hard to perceive even in its unaltered form, creates even greater cognitive barriers under permutation. It's one thing to know that one sequence of notes is an inversion of another, but quite another to hear it.
Lerdahl lists several reasons, in addition to these, why tone rows are ‘cognitively opaque’ (Lerdahl 2001), including the unsystematic way they incorporate consonance and dissonance, and the lack of shared ‘pitch alphabets’ (Deutsch 1982b) between different serial works (in comparison to the common use of, for example, arpeggios and ascending or descending diatonic scales in tonal music).
Forte has claimed that serial music is organized according to so-called pitch-class sets, which are small groups of notes (more properly, of pitch classes, taking no account of octave) that recur in clusters either simultane- ously (in chords) or sequentially (in melody) (Forte 1973). Rather like the tone row itself, these sets are transformed in the composition according to various symmetry operations, such as inversions or cyclic permutations. The problem with this rather mathematical analysis is that it focuses only on the musical score and again takes no account of whether the sets are actually perceived. There is no indication that they are, and good reason to suppose, on the above considerations, that they are not (Deutsch 1982b). Indeed, the typical ‘deep embedding’ of pitch-class sets in the musical structure makes it highly unlikely that even experts can hear them, at least without prior study. So whether or not pitch-class set theory elucidates any formal structure in atonal music, in all probability it says nothing about how that music is heard — nothing, indeed, about it as music.
Lerdahl's accusation of ‘cognitive opacity’ in atonal music extends beyond the character of the tone row to issues of larger-scale structure. One of the central tenets of the ‘generative theory of tonal music’ developed by Lerdahl and Jackendoff is that music has a hierarchical structure analogous (but not identical to) the embedded form of linguistic constructions, with an associated syntax that governs the arrangement of component parts (Lerdahl and Jackendoff 1996; Lerdahl 2001, Jackendoff and Lerdahl 2006). This structure, according to Jackendoff and Lerdahl, can be revealed through a gradual simplification and stripping down of the ‘musical surface’ called pitch reduction, removing less important notes and retaining just the skeleton — a process inspired by Schenker's method of musicological analysis introduced in the first half of the twentieth century (Schenker 1979). But Jackendoff and Lerdahl suggest that atonal music may contain little in the way of grammatical and syntactic structure. There is no atonal equivalent of a grammatical form such as a cadence, for example, and indeed no reason why any particular chord should follow or precede another, beyond (in serial composition) the constraints of the tone row. When the generative theory of Jackendoff and Lerdahl is applied to atonal music, it elicits structures that are ‘perceptually fragile’ — that is, it is hard to identify the start and end of phrases — and ‘of limited hierarchical depth’: they are all surface, with little recursive branching (Lerdahl 2001). Tests seem to bear this out (Dibben 1994): musically trained subjects were unable to identify accurately which of two pitch reductions of pieces of atonal piano music best matched the original pieces, indicating that any hierarchical structure was not clearly perceived. This implies that, although serialism certainly has governing rules (and typically applies them inflexibly), they are not of the kind that permit a well-formed musical grammar. Syntactically speaking, this music is shallow.
That may be significant for the way we respond to it. One of the reasons why music commands our attention may be that our mechanisms for linguistic processing are marshalled to give a perceptible logic to complex music (Patel 2008). Our ability to develop musical grammar means that we are not doomed to remain at the level of nursery tunes (which already have a simple syntax). But music without a clear grammatical framework may struggle to amount to more than a linear series of notes and motifs, lacking in depth (Lerdahl 2001). And again, when the ‘rules’ of the music are not cognitively transparent — whether at the level of pitch relations or of larger- scale hierarchical and syntactic structure — it becomes harder to develop meaningful expectations, and so a major channel of emotional expression is denied (Meyer 1956; Huron 2006). One can hardly blame audiences for sometimes suspecting that what is left is musically rather sparse.
Philosophy, politics and music
So why did Schoenberg invent his 12-tone method? Much has been made of his insistence on the ‘liberation of the dissonance’, a demand that we cease to consider some combinations of notes as cacophonous and forbidden. To this extent, Schoenberg's method seems only to be the logical endpoint of the experimentation with chromaticism and unconventional harmony that had been conducted for almost a century. By the start of the twentieth century, composers could use just about any amount of dissonance they wanted. They wouldn't always be thanked for it — audiences rioted at the premiere of Schoenberg's Second String Quartet in Vienna in 1908, and indeed even his 1899 tonal piece Verkldrte Nacht was controversial when it premiered in 1902. But there was far more receptivity to new sounds in the early twentieth century than is implied by the endless (and often misleading) recounting of the riot at the premiere of the Rite of Spring. In Verkldrte Nacht, one can hear the sound of conventional Western tonality in its anguished death-throes: music on the brink of falling apart. So what led Schoenberg to reject tonality so forcefully — not just to avoid it, but systematically to expunge it?
The fact is that Schoenberg created serialism not so much in order to do something new as to avoid doing something old. For all the talk of liberation, it was in fact a system designed to exclude: specifically, to exclude tonality. To Schoenberg, tonality needed to be banished because it had become a tired, sentimental, clichéd reflex. Eduard Hanslick remarked in 1891 that innovation can become mannerism, leading to a high turnover of forms such as certain cadences and harmonic progressions (Hanslick 1891, 81). This was precisely Schoenberg's complaint. He felt that his serial technique offered a systematic alternative to diatonic tradition, as opposed to ad hoc chromaticism.
Quite aside from the alleged degeneracy of the musical language of tonality, some of Schoenberg's supporters, notably Theodor Adorno, emphasized the political connotations of serialism (Adorno 2003). Adorno's Marxist critique of tonality argued that it had become the instrument of a complacent and selfish bourgeois capitalism. He felt that capitalist ‘mass culture’ had essentially ‘confiscated art’, and that reclaiming it demanded a complete rejection of the old tonal language of music.
Yet Schoenberg and Adorno never really explained where the supposed banality of tonality lay. For example, the serialists held the diminished seventh chord in particular contempt, whose ‘shabbiness and exhaustion’, said Adorno, is evident ‘even to the most insensitive ear ’ (Adorno 2003, 34). But Scruton argues that to say that a particular chord can be banal, rather than the way it is used, is absurd. ‘What would remain of the art of painting’, he asks, ‘if individual shades could simply be deleted from the painter's palette by those who use them tastelessly?’ (Scruton 1997, 291).
Babbitt reconsidered
Some people clearly find satisfaction in the ascetic extremes of total serialism. Apparently they have found ways of listening that make sense to them — perhaps, for example, by focusing on the individual sonic events rather than searching for relationships between them. It is hardly meaningful to suggest that they are wrong or mistaken in their tastes. Even the more extreme serial experiments are not wholly barren. They may weave strange, disembodied effects. One can enjoy their colourful interplay of timbre, one can sense skittish little ideas in their dense flurries of notes, or enjoy the weightlessness of their open spaces. They challenge us to find new ways to listen.
Besides, one can hardly blame Schoenberg, or indeed Babbitt, for ignoring principles of music cognition that had yet to be discovered. But experimentation in music must be regarded as precisely that: an experiment, which by definition may or may not work. Schoenberg's experiment did work to the extent that it led to new sonorities, new possibilities for finding form in music. In the hands of artists such as Berg, Stravinsky, Messiaen, Penderecki, and Schoenberg himself, atonalism could become a vibrant force. But a priori innovations driven primarily by philosophical, theoretical or ideological motivations lack a tradition from which to draw. Great art succeeds not because some theory says it should but because it is embedded in a web of reference, allusion and convention — it takes what we know and changes it, sometimes radically. Artistic traditions generally have the shapes they do for reasons that are to do as much with an empirical appreciation of cognitive needs as with the exigencies of culture and history. They can't simply be invented from first principles.
How, then, might one respond to Babbitt, in the light of our current understanding of musical cognition? We might want to acknowledge that he was right to be impatient at the notion that music should be undemanding. He was right that the marketplace is not the best arbiter of what is valuable in music. And he now looks prescient in his suggestion that electronic media could free musicians and composers from the normative and homogenizing demands of commercial success. Furthermore, the arguments adduced here do not imply that there is anything ‘wrong’ with atonal music, serial or otherwise.
But if the cognitive crutch of tonality is removed (and if the aim is to write music that can be coherently processed without a tremendous amount of theoretical preparation and repeated hearing), it seems important to recognize that this has been done, and that other cognitive mechanisms, such as the use of gestalt binding principles in melody and rhythm, are brought to bear. Babbitt's aloof dismissal fails to acknowledge the issue of cognition at all. It is not, for example, that listeners are inept at remembering pitch and duration values, but that the structure of the music actively thwarts this — sometimes for experienced listeners as much as for naïve ones. It cannot be a good thing for musicians to be erecting those cognitive barriers without any awareness that they are doing so.
Music can be a great many things. But there does not seem much point in allowing it to be anything. Our minds use certain cognitive tools to organize sound into music. With practice, we can change the way we listen. But if we frustrate our auditory cognitive mechanisms too far, all we are left with is sound. It seems possible that some forms of total serialism become tolerable to listeners only when they become inured, and not because there is really anything to hear — nothing except notes and silence, a meandering uniformity with no means of creating tension and release, no ways to begin or to end.
As Scruton puts it, ‘When the music goes everywhere, it also goes nowhere’ (Scruton 1997, 303).
Note
1 I should more properly talk here of pitch classes rather than notes, to acknowledge the equiva- lence of notes in different octaves. There are 12 pitch classes in Western (equal-tempered) scales, which repeat in successive octaves. The argu- ments developed here are not, however, specific to the conventional use of equal-tempered tuning systems.
Notes on contributor
Philip Ball is a science writer based in London, UK. His recent books include Universe of Stone (2008), The Music Instinct (2010) and Unnatural (2011). He contributes regularly to a variety of publications, including Nature, Prospect and Chemistry World, and frequently writes on interactions between the sciences and the humanities.
