Abstract
Psychological research into musical behavior has mostly focused on Western music, explored with experiments utilizing Western participants. This ethnocentric bias limits the generalizability of many claims in the field. We argue that our current understanding of the cognition of pitch organization might be helpfully informed by data gathered in non-Western contexts. In particular, musical traditions featuring equal-spaced scales (where all scale-step interval sizes are equal) are suggested to pose a challenge to popular models of pitch organization, in which unequally spaced scales are suggested to provide cognitive anchor points for on-the-fly pitch orientation. This article presents a summary and theoretical consideration of all available evidence on equal-spaced scales, the vast majority of which appear in east Africa. It is noted that despite equal spacing, there is evidence to suggest that tonal centers are still perceived by idiomatic listeners. We then proceed to propose how such tonal center perception is possible within equal-spaced tonal environments. In short, the existence of equal-spaced scale systems shifts the focus of research from interval uniqueness to alternative explanations for the perception of tonal centers, such as implicit statistical tracking, secondary parameters, recognition of learnt patterns as tonal cues, and so on. Throughout, we note that interdisciplinary work involving ethnomusicologists and psychologists would be beneficial in answering questions about music cognition, and by extension, human cognition in general.
Keywords
Introduction
One might assume a closer relationship between ethnomusicology and music psychology than exists today, particularly in light of concerns about the universality of musicality in both disciplines (Blacking, 1973; Harwood, 1976; Nettl, 2005). However, psychology has preoccupied itself mainly with the music of the West, and the quantity of interdisciplinary work is small—some high-profile cross-cultural work notwithstanding (Castellano, Bharucha, & Krumhansl, 1984; Eerola, Louhivuori, & Lebaka, 2009; Krumhansl et al., 2000). Our knowledge of music cognition is therefore still confounded by the fact that the subject pool has been largely restricted to Westerners and Western materials (Henrich, Heine, & Norenzayan, 2010).
The purpose of the present paper is to highlight the potential impact of ethnomusicological data on psychological models of tonal cognition. The following section will introduce some basic cognitive models of pitch organization, and argue that such models depend on the structural feature of unequal scale-step intervals. We will then review literature suggesting that there may exist several African ‘equitonic’ musical traditions which, despite featuring equal-spaced scale steps, require pitch organizational cognition from listeners. This brings into question claims that unequal spacing of scales is a musical universal. In closing, we consider the implications of such ethnomusicological evidence for music cognition, and suggest alternative mechanisms by which listeners might be able to abstract a sense of pitch relationships.
Two fundamental models of tonal cognition
Much of the world’s music utilizes discrete pitch categories, which subdivide the octave into a number of pitch classes (Harwood, 1976; Patel, 2008). These pitch categories are usually transposable up or down the octave, maintaining their pitch class designation functionally if not nominally. Maintenance of pitch class identity despite octave transposition is a widespread trait of the world’s musical cultures, and is considered by some to be a potential universal of human musicality (Dowling, 1989; Harwood, 1976; Trainor, 2008; for examples in African music, see Kubik, 1985; for an Australian counter-example, see Will, 1997). In the ensuing discussion, the terms ‘tonality,’ ‘sense of tonality’ and ‘tonal knowledge’ are used to mean an understanding (tacit or otherwise) of which pitches are members of a set with a given reference pitch, used as a means of orienting the listener with respect to pitch organization. In the music psychological literature, the paradigmatic example of this is Western diatonicism, where tonal functionality is expressed in relation to a set of pitches with a tonal center (the ‘tonic’). Put another way, tonal knowledge refers to what it is that the listener knows when we say that s/he is acquainted with pitch relationships in tonal music, and how that knowledge is applied when a listener tries to make sense of a musical passage.
To illustrate, imagine the first few moments of listening to a novel piece written in a Western tonal style. Within a few notes, even theoretically naïve listeners are able to identify the tonal center of the piece, and the relative stability of the various notes that they have thus far encountered. The listener is also able to form expectations about what is to come, based on having being exposed to multiple instances of tonal music in the past. Similarly, a composer or performer is able to capitalize on this knowledge of tonality to create novel pieces, which can fulfill or thwart the musical expectations of enculturated listeners. Arom (1991a, pp. 137–157) points out a similar implicit knowledge of underlying principles in performers and listeners of ‘folk’ idioms, stating that the
existence of theory in civilisations with an orally transmitted tradition, though entirely implicit, can nevertheless be demonstrated from the fact that the users never make a mistake. They will immediately remark on and correct even the most minor error. If there can be a mistake, it must be with reference to some sort of theoretical framework (p. 140; Arom’s emphasis).
Psychologists have modeled the acquisition and application of tonal knowledge in Western musical settings in various ways, but two approaches have historically been prominent. The first is the statistical-distributional or cognitive-structural approach, following the work of Krumhansl and colleagues (Krumhansl & Shepard, 1979; Krumhansl & Kessler, 1982). This approach rests on the fact that human beings are efficient statistical learners. When listening to a piece of music, the listener will keep track of the frequency distribution of pitch classes. If the frequency distribution matches learnt key profiles—that is, note 1 occurs x times, and note 2 y times, and these match the learnt distributional hierarchy—the listener will abstract a sense of key based on that learnt key profile. A Western listener who has internalized this profile will assign to the most frequent note the functionality of the tonic, the second most frequent the dominant, and so on.
Such an approach is intuitively appealing. Knowledge of the overall distribution of pitches, formed by a lifetime of exposure to idiomatic music, is directly translated into a tonal profile; and because the distribution of pitches in a given Western tonal piece is likely to map closely onto the tonal profile (Krumhansl, 1990), the listener is able to apply this knowledge by keeping track of the relative occurrence frequency of pitch classes in any piece that is encountered. It has been shown that this statistical approach is employed by listeners in non-Western musical traditions (Kessler, Hansen, & Shepard, 1984) and by listeners encountering music that doesn’t conform to a standard distributional norm: Oram & Cuddy (1995) demonstrated that when subjects were presented with novel distributions, they proceeded to rate test tones as if a new idiom had been constituted by the test materials—that is, they tended to rate notes which occurred often as being more stable than those that didn’t. Thus this approach is not necessarily restricted to the perception of Western tonal music, but could account for the perception of pitch relationships in other idioms.
The most obvious problem with the distributional account is its retrospective nature: in Western tonal music, listeners rarely have to listen to very many notes before they are able to identify the key of a piece, and understand the tonal implications of various pitch events. In this sense, the statistical-distributional approach does not provide a particularly good model of real-time orientation within a tonal pitch space. A second theory of tonal knowledge, the intervallic rivalry theory (Brown, Butler, & Jones, 1994; Butler, 1989; Butler & Brown, 1984), is better able to account for the dynamic nature of tonal orientation. Intervallic rivalry theory suggests that all the listener needs is a number of rare intervals to determine key, since—given the structural properties of the scale—there are only a limited number of possible interval types within a given key (Balzano, 1980, 1982). For example, the minor 6th and minor 2nd occur only twice in the Western diatonic scale set, and the diminished 4th only once. Thus the sequential presence of a selection of these rare intervals can indicate key to a listener well-versed in diatonic music.
This approach is attractive for a number of reasons. As mentioned before, we regularly abstract key despite a paucity of statistical evidence. A second reason is that, like the statistical-distributional model, this approach is not wedded to Western tonal music and the Western dodecaphonic pitch class environment. Rare intervals are present in virtue of the structure of the set of notes used in the tonal system, and structurally rare intervals are not unique to Western tonal music. Thus, it is conceivable that listeners of idiomatic non-Western pitch-based music may use similar mechanisms.
In fact, experimental data seem to indicate that both types of information—statistical distribution and interval information—are used by listeners to figure out the key of a novel piece of Western tonal music (Brown et al., 1994). But are all musical styles featuring tonal centers similar to the Western variety in displaying both a distribution of tones favoring these tonal centers and also rare interval information? If not, then this could call into question the generalizability of these models of tonal cognition. Take, for example, a type of music that used only whole-tone scales. In order to understand the implications this might have, consider the characteristics of the Western diatonic scale when conceived of as a mathematical set. Diatonic pitch sets have three features that are thought to be psychologically important in key finding (Balzano, 1982; Cross, 1997):
Uniqueness. In a given diatonic set, a pitch class can be differentiated from, or defined in terms of, the set of intervals it forms with every other pitch class in the set.
Simplicity. Each diatonic set becomes a different diatonic set of the same structure by changing one single, specific pitch class.
Coherence. The sum of any two intervals in the diatonic set is larger than any single interval.
Let us consider these characteristics in reverse order. For equal-spaced scales like whole-tone scales, the property of coherence is trivial, as the size of stepwise intervals in the scale remains the same throughout. All possible intervals will merely be whole-number multiples of the smallest interval. The property of simplicity is also trivial, as changing a single pitch will result in a scale that is no longer an equitone scale. However, uniqueness is a property that can only be attributed to scales that are composed of more than one step size. Consider the diatonic set, C major. We can define the pitch class C as having the following set of interval relationships to each successive pitch class in the scale: [M2, M3, P4, P5, M6, M7]. This set of intervals uniquely defines the tonic note. Every other pitch class has a different set of relationships to the notes above it. For example, D is defined [M2, m3, P4, P5, M6, m7]; E is defined [m2, m3, P4, P5, m6, m7]; and so on. In equitonic scales, each pitch class can be defined in terms of the same set of interval relationships. The whole-tone scale beginning on C has the note C defined as [M2, M3, dim5, m6, m7], as does the note D and every other note in the scale. The consequence is that all interval combinations are equally common, because all can be realized from each pitch class; hence, the property of uniqueness does not apply to equitone scales.
This is important, because the rare interval hypothesis is premised on the fact that scales feature unequal steps: according to intervallic rivalry theory, the raison d’être behind asymmetrical scales is that they help the listener to orientate herself within a pitch space or key, and the idea that unequal-spaced scale structure conveys a psychological advantage has been put forward in several studies (Shepard, 1982; Sloboda, 1985; Trehub, 2000).
Broadly speaking, intervallic rivalry theory has three fundamental claims, all of which are rendered impotent by the lack of uniqueness in equitonic scale systems:
Listeners assume that the first pitch class they hear is the tonal center, until a better candidate is presented (in terms of intervallic information).
In establishing key, listeners rely more on rare interval combinations than common interval combinations.
Listeners’ judgments of key are more accurate when rare interval combinations appear in a temporal order that implies or reinforces harmonic movement.
Without rare interval combinations, it becomes difficult to appeal to structural features of the tonal system to explain how listeners orientate themselves within a key. This would not be so problematic if such a musical system were merely a thought experiment. Indeed, it has been proposed that unequal spacing in scales is a musical universal (Brown & Jordania 2013; Sloboda 1985; Trainor 2008). However, as we discuss in the next section, there is reason to suspect that this claim of universality is misguided when the ethnomusicological literature is consulted. If we can demonstrate, with reference to the ethnomusicological literature, that there are musical styles with tonal organization but equally spaced scales, we would have strong evidence against the universal validity of the rare interval hypothesis. The psychological puzzle would then lie in accounting for pitch organization in musical cultures where equal-spaced tuning is the norm.
Evidence from the ethnomusicological literature
Cross-cultural data on pitch-interval categorization, just noticeable difference, and tolerance of mistuning
Before reviewing the evidence for the existence of equitone scale systems, a word must be said about the perceptual nature of pitch intervals. The intervallic relationships between pitch classes are learnt categorically (Patel, 2008, pp. 22–28). It is possible to perceive pitches falling outside the established pitch classes of a given culture, but these are heard as deviations, and not as distinct and novel categories with their own structural potential. Hence, in the West, when hearing an interval 15 cents flatter than a minor third, we are likely to recognize it as a mistuned minor third rather than a new interval category. Different cultures have different interval categories within the octave, and establishing these interval categories is what has been done in the majority of ethnomusicological considerations of tuning and scale identification. Thus there are two immediate questions about interval categorization which bear on the case of equitonic tonal cognition, neither of which has been systematically explored cross-culturally. The first is the question of where pitch class boundaries lie, alluded to by Cooke (1992) in his discussion of tolerated deviations from ideal tuning standards among the Baganda and Basoga in Uganda. The second is whether just noticeable difference (JND)—the minimum variance in frequency where a difference in pitch is perceived—is comparable between African and Western listeners. Disentangling JND and pitch categorization from tolerance of tuning deviation is likely to prove a challenge to a cross-cultural music psychology: trained musicians in the West don’t necessarily give accurate verbal reports about direction of mistuning (Siegel & Siegel, 1977).
A prime example of information readily available to ethnomusicologists that may be of considerable aid to music psychologists is the naming of pitch classes in local vernacular. Language use in many cases reflects perception. For example, examination of color labels in different languages has given rise to a lively and highly fruitful debate about visual categorization across cultures (e.g., Heider & Oliver, 1972; Saunders & van Brakel, 1997 and replies therein). In the case of pitch class sets, it may seem that musical instruments and performances ‘speak for themselves’ in this regard, as the pitch class categories of a culture are likely to show up in the way instruments are built and tuned, and what frequencies are employed in music-making. Nevertheless, it is interesting to note that the mapping between language and instrumental features is not perfect. For instance, van Zanten (1980) reported that the Asena do not have a name for every tone on their instruments. Instead, they have names for groups of tones, and these names vary from instrument to instrument. Without this knowledge, one might assume that the Asena name each note, and try to understand the syntactical construction of the music in terms of individual pitch classes, when in fact an approach governed by relationships between different note groups may be more appropriate. Psychologists invariably wish to generalize across cultures, but it becomes unfeasible to acquaint oneself with a wide range of local languages. Furthermore, any psychologist wishing to investigate issues such as categorization of pitch would need to be cognizant of the notion of ‘emic’ and ‘etic’ approaches in anthropology (a distinction after Kubik, 2010; Pike, 1954).
In short, one must bear in mind that there will not always be a correspondence between measured and perceived tones and intervals; furthermore, the nature of these perceptions is understudied, and may not be comparable with those found for Western listeners.
Review of the ethnomusicological literature
The vast majority of reports of equal-spaced scales come from the African ethnomusicological literature (although see Ambrazevičius 2009, on Lithuanian folk music, for an exception). A starting point for reports of equidistant scale intervals is the entry on the music of Mozambique in the online version of the Grove Dictionary of Music and Musicians, which alludes to a tendency toward equitonicism in lamellophones in the Zambezi basin (Tracey, 2016). The Chopi of Mozambique, who traditionally inhabit the south of the country and have a celebrated tradition of xylophone ensemble music, were described as having an equitone tuning system by Hugh Tracey (1948). Other African instances of equitonic scale systems are cited by Jones (1964), Kubik (1982), Kyagambiddwa (1955), Rouget (1969), A. Tracey (1970, in north-eastern Zimbabwe), Wachsmann (1950, 1957, 1967), van Zanten (1980), Arom (1991a), and Cooke (1992).
One of the most systematic and thorough accounts of equitonic scale systems is that of Hugh Tracey (1948), whose Chopi Musicians is the standard ethnomusicological reference for Chopi xylophone music. Tracey provides frequency measurements for several instruments in the Zavala district. In spite of urbanization and globalization, practitioners of Chopi xylophone music are reported to still tune their instruments equitonally today (A. Tracey, personal communication). Furthermore, musicians actively using both Western and traditional instruments reportedly make use of both Western and Chopi tuning systems, depending on musical context (Matchume Zango, personal communication). The music of Chopi xylophone ensembles is heptatonic. Splitting an octave into seven categories of equal spacing would require intervals of approximately 171 cents. Chopi musicians indicated to H. Tracey that this was their explicit aim so far as tuning was concerned, although in reality deviations from a pure 171 cent interval appear to be common. It is currently unknown whether these deviations indicate that equitonic tuning is in practice never achieved. It could be that while equal spacing is an aim, it is not realized; this in turn could provide an important cue to tonal center, provided deviations are consistent or predictable throughout the culture. However, it could also be that the boundaries of pitch categorization among the Chopi are very wide, tuning is highly variable, and mistuned notes are tolerated. In such a case, it may be that some form of categorical perception with regard to interval function proves to be important (Burns, 1999; see ‘Cross-cultural data’ section above). That is, it may also be that the intervals are perceived as equitonic, even if they are not equidistant in an external, physical sense.
Wachsmann (1950) reported the case of an 8-stringed pentatonic harp tradition in Uganda, where musicians also clearly indicated that it was their intent to tune their instruments with equal steps between strings. Wachsmann reports that the same attitude is also adopted in xylophone tuning (1950, 1957). With respect to harp tuning, he notes that the musicians showed ‘no response to the suggestion that their tuning might distinguish between intervals of different size’, and that in fact ‘they clearly stated that they wished all steps to be the same’ (1950, p. 41). Wachsmann proceeded to make measurements of the frequencies of each string for 10 instances of supposed equitonic tuning. There was a certain degree of deviation from the equitonic ideal of 240 cents (for equi-pentatonicism), but it is interesting to note that the largest average deviation for any string from perfect equitonicism was 17 cents. The average deviation from the equitone ideal overall was 12.43 cents. Wachsmann also reported the method of tuning. The musician starts with the highest note on the instrument, which usually hovers around G4 (392 Hz). The next note is presumably estimated (tuned by ear), such that it is close to 240 cents from the first note. The remainder of the strings are then matched to the interval between the first and second highest strings. Once all strings have been adjusted, notes 1, 2 and 3 are played in quick succession (where 1 is the lowest note on the instrument), followed by notes 2, 3 and 4, and so on. The tuner aims to have all the intervals equal, and proceeds in this manner until all strings have been adjusted more finely. Finally, notes 1 and 6, 2 and 7, and 3 and 8 are separated by an octave, and the tuner meticulously checks that these octaves are in tune. With respect to xylophone tuning, Wachsmann (1957) describes the tuning of an amadinda xylophone, which is also close to an equi-pentatonic norm of 240 cents. This latter example is notable, as the tuner and instrument maker actively tuned the instrument away from a tuning present immediately after manufacture, which involved intervals close to major seconds and minor thirds. The tuning arrangement settled on was close to equi-pentatonic.
Wachsmann returned to the question of equidistant tuning by studying an ensemble of three pentatonic sansas and a rattle, originating from the Busoga region of Uganda, as well as another sansa tuned by a musician locally regarded as an expert (1967; a sansa is a lamellophone akin to a kalimba). Wachsmann used a chromatic stroboscope to measure the deviation of various notes with reference to an equal-tempered Western chromatic scale, and concluded that the tuning tends toward equitonicism. Among the three sansas in the ensemble, there is a range in tuning for seconds of up to 80 cents on one instrument, and around 60 on the other two. However, the majority of intervals are larger than a Western major second. Details regarding the tuning of a trough zither are also provided, and Wachsmann notes that an interval larger than a major second, but smaller than a minor third, is the aim. He is hesitant to call these tunings equidistant, however, due to large variation in the sansa tuning, and the fact that trough zithers are not expected to remain in tune for very long. What this account emphasizes is the fact that future investigations into African tuning systems will have to carefully distinguish between intended tuning, physically realized tuning, and perceived interval relationships.
Remaining in Uganda, Kubik’s later report of the Baganda tone system (1969), as evidenced on amadinda and akadinda xylophones, is tantalizingly suggestive of a pentatonic system with equally spaced intervals. Recognizing that measurements of frequency on akadinda slats are highly variable (although not so much so that they would fall into the Western pitch categories), Kubik says that it remains to be proven whether the tuning is indeed based on equal spacing. If so, the 240 cent ideal would be the theoretical aim, even if in practice the musicians deviate from this. Kubik also presents us with some further points of interest. He reports that the musicians are careful about precisely what note they begin a tune on. That is, they do not tolerate transposition gladly. One would imagine that with equitonic instruments, choice of starting note would not be a matter of great concern. However, Kubik notes that the preference for starting known tunes on certain notes most probably has to do with playing position. This makes intuitive sense, as there is more than one musician playing an amadinda xylophone at a time. (It should also be mentioned that transposition occurs due to singers’ tessitura—Peter Cooke, personal correspondence.) Kubik also notes that in this type of music, intervals are often conceptualized with relation to instrumental technique and in particular the spatial arrangement of xylophone slats. Instrumentalists may shift their playing position, thus performing the same movements at a different part of the instrument, and yet maintain that the transposed intervals have the same identity as the original intervals (Kubik, 1969, p. 27).
Van Zanten (1980) investigated the music of the Asena in southern Malawi, in the region of the Lower Shire River Valley, and asked whether heptatonic music realized on the bangwe (a type of zither) and the multiplayer ulimba xylophone (also referred to as ‘malimba’ or ‘valimba’ xylophone) was truly equitonic. Equitone scale systems in this area were originally reported by Kubik (1968), who noted a standard interval of 171 cents. In determining the frequency of notes on an instrument, van Zanten (like many other ethnomusicologists of the time) made use of a set of 54 tuning forks, ascending by 4 Hz from a starting frequency of 212 Hz. Comparison with the pitch of the note is then assessed by ear, and corroborated with the judgments of local informants. The Asena are more tolerant of transposition than the Baganda, although they tend to start pieces on the same note each time. Strumpf (1999, p. 112), however, reports that the Asena are more likely to start a piece on a high note when the players have plenty of energy to expend, as the accompanying vocal line is more strenuous at a higher pitch. The fact that the Asena recognize transpositions of heptatonic music on instruments with seven pitch classes as instances of the same piece is taken by van Zanten as evidence that they conceive of their intervals as equidistant. Furthermore, there is no evidence of any system or tonal treatment approximating to modal shifts in the music of the Asena (modal treatment on instruments with heptatonic tones using non-equal spacing would imply that intervals would change from mode to mode). Interestingly, van Zanten observes that the octaves on a group of bangwes he studied were tuned to a mean of 1222 cents, which is consistent with Western octave stretching reported by Dowling (1973) (see also Cooke, 1992), and that the majority of octave intervals were tuned above 1200 cents (60% of instruments sampled, with 23% falling below 1200 cents). Van Zanten also observed a mean of 1213 cents for octaves on a group of 17 valimbas. As with the Ugandan tuning report by Wachsmann (1950), van Zanten notes that the method of tuning involves the checking of tuning by means of octave comparison.
Van Zanten calculated mean interval sizes and standard deviations from these means (see Tables 1 and 2). His reasoning appears to be that, should the standard deviation prove to be relatively small, and the mean close to an equitone ideal, the scale system is equitonic. However, the means and standard deviations for various intervals on both bangwe zithers and valimba xylophones suggest that there is a large degree of variability in tuning. Van Zanten also considered the case of malimba lamellophones. These instruments are of considerable interest because, unlike the bangwe and the valimba, Asena lamellophones have replications—that is, there are tongues on the instrument that are meant to be tuned to the same note. Van Zanten suggested that a general picture of tolerated tuning deviance could be established by determining the degree to which these supposedly identical ‘primes’ differed. The mean difference between primes was +20 cents, with a standard deviation of 23 cents (a total of 40 primes from 4 instruments). Van Zanten is cautious in drawing too firm a conclusion from these figures, given the acknowledgment of a possible error in measurement of up to 9 cents due to the tuning fork method. However, this general technique of assessing tuning deviance within a musical culture is likely to be a profitable approach where applicable.
Mean interval sizes for 89 bangwes (Van Zanten, 1980).
Mean interval sizes for 17 valimbas (van Zanten, 1980).
An important difference between the Asena tuning system and another potential candidate for equitonicism, the pentatonic slendro tuning of gamelan ensembles, are the verbal accounts of tuning practices by local musicians. Van Zanten produced a table of mean tunings for 28 gambang xylophones, tuned to the slendro scale, and observed a maximum standard deviation of only 12 cents. However, van Zanten sides with Wasisto Surjodiningrat and colleagues (1972) in concluding that this is not a case of equitonic tuning, since the musicians state that the interval between the first and second steps of the scale is meant to be smaller than the interval between the fifth and octave-transposition of the first tone (reported means of 233 cents vs. 252 cents, both with a standard deviation of 9 cents). Van Zanten’s ultimate conclusion regarding the Asena system was therefore that, unlike the gamelan system, he was unable to show that it wasn’t a case of equitonicism, since the Asena musicians did not report any intended difference between interval sizes. This conclusion is shared by other researchers: although McPhee (1949) suggested that that the scale featured ‘intervals tending toward equidistance’ (p. 257), Schneider (2001) notes that in practice, slendro tuning is not equitone, and in fact each octave features one interval well below 240 cents, and one interval well above. By considering the verbal reports of gamelan musicians, van Zanten makes an even stronger case for this being a non-equitonic system.
Van Zanten also provided detailed data regarding the tuning of individual instruments. It is important to note that computing a mean for each interval across several instruments poses problems for the assessment of the presence of equal-spaced tuning. It could be the case that no instruments are precisely equitonic, but that frequency data are best modeled by means which approximate equitone intervals. Standard deviation values do not clarify the situation either, as it is expected that only 68.2% of data points would fall within one standard deviation of the mean, assuming a normal distribution.
The discussion above shows that there is indeed some strong evidence for the existence of equitonic tuning systems across a range of African musical traditions. The following section addresses how we might begin to understand the phenomenon of pitch organization and tonal center establishment in these scenarios.
How is pitch organized in equitone music? Potential solutions
Before continuing, care should be taken not to automatically assume that tonal perception is qualitatively similar in the African musical traditions discussed here and in Western diatonicism. It may be the case that pitch in these African musics is organized in a logical and perceptible fashion, but that this organization does not conform to the concepts of ‘tonality’ or ‘tonal centers’ as outlined in the music psychological literature; alternatively, it may be that tonality, although present, does not dominate the listening experience to the same degree as Western diatonicism. It is also important to acknowledge that, in some cases, musical features may serve to create tonal ambiguity rather than to define a clear tonal center. However, given the tendency in the world’s musics toward pitch organization reflecting tonal centers, it does not seem unreasonable to consider ways in which equitonic music, despite its lack of interval information, might function in a similar manner. That being said, one should remain aware of alternate conceptions of tonal centers, such as the double tonal anchors found in some eastern European traditional music (Ambrazevičius & Wiśniewska, 2008). Such phenomena suggest that not all cases of ‘tonal’ music necessarily adhere to fundamental assumptions about tonality held by Western music psychologists (such as a restriction to a single tonal center).
Basic questions about equitonic scale systems
The present discussion of equitonicism is based on what are, at times, rather fragmentary accounts of the phenomenon. While valuable information can be gleaned from existing ethnomusicological reports, many important questions remain unanswered. For example, while Tracy (1948), van Zanten (1980), and Wachsmann (1967) all provide highly detailed tables of tuning for their respective cultures of interest, a far larger corpus of instruments per region would be required before any reliable statistical inferences can be drawn. Similarly, while an attempt to draw statistical inferences about tuning was made by van Zanten (1980), more instruments and cultures need to be examined, with the application of appropriate analytical techniques. Of course, there is the question of how much we can learn from statistical assessment. The presence of a high degree of variability would suggest that, informants’ reports notwithstanding, equitonicism is not being achieved so far as instrument-making is concerned. This would demand a move away from statistical assessment of instruments, in favor of a closer investigation into the nature of pitch perception and categorization. This may explain a discrepancy between introspective reports and the actual physical characteristics of musical instruments. The reports of substantial tuning deviations described above (Cooke, 1992; Wachsmann, 1967) underline the need to carefully consider the relationships between tolerance of mistuning, categorization, and noticeable difference in frequency.
While we are confident that there is only a marginally small error in measurements provided by sets of tuning forks, it would nevertheless be wise to corroborate these data by taking measurements with more modern and sophisticated equipment, as well as exploiting the avenues available to researchers when provided with digitally recorded samples of the music (Schneider, 2001). The pioneering work of Arom and his colleagues in using sound synthesis as a means to interrogate the implicit, non-verbal and idealized tuning schema of idiomatic listeners also points towards a sophisticated field procedure (Arom, 1991a, 1991b; Arom & Voisin, 1998). However, assessing tuning by computing means after collecting data from many instruments might still disguise the fact that no instruments are equitonic, but instead deviate from the equitonic ideal by more than just noticeable difference—deviations which may play a crucial role in orienting the listener with respect to the tonal center. It is also important to note that the general empirical approach just discussed assumes an equivalence between measured frequency and perceived pitch. Such an assumption is problematic, and particularly so given that many instruments used in relevant African musical traditions—such as xylophones and lamellophones—do not produce notes with harmonically related overtone structures. Due to pitch perception processes too complex to review here (see Terhardt, 1974; Terhardt, Stoll, & Seewann, 1982a, 1982b), this could lead to a discrepancy between perceived pitch and measured frequency components, in addition to the issues of categorical perception and noticeable differences mentioned above.
Frequency distribution
The work of Oram and Cuddy (1995) showed that listeners make use of implicit statistical learning when prior knowledge about a tonal system is not available. They presented listeners with musical excerpts in which the statistical distribution of pitch classes differed from regular Western tonal music. They found that listeners rated probe tones as having better ‘goodness of fit’ if the same tone had occurred frequently in the preceding novel musical context. This has been taken as an indication that, when unfamiliar with the rules of a musical idiom, listeners tend to assign tonal significance to pitch classes that occur often.
Could it be the case that, in music based on equitonic scale systems, listeners familiar with the idiom depend more heavily on statistical distribution of pitch class to infer pitch class relations than listeners in non-equitonic musical traditions? This explanation becomes attractive only if the listener is able to assess many pitch class relations in a relatively short period of time. The rare interval hypothesis is attractive precisely because it is able to explain how a listener abstracts key with minimal statistical information (in fact, in many cases, without a single note having been repeated). However, Western music tends to be homophonic in texture, with a large emphasis being placed on melody plus harmonic ground. By contrast, many African musics differ substantially in texture, and boast a remarkable number of events within a short time period. Chopi xylophone music is typically densely polyphonic, and much equi-pentatonic harp music from Uganda is also reported to involve a fast stream of pitches (Wachsmann, 1950). A similar case holds for Ugandan xylophone, flute, and lyre music (Cooke, personal communication). Where dense polyphonic textures exist, the listener would be provided with a wealth of information on the statistical distribution of pitches in a very short period of time. In these situations, experienced listeners might therefore have little need to resort to the alternative strategy of relying on rare interval information.
Patterns and other musical features as learnt cues
It is possible that cues to tonal functionality could include particular melodic formulae, which are often repeated within the musical culture. When a listener hears such a formulaic structure, there is no need to take heed of the statistical distribution of pitches, or even rare interval information, because the formula will only be interpreted within certain frameworks. Learnt patterns with particular intervals (a combination of seconds and thirds, for instance) could thus give learnt cues to tonality (for example, that the lowest note is always the tonal center). Such a strategy may well work hand-in-hand with an assessment of distribution by means of weighted frequency of occurrence. Such patterns could be realized on a local (motivic) or global (e.g., harmonic) level.
Drone-like treatment of certain notes may also serve to disambiguate the tonal center in the absence of uniquely identifying intervals. It may be the case that drones are treated as specially weighted pitch classes, such that they outweigh the contribution of other tones that are present in the music.
Secondary musical parameters assume a more important role
Although made within the context of style contrasts in Western art music, Meyer’s distinction between primary and secondary musical parameters (1989, pp. 14–16) may point toward another explanation for pitch organization in equitonic music. When a musical event, or the relationship between successive musical events, is governed by rules, the event or relationship is a primary parameter (Meyer uses the term ‘syntactical’). Hence, in tonal music, intervals could be considered to be primary parameters. Changes that can be made without affecting rule-governed (syntactical) parameters—such as loudness—are called secondary parameters.
The regulated presence of secondary musical parameters (accents and dynamic shading, phrasing, metricality, and so on) may serve as a disambiguating cue in equitonic music. Even if the tonal center is not uniquely identified by statistical distribution or learnt cues, it could be that the intended tonal center is granted extra salience due to secondary parameters. Whether secondary parameters can indeed be disambiguating with regard to tonal center has not yet (to our knowledge) been established for Western tonal music.
Physical constraints external to the pitch system
It is possible that non-music-structural factors constrain and curtail the way that a musical system is organized. For example, it has been observed that maintenance of a hand position conducive to chord playing underlies many melodic fills in early blues (Nelson, 2002). In these cases, physical constraints beyond the idealized musical system have an impact on its fundamental nature. Similarly, an instrument might, of practical necessity, only produce certain pitches in certain relations, such as the two low fundamental notes often found in bow playing. Thus there may be something constraining the manner in which equitonic music is played that enables listeners to orient themselves within the tonal pitch space.
The Asena valimba xylophone appears to be equitonic for notes above 130 Hz (van Zanten 1980, p. 113 and Appendix 2; see also Strumpf, 1999). Most instruments also have slats producing frequencies below 130 Hz. The range below 130 Hz features intervals bearing no obvious relationship to a 171 cent ideal, with some as small as 71 cents (e.g., valimba no. 16 in van Zanten’s second appendix), and even an abandonment of low-to-high note ordering in some cases. It is therefore conceivable that listeners could use knowledge of the deviant tuning in the lower register to orientate themselves with respect to the tonal center. Note, however, that the Asena valimba is an example in principle only. There appears to be no systematic scheme governing the size of intervals below 130 Hz, nor the number of keys or their ordering in this range, making it hard for listeners to use this information in a consistent and reliable manner.
Rare intervals are statistical, not structural
So far, a ‘rare interval’ has been understood as being rare in terms of its occurrence within a tonal set. However, it may be the case that a particular interval is rare within the culture for non-structural reasons. For instance, it may be that the interval of a fourth only occurs in certain places within the octave because of instrument construction (e.g., with missing keys forcing particular melodic patterns), or simply because of a stylistic or esthetic principle that is dominant within the culture. Such intervals can function like learnt patterns, giving the listener a culturally determined cue to the tonal center. A Western example is the tritone. The tritone is a structurally rare interval, but it is even rarer for the fact that the interval is difficult to produce vocally. Furthermore, the tritone has in the past had cultural connotations as ‘the Devil’s interval’. Similarly, in vocal music, large leaps are statistically rarer than stepwise motion.
Conclusion
In this paper, we have looked to the ethnomusicological literature for evidence that unequal-spaced scales may not be as universal as previously claimed. We suggest that the equitonal scale systems described by some ethnomusicologists may pose a major challenge to one of the fundamental assumptions underlying models of tonal cognition, namely that scales should be composed of at least two differing interval sizes between successive pitches. In particular, these accounts challenge the intervallic rivalry model as an exhaustive account of tonal cognition. Furthermore, although our focus here has been on African musical traditions, it is worth noting that equitonal systems and potentially related phenomena have also been reported in some European musics (Ambrazevičius, 2009). However, it must also be acknowledged that, the varied reports of equitonal scale systems notwithstanding, the existence of these scale systems is yet to be definitively established, and further research into more general aspects of cross-cultural pitch perception is required. One should be cautious not to force an idealized theoretical notion of equitonic tuning onto what data are currently available, nor data gathered in the future—to do so would be to fall into a similar trap to that which ensnared Alexander Ellis and his idealized notion of ‘tempered form in quartertones’ (see Ellingson 1992, pp. 120–121; Nettl, 2005). Nevertheless, the present survey suggests that there is, at the very least, good reason to hypothesize the existence of equitonic tuning systems. We further hope to have shown that the ethnomusicological literature provides a major source of data for music psychologists, especially given the current lack of cross-cultural data in the music psychological literature.
Footnotes
Acknowledgements
The authors would like to thank Prof. Peter Cooke and Prof. Ian Cross for helpful critique and commentary on draft versions of this paper. We would also like to thank Prof. Andrew Tracey, Prof. Kofi Agawu, and the members of the Centre for Music and Science at the University of Cambridge, for extensive discussions regarding this work as well as to thank the Centre for Mind in Society at Queen Mary University of London.
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
