Research ReportEffects of attention on the neural processing of harmonic syntax in Western music
Introduction
Music we encounter every day is composed according to rules of tonality and harmony. Following repeated cultural exposure, the human brain is thought to implicitly form expectations in accordance with traditional musical styles [13], [14], [26]. In recent years cognitive neuroscience researchers have explored the neural underpinnings of musical expectation, especially through the use of event-related potentials (ERPs) [1], [7], [8], [9], [10], [11], [12].
The theoretical musical context of these studies relates to a rich set of rules and principles of harmonic music theory. Traditional Western music is generally composed of sequentially and simultaneously occurring pitches organized around a major or minor key, where the key typically serves as the reference and endpoint of a musical piece. Sequentially presented pitches form the melody in music, whereas simultaneously presented pitches give rise to harmony. A chord is comprised of three or more simultaneously occurring pitches, and the root of a chord is the reference pitch within the chord. Chord progressions, or series of chords, illustrate the basic grammatical structures upon which musical harmony is built. A prototypical chord progression begins and ends with a tonic chord, which has a root that is the same pitch as the name of the key—for instance, for a piece in the key of C major, the tonic chord has a root of C and is comprised of the pitches C, E, and G. The tonic chord of a progression is notated with the Roman numeral “I”. The penultimate chord of the prototypical progression is the dominant chord, which has a root that is five increments away from tonic and is thus notated “V”. Immediately preceding the dominant is usually the predominant, which can be one of several chords, typically a chord with its root four increments away from the root of the tonic (notated “IV”). Between the first tonic and the predominant, composers usually insert other chords to prolong the sense of the initial tonic. One chord frequently used here is a special case of the tonic chord—the tonic in first inversion (“I6”), which is a tonic chord with a different configuration of pitches from the typical tonic chord. The five-chord progression of tonic–tonic in first inversion–predominant–dominant–tonic (i.e., I–I6–IV–V–I) provides a phrase structure that tends to be a prototypical basis of harmonic syntax in Western music (e.g., [24]).
As the above phrase structure has become an accepted norm in music theory over the centuries, vast numbers of musical phrases have been composed according to this underlying structural rule. The chord progression I–I6–IV–V–I thus reflects a structural rule which can be viewed as an instance of syntax in musical grammar [1]. Following repeated exposure to traditional Western music, most individuals have come to expect the prototypical musical (or harmonic) syntax, and listening to a musical phrase that deviates from this syntactical rule leads to a violation of expectations. These expectations and their violations have been studied from theoretical and empirical perspectives [2], [7], [17] and have been implicated as the source of emotion and meaning in music [17].
In addition to the usual chords, composers occasionally substitute other chords, such as the Neapolitan sixth, in the predominant position. The Neapolitan sixth (N6) chord is a major chord where the root pitch is one half of an increment above the tonic pitch [24]; it is derived from the minor scale and is typically used in the first inversion in place of predominant chords, especially in minor keys [24]. Thus, in the key of C, it would be a combination of the pitches F, A-flat, and D-flat (half increment from C). The N6 chord is conventionally placed in the predominant position; when it is placed in a different position, such as in place of the final tonic, the typical or expected musical syntax is violated. The N6 chord is not dissonant by itself, but when the context of a particular key has been established, the N6 chord sounds out of place at the end of the sequence and thus violates harmonic expectation [7], [9], [11]. Such a violation of harmonic expectation has been examined in a few recent ERP studies.
An early study involving ERPs and musical expectancy was conducted in order to compare musically-evoked neural activity in musicians and nonmusicians [2]. In an attempt to find an electrophysiological marker of musical expectancy, melodies with congruous (expected) terminal notes were contrasted against melodies with incongruous endings that either obeyed or violated the theoretical norms of melodic expectation. A late positive component (LPC) peaking at around 600 ms over parietal sites was evoked by incongruous endings, especially those that violated harmonic rules. This effect was significant in both musicians and nonmusicians, but was more pronounced in musicians. Based on these results, the LPC was thought to reflect a decision process based on implicit musical knowledge [2].
Results from the above study, while being the first ERP data to address musical expectation, investigated the neural response to deviants that violated both melodic and harmonic expectation. From a somewhat different standpoint, ERPs were used in the study of musical expectancy with respect to the violation of harmonic contexts [7]. Participants in this study listened to chord progressions and decided whether they resolved correctly. Results from this study provided evidence for an earlier, centrally distributed P3a component (peaking at approximately 350 ms) that was separate from the later, parietal, and much larger P3b (450 ms) component within the P300 complex. It was proposed that the P3a component was elicited in response to unexpected musical-syntactic events, whereas the P3b component was produced by a shift in voluntary attention in response to a target. In a study testing the language-like specificity of the syntax-related P600 component, a positive component peaking at 600 ms was elicited in response to an out-of-key chord within a harmonic phrase; in addition, a positive component peaking at approximately 300 ms was again elicited, the amplitude of which correlated with the degree of harmonic violation in musical melodies [23].
This positive component at 300 ms was also reported by Koelsch et al., who presented subjects with sequences of various chord progressions [9]. While most of the progressions followed the (I–I6–IV–V–I) syntax, some of the progressions contained the Neapolitan sixth (N6) chord: 25% of the trials included the N6 in the syntactically proper third position (replacing the IV from the prototypical progression), whereas another 25% of the trials included the N6 in a harmonically improper and unexpected fifth position, replacing the final tonic chord. Relative to the IV and I chords, the Neapolitans in both third and fifth positions elicited a response termed the Early Right Anterior Negativity (ERAN) at around 150–250 ms, the aforementioned P3a component peaking at 350 ms, and a late frontal negative wave (N5) component with an onset at around 380 ms and a peak at 550 ms. The ERAN was thought to reflect neural activity in response to the initial detection of the violation of harmonic expectancy, and was unaffected by task relevance of the unexpected chords. In contrast, the subsequent N5, occurring later at a cognitive rather than sensory level, was proposed to reflect processes of harmonic integration [9], [11]. The ERAN, P3a, and N5 were all elicited by N6 chords, but all these waveforms were smaller in amplitude when the N6 chord was in the third rather than the fifth position, a result suggesting that the N6 chord was less unexpected when in third position, in accordance with its predominant harmonic function.
An important aspect to processing activations such as the ERAN and N5 is their degree of automaticity. To investigate whether the ERAN and the N5 could be elicited preattentively, another study manipulated the allocation of attentional resources during harmonic syntax processing [11]. The experiment involved attended and unattended conditions. In the unattended block, subjects read a self-selected book under the instruction to ignore all acoustic stimuli. In the attended block, subjects were asked to detect and respond with a button press to N6 chords in the third and fifth positions. An early right anterior negativity (ERAN) during 150–210 ms and an N5 (380–600 ms) was found in response to N6 chords in both the third and fifth positions. In comparing attended and unattended blocks, the authors report that “The amplitude of the ERAN does not significantly differ between both blocks, suggesting that the processes underlying the generation of the ERAN are fairly independent of attention.” ([11], p. 45) The N5 in the attended condition was not observed, but it appears that it may have been masked by the large target-related P3b wave in response to the target N6 chord. [11].
Previously, neural processes underlying other sensory and cognitive features of auditory stimuli were shown to be modulated by attention. One waveform that seems to be particularly comparable to the ERAN is the Mismatch Negativity (MMN), a negative component elicited by a deviant auditory stimulus (deviant in a basic auditory sensory feature such as pitch or intensity) in an ongoing stimulus train [9], [20], [25]. Prior studies have shown that the MMN, like the ERAN, is automatically elicited [10], but relatively larger in amplitude in attended channels [28], [30], [32]. In addition, the unattended MMN was also shown to be larger in musicians than nonmusicians [10]. As the MMN has an onset time approximately similar to that of the ERAN, and is sensitive to attentional modulation, it would seem likely that the ERAN could also be modulated by attention factors.
Other more cognitive waveform components that are comparable to the N5 are also modulated by attention. One such component is the N400, which is typically elicited by semantic incongruity for words [15]. The N400 was also found to be strongly modulated in amplitude when in an attended vs. an unattended channel [16]. As the N400 and the N5 are evoked by semantic violations in language and music respectively [9], [15], and are elicited at similar latencies, these two waveforms could reflect comparable mental processes, and thus might also be similarly modulated by selective attention.
In the above paradigm, the unattended condition involved subjects reading self-selected material, and unlike most ERP studies, in which trials with eye movements are rejected as containing artifacts (e.g., [1], [5], [20], [29]), only EEG trials that actually contained horizontal eye movements were included in the data analysis, as an indicator that the subjects were actually reading during EEG recording. To account for the increased variance caused by eye movements, more trials were included in the unattended than in the attended condition. While reading had been previously reported as an ignore-auditory task (e.g., [20], [25]), there had been few reports of specifically including only trials containing eye movements in EEG data analysis. As eye movements were the only indicator of reading in the original study, it may not be the case that subjects were truly devoting maximal attentional resources to the reading task; subjects could have been scanning the material without truly paying attention. Moreover, although horizontal eye movements during reading are unavoidable, they add non-neural physiological noise into the EEG and ERPs. In order to ensure attentiveness to reading during the unattended condition, the present study employed controlled reading passages and post-run tests of comprehension. Additionally, in order to minimize ocular artifacts from the horizontal eye movements during reading, the eye movements were kept at a minimum by setting the reading material at a substantial distance from the eyes.
During the attended block, subjects in the Koelsch et al. study were instructed to listen for and respond to the N6 chords; this implies that instead of comparing an attended condition to an unattended one, the study was in fact comparing an attended and task-relevant-target condition to an unattended condition. To test for the effects of attention on the processing of harmonic deviants, separately from the effects of those deviants being targets or not, the present study required subjects to listen carefully to the music, but to detect and respond to a feature unrelated to harmony. From the data presented by the previous study it was evident that the task relevance of the N6 chord led to a very large target-related P3b wave, which seems likely to have masked or otherwise distorted the N5 in the attended (and task-relevant) condition, and possibly also affected the ERAN. By devising another task that ensures continuous attention to the auditory stimuli, but avoids any task relevance of the N6 chord, the masking effect of the P3b could be avoided.
The purpose of the present study was to investigate the effects of attention on brain activity associated with violation of harmonic expectancy, using a manipulation of attention in which sequences with the deviant N6 chords were either attended or ignored, but their deviancy was neither target-defining nor task-relevant. Thus, in the attended condition here, subjects were instructed to detect an occasional decrease in sound intensity of any of the sounds. This use of an orthogonal task was similar to an approach used in some prior studies of music processing [e.g., 9]. In the unattended condition, subjects studied reading comprehension passages in preparation for answering questions concerning these passages. Moreover, their answers to these questions were statistically analyzed, thereby providing behavioral data confirming that the subjects were attending away from the chord progressions during this condition.
Section snippets
Subjects
Eighteen nonmusicians (10 females, 8 males, mean age 23.3 years, age range 18–49, standard deviation 6.9) participated in this study. No participant had received more than 2 years of education in any instrument or voice outside of normal school education. All subjects were right handed and reported having normal hearing, normal or corrected-to-normal vision, and no history of neurological or psychiatric disorder. All subjects were volunteers recruited by email from the Duke University
Behavioral data
For the unattended condition, the reading comprehension questions were multiple-choice with five choices; therefore, chance level was 20%. All but one of the subjects performed above 50% correct in answering the reading comprehension questions. The high level of performance among these subjects was especially significant as the national average of performance on such reading comprehension passages is below 50% [3]. To ensure that subjects were attending closely to the reading (and ignoring the
Discussion
From previous research, the prominent waveforms identified as reflecting violation of musical expectancy were the Early Right Anterior Negativity (ERAN), Late Positive Complex (LPC), and N5 [2], [7], [9], [11], [21], [23]. In the present study, we observed an Early Anterior Negativity (EAN), similar to the ERAN and a Late Negativity similar to the N5.
The EAN was evoked in response to the unexpected Neapolitan chord in the latency window between 150 and 300 ms (peaking at around 180 ms) over
Acknowledgments
This work was supported by NIMH RO1-MH60415 to M.G.W. We thank A. Landau, A. Finn, C. Lucas for suggestions on earlier versions of the manuscript, along with S. Koelsch and an anonymous referee for their helpful review comments.
References (33)
- et al.
Modulation of semantic processing by spatial selective attention
Electroencephalogr. Clin. Neurophysiol.
(1993) Implications of ERP data for psychological theories of attention
Biol. Psychol.
(1988)- et al.
Stimulus deviance and evoked potentials
Biol. Psychol.
(1982) - et al.
Event-related potentials elicited by wrong terminal notes: effects of temporal disruption
Biol. Psychol.
(2000) - et al.
Auditory frequency discrimination and event-related potentials
Electroencephalogr. Clin. Neurophysiol., Evoked Potentials
(1985) - et al.
Modulation of early auditory processing during selective listening to rapidly presented tones
Electroencephalogr. Clin. Neurophysiol.
(1991) - et al.
The temporal dynamics of the effects in occipital cortex of visual–spatial selective attention
Cogn. Brain Res.
(2002) The Unanswered Question: Six Talks at Harvard (Charles Eliot Norton Lectures)
(1973)- et al.
An event-related potential (ERP) study of musical expectancy: comparison of musicians with nonmusicians
J. Exp. Psychol. Hum. Percept. Perform.
(1995) Educational Testing Systems
The Graduate Record Exam
(2003)
An evaluation of the automaticity of sensory processing using event-related potentials and brain–stem reflexes
Psychophysiology
Processing a second language: late learners' comprehension mechanisms as revealed by event-related brain potentials
Bilingualism: Lang. Cogn.
Electrophysiology of cognition
ERP measures assay the degree of expectancy violation of harmonic contexts in music
J. Cogn. Neurosci.
Superior pre-attentive auditory processing in musicians
NeuroReport
Brain indices of music processing: ‘Non-musicians’ are musical
J. Cogn. Neurosci.
Cited by (95)
Resource sharedness between language and music processing: An ERP study
2023, Journal of NeurolinguisticsSyntactic processing in music and language: Effects of interrupting auditory streams with alternating timbres
2018, International Journal of PsychophysiologyEffects of global and local contexts on chord processing: An ERP study
2018, NeuropsychologiaCitation Excerpt :An early anterior negativity was evoked by less related local context around 180 ms after the onset of target, and peaked at around 200 ms. This negativity resembled the early right anterior negativities (ERAN) although it was bilaterally distributed rather than right lateralized. In fact, the absence of the ERAN lateralization was observed in previous studies (Koelsch et al., 2013; Leino et al., 2007; Steinbeis et al., 2006), and some researchers even termed the effect as EAN instead (Loui et al., 2005). The early negativity has been taken to reflect expectancy violation based on music syntax (Koelsch et al., 2000, 2002; Rohrmeier and Koelsch, 2012).
Cortical Auditory Attention Decoding During Music and Speech Listening
2023, IEEE Transactions on Neural Systems and Rehabilitation EngineeringTension experience induced by tonal and melodic shift at music phrase boundaries
2022, Scientific ReportsDoes musical training affect neuro-cognition of emotions? An EEG study with Indian Classical Instrumental Music
2022, Proceedings of Meetings on Acoustics