Abstract
Functional magnetic resonance imaging (fMRI) in humans and macaques allows a test of the hypothesis that there is a specialized neural ensemble for pitch within auditory cortex: a pitch center. fMRI measures the blood oxygenation level-dependent (BOLD) response related to regional synaptic activity (Logothetis et al., 2001). The distinction between synaptic activity and spike firing, and species differences encourage caution when comparing BOLD activity in humans and macaques to recordings from single neurons in ferret and marmoset in the previous mini-review. The BOLD data provide support for the pitch-center concept, with ongoing debate about its location.
Introduction
The auditory cortex of macaque and human is located in the superior temporal plane (Fig. 1). In the macaque, three core areas, including primary cortex (A1), run posterior to anterior. Using functional magnetic resonance imaging (fMRI), each can be seen to demonstrate distinct tonotopic (frequency) mapping (Petkov et al., 2006). Adjacent belt areas also show some degree of tonotopicity. In humans, organization of the auditory areas based on tonotopic mapping is currently debated (for recent human fMRI reports see Formisano et al., 2003; Talavage et al., 2004; Woods et al., 2009; Humphries et al., 2010; Da Costa et al., 2011; Langers and van Dijk, 2011). There is consensus that primary cortex is located in the medial part of the first transverse temporal gyrus (Heschl's gyrus, HG) in the superior temporal plane, while the status of lateral HG as a core or belt homolog is debated. Studies of human pitch representation have sought an area that is specialized for pitch coding within HG and adjacent auditory areas including the planum temporale (PT), posterior to HG.
fMRI approaches to pitch
fMRI pitch studies have used subtraction methodology in which responses to a stimulus associated with pitch are compared with a control stimulus with no pitch. These experiments have sought to control for aspects of the different stimulus structure of the pitch and control sound to which fMRI might be sensitive, including the frequency composition (spectrum) of the sound (Oxenham, 2012). The first fMRI studies of cortical pitch responses used regular-interval noise (RIN), which is a type of noise to which the stimulus property of temporal regularity and the percept of pitch can be applied by a synthetic delay-and-add algorithm (Patterson et al., 2002; Hall et al., 2006). The use of low-pitch values, and a high-pass filter, minimizes the spectral ripple that occurs in the stimulus so that the stimulus can be compared with a control noise with the same passband to demonstrate responses that cannot be explained by changes in the time-averaged spectrum. The studies demonstrated maximal responses to RIN in lateral HG (although more medial responses in primary cortex are also observed; see Griffiths et al., 2010), which have been interpreted as pitch mappings. Similar activation in lateral HG has been demonstrated in an experiment in which a comparison was made between resolved harmonics (with high pitch salience) and unresolved harmonics (low pitch salience) in the same pass band (Penagos et al., 2004). Other studies have used forms of binaural pitch (Hall and Plack, 2007, 2009; Puschmann et al., 2010), in which the imposition of a phase shift between the ears in a particular pass band can be associated with a pitch. The experiments with binaural pitch have demonstrated responses in lateral HG and the adjacent part of PT.
These human studies suggest a regional specialization for pitch in the lateral superior temporal plane. The precise interpretation of the studies is critically dependent on differences between pitch and control stimuli other than the presence of pitch. For example, the modeled representation of RIN stimuli in the auditory pathway (Hall and Plack, 2009) demonstrates slow fluctuations in the spectrum over time and, perceptually, the creation of RIN from noise produces timbral as well as pitch change. Experiments using a more refined type of control noise containing similar fluctuations did not show differences in RIN-related activation in lateral HG that are significant (Barker et al., 2012). Experiments using resolved and unresolved harmonics produce differences in the auditory spectrum as well as differences in the pitch percept. The experimental manipulations to produce binaural pitch also cause a difference in the perceived spatial location of a sound, although it is possible to control for this to some extent (Puschmann et al., 2010).
The studies above were all based on categorical comparison of pitch and control stimuli. Parametric designs, in which pitch strength is manipulated by continuous variation of the stimulus, provide a powerful way of seeking responses from putative pitch areas, which would be predicted to increase as a function of any accompanying change in pitch salience. A recent detailed study (Barker et al., 2011), however, in which stimulus regularity and the associated pitch strength were varied continuously, did not show such a relationship in any area. Parametric stimulus manipulations also allow a critical test for a pitch area, which is to respond only when the stimulus parameters for sounds are in the pitch range. Figure 1 shows illustrative macaque and human data. In both species, activation is shown in the lateral superior temporal plane for the contrast between RIN with repetition rates above and below the lower limit of pitch in humans (∼30 Hz; Pressnitzer et al., 2001). The histograms show blood oxygenation level-dependent (BOLD) data as a function of repetition rate, with little change in the inferior colliculus, but responses in both species in the lateral cortical area that increase above the human lower limit of pitch.
Another suggested property of a pitch mechanism is that it should show pitch constancy: the same response to a given pitch value and strength regardless of the particular associated stimulus. This was first addressed (Hall and Plack, 2009) in a study using seven different stimuli, each with different spectral and temporal characteristics: pure tone, resolved and unresolved harmonic complex tones, a wideband harmonic-complex tone, a binaural pitch stimulus (Huggins pitch), and two types of RIN were all presented to each of 16 subjects. RIN produced responses in lateral HG in good agreement with previous studies. However, a different pattern of activation was reported for the other five pitch-evoking stimuli, with PT most consistently activated across the group. Moreover, the data showed similar distributions of neural activation for pure tones, resolved and unresolved harmonic complex tones, wideband harmonic-complex tones, and the binaural pitch despite their acoustic differences.
The categorical experiments using single stimuli and the experiments with multiple pitch stimuli assume pure insertion (Friston et al., 1996): a substrate for pitch representation that is independent of the perceptual and cognitive context. Experimental manipulation of pitch context has demonstrated that activity in primary cortex during pitch analysis is sensitive to context, while PT activation is independent of it (Garcia et al., 2010), consistent with the establishment of an invariant representation for pitch beyond primary cortex. Both forms of pitch analysis can be construed as levels within a hierarchical or heterarchical pitch system. Such organization can be examined in modeling studies using techniques similar to those used to model electrical data by Kumar and Schonwiesner (2012).
Many human studies employ group-level statistical inference and passive listening. These studies have the potential to overlook pitch-associated mechanisms that occur in functional areas that are anatomically variable between subjects or related to cognitive strategies that might vary between subjects. Figure 2 shows group data from Hall and Plack (2009) in addition to individual maps of pitch constancy. Ten subjects responded to at least four of the pitch contrasts within the same brain region, a condition with a probability of <5 × 10−8, supporting a role for this region in pitch coding. However, the location of this site varied a great deal from listener to listener; nine sites in the PT, three sites in the planum polare, two in the superior temporal sulcus, and in one subject, the inferior frontal gyrus. The functional role of these different regions could be examined using studies that systematically manipulate the pitch listening task in a within-subjects design.
The experiments above all use external stimuli. A pitch mechanism should be active during the perception of pitch regardless of whether any stimulus is present. Illusory pitch as an aftereffect has been examined using magnetoencephalography (Hoke et al., 1996), but not fMRI. Illusions and contextual effects that change pitch value or salience have not been exploited in fMRI studies to date but will provide a further test of the concept of a perceptual pitch center. A number of studies have addressed musical experience without external stimuli at the level of pitch sequences in studies of musical imagery (Zatorre and Halpern, 2005).
Pitch sequences and timbre
Most pitch studies have assessed sequences in which the pitch was fixed, but natural stimuli including vocalizations and music usually contain pitch variation across sequences or in the form of glides. In passive listening experiments, the effect of varying pitch in sequences shows more distributed representation in the superior temporal lobe than that corresponding to a constant pitch. Studies show bilateral activity in the anterior temporal lobes and the posterior part of the superior temporal gyrus (Patterson et al., 2002; Warren and Griffiths, 2003). Pitch-sequence studies in which there is an active listening task also engage the inferior lateral frontal lobe (Overath et al., 2007). Such frontal mechanisms in the right hemisphere have previously been emphasized as a substrate for pitch working memory (Zatorre et al., 1994).
Pitch stimuli are always associated with timbre: a distinct perceptual quality. A detailed discussion of timbre is beyond the scope of this review (see Griffiths et al., 2009), although behavioral evidence suggests an interdependence of pitch and timbre perception (Moore and Glasberg, 1990; Krumhansl and Iverson, 1992; see also Oxenham, 2012). A number of fMRI experiments in which timbral dimensions are modified independently of pitch point to overlapping early substrates in auditory cortex in HG and PT (Warren et al., 2005; Overath et al., 2008, 2010), in addition to the engagement of remote areas such as the superior temporal sulcus. The early overlapping substrates for pitch and timbre in HG and PT might in future be disambiguated using pattern analysis as below.
Beyond conventional pitch mapping
Multivariate pattern analysis allows the discrimination of responses to stimuli with different characteristics that cannot be discriminated using conventional fMRI analyses using mass-univariate statistics (Haynes and Rees, 2006). A recent study (Staeren et al., 2009) examined the effect of pitch variation in natural sounds. Distinct patterns of response to different pitch values were demonstrated in the region spanning lateral HG and anterolateral PT. The result is consistent with the earlier studies using conventional fMRI analysis. Another way of examining responses within areas that are not shown by conventional analysis uses the phenomenon of repetition suppression: decrease in BOLD activity as a function of stimulus repetition. The technique allows a search for sensory coding mechanisms in neuronal populations independent of context. Models for the phenomenon (Grill-Spector et al., 2006) include adaptation within a defined population of neurons, a decrease in the pool of neurons from which a response occurs, and alteration in the response time course. The technique allows the search for suppression of responses to repeated pitch in a subpopulation of neurons regardless of the stimulus with which the pitch is associated, another predicted behavior of a pitch mechanism (Baumann et al., 2011).
Future directions
The human experiments above implicate regions lateral and posterior to primary auditory cortex in pitch representation, with ongoing debate about precise position. The data suggest a role for this area in pitch analysis based on experiments with different pitch-associated stimuli that might undergo initial sensory analysis in primary cortex. Preliminary data suggest that a similar pitch region lateral to primary cortex exists in the macaque, so that we are now in a position to establish a primate model for pitch perception using paradigms that are comparable to those used in humans. In addition to establishing regional organization, the macaque work will allow identification of key areas within the network for neurophysiology to further establish neuronal mechanisms for the abstraction and use of pitch.
Human fMRI is particularly suitable for examining the distributed processing circuit supporting pitch cognition, which has not been possible in previous studies based on passive listening. The roles of lateral and posterior portions of the superior temporal plane and their remote connections in active pitch listening, including selective attention, working memory, and object categorization, require further definition.
Footnotes
T.D.G. is a Wellcome Trust Senior Clinical Fellow. All fMRI studies conducted by D.A.H. were supported by the Medical Research Council (UK). D.A.H. is currently supported by the National Institute for Health Research.
- Correspondence should be addressed to either of the following: Timothy D Griffiths, Institute of Neuroscience, Newcastle University Medical School, Newcastle upon Tyne, NE2 4HH, UK, t.d.griffiths{at}ncl.ac.uk, or Deborah A Hall, National Institute for Health Research Nottingham Hearing Biomedical Research Unit, Ropewalk House, 113 The Ropewalk, NG1 5DU, UK, Deb.Hall{at}nottingham.ac.uk