Behavioral and ERP measures of holistic face processing in a composite task
Introduction
Faces, unlike many other types of objects, are processed holistically, which means that they are encoded as one inseparable unit, rather than as a group of individual features or parts (Tanaka and Farah, 1991, Tanaka and Farah, 1993). Evidence of holistic processing can be observed in part-whole paradigms, in which subjects are better able to match or recognize individual facial features that are presented within the context of a whole face than features that are presented alone (Tanaka & Farah, 1993). Evidence of part versus whole effects and holistic processing is observed in adults and in typically developing children as young as four years of age (de Heering et al., 2007, Pellicano and Rhodes, 2003).
Holistic face processing effects can also be observed in tasks that require the restriction of attention to only one half of the face at a time. In some tasks, composite faces are created by combining the top half of one famous or familiar face with the bottom half of another face, and subjects are then asked to identify only the person depicted in the top or the bottom half of the face (Young, Hellawell, & Hay, 1987). When the two face halves are presented in alignment with one another, they join to form a new face configuration and the stimulus is encoded holistically. In this case, recognition of the individual parts of the face is difficult because the new configuration interferes with the recognition of the individual features within each half. By contrast, a new overall facial configuration does not result when the two halves of the composite face are spatially misaligned with one another. In this case, there is no interference due to holistic encoding of the stimulus and subjects recognize the source images more easily (Young et al., 1987).
Discrimination of unfamiliar faces is similarly disrupted by holistic processing in composite tasks (Hole, 1994, Hole et al., 1999, Le Grand et al., 2004). In these tasks, subjects are presented with two composite faces and are asked to make same or different judgments based on only one half of the faces. The unattended halves of the two faces differ on every trial. When the two halves of the composite face are aligned, the face is processed holistically. The new configuration of features that results from pairing identical attended halves with differing unattended halves reduces the probability that subjects will recognize that the attended halves are the same within the trial. However, when the two halves of the composite face are spatially misaligned, holistic processing is disrupted, allowing better selective attention to the attended half of the face, and consequently, better performance in the misaligned relative to the intact condition. Thus, the critical question is the degree to which spatial alignment affects the ability to detect when the attended halves are the same. Recognizing that attended halves are different is relatively easy in this task because in this instance both the attended and unattended halves of the face stimuli differ.
In sum, holistic processing is characterized by the encoding of the face as single stimulus, with difficulty attending selectively to individual features or parts of the face. Selective attention to features can be improved by misaligning sections of the face because the misalignment disrupts holistic processing.
Event-related potential (ERP) studies of face processing have focused on several components that appear to be face-sensitive, or show different activity for faces compared to other objects. These peaks are also sensitive to information within the face, such as spacing and orientation of features.
The earliest peak is the P1, a positive-going peak around 100 ms post-stimulus. The P1 is observed during many visual tasks, and some studies have reported that this component responds differently to faces than other objects (e.g., Herrmann, Ehlis, Ellgring, & Fallgatter, 2004). The P1, observed in posterior, lateral electrodes, is sensitive to face information, including orientation (Henderson et al., 2003, Itier and Taylor, 2002, Itier and Taylor, 2004a, Itier and Taylor, 2004b, Itier and Taylor, 2004c, Linkenkaer-Hansen et al., 1998, Taylor et al., 2004), thatcherization, a technique which inverts the orientation of the eyes and mouth within the face (Milivojevic, Clapp, Johnson, & Corballis, 2003), and changes in the spacing of features and typicality or attractiveness of the face (Halit, de Haan, & Johnson, 2000). However, some studies have failed to find thatcherization effects on the P1 (Boutsen, Humphreys, Praamstra, & Warbrick, 2006), and therefore these effects may be dependent on stimulus or task characteristics.
The N170, a negative-going peak around 170 ms post-stimulus recorded in posterior lateral sites, clearly differentiates faces as a class of stimuli from other visual objects and is therefore considered to index the structural encoding of the face (Bentin et al., 1996, Bentin and Deouell, 2000, Itier and Taylor, 2004a, Itier and Taylor, 2004b, Itier and Taylor, 2004c, Rossion et al., 1999, Rossion et al., 2003). This component is highly sensitive to face inversion (Eimer, 2000, Rossion et al., 1999), contrast reversal (Itier & Taylor, 2002), and thatcherization (Carbon et al., 2005, Milivojevic et al., 2003). Furthermore, in studies using Mooney faces, which are devoid of individual features but retain the overall configuration of a face, the N170 is larger for stimuli correctly identified as faces than those not perceived as faces, and an inversion effect is present only for stimuli perceived as faces (George, Jemel, Fiori, Chaby, & Renault, 2005). This peak is observed in posterior, lateral electrodes and tends to be faster and larger in the right hemisphere (RH) than the left hemisphere (LH) (Bentin et al., 1996), but this can vary with task and stimulus characteristics. Source-localization studies have localized the generator(s) of the N170 to either the fusiform gyrus, the superior temporal sulcus, or both (Caldara et al., 2003, Itier and Taylor, 2002, Itier and Taylor, 2004a, Itier and Taylor, 2004b, Itier and Taylor, 2004c, Schweinberger et al., 2002a, Schweinberger et al., 2002b).
The VPP is a positive going peak that is observed over frontal electrodes at the same latency as the N170. Previous research has shown that the VPP, like the N170, is sensitive to face inversion (Eimer, 2000, Jemel et al., 2003, Rossion et al., 1999). Some source localization studies conclude that the VPP and N170 components reflect activity from the same neural dipole, probably located in or near the fusiform gyrus (Itier and Taylor, 2002, Joyce and Rossion, 2005, Rossion et al., 1999), while others document differences suggestive of two independent neural generators (Bentin et al., 1996, Botzel et al., 1995, Eimer, 2000, George et al., 2005).
Later components in the ERP are also sensitive to face information within the context of a task. The P2, a positive-going peak recorded around 200 ms in posterior temporal sites, is sensitive to facial configuration and is larger over the right hemisphere (RH) in response to faces (Halit et al., 2000). It is also affected by thatcherization (Boutsen et al., 2006, Carbon et al., 2005, Milivojevic et al., 2003), familiarity of faces (Caharel et al., 2002), and emotional facial expressions (Stekelenburg & de Gelder, 2004). Many studies suggest that the P2 is involved with processing configural relations between features (Itier and Taylor, 2002, Linkenkaer-Hansen et al., 1998, Milivojevic et al., 2003) or processing of stored representations of familiar faces (Caharel et al., 2002).
The N250, a negative-going peak around 225–250 ms, has been associated with repetition and familiarity in many studies of face perception. This peak shows a RH asymmetry and because it tends to be larger for familiar than unfamiliar faces and for repetition of unfamiliar faces, some have posited that it indexes face familiarity (Begleiter et al., 1995, Schweinberger, 1995, Schweinberger et al., 2002a, Schweinberger et al., 2002b). The repetition effect for this peak is also delayed by inversion and contrast-reversal (Itier and Taylor, 2004a, Itier and Taylor, 2004b, Itier and Taylor, 2004c), suggesting a sensitivity to configural relations of features.
In the context of the current study, the effect of part-based versus whole-face processing on these face-sensitive peaks is most relevant. The ERP components described above often vary in their latency, amplitude, and hemispheric distribution depending upon whether they are elicited by whole faces or parts of faces, and they also vary depending on which face parts are presented. Firstly, the typical RH asymmetry for face processing can be affected by holistic as compared to part-based or featural processing. “Global” or holistic processing in general is associated with a RH asymmetry, while “analytical” or featural processing is associated with a LH asymmetry (Hillger and Koenig, 1991, Rhodes, 1985, Rhodes et al., 1993, Van Kleeck, 1989). Functional imaging studies further support this difference in hemispheric asymmetry. The right fusiform gyrus, a highly face-sensitive brain region (Kanwisher et al., 1997, Puce et al., 1995), is most active when subjects match whole faces, while the left fusiform gyrus is most active when subjects match individual facial features (Rossion et al., 2000). Thus, holistic processing maintains the general RH asymmetry observed across a variety of tasks, while featural or parts-based processing elicits the opposite asymmetry.
Secondly, the latency and amplitude of face-sensitive ERP components are affected differently by holistic as compared to parts-based processing. The N170 peaks later in response to face parts than to whole faces, and this delay may reflect increased difficulty in determining whether or not the stimulus represents a face (Bentin et al., 1996). Parts-based processing varies across facial features, with multiple studies suggesting that the eyes appear to have a privileged status. For example, the N170 is largest in response to whole faces and isolated eyes, and smaller in response to other facial features (Bentin et al., 1996, Itier et al., 2006, Taylor et al., 2001). Latencies of the N170 are also faster when eyes are present within a face than when they are absent (Eimer, 1998), and the effects of inversion and contrast reversal may be driven by the changes in the location of the eyes and their local contrast (Itier et al., 2006). Indeed, the effect of inversion on N170 amplitude disappears if the face stimuli do not contain eyes (Itier, Alain, Sedore, & McIntosh, 2007). Thus, the eyes are particularly salient facial features that strongly affect the latency, amplitude, and distribution of neural responses to faces.
One goal of the current study is to characterize holistic face processing and its associated ERP components during a composite face task. If behavioral performance is affected by spatial misalignment, then face-sensitive ERPs should also be affected. Furthermore, if alignment affects detection of “sameness” in composite faces more than the detection of difference, we expect that this effect will also be reflected in latency or amplitude differences in ERPs elicited during the composite task.
Another goal of this study is to determine whether holistic face processing varies with attention to the top versus the bottom of the face. Eye tracking data show that observers fixate longer and more frequently on the eye region of the face than other regions (Rizzo et al., 1987, Walker-Smith et al., 1977, Yarbus, 1967), and behavioral studies demonstrate increased reliance on the eyes when recognizing familiar faces (O’Donnell & Bruce, 2001) and poorer recognition performance when the eyes of a face are masked, relative to when the mouth is masked (McKelvie, 1976).
Studies of emotion recognition also report that the top half of the face provides different information than the bottom half and that this difference varies in importance depending upon the task (Calder et al., 2000, Prodan et al., 2001). On average across expressions, adults tend to rely on the bottom of the face to identify emotions, even if they are explicitly instructed otherwise (Prodan et al., 2001). However, while the mouth may be distinctive for some emotional expressions, such as happiness, the eye region of the face may be more important for recognition of other facial expressions, such as anger (Calder et al., 2000). Thus, while studies involving recognition of neutral faces note the salience of the top half of the face, this salience may be altered by the characteristics of the stimulus. Furthermore, the effects, if any, of directing attention to the bottom of the face as opposed to the top on fundamental mechanisms of face perception, such as holistic face perception, remain unknown. Therefore, the current study asks whether the apparent primacy of the eye region in the processing of neutral faces will result in different patterns of performance when the top versus the bottom half of the face is attended.
In light of these goals, we propose the following hypotheses.
First, the disruption of holistic processing caused by spatial misalignment should result in better performance (higher accuracy, faster reaction times) in a composite face task for misaligned faces than intact faces, and this alignment effect should be larger for “same” trials than “different” trials, as seen in previous studies (Le Grand et al., 2004, Young et al., 1987). Furthermore, the disruption of the configuration or “faced-ness” of the stimuli caused by misalignment should result in increased ERP amplitudes and slower ERP latencies, as compared to those elicited by intact faces. This should mimic the effects of face inversion (Eimer, 2000) and thatcherization (Milivojevic et al., 2003) observed in other studies. Finally, if early ERP components are sensitive to holistic processing, then neural evidence of holistic face processing should also be observed as greater effects of alignment during “same” trials. Such an effect would indicate that encoding of the face is sensitive not only to stimulus alignment, but also to immediate repetition of the attended half.
Second, people tend to look at and attend more to the top half of the face than to the bottom. This apparent primacy of the top half of the face should result in more efficient processing when attention is explicitly directed to the top of the face because in this instance the information to be suppressed is from the typically less-attended face region. By contrast, when attention is directed to the bottom, information from the more-attended face region must be suppressed. In other words, if attention is naturally drawn to the eyes of the face, then alignment effects in the composite task should be greatest when subjects attend to the bottom half because they will have difficulty ignoring the top half. Finally, because the N170 is sensitive to eyes, we may observe a reduction in its amplitude when attention is directed away from the eyes and toward the bottom of the face.
Finally, Overall, ERP components should show the typical RH asymmetry commonly observed in face processing tasks (i.e., higher amplitudes and faster latencies in the RH than the LH). Because aligned composites should elicit greater holistic processing than misaligned faces, and holistic processing is associated with a RH asymmetry, aligned faces should maintain the RH asymmetry. Because misaligned composites are designed to elicit more parts-based processing, and parts-based processing is associated with a LH asymmetry, these stimuli may elicit a reduced RH asymmetry.
Section snippets
Participants
Twenty-four adults (12 female) between the ages of 19 and 34 participated in the study. All subjects had normal or corrected to normal vision, and reported to have no neurological or psychological diagnoses. Subjects provided informed consent under a protocol approved by the University of Massachusetts Medical School and were paid for their participation.
Stimuli
Stimuli were developed in the laboratory of Daphne Maurer at McMaster University (see Le Grand et al., 2004). Fifty-two face composites were
Prediction 1
The first hypothesis was that holistic processing would be disrupted in misaligned faces, and this would be most evident in “same” trials. As predicted, accuracy was higher (F(1, 23) = 26.028, p < 0.001) and RTs were faster (F(1, 23) = 26.239, p < 0.001) in response to misaligned than aligned faces. Both of these effects were larger for “same” trials than “different” trials. For accuracy, the alignment by trial type interaction (F(1, 23) = 5.493, p = 0.028), shown in the left panel of Fig. 4, indicated a
Behavioral effects of holistic processing
Composite face tasks tap holistic processing by examining the effects of spatial misalignment on the ability to attend selectively to specific portions of faces. In these tasks, holistic processing negatively impacts the ability to attend selectively to parts of composite faces when they are intact but not when they are misaligned, because in the former instance the features are bound into a gestalt. As predicted, participants in this study were faster and more accurate while making
References (69)
- et al.
Early processing of the six basic facial emotional expressions
Cognitive Brain Research
(2003) - et al.
Event-related brain potentials differentiate priming and recognition to familiar and unfamiliar faces
Electroencephalography and Clinical Neurophysiology
(1995) - et al.
Comparing neural correlates of configural processing in faces and objects: an ERP study of the Thatcher illusion
Neuroimage
(2006) - et al.
Face versus non-face object perception and the’other-race’ effect: a spatio-temporal event-related potential study
Clinical Neurophysiology
(2003) - et al.
The Thatcher illusion seen by the brain: an event-related brain potentials study
Brain Research Cognitive Brain Research
(2005) - et al.
Holistic face processing is mature at 4 years of age: Evidence from the composite face effect
Journal of Experimental Child Psychology
(2007) Effects of face inversion on the structural encoding and recognition of faces. Evidence from event-related brain potentials
Cognitive Brain Research
(2000)- et al.
Electrophysiological correlates of facial decision: insights from upright and upside-down Mooney-face perception
Brain Research Cognitive Brain Research
(2005) - et al.
Event-related potentials (ERPs) to schematic faces in adults and children
International Journal of Psychophysiology
(2003) - et al.
Face, eye and object early processing: what is the face specificity?
Neuroimage
(2006)