Elsevier

NeuroImage

Volume 51, Issue 1, 15 May 2010, Pages 373-390
NeuroImage

Extracting the internal representation of faces from human brain activity: An analogue to reverse correlation

https://doi.org/10.1016/j.neuroimage.2010.02.021Get rights and content

Abstract

Much of the debate surrounding the precise functional role of brain mechanisms implicated in the processing of human faces can be explained when considering that studies into early-stage neural representations of the spatial arrangement of facial features are potentially contaminated by “higher-level” cognitive attributes associated with human faces. One way to bypass such attributes would be to employ ambiguous stimuli that are not biased toward any particular object class and analyze neural activity in response to those stimuli in a manner similar to traditional reverse correlation for mapping visual receptive fields. Accordingly, we sought to derive whole face representations directly from neural activity in the human brain using electroencephalography (EEG). We presented ambiguous fractal noise stimuli to human participants and asked them to rate each stimulus along a “face not present” to “face present” continuum while simultaneously recording EEGs. All EEGs were subjected to a time–frequency analysis near 170 ms (negative amplitudes near 170 ms post-stimulus onset have been linked to early face processing) for five different frequency bands (delta, theta, alpha, beta, and gamma) on a trial-by-trial basis, independent of the behavioral responses. Images containing apparent face-like structure were obtained for theta through gamma frequency bands for strong negative amplitudes near 170 ms post-stimulus onset. The presence of the face-like structure in the spatial images derived from brain signals was objectively verified using both Fourier methods and trained neural networks. The results support the use of a modified reverse correlation technique with EEG as a non-biased assessment of brain processes involved in the complex integration of spatial information into objects such as human faces.

Introduction

The human perceptual system possesses an incredible ability to rapidly sample and integrate basic features from the natural environment into composite internal representations. From these representations, we are able to discern the subtlest of details amongst similar visual forms such as texture patterns or complex object forms such as human faces. It has long been speculated that our ability to make fine-tuned perceptual judgments regarding facial identity and expression involves mechanisms in the brain that are specifically devoted to processing the subtle idiosyncrasies of human faces. The evidence in favor of specialized brain signals for human face processing and representation has grown exponentially over the last two decades. Investigative techniques have ranged from single unit recordings from cells in the primate inferior temporal cortex (IT) (Perrett et al., 1992, Desimone et al., 1984, Baylis et al., 1987, Yamane et al., 1988, Gross, 1992, Heywood and Cowey, 1992, Young and Yamane, 1992) to electroencephalography (EEG) and functional magnetic resonance imaging (fMRI) in humans (Jeffreys, 1989, Jeffreys et al., 1992, Allison et al., 1994, Bentin et al., 1996, Eimer, 2000, Kanwisher et al., 1997, McCarthy et al., 1997, Kanwisher et al., 1999). And while the evidence from single unit recordings has not fully supported the notion of a high degree of face selectivity in temporal cortex (i.e., many cells were found to respond to a wide variety of object classes, with others identified as being specifically selective for faces), EEG and fMRI studies have been argued to strongly support such a view as their signals arise from populations of neurons (either directly via EEG or indirectly via the blood-oxygen level dependent signal of fMRI).

Regarding population-based signals, electroencephalography studies have repeatedly shown two brain signals (event related potentials or ERPs) that are linked to the processing of human faces as opposed to other object classes. Such components include 1) the vertex positive potential, or VPP (i.e., a positive deflection from the ERP baseline peaking between 150 and 200 ms post-stimulus onset) observed along the central portion of the scalp (Bötzel and Grusser, 1989, Jeffreys, 1989, Jeffreys, 1996), and 2) a negative deflection from ERP baseline occurring at about the same latency and observed bilaterally over the occipito-temporal junction of the scalp, termed the N170 component (Bötzel et al., 1995, Bentin et al., 1996, George et al., 1996; Eimer, 2000). It is worth noting that the VPP and N170 have recently been argued to reflect the same brain processes, with the relative differences in ERP magnitude between different studies argued to be dependent on differences in reference electrode methodology (Joyce and Rossion, 2005). Studies that have employed fMRI to study the mechanisms behind face processing have also supported the face selectivity notion by reporting face sensitive activity in the bilateral fusiform gyrus (also known as the fusiform face area, or FFA) located along the ventral temporal cortices (for other face selective areas measured with fMRI see Grill-Spector et al., 2004, Ishai et al., 2005). As with the single unit recording evidence for face selectivity in the temporal cortex, it is also worth noting that while the N170 and FFA brain signals yield preferential responses to faces, they are known to be modulated by other object classes (e.g., Eimer, 2000, Joseph and Gathers, 2002, Grill-Spector et al., 2004, Guillaume et al., 2009).

However, despite the volumes of evidence in favor of “selective” signals in the brain for face representational processing, there is substantial debate regarding the validity of such signals. For example, within both the EEG and fMRI literature, there has been compelling evidence reported which questions the role of the N170 as a brain signal, or FFA as a brain area, specifically devoted to face processing, by arguing that they are instead signals arising from the brain that are devoted to processing any complex stimulus associated with holistic processes acquired through expertise or involving subordinate-level processing (Gauthier et al., 1997, Gauthier et al., 1999, Gauthier et al., 2000, Rossion et al., 2002, Rossion et al., 2007, Xu, 2005), with subsequent studies disputing such claims (Grill-Spector et al., 2004, Grill-Spector et al., 2007). In addition to the debate surrounding the “selectivity” of signals associated with face processes arising from the brain, there exist several disputes regarding the time-course of face processing. That is, it has been argued that “face selectivity” signals (assessed via ERPs) can be observed as early as 100 ms (e.g., the P1, a positive potential occurring at 100 ms over occipital electrodes) (e.g., Eimer, 1998, Eimer, 2000, Itier and Taylor, 2004). However, other studies have not shown the P1 to be modulated by the strength of the face percept when the luminance contrast of face stimuli is varied against a white noise background (Jemel et al., 2003, Tanskanen et al., 2005). It has also been argued that the P1 may be signaling differences related to the low-level features of faces and objects rather than face representations per se (Rossion and Jacques, 2008). Yet another issue concerns the depth of the face representation as assessed through the N170. One view is that the representation consists of an initial global (i.e., holistic) representation (e.g., Sergent, 1984, Young et al., 1987, Goffaux and Rossion, 2006, Jacques and Rossion, 2009), with an opposing view that the initial representation is built up through an iterative part-based integration (e.g., Harris and Aguirre, 2008, Smith et al., 2007).

Regarding face selectivity, one primary cause for the disputes described above is that the classification of the selectivity of brain signals has always been carried out with visual stimuli (i.e., faces or other object categories) that possess many higher-level cognitive attributes to which the brain is sensitive instead of, or in addition to, the spatial arrangement of features contained therein. Thus, in order to address whether or not a particular brain signal is actually sensitive to the spatial properties of complex stimuli such as faces, it would be beneficial to employ ambiguous stimuli that are not biased toward any particular object class and examine the spatial properties of the ambiguous stimuli that elicit significant brain signals. One promising technique that would allow for such an approach is a behavioral analogue to reverse correlation, referred to as classification image analysis (Ahumada, 1996, Beard and Ahumada, 1998, Ahumada, 2002). Along that vein, several neurophysiological studies have begun to utilize reverse correlation techniques using actual faces (or parts of faces) in hopes of better understanding the nature of different face selective brain signals (Smith et al., 2004, Smith et al., 2006, Smith et al., 2007). However, such studies cannot bypass the high-level cognitive attributes which accompany actual faces (or parts of faces) as stimuli. With respect to employing ambiguous stimuli in a reverse correlation paradigm behaviorally, Gosselin and Schyns (2003) have shown that reverse correlation based on behavioral responses (i.e., classification image analysis) to white noise stimuli can render in external space the internal representation of faces held by human observers. More importantly, such external renditions of the internal face representation can be obtained in tasks where no face signal is ever presented (Gosselin and Schyns, 2003). The paradigm itself involved presenting human observers with different white noise patterns (20,000 trials in total). Observers were instructed to discriminate between a neutral or smiling face in the noise. No faces were ever presented in the noise, but observers were told that 50% of the trials would contain faces. After summing all noise images where observers responded with a “yes” and subtracting out the sum of all noise images where observers responded with a “no”, an apparent image of a smiling face was obtained when filtered at the fundamental spatial frequency (referred to as a classification image, or CI). Such an approach (based on behavior) has high potential for explicitly rendering the internal representation of face-like structure in an external image in a completely non-biased way. That is, images containing face-like structure can be obtained from ambiguous stimuli that could be interpreted as containing any number of objects from different categories or object classes. It is worth noting that while this procedure is similar to classification image analysis as defined by Ahumada (2002), it deviates from this definition in that no signal (i.e., a face stimulus) is ever presented. From this point forward, we refer to classification image analysis as carried out by Gosselin and Schyns (2003).

Motivated by previous studies employing fMRI (Zhang et al., 2008) and ERPs (Wild and Busey, 2004) which showed modulation of brain signals associated with face processing when participants reported the percept of faces-in-white noise (when no faces were actually present), we sought to bypass the aforementioned behavioral measures and derive whole face representations directly from neural activity in the human brain using EEG. That is, instead of conducting a classification image analysis using behavioral responses, we sought to derive CIs based on ambiguous noise (i.e., fractal noise) images that “triggered” strong negative EEG amplitudes on a trial-by-trial basis within the 150 to 200 ms range corresponding to the range the N170 is typically observed in ERPs. This approach therefore constitutes a novel type of reverse correlation between fractal noise and the targeted EEG amplitudes, similar to the preliminary reports of such an approach using white nose (Goffaux et al., 2003, Smith et al., 2009). However, the number of trials originally reported (Gosselin and Schyns, 2003) to render the internal representation of human faces externally was substantial (i.e., 20,000 trials), which does not lend itself to an EEG paradigm. We thus first sought to reduce the number of trials (Experiment 1) by presenting participants with fractal noise images (similar amplitude spectrum as human faces, but with no face signal embedded) along with an anchored behavioral response continuum and recorded reaction times (RTs). Experiment 2 was identical to Experiment 1, except we simultaneously recorded EEGs during the experiment. All EEGs were subjected to a time–frequency analysis near 170 ms for five different frequency bands (delta, theta, alpha, beta, and gamma) on a trial-by-trial basis independent of the behavioral responses. Images containing apparent face-like structure were obtained for theta through gamma frequency bands when strong negative amplitudes near 170 ms post-stimulus onset were present.

Behavioral classification images are inherently noisy and we anticipated that any EEG derived CIs may likely contain a higher level of noise due to the nature of the EEG signal. We therefore employed a rigorous procedure to objectively verify the presence of face-like structure in the CIs for both experiments. This procedure addressed 4 primary concerns regarding the strength and presence of systematic face-like structure in the CIs. First, we conducted a significance test (Chauvin et al., 2005) which verified that the structure in the CIs was significantly different from noise. Second, using Fourier methods (as in the study of Gosselin and Schyns, 2003), we showed that the structure contained within all CIs was systematic, and very different from any chance structure present within randomly generated CIs (i.e., we demonstrated that the structure in the CIs was not random structure). Third, using a standard supervised neural network, we showed that the structure in the CIs was highly similar to actual faces as opposed to random structure. Finally, using a second supervised neural network, we showed that the structure in the CIs is more similar to human faces than to common non-face-like objects.

Our objective verification procedure identified many of the classification images as containing face-like structure, thereby validating that the images derived from brain signals very much resembled human faces. The results of the current study show that reverse correlation can be carried out with EEG as a way of reverse-engineering the neural mechanisms involved in face perception in order to extract their internal spatial representation and render it externally, and therefore holds the potential to successfully address many of the debates surrounding human face perception.

Section snippets

Experiment 1

In Experiment 1, we sought to reduce the number of trials required to yield convincing classification images of faces with white noise patterns (i.e., Gosselin and Schyns, 2003). To this end, we chose to use stimuli constructed from fractal noise as this would likely yield a higher probability of the participants perceiving a face in the noise (refer to Fig. 1). It is important to note that a deliberate face signal was never present in the noise (i.e., we did not embed any face stimuli in any

Experiment 2

The primary interest of the current study was centered on a non-biased approach to investigate the true internal representation associated with a given signal associated with neural processing of human faces. Specifically, if a given neural signal is explicitly associated with the integration of external spatial features into a coherent representation of a face (and not some higher cognitive attribute of faces), then one should be able to reverse engineer the mechanism via ambiguous stimuli in

General discussion

The primary aim of the current study was to derive CIs based on ambiguous noise images that “triggered” strong negative EEG amplitudes on a trial-by-trial basis within the 150 to 200 ms range corresponding to the range within which the N170 is typically observed in ERPs. Before we could achieve this goal, we first had to verify and validate a modified version of the classification image analysis paradigm since the original approach did not lend itself to an EEG paradigm due to the large number

Acknowledgments

This work was supported by NSERC and CFI grants to D.E. and CIHR grants MT108-18 and MOP 53346 to R.F.H. We are grateful to Philippe G. Schyns and the other two anonymous reviewers for their suggestions concerning this study. We are also grateful to Landon D. Reid for his assistance in gathering the behavioral data in Experiment 1 and Richard F. Braaten for suggesting the use of reaction time in Experiment 1.

References (81)

  • RingachD. et al.

    Reverse correlation in neurophysiology

    Cogn. Sci.

    (2004)
  • RossionB. et al.

    Does physical interstimulus variance account for early electrophysiological face sensitive responses in the human brain? Ten lessons on the N170

    Neuroimage

    (2008)
  • RousseletG.A. et al.

    Single-trial EEG dynamics of object and face visual processing

    Neuroimage

    (2007)
  • SmithM.L. et al.

    From a face to its category via a few information processing states in the brain

    Neuroimage

    (2007)
  • TrojeN. et al.

    Face recognition under varying poses: the role of texture and shape

    Vision Res.

    (1996)
  • TuckerD.M.

    Spatial sampling of head electrical fields—the Geodesic Sensor Net

    Electroencephalogr. Clin. Neurophysiol.

    (1993)
  • WardL.M.

    Synchronous neural oscillations and cognitive processes

    Trends Cogn. Sci.

    (2003)
  • AhumadaA.J.

    Perceptual classification images from vernier acuity masked by noise [Abstract]

    Perception

    (1996)
  • AhumadaA.J.

    Classification image weights and internal noise level estimation

    J. Vis.

    (2002)
  • AllisonT. et al.

    Face recognition in human extrastriate cortex

    J. Neurophysiol.

    (1994)
  • BattitiR.

    1st-Order and 2nd-order methods for learning—between steepest descent and Newton method

    Neural Comput.

    (1992)
  • BaylisG.C. et al.

    Functional subdivisions of the temporal lobe neocortex

    J. Neurosci.

    (1987)
  • BeardB.L. et al.

    A technique to extract relevant image features for visual tasks

  • BentinS. et al.

    Electrophysiological studies of face perception in humans

    J. Cogn. Neurosci.

    (1996)
  • BötzelK. et al.

    Electric brain potentials evoked by pictures of faces and non-faces: a search for ‘face-specific’ EEG-potentials

    Exp. Brain Res.

    (1989)
  • BötzelK. et al.

    Scalp topography and analysis of intracranial sources of face-evoked potentials

    Exp. Brain Res.

    (1995)
  • ChauvinA. et al.

    Accurate statistical tests for smooth classification images

    J. Vis.

    (2005)
  • DesimoneR. et al.

    Stimulus-selective properties of inferior temporal neurons in the macaque

    J. Neurosci.

    (1984)
  • EimerM.

    Does the face-specific N170 component reflect the activity of a specialized eye processor?

    Neuroreport

    (1998)
  • EimerM.

    The face-specific N170 component reflects late stages in the structural encoding of faces

    Neuroreport

    (2000)
  • GauthierI. et al.

    Activation of the middle fusiform ‘face area’ increases with expertise in recognizing novel objects

    Nat. Neurosci.

    (1999)
  • GauthierI. et al.

    Expertise for cars and birds recruits brain areas involved in face recognition

    Nat. Neurosci.

    (2000)
  • GoffauxV. et al.

    Faces are “spatial”—holistic face perception is supported by low spatial frequencies

    J. Exp. Psychol.: Hum. Percept. Perform.

    (2006)
  • GoffauxV. et al.

    Superstitious perceptions of a face revealed by non phase-locked gamma oscillations in the human brain [Abstract]

    J. Vis.

    (2003)
  • GosselinF. et al.

    Superstitious perceptions reveal properties of internal representations

    Psychol. Sci.

    (2003)
  • Grill-SpectorK. et al.

    The fusiform face area subserves face perception, not generic within-category identification

    Nat. Neurosci.

    (2004)
  • Grill-SpectorK. et al.

    Corrigendum: high-resolution imaging reveals highly selective nonface clusters in the fusiform face area

    Nat. Neurosci.

    (2007)
  • GrossC.G.

    Representation of visual stimuli in inferior temporal cortex

    Philos. Trans. R. Soc. Lond. B Biol. Sci.

    (1992)
  • HaigN.D.

    How faces differ—a new comparative technique

    Perception

    (1985)
  • HaigN.D.

    Investigating face recognition with an image processing computer

  • Cited by (0)

    View full text