Neural portraits of perception: Reconstructing face images from evoked brain activity
Graphical abstract
Introduction
Neuroimaging methods such as fMRI have provided tremendous insight into how distinct brain regions contribute to processing different kinds of visual information (e.g., colors, orientations, shapes, or higher-level visual categories such as faces or scenes). These studies have supported inferences about the neural mechanisms or computations that underlie visual perception by documenting how various types of stimuli influence brain activity. However, knowledge about the relationship between visual input and corresponding neural activity can also be used for reverse inference: to predict or literally reconstruct a visual stimulus based on observed patterns of neural activity. That is, by understanding how an individual's brain represents visual information, it is possible to ‘see’ what someone else sees. While there are a relatively limited number of studies reporting neural reconstructions to date, the feats of reconstruction that have been achieved thus far are impressive. In addition to reconstruction of lower-order information such as binary contrast patterns (Miyawaki et al., 2008, Thirion et al., 2006) and colors (Brouwer and Heeger, 2009), there are also examples of successful reconstruction of handwritten characters (Schoenmakers et al., 2013), natural images (Naselaris et al., 2009), and even complex movie clips (Nishimoto et al., 2011).
However, even reconstructions of complex visual information have relied almost exclusively on exploiting information represented in early visual cortical regions (typically V1 and V2). Exceptions to this include evidence from Brouwer and Heeger (2009) that color can be reconstructed from responses in intermediate visual areas such as V4, and evidence from Naselaris et al. (2009) showing that reconstruction of natural images benefits from inclusion of higher-level visual areas (anterior occipital cortex) that are thought to represent semantic information about images. But reconstructions of visual stimuli based on patterns of activity outside occipital cortex have not, to our knowledge, been reported. The potential for reconstructions from higher-level regions (e.g., ventral temporal cortex or even fronto-parietal cortex) is enticing because reconstructions from these regions may be more closely related to perceptual experience as opposed to visual analysis (Smith et al., 2012).
Here, we attempted to reconstruct images of faces—a stimulus class that has not previously been reconstructed from neural activity. While face images—like other visual images—could, in theory, be reconstructed from patterns of activity in early visual cortex (i.e., via representations of contrast, orientation, etc.), we were also interested in the potential to reconstruct faces based on patterns of activity in higher-level regions. A number of face-selective (or face-preferring) regions have been identified outside of early visual cortex—for example, the occipital face area (Gauthier et al., 2000), fusiform face area (Kanwisher et al., 1997), and superior temporal sulcus (Puce et al., 1998) are all thought to contribute to aspects of face perception. Furthermore, other non-occipital regions have been implicated in the processing of relatively subjective face properties such as race (Hart et al., 2000) and emotional expression (Whalen et al., 1998). Thus, faces represent a class of visual stimuli that may be particularly suitable for ‘higher-level’ neural reconstructions. Moreover, a major computational advantage of using face stimuli is that there are previously established methods, based on principal component analysis (PCA), to dramatically reduce the dimensionality of face images such that an individual face can be accurately represented by a relatively small number of components. The representation of faces via a limited set of PCA components (or eigenfaces) has proved useful in domains such as face recognition (Turk and Pentland, 1991), but the application to neural reconstructions is novel.
In short, our approach to reconstructing face images from brain activity involved four basic steps (Fig. 1). First, PCA was applied to a large set of training faces to identify a set of components (eigenfaces) that efficiently represented the face images in a relatively low dimensional space (note: this step was based on the face images themselves and was entirely unrelated to neural activity). Second, a machine-learning algorithm (partial least squares regression, or PLSR) was used to map patterns of fMRI activity (recorded as participants viewed faces) to individual eigenfaces (i.e., the PCA components representing the face images). Third, based on patterns of neural activity elicited by a distinct set of faces (test faces), the PLSR algorithm predicted the associated eigenface component scores. Fourth, an inverse transformation was applied to the component scores that were predicted for each test face to generate a reconstruction of that face. To empirically validate the success of neural reconstructions, and to compare reconstructions across distinct brain regions, we assessed whether reconstructed faces could be identified as corresponding to the original (target) image. Identification accuracy was assessed via objective, computer-based measures of image similarity and via subjective, human-based reports of similarity.
Section snippets
Participants
Six participants (2 females) between the ages of 18 and 35 (mean age = 21.7) were recruited from the Yale University community. Informed consent was obtained in accordance with the Yale University Institutional Review Board. Participants received payment in exchange for their participation.
Materials
A total of 330 face images were used in the study. Face images were obtained from a variety of online sources [e.g., www.google.com/images, www.cs.mass.edu/lfw (Huang et al., 2007)] and were selected such that faces were generally forward facing with eyes and mouth visible in each image. The faces varied in terms of race, gender, expression, hair, etc. For all images, the location of the left eye, right eye, and mouth were first manually labeled (in x/y coordinates). Each image was then cropped
Eigenfaces
Each face image was represented by a single vector of 50,820 values (110 pixels in x direction ∗ 154 pixels in y direction ∗ 3 color channels). Principal component analysis (PCA) was performed on the set of 300 training faces (i.e., excluding the test faces), resulting in 299 component “eigenfaces” (Turk and Pentland, 1991). When rank ordered according to explained variance, the first 10 eigenfaces captured 71.6% of the variance in pixel information across the training face images.
To validate the
Discussion
Here, we used a machine learning algorithm to map distributed patterns of neural activity to higher-order statistical patterns contained within face images. We then used these mappings to reconstruct, from evoked patterns of neural activity, face images viewed by human participants. Our results provide a striking confirmation that face images can be reconstructed from brain activity both within and outside of occipital cortex. The fidelity of the reconstructions was validated both by an
Acknowledgments
This work was supported by National Institutes of Health grants to M.M.C. (R01 EY014193) and B.A.K. (EY019624-02), by the Yale FAS MRI Program funded by the Office of the Provost and the Department of Psychology, and by a Psi Chi Summer Research Grant to A.S.C. We thank Avi Chanales for assistance in preparing the manuscript.
References (40)
- et al.
The correlates of subjective perception of identity and expression in the face network: an fMRI adaptation study
NeuroImage
(2009) - et al.
The neural representation of face space dimensions
Neuropsychologia
(2013) - et al.
Neural systems for recognition of familiar faces
Neuropsychologia
(2007) - et al.
On the interpretation of weight vectors of linear models in multivariate neuroimaging
NeuroImage
(2014) - et al.
A continuous semantic space describes the representation of thousands of object and action categories across the human brain
Neuron
(2012) - et al.
Position-specific and position-invariant face aftereffects reflect the adaptation of different cortical areas
NeuroImage
(2008) - et al.
Partial least squares (PLS) methods for neuroimaging: a tutorial and review
NeuroImage
(2011) - et al.
Spatial pattern analysis of functional brain images using partial least squares
NeuroImage
(1996) - et al.
Visual image reconstruction from human brain activity using a combination of multiscale local image decoders
Neuron
(2008) - et al.
Bayesian reconstruction of natural images from human brain activity
Neuron
(2009)
Reconstructing visual experiences from brain activity evoked by natural movies
Curr. Biol.
Linear reconstruction of perceived images from human brain activity
NeuroImage
Measuring internal representations from behavioral and brain data
Curr. Biol.
Natural scene statistics account for the representation of scene categories in human visual cortex
Neuron
Imagery for shapes activates position-invariant representations in human visual cortex
NeuroImage
Inverse retinotopy: inferring the visual content of images from brain activation patterns
NeuroImage
Representations of individuals in ventral temporal cortex defined by faces and biographies
Neuropsychologia
Decoding representations of face identity that are tolerant to rotation
Cereb. Cortex
Hierarchical processing of face viewpoint in human visual cortex
J. Neurosci.
Implicit race bias decreases the similarity of neural representations of black and white faces
Psychol. Sci.
Cited by (87)
Brain-driven facial image reconstruction via StyleGAN inversion with improved identity consistency
2024, Pattern Recognition