Neurons in the temporal lobe of both monkeys and humans show selective responses to classes of visual stimuli and even to specific individuals. In this study, we investigate the latency and selectivity of visually responsive neurons recorded from microelectrodes in the parahippocampal cortex, entorhinal cortex, hippocampus, and amygdala of human subjects during a visual object presentation task. During 96 experimental sessions in 35 subjects, we recorded from a total of 3278 neurons. Of these units, 398 responded selectively to one or more of the presented stimuli. Mean response latencies were substantially larger than those reported in monkeys. We observed a highly significant correlation between the latency and the selectivity of these neurons: the longer the latency the greater the selectivity. Particularly, parahippocampal neurons were found to respond significantly earlier and less selectively than those in the other three regions. Regional analysis showed significant correlations between latency and selectivity within the parahippocampal cortex, entorhinal cortex, and hippocampus, but not within the amygdala. The later and more selective responses tended to be generated by cells with sparse baseline firing rates and vice versa. Our results provide direct evidence for hierarchical processing of sensory information at the interface between the visual pathway and the limbic system, by which increasingly refined and specific representations of stimulus identity are generated over time along the anatomic pathways of the medial temporal lobe.
The inferior and medial regions of the temporal lobe in human and nonhuman primates comprise the distal stages of the ventral visual pathway and parts of the limbic system, responsible for encoding and retrieval of mnemonic information. The differential contribution of these regions to the processing and elaboration of information at the interface between perception and memory remains an open question (Squire et al., 2004). Recordings of single-neuron activity in monkey visual temporal cortex led to the discovery of neurons that respond selectively to certain categories of stimuli such as faces or objects (cf. Logothetis and Sheinberg, 1996; Tanaka, 1996; Freedman and Miller, 2008). Recordings of single-cell activity in the human medial temporal lobe (MTL) have revealed similar category neurons (Fried et al., 1997; Kreiman et al., 2000) and even neurons that show selective and invariant responses to different pictures of an individual, including their written name (Quiroga et al., 2005). Neuroanatomical studies in monkeys have identified direct connections between different regions of the inferior and medial temporal lobe (Suzuki, 1996). Whereas visual response latencies to different types of stimuli have been reported for different temporal areas in monkeys (Table 1), few studies provide a direct regional comparison (Leonard et al., 1985; Liu and Richmond, 2000; Naya et al., 2001, 2003), and reports on human latency data to date remain elusive. We, here, systematically investigate the latency and selectivity of visually responsive neurons and report evidence for hierarchical processing of visual stimuli in the human MTL.
Materials and Methods
All studies conformed to the guidelines of the Medical Institutional Review Board at the University of California, Los Angeles. Electrode locations were based exclusively on clinical criteria and were verified by magnetic resonance imaging (MRI) or by computer tomography coregistered to preoperative MRI. Each electrode probe had nine microwires protruding from its tip, eight high-impedance recording channels (typically 200–400 kΩ), and one low-impedance reference with stripped insulation. The differential signal from the microwires was amplified using a 64-channel Neuralynx system, filtered between 1 and 9000 Hz, and sampled at 28 kHz. Spike detection and sorting was performed after bandpass filtering the signals between 300 and 3000 Hz (Quiroga et al., 2004).
Each recording session lasted ∼30 min. Subjects were sitting in bed, facing a laptop computer on which pictures of famous individuals, landmarks, animals, or objects were shown. A median number of 97 (range, 60–202) different images were shown per session, centered on a laptop screen and covering ∼1.5°, and displayed six times each for 1 s in pseudorandom order (Quiroga et al., 2005). After image offset, subjects had to indicate whether the picture contained a human face or something else by pressing the “Y” and “N” keys, respectively. This simple task, on which performance was virtually flawless, required them to attend to the pictures. Every stimulus presentation was preceded by a fixation cross for 500 ms to assess baseline firing activity. In a slightly different variant of the paradigm (23 of 96 sessions), images were presented for 500 ms (20 sessions) or 750 ms (3 sessions), and the attention task was omitted. Absence of a significant influence of the presentation time on the observed response latencies was confirmed post hoc by nonparametric one-way ANOVA (Kruskal–Wallis; p = 0.18).
To determine whether a unit responded selectively to one or more of the stimuli presented, we divided the 1000 ms after stimulus onset into 19 overlapping 100 ms bins, and for each bin we compared the spike rates for the six presentations of each stimulus to the baseline intervals of 500 ms before all of the stimulus onsets in a session (∼100 × 6) by means of a two-tailed Mann–Whitney U test, using the Simes procedure (Rodland, 2006) to correct for multiple comparisons and applying a conservative significance threshold of p = 0.001 to reduce false-positive detections. Only responsive units were included in the subsequent latency and selectivity analyses.
Onset latencies for responsive units were determined by Poisson spike train analysis (Hanes et al., 1995). For this procedure, the interspike intervals (ISIs) of a given unit are processed continuously over the entire recording session, and the onset of a spike train is detected based on its deviation from a baseline Poisson, i.e., exponential, distribution of ISIs (regardless of the experimental paradigm). For each response-eliciting stimulus, we determined the time between stimulus onset and the onset of the first spike train in all six presentations. Only spike train onsets within the first 1000 ms after stimulus onset were considered. The median length of these six time intervals was taken as response latency. For sparsely firing units with mean baseline firing activity of <2 Hz, Poisson spike train analysis generally failed to pick up any onset spike, thus we used the median latency of the first spike during stimulus presentation instead. To minimize spurious latency values, we excluded responses for which the onsets of the three trials closest to the calculated response latency were >200 ms apart. For a neuron responding to more than one stimulus, the median of the different stimulus latencies was taken.
For the nonparametric correlation analysis, selectivity of each unit was operationally defined as the reciprocal value of the relative number of response-eliciting stimuli.
Baseline firing rates of the responsive cells were calculated from the 500 ms before stimulus onset and quantified as the median across six presentations. For a neuron responding to more than one stimulus, the median of the baseline rates for different stimuli was taken.
During 96 sessions, we recorded from 3278 neurons (1356 multi units, 1922 single units) in 35 subjects with pharmacologically intractable epilepsy (29 right handed, 20 male, 17–54 years old), implanted with chronic electrodes to localize the seizure focus for possible surgical resection (Fried et al., 1997). We report data from microelectrode recordings in the hippocampus, amygdala, entorhinal cortex, and parahippocampal cortex [in the part of the parahippocampal gyrus that is posterior to the entorhinal and perirhinal cortex (cf. Insausti et al., 1998)]. Each recording session lasted ∼30 min. Subjects were sitting in bed, facing a laptop computer on which ∼100 pictures per session of different famous individuals, landmarks, animals, or objects were displayed for 1 s each, with six repetitions in pseudorandom order. Onset latencies for responsive units (i.e., units showing a significant increase in firing rate relative to baseline) were determined by Poisson spike train analysis (Hanes et al., 1995). Examples of responses from five different neurons in each MTL region are displayed in Figure 1.
A total of 398 units [47 of 293 (16%) in the parahippocampal cortex; 79 of 844 (9%) in the entorhinal cortex; 171 of 1194 (14%) in the hippocampus; 101 of 947 (11%) in the amygdala] responded significantly to one or more of the presented stimuli (cf. Waydo et al., 2006). Response latencies of these neurons yielded unimodal, localized distributions in all four regions (Fig. 2, top, middle). Average response latencies in the parahippocampal cortex (271 ms) were significantly earlier than those in the entorhinal cortex (392 ms), hippocampus (394 ms), and amygdala (397 ms), preceding these regions typically by >100 ms [Figs. 2 (bottom), 4A].
Because we used an automated, objective criterion to select responsive neurons and determine their response latencies, we cannot expect the specificity of our approach to be perfect, and the distributions in Figure 2 may thus be contaminated by a small percentage of spurious latencies, affecting especially the tails of the distributions. We identified the earliest reliable response latencies by visual inspection and found them to be 101 ms for the parahippocampal cortex, 206 ms for the entorhinal cortex, 204 ms for the hippocampus, and 220 ms for the amygdala.
Given the all-or-none character of the firing response (Quiroga et al., 2007), we evaluated response selectivity by the number of stimuli to which a neuron responded. Like the latency, selectivity varied across regions (Figs. 3, 4B). Whereas parahippocampal neurons responded on average to approximately five stimuli, neurons in the other three regions showed a significantly higher selectivity with an average of approximately two response-eliciting stimuli (Fig. 3, bottom). To rule out an influence of the total number of stimuli per session, the analysis was repeated after normalizing the number of response-eliciting stimuli by the total stimulus number, yielding analogous results (supplemental Fig. S1, available at www.jneurosci.org as supplemental material).
Analysis of baseline firing rates for the different MTL regions showed higher baseline activity for responsive neurons in the parahippocampal and entorhinal cortex than for hippocampal and amygdala neurons, but no prominent difference of the parahippocampal cortex from the other three regions as observed for latency and selectivity (Fig. 4C; supplemental Fig. S2, available at www.jneurosci.org as supplemental material).
Nonparametric correlation analysis across all 398 responsive neurons confirmed a highly significant direct relationship between latency and selectivity (Spearman's ρ = 0.24; p = 9.5 × 10−7). Separate regional analysis (Fig. 5) confirmed a statistically significant correlation between latency and selectivity in the parahippocampal cortex (p = 0.00009), entorhinal cortex (p = 0.008), and hippocampus (p = 0.038), but not in the amygdala (p = 0.495).
Both latency and selectivity, furthermore, showed a significant inverse correlation with the baseline firing rates across all 398 MTL neurons (Spearman's ρ = −0.24, p = 8.1 × 10−7; Spearman's ρ = −0.12, p = 0.02, respectively).
Finally, to rule out an influence of the underlying pathology of an epileptic brain, we repeated the entire analysis after excluding 65 cells (amounting to 16%) that were located in the same brain hemisphere as the epileptic focus. All findings demonstrated in this study remained valid and significant, and mean response latencies changed by <10 ms on average.
The latencies found in the parahippocampal cortex, entorhinal cortex, and hippocampus reflect the well-established neuroanatomical connections of these structures, with the entorhinal cortex providing the predominant input to the hippocampus and receiving major connections from the parahippocampal region (Suzuki and Amaral, 1994). The finding that latencies of responsive amygdala neurons do not significantly differ from entorhinal and hippocampal latencies is consistent with neuroanatomical evidence that the amygdala has monosynaptic connections to the entorhinal cortex and hippocampus (Suzuki, 1996; Pitkänen et al., 2002). Inputs to the amygdala originate from various sensory areas and other subcortical and cortical regions, among them the perirhinal cortex, which in turn receives parahippocampal inputs. Whereas some imaging studies have inferred a fast, subcortical sensory pathway to the amygdala (cf. Ohman et al., 2007), our results indicate that at least for the explicit, selective neural representations observed here, the processing time as reflected by the firing latency in the amygdala is comparable with entorhinal and hippocampal responses that are presumably generated along the ventral visual pathway.
The latencies observed by us are substantially larger than visual MTL latencies reported in monkeys. As can be seen from Table 1, mean latencies in the macaque entorhinal cortex, hippocampus, and amygdala range ∼150–200 ms, with latencies in the perirhinal cortex being somewhat earlier but still significantly later than those in inferotemporal cortex (IT). To the best of our knowledge, no visual response latencies from neurons in the monkey parahippocampal cortex (area TH/TF) have been reported to date. Likewise, neuronal response latencies remain elusive for the ventral visual pathway in humans. Considering the data available, we can estimate the visual response latencies in the human MTL to be approximately twice as long as those observed in macaque monkeys. Based on this ratio, one could extrapolate the latencies in the human homolog of monkey IT [presumably the lateral occipital complex (cf. Grill-Spector and Malach, 2004)] to range in the vicinity of 200 ms.
However, given the remarkable speed at which humans can discriminate stimulus categories (120 ms) (Kirchner and Thorpe, 2006), it is also conceivable that the latencies in monkey IT and its human homolog are not at all very different. In this case, the major difference would be the substantially longer delay between object recognition and the MTL latencies found by us. This in turn would suggest that IT responses in humans may undergo extensive further processing, possibly involving other regions, before reaching MTL and eliciting the highly selective responses observed here.
The visual tasks used in monkey MTL studies typically involve discrimination of novel versus familiar stimuli or association of different stimulus features. Recent electrophysiological studies in human subjects performing a learning task with initially unfamiliar stimuli likewise reported evidence for hippocampal and amygdala neurons that act as novelty or familiarity detectors without being stimulus specific (Rutishauser et al., 2006, 2008). It should be noted, however, that the responses described there are conceptually different from ours in that our stimulus material consists of images of single objects or persons that are already familiar to the subject and that no memory or association task is involved. Rather, these cells have been shown to encode with a high degree of invariance the category or identity of a presented object or person (Quiroga et al., 2005). A possible functional role of these neurons is to provide the link between perception and memory storage (Quiroga et al., 2008).
Remarkably, we find a prominent leap both in latency and selectivity between the parahippocampal cortex and its major projection area, the entorhinal cortex. Our data cannot unravel the detailed mechanisms of this hierarchical processing, but the involvement of various processes is conceivable. In the olfactory system of the locust, for instance, sparsening of representations is achieved by periodic feedforward inhibition (Perez-Orive et al., 2002). A similar mechanism could possibly be mediated by interneurons phase-locked to mediotemporal oscillations (Somogyi and Klausberger, 2005). Modulating influences could further arise through feedback loops between the parahippocampal cortex and inferotemporal and/or cingulate cortex, respectively (Suzuki and Amaral, 1994).
The direct relationship between latency and selectiveness of the visual responses observed here indicates a general mechanism of hierarchical processing (Grill-Spector and Malach, 2004; Freedman and Miller, 2008) by which increasingly refined and specific representations of stimulus identity are achieved over time along the anatomic pathways of the MTL (cf. Squire et al., 2004). A remarkable finding from our study is that this type of hierarchical processing occurs not only across different MTL regions, but also within regions such as the parahippocampal and entorhinal cortex and the hippocampus.
Interestingly, early and less-selective responses tended to be generated by cells with high baseline firing rates, whereas cells that responded later and more selectively tended to exhibit rather sparse baseline activity. Future technological advances may allow simultaneous recording of larger cell populations and thus provide an opportunity to directly monitor the detailed mechanisms by which these cells implement the hierarchical processing described in this study.
This work was supported by the European Commission (Marie Curie Outgoing International Fellowship 040445), National Institute of Neurological Disorders and Stroke, Defense Advanced Research Projects Agency, the Engineering and Physical Science Research Council, and the Mathers Foundation. We thank all patients for their participation, and Eric Behnke, Tony Fields, Emily Ho, Eve Isham, Kelsey Laird, Neel Parikshak, and Anna Postolova for technical assistance.
- Correspondence should be addressed to Dr. Florian Mormann, California Institute of Technology, Division of Biology, MS 216-76, 1200 East California Boulevard, Pasadena, CA 91125.