Abstract
When the same visual stimulus is repeatedly presented with a brief interval, the brain responses to that stimulus are attenuated relative to those at first presentation [neural adaptation (NA)]. Although this effect has been widely observed in various regions of human brain, its temporal dynamics as a neuronal population has been mostly unclear. In the present study, we used a magnetoencephalography (MEG) and conducted a macrolevel investigation of the temporal profiles of the NA occurring in the human visual ventral stream. The combination of MEG with our previous random dot blinking method isolated the neural responses in the higher visual cortex relating to shape perception. We dissociated three dimensions of the NA: activation strength, peak latency, and temporal duration of neural response. The results revealed that visual responses to the repeated compared with novel stimulus showed a significant reduction in both activation strength and peak latency but not in the duration of neural processing. Furthermore, this acceleration of peak latency showed a significant correlation with reaction time of the subjects, whereas no correlation was found between the reaction time and the temporal duration of neural responses. These results indicate that (1) the NA involves the brain response changes in the temporal domain as well as the response attenuation reported previously, and (2) this temporal change is primarily observed as a rapid rising of “what” responses, rather than a temporal shortening of neural response curves within the visual ventral stream as considered previously.
Introduction
One common finding in neurophysiological and neuroimaging studies is a reduced neural response to repeated compared with unrepeated stimuli (Schacter and Buckner, 1998; Wiggs and Martin, 1998; Henson and Rugg, 2003). Although this attenuation was first reported in inferior temporal (IT) neurons of monkeys (Baylis and Rolls, 1987; Brown et al., 1987; Miller et al., 1991; Desimone, 1996; Ringo, 1996), recent studies using positron emission tomography (PET) and functional magnetic resonance imaging (fMRI) showed that the effect also occurs in various regions of the human brain, including the occipital (Grill-Spector et al., 1999; Kourtzi and Kanwisher, 2000), parietal (Naccache and Dehaene, 2001), and inferior frontal cortices (Raichle et al., 1994; Thompson-Schill et al., 1999; van Turennout et al., 2000; Wagner et al., 2000).
Despite mounting evidence of the repetition attenuation, the temporal profiles of this effect as a neural network have been poorly understood, probably because of the limited temporal resolution of hemodynamic imaging methods such as PET and fMRI. Although a decrease in the blood oxygen level-dependent (BOLD) signal has been regarded as a sign of this effect in fMRI studies, a previous study indicated that this can arise from at least two types of neural activity (Henson and Rugg, 2003) (Fig. 1). One explanation is a reduced firing rate as a whole neural population in each cortical area (Fig. 1a). This view is strongly supported by many studies of unit cell recordings and the “sharpening” theory propounded by Wiggs and Martin (1998). According to this theory, neurons that are not critical for recognizing an object decreases their responses as the object is repeatedly presented, whereas those carrying essential information continue to give a robust response. As a result, the mean firing rate as a whole is attenuated by stimulus repetition. In contrast, a reduction in BOLD responses can also be explained by the response change in the temporal domain: a shortened duration of neural activity (Henson and Rugg, 2003) (Fig. 1b). This account is based on the fact that the hemodynamic response represents the integration of several seconds of neural-synaptic activity and proven to be possible by a previous neural computation theory (Becker et al., 1997). In this theory, the neural processing network settles to a stable response more quickly in response to a repeated than novel stimulus, because the network connections involved in producing the response have been reinforced by a previous presentation of the same stimulus. Indeed, a recent fMRI study reported results supporting this view (Henson et al., 2002a), although their analysis requires the assumption of a precise linearity of hemodynamics.
Several hypotheses accounting for a BOLD signal decrease in NA effect. The neural activity (square function) and predicted BOLD signal (round function; in arbitrary units) in response to a novel (broken line) and repeated (solid line) stimulus are shown in each panel. a, Activation reduction hypothesis. The decrease in mean firing rate in the repeated condition results in a smaller BOLD amplitude. b, Shortened duration hypothesis. The stimulus repetition elicits a temporal acceleration of neural processing, which leads to a reduction in the BOLD response. These two models were originally reviewed by Henson and Rugg (2003) and can be distinguished by a difference in BOLD peak latency on the assumption that any nonlinear term in the neural-hemodynamic relationship can be excluded [Henson and Rugg (2003), reprinted with permission in Henson et al. (2002a)]. c, Integrative model of the response attenuation and temporal acceleration proposed in the present study. The repetition of the same stimulus produces a more rapid and smaller neural response in this model. However, the temporal duration is not affected by stimulus repetition. The reference function of hemodynamics was adopted from Friston et al. (1998).
In the present study, we used magnetoencephalography (MEG) with a high temporal resolution and measured directly the neural responses underlying the neural adaptation (NA) effect in the human shape perception area (Grill-Spector et al., 1999; Kourtzi and Kanwisher, 2001). As a result, our data supported the integrative model of the response reduction and temporal acceleration, as shown in the third hypothesis depicted in Figure 1c.
Materials and Methods
Subjects. Ten healthy volunteers participated in the present study (seven males and three females). All subjects had normal or corrected-to-normal visual acuity. Informed consent was received from each participant after the nature of the study had been explained. Approval for these experiments was obtained from the ethics committee of the National Institute for Physiological Sciences (Okazaki, Japan).
Stimuli and task. One problem with MEG experiments on visual function is that the neuronal signals from the higher visual areas are difficult to distinguish from those in early visual cortex (such as V1) in most cases because of the insufficient modeling quality of V1 activities. Because neural responses in the early visual areas are relatively insensitive to the repetition effect compared with those in the later visual areas (Buckner et al., 1998; Schacter and Buckner, 1998), the confounding of early visual signals into MEG data in the present study would obscure the NA effect occurring in higher visual areas. Given recent studies reporting that V1 area receives a delayed feedback signal from the higher visual cortex at a latency of 190-230 msec (Noesselt et al., 2002; Halgren et al., 2003) in addition to the primary visual input from the thalamus, it would be difficult to exclude the early visual signals on the basis of signal latency. We therefore presented visual stimuli based on our random dot blinking (RDB) technique developed previously (Okusa et al., 1998). With this method, characters are presented in the center of a black and white random dot field. Although all dots in the field flicker continuously in the resting state, a subset of dots becomes static during the character presentation period, whereas the other dots remain dynamic (Fig. 2a). This static-dynamic contrast enables observers to perceive the shape of a letter. Because the ratio of white and black pixels is fixed throughout both periods, the mean luminance of the field is always the same. Our previous study has shown that this stimulation paradigm effectively inhibits the neural responses from the V1 area and elicits one simple component of magnetic response at a peak latency of ∼300 msec, the signal source of which is estimated to lie in the occipito-temporal area around the fusiform gyrus.
Schematic illustration of the stimulus presentation paradigm used in the present study. a, RDB method for the presentation of letter stimuli. Uppercase letters were presented through a 60 × 60 random dot field in the central visual field, although the number of dots is reduced in this figure. Although each dot in the field is vibrating at 60 Hz in the resting period (left panel), the dynamic-static segregation of dots elicits a perception of visual letters in the character presentation period (middle panel). The contour of A depicted by the thick line is shown only for this illustration. The actual density of the dot is shown in the right panel. In this figure, stimulus images of four consecutive frames during the presentation are overlaid to illustrate the dynamic and static texture. The dots in the static texture stay at the same place, thus their overlaid image appears solid. In contrast, the overlaid dynamic dots show various appearances resulting from the dot motion. b, Event charts in the SINGLE and three ISI conditions. Although only S1 (300 msec) was presented in the SINGLE condition, S1 (300 msec) and S2 (500 msec) were sequentially presented in the ISI 150, 250, and 350 conditions. In all conditions, the averaging epoch of MEG signal was from -100 to 1200 msec relative to the S1 onset.
We used the RDB method for sequential presentation of two visual stimuli (letters) in the central visual field of the subjects. All stimuli were presented in a random dot field subtending a visual angle of 6 × 6° with a 60 × 60 dots array on the projector screen at a viewing distance of 250 cm. For the dynamic texture, each dot (2 × 2 pixel) changed its position within a 3 × 3 pixel area every 16 msec in a pseudorandom manner to produce vibrating motion. For the static texture, the dots remained stationary. The ratio of white to black pixels was fixed at 1:3 throughout the entire scanning period.
We used six uppercase letters (A, O, E, B, K, P) as letter stimuli. Each letter was used as both S1 (first stimulus) and S2 (second stimulus). The display duration of S1 and S2 was 300 and 500 msec, respectively. The time interval between S1 and S2 [interstimulus interval (ISI)] was either 150, 250, or 350 msec (Fig. 2b), and there were two kinds of trials for each ISI. In SAME trials, the same letter was repeatedly presented as S1 and S2. In DIFF trials, the S1 and S2 letters differed. Because each letter was presented as S1 or S2 in both SAME and DIFF trials at equal times, the difference in brain responses between these two types of trials cannot be attributed to the difference in the visual features of the stimuli presented. Apart from these six conditions (three ISIs for SAME or DIFF), we introduced a control condition in which only S1 was presented for 300 msec (SINGLE condition).
A single scanning session of MEG recordings started with six trials of the SINGLE condition during which subjects were instructed to look passively at the letter presented (no-task period). This period was followed by 72 trials with paired letter stimuli (task period). In this period, stimulus pairs in the six conditions (12 trials for each) were randomly intermixed, and subjects were asked to perform a vowel-consonant judgment task with S2, not S1, characters. They were instructed to press one button as quickly as possible when the S2 letter was a vowel (A, O, E) and another button when it was a consonant (B, K, P). All responses were made by the right hand of the subjects. The session ended with another no-task period composed of six trials of the SINGLE condition. To prevent the task and no-task periods from being confused, cue stimuli showing the switch between the two periods were presented. The numerals 2 and 1 were presented at the beginning and end of the task period, respectively. A scanning session containing a total of 84 (∼5 min) was repeated six times in one experiment. Every three trials, a brief interval (2 or 5 sec) was interposed in which subjects were allowed to blink their eyes. Considering the previous results on the repetition attenuation effect (Henson et al., 2000, 2002b), the use of familiar stimuli (letters) and an indirect task (a task that does not require an explicit recollection of previous events) in the present study would highlight the NA in the higher visual regions.
MEG recordings. The visual-evoked fields (VEFs) were recorded with a helmet-shaped 306-channel detector array (Vectorview; ELEKTA Neuromag, Helsinki, Finland), which comprised 102 identical triple sensor elements. Each sensor element consisted of two orthogonal planar gradiometers and one magnetometer coupled to a multi-SQUID (superconducting quantum interference device) and thus provided three independent measurements of the magnetic fields. In the present study, we analyzed MEG signals recorded from 204-channel planar-type gradiometers. The signals from these sensors are strongest when the sensors are located just above local cerebral sources (Nishitani and Hari, 2002). The MEG signals were recorded with 0.1-200 Hz bandpass filters and digitized at 900 Hz.
Before MEG recordings, four head position indicator (HPI) coils were placed at specific sites on the scalp. To determine the exact head location with respect to the MEG sensors, electric current was fed to the HPI coils, and the resulting magnetic fields were measured with the magnetometer. These procedures allowed for alignment of the individual head coordinate system with the magnetometer coordinate system. The locations of HPI coils with respect to the three anatomical landmarks (nasion and bilateral) were also measured using a three-dimensional digitizer to align the coordinate systems of MEG with magnetic resonance (MR) images obtained with a 3 tesla MRI system (Allegra; Siemens, Erlangen, Germany). We adopted the head-based coordinate system used in our previous study (Wasaka et al., 2003). The x-axis was fixed with the preauricular points, with the positive direction to the right. The positive y-axis passed through the nasion, and the z-axis thus pointed upward.
Data analyses. The signals in the seven conditions were averaged separately, time-locked to the onset of S1 stimuli. The averaging epoch ranged from 100 msec before to 1200 msec after the S1 onset, and the prestimulus period (initial 100 msec) was used as the baseline. Epochs in which signal variation was larger than 3000 fT were excluded from the averaging. For detecting the occipital signals related to letter recognition, we took the sensor of interest (SOI) approach described in the previous MEG study (Liu et al., 2002). Initially, we selected the SOIs in the present study from 204 planar channels according to the following criteria: the peak deflection in the SINGLE condition lies 200-400 msec after the S1 onset, and a significant deflection (>2 SD of the fluctuation level in baseline period of each channel) continues for at least 60 msec centering on the peak latency. These criteria are based on our previous results reporting the fusiform activation at a latency of ∼300 msec (Okusa et al., 1998). An average of 22.2 SOIs was selected for each subject. We then calculated the mean waveform of all SOIs in each subject, after the data on SOIs showing a negative deflection were flipped to match the polarities (Liu et al., 2002). Using the across-SOI time series of each subject, we measured three independent parameters to test the repetition effect between SAME and DIFF: peak amplitude, peak latency, and full width at half-maximum (FWHM). The FWHM is the temporal interval where the signal amplitude is >50% of the peak. Because it is theoretically independent of variation in the vertical amplitude of the MEG waveform, the FWHM corresponds to an index of the temporal duration of neural activity induced in the ventral visual cortex. Regarding the responses to S1, the peak amplitude and latency were defined as a maximum signal strength (and its latency) in the across-SOI waveforms within the time window of -100 to ∼500 msec after the S1 onset. The time window was set at 500 to ∼1200 msec for S2 responses. Once the peak amplitude was determined, we calculated 50% of the peak for each response waveform (half maximum). The time interval between the first and last time points in which signal strength is above the half maximum was defined as FWHM of this waveform.
Apart from the SOI analyses, we estimated the single equivalent current dipole (ECD) on response waveforms to confirm the anatomical location of VEF sources. We adopted a spherical head model based on individual MR images (Hamalainen et al., 1993). The ECDs best explaining the distribution of the magnetic field over at least 20 channels around the signal maxima were searched by the least square method (Wasaka et al., 2003). We accepted only ECDs that accounted for at least 80% of the field variance at the peak.
Results
Figure 3a shows the VEF in the SINGLE condition for one subject. Clear deflections were observed primarily in MEG channels on the lateral sides of both hemispheres. Magnetic responses around the occipital pole were relatively small, indicating that neural responses in the early visual areas were successfully inhibited by the RDB stimulus. Figure 3b shows the superimposed waveform of all SOIs in the same subject. Consistent with our previous study (Okusa et al., 1998), a large component was observed at a latency of ∼300 msec. We plotted in Figure 3c the waveform of all conditions in two representative SOIs marked in Figure 3a. Magnetic responses to S2 that have the same polarity as those to S1 were clearly observed in the six conditions with the paired letter presentation, but they were absent in the SINGLE condition. The S2 response latencies reflected the difference of S2 onset in the three ISI conditions. Within each ISI, the S2 response in the SAME condition appeared to show earlier peak latency (latency at the peak) and smaller deflection than that in the DIFF condition.
MEG signals of a representative subject. a, VEF waveforms over 204 planar coils in the SINGLE condition. b, Superimposed waveform of 33 SOIs in the same subject. c, MEG waveforms in two SOIs encompassed in a. In addition to the MEG signals in the SINGLE condition (black line), waveforms in the other six conditions are also shown in this panel. The blue, green, and red lines correspond to VEFs in the ISI 150, 250, and 350 conditions, respectively. For each ISI condition, the data in the SAME (solid line) and DIFF (broken line) trials were plotted. All data in a--c were digitally filtered (0.1-30 Hz bandpass) for display purposes.
The results of dipole analyses indicated that all ECDs calculated on MEG signals in the SINGLE condition were estimated in the vicinity of the occipito-temporal cortex around the fusiform gyrus, which also confirmed our previous results (Okusa et al., 1998). In Figure 4a, the mean dipole location of each hemisphere across subjects was shown on the MR image of a representative subject. According to the head-based coordinate system used, the mean (±SE) coordinates were (-31 ± 9.6, -26 ± 3.5, 44 ± 3.0) for left and (35 ± 3.2, -20 ± 4.2, 48 ± 3.1) for right hemispheres. There was no significant difference of ECD locations between the two hemispheres (p > 0.1, for all axes). These ECD results could be reinforced by two topography maps depicted over the field of 102 sensor positions in MEG. Figure 4b shows a distribution of 222 SOIs selected from the data of 10 subjects and thus represents how many times each sensor was selected as SOI. We also made another contour map (Fig. 4c) in which mean deflection of MEG waveforms at 300 msec of SINGLE condition are plotted for each sensor position. The results of both maps indicate that the signal sources of the 300 msec component are located in the bilateral occipital-temporal regions.
Anatomical location of the signal source in the SINGLE condition. a, ECD location estimated at the peak of the large component shown in Figure 3b. The mean dipole coordinates (5 and 7 dipoles in the left and right hemispheres, respectively) across subjects were plotted on the MR image of a representative subject. b, The distribution of SOIs across 10 subjects. The number of SOIs in each sensor position was summed across subjects and color-coded on a contour map depicted over the topographical layout of 102 sensor positions. c, MEG signal strength at 300 msec of the SINGLE waveform; same as b, but the mean deflection across 10 subjects was represented in this map. The data in two planar sensors (latitudinal and longitudinal) were averaged for each position. Note that both topographical maps (b, c) show the distinct neural responses in bilateral occipito-temporal regions.
We showed in Figure 5 the across-SOI waveforms of one subject (Fig. 5a) and grand mean of 10 subjects (Fig. 5b). In S2 responses, both peak amplitude and peak latency were clearly different between the SAME and DIFF trials in all three ISI conditions. In contrast, S1 responses in the six conditions with a paired stimulus were almost identical, although, in the grand mean waveform, peak latency was significantly delayed in the SINGLE condition compared with the other six conditions (p < 0.05; t test), probably because of the lack of a task requirement in the SINGLE condition. To closely investigate the temporal profiles of the response waveforms regardless of their amplitude differences, we also presented the grand mean time courses normalized to the S2 peak amplitude of each paired-stimulus condition (Fig. 5c). In all three ISI conditions, the S2 responses in the SAME trials reach their peak more rapidly than those in the DIFF trials. The SAME responses also precede the DIFF in their signal decreases.
Across-SOI waveforms of seven conditions. a-c, The data for one representative subject (a), grand mean waveforms of 10 subjects (b), and normalized grand mean time courses (c) are shown. In c, the peak amplitude of S2 response in each paired-stimulus condition was represented as 1. As in Figure 3c, the blue, green, and red lines represent ISI 150, 250, and 350 conditions (solid, SAME; broken, DIFF). The event charts for the three ISI conditions are also displayed below each time series.
We then examined the NA effect statistically by calculating the three independent parameters on the across-SOI time series of each subject: peak amplitude, peak latency, and FWHM of response waveforms (Fig. 6). In each panel of Figure 6, we used repeated-measures ANOVA of ISI times repetition (SAME vs DIFF) with repetition as a within-subject factor. The Mauchly's tests indicated that the sphericity assumption was not rejected in all comparisons. In the S1 response, there was no significant main effect or interaction in any parameters (p > 0.05 for all) (Fig. 6a,c,e). In contrast, the peak amplitude and latency of the S2 response showed a significant main effect of repetition (p < 0.0001 for both; the other effects were not significant, p > 0.05) (Fig. 6b,d). However, there was no significant effect in the FWHM of the S2 signal (p > 0.05) (Fig. 6f), although the DIFF trials tended to have a larger FWHM value than the SAME trials (main effect of repetition, F = 4.037; p > 0.05). In Figure 7, we plotted the repetition effect (DIFF-SAME) of the three indices. One-group t test applied to each bar demonstrated that the significant effect of repetition (SAME < DIFF) in S2 peak amplitude and latency could be observed in all three ISIs (Fig. 7a,b) (p < 0.05 for all), whereas no repetition effect of FWHM was observed in any ISI conditions (Fig. 7c) (p > 0.05). The means and SEs in these results were summarized in Table 1.
Statistic comparisons of brain responses between the SAME and DIFF trials. a-f, For both S1 (a, c, e) and S2 (b, d, f) responses, three parameters were calculated from across-SOI waveform of each subject: peak amplitude (a, b), peak latency (c, d), and FWHM (e, f). The blank and filled bars indicate the data in the SAME and DIFF trials, respectively. Error bars denote the SE across 10 subjects. Note that peak latency to S2 (d) was measured from the S2 onset in each ISI condition.
The repetition effects measured by three indices. a-c, The differences between DIFF and SAME were shown for each parameter: peak amplitude (a), peak latency (b), and FWHM (c). Error bars denote the SE across 10 subjects. *p < 0.05; **p < 0.01; one group t test.
Neural responses to S2 and behavioral RT data
Although significant repetition effects were observed in S2 peak amplitude and latency, the signal source of these effects remains unclear, because the source location of the repetition effect (DIFF-SAME) was not directly investigated. We therefore conducted the ECD estimation again, using the DIFF-SAME waveform of each ISI condition. The across-subject mean and SE of these ECD locations were: ISI 150 (41 ± 4.5, -24 ± 4.2, 52 ± 4.0), ISI 250 (37 ± 4.8, -15 ± 5.4, 41 ± 3.8), ISI 350 (37 ± 3.8, -21 ± 5.0, 50 ± 4.0) for right hemisphere and ISI 150 (-43 ± 9.7, -20 ± 5.9, 48 ± 5.4), ISI 250 (-27 ± 7.1, -43 ± 7.2, 50 ± 3.8), ISI 350 (-33 ± 8.2, -29 ± 10.3, 49 ± 4.3) for left hemisphere. To examine whether these ECD locations were significantly different from that for the S1 peak, we performed one-way ANOVA with four levels (S1 and three ISI conditions) for each axis and each hemisphere. No significant main effect was observed in all comparisons (p > 0.05 for all). These results suggest a common neural generator of the repetition effect and S1 SINGLE component.
Finally, we investigated the relationship between behavioral performance of the vowel-consonant judgment task and MEG waveforms. Subjects performed the task very well, and accuracy in the six conditions ranged from 97.5 to 98.3%. In contrast, a significant response time (RT) reduction in the SAME compared with DIFF condition was observed in all three ISIs (Fig. 8a) (main effect of repetition, p < 0.001; repeated-measures ANOVA), as seen in many behavioral results of previous studies (Raichle et al., 1994; Buckner et al., 1998; Thompson-Schill et al., 1999; van Turennout et al., 2000; Wagner et al., 2000; Naccache and Dehaene, 2001). In Figure 8b, results of correlation analyses between the task RT and the S2 peak latency (Fig. 6d) or S2 FWHM (Fig. 6f) are shown. Whereas peak latency exhibited a high correlation with RT data (r = 0.48; p < 0.0001), no significant correlation was found between FWHM and RT (r = 0.09; p > 0.05), indicating that behavioral responses are highly predictable based on the latency, but not temporal duration, of the occipito-temporal activation.
The relationship between behavioral data and MEG waveforms. a, RT data in the vowel-consonant judgment task measured from the S2 onset. As in Figure 6, the blank and filled bars indicate the data in the SAME and DIFF trials, and error bars denote the SE across 10 subjects. ***p < 0.0001; paired t test. b, Correlation analyses of the RT data with two temporal parameters of MEG response: S2 peak latency (Fig. 6d) and FWHM (Fig. 6f). The data for six conditions in 10 subjects (60 points) were plotted for each parameter. A significantly high correlation was observed between RT and peak latency (represented as •; r = 0.48; p < 0.0001; significance test for correlation coefficients) but not between RT and FWHM of the S2 response (○; r = 0.09; p > 0.05).
Discussion
In the present study, we conducted a macrolevel analysis of the temporal profiles of the NA effect in the shape perception area of the human brain. Although brain responses to two rapidly presented visual stimuli were previously indistinguishable from each other because of the limited temporal resolution of other macrolevel imaging techniques such as PET and fMRI, we examined the S1 and S2 responses in the occipito-temporal areas separately by combining MEG recording with the RDB method. In contrast to the S1 response in which no difference was observed between the SAME and DIFF trials, the S2 response showed a clear NA effect both in peak amplitude and peak latency. In contrast, the temporal width (FWHM) of the S2 response was not significantly affected by the stimulus repetition.
One previous study using intracranial event-related potential (ERP) reported that the first distinct response in visual ventral areas appears ∼200 msec after the stimulus onset (Puce et al., 1999). In contrast, the occipito-temporal responses in the present study were observed at a latency of 300 msec, 100 msec later than that in the intracranial ERP. We suppose that this delay in response latency would be caused by the special characteristics of our RDB stimuli. Although letters in the ERP study were defined by the luminance difference from the background, our stimuli were presented through a static-dynamic contrast between letters and background. A recent MEG study (Schoenfeld et al., 2003) showed that the response latency in shape processing areas is variable depending on how the visual stimuli are defined. According to them, visual shape stimuli (squares and rectangles) defined by luminance cues activated a serial processing stream from V1 to lateral occipital (LO) and IT regions, whereas shapes defined by motion coherence of random dots (similar but not identical to ours) elicited the activation in MT/V5 before evoking LO and IT responses. As a result, the response latency in LO and IT was delayed for 50-60 msec in response to the motion-defined than luminance-defined shapes, despite the same response latency in early visual areas. Although it is unclear where in the brain our RDB stimuli are processed, it is possible for the RDB letters to be processed in several areas (in the dorsal stream) before reaching occipito-temporal regions, resulting in some delay of activity in the visual ventral pathway.
In Figures 6 and 7, we showed that both peak amplitude and latency in S2 response were significantly modulated by the stimulus repetition. However, these results may be explained merely by the amplitude reduction, because the decrease in peak amplitude inevitably involves the shortening of peak latency when SAME and DIFF show the neural response curves with similar shapes. To examine this possibility, we calculated the correlation coefficient between the peak amplitude and peak latency. If the decrease in peak latency is a “by-product” of the amplitude attenuation, these two values would show a high positive correlation (the smaller the amplitude is, the shorter the latency becomes). The result revealed that only the weak correlation was observed between peak amplitude and latency (r = 0.17; p > 0.05), suggesting that decrease in amplitude and latency occurred independently, although these changes appear to occur simultaneously in the grand-average time course (Fig. 5b). We also investigated the relationship between DIFF-SAME of peak amplitude and latency, only to find that their correlation was not significant again (r = 0.29; p > 0.05). In addition, the peak amplitude was also poorly correlated with the RT data (r = -0.14; p > 0.05), whereas the high correlation was observed between the peak latency and RT, as shown in Figure 8b. Our conclusion, therefore, is that the reductions in peak amplitude and peak latency were independent phenomena produced by the stimulus repetition, and RT decreases behaviorally observed were primarily induced by the decrease in peak latency, not peak amplitude.
One characteristic of the present study is that the temporal durations of neural responses were measured by FWHM. It may seem unnatural that no significant difference between SAME and DIFF was observed in FWHM of S2 responses (Figs. 6, 7), because one would expect a longer processing of DIFF than SAME if the duration of activation was defined by the time interval, during which the amplitudes are above a certain criterion (Fig. 5b). This apparent inconsistency is caused by the fact that the FWHM is determined based on the size of the peak amplitude for each response curve and, thus, is free from the amplitude effects, at least theoretically. Indeed, we initially calculated the temporal duration of waveforms by applying the constant criterion (50% of S1 peak amplitude in the SINGLE condition) to the six conditions with paired stimuli. The result revealed that these measures were highly correlated with the peak amplitudes (r = 0.62; p < 0.0001), indicating that temporal and amplitude effects are mixed in this index. In contrast, the correlation between peak amplitude and FWHM is not significant (r = 0.24; p > 0.05), as predicted by the theoretical consideration above. Because our primary purpose is to investigate the temporal dynamics of the neural adaptation effect other than response attenuation, we adopted the FWHM as a better index to measure the temporal duration of waveforms regardless of their amplitude difference. However, it may be difficult to argue that the FWHM is completely independent from amplitude effects, because the correlation between them was not zero (0.24).
It should be noted that the repetition interval in the present study is relatively short (150-350 msec) compared with those in most previous studies and, therefore, is closely related to the “neural adaptation” technique used in several recent researches (Grill-Spector et al., 1999; Kourtzi and Kanwisher, 2000, 2001). This short ISI also distinguishes our study from previous anatomically constrained MEG studies (Dhond et al., 2001; Marinkovic et al., 2003) using the longer repetition interval that consisted of several intervening items. Indeed, it was suggested that neural response attenuation induced by the immediate stimulus repetition (<1 sec) is qualitatively different from that involved in long-lag repetition of more than several seconds. For example, the neural adaptation with short ISI may be induced by the response attenuation of the same groups of neurons, rather than the decrease in neural population encoding the repeated stimuli as assumed in the sharpening theory (Henson, 2003). Although our study revealed the temporal profiles of the NA effect in visual ventral regions, one should attend to the scale of repetition lag for the interpretation of the present results.
There are two neural models that can account for the BOLD signal decrease in response to the repeated compared with unrepeated stimulus: a reduction in mean firing rate (Wiggs and Martin, 1998) and shortened duration of neural activity (Becker et al., 1997; Henson et al., 2002a). The present results strongly favor the integration of these two models. In addition to a decrease in neural activation by the repeated relative to unrepeated stimulus, we found that NA also induced a temporal shift of peak latency (Fig. 6d). Furthermore, this peak latency showed a significant correlation with RT data (Fig. 8b). These results support the shortened duration model, which attributes NA to the reduced settling time of the neural processing network (Becker et al., 1997). However, our results also showed that this shortening of settling time does not lead to a significant reduction in the temporal duration of response curve itself, because the FWHM, an index of the temporal width of neural processing, was not significantly modulated by the stimulus repetition. These results indicate that, although NA surely involves a change in the neural response in the temporal domain, this change is primarily observed as a speeded response of shape perception areas, rather than a temporal shortening of neural response shapes within these areas.
In conclusion, the present study demonstrated that the NA effect in the human visual ventral stream is characterized by both activation reduction and temporal acceleration, and that the decreased rising time in the ventral visual area is a main factor in the reduction in RT observed. Although there are some limitations on the interpretation of our study (e.g., the frequency changes such as at gamma band were omitted in our trial-averaged analysis, which may relate to hemodynamics changes involved in stimulus repetition), the present results should have an impact on future neural computational theories and provide additional insight into the visual processing mechanism in the human brain.
Footnotes
This work was supported by grants from the Japan Society for the Promotion of Science for Young Scientists to Y.N. We thank Dr. T. Okusa for development of the computer software to present visual stimuli and his helpful comments on this manuscript and O. Nagata and Y. Takeshima for their technical support.
Correspondence should be addressed to Yasuki Noguchi, Department of Integrative Physiology, National Institute for Physiological Sciences, 38 Nishigonaka, Myodaiji, Okazaki, Aichi 444-8585, Japan. E-mail: noguchi{at}nips.ac.jp.
Copyright © 2004 Society for Neuroscience 0270-6474/04/246283-08$15.00/0