When does the brain know that a decision is difficult to make? How does decision difficulty affect the allocation of neural resources and timing of constituent cortical processing? Here, we use single-trial analysis of electroencephalography (EEG) to identify neural correlates of decision difficulty and relate these to neural correlates of decision accuracy. Using a cued paradigm, we show that we can identify a component in the EEG that reflects the inherent task difficulty and not simply a correlation with the stimulus. We find that this decision difficulty component arises ≈220 ms after stimulus presentation, between two EEG components that are predictive of decision accuracy [an “early” (170 ms) and a “late” (≈300 ms) component]. We use these results to develop a timing diagram for perceptual decision making and relate the component activities to parameters of a diffusion model for decision making.
- task difficulty
- decision making
- electroencephalography (EEG)
- single-trial analysis
- diffusion model
- timing diagram
Perceptual decision making is a complex process engaging sensory processing, attention, evidence accumulation, and motor-response networks (Schall and Hanes, 1993; Kim and Shadlen, 1999; Shadlen and Newsome, 2001; Egner and Hirsch, 2005; Huk and Shadlen, 2005). Experiments on both primates and humans have focused on identifying the neural correlates of decision making and its underlying component processes. For instance, neural correlates of the time-dependent accumulation of stimulus evidence have been localized to the lateral intraparietal area with additional decision-making processing also identified in the frontal eye fields and the dorsal lateral prefrontal cortex. Frontoparietal networks have also been implicated, particularly with respect to attention and allocation of relevant resources (Posner and Peterson, 1990; Bisley and Goldberg, 2003; Romo et al., 2003; Egner and Hirsch, 2005). Although much of this work has been done in primates using single and multiunit recordings, recent work in humans using functional magnetic resonance imaging (fMRI) (Heekeren et al., 2004) has attempted to address some of the same questions. However, because of the time resolution of fMRI, little can be said about the relative timing of the cortical processes underlying decision making in humans.
Our previous work using single-trial analysis of the electroencephalography (EEG) focused on the issue of the timing of neural components predictive of psychophysical performance during decision making in humans. Specifically, we have shown that for a face-versus-car perceptual categorization task (see Fig. 1A), two stimulus-locked EEG components resulted in neurometric functions indistinguishable from their corresponding psychometric curves (Philiastides and Sajda, 2006), with the timing of these components being 170 ms (“early”) and ≥300 ms (“late”) relative to stimulus onset. Noteworthy was that although the early component showed little variability in its timing as a function of stimulus evidence (i.e., phase coherence of the image), the late component systematically shifted forward in time as the phase coherence was reduced and the task became more difficult. In addition, for most subjects, the late component was a better match to the corresponding psychometric curve and yielded larger (and more significant) choice probabilities, relative to the early component, suggesting this component was more directly linked to the decision made by the subject.
One interpretation of the shift in time of the late component is that during perceptual decision making, the brain must engage additional resources (e.g., attention networks) and/or prolong processing when a decision is hard (relative to when it is easy) and that this results in the delay of the timing of the late component. Here, we investigate the cortical activity related to decision difficulty and relate it to the two components predictive of psychophysical performance. Once again, we use EEG to uncover components covarying with decision difficulty, analyzing their timing and strength relative to the task. We use these results to construct a timing diagram for perceptual decision making and show that the late component can be interpreted as representing the mean drift rate in a diffusion-based decision model.
Materials and Methods
Thirteen subjects (six females; age range, 20–37 years) participated in the study. All had normal or corrected to normal vision and reported no history of neurological problems. Informed consent was obtained from all participants in accordance with the guidelines and approval of the Columbia University Institutional Review Board.
We used a set of 12 face (face database; Max Planck Institute for Biological Cybernetics, Tuebingen, Germany) and 12 car grayscale images (image size, 512 × 512 pixels; 8 bits/pixel). All images were equated for spatial frequency, luminance, and contrast. They all had identical magnitude spectra (average magnitude spectrum of all images in the database), and their corresponding phase spectra were manipulated using the weighted mean phase (Dakin, 2002) technique to generate a set of images characterized by their percentage of phase coherence. For the first experiment, we processed each image to have six different phase coherence values (20, 25, 30, 35, 40, and 45%) (Fig. 1B). In addition, for the second experiment, we colorized our images with subtle red and green tones. We performed this adjustment by manipulating the hue (H), saturation (S), and value color space of the original images (red: H = 0.04, S = 0.17 V, unchanged; green: H = 0.34, S = 0.23 V, unchanged). A Dell (Round Rock, TX) Precision 530 Workstation with an nVidia (Santa Clara, CA) Quadro4 900XGL graphics card and E-Prime software (Psychological Software Tools, Pittsburgh, PA) controlled the stimulus display. A liquid crystal display projector (LP130; InFocus, Wilsonville, OR) was used to project the images through radio frequency-shielded window onto a front projection screen. Each image was subtended to 33 × 22° of the visual angle.
Subjects performed two versions of a simple categorization task. In the first experiment, they had to discriminate between grayscale images of faces and cars. Within a block of trials, face and car images over a range of phase coherences were presented in random order. The range of phase coherence levels was chosen to span psychophysical thresholds. All subjects performed nearly perfectly at the highest phase coherence but performed near chance for the lowest one. In the second experiment, colorized face and car trials of 30 and 45% phase coherence were presented in random order. In this version of the experiment, subjects were presented with a visual cue for 400 ms that was followed by a 600 ms delay before the next image presentation. Based on the cue, subjects had to either discriminate face versus car or the color of the image (i.e., red vs green). Subjects reported their decisions by pressing one of two mouse buttons, left for faces (and red) and right for cars (and green), using their right index and middle fingers, respectively. All images were presented for 30 ms, followed by an interstimulus interval (ISI) that was randomized in the range of 1500–2000 ms. Subjects were instructed to respond as soon as they formed a decision and before the next image was presented. In both experiments, a total of 50 trials per behavioral condition were presented (i.e., an overall of 600 trials for each experiment). Schematic representations of the two behavioral paradigms are given in Figure 1, A and C, respectively. Trials in which subjects failed to respond within the ISI were marked as no-choice trials and were discarded from additional analysis.
EEG data were acquired in an electrostatically shielded room (ETS-Lindgren, Glendale Heights, IL) using a Sensorium (Charlotte, VT) EPA-6 Electrophysiological Amplifier from 60 Ag/AgCl scalp electrodes and from three periocular electrodes placed below the left eye and at the left and right outer canthi. All channels were referenced to the left mastoid with input impedance of <15 kΩ, and the chin electrode was used as ground. Data were sampled at 1000 Hz with an analog pass band of 0.01–300 Hz using 12 dB/octave high-pass and eighth-order elliptic low-pass filters. Subsequently, a software-based 0.5 Hz high-pass filter was used to remove DC drifts, and 60 and 120 Hz (harmonic) notch filters were applied to minimize line-noise artifacts. These filters were designed to be linear phase to minimize delay distortions. Motor response and stimulus events recorded on separate channels were delayed to match latencies introduced by digitally filtering the EEG.
Movement artifact removal.
Before the main experiment, subjects completed an eye movement calibration experiment during which they were instructed to blink repeatedly on the appearance of a white-on-black fixation cross and then to make several horizontal and vertical saccades according to the position of the fixation cross subtending 1 × 1° of the visual field. Horizontal saccades subtended 33°, and vertical saccades subtended 22°. The timing of these visual cues was recorded with EEG. This enabled us to determine linear components associated with eye blinks and saccades (using principal component analysis) that were subsequently projected out of the EEG recorded during the main experiment (Parra et al., 2003). Trials with strong eye movements or other movement artifacts were manually removed by inspection.
We used a single-trial analysis of the EEG to discriminate between any two given experimental conditions (i.e., face vs car or easy vs hard). Logistic regression was used to find an optimal basis for discriminating between the two conditions over a specific temporal window (Parra et al., 2002, 2005). Specifically, we defined a training window starting at a poststimulus onset time τ, with a duration of δ, and used logistic regression to estimate a spatial weighting vector wτ,δ, which maximally discriminates between sensor array signals X for the two conditions as follows: in which X is an N × T matrix (N sensors and T time samples). The result is a “discriminating component” y, which is specific to activity correlated with one condition while minimizing activity correlated with both task conditions such as early visual processing. We use the term “component” instead of “source” to make it clear that this is a projection of all the activity correlated with the underlying source. For our experiments, the duration of the training window (δ) was 60 ms and the window onset time (τ) was varied across time. We used the reweighted least-squares algorithm to learn the optimal discriminating spatial weighting vector wτ,δ (Jordan and Jacobs, 1994).
The discrimination vector wτ,δ can be seen as the orientation (or direction) in the space of the EEG sensors that maximally discriminates between the two experimental conditions. Thus, the time dimension defines the time of a window (relative to the either the stimulus or response) used to compute this discrimination vector. Given a fixed window width (60 ms in this case), sweeping the training window from the onset of the visual stimulation to the earliest response time represents the evolution of the discrimination vector across time. Within a window, at a fixed time, all samples are treated as independent and identically distributed to train the discriminator. Once the discriminator is trained, it is applied across all time so as to visualize the projection of the trials onto that specific orientation in EEG sensor space. A discriminating component is defined as one such discrimination vector, with its activity visualized by projecting the data across all time onto that orientation. We call this visualization a discriminant component map. For instance, for recurring components, one would expect activity trained during one window time to also be present at another time.
To visualize the profile of these components (stimulus or response locked) across all trials, we constructed discriminant component maps. We aligned all trials of an experimental condition of interest to the onset of visual stimulation and sorted them by their corresponding reaction times (RTs). Therefore, each row of one such discriminant component map represents a single trial across time [i.e., yi(t)]. The discriminant component maps used in this study (see Fig. 7) represent face trials with the mean of the car trials subtracted (i.e., yfaces − ȳcar).
To provide a functional neuroanatomical interpretation of the resultant discriminating activity, and given the linearity of our model, we computed the electrical coupling coefficients for the linear model as follows: Equation 2 describes the electrical coupling a of the discriminating component y that explains most of the activity X. To compute these coefficients, y is computed for only times during the specific window used to calculate the weights for that component. Strong coupling indicates low attenuation of the component and can be visualized as the intensity of the “sensor projections” a. a can also be seen as a forward model of the discriminating component activity (Parra et al., 2002, 2005).
We quantified the performance of the linear discriminator by the area under the receiver operator characteristic (ROC) curve, referred to as Az, with a leave-one-out approach (Duda et al., 2001). We used the ROC Az metric to characterize the discrimination performance while sliding our training window from stimulus onset to response time (varying τ). Finally, to assess the significance of the resultant discriminating component, we used a bootstrapping technique to compute an Az value leading to a significance level of p = 0.01. Specifically, we computed a significance level for Az by performing the leave-one-out test after randomizing the truth labels of our face and car trials. We repeated this randomization process 100 times to produce an Az randomization distribution and compute the Az leading to a significance level of p = 0.01.
Traditional event-related potential (ERP) analysis was also performed by aligning the data to paradigm-specific events and averaging across trials as well as across subjects where appropriate. When ERP activity was used for additional analysis (e.g., ERP amplitude correlation with other experimental parameters), we averaged activity across short-length temporal windows (typically 40 ms in width) to make our estimates more robust. To visualize the spacial extent of the ERP activity across time, we computed average ERP scalp maps by interpolating the ERP activity across all electrode locations. We used a biharmonic spline interpolation (Sandwell, 1987) that is designed for irregularly spaced data points. All scalp maps were plotted using EEGLAB (Delorme and Makeig, 2004).
Discriminant component peak detection.
To quantify the spread/duration of a discriminant component, we performed single-trial peak detection by fitting a parametric function to the spatially integrated discriminating component y(t). For simplicity, we used a Gaussian profile [as in the study by Gerson et al. (2005)] that is parameterized by its height β, width σ, delay μ, and baseline offset α as follows: We computed the optimal parameters for each trial by using a nonlinear least-squares Gauss–Newton optimization (Gill et al., 1981). The center and width of the discriminating training window was used to initialize the optimization.
Diffusion model simulations.
The diffusion model assumes that fast two-choice decisions are made by a noisy process that accumulates information over time from a starting point toward one of two response criteria or boundaries, as in Figure 2B. The starting point is labeled z, and the boundaries are labeled “a” and “0.” When one of the boundaries is reached, a response is initiated. The rate of accumulation of information is called drift rate v, and it is determined by the quality of the information available from the stimulus. The better the information quality, the larger the drift rate toward the appropriate decision boundary and the faster and more accurate the response. Within-trial variability in the accumulation of information results in processes with the same mean drift rate terminating at different times (producing RT distributions) and sometimes at different boundaries (producing errors). Speed–accuracy tradeoffs are modulated by the positions of the boundaries as follows: moving boundaries closer to the starting point speeds responses and decreases accuracy. Response time distributions in two-choice tasks are positively skewed, which occurs naturally in the model by simple geometry: the increase in RT is larger if a lower value of drift rate is decreased by some amount than if a larger value of drift rate is decreased by the same amount. Besides the decision process, there are nondecision components of processing such as encoding and response execution (Fig. 2A). These processes are combined in the model, and their contribution to RT has mean Ter (Ratcliff and Tuerlinckx, 2002).
In the diffusion model, components of processing are assumed to vary from trial to trial. Variability in drift rate across trials (normally distributed with SD η) gives rise to error responses that are relatively slow compared with correct responses, and variability in starting point across trials (uniformly distributed with range sz) gives rise to relatively fast errors. Whether errors are faster or slower than correct responses for an experimental condition depends on the relative amounts of drift rate and starting point variability, drift rate values, and boundary positions (Ratcliff and Rouder, 1998; Ratcliff et al., 1999). Across-trial variability in Ter is uniformly distributed with range st (Ratcliff and Tuerlinckx, 2002).
The diffusion model serves to map performance characteristics onto underlying processes. From the probability of a correct response and the RT distributions for correct and error responses for each of the experimental conditions, the model extracts estimates of the quality of the stimulus information that enters the decision process for each condition (drift rate), the amount of information that must be accumulated before a decision can be made (boundary positions), the time taken by nondecision components of RT (Ter), and the amount of variability across trials in each of the processing components.
Here we fit the model to the data from the six subjects in the first experiment in which a range of experimental conditions (i.e., different phase coherence levels) were available. We used a χ2 method (Ratcliff and Tuerlinckx, 2002) to perform the fits. We used the simulation results to relate different parameters of the diffusion model to our experimental observations.
To identify the components related to task difficulty and perceptual decision making, we measured the psychophysical performance of 13 subjects on two separate versions of a simple categorization task (Fig. 1) while simultaneously recording neuronal activity using a high-density EEG electrode array. We changed the stimulus evidence by manipulating the phase coherence (Dakin, 2002) of our images (Fig. 1B). During the first experiment (six subjects), face and car images were presented in random order over a range of phase coherences, and subjects were asked to report their decision regarding the type of image by pressing a button (Fig. 1A). All subjects performed nearly perfectly at the highest phase coherence but performed near chance at the lowest coherence.
Early and late face-selective components
We compared the EEG activity obtained for the two types of images at each phase coherence level on a single-trial basis, using a linear discriminator that integrates EEG over space rather than across trials (see Materials and Methods) and searched for components that maximally discriminated between the two experimental conditions. We used ROC analysis to quantify the discriminator’s performance. Furthermore, taking advantage of the linearity of our model, we computed sensor projections of the discriminating components activity as a means for interpreting the neuroanatomical significance of the resultant discriminating components.
In the interval between the onset of the visual stimulation and the earliest reaction time, we identified two maximally discriminating face-selective components. The early component was consistent with the well known N170 (Jeffreys, 1996; Halgren et al., 2000; Liu et al., 2000, 2002; VanRullen and Thorpe, 2001; Rossion et al., 2003), and its temporal onset appeared to be consistent across all subjects. The late component appeared at least 130 ms after the first, and its temporal onset varied across subjects and across coherences in the range between 300 and 450 ms from the stimulus onset (Philiastides and Sajda, 2006). To characterize the neuronal performance at these two times, in a manner compatible to the description of the psychophysical sensitivity as captured by the psychometric functions (Green and Swets, 1966), we constructed neurometric functions by plotting the area under the ROC curves (Az values) against the corresponding phase coherence levels (Britten et al., 1992). We showed that for all subjects in our dataset, these neurometric functions were statistically indistinguishable from their corresponding psychometric functions. In addition, the late face-selective component resulted in a better match to the psychophysical data than the early one (Philiastides and Sajda, 2006).
We also investigated the relationship between the temporal onset of the early and the late face-selective components and task difficulty. We found that for the early component (N170), there was no significant shift in time, whereas the optimal onset of the late component had a systematic forward shift in time as the task became progressively more difficult (Philiastides and Sajda, 2006). This observation raised the interesting possibility that there is yet another component that is more closely associated with task difficulty.
We computed average ERPs at each of six different phase coherence levels. Figure 3, A and B, illustrates the results for two sensors of interest (FCz and PO8, respectively). To construct these ERPs, we averaged across all subjects and across both face and car trials and found a component ≈220 ms after stimulus, the amplitude of which is inversely proportional to the strength of the stimulus as follows: the lower the coherence (harder task), the greater the amplitude of the D220 component (“D” for “difficulty”). In addition, we constructed average ERP scalp maps to visualize the spatial distribution of this component. To compute these maps, we considered ERP activity centered at the peak of the D220 component. Figure 3C demonstrates these results. We found that this component is distributed over a range of electrode locations as indicated by the increased activations at several centrofrontal (high negative activations) and occipitoparietal (high positive activations) sites in the lower coherences. It is also clear from Figure 3C that the magnitude of the effect deteriorates as the percentage of phase coherence increases (i.e., as the task becomes easier).
To determine whether this neural signature at 220 ms after stimulus is indeed associated with task difficulty, as captured by our subjects’ behavioral performance, we defined a metric we termed “difficulty index” (DI) as follows: where Pmin and Pmax are the subject’s proportion correct at the lowest and highest phase coherence levels, respectively, and Pj is the subject’s behavioral performance at a given phase coherence level. Note that our subjects performed nearly perfectly at the highest coherence and near chance at the lowest one. We subsequently used our single-trial analysis techniques (Parra et al., 2002, 2005) to discriminate between the different difficulty levels. Unlike our previous use of the single-trial classifier (i.e., to discriminate between face and car trials), we now pool data across both face and car trials and discriminate between the highest and each progressively lower coherence set of trials.
During each comparison, we applied our classifier at several time intervals and each time computed an Az value as a metric for the classifier’s performance. We finally correlated these Az values with the corresponding DIs. We found a significant correlation (p < 0.01) at ≈220 ms after the onset of the stimulus, as can be seen in Figure 4A. The correlation between the Az and DIs that yielded the highest correlation coefficient is shown in Figure 4B. Note that although there was also a pronounced separation in the average ERPs at an earlier time representing the N170 component (Fig. 3A,B), the correlation analysis did not yield a significant observation at that time interval (Fig. 4B, inset).
Difficulty component predicts onset of late component
As mentioned previously, the temporal onset of our late face-selective component systematically shifts forward in time as the task becomes more difficult. We wanted to investigate the potential relationship of this temporal shift and the strength of the D220 component. Therefore, we performed an ERP amplitude correlation of this component with the onset of the late one. To emphasize the magnitude of the effect, we used data from several occipitoparietal and centrofrontal electrode sites for this analysis. We identified these sites based on the ERP activation maps in Figure 3C. Specifically, we averaged activity across these sensors around a 40 ms window centered at the peak of the D220 component on a subject-by-subject basis and for each different phase coherence level. We subsequently correlated this activity with the onset of each subject’s late component at these different coherences. Figure 5 summarizes the results of this analysis and demonstrates that the D220 component can, in fact, predict the temporal onset of the late face-selective component.
Top-down influence on difficulty component
To investigate the possibility that the D220 component represents a true top-down influence on decision making rather than a mere bottom-up processing of the stimulus, we designed a modified version of our original face-versus-car categorization task. We used the same set of images as in the original paradigm, with the only exception that we now colorized the images with subtle red and green tones. At any given trial, subjects were initially presented with one of two possible visual cues. The cues indicated whether subjects would perform the original face-versus-car categorization task or simply discriminate the color of the image (i.e., red vs green) that was presented shortly thereafter (Fig. 1C). There were seven new participants for this version of the experiment. The ultimate goal of this new design was to manipulate the task/decision difficulty while keeping the stimulus evidence unchanged (for instance, for the same low coherence stimulus, we could either make the task “hard” by cueing the subjects to perform the face-vs-car categorization task or “easy” by cueing them to perform the color discrimination task).
We used only two phase coherence levels for this study; a high one (45%), for which subjects performed nearly perfectly at the face-versus-car discrimination, and a low one (30%) that caused a significant reduction in behavioral performance. For each cued condition, we sorted the trials based on the strength of the stimulus evidence (i.e., low-vs-high-coherence trials) and constructed average ERPs (Fig. 6). In the face-versus-car cueing condition, we found, just like in the original experiment, a significant separation ≈220 ms after the onset of visual stimulation between hard (low-coherence) and easy (high-coherence) decisions (mean performances of 65 and 95% correct, respectively). These results can be seen in Figure 6A. In the red-versus-green cueing condition, for which subjects performed nearly perfectly for both the low- and high-coherence trials (mean performances of 96 and 96% correct, respectively), the effect of this component was eliminated (Fig. 6B).
To visualize the spatial extent of this effect, we devised a bootstrap test that allowed us to assess the significance of the difference between low- and high-coherence trials for the two tasks across all electrode locations. For each task, we permuted the trial labels so that the original relationship between low- and high-coherence trials was abolished. We then computed the difference in ERP amplitudes among the newly labeled trials around the D220 component. We repeated this test 5000 times to construct a difference ERP distribution. We finally checked whether the true ERP amplitude difference was outside of the 99% confidence interval of this distribution, in which case we concluded that the difference was statistically significant. For the face-versus-car discrimination task, significant effects can be seen in virtually all electrode locations (Fig. 6C). Interestingly, the effect was not nearly as pronounced when subjects were cued to discriminate color instead (Fig. 6D). In addition, we used this permutation test to check which electrode locations demonstrated a significant difference between the difference ERP amplitudes across the two tasks. Not surprisingly, a distributed network that includes several centrofrontal and occipitoparietal regions was revealed (Fig. 6E).
Decision-related late component
Using the color version of our behavioral paradigm, we came across another interesting finding that provides more evidence that our late component, which discriminates highly between face and car trials, is related to a postsensory/decision event rather than an early visual perception/detection event. When we used our single-trial discrimination analysis to classify between face and car images for trials in which subjects were cued to merely discriminate the color of an image, we found that the early face-selective component remained present, whereas the late component was significantly reduced. Figure 7 shows single-trial data for one subject at the 45% phase coherence level. The top row shows that the early and the late components are both present when the subject performed the face-versus-car discrimination task, in line with our original findings. The bottom row demonstrates the extinction of the late component when the discriminator is classifying between face and car trials while the subject is categorizing images based on their color.
We performed a group analysis to quantify the significance in the reduction of the late component. Our subjects’ performance was comparable at both the high coherence (45% phase coherence) face-versus-car and color discrimination tasks with 95 and 96% correct, respectively (two-tailed paired t test, p > 0.25). Their mean RTs on the two tasks were 667 and 669 ms, respectively, for which they were statistically indistinguishable (two-tailed, paired t test, p > 0.89). Our group analysis revealed that when we discriminated between face and car trials, the early component (N170) was present in both tasks. The mean Az values in each case were 0.771 and 0.773, and their corresponding distributions did not differ significantly from one another (two-tailed, paired t test, p > 0.94). For the late component, however, there was a significant reduction in the classifier’s performance during the color discrimination task (mean Az, 0.675) compared with the face-versus-car discrimination task (mean Az, 0.905) (left-tail paired t test, p < 0.0006). These findings are summarized in Figure 8.
Evidence changes the temporal spread of late component
To quantify the spread/duration of the late component, we performed single-trial peak detection by fitting a parametric function to the spatially integrated discriminating component ylate(t) (see Materials and Methods for details).
Figure 9A illustrates a single-trial fit (black) of the late discriminating component (gray). We applied this procedure on all trials across the three highest phase coherence levels (i.e., 35, 40, and 45%) to investigate the extent to which the duration of the late component varies with the task difficulty. For every single-trial, we used the SD of the Gaussian fit as a metric to quantify the spread of the component. We then computed the mean SD across all trials and all subjects for each one of the three phase coherence conditions. For every fit, we computed an r2 value to characterize the goodness-of-fit for ŷlate(t). We considered parameters only from fits with r2 > 0.5. As can be seen in Figure 9B, we found that there is a systematic increase in the duration of the late component as the percentage of phase coherence is reduced. We observed a significant increase in the mean SD between the 45 and 40% phase coherence conditions (σ45%, 40 ms; σ40%, 53 ms; paired t test, p < 10−12) as well as a significant increase between the 40 and 35% conditions (σ40%, 53 ms; σ35%, 62 ms; paired t test, p < 4 × 10−6).
Finally, we point out that there is also a corresponding increase in mean RT variance across all subjects as the percentage of phase coherence is reduced (i.e., as the task becomes more difficult). Figure 9B also illustrates this point.
Association between the late component and the diffusion model
Using only behavioral data (i.e., behavioral performance and RT distributions) from all subjects of our original face-versus-car categorization task, we estimated parameters for the diffusion model of decision making (Ratcliff and Rouder, 1998, 2000; Ratcliff and Tuerlinckx, 2002). Specifically, we estimated diffusion rates for five different phase coherence levels (25–45% in 5% increments). We also computed a χ2 goodness-of-fit value (mean χ2 = 40.9, with df = 45) which showed that the fits of the model did not significantly deviate from the data. The fits are shown in quantile probability plots in Figure 10A. In this plot, the quantiles of the RT distributions are plotted on the y-axis and the response proportion is plotted on the x-axis, with correct responses to the right and the corresponding error responses to the left.
To provide evidence that our late component could, in fact, represent the mean drift rate in a diffusion model, we correlated the strength of this component (i.e., Az value), for each subject and at each coherence level, with the estimated accumulation rates from the diffusion model simulations. Figure 10B shows that there is a significant correlation (r = 0.8575; p < 1.0 × 10−8) between these two measures. This finding is especially interesting considering that the first measure (Az) is derived using only neural responses and the second one (mean drift rate) purely behavioral responses.
Finally, we looked at the total nondecision time (Ter) to estimate the relative position of the late component with respect to the actual decision-making process. Note that the time for processes such as stimulus encoding, response output, memory access, and so on, are all combined in Ter. For our subjects, the mean Ter was estimated to be ≈460 ms. Assuming a 100 ms motor response output, which follows the decision-making process in our paradigm, leaves a total processing time of ≈360 ms for any early stimulus encoding/sensory event. Note that the peak of the activity of our late component occurs, on average, ≈350 ms after the onset of the stimulus at the highest phase coherence (i.e., easiest condition), and it increases with task difficulty [Philiastides and Sajda (2006), their Fig. 5A]. This finding places essentially all of our detected late components directly after early visual perception and at the onset of the actual decision-making process. We interpret this result as additional evidence that our late component is unlikely to represent purely early perceptual/sensory events.
Figure 11 summarizes our findings in the form of timing diagrams. Discriminating between face and car trials, we initially identified an early face-selective component (N170), the strength of which is proportional to the strength (percentage of coherence) of the stimulus. This component is present, in equal strength, for both a face-versus-car categorization task and a color discrimination task (Fig. 11A,C). Its presence in the color discrimination task, although our subjects were not explicitly discriminating the type of image, seems to indicate that this component represents an early perceptual event and is not directly linked to the actual decision. The late component, on the other hand, is clearly more closely linked to the decision-making process, as evidenced by it being a better predictor of behavioral accuracy (at any given coherence level, its strength, Az value, is higher than that of the early one) (Fig. 11A,C), the delay in its onset time when a decision is difficult (Fig. 11B), and its suppression when the decision is unrelated to a face-versus-car discrimination (Fig. 11C,D).
In addition, we found a difficulty component that is situated somewhere between the early and the late components. Specifically, we found a component ∼220 ms after the onset of visual stimulation, the strength of which is inversely proportional to the stimulus evidence (Fig. 11A,C). Moreover, the amplitude of this component is highly correlated to the difficulty of the task, as captured by the behavioral performance of our subjects, and it also predicts the onset of the late face-selective component. As such, this component appears more closely linked to the late rather than the early component. We also showed that this difficulty component (D220) is likely to represent a true top-down influence on decision making, rather than a mere bottom-up processing of the stimulus, by virtue of the fact that it disappears during an easy color (red-vs-green) discrimination (Fig. 11C,D). We speculate that the difficulty component could be implicated in the recruitment of the relevant attentional and other neuronal resources required to make a difficult decision. For instance, an EEG component with similar latency (N2pc) was shown to capture the moment-by-moment rapid shift of attention in a demanding visual search task (Woodman and Luck, 1999). The N2pc component is also implicated in covert orientation of visual attention during object recognition (Luck and Hillyard, 1994) and was shown to resemble the attention-related modulations of activity seen in monkeys (Luck et al., 1997).
All three components are stimulus locked and thus, within the context of a neural integrator (Shadlen and Newsome, 2001; Mazurek et al., 2003; Huk and Shadlen, 2005) or diffusion model (Ratcliff, 1978; Luce, 1986; Smith and Ratcliff, 2004; Palmer et al., 2005), all components, including the late one, are unlikely to represent the process of evidence accumulation to threshold which can account for both accuracy and RT variability (i.e., because the late component is stimulus locked and does not persist until the response, it does not predict the trial-by-trial RT distribution within a coherence level).
Additional analysis, however, allows us to relate the late component with other parameters of an integrator or diffusion model. For example, the late component is highly predictive of behavioral accuracy (i.e., a good match between neurometric and psychometric functions) and shifts in time as a function of difficulty. This time shift correlates with the shift of the mean RT at a given coherence level. In addition, the duration of the late component gradually increases as coherence is reduced (Fig. 11B) and is strongly correlated with RT variance. Finally, there is a significant correlation between the drift rates computed from the diffusion model simulations and the strength (Az values) of the late component. These findings are all consistent with the late component being analogous to the mean drift rate in a diffusion model (Ratcliff, 1978; Palmer et al., 2005), much the same way as noisy, but sustained, activity in the middle temporal area would drive the integration process in neural integration models (Mazurek et al., 2003) (although, in that case, the mean drift rate would typically be thought of as a signed quantity). In other words, the late component represents the postsensory evidence that is fed into the diffusion process that ultimately determines the decision. Last, the fact that we find no evidence of a component that discriminates a red-versus-green decision suggests that the late component activity is not indicative of a general purpose decision-making process (Heekeren et al., 2004) and is more likely part of a face-selective network that drives such a process.
Our choice of stimuli, in particular faces, was made because faces are known to activate strong neural generators measurable via EEG/magnetoencephalography (Botzel et al., 1995; Bentin et al., 1996; Jeffreys, 1996; Liu et al., 2000, 2002; Rossion et al., 2003). One question is whether our timing diagram generalizes for other stimulus classes. Unpublished work by our group investigates this question using motion coherence stimuli to find neural correlates of decision making in EEG. However, we found that the neural generators are not strong enough to enable construction of neurometric functions on more than one or two relative motion coherences that are well above perceptual threshold.
The issue of localization is one that we have not addressed. A number of single-unit and fMRI studies have attempted to localize neural activity related to perceptual decision making (Kim and Shadlen, 1999; Platt and Peterson, 1999; Shadlen and Newsome, 2001; Heekeren et al., 2004; Ridderinkhof et al., 2004), with several studies specifically addressing the issue of decision difficulty and uncertainty (Critchley et al., 2001; Paulus et al., 2002; Volz et al., 2003, 2004; Huettel et al., 2005; Grinband et al., 2006). These studies have found a variety of areas modulated by decision difficulty, including areas in the frontal and parietal cortices, the thalamus, as well as the striatum. Our work complements these previous findings by focusing on the timing of the underlying neural processes. For example, our results show a change in the timing of a late component that is predictive of behavioral accuracy, suggesting that for difficult decisions, not only are different brain areas engaged or modulated, but also additional latency is introduced. This latency may represent the need for, or allocation of, additional processing time required to evaluate the evidence. Consistent with this last hypothesis is the presence of a difficulty component in our data that appears after the initial evaluation of the evidence (early component) and before the late face-selective component that is mostly linked to a postsensory/decision event.
Intriguing in our results is that the scalp plots for the early and late components are similar in topology, except for sign (Philiastides and Sajda, 2006). One might consider using inverse methods, such as dipole fitting, to localize the component activities. However, given the ill-posed nature of the inverse problem in EEG, we are instead focusing on simultaneous EEG and fMRI to localize the component activities in our timing diagram.
In summary, we show that with single-trial analysis of EEG, we can construct a timing diagram that begins to separate perceptual processing from decision-making processing in human subjects. Our results suggest that cortical networks selective to task difficulty could play an integral role in perceptual decision making by dynamically allocating resources, such as additional processing time, for difficult decisions.
This work was supported by grants from the Office of Naval Research (N00014-01-1-0625), National Geospatial-Intelligence Agency (HM1582-05-C-0008), and the National Institutes of Health (EB004730). We thank Robin Goldman and Lucas Parra for useful comments on a previous version of this manuscript. We declare that we have no conflicts of interest, financial or otherwise.
- Correspondence should be addressed to Dr. Paul Sajda at the above address.