Abstract
The primary visual cortex (V1) changes its computation according to the perceptual task being performed. We propose that this cognitive modulation results from gating of V1 intrinsic connections. To test this idea, using behavioral paradigms that engage top-down modulation of V1 contextual interactions, we recorded from chronically implanted electrode arrays in macaques. We observed task-dependent changes in interactions between V1 sites measured both by correlation between spike trains and by coherence between local field potentials (LFP-LFP coherence). The direction of the changes in aggregate activity, as measured by LFPs, depended on perceptual strategy: perceptual grouping increased LFP coherence between sites crucial for the task, whereas perceptual segregation lowered the LFP coherence. Using spiking activity as a measure, we found that the behaviorally driven changes in correlation structure between neurons dramatically increased the stimulus-related information that they convey; this additional increase in encoded information at the level of neuronal ensembles equals that obtained from task-driven reconfigurations of neural tuning curves. The improvements in information encoding were strongest for stimuli with greatest discrimination difficulty.
Introduction
The classical bottom-up model of vision portrays primary visual cortex (V1) as a set of filters that extracts low-level visual features. More recent findings expand this traditional view whereby V1 neurons integrate information over a large visual area and their responses to local features depend on contextual influences among scene components, including contours and surfaces (Blakemore and Tobin, 1972; Allman et al., 1985; Nelson and Frost, 1985; Gilbert and Wiesel, 1990; Knierim and VanEssen, 1992; DeAngelis et al., 1995; Kapadia et al., 1995; Zipser et al., 1996; Roelfsema et al., 1998; Kapadia et al., 1999; Li et al., 2000; Angelucci et al., 2002; Hegdé and Felleman, 2003; Li et al., 2004; Li et al., 2006; Zhang and von der Heydt, 2010). For example, if an oriented line within a V1 neuron's receptive field (RF) is flanked by a collinear line, the cell's response is facilitated, while a parallel flank inhibits the cell's response (Kapadia et al., 1995). Such contextual influences have been implicated in perceptual processes, such as contour integration and surface segmentation (Lamme, 1995; Gilbert et al., 1996; Zipser et al., 1996; Roelfsema et al., 1998; Ito and Gilbert, 1999; Gilbert et al., 2000; Paradiso, 2002; Li et al., 2004; Roelfsema et al., 2004; Li et al., 2006). In addition to sensory context, V1 neurons are also influenced by different forms of behavioral context, such as spatial, object-based, and feature-based attention, as well as task-dependent and anticipatory effects (Motter, 1993; Ito and Gilbert, 1999; Crist et al., 2001; Li et al., 2004, 2006; Roelfsema et al., 2004; McManus et al., 2011). Importantly, in V1, the strongest top-down influences are not seen on neuronal responses to simple stimuli, such as a single oriented line segment, but on responses to more realistic, complex stimuli, whereby neural activity and visual perception are shaped by contextual interactions (Gilbert et al., 2000). As a consequence, V1 can be thought of as an adaptive processor, influenced by both sensory and behavioral context. At any given instant, the confluence of bottom-up inputs of sensory features and top-down influences of behavioral states defines its function (for review, see Gilbert and Sigman, 2007).
Although it is becoming increasingly evident that V1 is subject to task-specific modulation, it remains unclear how the top-down signal mediates the interactions between sensory and behavioral context. One possible mechanism could involve changes in functional connectivity between V1 neurons, caused by top-down influences. That is, depending on the current task, the effective connectivity within the cortical network may be dynamically reset by behavioral context, thereby enriching the task-relevant information carried by neural responses. Alternatively, top-down control may operate by modulating the responses of individual neurons that encode the scene components. To investigate these possible top-down mechanisms in V1, we chronically implanted multielectrode arrays in behaving monkeys and engaged animals in behavioral paradigms that were known to invoke task-specific control of contextual influences in V1 neurons (Li et.al., 2004, 2006).
Materials and Methods
Animal preparation and electrophysiology
Data were obtained from two adult male rhesus monkeys (Macaca mulatta; Monkey a and Monkey b). The animals were implanted with head posts and trained in several tasks for 3–4 months (see Stimuli and task design). After training, two 6 × 8 multielectrode arrays (Blackrock Microsystems) were implanted in the V1 opercular surface. The electrodes were 500- to 600-μm-long with 400 μm interelectrode spacing, and the two arrays were connected to a percutaneous connector that allowed electrophysiological recordings. The two impanted subarrays were adjacent to each other and based on the receptive field map, the closest electrodes between them were ∼0.2° apart in Monkey a and 0.35° apart in Monkey b. Spike and local field potential (LFP) signals from orientation-selective cells in the V1 superficial layers were collected using a real-time multielectrode data acquisition system (MAP system, Plexon). All procedures were conducted in compliance with the National Institutes of Health Guide for the Care and Use of Laboratory Animals and under approval of the Institutional Animal Care and Use Committee at Rockefeller University.
Stimuli and task design
Stimuli were generated by a visual stimulus generator (VSG2/5, Cambridge Research Systems) on a CRT monitor (NANAO FlexScan F2–21) at a resolution of 1024 by 769 pixels and a refresh rate of 105 Hz. The viewing distance was 78 cm.
Five bar discrimination tasks.
One of the two behavioral paradigms used in this study was a dual discrimination task on a 5 bar stimulus; the stimulus and behavioral protocol were as described by Li et al., 2004. The animals performed two discrimination tasks, bisection and vernier, on the same 5 bar stimulus: one fixed central bar, flanked by two parallel and two collinear bars (Fig. 1A). The five oriented bars (0.4° × 0.08°) were displayed on a gray background (6.25 cd/m2), with Michelson contrast ranging from 15% to 60%. For a given recording session, the central stimulus bar was fixed at the RF center of one chosen neuron, and all the bars in the stimulus array were oriented at the preferred orientation of this cell. An example arrangement of stimulus components in relation to the RF centers of neurons recorded from one of our arrays is shown in Figure 1B. In the bisection task, the animals discriminated the relative distance between the parallel bars (Fig. 1A). The parallel bars were positioned at 0.2–0.25° from the central bar, and in different trials either of the two parallel flanks was randomly displaced in varying steps of 0.1–0.13°; the animals reported which flank was nearer to the fixed central bar. The vernier task involved discriminating the offset of collinear bars (Fig. 1A). The two collinear flanks were placed on either side of the central bar at an end-to-end distance (between the central bar and the collinear flanks) of 0.3°. In different trials, both collinear flanks were randomly shifted in a set of 0.1–0.13° lateral steps. The animals determined to which side of the central bar the flanks were offset. Each task (bisection or vernier) was performed in a continuous block of randomized trials. Within a single experiment, we repeatedly switched the monkey's perceptual task by interleaving a block of trials on one task (e.g., the bisection task) with a block of trials on the other task (e.g., the vernier task). Each stimulus configuration, for a given task, was repeated 20–25 times. Monkeys initiated a trial by pulling on a lever and fixating on a ∼0.1° fixation point (FP) displayed at the monitor center. We used an infrared eye tracking system to ensure that monkeys maintained their fixation within 0.5° of the FP. Our eye tracking system was sensitive enough to detect saccades as small as 0.05° in either the horizontal or the vertical direction (Li et al., 2006). At 196 ms after the fixation onset, the stimulus was presented for 496 ms, followed by the presentation of two 0.15° saccade targets. The animals reported their choice by making a saccade to one of the two targets. The psychometric curves allowed us to determine that the animals actually performed the cued task and were not influenced by the task-irrelevant flank stimuli (Fig. 1C). The animals were cued to the task to be performed by color. To ensure that the observed task-dependent effects were not the result of the changes in stimulus color, we also collected data during control experiments designed to remove the influence of perceptual task on the recorded neural responses. In these experiments, we displayed the same 5 bar stimuli over the recorded RF locations but had the monkeys perform a different task on a separate stimulus in the opposite hemifield (Fig. 1D, control task). For one of the two monkeys, this consisted of a 3 line discrimination (bisection/vernier) task, and in the second monkey this involved a brightness discrimination task. Eye-tracking analysis showed that, for both monkeys, there were no significant differences in eye position under the different task conditions (p > 0.05 for bisection task vs vernier task and discrimination task vs control task; Wilcoxon signed-rank test, Matlab, The Mathworks).
Contour detection task.
The second behavioral paradigm was a contour detection task, where the animals were trained to detect a contour, consisting of 1–9 collinear lines, embedded in one of two complex backgrounds (i.e., stimulus patches) of randomly oriented lines (see Fig. 7A). The stimulus parameters and experimental design have been described previously (Li et al., 2006). The stimulus patches consisted of 0.2° × 0.05° bars displayed on a gray background. Different stimulus conditions (1, 3, 5, 7, or 9 bar contours) were randomized and repeated 30–40 times in a recording session. Each trial began when the monkeys pulled a lever, followed by the display of a ∼0.1° FP at the screen center. At 333 ms after fixation, two stimulus patches were displayed in opposite hemifields for 596 ms, followed by two corresponding saccade targets. The animal indicated which patch contained a contour by making a saccade to one of the targets. To study contour-related effects in the absence of attention, we also collected data when the monkeys performed a visual task (3 line bisection or a brightness discrimination task) in the hemifield opposite to recorded neuronal RFs (Fig. 7B, “attend-away” task). Again, we analyzed the animals' eye positions under the detection and control task to ensure that there were no differences in eye positions for the two tasks (p > 0.05 for detection task vs attend-away task; Wilcoxon signed-rank test, Matlab, The Mathworks).
Data analysis
Five bar contextual tuning curves.
V1 neuronal responses to various positions of parallel or collinear flanks (stimulus context) were compared under two task conditions: tasks that were relevant and irrelevant to the contextual stimulus (for the bisection task the relevant stimuli were the parallel flanks and the irrelevant stimuli were the collinear flanks; the relationship was reversed for the vernier task). To generate a parallel flank position tuning curve, at each position of the parallel flanks we pooled and averaged the five stimulus conditions with the same parallel flank position but different collinear flank positions (e.g., Fig. 2A). Similarly, responses from stimulus conditions with same collinear flank positions but different parallel flank positions were averaged to generate a collinear flank position tuning curve (e.g., Fig. 2D). For a given cell, these tuning curves were calculated for the two tasks, thus allowing us to study how a single neuron encodes the same stimulus information under different task conditions.
Mutual information.
For the 5 bar experiments, we used mutual information to quantify the amount of information neural responses conveyed about a stimulus attribute. This measure indicates to what extent an ideal observer could categorize stimulus information given the spike count of a cell during one trial. Given the probability of presenting a stimulus (p(sj)), the probability of observing a spike count (p(ri)) and the conditional probability of observing a spike count for a specific stimulus (p(ri | sj)), mutual information was calculated as follows: To calculate the probabilities p(ri)) and p(ri|sj) that involved spike counts, we binned the spike counts at one SD of all the spike counts for a given task and cell, rounded to the nearest integer. The mutual information present in the LFP responses was calculated similarly: instead of spike count, LFP power in frequencies 10–120 Hz from 100–500 ms after stimulus presentation was used. LFP power at a given frequency was estimated using the fast Fourier transform (Matlab, The Mathworks).
Contour tuning curves.
The mean responses of cells with RFs lying along the contour stimuli were used to calculate contour-dependent responses in V1. Spike counts within 100–600 ms after stimulus presentation were used to calculate average firing rates because the initial neuronal responses do not contain information about the embedded contour (Li et al., 2006). The mean response of a cell to varying contour lengths was normalized to the cell's average response to the background pattern (i.e., a 1 bar “contour”), and then averaged over all the recorded cells to get the population responses. Similarly, LFP power within the 10–120 Hz frequency band (we observed contour-related power in this entire frequency range), within 150–600 ms after stimulus presentation, were used to obtain contour-related tuning curves.
Spiking cross-correlations.
We estimated the effective connectivity between spiking neurons using cross-correlation analysis, which provides a measure of synchronous activity between neurons. Raw cross-correlograms were obtained from the joint peristimulus histogram, with 5 ms resolution, of the spike trains of a cell pair (Aertsen et al., 1989). We corrected for the stimulus-induced synchronous activity by estimating a modified shift-predictor as follows:
For each neuron of a cell pair, for each trial, we simulated a spike train from an inhomogeneous Poisson process (i.e., a Poisson process whose mean rate varies as a function of time, to match the peristimulus histogram of each neuron). The simulated spikes exactly matched both the observed spike count at each trial and the shape of the mean peristimulus histogram for each neuron. Only the timing of individual spikes in individual trials differed between the observed and simulated spike trains. Because the Poisson process used to simulate the spikes for one neuron was independent of the Poisson process used to simulate the spikes for the second neuron of the pair, these simulations yield the number of coincident spikes expected under the null hypothesis of no neuronal temporal correlation.
We then calculated cross-correlograms from the simulated spike trains. Even if the precise spike timing of two neurons is independent, the two cells will still exhibit a basal level of correlation in the correlogram, caused by the similarity of the neurons' peristimulus histograms and any covariation in their firing rates. The cross-correlograms computed from the simulated spike trains reflect exactly this basal component of the correlogram, expected from independent neurons whose individual firing statistics match those of the real neurons recorded.
We repeated steps 1 and 2 1000 times and averaged the resultant 1000 correlograms to obtain the shift-predictor.
After subtracting a shift-predictor from the raw cross-correlogram, we normalized the correlogram by the geometric mean of the auto-correlograms of the cells under study. All the correlograms presented in the paper are such normalized cross-correlograms (NCCGs). Because of the application of our shift predictor, the correlograms reflect only very precise spike timing correlations; they ignore spike timing coincidences that occur at large time lags and that constitute a component of neuronal noise correlations.
The effective connectivity between a neuron pair was measured by estimating the area under the normalized cross-correlogram peaks (±15 ms for all experiments and task conditions). To test for significance of an observed correlation, we used the 1000 correlograms obtained from the simulations mentioned above: the p value was calculated as the proportion of simulated correlograms with correlation magnitude greater than or equal to the observed correlation (one-sided test). We used a permutation test to determine whether observed correlations between a cell pair were significantly different under different task conditions. The permutation test was performed as follows: for the cell pair under consideration, the trials from the two tasks were pooled into one set and then were randomly reassigned into two subsets; the two randomized subsets matched the original datasets in their trial count. NCCGs were then computed from these two subsets, and the differences in their correlation magnitudes were calculated. The random permutation and estimation of correlation magnitude difference were done 1000 times, and the p value was reported as the probability that the difference in the correlation magnitudes from the permuted dataset was as large as the one observed from the original dataset.
For contour detection experiments, we compared the spiking correlations at the population level. For each cell pair, we estimated NCCGs (as mentioned above) under different stimulus (1, 3, 5, 7, and 9 bar) and task (detection and attend-away) conditions. We averaged the NCCGs of all the cell pairs for the 1 bar condition during the detection task to obtain the “no contour” correlogram. The NCCGs for 3, 5, 7, and 9 bar conditions were averaged to get the “contour” correlograms for the detection and attend-away conditions separately. To test whether the observed correlations at the population level differed significantly between two stimulus/task conditions, we used the paired Wilcoxon signed-rank test (α = 0.05) on the correlation magnitudes (sum of coincidence spikes in ±15 ms) of the individual NCCGs (Matlab, The Mathworks).
LFP coherence.
We determined LFP interactions between recording sites by measuring the coherence of their LFP signals. Cross-spectra and auto-spectra of LFP signals for a pair of sites were calculated by the Fourier transform. Coherence was then calculated as follows: Sxy is the cross-spectra, Sxx and Syy the auto-spectra of the LFP signals. LFP coherence, which varies between 0 and 1, measures the linear correlation between two signals as a function of frequency. For a given frequency, coherence between two LFP signals will be one if their amplitudes covary and if they maintain a constant phase relationship. If the two signals are independent or if there is no constant phase relationship between the signals, coherence will be equal to 0. The cross-spectra and auto-spectra were averaged over trials for a task condition before calculating the coherence. The Fourier analysis was done in 120 ms sliding windows with 1 ms shifts, resulting in a coherogram giving the time–frequency relationship of the coherence. The LFP signals were corrected for common reference to ensure that the observed task-dependent differences in coherence were not the result of the changes in the common ground potential. LFPs from each electrode were rereferenced to the average potential across the area (the LFP signals at an electrode were subtracted from the mean potential of the signals across the entire array, excluding the electrode under consideration) (Buschman et al., 2007). We also corrected for stimulus-induced coherence changes by computing the coherence shift-predictor (i.e., the mean coherence computed from all possible permutations of trials) and subtracting it from the coherence to estimate corrected coherence. All the coherence results presented in this paper are such corrected coherence. To obtain the coherence as a function of frequency only, we averaged coherence over the entire trial period after the initial burst at stimulus onset. The time course of coherence dynamics was obtained by averaging the coherence in the frequencies 10–120 Hz. We used the Wilcoxon signed-rank test (α = 0.05; Matlab, The Mathworks) to determine whether the observed task-dependent changes in LFP coherence magnitude were significantly different.
Noise correlations.
We studied two measures of correlation between the responses of a cell pair: signal correlation and noise correlation (Gawne and Richmond, 1993). Signal correlation, rsig, estimates the similarity in tuning to a stimulus set between a pair of neurons. In our case, it was simply the Pearson correlation coefficient of a cell pair's tuning curves for parallel bar/collinear bar positions. Noise correlation, rnoise, estimates correlated trial-by-trial variability for a pair of cells. For each cell pair, we calculated rnoise for the different tasks separately. We first calculated the trial-by-trial spike count for each neuron during the entire stimulus period. Because each task in our experiments had multiple stimulus conditions, we first z-scored the spike counts (Bair et al., 2001), which normalized the spike counts for the changes in spike rates due to stimulus conditions. We then calculated the Pearson correlation coefficient of the normalized spike counts to determine the rnoise. Differences in noise correlations between task conditions were tested by the paired Wilcoxon signed-rank test (α = 0.05; Matlab, The Mathworks).
Fisher information.
The Fisher information (IF) provides a limit on the accuracy with which an unbiased decoder can read out a population code. We estimated the information present in a neuronal ensemble as (Abbott and Dayan, 1999) Here, f(x) is the vector of responses of the neurons in the population for the stimulus x; Q denotes the covariance matrix; superscript T denotes the matrix transpose, superscript −1 the matrix inverse and Tr represents the trace operation. The spike counts were corrected for nonstationarity. Because the stimulus condition xi (i = 1, 2, 3, 4, 5) was discrete in our experiments, we estimated f′(xi) (and Q′(x)) as f(xi+1) − f(xi) (and Q(xi+1) − Q(xi)); thus, Fisher information curves were calculated for the first four stimulus conditions (see Fig. 10D). To estimate the contribution of tuning curve changes to change in information content when the animal performed the perceptual task at the RFs, we recomputed Fisher information as before but using the tuning curves (f(x)) during the perceptual task at the RFs and covariance matrices (Q(x)) from the task condition when the animal attended to the RFs, but did not perform a task at the RFs. We did a Box-Cox transform of the spiking rates before the calculation, to ensure that the neuronal responses for a given stimulus follow a normal distribution. Because the above transformation can result in non-0 spike responses, we adjusted the transformed data such that the spike response distribution is shifted away from 0. Qualitatively, the results were similar for the original, untransformed dataset.
Results
Task-dependent modulation of V1 contextual interactions
We trained animals to perform two discrimination tasks (bisection and vernier) on a 5 bar stimulus (Fig. 1A; see Stimuli and task design). In the bisection task, the animals discriminated the relative distance between the parallel bars; and in the vernier task, the lateral offset of the collinear bars. For a given 5 bar stimulus, the two tasks engaged different stimulus components: the bisection task involved the relative position of the parallel bars; the vernier task relied on the spatial offset of the collinear bars. We recorded from V1 cells whose RFs were positioned over the various parts of the 5 bar stimulus (Fig. 1B). During a recording session, the central bar was fixed in the RF center of an arbitrarily selected V1 neuron. All the bars were oriented to match the preferred orientation of this neuron. We then studied the effect of top-down signals on individual neuronal responses and network interactions in V1, using both neuronal spiking activity and LFP as measures of cortical activity.
Spiking activity
V1 neurons were differentially modulated by positional offset of either the parallel or collinear flanks (Fig. 2) when the animals performed different discrimination tasks. For example, the tuning curves for various parallel-bar positions (Fig. 2A) differed in a task-dependent fashion, whereby cells showed more modulation when the animals were performing the bisection task, in which parallel-flank position was the task relevant attribute, compared with when they did the vernier task where the same parallel flanks were task irrelevant. That is, the cell's tuning for the parallel-bar positions was more informative during the bisection task, when the animal had to use this information (see the mutual information analysis, next paragraph). On the other hand, the cell's tuning was less modulated and hence less informative of the parallel-bar positions during the vernier task, when this information would be of no use for the animal. Similarly, the degree of modulation in tuning for collinear-flank position depended on its relevance to the task of vernier discrimination (Fig. 2B).
We used “mutual information” between the spiking response and the stimulus to quantify the task-dependent modulations in the tuning curves of V1 cells. Mutual information provides us with a measure of how reliably an ideal observer could categorize a stimulus presented in a single trial based on the spike count of a cell during the trial. Over the population (N = 57 and N = 30 for Monkey a and Monkey b, respectively; Fig. 2C), the average mutual information for both the parallel and collinear-flank position tuning was significantly higher in the relevant task. Moreover, it was clearly higher than that calculated by Monte Carlo simulations of the data (the red and green clouds), where the data were randomly assigned to the two task conditions. Therefore, V1 responses carried significantly more information about a stimulus context when the context was task-relevant.
The animals were cued to the task by color: green was used for relevant bars and white for irrelevant ones. To exclude the possibility that the changes in mutual information could arise purely from this manipulation, we studied V1 responses during a control task in which the animal performed a visual task in the hemifield opposite to that of the 5 bar stimulus (Fig. 2D, E). During this control task, V1 sites showed no significant changes in their tuning when the flank stimulus switched color. For example, the same V1 neuron that showed task-driven modulations for parallel flanks (Fig. 2A) did not significantly differ in its response when the parallel flanks changed their color from green to white when the animal was engaged in the control task (Fig. 2D). This demonstrates that the task-driven changes observed in Figure 2A were not the result of change in stimulus color but cognitive influences. Similarly, change in color of collinear flanks did not produce differential modulations in V1 responses (Fig. 2E). We found no significant differences in the population mean mutual information during the control task: the values for both the parallel-flank and collinear-flank position tuning were close to the diagonal and within the Monte Carlo simulations (Fig. 2F). These results suggest that the observed task-dependent changes in V1 neuronal responses were the result of the change in the behavioral relevance of the stimulus rather than its color. The aforementioned observations are consistent with our previous study showing that V1 neurons carry significantly more information about a stimulus when it is task-relevant. Because in our current study we were primarily interested in mechanisms of top-down control in V1, these confirmatory results ensured that our subsequent experiments and analyses were based on task-dependent modulatory effects in V1.
LFP response
We performed a similar analysis based on LFPs, which reflect aggregate activity over a large population of neurons. Considerable task-dependent modulation of contextual effects was seen in the power present in LFP frequencies (Fig. 3). LFP power tuning (in 10–120 Hz) for both the parallel and the collinear-flank positions was more modulated in the relevant task than in the irrelevant task (Fig. 3A, B). Similar to spiking activity, the mean mutual information was significantly higher during the task where the flanks were task relevant and clearly separated from Monte Carlo simulations (Fig. 3C, N = 50 and N = 30 in Monkey a and Monkey b, respectively). Moreover, LFP power at V1 sites was not differentially modulated by stimulus context during the control task (Fig. 3D, E), and there was no significant difference in the population-averaged mutual information during the control task (Fig. 3F). These results together indicate that V1 LFPs represented both stimulus context (parallel or collinear flank positions) and behavioral context (bisection or vernier task).
To explore frequency dependence of the task-specific modulations, we analyzed the LFP power in the 10–30 Hz band separately from the power in 31–120 Hz ranges. The LFP power in both the frequency bands showed similar task-dependent effects of mutual information, suggesting a broadband representation of behavioral context in V1.
Given that V1 activity reflected the task-dependent contextual interactions, the top-down signals carrying task information must induce the network to process behaviorally relevant sensory information. This could be achieved either by suppressing the activity of neurons encoding task-irrelevant information or by altering the effective connectivity between cortical sites representing stimulus context that is either relevant or irrelevant to the task.
Neuronal representation of the flanks
First, we tested the possibility that the top-down control of contextual modulation operates by suppressing or enhancing the V1 neurons that directly encode the various stimulus contexts. We compared the response properties of cells whose RFs were over the parallel or collinear flanks, when the animal performed the bisection or vernier discrimination task. The flanking sites did not show significant task-dependent changes in their responses for various positions of the flank stimuli in their receptive fields. For example, in V1 the responses of neurons (Fig. 4A) with receptive fields overlapping the parallel flanks were independent of the task relevance of the parallel flanks. Similarly, the sites that represented the collinear flanks (Fig. 4B) did not show task-dependent changes in their responses.
The same trend was seen over the population that represented the flankers: the mean firing rate of V1 cells showed no significant task-dependent changes (Fig. 4C; p = 0.6448 and p = 0.3238 for the difference in the firing rates for Monkey a and Monkey b, respectively, Wilcoxon signed-rank test). Although the responses of these sites encoded the various flank positions, there were no task-dependent changes in the encoded mutual information (Fig. 4D; the mean mutual information for both parallel and collinear flank sites was away from the Monte Carlo clouds, but the mean values lie on the diagonal). This suggests that top-down influences did not operate by suppressing or facilitating V1 neurons that encode the stimulus context.
Top-down modulation of spiking cross-correlations
To test the alternate possibility that the task-dependent contextual modulations are caused by the top-down driven changes in functional connectivity, we studied the spiking interactions (calculated by cross-correlations) between V1 sites that encoded different stimulus contexts. We classified cell pairs as “parallel” or “collinear” dependent upon their relative RF positions (see Fig. 5 for recording sites configuration). For each parallel cell pair, one neuron's RF was located over the center bar and the other neuron's RF overlapped with one of the parallel flanking bars. Similarly, one neuron of each collinear cell pair had its RF positioned over the center bar and the other overlapping with one of the collinear flanks. We did observe task-dependent changes in spiking cross-correlations. Figure 5A shows the NCCGs observed in two pairs of neurons, with parallel RFs, under the two tasks. Both cell pairs showed significant task-driven differences in their correlations, with the direction of changes varying between different cell pairs: one pair showed higher correlations during the relevant, bisection task (Fig. 5A, left), whereas the other pair showed higher correlations in the irrelevant, vernier task (Fig. 5A, right). Similar results were observed for collinear V1 sites (Fig. 5B).
Over the population of recorded V1 cell pairs in the two animals (N = 395 parallel pairs; 362 collinear pairs), 40% of the cell pairs showed significant correlations in at least one task condition. Approximately 50% of the cell pairs with significant correlations showed task dependency of correlation strength (Fig. 5C). Of these, stronger correlations could be observed either under the task relevant (49%) or task irrelevant (51%) conditions, reflecting a significant task dependence of effective connectivity.
Top-down modulation of LFP interactions
The finding of task-dependent changes in spike correlations motivated us to obtain a measure of cortical interactions reflecting the integrated connectivity at the population level. We calculated LFP-LFP coherence between V1 sites representing different stimulus components under the two tasks. We found significant task-driven changes in LFP coherence for both parallel (Fig. 6A1–A4, N = 382) and collinear (Fig. 6B1–B4, N = 296) sites. We computed the coherence between parallel sites during a bisection task (task relevant for the sites) involving parallel bars (Fig. 6A1, dark red curve) and the coherence between the same sites during a vernier task (task irrelevant for the sites) involving collinear bars (Fig. 6A2, darker gray curve, which is almost superimposed on the lighter gray one). Because the animals were cued to the task by the flanks' color, we determined the contribution of color to LFP coherence: we measured coherence during a control task performed in the hemifield opposite to the recorded RFs, where the stimuli were identical to those used for the experimental task (Fig. 6A1, A2, lighter red and gray curves). Subtracting the coherence under the control condition from that under the task condition provided an accurate estimate of coherence changes due solely to changing task (Fig. 6A3). The coherence between parallel sites was higher in the bisection task when the animals used the parallel bars encoded by these sites, compared with the vernier task where these sites were irrelevant to the task (Fig. 6A3, C, red points; p < 10−29 for Monkey a and p <10−25 for Monkey b). These task-driven changes were present in both lower and higher frequencies, ranging from 10 to 120 Hz and for the entire trial period (Fig. 6A4). Interestingly, these differences emerged even before the stimulus onset, suggesting that task expectancy can preset computational state of visual cortex (see Discussion).
The collinear sites also displayed task-dependent changes in LFP coherence (Fig. 6B1–B3). For these sites, the coherence was lower in the relevant vernier task compared with the irrelevant bisection task (Fig. 6B3, C, green points; p < 10−17 for Monkey a and p <10−18 for Monkey b). Similar to parallel sites, the difference in coherence between the collinear sites for the two tasks was observed in both lower and higher frequencies and emerged during the prestimulus period and persisted for the entire trial period (Fig. 6B4). Because in our experiments the two task conditions were interleaved in blocks, the animal was primed to do the task before stimulus onset. Hence, the observed task-driven differences before the trial onset could be suggestive of task expectation presetting the state of the cortical network and thereby enabling the cortical network to process the incoming stimulus from its onset. This idea is supported by previous attentional studies in visual cortex showing modulation of prestimulus cortical responses by behavioral cues (Kastner et al., 1999; Fries et al., 2001; Thut et al., 2006).
The aforementioned results suggest that top-down control in V1 operates by modifying the connectivity among the sites. However, the direction of changes differed between the tasks: increased connectivity under the relevant (bisection) task for parallel sites but decreased connectivity under the relevant (vernier) task for collinear sites. This difference in the direction of the task-dependent changes between the parallel and collinear sites was consistent between the two animals (Fig. 6C). The difference may have resulted from the perceptual strategies used for the two tasks. The bisection task requires perceptually grouping the center bar with its nearest parallel flanking bar to judge if it's closer to the upper or lower bar. Conversely, the strategy required in the vernier task to judge the relative position of three collinear bars is to break their perceptual continuity and to segregate the collinear flanks from the center bar. Thus, bisection involves grouping of the parallel bars using the Gestalt law of proximity, whereas vernier discrimination involves segregation of the collinear bars. As a consequence, LFP interactions were enhanced between parallel sites in the bisection task and reduced between collinear sites in the vernier task. If our hypothesis is correct, then a task involving grouping of collinear lines should increase LFP interactions between the collinear sites. To test this idea, we had the animal perform a task that required grouping, rather than segregation, of collinear sites, a contour detection task.
Contour detection task
In this task, a contour composed of collinear bars was embedded in a complex background, and the contour saliency depended on the number of collinear elements. The animals were trained to detect the contour embedded in one of the two stimulus patches (Fig. 7A; see Materials and Methods). Previous work in V1 has shown that more salient contours activate stronger neuronal responses and that the degree of collinear facilitation is subject to top-down influences. Facilitation is strongest when animals perform tasks involving contours (Li et al., 2006). To understand whether V1 network properties could account for this task-dependent facilitation, we compared network interactions during the contour detection task and a control (attend-away) task unrelated to the contour stimulus (Fig. 7B; see Materials and Methods). Spike and LFP data were collected from V1 neurons that lay along the contour (Fig. 7A, B, red squares represent the RFs of the recorded cells).
V1 contour integrative properties
As in our previous study (Li et al., 2006), we found that V1 neurons encoded contour saliency: contours of longer lengths resulted in stronger firing (Fig. 7C). The contour-related facilitation in spiking activity emerged ∼100 ms after stimulus onset. Over the population, neuronal responses to the longest (most salient) embedded contour were more than double the response to a single bar in the RF, surrounded by the complex background (Fig. 7D, solid curves). A degree of collinear facilitation in V1 activity was still present during the attend-away task, but the amount of facilitation was significantly weaker (Fig. 7D, broken curves).
We also observed contour-dependent facilitation in the frequency domain of LFPs. Figure 7E shows the population-averaged time course of LFP power in 10–120 Hz for contours of varying lengths: longer contours result in higher LFP power. The contour-related facilitation emerged ∼150 ms after stimulus onset. The mean facilitation of LFP power (from 150 to 600 ms after stimulus onset) was ∼30% for the longest contour (Fig. 7F, solid curves). Similar to spiking activity, contour saliency related facilitation of LFP power was higher when the animal performed the detection task than when it attended away (Fig. 7F, broken curves).
V1 interactions and top-down influences during contour integration
To understand how the V1 network was involved in perceptual integration of collinear lines into a contour, we analyzed both spiking and LFP interactions between V1 neurons that lay along the contour embedded within the complex background.
Spiking correlations
V1 neurons whose RFs were located over the collinear contour showed significant spiking correlations, but they showed negligible correlated responses in the absence of the contour. Figure 8A shows NCCGs for the population of recorded V1 cells with RFs of similar orientation preference located along the contour when the animal performed the contour detection task (<10° difference; N = 354 pairs; distance between cell pairs: 0.3°–1.8°, mean 0.7°). When there was a contour passing through the cells' RFs, the cells were correlated significantly (correlation magnitude: 0.058; red curve, Fig. 8A). However, when there was no contour present, there was little or no correlation between the V1 sites (Fig. 8A, black curve, correlation magnitude: 0.0019; p < 0.001 for the difference between contour and no contour conditions). The correlations between these V1 sites also captured the contour saliency information, with longer contours producing stronger correlations (Fig. 8B, red curves).
We also observed task-related effects of contour facilitation in the spiking interactions between V1 contour sites. Although V1 neurons showed significant correlations when the contour elements were unrelated to the animal's behavior (i.e., the attend-away task), the observed correlation (Fig. 8A, green curve, correlation magnitude, 0.0384) was significantly less than that observed during the contour detection task (Fig. 8A, B, green curve, attend-away task vs red curve, contour detection task; p < 0.001). Thus, the top-down influences boosted the spiking interactions between V1 sites that encoded the contour when its saliency was behaviorally relevant.
LFP interactions
Similar to spiking correlations, LFP coherence between V1 sites captured contour-related information. During the contour detection task, collinearly arranged sites increased their coherence when there was a contour present in the noise background compared with when there was no contour (Fig. 9A, red and black curves, respectively; p < 10−15 for Monkey a and p < 10−9 for Monkey b). This contour-dependent increase in coherence was observed in both low and high frequencies, from 10 to 120 Hz, and emerged at 150 ms after stimulus onset and lasted the entire stimulus period (Fig. 9B, compare red and black curves).
Collinear V1 sites along a contour also showed task-related effects in their LFP coherence. Similar to spiking correlations, the LFP coherence between the contour-encoding sites was higher when the animal was actively looking for a contour compared with when the animal was doing an unrelated task (Fig. 9A, green curves; p < 10−15 for Monkey a and p < 10−8 for Monkey b). This difference in coherence between the two task conditions emerged after 150 ms of stimulus presentation and was present in frequencies from 10 to 120 Hz (Fig. 9B, compare red and green curves).
The finding that the contour detection task, which requires grouping of collinear lines, increased LFP interactions supports the idea that perceptual strategy determines the direction of task-dependent modulation of LFP coherence. Both tasks involving perceptual grouping, 3-line bisection, and contour detection increased V1 interactions.
Top-down influences on V1 noise correlations
The information carried by a neuronal ensemble is dependent on noise correlations, whether neurons exhibit similar trial-to-trial fluctuations in their responses (Shadlen et al., 1996; Lee et al., 1998; Abbott and Dayan, 1999; Panzeri et al., 1999; Bair et al., 2001; Averbeck et al., 2006). Because noise correlations can affect the encoding accuracy of a cortical network, we investigated how behavioral context affected V1 noise correlations.
Our 5 bar discrimination experiments allowed us to study V1 noise correlation dynamics in three different conditions (Fig. 10A): When the animal (1) performed a task involving the stimuli within the RFs of the cell pair under consideration (Fig. 10A, right panel); (2) attended to the same location, but performed a task that did not involve the flanking neuron's RF (Fig. 10A, middle panel); and (3) attended away from the location of the RFs of the recorded cell pair (Fig. 10A, left panel). For example, for a pair of parallel V1 sites, these 3 different cases would be: (1) bisection task involving parallel bars, (2) vernier task involving collinear bars, and (3) attend-away task involving the stimulus in the opposite hemifield. These different task conditions could then be used to dissociate noise correlation changes resulting from spatial attention and the perceptual task. Because both parallel and collinear sites showed similar trends in the attention and task effect of noise correlations, we combined the data from both classes in our analysis.
We observed that V1 neurons decreased their noise correlations by ∼60% when the animals shifted their attention from the opposite hemifield to the visual field position of the recorded neurons (Fig. 10B). The mean noise correlation was 0.0381 for the attend-away task and 0.0141 when the animals attended to the location of the receptive fields of the recorded neurons (p < 10−6 for difference). We saw a more substantial reduction in noise correlations when the animal performed a perceptual task at the receptive field locations of the recorded neurons (mean 0.0041; p < 10−6 for difference between the attend-away task and the discrimination task at the RFs).
We further examined whether the top-down modulation of noise correlations depended on the tuning similarity between neurons. We studied the changes in noise correlations between V1 neurons as a function of their “signal correlations” (i.e., the correlation between their tuning curves), for the 3 different task conditions (Fig. 10C). Across task conditions, similarly tuned cells (positive signal correlation) showed higher noise correlations than cells with different tuning (negative signal correlation, all the curves in Fig. 10C). This result agrees with previous studies of noise correlations in various cortical areas and supports the idea that cells with similar tuning are subject to shared noise sources through their common inputs. Furthermore, task-driven reduction in noise correlations was observed for all cell pairs that was independent of the tuning similarity (signal correlation) of the cells. Both similarly and dissimilarly tuned cells reduced their correlations when the monkey shifted attention and performed a perceptual task using the stimulus encoded by the neurons. The biggest reduction in noise correlations, however, was seen for similarly tuned cells (Fig. 10C, compare the curves for positive signal correlations). Because positive noise correlations between similarly tuned cells limit information capacity more than positive correlations between neurons with dissimilar tuning, this is precisely the result we expect to maximize information capacity in V1 (Panzeri et al., 1999; Averbeck et al., 2006).
We consolidated these information theoretic observations into a quantitative measurement of the accuracy of the V1 population code. To do this, we calculated the Fisher information (IF) present in our recorded neuronal ensembles, both when the stimulus was behaviorally relevant and when it was not. The inverse of the Fisher information is the minimum averaged squared error for an unbiased estimator of an encoded stimulus attribute and thus sets a limit on the population code accuracy (Abbott and Dayan, 1999). With attention directed toward the RFs of the recorded ensemble, but not specifically toward the encoded stimulus attribute (e.g., when the animal performed the bisection task but the encoded stimulus attribute was collinear offset), we observed a moderate increase in the Fisher information (Fig. 10D, green curves). The Fisher information increased much more considerably when the animal was engaged in a perceptual task involving the encoded stimulus attribute (Fig. 10D, red curves), and this increase was highest for the stimuli with the smallest lateral or collinear displacements and thus the highest discrimination difficulty (Fig. 10D, second and third point on the red curves). The results from our previous studies (Li et al., 2004, 2006; McManus et al., 2011), and our current work shows that changes in the tuning curves of individual neurons, as well as changes in the structure of noise correlations in the network, can both improve the population code for a stimulus attribute. We therefore investigated how these two components of the behaviorally driven change in network activity separately affected the population code. We found that the changes in the shape of the tuning curves of the neurons in the ensemble contributed 50–60% of the observed information enhancement (Fig. 10D, hyphenated red curves) and 40–50% of the information increase derived from changes in correlational structure.
Thus, various forms of top-down control resulted in different degrees of modulation of noise correlations. Although changes in the locus of attention decreased correlations, performing a perceptual task involving the stimulus encoded by the neurons further reduced the noise correlations and substantially increased the information content of the V1 network.
Discussion
Contextual influences in V1 change to carry more information about behaviorally relevant stimulus features (Li et al., 2004). Here we investigated the mechanisms of task-dependent modulation in V1, testing the idea that it involves differential gating of sensory inputs from stimulus components according to their task relevance. This gating requires an interaction between intrinsic connections conveying stimulus context and recurrent inputs providing behavioral context. To measure changes in effective connectivity, we studied the task effects on measures of lateral interactions: spiking correlations and LFP-LFP coherence. Furthermore, we calculated task-dependent changes in noise correlations and Fisher information as measures of the stimulus information present in the neuronal ensemble.
Contextual influences strongly modulate V1 spiking responses, and these influences are under top-down control. Similarly, LFPs in the 10–120 Hz range encoded stimulus context, and this contextual tuning was modulated in a task-dependent fashion to extract behaviorally relevant stimulus information. Our results agree with previous work showing that LFPs reflect the neuronal basis of feature selectivity, perception, and attention (Kreiter and Singer, 1996; Gail et al., 2000; Fries et al., 2001; Siegel and Konig, 2003; Henrie and Shapley, 2005; Womelsdorf et al., 2006; Siegel et al., 2007; Berens et al., 2008).
If top-down control involves an interaction between sensory and behavioral context, the question arises as to the underlying circuitry. The lack of task-dependent suppression of signals encoding irrelevant stimulus components suggests instead a model involving cortical interactions. The observed task-dependent changes in V1 spike correlations and LFP coherence support the idea that dynamic changes in functional connectivity underlie top-down control in V1. We show that, under identical stimulus conditions but differing tasks, there can be large changes in spiking correlations. The observation that correlations increased between some cell pairs and decreased between others under a given task is perhaps not surprising because the changes in effective connectivity required for the task-dependent changes in neuronal tuning may require strengthening of interactions between some sites and weakening between others. The task-driven alteration in LFP coherence, though, showed more consistent changes. This may be the result in part of the fact that LFPs derive from large neuronal populations within the cortex (Kruse and Eckhorn, 1996; Kreiman et al., 2006; Liu and Newsome, 2006; Katzner et al., 2009) and that they are likely to originate from currents generated by both subthreshold inputs and spiking outputs. Another important difference between the LFP and spiking signals was that there were substantial task-dependent LFP differences that began before stimulus onset, whereas superficial neurons show negligible spiking without a stimulus. The LFPs may therefore reflect subthreshold recurrent cortical inputs that carry task information, setting the cortical state and, based on LFP-LFP coherence, network effective connectivity for performing the task-relevant calculation.
Attention has been shown to increase cortical interactions (Fries et al., 2001; Bichot et al., 2005; Fries et al., 2008; Gregoriou et al., 2009; Bosman et al., 2012), although there are contrary reports (Chalk et al., 2010). In our experiments, LFP coherence changes depended not only on attention but also on the task performed at the attended location. Interestingly, the perceptual strategies used during the tasks dictated the direction of these task-dependent changes. Perceptual grouping increased LFP coherence, whereas perceptual segregation decreased LFP coherence. Our results bear on the ongoing debate about the neural correlates of perceptual grouping and scene segmentation within visual cortex. One proposed theory suggests that neurons encoding components of the same object couple their activities to form synchronized assemblies (Gray et al., 1989; Engel et al., 1991, Castelo-Branco et al., 2000; Gail et al., 2000; Fries et al., 2001; for review, see Gray, 1999), although some studies challenge this idea (Lamme and Spekreijse, 1998; Thiele and Stoner, 2003; Palanca and DeAngelis, 2005). Our observation, that perceptual grouping can increase V1 interactions whereas perceptual segregation can reduce them, suggests that both coupled and decoupled activity in neuronal ensembles are important for perception. The effect of such dynamic ensemble interactions is to alter response rates along with effective connectivity, producing tuning characteristics that enable neurons to encode task-relevant information.
We observed top-down-modulated LFP changes over a broad range of frequencies. This is somewhat at variance with studies showing specific frequency effects, such as the implication of γ band in feature integration (Gray et al., 1989; Engel et al., 1991; Singer and Gray, 1995; Kreiter and Singer, 1996; Gail et al., 2000; Henrie and Shapley, 2005), attention (Fries et al., 2001; Bichot et al., 2005; Taylor et al., 2005; Liu and Newsome, 2006; Womelsdorf et al., 2006; Gregoriou et al., 2009; Chalk et al., 2010), or memory (Tallon-Baudry et al., 1998; Pesaran et al., 2002). Lower frequencies (<30 Hz, α and β bands) have been linked to motor control, and mental imagery (Palva et al., 2005; Cooper et al., 2006), short-term memory (Jensen et al., 2002; Busch and Herrmann, 2003; Sauseng et al., 2005; Siegel et al., 2009), hypothesis testing and setting of cognitive state (von Stein et al., 2000; Gail et al., 2004; for reviews, see Palva and Palva, 2007; Engel and Fries, 2010). Studies on binocular rivalry, bistable percepts, and visual search have associated these low frequencies with cognitive state (Gail et al., 2004; Thut et al., 2006; Okazaki et al., 2008; Pesaran et al., 2008; Iversen et al., 2009). Because our studies involve interactions between cognitive influences and feature integration, it is perhaps not surprising that we see effects over a broad range of frequencies, and the character of stimulus and task are likely to influence the affected frequencies.
Uncorrelated noise in neuronal responses provides optimal coding of information by neuronal ensembles (Shadlen et al., 1996; Abbott and Dayan, 1999; Panzeri et al., 1999; Averbeck et al., 2006). But noise in the brain is correlated (Gawne and Richmond, 1993; Zohary et al., 1994; Gawne et al., 1996; Lee et al., 1998; Bair et al., 2001) and changes in noise correlations can influence the encoded information (Oram et al., 1998; Abbott and Dayan, 1999; Panzeri et al., 1999; Averbeck et al., 2006; Smith and Kohn, 2008). The mean noise correlations in our study were lower than that reported in some studies in V1 and other areas but similar to that seen in other reports (Zohary et al., 1994; Gawne et al., 1996; Lee et al., 1998; Kohn and Smith, 2005; Poort and Roelfsema, 2009; Ecker et al., 2010; Womelsdorf et al., 2012). Differences in experimental conditions (stimulus parameters, arousal state) and analysis parameters (spike-sorting conventions, time windows used for spike counting) could account for this discrepancy (Cohen and Kohn, 2011). Similar to other studies (Zohary et al., 1994; Bair et al., 2001; Kohn and Smith, 2005; Cohen and Maunsell, 2009; Gu et al., 2011), we found that similarly tuned cells had higher noise correlations, consistent with the idea that similarly tuned cells share common inputs and hence are subject to common noise sources. Top-down influences of attention and perceptual learning have been shown to reduce noise correlations (Cohen and Maunsell, 2009; Mitchell et al., 2009; Gu et al., 2011), but others report mixed effects of noise correlations on information (Romo et al., 2003; Poort and Roelfsema, 2009). In our experiments, not only spatial attention but also perceptual task improved information in V1 both by changes in neuronal tuning and by reduction in noise correlations, and the neuronal ensemble response was most informative for stimuli with the highest discrimination difficulty.
Theoretical studies suggest that the impact of noise correlations on the population code depends on the tuning properties of the cells pooled. Noise correlations between similarly tuned cells can be detrimental to the population code (Shadlen and Newsome., 1998; Abbott and Dayan, 1999), whereas they can be beneficial if neurons are tuned to different features (Oram et al., 1998; Panzeri et al., 1999; Averbeck et al., 2006). Here, top-down influences in V1 improved the coding of task-relevant stimuli by reducing noise correlations between similarly tuned cells. This relationship, although not seen previously (Cohen and Maunsell, 2009), may depend on the task demands and the stimulus configuration. For example, decoding where a stimulus change occurred might be less sensitive to the neurons' tuning similarity, whereas discriminating the spatial configuration of a complex stimulus might be more dependent on the neurons' tuning similarity.
We have suggested that, in V1, top-down control involves interactions between feedback carrying behavioral context and horizontal connections conveying stimulus context (Gilbert and Sigman, 2007). In our experiments, cognitive influences on V1 contextual interactions produced robust changes in functional connectivity. This suggests that, even though the anatomical connectivity of the horizontal connections is stable over the short-term, the functional efficacy of these connections can be controlled by task-driven influences, provided, for example, by recurrent projections to V1. Thus, V1 can be viewed as an “adaptive processor” that runs different computational programs as dictated by feedback from higher-order areas. The knowledge and the “switchboard” circuitry that is required to associate various behavioral needs with different brain states may be acquired through learning.
Footnotes
This work was supported by National Institutes of Health Grant EY007968. We thank Valentin Piëch for helpful suggestions.
The authors declare no competing financial interests.
- Correspondence should be addressed to Dr. Charles D. Gilbert, Rockefeller University, 1230 York Avenue, New York, NY 10065. gilbert{at}rockefeller.edu