Abstract
An important question in neuroscience is how the activity from spatially distributed cortical representations is integrated and processed together. In this study, we used a new approach to investigate the integration of distributed cortical activity. We used microstimulation to directly activate pairs of sites in primary visual cortex of rhesus monkeys. The sites were activated either singly or jointly, and the monkeys were trained to behaviorally report detection of the activation of either cortical site. We compared the detection performance with predictions from two different mathematical models of signal combination. Our data show that, at cortical separations <1 mm, signal integration is well described as a linear combination (d′ summation) of individual site activity. At larger separations, signal integration is better described as a maximum operation on the site signals. We compare our neurophysiological findings to existing psychophysical data and suggest the intriguing possibility that cortical activity originating at spatial separations greater than ∼1 mm is processed as if by parallel, independent circuits whose signals can be compared against each other but not summed. This in turn implies that there is a strong constraint to the kinds of computations the brain can perform with spatially distributed cortical activity.
Introduction
One of the basic findings of cortical neurophysiology has been the spatial localization of function. This segregation of function across cortical space has been best described in the sensory and motor cortices. This analytical view of cortical function has been accompanied by the complementary question of how such distributed representations may be synthesized into percepts that guide behavior. Although we have made great advances in understanding cortical representations, the nature of the processes integrating spatially distributed neural representations still poses a great mystery.
Current approaches to the question of integration of cortical representations use anatomical or correlational approaches. Anatomical studies attempt to build a neural wiring diagram at different spatial scales, which is then used to draw inferences about the flow of cortical signals. Correlational studies typically measure activity in different populations of neurons and use statistical models to infer connectivity and signal flow between cortical locations (for recent reviews, see Zamora-López et al., 2011; Behrens and Sporns, 2012).
In this study, we present a novel experimental approach to investigate the question of cortical integration. We implanted microelectrode arrays in the primary visual cortex (V1) of rhesus monkeys. We applied electrical microstimulation through selected pairs of array electrodes to insert activity directly into cortical sites. Through operant conditioning, we trained the monkeys to make controlled behavioral reports to indicate the presence of induced activity in either or both of the pair of cortical sites. In this manner, we were able to examine the integration of cortical signals originating at pairs of sites at well-defined separations.
We tested the mode of integration of the induced cortical signals against two different models of signal processing. In the first model, we assumed that activity from pairs of cortical sites is linearly summed. In the second model, we assumed that activity from the two sites does not interact and instead cortical activity is processed as if from parallel, independent circuits that can be modeled as a maximum (max) operation.
We found that, at close cortical separations (within ∼1 mm), the observed integration was consistent with a summation mechanism. At larger separations, however, the computation was well described by independent processing of each site, suggesting a strong constraint to the ability of the brain to assemble combinations from neural activity of groups of spatially separated cortical neurons.
Materials and Methods
Signal processing framework.
We consider the simplest case of cortical integration: the joint processing of neural activity originating from just two cortical sites. The computations performed on the signals from these two sites are likely to be task dependent. For example, a task that requires a subject to identify which of two patches of light is brighter probably requires a subtractive or other “comparative” operation. Conversely, a task that requires the detection of orientation change in either of two oriented gratings probably requires a summative or other “cooperative” operation. To study cortical integration from these two sites, we need to constrain the brain to a well-defined operation on the signals.
Consider a task that requires a subject to process neural activity originating at two cortical sites and execute a behavioral response if either cortical site is active. There is always baseline activity at the two sites, so the task the brain has to perform is to process activity from both sites and determine whether, in at least one site, the activity exceeded baseline. We can model the activity at the two sites as being either a “signal” contaminated with “noise” (s0, s1) or just “noise” alone (n0, n1) as shown in Figure 1, a and b. In this hypothetical task, an integrative process with access to both sets of cortical signals should perform a linear summation of the activity at the two sites before generating a decision (Snippe and Koenderink, 1992; Chen et al., 2006). In a prevalent framework of how cortical activity guides behavior (“bounded accumulation of evidence”; Gold and Shadlen, 2007), the summed activity from both cortical sites would accumulate until it crossed an internal criterion before releasing an appropriate behavioral response (Fig. 1c).
Two different models for signal integration. a, b, Theoretical distributions of noise (n0, n1) and signal plus noise (s0, s1) at two spatially separated cortical representations. The more separated the noise and signal plus noise distributions, the more easily the signal will be correctly detected. c, In the signal summation model, the activity at both sites is linearly summed, before being passed to a criterion mechanism. d, In the independent processing model, the activities from each site do not interact with each other, forming two components of a race model, signaling detection whenever either reaches criterion. e, The summation model simply sums the activity at the two sites, producing distributions of noise and signal plus noise that are more separated than those at either individual site. f, The independent processing model can be mathematically expressed as a max operation, taking the maximum of the two samples from the two sites. This results in distributions that are better separated than those at either individual site but not as well separated as those for the summation model.
As an alternative, we consider an integrative mechanism that processes each cortical signal separately (Fig. 1d). Under this model, each cortical site engages independent circuits all the way up to the criterion comparison (in the bounded accumulation of evidence framework, this implies that evidence from each cortical site is accumulated separately). Under such a constraint, the decision mechanism should release the behavioral response whenever either of the two circuits attains criterion (an OR-like operation).
Mathematically, the summation mechanism can be modeled as a simple sum of the measured samples (Fig. 1e), whereas the independent processing (OR-like) mechanism can be modeled as a max operation (Fig. 1f).
A clean test of the question set out here requires conditions in which a subject reads out cortical signals from precisely defined cortical sites while performing a task effectively requiring the brain to “detect” the presence of activity at either site. We know that sensory stimulation (e.g., by presenting visual stimuli) activates a broad and distributed network of cortical areas (Williams et al., 2003; de Lafuente and Romo, 2006), and it is difficult to infer which cortical neurons are being processed during a particular behavioral task and where they are located.
To properly constrain the location (and thereby the separation) of cortical sites carrying the signals to be tested for integration, we used intracortical microstimulation through implanted microelectrode arrays. Intracortical microstimulation is an electrophysiological technique in which pulses of electric current are used to directly activate small groups of neurons at particular cortical sites. Microstimulation has proven to be a valuable technique to test hypotheses related to the processing of cortical activity that are difficult to approach via traditional psychophysical and neural recording techniques (Salzman et al., 1990; Nichols and Newsome, 2002; Deangelis and Newsome, 2004; Afraz et al., 2006).
Subjects.
Two adult male rhesus monkeys (Macaca mulatta, 10 and 15 kg) were used in the experiment. Before training, each monkey was implanted with a head post and a scleral search coil for monitoring eye movements. After the monkey learned the behavioral task, we implanted two 6 × 8 Utah arrays (platinum–iridium-coated, manufactured by Blackrock Microsystems, based on a design by Maynard et al., 1997), one in V1 of each cerebral hemisphere (Fig. 2a). The array electrodes were 1 mm long and arranged on a 400 μm grid. The two arrays were connected to a percutaneous connector that allowed intracortical microstimulation and neural recording. All animal procedures were in accordance with the Institutional Animal Care and Use Committee of Harvard Medical School.
a, Intracortical microstimulation through implanted electrode arrays was used to induce activity at well-defined cortical sites. Different electrode pairs were selected to vary the separation between the activated sites. b, Subjects performed a 2IFC two-site microstimulation detection task. c, For each trial, the microstimulation currents at the two sites (the stimulus) were chosen randomly and independently from a set of values, including zero. Over a session of trials every combination of currents was presented multiple times, filling out a grid of combinations. A subject's performance in an example session is shown. d, The detection performance for every stimulus combination was converted into a d′ value. The single-site d′ values (e.g., d′0 = 2.59 and d′1 = 1.00, indicated by a black bordering) were used to generate predictions for the two models (in this case, d′sum = 3.59 and d′max = 2.78) that were then compared against the measured d′ (in this case, d′meas = 2.23, indicated by heavy gray shading) when both sites were stimulated with the corresponding current combinations. e, Plot of d′sum (x-axis) against d′meas (y-axis). f, Plot of d′max against d′meas. The black point indicates the example cell shown in d. The example error bars mark 95% confidence intervals obtained by bootstrapping d′meas and d′sum or d′max.
Behavioral task.
The subjects were trained in a two-interval forced-choice (2IFC) detection task. The stimulus was presented randomly in either the first or second interval (but never both), and the subjects were trained to indicate, with an eye movement, in which interval they detected the stimulus to be. The stimulus itself consisted of a pair of components. For the version of the task using visual stimuli, the components were a pair of bright Gaussian blobs. For the version of the task using electrical microstimulation, the components were electrical currents delivered at a pair of cortical sites.
The sequence of events during a trial is illustrated in Figure 2b. Subjects were presented with a blank, uniform gray screen. At the start of a trial, a warning tone was sounded and a fixation point was presented on screen. When the subject's eye position, which was monitored using a scleral search coil, entered a virtual 1° square box centered on the fixation point, another warning tone was played. After the subject had held eye position in the box for 150 ms, the two sampling intervals were presented, each indicated by a warning tone. Each sampling interval lasted 150 ms, and they were separated by 500 ms. On each trial, the compound stimulus was presented in one of the two sampling intervals with a 50% probability for each. The second sampling interval was followed by a 50 ms delay, after which the fixation point was removed and two choice targets were presented. The choice targets were located 5° above and below the original fixation point. The subject was trained to make a saccade to one of the targets to indicate its choice of the stimulus interval. A correct choice was rewarded with a drop of juice, whereas incorrect choices went unrewarded. If the subject did not make a choice within 1 s, the trial was not included in the analysis. At any point during the trial, if the subject broke fixation before the choice targets appeared, the trial was aborted and not included in the analysis.
The strength of each component of the stimulus was chosen randomly and independently from a fixed set of values. The values included zero and were adjusted to span a large range from below threshold to just above. The different combinations of the stimulus components can be visualized as filling out a two-dimensional grid, indexed by the strength of each individual component (for an example of a 6 × 6 grid, see Fig. 2c). Therefore, on any given trial, the subject could not predict whether the stimulus would consist of both components or just one (the other component would be zero) and what the relative strengths of the components would be. Trials with both components set to zero were randomly rewarded. This experimental design was intended to encourage the subject into a consistent behavioral state in which the subject attempts to simultaneously process both components of the stimuli. The short duration of the stimulus interval (150 ms) was designed to minimize the possibility of attentional shifts and eye movements during stimulus presentation (Kröse and Julesz, 1989).
Stimulus and reward delivery, behavioral monitoring, and data acquisition were all under computer control driven by custom software.
Training.
Both subjects used in this experiment had been used previously in a reaction-time change detection task and knew how to hold fixation. Neither had experienced cortical microstimulation before. During initial training, the subjects were presented with the visual version of the task, with each component in a different hemifield. Subjects were started with a 2 × 2 stimulus component grid. After subjects has mastered the concept of the 2IFC task, they were trained on successively larger grids until they would perform the task with 6 × 6 size grids that enabled sampling of stimulus values spanning a large subthreshold region. Training continued until detection thresholds approached asymptote.
The subjects were then continued on the visual task for 1 month before being implanted with the Utah arrays and then switched over to the microstimulation experiment. No visual testing was done while the microstimulation experiment was being conducted. When the animals were transitioned to the microstimulation stimuli, their initial performance dropped to chance and then improved quickly over the course of 1 week, consistent with previous reports of microstimulation detection (Ni and Maunsell, 2010). Training using electrical microstimulation was continued until detection thresholds stabilized (1–2 weeks), after which the data reported here were collected.
Intracortical microstimulation.
We used intracortical microstimulation through the implanted microelectrode arrays to directly activate populations of neuronal elements at pairs of well-defined cortical sites. In each array, electrodes were 1 mm long, spaced 400 μm apart and located on a 6 × 8 grid covering a patch of cortex 2 × 3 mm. Initial electrode impedances ranged from 1 to 1.5 MΩ. After the first session of microstimulation, the impedances of the stimulated electrodes dropped to ∼200 kΩ and remained stable at that value for the duration of the whole experiment (several months). By microstimulating through different pairs of electrodes in the implanted arrays, we could test how the integration of cortical signals varies at different separations (Fig. 2a) and at different pairs of locations throughout this patch of cortex. We implanted an array in V1 in each hemisphere, so we could also test signal integration between the two cerebral hemispheres. From the experience of previous researchers using the same array design and implantation technique, we expect the electrode tips were mostly in layers 2–3 (Kelly et al., 2007).
Intracortical microstimulation was delivered as trains of biphasic, positive-polarity-first current pulses. Each pulse phase was 200 μs long, and the train was delivered at 200 Hz. This stimulation protocol has been used previously for intracortical microstimulation in visual cortex (Ni and Maunsell, 2010). On the first few trials of the first day of the switch to the microstimulation version of the task, subjects would make eye movements out of the fixation box during the application of the stimulus, and these saccades had a direction and amplitude consistent with the receptive field locations at the stimulating microelectrodes. On subsequent trials, the animals never made saccades in response to the stimulation because, if the animal broke fixation for any reason, the trial was aborted.
Pulse waveforms were generated by experiment software and converted into current pulses using two stimulators, one for each site (BSI-1; Bak Electronics), each with their own isolated power supply (A14783-115; Acopian Technical Company). The delivered current waveforms were monitored by using two custom-built optically isolated current sensors. Pairs of electrodes at the required cortical separation were connected to the stimulators through a custom-built adapter that connected to the array percutaneous plug. Electrode pairs were rotated, and the same electrode was reused only after a gap of at least 7 d.
Data were collected in sessions, with each session consisting of a 5 × 5 or 6 × 6 grid of current combinations for the chosen pair of sites. A complete session consisted of 50 repetitions of each current combination, filling out a grid as shown in Figure 2c. The number in each cell of the grid shows the proportion correct attained by the subject for the 50 repetitions of the indicated combination of stimulation currents. The maximum current used was 25 μA (typically 20 μA; see Fig. 8c).
A single pair of electrodes was tested in a given session. Subjects typically ran two to three sessions in one day, generally with the same separation. The first separation condition tested in subject 1 was interhemispheric. This condition followed the initial training and testing with cross-hemifield visual stimuli pairs and was run for 1.5 months. The interhemispheric condition was not run in subject 2 because of technical problems with one of the implanted arrays. The cross-hemisphere microstimulation was followed by a brief period of visual retraining with one visual stimulus to cue the subject that stimuli would now be presented in one hemisphere only. For the single-hemisphere stimulation, different site separations were tested in different orders.
Data analyses.
We used the discriminability index, d′, as a measure of the internal cortical signals (s0, s1) induced by microstimulation. d′ is defined as the ratio of the separation between signal and noise to the spread of the noise (difference of means divided by SD; Green and Swets, 1966). We used the subjects' detection performance for each combination of component strengths (Fig. 2c) to compute the corresponding d′ values (Fig. 2d). The d′ values from the bottom row (d′0) and leftmost column (d′1) of the grid shown in Figure 2d were obtained during single-site stimulation, in which the stimulation current at one of the sites was set to 0 μA and are referred to as single-site d′ values.
The single-site d′ values (d′0 and d′1) were used to generate predictions for the joint d′model for the two models we consider (d′sum and d′max). The predicted d′model were plotted against the d′means actually obtained from the experiment as shown in Figure 2, e and f. The pairs of measured and predicted d′ values were collected across sessions and then fit with a straight line passing through the origin to obtain a slope m (Fig. 2e,f). The value of this slope indicates how well the data fit the respective model and in what direction the data deviate from the model. A value of m = 1 indicates that the data are a good fit to the model, whereas m > 1 indicates that the data exceed the model prediction and so on. Confidence intervals on m were obtained by running 2000 bootstraps on the (d′model, d′means) point collection. Here we have shown m computed from data from a single session for illustration. For our analysis, we pooled the (d′model, d′means) point collection from all the sessions conducted with the same electrode separation.
Although both subjects were trained in a 2IFC task, post hoc analysis of the interval biases led us to infer that subject 2 interpreted the task as a single-interval yes/no task (Fig. 3). Subject 2 mostly ignored the stimulus presented in interval A as demonstrated by the almost flat “psychometric curve” for that interval. Instead, subject 2 used the stimulus state in interval B to form a judgment (stimulus present/absent) and then responded to interval B if it judged the stimulus to be present or interval A if it judged the stimulus to be absent in interval B. Therefore, the percentage correct in interval B corresponds to the “hit” rate in a yes/no detection task, whereas the percentage correct in interval A is the correct rejection rate. For this reason, we analyzed data from subject 2 as a yes/no task. Subject 1 had no significant interval bias. Analysis formulae for both 2IFC and yes/no tasks are derived in the following section and applied in the analysis.
Subject 2 demonstrated a strong response bias in the task. The x-axis shows stimulus bin (arranged in order of increasing stimulus strength), and the y-axis shows the proportion correct in the relevant interval. For subject 2, interval A responses remained essentially constant across different stimulus levels, whereas its interval B responses varied systematically with stimulus level. The response pattern of subject 2 indicates that it interpreted the 2IFC task as a single-interval yes/no task, essentially ignoring information from interval A. The data are collapsed across all sessions and across all conditions tested. Each point displayed in each bin contains ∼5000 trials on average.
Data were analyzed using scripts and modules written in the Python programming language (van Rossum and Drake, 2006) using the numpy (Oliphant, 2006), scipy (http://www.scipy.org), and matplotlib (Hunter, 2007) modules and the IPython shell (Pérez and Granger, 2007).
Summation model.
In the summation model, the signals at the two cortical sites are linearly summated to create a compound signal, which is then processed further.
Let the noise samples n0, n1 at the two sites be Gaussian distributed and the signals s0, s1 be additive over the noise such that
where φ(μ, σ) is the normal distribution with mean μ and SD σ.
The distribution of the samples of the combined noise and signal are given by n01 and s01 and can be described as follows:
Therefore, the sensitivity measure d′ for the single-site (μ1 = 0 or μ0 = 0) and joint-site (μ0, μ1) stimulation conditions are given by the following:
from which we can see that the predicted d′ for the signal summation model is given by the following:
Given that all the distributions involved here are Gaussian and have the same variance, we can obtain d′ from subject responses as follows:
where H (“hit”) is the probability correct in the second interval, FA (“false alarm”) is the probability of the subject indicating the second interval when the stimulus was actually presented in the first interval, and z is the z-transform of the probability. The formula is symmetric, and one can define H as the probability correct in the first interval and FA as the probability of making a first interval response when the stimulus was presented in the second interval (Green and Swets, 1966).
Independent detectors (max) model.
In this model, the signals at the two cortical sites are compared against each other, and only the maximum of the two is used for further processing. We cast the max model in a form compatible with signal detection theory, such that we may obtain formulae that express d′ in terms of detection performance for both the yes/no and 2IFC tasks for this model. We first derive an analytic expression for the distribution, ψ(μ0, μ1), of the maximum of two uncorrelated Gaussian distributed variables, x0 and x1, with means μ0 and μ1, each having unit SD. We then show how the expression for ψ may be used to derive the required formulae, given that ψ is fairly close to Gaussian and making allowances for the fact that its SD varies depending on μ0 and μ1.
By definition, the value of ψ at some point x is given by the following limit:
where xm = max(x0, x1), and P (condition) is the probability of “condition” happening.
We can expand the numerator as follows:
Now, by definition,
and
where Φ(x, μ1) is the cumulative distribution for a Gaussian random variable with unit SD and mean μ1. Therefore,
An example of the form of Equation 4 (compared with the distribution of xmax obtained from Monte Carlo simulations) is shown in Figure 4a.
The max model can be expressed in a form compatible for analysis with signal detection theory. a, The distribution of the maximum of two uncorrelated Gaussian random variables obtained from Monte Carlo simulations (shaded area) compared with the analytically derived distribution ψ (black solid line) for μ0 = μ1 = 0.5. b, ψ is itself very close to Gaussian but with an SD that varies depending on μ0 and μ1. c, An analytical expression for the max model d′01 can be obtained using standard signal detection theory given ψ. d′01 takes the form of a nonlinear function of d′0 and d′1. d, The distribution of x2afc obtained from Monte Carlo simulations (shaded area) compared with the analytically derived distribution ξ (black solid line) for μ0 = μ1 = 0.5. e, Because ξ is very close to Gaussian formulae for d′01 for the 2IFC task can also be directly derived from standard signal detection theory.
The central moments of ψ(x, μ0, μ1) are given by the following:
where μ is the mean:
The mean of ψ varies with μ0, μ1, and its SD varies with |μ0 − μ1|. As |μ0 − μ1| gets larger, μ → max(μ0, μ1) and σ → 1. Computations of the skew (the third central moment) and the kurtosis (the fourth central moment) inform us that the deviation of ψ from Gaussian is very slight and vanishes as |μ0 − μ1| increases (Fig. 4b).
Taking advantage of the observation that ψ(x, μ0, μ1) deviates very little from a Gaussian and noting that the SD of signal and noise will not be equal, we can write the following (Simpson and Fitter, 1973):
where μs and σs are the mean and SD of the signal sample, and μn and σn are the mean and SD of the noise samples. These are obtainable from Equations 5 and 6.
We generate a table of d′ by creating a grid of μ0 and μ1 values and generating the corresponding d′ values from Equation 7. The required μs and σs values are computed from Equations 5 and 6 (μn and σn are fixed values, obtained by setting μ0 = μ1 = 0). This presents us with a lookup table that relates single-side d′ (d′0 and d′1) to joint d′01 for the max model (Fig. 4c).
Yes/no task data and the max model.
If the subject has an internal criterion c, then the hit rate H and false alarm rate FA are given by
where z is the z-transform of the probability values.
Now, if we let
by substituting Equations 8 and 9 in Equation 7, we obtain
To obtain the single-side d′ from subject response data obtained from the yes/no task, we generate a table that relates H and FA to d′ for the single-side data. Once again, we use Equations 5 and 6 to compute values for μs and σs by varying μ0 and setting μ1 = 0. Simultaneously, we vary c and use Equations 8 and 9 to generate a two-dimensional table that enables us to put in experimentally measured values of H and FA to obtain, first, μ0 and then μ1 by considering each single-site response data separately.
We then use the obtained (μ0, μ1) pair to compute the value of k as if the data came from the max model and then substitute the experimentally obtained joint H and FA to compute the measured d′01 from Equation 11. This measured d′ is then compared against the predicted d′ as usual.
2IFC task and the max model.
For the 2IFC task, we assume the subject adopts the strategy of indicating the interval that yielded the larger sample (Green and Swets, 1966). For this strategy, the decision variable can be modeled as x2afc, which is the difference of the samples obtained from the two intervals (xmaxA, xmaxB), which themselves are the max of the two individual sites (as discussed above). So we can write, for the distribution of ξ of the decision variable x2afc, for the case that the signal is in interval A (such that xmaxA → ψ(x, μ0, μ1) and xmaxB → ψ(x, 0, 0):
An example of the form of Equation 12 (compared with the distribution of x2afc obtained from Monte Carlo simulations) is shown in Figure 4d.
For the 2IFC task, the subject ends up discriminating two categories: (1) signal in interval A; and (2) signal in interval B. The distribution for signal in interval A, of course, we just derived as ξ(x, μ0, μ1). The distribution for signal in interval B is simply the mirror image of ξ i.e., ξ(−x, μ0, μ1). Thus, for a given signal strength [given by the pair (μ0, μ1)], the 2IFC task presents itself as the discrimination of two very Gaussian-like distributions (Fig. 4e) with the same SD. We can therefore simply use the regular formula
to compute d′ from the task performance data and then compare it against predictions from the d′ table developed previously. Here, hits are defined as correct responses to stimuli in interval B, whereas false alarms are responses to interval B when the stimulus was presented in interval A, although the definitions can be swapped.
Results
Integration of cortical signals transitions between two modes over a very small distance
Figure 5, a and b, shows the collected data from all sessions at 400 μm electrode separation (the smallest allowed by the implanted arrays). The d′ measured from the subject's behavioral performance (d′meas) closely matched the prediction of the summation model. The best-fit line through the data comparing d′sum and d′meas had a slope indistinguishable from 1 (Fig. 5a). Correspondingly, when compared with the d′max predictions for the max model, the measured d′ values were systematically larger (m > 1; Fig. 5b) and indicated a poor fit to this model. Thus, the integration process appears to perform a linear summation of signals that arise from nearby sites in cortex.
Comparison of model predictions (x-axis) against measured data (y-axis) for two different cortical site separations. The slope m of the best-fit line passing through the origin is shown with 95% confidence intervals obtained by bootstrapping. The black line is the best-fit line, and the gray band shows its 95% confidence intervals. The dotted line has unity slope indicating where the best-fit line should lie if the data match the model compared. a, b, Data from all sessions and both subjects in which electrode separation was 400 μm compared against the d′ summation and max model predictions. The data are well fit by a d′ summation mechanism. c, d, Data for cortical sites located in different hemispheres (for s1 only). The data are well fit by a mechanism after a max model.
When the two active cortical sites were located in different hemispheres, however, signal integration was poorly described by d′ summation (Fig. 5c). The d′ measured from the subjects' performance in the integration task fell below that predicted by the summation model. Rather, the integration of signals arising in different hemispheres is well described by the max model (Fig. 5d).
Many studies have suggested that information processing between the two hemispheres might be uncharacteristic of processing within a single hemisphere (for review, see Lassonde and Ouimet, 2010). Therefore, we also explored how integrative processes vary with cortical separation within the same hemisphere. In addition to 400 μm, we also selected pairs of cortical sites at separations of 800, 1600, and 2400 μm and tested what mechanism of signal integration best fits the subjects' performance (Fig. 6).
Integration of cortical signals transitions between two modes over a very small distance. The graphs show m for the d′ summation (a) and max (b) models plotted against the cortical site separations tested. The gray points show data from the two subjects separately (s1, light gray; s2, dark gray), and the black points show the combined data. At the smallest separation (400 μm), signal integration is well described as d′ summation. However, by 2400 μm, signal integration is better described as a max operation, and this mode is the same as that for integration across hemispheres.
We found that the integration mode deviated from signal (d′) summation at separations >400 μm. By 2400 μm, the integration mode was indistinguishable from the max model, the same as that for interhemispheric separations.
The data presented in Figure 6 were obtained by pooling data across all sites having the same separation. We did not collect enough data from each pair of electrodes to get small enough confidence intervals to test intersite variability. This leaves open the possibility that, in the transition zone (from 800 to 1600 μm), there is a mixture of site pairs that show summation and max processing. Sites outside the transition zone (400 and ≥2400 μm) match up well to either the summation or the max models. Although we cannot rule out the possibility that, at the closer separations, sites are presenting as a mixture of super-summation and less than summation, we think it unlikely that we have sampled sites at the just the right mixture to fit the pattern expected from a summation model. Similarly, at further separations, we think it unlikely that the collected data from the sites fits the pattern expected from a max model but with individual sites being a mixture of, say, summation and something worse than max.
Signals from separated pairs of small visual stimuli are processed independently
Previous work on humans has shown that spatially separated simple visual stimuli (small dots) are integrated through a process of probability summation, whereas such stimuli, when presented dichoptically at corresponding retinal locations in the two eyes, show a d′ summation (Kristofferson and Dember, 1958; Green and Swets, 1966). Probability summation and the max model can be shown to be formally identical under simple assumptions (Pelli, 1985).
We ran the two subjects on the visual version of the experiment and analyzed the data in the same manner as described for the microstimulation version of the experiment (Fig. 7). The two visual components (small bright Gaussian blobs on a gray background) were placed in opposite hemifields at 5° separation to obtain a cross-hemifield condition. They were placed in the same hemifield at 1° separation, symmetrically above and below the horizontal meridian at 5° eccentricity to obtain the “far” condition and at 10° eccentricity to obtain the “near” condition. The labels far and near refer to the relative (inferred) separation of the stimulus representations in V1 (1.4 and 0.63 mm, respectively) based on estimates of average cortical magnification for this species. The magnification factor can vary by a factor of 3 across individuals and should be treated very cautiously (Van Essen et al., 1984).
Consistent with previous studies, integration of signals from spatially separated visual stimuli are processed independently. The graphs show m for the d′ summation (a) and max (b) models plotted against the visual site separations tested. The far visual stimuli were placed 1° apart at 5° eccentricity, symmetrically above and below the horizontal meridian. The near visual stimuli were placed 1° apart at 10° eccentricity, symmetrically above and below the horizontal meridian. The cross-hemifield stimuli were placed at different hemifields at 5° eccentricity. The near and far conditions were tested with one subject only.
Consistent with the previous findings in humans, we found that integration of signals from the pairs of simple visual components fit well with the max model and not with the d′ summation model. We did not attempt to move the components closer because such experiments would be especially difficult to interpret as a result of the combined effects of small eye movements and peripheral processing. However, the results of Kristofferson and Dember (1958), who used dichoptic stimulation (data reanalyzed by Green and Swets, 1966), suggest that stimuli that are inferred to generate overlapping representations, at least in V1, are processed in a linear d′ summation manner in humans, which are consistent with the results obtained from the microstimulation experiment with electrodes spaced at 400 and 800 μm.
d′ summation is not attributable to current (stimulus) summation
Past experimental and theoretical work has shown that the direct electrical effect of microstimulation attributable to current spread is extremely localized around the electrode tip. Current summation should affect measurements like ours only when the separation between the stimulated sites is of the order of 10–50 μm, which is well below the closest separation we could test (Stoney et al., 1968; Tehovnik et al., 2006; Histed et al., 2009).
We verified this by examining whether the d′ summation obtained at the closest separation (400 μm) could be explained by the summation of currents injected from the two electrodes. The single-site stimulation data, which relates the current at a single stimulation site (I0 or I1) to its detectability (d′0, d′1) was used to generate predictions for a current summation model. If we let d′ = ϴ(I) be the empirically obtained relationship between stimulation current and measured d′, then the joint d′ prediction for the current summation model is obtained from Equation 14:
We see from Figure 8a that current summation is a poor description of the measured data at 400 μm. Behavioral performance was far below the level expected if there had been current summation. Notably, the plot of d′currentsum against d′meas presents a large offset from the origin that is quite different from the behavior of the other two models considered. The reason for this offset can be understood by considering the nature of ϴ, which is plotted in Figure 8b. This relationship is nonlinear and well fit by a piecewise linear model (black line). d′ increases linearly with current but only after an offset. This initial nonlinearity (which can be interpreted as a threshold) causes the shifted shape of the scatter plot shown in Figure 8a, which is quite different from the observed data (Fig. 5).
d′ summation is not current (stimulus) summation. a, Comparison of measured joint d′ with d′ predicted from current summation. b, Plot of d′ as a function of microstimulation current. The raw data are shown as transluscent gray disks. The large gray discs with error bars show the median and interquartile range of d′ binned against the stimulation current. The black line shows a piecewise linear fit made to the raw data. c, Frequency plot of currents used in the experiment. Although some data were obtained at 30 μA, the bulk of the sessions were restricted to 10–20 μA at which the behavioral dynamic range was located.
The max model cannot be explained by attention switching or bias
In the task presented to the subject, the stimulus level at either site was randomly chosen from a set of stimulus strengths that included zero and was adjusted to span a large range from below threshold to just above. On any given trial, the subject could not predict whether it would receive stimuli at both sites or just one of the sites and what the relative strengths of the stimuli would be. This experimental design was intended to force a consistent behavioral state in which the subject always attempts to simultaneously access information from two cortical sites.
The subject is free, however, to adopt various strategies, in which it favors one cortical site over the other (by unevenly allocating attention or an analogous resource that affects processing). These strategies fall into two classes: (1) consistent favoring, in which the favored site is constant over a session; and (2) random switching, in which the favored site switches randomly from trial to trial. The following mathematical analysis indicates that such strategies would not produce the pattern of results we have obtained. Most importantly, the interpretation of processing by independent detectors is not an artifact of an attention switching strategy in which the subject could engage.
Let us define the following terms:
So, over a session, the d′ we actually measure are weighted averages of the two states (favor 0 and favor 1) and can be expressed as follows:
No effect of favoring on d′ summation model
For d′ summation, we can write
Substituting this in Equation 17, we obtain
which means that an underlying d′ summation mechanism will still look like d′ summation in the measured data even if the subject plays games by favoring one side or the other, either consistently, or through random switching.
No effect of consistent favoring on the max mode of integration
For the max model, the joint d′01 is related to the single-side d′0, d′1 through a nonlinear function, which we shall call Ω, such that:
For consistent favoring, because there is no switching of the favored side over a session, we have γ = 1, and we can write Equation 17 as follows:
This means that an underlying max mechanism will still look like a max mechanism in the measured data even if the subject consistently favors one side over the other during a session.
Random switching can make an underlying max mechanism look like a d′ summation mechanism
From Equation 17, we can see for the max model that we get
Now, we can infer that the effect of randomly switching the favored side will increase with the difference between the favored and nonfavored d′. As an extreme case, we can consider the case in which
i.e., the nonfavored site does not get processed at all.
We then have
which implies that the extreme effect of randomly switching the favored side is to make an underlying max mechanism look like a d′ summation mechanism. In general, a switching strategy will make any nonlinear mechanism appear closer to the linear d′ summation mechanism than it really is. (This can be seen by noting that Equation 19 does not depend on the actual form of Ω.)
The obtained results cannot be an artifact of favoring signals from one site over another
In the results that we have obtained (Fig. 6) when the cortical sites are located farther apart, our experimental observations indicate a max mechanism, whereas for closer cortical sites (within ∼1 mm), our observations indicate a d′ summation mechanism. Site favoring is likely to be greater at larger site separations rather than at smaller.
We therefore conclude that the most parsimonious explanation of our data is that our subjects did not engage, to any appreciable extent, in any games involving switching the favored site from trial to trial.
Discussion
The mechanisms and principles by which signals from multiple brain regions interact are a key question for neuroscience. In this study, we investigated some of the principles that govern the integration of signals from pairs of cortical sites. We used intracortical microstimulation to insert signals directly into pairs of sites at known separations in cortex and trained subjects to report activation of either cortical site. We tested how closely the integrative processes in the brain approach linear summation—the ideal observer solution for a task like this that involves pooling signals from multiple sources in the presence of uncorrelated or correlated noise (Duda et al., 2001; Chen et al., 2006; Pelli et al., 2006). We found that, although the integration of cortical activity from sites within 1 mm is consistent with a linear summation operation, integration of activity from pairs of sites at greater distances is instead consistent with a max operation performed on the individual single-site signals.
The summation and max operations are optimal under different classes of constraints. Although linear summation is optimal for detection in noise, the max operation has been proposed as a biologically plausible mechanism for stabilizing neuronal responses (creating invariant neuronal responses) when a target stimulus is distorted or mixed with distractor stimuli—a condition in which linear weighted summation is thought to perform poorly (Riesenhuber and Poggio, 1999; Maass, 2000). Studies of the responses of single cortical neurons to multiple visual stimuli placed in their receptive field reveal a wide variety of computations. Some experiments report primarily max-like responses (often described as winner-take-all) in a both striate and extrastriate visual cortex, as well as posterior parietal cortex (Gawne and Martin, 2002; Rolls et al., 2003; Lampl et al., 2004; Oleksiak et al., 2011). Other studies report responses consistent with weighted linear summation (Zoccolan et al., 2005; Ghose and Maunsell, 2008). A common hypothesis for the generation of all such single-neuron responses is local cortical interactions involving gain normalization mechanisms.
However, the mechanisms that generate new response properties as signals pass to successive areas in visual cortex (receptive field elaboration) might be distinct from the mechanisms involved in the flexible, task-dependent, integration of signals from multiple distant cortical sites (Zeki and Bartels, 1999). We expect that, for the conditions for which we observe a max operation, given the low currents used, the arbitrariness of the site pairs selected, and the relatively large separation between sites, it is extremely unlikely that we are driving a third cortical site based on direct cortico-cortico connections from the two stimulated sites. Instead, we propose that our experiment, using pairs of sites that differ from session to session, tests the mechanisms in the brain that flexibly link multiple cortical sites on the fly (Singer and Gray, 1995; Roskies, 1999). In our view, the finding of a max operation in this experiment suggests that the individual signals from separated cortical sites are effectively being combined after, rather than before, passing through a decision criterion, referred to in the literature as probability summation (Pirenne, 1943; Pelli, 1985; Treisman, 1998).
We propose that sensory cortex, at least, is functionally composed of modules, ∼1 mm or less in size. The activity in each of these cortical modules is accessed independently and in parallel by higher brain functions, such as associative learning and memory. In the joint-detection task presented here, because the signals from each cortical site individually drive cognitive processes, they are combined equivalently to an OR gate, with the behavioral decision being determined when either site reaches criterion (Fig. 1). This is the optimum operation a mechanism based on independent detectors can perform in a joint-detection task. The proposal for parallel but independent streams of processing from cortex is consistent with past findings about the localization of cortical function and studies of cortical lesions, although we are proposing a subdivision at much finer scales than is commonly hypothesized (Goldman-Rakic, 1988).
We propose that the summation we observe at close separations (<1 mm) reflects local processing within a cortical module. The size of our hypothetical module corresponds well with the range of dense local axonal projections from V1 neurons (Stettler et al., 2002) and the proposed size of a cortical hypercolumn (Hubel and Wiesel, 1974; Ts'o et al., 2009). The dense local interconnectivity among neurons with diverse neural responses within this zone is thought to enable a rich set of cortical computations.
Our experiment did not test what the qualia of the stimulation was to the subjects. Previous work in V1 of human subjects suggests that, when microstimulation was delivered at sites farther apart than 1 mm, subjects perceived two distinct phosphenes, whereas stimulation at sites closer than 0.5 mm was perceived as a single, brighter phosphene (Bak et al., 1990; Schmidt et al., 1996). The transition zone between summation and independent processing in our data approximately corresponds with the transition zone between fused and discrete percepts in the human data, raising the hypothesis that the two modes of signal integration might manifest as such perceptual differences. In addition, our experiment does not test whether signals from within a module can be processed separately or whether the local cortical interconnections group the signals together such that they are always summated and inseparable, leading to a certain level of graininess of readout from cortex. Some psychophysical results related to visual crowding and vernier acuity seem to indicate such graininess (Parkes et al., 2001; Duncan and Boynton, 2003).
The current literature on microstimulation indicates that current summation is unlikely at the closest separations we tested (Stoney et al., 1968; Tehovnik et al., 2006; Histed et al., 2009). However, we note that, if current summation did play an unexpected and unlikely role at the closer separations, coincidentally “topping up” the max interaction in the right amount to masquerade as linear summation, it would only affect our conclusions quantitatively, requiring us to further reduce the size of the modules over which linear summation occurs.
The difference between the max and linear summation models in terms of behavioral performance is small, amounting to 4% in the middle of the dynamic range and less elsewhere. This means that there is little incentive for a subject (in terms of greater rewards) to use linear summation rather than independent processing, although the former is better. Therefore, it is possible that the subjects in the experiment chose to use a max strategy when actually, if sufficiently pushed, they could have used linear summation to process widely separated cortical sites. This still points to a cost of some sort to linear summation when the sites are farther apart but not when they are close together. A cost to which both humans processing visual stimuli (Kristofferson and Dember, 1958; Green and Swets, 1966) and monkeys processing direct cortical activation (our current work) seem to respond in an extreme manner: by dropping an optimal linear combination strategy for a very different nonlinear strategy based on independent detectors. This is especially remarkable given that individual neurons in V4 and inferior temporal cortex, which presumably integrate signals from neurons located far apart in early visual cortex, have been shown to perform linear summation on such signals (Zoccolan et al., 2005; Ghose and Maunsell, 2008).
Our finding raises the possibility of interpreting a whole class of psychophysical phenomena as the result of a constraint in signal integration from spatially separated loci on cortex. When presented with a task involving joint audiovisual stimulation, humans behave as if they process the visual and auditory components of the stimulus independently even when the two signals come from the same physical spatial location (Wuerger et al., 2003). We interpret these results as reflecting constraints on how the brain processes signals that are initially represented far apart in cortex, in different primary sensory cortices. When presented with the task of reading letters and words degraded by visual noise, humans behave as if they process such symbols as independent combinations of elementary features, although they are highly practiced in reading (Pelli et al., 2003). Under our framework, we interpret such results as reflecting the joint processing of spatially separated cortical representations of the elementary features (lines and curves) making up the symbols.
Interestingly, although previous experiments show that pairs of visual stimuli presented dicoptically at the same corresponding retinal locations result in d′ summation (Kristofferson and Dember, 1958), studies with compound gratings suggest that even completely spatially overlapping stimuli can result in independent processing if they are separated in feature space (Sachs et al., 1971). In our own visual version of the experiment in which we placed pairs of simple visual stimuli on a video screen, we found independent processing of dot stimuli even at the closest separation, which we estimated to be 0.63 mm on V1. Taking all these results together, we propose that, although cortical separation is a sufficient condition for independent processing of neural signals, it is not a necessary one. Although small, spatially restricted stimuli generate very broad pools of activation in V1 (Grinvald et al., 1994) if they are part of functionally separated circuits, such as neurons representing different spatial frequencies, or originating at distinct spatial locations, the neurons can form precise mappings that preserve the independence of their original separate signals.
Our electrical stimulation experiment suggests that 0.63 mm is part of the transition zone from independent processing to d′ summation (Fig. 6), whereas our visual experiment suggests that the transition zone is even tighter (Fig. 7). One possibility for this effect rests on the observation that intracortical stimulation activates an indiscriminate group of cortical neurons around the electrode tip (Tehovnik et al., 2006; Histed et al., 2009). We propose that this arbitrary activation of two groups of overlapping cortical neurons that potentially form fragments of multiple local cortical microcircuits results in an average signal that manifests as d′ summation in the behavioral report in our experiment. In contrast, the precise cortical activation resulting from pairs of dots presented visually, functionally preserves the original spatial separation on the retina although the neurons representing each stimulus component in cortex are physically intercalated. This results in a behavioral report consistent with the processing from parallel, independent circuits.
Our study was restricted to signals inserted into V1. It is possible that processing mechanisms operate differently in different cortical areas, especially given that the external connectivity of such areas differ greatly. Conversely, the similar architecture that defines cerebral cortex suggests that its computations may face similar constraints throughout. Additional experiments are needed to elucidate whether pairs of cortical signals from other, functionally and anatomically different, cortical areas are also processed as parallel but independent streams until they activate decision-making structures.
Footnotes
This work was supported by the Howard Hughes Medical Institute and National Institutes of Health Grant R01EY005911. We thank David Averbukh for constructing the optically isolated current monitors.
The authors declare no competing financial interests.
- Correspondence should be addressed to Kaushik Ghose at his present address: Department of Neurosurgery, Massachusetts General Hospital, Boston, MA 02114. kaushik.ghose{at}gmail.com