Abstract
Choosing an action in response to visual cues relies on cognitive processes, such as perception, evaluation, and prediction, which can modulate visual representations even at early processing stages. In the mouse, it is challenging to isolate cognitive modulations of sensory signals because concurrent overt behavior patterns, such as locomotion, can also have brainwide influences. To address this challenge, we designed a task, in which head-fixed mice had to evaluate one of two visual cues. While their global shape signaled the opportunity to earn reward, the cues provided equivalent local stimulation to receptive fields of neurons in primary visual (V1) and anterior cingulate cortex (ACC). We found that mice evaluated these cues within few hundred milliseconds. During this period, ∼30% of V1 neurons became cue-selective, with preferences for either cue being balanced across the recorded population. This selectivity emerged in response to the behavioral demands because the same neurons could not discriminate the cues in sensory control measurements. In ACC, cue evaluation affected a similar fraction of neurons; emerging selectivity, however, was stronger than in V1, and preferences in the recorded population were biased toward the cue promising reward. Such a biased selectivity regime might allow the mouse to infer the promise of reward simply by the overall level of activity. Together, these experiments isolate the impact of task demands on neural responses in mouse cerebral cortex, and document distinct neural signatures of cue evaluation in V1 and ACC.
SIGNIFICANCE STATEMENT Performing a cognitive task, such as evaluating visual cues, not only recruits frontal and parietal brain regions, but also modulates sensory processing stages. We trained mice to evaluate two visual cues, and show that, during this task, ∼30% of neurons recorded in V1 became selective for either cue, although they provided equivalent visual stimulation. We also show that, during cue evaluation, mice frequently move their eyes, even under head fixation, and that ignoring systematic differences in eye position can substantially obscure the modulations seen in V1 neurons. Finally, we document that modulations are stronger in ACC, and biased toward the reward-predicting cue, suggesting a transition in the neural representation of task-relevant information across processing stages in mouse cerebral cortex.
Introduction
Goal-directed behavior, such as standing in line for a restaurant table, relies on cognitive processes allowing us to recognize the current situation, evaluate the costs and benefits of potential actions, and eventually decide on a specific course of action, here, keep standing or move on. The neural signals reflecting such cognitive processes are often studied in parietal (Shadlen and Kiani, 2013; Hanks et al., 2015; Goard et al., 2016; Licata et al., 2017; Krumin et al., 2018; Pho et al., 2018) and frontal areas of cerebral cortex (Duan et al., 2015; Hanks et al., 2015; Goard et al., 2016; Kim et al., 2016; Murray and Rudebeck, 2018; Pho et al., 2018), yet they can have widespread impact, reaching down to the earliest stages of cortical sensory processing (Chen et al., 2008; Briggs et al., 2013; Saleem et al., 2018). Measuring how cognition affects sensory responses is essential to understand how the same physical stimulus can give rise to different percepts.
Cognition has long been known to shape responses of visual neurons; its impact has most elegantly been demonstrated in nonhuman primates, where the level of behavioral control remains unmatched. Excellent examples are studies on covert attention (for review, see Maunsell, 2015; Moore and Zirnsak, 2017), where monkeys are trained to direct, without moving their eyes, attention to a visual stimulus. Attentional effects are then isolated by comparing conditions with identical sensory stimulation, levels of engagement, task difficulty, and behavioral responses. How cognition influences early vision is increasingly studied in the mouse (Gavornik and Bear, 2014; Zhang et al., 2014; Poort et al., 2015; Wimmer et al., 2015; Fiser et al., 2016; Goard et al., 2016; Henschke et al., 2020; Speed et al., 2020); turning to the mouse brings powerful genetic tools, but reaching a level of behavioral control strong enough to isolate cognition is a challenge. A standard paradigm for mice is the Go/No-go task with head fixation, where the mouse is required to respond to a specific stimulus, and withhold the response to other stimuli (e.g., Andermann et al., 2010; Histed et al., 2012; Lee et al., 2012; Glickfeld et al., 2013; Goard et al., 2016; Montijn et al., 2016; Ramesh et al., 2018; Neske et al., 2019). These tasks have the advantage that mice can learn them in reasonable time. However, if the animal's behavioral report, or the sensory drive provided by each stimulus, grossly differs between Go and No-go conditions, it can be difficult to isolate neural signatures of cognition. Any such efforts can be further complicated by the fact that rodents perform eye movements when viewing visual stimuli (Sakatani and Isa, 2007; Wallace et al., 2013; Samonds et al., 2018).
Here, we focus on one cognitive process, evaluating visual cues, and study how this process affects stimulus representations in V1 and ACC of the mouse. We trained mice to engage in goal-directed behavior, where the commitment to a potentially rewarding action had to rely on visual cues. These cues differed in terms of global shape but provided equivalent stimulation to locally confined receptive fields (RFs) in cortex. Under equivalent visual stimulation, with controlled locomotion behavior and matched eye positions, we found that, during cue evaluation, about one-third of V1 neurons responded more strongly to one or the other of the two locally identical visual cues, and their preferences were evenly split. In ACC, cue evaluation affected activity in a similar fraction of neurons; here, however, the effect was substantially stronger and preferences in the recorded population were biased in favor of the cue promising reward. Together, these results reveal distinct signatures of cue evaluation in mouse visual and cingulate cortex.
Materials and Methods
We used 19 mice (3-4 months old, 11 males and 8 females): 9 of the C57BL/6J WT strain and 10 of the PV-Cre strain B6;129P2-Pvalbtm1(cre)Arbr/J (JAX stock #008069). All procedures were conducted in compliance with the European Communities Council Directive 2010/63/EC and the German Law for Protection of Animals; they were approved by the local authorities following appropriate ethics review.
Surgical protocol
Anesthesia was induced with isoflurane (3%) and maintained throughout the surgery (1.5%). A small S-shaped aluminum headpost was attached to the anterior part of the skull (OptiBond FL primer and adhesive, Kerr dental; Tetric EvoFlow dental cement, Ivoclar vivadent); two miniature screws (00-96 × 1/16 stainless-steel screws, Bilaney) were implanted over the cerebellum serving as reference and ground for electrophysiological recordings. Before surgery, an analgesic (buprenorphine, 0.1 mg/kg s.c.) was administered, and the eyes were protected with ointment (Bepanthen). The animal's temperature was kept at 37°C via a feedback-controlled heating pad (WPI). Antibiotics (Baytril, 5 mg/kg s.c.) and a longer-lasting analgesic (Carprofen, 5 mg/kg s.c.) were administered for 3 d after surgery. Mice were given 7 days to recover before they were habituated to the experimental setup. Before electrophysiological recordings, a craniotomy (1.5 mm2) was performed over V1 (3 mm lateral to the midline, 1.1 mm anterior to the transverse sinus) (Wang et al., 2011) or ACC (0.3 mm lateral to the midline, 0.2 mm anterior to bregma) (Zhang et al., 2014). The craniotomy was sealed with Kwik-Cast (WPI), which was removed and reapplied before and after each recording session.
In PV-Cre mice, we expressed channelrhodopsin (ChR2) by injecting the adeno-associated viral vector rAAV5.EF1a.DIO.hChR2(H134R)-EYFP.WPRE.hGH (Penn Vector Core, University of Pennsylvania). The vector was injected through a small craniotomy into V1 of anesthetized mice. We used a Picospritzer III (Parker) to inject the virus at multiple depths while gradually retracting the pipette. We expressed ChR2 to identify, in our recorded population, inhibitory interneurons (Kvitsiani et al., 2013). The number of identified interneurons was too small, however, to provide a substantial contribution to this report.
Histology
Histologic reconstructions were used to verify recording sites from ACC. Before recording from ACC, electrodes were coated with a yellow-shifted fluorescent lipophilic tracer (DiI; DiIC18(3), Invitrogen). After recordings were terminated, mice were perfused transcardially and the brain was fixed in a 4% PFA/PBS solution for 24 h, before being stored in PBS. Brains were sliced at 50 µm using a vibratome (Microm HM 650 V, Thermo Scientific), mounted on glass slides with Vectashield DAPI (Vector Laboratories), and coverslipped. Slides were inspected for blue DAPI and yellow DiI using a fluorescent microscope (Zeiss Imager.Z1m).
Experimental setup and visual stimulus
Mice were put on an air-suspended Styrofoam ball (n = 11) or a mounted plastic disk (n = 8) and head-fixed by clamping their headpost to a rod. Movements of the ball were recorded at 90 Hz by two optical mice connected to a microcontroller (Arduino Duemilanove); disk rotation was measured with a rotary encoder sampling at 100 Hz (MA3-A10-125-N Magnetic Encoder, Pewatron). A computer-controlled syringe pump (Aladdin AL-1000, WPI) delivered precise amounts of water through a drinking spout, which was positioned in front of the animal's snout. The drinking spout was present only during the foraging task experiments and was removed during measurements in sensory control conditions. Eye movements were monitored under infrared illumination using a zoom lens (Navitar Zoom 6000) coupled to a camera (Guppy AVT, frame rate 50 Hz). The setup was enclosed with a black fabric curtain. Visual stimuli were generated with custom-written software (https://sites.google.com/a/nyu.edu/expo/home) and presented on an LCD monitor (Samsung 2233RZ, display size 47 × 30 cm, refresh rate 120 Hz, mean luminance of 50 cd/m2). The monitor was positioned 20-25 cm from the animal's eyes at an angle of 15-40 degrees, relative to the mouse AP axis. Luminance nonlinearities of the display were corrected with an inverse γ lookup table, which was regularly obtained by calibration with a photometer. Stimuli were downward-drifting sinusoidal gratings at 50% contrast. Temporal frequency was 1.5 Hz, spatial frequency 0.02-0.05 cycles/degree. Gratings were 40-55 degrees diameter in size, and framed by either a black square or diamond.
Behavioral training
After recovery from the surgery, mice were placed on a water restriction schedule until their weight dropped to ∼85% of their ad libitum body weight. During this time, mice were habituated to head fixation on the ball or disk and delivery of water through the spout. The animals' weight and fluid consumption were monitored and recorded on each day, and the mice were checked for potential signs of dehydration. After the weight had stabilized, the mice were trained in daily sessions on the visual task. Training sessions were typically performed 5 d a week. On days without training, mice received water supplement of 25 ml/kg body weight.
Electrophysiological recordings
After mice had learned the task, extracellular recordings were performed with 32-channel linear silicon probes (Neuronexus, A1x32-5 mm-25-177-A32). Electrodes were inserted perpendicular to the brain surface and lowered to ∼800 µm (V1) or 1200 µm (ACC) below the surface. Wideband extracellular signals were digitized at 30 kHz (Blackrock Microsystems) and analyzed using the NDManager software suite. To isolate single neurons from linear arrays, we grouped adjacent channels into 5 equally sized “virtual octrodes” (8 channels per group with 2 channels overlap). Using an automatic spike detection threshold (Quiroga et al., 2004), spikes were extracted from the high-pass filtered continuous signal for each group separately. The first three principal components of each channel were used for automatic clustering with KlustaKwik (K. D. Harris, http://klusta-team.github.io/klustakwik), which was followed by manual refinement of clusters (Hazan et al., 2006). For the analyses of neural data, we only considered high-quality single-unit activity, judged by rate stability, distinctiveness of spike wave shape, and cleanness of the refractory period in the autocorrelogram. In our final set of neurons, overall firing rates were distributed according to a log-normal distribution (Buzsaki and Mizuseki, 2014). Only 0.14% (median) of single-unit waveforms violated a refractory period of 2 ms. The median rate of interspike interval violations was 0.15, computed for an assumed refractory period of 2 ms and reflecting the relative firing rate of hypothetical neurons generating these violations (Hill et al., 2011). Overall cluster quality was assessed by fitting separate Gaussian mixture models in principal component analysis space between all pairs of units (Hill et al., 2011) on our “virtual octrodes,” which revealed a median summed false positive rate of 0.14 (probability that a waveform assigned to cluster 1 was generated by cluster 2), and a median summed false negative rate of 0.08 (probability that a waveform assigned to cluster 2 was generated by cluster 1).
RF mappings and orientation tuning
We mapped RFs of V1 neurons with a sparse noise stimulus, consisting of 5 degree, full-contrast black and white squares, which were flashed on a gray background for 150 ms at a random location on a virtual 12 × 12 grid. Neural responses were fitted with 2D Gaussians to determine RF center, separately for ON and OFF subfields (Liu et al., 2010). To guide the placement of the stimuli, we estimated RF parameters online, relying on threshold crossings of spiking activity at each recording channel. Stimuli were then positioned to cover as many RFs as possible. For the analyses shown in Figure 4A, we considered RFs as well defined if the 2D Gaussian explained at least 30% of the variance in the neural response.
We computed orientation tuning curves by fitting a sum of two Gaussians of the same width with peaks 180 degrees apart (Jurjut et al., 2017). For the analysis shown in Figure 4B, we considered neurons well tuned if the sum of Gaussians explained at least 50% of response variance.
Current source density (CSD) analysis
As described by Jurjut et al. (2017), we computed the CSD from the second spatial derivative of the local field potential (Mitzdorf, 1985) in response to periodic visual stimulation. We smoothed the CSD in space using a triangular kernel (Nicholson and Freeman, 1975) and used a value of 0.4 S/m as measure of cortical conductivity (Logothetis et al., 2007) to approximate the CSD in units of nanoamperes per cubic millimeter. We assigned the contact closest to the earliest polarity inversion to the base of layer 4 (Schroeder et al., 1998). The remaining contacts were assigned to putative supragranular, granular, and infragranular layers based on a cortical thickness of 1 mm and anatomic measurements of the relative thickness of individual layers in mouse V1 (Heumann et al., 1977).
Behavioral task
We successfully trained n = 19 mice on the cue evaluation task shown in Figure 1A. For n = 12 mice, the diamond was the Go cue and the square the Stop cue; for the remaining mice (n = 7), this assignment was reversed. The cues were presented in a randomized sequence. After keeping still for at least 80 ms, the mice could trigger cue onset by running for a duration of 500 ms above a speed threshold of 5 cm/s, during which a mean-luminance gray screen was shown. The Go cue signaled a delayed fluid reward (5 µl), which the mouse could earn by continuing to run above threshold for an additional 3.5 s. Running in response to the Stop cue was not rewarded; therefore, the most economical action for the mouse was to terminate such trials by stopping, and immediately initiate a new trial. Cues were present until a trial was correctly or incorrectly terminated. To measure V1 activity in the absence of visual stimulation, running triggered, on a fraction of trials, a mean-luminance gray screen without any cue. A single session consisted of 300-600 trials divided into blocks of 100 trials; diamond, square, and blank screen appeared on 43%, 43%, and 14%, respectively, of all trials.
Because trials were actively initiated by the mouse, the interstimulus intervals were variable. After a hit, mice took some time to consume the reward before they initiated a new trial (median interstimulus interval: 8.3 s, interquartile range: 3.6 s, minimum: 4.2 s, n = 2173 trials from 63 experiments). After correct aborts, in contrast, mice initiated trials more quickly, but the interstimulus interval never fell below 1 s (median: 2.3 s, interquartile range: 1.5 s, minimum: 1.1 s, n = 1445 trials).
Sensory control measurements
After the mice had completed all task blocks, we ran a sensory control condition, in which a periodic sequence of the stimuli was shown, independent of the animals' behavior. To set apart this sensory control condition, the mice could not initiate a trial by running. Instead, the stimuli were simply flashed in a randomized, periodic sequence. Each stimulus was shown for 2 s, followed by a 1 s presentation of a mean-luminance gray screen. No reward was given during these controls. To further emphasize the difference to the task condition, the lick spout was removed. Analogous to the task condition, we showed, on a fraction of trials, a mean-luminance gray screen without any stimulus. Sensory control experiments consisted of 100-500 trials; square, diamond, and blank screen appeared on 40%, 40%, and 20%, respectively, of all trials.
Measurements of eye position
We detected the pupil with a custom-written program developed with the Bonsai framework (Lopes et al., 2015). Briefly, we applied a threshold to turn each camera frame into a binary image, performed a morphologic opening operation, identified the most circle-like object as the pupil, and fitted a circle to determine the position of its center. We computed relative pupil displacements by subtracting, for each frame, the pupil position from a default position, defined as the grand average eye position across all stimuli and task conditions. To convert pupil displacements to angular displacements, we assumed that the center of eye rotation was 1.041 mm behind the pupil (Stahl et al., 2000). We defined saccades as changes in eye position ≥ 2 degrees. Considering that the average mouse saccade lasts 50 ms (Sakatani and Isa, 2007), we detected saccades by taking the difference of mean eye position 60 ms before and after each time point.
Measurements of locomotion
For the air-suspended ball, running speed was computed as the Euclidean norm of three perpendicular components (roll, pitch, and yaw) of ball velocity (Dombeck et al., 2007). For the running disk, we converted deg/s to cm/s by considering the radius from the center of the mouse to the center of the disk, which typically was between 5 and 6 cm.
Experimental design and statistical analysis
We relied on the open-source framework DataJoint for creating data analysis pipelines (Yatsenko et al., 2018) and the R project for statistical analysis (R Core Team, 2017). We used a within-subject (i.e., repeated-measures) design, such that every mouse participated in each condition (visual task, sensory control measurements). Neural data were also collected within subjects; that is, responses of each neuron were measured under all behavioral conditions. Where appropriate, we therefore performed within-subject ANOVAs, or tests relying on dependent samples. Details of the statistical procedures, and sample sizes are described in the following subsections.
Running behavior
For each session of each mouse, we extracted run-speed profiles for individual trials and aligned them to stimulus onset to identify and exclude invalid trials, in which task engagement might have been suboptimal. For aborted trials, we took as termination time the point in time, relative to stimulus onset, where running speed dropped below the threshold of 5 cm/s, and considered trials invalid if termination time was faster than 500 ms (13.6% of 5253 trials) or slower than 2 s (9.0%). We also considered a trial invalid if its running speed profile differed markedly from the average profile across trials of the same type (i.e., across all hits, or all correct aborts). It “differed markedly” if the maximum running speed within 500 ms after stimulus onset was lower than the mean across trials - × * SEM. The factor x varied across mice and sessions and ranged from 2 to 5, excluding 1.8% of the trials. After removing invalid trials, we then determined, for hits and correct aborts, the period of stimulus presentation, during which running speed was indistinguishable. We compared distributions of running speed at every point in time, and took as “point of speed divergence” the first of three consecutively significant time points (Kolmogorov–Smirnov test, p < 0.01).
Behavioral discrimination performance
To quantify and track behavioral performance, we computed, from hits and false alarm rates, a discriminability index d', defined as ZN - ZSN, after ZN = 1 – p(false alarm) and ZSN = 1 – p(hit) were turned into z scores using the inverse of the cumulative normal distribution (Gescheider, 1997). In case of extreme performance levels (hits or false alarm rates of 0 or 1), we computed d' using the log-linear approach described by Stanislaw and Todorov (1999). We considered mice as trained, if d' across task blocks was ≥ 1.5 for at least 2 consecutive days. From trained mice, we only considered neural responses, eye positions, or speed profiles that were measured in task blocks where d' was ≥ 1.5.
Eye position
For each session and each mouse, we extracted eye positions for individual trials and aligned them to stimulus onset. Within each session's time window of constant running speed (based on the speed divergence times shown in Fig. 1F), we removed trials containing saccades and then determined, for each trial, the average eye position across time. We compiled 2D (horizontal, vertical) distributions of eye positions, separately for each stimulus and task condition. We matched the number of trials across conditions in each eye position bin by finding the smallest number of trials across conditions and deleting, where necessary, excess trials from the corresponding bin in the other three conditions. By matching the number of trials, we made sure that we removed any potential bias in terms of eye position across all task and sensory control conditions. We varied bin width across mice and sessions from 1.4 to 6.4 degrees, with the exception of a single session in 1 mouse, where bin width was 19.2 degrees. We chose bin widths to maximize the number of surviving trials under the constraint that the resulting distributions of eye positions were statistically indistinguishable across stimulus and task conditions (p ≥ 0.1). Statistical testing, however, was then performed on raw eye positions (i.e., without any binning). To assess statistical significance, we compared all four distributions of eye positions using the multisample variant of the nonparametric Anderson–Darling test (Scholz and Stephens, 1987). This is an omnibus test (i.e., it provides a single test statistic to assess whether multiple distributions differ from each other). Sessions for which p < 0.1, or the number of surviving trials < 10, were considered “unmatchable” and excluded from the analyses of V1 responses (13 of 25 sessions, see Fig. 2G). After matching eye positions, the mean number of trials per condition was 28.75 ± 3.8, averaged across sessions. To assess statistical significance of mean saccade rates (see Fig. 2C), we performed a repeated-measures ANOVA, including the within-subject factors task condition (task vs sensory control) and stimulus type (Go vs Stop).
Neural data
We obtained neural data during task performance from n = 13 mice in 42 sessions. In 9 mice (25 sessions), we recorded spiking activity in area V1. Sessions for which we were unable to match eye positions were excluded from further analyses, resulting in a final V1 dataset of 5 mice (12 sessions). In these sessions, we recorded 411 single neurons. We computed single-trial firing rates by convolving spike trains with a Gaussian kernel (resolution 1 ms, width 70 ms). We first identified neurons that were visually responsive: we compared, across trials, time-averaged firing rates in 5 ms bins between stimulus and blank screen conditions and considered a neuron visually responsive if its response to a stimulus, during task and sensory control, was significantly larger than its response to the blank screen in at least 10 consecutive bins within a window of 1 s (Wilcoxon rank sum test, p < 0.05). Of n = 264 visually driven neurons, we only included those for which we had at least 10 trials per stimulus and task condition after matching eye positions (n = 247). To quantify how well each neuron could discriminate the stimuli, we performed ideal observer analyses (Macmillan and Creelman, 2005), separately for task and sensory control conditions. We split single-trial firing rates based on stimulus type, focused on the time window where running speed was indistinguishable during task performance, averaged across time, and determined the area under the receiver operating characteristic (area under the curve [AUC]). To assess statistical significance, we repeated 1000 times the random selection of trials by eye position matching and created, for each neuron, a distribution of AUC values. We took the mean AUC across repeats as measure of neural discriminability and considered it significant if chance performance (0.5) was outside the central region of the CI. In the sensory control condition, our criterion was lax: We used a 90% CI to catch neurons that showed any trend toward differential responses. Using a lax criterion was important because we wanted to make sure that the neurons we examined responded in the same way to stimulation of their classical and extraclassical RFs. In the task condition, our criterion was conservative: we used a 99% CI to identify those neurons that could reliably discriminate the stimuli. Based on these statistics, we excluded neurons that could discriminate the stimuli in the sensory control condition, and asked how many of the remaining, nonselective, neurons (n = 115) could discriminate the stimuli within the context of the task.
Although we selected neurons based on their AUC value in the sensory control condition, and then reassessed AUCs during task performance, our observed effects cannot be reduced to the statistical phenomenon known as “regression to the mean” (e.g., Barnett et al., 2005). By requiring that neurons could not discriminate the stimuli during the sensory control, we selected those neurons whose AUC values were very close to the overall population mean, which was at 0.51. If they then showed any significance during the task, they actually “regressed away” from the population mean (see, e.g., Fig. 3G,H). We also tested for regression-to-the-mean effects at the level of individual neurons. We took the V1 neurons that could not discriminate the stimuli during the control condition (n = 115; Fig. 3H), split the dataset into odd and even trials, and computed AUC values for each subset of trials. Reassuringly, AUC values that were nonsignificant for the subset of even trials remained so for the subset of odd trials, and vice versa. In stark contrast, comparing the subsets of even trials from sensory control and task condition still revealed significant task-related modulation in 25 of 115 neurons. A similar fraction (21 of 115) showed task-dependent modulation when we compared the odd trials from the sensory control and task conditions. A smaller fraction overall, with these control analyses, is to be expected, given the reduction in power that comes with using only half of the trials. Together, these control analyses show that the task-dependent modulations we observed are not simply a statistical artifact introduced by taking two successive measurements from the same neuron.
To assess how strongly eye position matching contributed to our result, we reran our analyses of discriminability by randomly selecting subsets of trials, but independent of eye position. Everything else was kept identical, including the trial numbers and the computation of AUC values. Using identical trial numbers was important because it rules out that any changes in variability might be related to an unequal number of trials. To compare counts of significant neurons between matched and nonmatched datasets, we used a χ2 test. To assess the variability of single-neuron AUC distributions, we determined their SDs and compared distributions of SDs between matched and nonmatched datasets using a two-sample Kolmogorov–Smirnov test.
Because of a bug in the stimulus presentation program, the phase of the gratings during sensory control measurements was shifted by 90 degrees, relative to the phase of the stimuli during the cue evaluation task. We therefore repeated all analyses by shifting the time window in the sensory control condition, such that the stimulus phases were identical across conditions. The results were essentially the same and the conclusions unaffected: With matched eye positions, the number of cells that could discriminate during the task was 30.07% (16.99% responded more strongly to the Go stimulus, 13.07% more strongly to the Stop stimulus). Without matching, this percentage dropped to 9.8% (5.88% responded more strongly to the Go stimulus, 3.92% more strongly to the Stop stimulus). The drop in percentages was highly significant (p < 0.00001, χ2 test), and the SDs of single-neuron AUC distributions were smaller under matched than unmatched eye positions (p < 0.00001, two-sample Kolmogorov–Smirnov test).
To test whether the results observed in V1 might be caused by changes in spatial integration because of running, we focused on the sensory control condition, in which mice showed variability in running behavior. For each trial, we computed the mean speed in a time window of 0.5 s after stimulus onset, and determined whether the mouse was running (speed > 1 cm/s) or stationary (speed ≤ 1 cm/s). Where possible, we split trials based on running behavior, and drew 1000 times, without replacement, random subsets of trials. For each neuron, we compiled distributions of AUC values, separately for running versus sitting. We considered a neuron's discriminability to be affected by running if the mean AUC value for running was outside the central 95% of the AUC distribution for sitting.
In 3 mice (14 sessions), we recorded spiking activity from ACC. During ACC recordings, we did not track eye positions; all other aspects of task structure and analyses were kept identical to the V1 recordings described above. We excluded neurons with < 1 spike/s and bootstrapped CIs for AUC values (n = 1000 replications). If many ACC neurons showed visual responses that were sensitive to differences in eye positions, we might have underestimated the number of neurons that can discriminate the stimuli during the task, just as in V1 with unmatched eye positions. We also tested whether stratifying V1 data by matching eye positions introduced a bias in terms of a specific brain state or arousal level. We examined two prominent proxies for brain state: pupil size and spectral content of the local field potential. We found that both were largely similar between trials with matched and unmatched eye positions. These analyses confirmed that matching for eye positions in V1 recordings did not select for a specific brain state, and justify the comparison of V1 to ACC recordings, where responses were not stratified by eye position.
Results
To isolate how visual processing is modulated by behavioral demands, we designed a task, in which mice had to evaluate visual cues to decide on an appropriate action (Fig. 1). We placed head-fixed mice on a treadmill in front of a monitor, where they could harvest a reward by running in response to one, but not the other, of two visual cues (Fig. 1A). The cues had different global shapes (diamond vs square), such that they could be discriminated by the mice, but they were composed of the same, downward-drifting grating to provide identical visual stimulation to neurons targeted for recordings in visual cortex. The mice initiated each trial by moving forward on the treadmill, which triggered the presentation of a randomly selected cue. One cue (Go) promised a fluid reward, which the mouse could earn by continuing to run for an additional 3.5 s. A 3.5 s run in response to the Go cue was considered a hit; a 3.5 s run in response to the other cue (Stop) was considered a false alarm. At any point in time, the mouse could terminate the current trial by slowing down, and immediately start over; these terminated trials were considered correct or incorrect aborts (Fig. 1A). With this task and stimulus design, we sought to isolate how visual processing is affected by the process of evaluating cues because (1) across cue conditions, the animal's running behavior was identical around the time of cue onset and (2) those neurons, whose classical RFs (Fig. 1A, dotted circles) were contained within the stimulus aperture, should receive equivalent visual stimulation. Which of those neurons indeed received equivalent visual signals was determined in a sensory control condition, in which we flashed the same stimuli in a randomized, periodic sequence, unrelated to the animals' behavior. In this condition, the mice were not engaged: they could not control the onset of a cue, nor could they earn rewards by running.
After a few training sessions, mice had learned to discriminate the cues and earn rewards (Fig. 1B–D). Naive mice performed at chance level (Fig. 1B, left). Trained mice, in contrast, achieved high hit rates and low false alarm rates (right). From hit and false alarm rates, we computed d' (Gescheider, 1997) to quantify and track behavioral performance (Fig. 1C,D). Some mice reached a criterion level of 1.5 within few days (Fig. 1C, dark traces); other mice needed multiple weeks (bright traces). On average, mice reached criterion levels of performance after 18.9 ± 3.3 training days (mean ± SEM, n = 19). Once the mice had learned the task, they showed stereotypical running behavior, which reliably reflected the reward assignments (Fig. 1E,F). Consider, for instance, the 2 example mice shown in Figure 1E. Running above threshold for 0.5 s triggered the onset of the cue (time 0). After presentation of the Go cue, speed remained high or even increased as the mice went for the reward (hit trials, blue traces); after presentation of the Stop cue, in contrast, running speed quickly dropped (correct aborts, red traces).
How long does it take a mouse to evaluate a visual cue and decide on a specific course of action? We took the point in time, at which the speed profiles diverged as an estimate for the duration of this process. We determined, for each individual session, the time window during which running speed was indistinguishable between cue conditions (Fig. 1E,F). We performed these analyses on a subset of 13 mice, from which we recorded neural responses during task performance. Across these recording sessions, behavioral performance remained well above criterion level (mean d' = 2.82 ± 0.08, n = 42 sessions). We compiled distributions of running speeds across trials, separately for hits and correct aborts, and compared these distributions at every point in time. We took as point of speed divergence the first of three consecutive time points, where the distributions of running speed significantly differed (Kolmogorov–Smirnov test, p < 0.01). For the two sessions shown in Figure 1E, running speeds were indistinguishable during the first 365 (left) or 414 ms (right) of stimulus presentation. The time points of speed divergence varied across sessions with an average of 347 ± 16 ms (Fig. 1F; n = 42 sessions from 13 mice). We took these estimates as a behavioral marker for the average duration of the cue evaluation process, and used them to define the time window of interest for the analyses of cortical responses. Restricting the analyses to this time window also ensured that the two behavioral conditions were comparable in terms of locomotion behavior. Controlling locomotion is important because it can affect sensory responses in the mouse visual system (Niell and Stryker, 2010; Bennett et al., 2013; Saleem et al., 2013; Erisken et al., 2014; Pakan et al., 2016; Dadarlat and Stryker, 2017; Aydın et al., 2018; Clancy et al., 2019; Musall et al., 2019).
We next examined eye movements during cue evaluation and found systematic differences in saccades and eye positions between the cue conditions (Fig. 2). We recorded videos of the eye to track pupil position (Fig. 2A) and observed frequent eye movements (Fig. 2B). They mostly occurred along the horizontal direction and often seemed related to the trial structure (gray vertical bars mark cue presentations). We expected eye movements in our task because mice initiated trials by running (Fig. 2B, bottom), which increases the frequency of eye movements in head-fixed mice (Niell and Stryker, 2010; Keller et al., 2012; Ayaz et al., 2013; Bennett et al., 2013). Because running behavior was comparable during evaluation of Go and Stop cues, we expected eye movements of similar frequencies. However, when we identified saccades and aligned them to cue onset, we found clear differences in saccade frequency between cue conditions: more saccades occurred during the presentation of the Go cue than during the Stop cue, but only during the task (Fig. 2C; n = 42 sessions from 13 mice, interaction between cue and task: F(1,41) = 8.27, p = 0.0064, repeated-measures ANOVA). Follow-up analyses confirmed that, during the task, saccades occurred more often for the Go cue than for the Stop cue (7.1 vs 3.8%, F(1,41) = 15.48, p = 0.00032). In the control condition, however, the percentage of saccades was indistinguishable (2.8 vs 2.3%, F(1,41) = 2.71, p = 0.11). In both task and control conditions, saccades were linked to transitions from slowing down to speeding up (data not shown), potentially reflecting intended head movements (Meyer et al., 2020).
To remove any differences in eye position between cue conditions for subsequent analyses of V1 responses, we next identified subsets of trials where eye position was equivalent between all task and sensory control conditions (Fig. 2D–G). For each recording session, we focused on the cue evaluation period where running speed was comparable between the cue conditions (Fig. 1E,F), and matched distributions of eye positions (Roelfsema et al., 1998). We illustrate this procedure for one example session (Fig. 2D–F). First, we removed all trials containing one or more saccades during the cue evaluation window (Fig. 2D, black trace). From the remaining trials (gray), we constructed histograms of time-averaged eye positions, separately for each stimulus in the task and sensory control condition (Fig. 2E, left). We then matched the number of trials in each eye-position bin by finding the minimum number across all four conditions and removing, where necessary, a random selection of excess trials from the other conditions (Fig. 2E, right). To confirm that this matching procedure removed any differences in eye positions, we compared, without binning, their cumulative distributions across all four conditions and found that they were indistinguishable (Fig. 2F, horizontal position: p = 0.77; vertical position: p = 0.36; Anderson–Darling test). Applying this procedure to our entire dataset, we found that, before matching, the distributions of eye position differed systematically in every single session (Fig. 2G, left, n = 25 sessions from 9 mice). In 12 sessions obtained from 5 mice, however, we could match eye positions without falling below a minimum number of 10 trials for each combination of stimulus and task condition (Fig. 2G, right, all p > 0.1). These are the sessions with equivalent visual input to the mouse's eye during task performance and in the sensory control condition; we focused on those for the analyses of V1 responses.
Having identified time windows of equal running speed and subsets of trials with equalized visual input reaching the eye, we analyzed how V1 neurons were modulated during cue evaluation (Fig. 3). We first identified those neurons, which could not discriminate the cues in the sensory control condition. To increase our chances that the two cues provided the same visual stimulation we had mapped, at the beginning of each recording session, RFs of the recorded V1 neurons (n = 9 mice, Fig. 3A) and had positioned the two cues during the task such that they fully covered multiple RFs (Fig. 3B). Post hoc, we tested which neurons indeed received comparable stimulation by comparing responses to the cues in the sensory control condition (Fig. 3C–F, left). To quantify whether individual neurons could discriminate between the two cues, we determined how well an ideal observer could decode stimulus identity. We separated single-trial firing rates evoked by Go versus Stop cues, determined mean firing rates during the window of constant running speed (black horizontal bars in Fig. 3C,E), and computed the area under the receiver operating characteristic (AUC) (Macmillan and Creelman, 2005) (Fig. 3D,F, left). To make sure that the magnitude of the AUC value did not depend on the specific subset of trials that survived our eye-position stratification procedure, we repeated random stratification and AUC computation 1000 times, creating, for each neuron, a distribution of AUC values. We took the mean of this distribution as measure of discriminability. To select only those neurons for further analyses that could not discriminate between the cues, we applied a conservative criterion and excluded all neurons that showed any trend for differential responses to the cues in the sensory control condition (chance performance = 0.5 was outside the central 90% of the AUC distribution, i.e., p < 0.1). This left us with 115 neurons, for which, similar to both example neurons (Fig. 3C–F, left; AUC = 0.53 and 0.52, both p > 0.1), the AUC was not significantly different from chance performance in the sensory control condition (Fig. 3G, gray data points).
Focusing on those neurons whose RFs received comparable sensory drive from the locally identical cues (Fig. 3G, gray; 115 of 247 neurons), we then examined how they represented the two cues during task performance. Consider, for instance, the example neurons shown in Figure 3C–F. Despite similar responses in the sensory control condition (left panels), both neurons could reliably discriminate these locally identical visual signals when the mouse had to extract the meaning of the cues. Neuron 1 responded more strongly to the Go than to the Stop cue (Fig. 3C,D, right panels, AUC = 0.68, p < 0.01). Neuron 2 showed the opposite effect (Fig. 3E,F, right panels, AUC = 0.32, p < 0.01). Choosing again a conservative criterion, we considered discriminability during the task as significant if chance performance was outside the central 99% of the AUC distribution (p < 0.01), and found that discriminability emerged, during cue evaluation, in a substantial fraction of V1 neurons. Specifically, among the nonselective neurons, which failed to discriminate between Go and Stop cue in the sensory control condition, 30.4% reliably signaled the identity of the stimulus during task performance (Fig. 3H, green, n = 35). About half of these neurons responded more strongly to the Go stimulus (AUC > 0.5, n = 16), and the remaining neurons responded more strongly to the Stop stimulus (AUC < 0.5, n = 19). These data show that responses in mouse V1 to locally identical stimuli can indeed reflect the process of evaluating visual cues.
To gain insight into potential differences between the neurons modulated by cue evaluation and those unaffected, we examined basic physiological parameters and compared them between these populations (Fig. 4). The majority (85%) of neurons that were modulated during cue evaluation had RF areas that were smaller than the mean of the entire population (189.2 ± 17.1, n = 81 well-defined RFs), indicating that modulation by cue evaluation was not restricted to neurons with relatively large RFs (Fig. 4A). Indeed, neurons that were modulated during cue evaluation (Fig. 4A, green data points) had significantly smaller RF areas than neurons that were shape-selective (Fig. 4A, black data points, p = 0.046, two-sample Kolmogorov-Smirnov test). We also asked whether the strength of modulation would depend on the neurons' orientation preference but found no evidence for any such relation: orientation preferences of neurons modulated during cue evaluation (green, n = 23 well-tuned neurons) were distributed around the circle, with no indication of a particular peak (Fig. 4B, Rayleigh test of uniformity, p = 0.69). Finally, we examined laminar positions and found that modulated neurons were present across the depth of cortex, with no apparent clustering (Fig. 4C, n = 115).
Would we have observed comparable effects had we not controlled for eye position? To address this question, we recomputed AUC distributions by randomly selecting the same number of trials, but regardless of the eye positions. We found the number of neurons modulated by task context to be substantially reduced (Fig. 5). In the sensory control condition, a larger fraction of neurons failed to discriminate between Go and Stop stimulus, which effectively increased the pool of nonselective neurons (n = 178 of 247, Fig. 5A, gray data points). From this pool of nonselective neurons, however, only 11.24% (20 of 178, Fig. 5B, green data points) signaled the identity of the stimulus during the task, a fraction that was significantly smaller compared with the fraction obtained after matching eye positions (30.4%, p < 0.00004, χ2 test). Matching eye positions made a difference because it reduced trial-by-trial variability in firing rates. As a consequence, individual neurons were more likely to be found selective in the sensory control condition, as illustrated for one example neuron in Figure 5C. Without matching eye positions, trial-by-trial variability in firing rates was relatively high, and the distribution of AUC values obtained by randomly resampling trials was relatively broad (orange histogram, mean = 0.45, SD = 0.04). Its confidence region (orange horizontal line at top) included chance performance (black vertical line), and the neuron would therefore be considered nonselective. With matched eye positions, however, trial-by-trial firing rates became less variable, the distribution of AUC values was narrower (black histogram, mean = 0.44, SD = 0.02), and chance performance was clearly outside the confidence limits (black horizontal line), such that the neuron would now be excluded from the nonselective pool. Across the population, therefore, matching eye position reduced the size of the nonselective pool. In the same way, matching eye positions increased the number of neurons that could discriminate the stimuli during the task, as illustrated with a second neuron in Figure 5D. Without matching eye positions, the AUC distribution was relatively broad (orange histogram, mean = 0.58, SD = 0.06); matching eye positions made this distribution tighter (black histogram, mean = 0.60, SD = 0.03), and performance for decoding stimulus identity significantly different from chance level (confidence limits at top). The observation that matching eye positions sharpened the distributions of AUC values was true across the population of recorded neurons (Fig. 5E,F). We determined, for each neuron, the SD of the AUC distribution and compared the distributions of SDs with and without matching eye positions. Both during sensory control (Fig. 5E) and during task performance (Fig. 5F), the SDs of the AUC distributions were smaller under matched (black traces) than unmatched eye positions (orange traces, p < 0.00001 in both cases, two-sample Kolmogorov–Smirnov test). We conclude that, had we not matched eye position, we would have substantially underestimated the number of V1 neurons that were modulated by the process of evaluating cues.
The improvements in discriminability seen during the task could not be explained by any differences in running between sensory control and task conditions. While running behavior was under tight control during task performance, it was unconstrained in the sensory control condition, where mice could not earn a reward. It might therefore be possible that the changes in V1 discriminability between task and sensory control condition do not reflect the context of the visual task, but rather differences in running (Niell and Stryker, 2010; Ayaz et al., 2013; Erisken et al., 2014). To address this question, we focused on the sensory control condition and asked whether running per se, outside the context of the task, would affect discriminability of single neurons. We found that, with the exception of 1 mouse, the animals were also running, in the sensory control condition, on a substantial percentage of trials (35.6 ± 1.9%, on average). Where possible, we exploited this behavioral variability and separated single-trial firing rates based on whether the mouse was running or sitting (n = 71 of the 115 neurons). Consistent with previous reports (Niell and Stryker, 2010; Saleem et al., 2013; Erisken et al., 2014; Dadarlat and Stryker, 2017), we found a fraction of V1 neurons, whose firing rates were modulated by running (15 of 71 neurons, 21.13%, p < 0.05 for each neuron, unpaired t test). Although running had a general effect on average firing rates, it did not, however, affect how well V1 neurons could discriminate the cues. We computed AUC values and asked whether V1 discriminability, measured during sitting, would be any different if the mouse was running. On average, mean AUC values were very similar (sitting: 0.52, running: 0.51, p = 0.28, paired t test), and only 4 of 71 neurons (5.6%) showed a significant difference in AUC values (permutation test, p < 0.05). This observation is consistent with imaging data, where the relative selectivity for two grating stimuli was also not affected by running behavior (Poort et al., 2015). We therefore conclude that running per se cannot account for the improved discriminability of V1 neurons during task performance.
To gauge the task-dependent modulations in V1, we compared them against potential cue evaluation signatures in prefrontal cortex, a key area for many aspects of behavioral control. We focused on ACC for two reasons. First, ACC plays a prominent role in foraging-like behaviors, where the benefits of certain actions have to be evaluated relative to the required efforts (Walton et al., 2002; Rudebeck et al., 2006; Kennerley et al., 2009; Hillman and Bilkey, 2010). Second, in mouse cortex, ACC sends direct projections to V1, providing a network for top-down modulation of visual processing (Zhang et al., 2014, 2016; Fiser et al., 2016; Leinweber et al., 2017).
We recorded, using the same paradigm, from ACC and found a similar fraction of neurons that was modulated during cue evaluation; discriminative power, however, was much stronger than in area V1, and population preferences were biased toward the cue promising reward (Fig. 6). We recorded from n = 3 mice (12 sessions) targeting the ACC just anterior of bregma (Fig. 6A); this part of ACC sends topographically organized projections to V1 (Zhang et al., 2016; Leinweber et al., 2017). In many ACC neurons, selectivity emerged for Go versus Stop cues during the task. The example neuron shown in Figure 6B, for instance, gave almost no response to either cue in the sensory control condition (left, AUC = 0.54, p > 0.1), but strongly signaled the presence of the Go cue during the task (AUC = 0.72, p < 0.01). We also observed, although less frequently, neurons with the opposite pattern of responses, such as the example shown in Figure 6C (left: AUC = 0.44, p > 0.1; right: AUC = 0.22, p < 0.01). Across the population, 82.5% (94 of 114) of ACC neurons could not discriminate the cues in the sensory control condition (Fig. 6D, gray data points, p > 0.1), a fraction of nonselective neurons that was significantly larger than observed in V1, even without matching eye positions (i.e., Fig. 5A, p = 0.033, χ2 test). Of these nonselective ACC neurons, 34.04% (32 of 94) signaled the identity of the cue during the task (Fig. 6E, green data points, p < 0.01). While the proportion of neurons affected by evaluating cues seemed comparable to area V1, two differences stood out: (1) discriminative power, assessed by the magnitude of AUC values, was consistently stronger in ACC than in V1 (Fig. 6F, p = 0.001, two-sample Kolmogorov–Smirnov test); and (2) modulations in ACC were not as balanced as in V1: among the modulated ACC neurons, the majority (75%) responded more strongly to the cue predicting a reward, a percentage significantly larger than in V1 (p = 0.015, χ2 test). Analogous to V1, we also examined whether the changes in discriminability had any simple relation to the act of running. We focused on the sensory control condition, in which mice were running around cue onset on almost half of the trials (44.55 ± 5.04%, n = 12 sessions), and compared AUC values computed separately for running versus sitting trials. Across the population of discriminating ACC neurons (n = 31, Fig. 6G, open circles), only 2 showed a statistically significant difference between sitting versus running AUCs (closed circles, permutation test, p < 0.05). These analyses show that running per se does not influence how well ACC neurons discriminate visual cues.
Discussion
Here, we have assessed how the behavioral demands of evaluating visual cues affected their neural representations in mouse V1 and ACC. In our task, the animal's commitment to a potentially rewarding action had to rely on visual cues, which differed in terms of global shape, but provided equivalent visual stimulation to RFs in cortex. Tightly controlling locomotion and eye positions, we found differential responses to locally identical visual cues in ∼30% of V1 neurons; this proportion was substantially smaller when we did not account for eye positions. In ACC, much stronger selectivity emerged during cue evaluation, consistent with its prominent role in behavioral control. While preferences for either cue were balanced in the V1 population, preferences in ACC were biased toward the cue promising reward. Together, these experiments demonstrate distinct signatures of cue evaluation in primary visual and cingulate cortex of the mouse.
Isolating cue evaluation signatures in mouse V1 is challenging because its neural activity can reflect multiple behavioral influences; our task was designed to reduce the impact of such influences, but it also has its shortcomings. Key features of our task, such as self-initiation of trials ensuring constant task engagement, constant running speed, and a delayed reward, allowed us to compare responses to the cues during the task, while controlling for the impact of arousal (Vinck et al., 2015; Reimer et al., 2016), locomotion (Niell and Stryker, 2010; Erisken et al., 2014; Dadarlat and Stryker, 2017), and task engagement (Jurjut et al., 2017; Pho et al., 2018; Jacobs et al., 2020). Yet, to assess baseline neural discriminability, we relied on the sensory control condition, during which the temporal structure of stimulus presentation and the overall arousal level of the mice were different. Although neural discriminability did not depend on locomotion per se, we cannot rule out that stimulus timing or arousal contributed to the difference in discriminability between sensory control and task condition. We also did not monitor orofacial features, such as movements of the face, the nose, or whiskers (Musall et al., 2019; Stringer et al., 2019; Salkoff et al., 2020), whose impact on V1 activity can be substantial during unconstrained behavior (Stringer et al., 2019) but might be less pervasive when mice are engaged in a visual task (Musall et al., 2019).
One of our findings is that substantial modulations of V1 responses were revealed only under tight control of eye positions. Even under head fixation, mice make saccadic eye movements (Sakatani and Isa, 2007; Payne and Raymond, 2017; Meyer et al., 2018, 2020; Samonds et al., 2018), whose size can depend on stimulus properties (Samonds et al., 2018) and whose frequency increases with running (Niell and Stryker, 2010; Keller et al., 2012; Ayaz et al., 2013; Bennett et al., 2013). Broadly consistent with the notion that saccades in head-fixed mice are related to attempted head movements (Meyer et al., 2020), we found that saccades were reliably preceded by acceleration. Systematic differences in eye position across behavioral conditions (Keller et al., 2012; Liang et al., 2020) required a stringent eye position matching procedure (Roelfsema et al., 1998) because simply collecting more trials would not alleviate such systematic differences. As we show, stratifying mouse V1 responses removes variability related to eye position, which might otherwise mask or reduce effects of interest. Indeed, had we not considered eye position, we would have grossly underestimated the percentage of modulated V1 neurons.
A key feature of our behavioral task is that mice self-initiate each trial by running (see also Marques et al., 2018), which we chose to aim for a constant level of task engagement. One might suspect that this requirement for running negatively affects performance, given previous reports of poorer visual performance during hyperaroused states, including running (McGinley et al., 2015; McBride et al., 2019; Neske et al., 2019; Salkoff et al., 2020). We found, however, that running is not detrimental to visual performance in our task, with a post-learning average d' of almost 3. Similarly, in earlier work, running also did not compromise motion coherence thresholds during direction discrimination in random dot patterns (Marques et al., 2018). One potential explanation for this apparent contradiction might be that, in Go/No-go tasks without self-initiation of trials, periods of running or hyperarousal might coincide with lower task engagement, which might underlie the poorer visual performance. A similar argument has recently been proposed for the low arousal case, where trials indexed by high amplitudes of low-frequency oscillations were linked to reduced task engagement more than to impaired performance (Jacobs et al., 2020).
Differential neural responses in V1 to identical visual stimulation, under comparable running behavior and eye positions, could reflect several task-related contexts, such as reward expectancy or learned categorical representations. Reward expectancy can shape activity in V1 of rodents (Shuler and Bear, 2006; Chubykin et al., 2013; Poort et al., 2015) and macaques (Stănişor et al., 2013). In macaque V1, if the amount of a fluid reward is manipulated and independently assigned to either of two stimuli, multiunit responses to the same stimulus can become stronger with increasing values of relative reward (Stănişor et al., 2013). Such a straight relation, however, did not exist in our single-neuron data: only half the population responded stronger to the reward-predicting cue; the other half responded stronger to the cue predicting no reward. We would argue, however, that there is no a priori reason that the cue predicting a reward is the only one that is behaviorally relevant. In our paradigm, the cue signaling the absence of reward might be just as relevant, if the goal is to save time and energy. Alternatively, the differential processing of the two cues might not reflect the value of reward, but rather represent a category representation emerging in the context of the task. Training macaques in categorical discrimination of similar shapes, for instance, can increase selectivity of neurons in ventral stream areas (Logothetis et al., 1995; Baker et al., 2002). Remarkably, in inferior temporal cortex, selectivity for stimulus categories can be stronger in a categorization task than during passive viewing (McKee et al., 2014). The differential responses we observed in mouse V1 might in a similar way reflect the fact that the mouse has learned to categorize the two stimuli.
In contrast to the balanced selectivity we observed in area V1, preferences in ACC were biased toward the stimulus that promised a reward; this biased selectivity in ACC was task-specific because it only emerged in response to the behavioral demands. Neural populations at higher processing stages often exhibit pronounced biases for specific stimuli. For instance, in macaques trained to discriminate visual stimuli, the majority of neurons recorded in the lateral intraparietal area responded most strongly to one particular stimulus (Fitzgerald et al., 2013). Such biased selectivity might be beneficial in perceptual tasks with binary or discrete outcomes because the overall level of activity could then be used by downstream areas to infer stimulus identity and commit toward an action. Biased selectivity has also been observed in mice trained to discriminate two orientations: 88% of neurons imaged in PPC responded more strongly to a rewarded than to an unrewarded orientation (Pho et al., 2018). These and our findings suggest that, in the mouse, stimulus representations transition along the cortical hierarchy from a balanced to a biased scheme. Future work will have to clarify under which behavioral conditions biased selectivity emerges, and where in cortex the transition occurs from balanced to biased distributions of preferences.
What might be the neural circuits underlying the task-dependent modulation we observed in area V1? Although we have no direct evidence, several lines of research indicate ACC as a prime candidate. ACC is important for reward-guided action selection (Doya, 2008; Kolling et al., 2016; Shenhav et al., 2016). In foraging-like behaviors, responses of ACC neurons represent the benefits of certain actions, relative to the required effort (Walton et al., 2002; Rudebeck et al., 2006; Kennerley et al., 2009; Hillman and Bilkey, 2010; Kolling et al., 2012). Effort-based decision-making might be one of the processes that is relevant in our task because mice have to commit to running bouts to harvest a reward. Furthermore, ACC sends direct projections to mouse V1, which can carry top-down signals that can shape activity of V1 neurons (Zhang et al., 2014, 2016; Fiser et al., 2016; Leinweber et al., 2017; Huda et al., 2020). Consistent with the idea that ACC contributes to the selectivity emerging in V1, we found that neurons in ACC became strikingly selective during cue evaluation. Conclusive demonstrations, however, for V1 modulations in our task by ACC will require simultaneous recordings from both areas (e.g., Steinmetz et al., 2019) or optogenetic suppression of ACC terminals over V1 (e.g., Zhang et al., 2014).
Footnotes
This work was supported by a Starting Independent Researcher grant from the European Research Council awarded to S.K. (281885 PERCEPT) and by funds awarded to the Centre for Integrative Neuroscience within the framework of the German Excellence Initiative (DFG EXC 307). G.B. was supported by Deutsche Forschungsgemeinschaft grant DFG BU 1808/5-1. We thank Agne Klein (nee Vaiceliunaite) and Petya Georgieva for help with the experiments.
The authors declare no competing financial interests.
- Correspondence should be addressed to Steffen Katzner at steffen.katzner{at}lmu.de