Abstract
Delayed-response sensory discrimination is believed to require primary sensory thalamus and cortex for early stimulus identification and higher-order forebrain regions for the late association of stimuli with rewarded motor responses. Here we investigate neuronal responses in the rat primary somatosensory cortex (S1) and ventral posterior medial nucleus of the thalamus (VPM) during a tactile discrimination task that requires animals to associate two different tactile stimuli with two corresponding choices of spatial trajectory to be rewarded. To manipulate reward expectation, neuronal activity observed under regular reward contingency (CR) was compared with neuronal activity recorded during freely rewarded (FR) trials, in which animals obtained reward regardless of their choice of spatial trajectory. Across-trial firing rates of S1 and VPM neurons varied according to the reward contingency of the task. Analysis of neuronal ensemble activity by an artificial neural network showed that stimulus-related information in S1 and VPM increased from stimulus sampling to reward delivery in CR trials but decreased to chance levels when animals performed FR trials, when stimulus discrimination was irrelevant for task execution. Neuronal ensemble activity in VPM was only correlated with task performance during stimulus presentation. In contrast, S1 neuronal activity was highly correlated with task performance long after stimulus removal, a relationship that peaked during the 300 ms that preceded reward delivery. Together, our results indicate that neuronal activity in the primary somatosensory thalamocortical loop is strongly modulated by reward contingency.
Introduction
It is traditionally believed that the execution of delayed-response sensory discrimination tasks depends on related but separate sets of neural structures (Baddeley, 1995). According to this view, stimulus-related information is first acquired and processed by lower-order forebrain areas, such as the primary sensory thalamus and cortex, and then sent to higher-order neural substrates, such as the hippocampus (Olton et al., 1980; Wirth et al., 2003) and nonprimary cortical areas (Fuster, 1989a; Salzman and Newsome, 1994; Leon and Shadlen, 1999; Platt and Glimcher, 1999; Miller, 2000; Shadlen and Newsome, 2001; Romo et al., 2002), where stimulus–response relationships are stored by way of learned reward contingencies so as to be useful for decision making (Schultz, 2002). Higher-order areas are also thought to hold on to task-relevant information until enough time has elapsed for the execution of delayed motor responses (Dehaene and Changeux, 2000).
However, mounting evidence suggests that primary sensory cortices are also involved in more complex processes such as reward timing (Shuler and Bear, 2006), efferent copy processing (Wikgren et al., 2003), and working memory (Zhou and Fuster, 1996; Supèr et al., 2001). The primary visual cortex (V1), for instance, has been implicated in complex processing beyond the analysis of the basic features of the visual scene. Attention effects have been demonstrated to influence activity in V1 (Crist et al., 2001; Li et al., 2004). Previously, our laboratory has shown that tactile responses in the primary somatosensory “barrel field” cortex (S1) differ significantly between active discrimination and passive discrimination, indicating that top-down influences contribute to S1 neuronal firing patterns (Krupa et al., 2004). Furthermore, transcranial magnetic stimulation of S1 long after stimulus removal interferes with a delayed-response task (Harris et al., 2002).
Here, we investigated the delayed tactile discrimination task used by Krupa et al. (2004) under different reward contingencies to further test the hypothesis that neuronal activity in the rat primary somatosensory thalamocortical loop performs more than simple ascending sensory processing. More than 27% of the single units recorded showed significant firing rate modulation between stimulus disengagement and reward. At the neuronal ensemble level, we found that reward contingency strongly modulates processing in the poststimulus phase, when animals execute a decision toward reward. Stimulus-related information at the ensemble level increased when the contingency between stimulus and response was crucial for reward and decreased when the stimulus was irrelevant. In addition, stimulus-related information in the cortex (but not thalamus) was directly related to behavioral performance in the task. This poststimulus neuronal processing in the primary somatosensory thalamocortical loop may be necessary for accurate stimulus discrimination immediately before reward delivery and/or the top-down reinforcement of rewarded behavioral choices.
Materials and Methods
Discrimination task.
Six adult Long–Evans rats (250–300 g) were trained to perform a tactile discrimination task for water reward (Krupa et al., 2001). The behavioral apparatus consisted of two chambers, a discrimination chamber and a waiting/reward chamber, connected by a central door (CD). The discrimination chamber contains two aligned bars that form apertures of variable width and constitute the source of tactile stimulation in these experiments. The position of these bars relative to the whiskers of the animal while poking the center nose poke (NP) determined two possible stimulus configurations, one narrow (52 mm) and another wide (85 mm). In both stimulus configurations, the whiskers touched the bars, which were positioned equidistantly with respect to NP. Initially rats were placed in the waiting/reward chamber, and at the beginning of each trial, the central door separating both chambers opened, allowing animals to explore the discrimination chamber. A successful nose poke detected by an infrared sensor inside the NP hole triggered the opening of the reward windows in the waiting/reward chamber.
Animals took up to 20 d to reach criterion performance (>80% correct trials). Well-trained animals learned to move forward as soon as the central door opened, poking straight the NP and quickly returning to the waiting/reward chamber to choose one of two reward windows (left or right). Once infrared beams inside the reward windows were broken, 50 μl of water were delivered as reward. At the same time, the central door closed, determining the trial end. During the 15 s intertrial interval, the CD was closed to prevent access to the discrimination chamber; during this period, the stimulus aperture was randomly reset to either wide or narrow positions. Sessions typically lasted 1 h each, during which an average of 100 trials were collected. This allowed animals to sample the stimuli (narrow or wide bar aperture) in a fast and repetitive manner (Fig. 1A). All six rats were recorded in the contingent reward (CR) condition, in which the reward is delivered only after a correct discrimination (i.e., right reward-poke for wide stimulus and left reward-poke for narrow stimulus). Five rats were also recorded in the free reward (FR) condition, in which the reward was delivered regardless of the behavioral choice. Neurons from S1 and ventral posterior medial nucleus of the thalamus (VPM) were simultaneously recorded under both CR and FR conditions in four rats. The behavioral apparatus was designed to produce very stereotypical behavior throughout the trial and to prevent the use of sensory cues other than tactile to solve the task (Krupa et al., 2001). The two chambers were located inside a sound-attenuated, light-proof isolation box, and all the events of the behavioral task were fully controlled by a computer program (Med Associates, St. Albans, VT). Behavior was also recorded in videotape with infrared-sensitive CCD cameras placed inside the waiting/reward and discrimination chambers.
Contingent and free reward in a tactile discrimination task. A, Schematic diagram of the behavioral apparatus used in the tactile discrimination task. As a trial begins, the rat enters the stimulus chamber and uses its whiskers to touch two bars placed around the NP, which determine a narrow or wide aperture. The animal then returns to the reward chamber and makes a choice of left or right window to receive a water droplet as reward. Reward contingency was manipulated to vary stimulus relevance. In the CR condition, reward was only delivered when animals poked the correct reward window given the stimulus presented (narrow aperture for left reward window and wide aperture for right reward window). In CR sessions, correct stimulus discrimination was highly relevant for obtaining reward. In the FR condition, reward was delivered regardless of the stimulus presented, and therefore the stimuli were irrelevant for the execution of the task. B, The time course of one single trial is shown, with the approximated discrimination period indicated by the gray shade. RTI is the time interval between the opening of the CD, at the beginning of the trial, and the nose poke at stimulus chamber; RTII is the time difference between NP and RW; RTIII is the time difference between discrimination beam (photobeam between the bars) and nose poke. C, The performance in CR and FR sessions is shown for each subject for all animals recorded in both conditions. D, A behavioral bias in the choice of reward window was present in FR sessions but not in CR sessions and was used as an independent criterion for the inclusion of FR trials in the dataset analyzed. Behavioral bias was calculated as the absolute value of (X − Y)/(X + Y), where X is the number of left choices, and Y is the number of right choices. After switching from the CR to the FR condition, animals usually took <10 trials to display the bias. Unbiased trial blocks in FR sessions were discarded from the experiment. The behavioral bias in FR sessions was reminiscent of the biases detected during the early phases of task acquisition (data not shown). E, Reaction times within the trials (RTI, RTII, and RTIII) did not show significant differences between the performance of CR and FR sessions (ANOVA, RTI, F(1,7) = 1.822, p = 0.22; RTII, F(1,7) = 0.24, p = 0.63; RTIII, F(1,7) = 3.26 × 10−4, p = 0.98). Error bars represent SEM.
Daily recordings comprising both CR and FR sessions were performed in tandem with a 10 min rest interval in between them. The temporal order of the two conditions was alternated to control for satiety and other possible order-related effects. CR and FR sessions involved an identical random schedule for stimulus presentation. To control for the possibility of animals performing the FR task as if it were a CR task (i.e., obeying the reward contingency even in its absence), we took advantage of the observation that during FR sessions animals tended to express a characteristic left or right behavioral bias in their choice of side for reward delivery (Fig. 1C). The behavioral bias was calculated in blocks of 10 trials as the absolute value of (X − Y)/(X + Y), where X is the number of left choices and Y is the number of right choices executed by the animal. Task performance (see Figs. 1D, 12) was calculated as a percentage of correct trials over the total number of trials. This bias was absent in CR sessions and was therefore used as an independent criterion to distinguish the two conditions. For each trial, we calculated the time interval between the opening of the central door and the nose poke at stimulus chamber [reaction time I (RTI)]; the time interval between nose poke and reward delivery (reaction time II); and the time interval between the moment when the animal first touches the bars and the nose poke (reaction time III). The moment of bar touch was defined by the breaking of an infrared photobeam between the bars, previously shown to coincide with the beginning of whisker deflection (Krupa et al., 2001). In CR sessions, the RTI, RTII, and RTIII lasted respectively 1.73 ± 0.07, 1.83 ± 0.13, and 0.30 ± 0.0034 s (mean ± SEM). These values did not change significantly between CR and FR sessions (Fig. 1E). Therefore, the discrimination chamber was available for the animals on average for 3.6 s.
Behavioral quantification.
Video tracking software (Anymaze; Stoelting, Kiel, WI) was used to overlay a grid on the video images recorded from all the sessions. Trajectory quantification by careful visual inspection was made by determining a reference on the animal's body (a fixed point between the ears) and tracking its position on the grid across consecutive video frames (see Fig. 9A). This procedure was performed blindly (i.e., the experimenter did not have knowledge of the CR/FR identity of the sessions). Left and right motor trajectories were classified according to the specific sequence of grid positions (i.e., the variance of motion across the spatial grid). The Kolmogorov–Smirnov test was used to compare the CR versus FR distributions of trajectory types (see Fig. 9B,C). Trajectory analysis was run for all animals except one, because of the poor quality of the video recording, which would render imprecise the quantification of the behavior.
Surgical implants.
After reaching task criterion, animals were surgically implanted with multielectrode arrays as described previously (Kralik et al., 2001). Stereotaxic coordinates in millimeters relative to bregma (Paxinos and Watson, 1997) were used to center the arrays in the S1 [anteroposterior (AP), +3.0; mediolateral (ML), +5.5; dorsoventral (DV), −1.5] and VPM (AP, +3.0; ML, +3.0; DV, −5.0). Preoperative and postoperative animal care followed National Institutes of Health guidelines.
Chronic neuronal recordings.
A multineuron acquisition processor (32 channels; Plexon, Dallas, TX) was used to record neuronal spikes and local field potentials, as described previously (Nicolelis et al., 1999). Briefly, differentiated neural signal was preamplified (2000–32,000×) and digitized at 40 kHz. Up to four single neurons per recording channel were sorted on-line (SortClient 2002; Plexon) and validated by off-line analysis (Offline Sorter 2.3; Plexon) according to the following cumulative criteria: (1) voltage thresholds >2 SDs of amplitude distributions; (2) signal-to-noise ratio >2.5 (as verified on the oscilloscope screen); (3) <1% of interspike intervals (ISIs) smaller than 1.2 ms; (4) stereotypy of waveform shapes, as determined by a waveform template algorithm and principal component analysis. Up to 64 channels were simultaneously recorded from a single brain area. Recording sites were histologically verified by comparing cresyl-stained 30 μm frontal brain sections with reference anatomical planes (Paxinos and Watson, 1997).
Neuronal data analysis.
Data were processed and analyzed by custom-made Matlab (MathWorks, Natick, MA) code running in a computer cluster comprising 32 central processing units (Evolocity; LNXI, Sandy, UT). For all analyses, trials in which reaction time II lasted longer than 4 s were discarded. Categories of tactile responses were determined by significant changes in single-unit activity during 2 s around the contact with the stimuli when compared with the baseline activity (see Figs. 3, 5, 10), which was measured in the time window from −3 to −2 s before the nose-poke photobeam. Magnitudes of the responses during the presentation of the stimulus were defined as the increase (excited cells) or decrease (inhibited cells) in firing rates (in hertz) relative to the baseline firing levels using a program developed by Mike Wiest and Ranier Gutierrez (Wiest et al., 2005; Gutierrez et al., 2006). The duration, also calculated for inhibited and excited cells, was defined as the time during which significant firing changes occurred (p < 0.01). The terms “excited” and “inhibited”, therefore, describe cells with incremental and decremental changes in firing rate, respectively. To assess whether the diversity of response patterns observed could be automatically separated into different functional classes, we calculated the Pearson's correlation coefficients between the peristimulus time histograms of every pair of recorded units. The response histograms included data from 2 s around NP, summed in 50 ms bins. The correlation matrix (see Fig. 4C) represents the similarity of each unit's response to every other unit's response. The correlation matrix was then reordered by an algorithm to cluster similarly responding neurons together resulting in square domains of high correlation significance that correspond to different functional classes of response (Lin et al., 2006) (supplemental material, available at www.jneurosci.org).
Mean firing rate and mean ISI single-unit analysis (see Table 2) were performed with the analysis package Nex (Plexon). Nonparametric statistics (Mann–Whitney U test) were used to evaluate the significance of the data sampled from individual neurons and task conditions.
For the thalamocortical ensemble analysis (see Figs. 7, 10, 11), an artificial neural network (ANN) for pattern classification (Bishop, 1995; Nicolelis et al., 1999) was coded in Matlab. The ANN implemented an optimized learning vector quantization algorithm to quantify the ability of the neuronal ensembles recorded to discriminate between narrow and wide stimuli. The ANN attempts to predict the stimulus presented on each trial from the binned spiking activity recorded during that trial. The inputs to the ANN were single-trial perievent histograms of neuronal spiking in eight 50 ms bins (400 ms windows) time locked to the moment the rat's nose broke the NP beam. To produce a continuous quantitative readout of stimulus-related information (i.e., “wide” or “narrow” stimulus) present in the neuronal ensemble activity throughout the entire trial, we advanced the 400 ms ANN analysis window in 50 ms steps and recalculated the ANN′s classification performance at each step (sliding window) (see Fig. 7B). Note that the ANN was always trained with neuronal ensemble activity around the NP but was tested at different moments in time. Therefore, above-chance ANN performance represents similarity with neuronal ensemble activity patterns that occurred during stimulation (for more details, see supplemental material, available at www.jneurosci.org).
For the comparison of stimulus-related neuronal ensemble information in the CR and FR conditions (see Fig. 12A), “wrong” trials in all sessions were discarded to control for different motor responses (left/right) to a given stimulus type (narrow/wide). Therefore, the associations between stimulus type and motor response were identical in the conditions CR and FR and could not account for their differences. We also balanced the number of trial types (narrow/left and wide/right) between CR and FR, so as to compare the same number of rewarded trials on each side. The comparison of correct-only trials between CR and FR sessions was only possible because animals in the FR condition, despite the characteristic behavioral bias to one side, still performed a substantial number of trials following the correct association between stimulus type and reward delivery windows. Additional trials were randomly discarded to equalize the number of CR and FR trials for each CR/FR comparison.
For the comparison of single-neuron firing rates in the CR and FR conditions (see Fig. 10A,B), perievent histograms of spike counts were calculated with 100 ms bins for a 1 s interval around the nose poke in the stimulus chamber and a 1 s interval before reward delivery in the reward chamber. For ANN analyses, three within-trial periods lasting 0.88 s each were selected with respect to stimulus delivery. The distinct time intervals were Pre (−2 to −1.12 s before NP; thus before contact with stimulus), Stim (−0.4 to +0.48 s around NP; during contact with stimulus), and Post (0.64–1.52 s after NP; after stimulus offset, but before reward) (Fig. 1).
Results
First we compared thalamic and cortical sensory responses during active tactile discrimination while recording extracellular single units from six rats. A total of 473 S1 and 242 VPM neurons (79 ± 16.6 and 48 ± 7 units per animal; mean ± SEM) were recorded in this study. We found that the proportions of cells modulated by the stimuli used were not significantly different (S1, 40%; VPM, 45%). Figure 2 shows that most S1 neurons had a mean firing rate of <10 Hz, whereas VPM neurons showed a broader frequency distribution. These data suggest that we recorded mostly excitatory pyramidal cells in S1 (Swadlow, 2003). This is in agreement with the fact that the multielectrode implants were aimed at cortical layer V, 1.5 mm below the pial surface. Because rat VPM contains no inhibitory neurons (Barbaresi et al., 1986; Williams and Faull, 1987), all the VPM neurons we recorded were probably projecting cells.
Mean frequency distribution of the thalamocortical population. For each cell, we calculated firing rate for all trials during a 4 s interval around NP. The figure shows the mean firing rate distributions for all neurons in S1 (n = 473) and VPM (n = 242) that showed some sort of modulatory response to the apertures (narrow or wide). S1 neurons were mostly characterized by low frequencies (<10 Hz), whereas a substantial proportion of VPM neurons showed higher firing rates (>10 Hz). Error bars represent SEM.
A variety of tactile responses was found in both S1 and VPM (Fig. 3A). For instance, 25% of the S1 responses and 36% of the VPM responses were purely inhibited, whereas 39 and 40% were strictly excited, respectively. A third category of neurons showed multiphasic modulations (36% of S1- and 24% of VPM-responsive cells). The duration and magnitude of the inhibited and excited responses did not differ significantly either within (Fig. 3B) (magnitude, ANOVA, S1, F(1,9) = 0.22, p = 0.6; VPM, F(1,9) = 0.47, p = 0.5; duration, ANOVA, S1, F(1,9) = 0.007, p = 0.9; VPM, F(1,9) = 1.6, p = 0.2) or across (magnitude, ANOVA, excited cells, F(1,9) = 1.32, p = 0.3; inhibited cells, F(1,9) = 0.004, p = 0.95; duration, ANOVA, excited cells, F(1,9) = 0.52, p = 0.5; inhibited cells, F(1,9) = 0.41, p = 0.5) areas.
Tactile responses in the thalamocortical loop. A, Neuronal responses around stimulus sampling were classified as excited, inhibited, or multiphasic, according to changes in firing rate around the NP. Each panel shows a raster plot (top) and a perievent time histogram (bottom) averaged across 81 ± 9 trials for S1 and 84 ± 12 trials for VPM (mean ± SEM). B, Average magnitude and duration (mean ± SEM) of modulated responses (excited or inhibited) in S1 and VPM neurons show no significant differences within the thalamocortical loop. The magnitudes of the responses during stimulus presentation were defined as the increase or decrease in firing rates relative to the baseline activity for each cell individually.
Second, we tested whether neurons whose activity was modulated by the stimulus also showed changes in activation throughout the poststimulus period, as seen in previous studies of brain areas such as the prefrontal cortex (PFC) (Romo et al., 1999). We analyzed the firing rates of neurons classified as excited, inhibited, and multiphasic during the decision-making period [∼1.5 s before reward delivery (RW)] compared with the spontaneous rate (see Materials and Methods) and then classified the cells as having showed “no response,” excitation, or inhibition. The consideration of responses during the tactile discrimination period and during the poststimulus period resulted in six subcategories (a to g) of neuronal activation for S1 and seven subcategories for VPM (Fig. 4A,B). However, the correlation-based cluster analysis showed that only two different functional neuronal categories (one for inhibited and one for excited cells) could be automatically separated for S1 and three (one for inhibited and two for excited cells) for VPM (see Materials and Methods and supplemental material, available at www.jneurosci.org), based on the number of clusters with high-similarity responses (Fig. 4C).
Categories of firing rate responses. A, Classification of cortical and thalamic responses based on firing rate changes around NP and before RW. Responses were classified as excited or inhibited according to a statistically significant increase (red) or decrease (blue) in firing rate in a 600 ms window around NP or during the entire delay period until RW. We identified two subcategories of cells that showed excitation during stimuli presentation (a and b), three subcategories of cells showing inhibition (c–e), and two subcategories of cells that showed multiphasic responses around the NP (f and g). B, Percentage of cells in each category. Among excited cells, 73% in S1 and 67% in VPM showed excitation during stimulus sampling and no response before reward (subcategory a), and the rest (S1, 27%; VPM, 33%) showed excitation during stimulus sampling followed by inhibition before reward. Also, most inhibited cells (S1, 74%; VPM, 62%) showed a sustained inhibition from discrimination of stimulus to RW. Note that only a small percentage, 26% in S1 and 13% in VPM, did not show any modulation around reward when inhibited activity was seen around NP (subcategory d). C, Automatic classification of neuronal responses. The correlation matrix (left) represents the degree of similarity between pairs of neuronal response, when plotted against each other. The two boxes in the S1 matrix and three boxes in the VPM matrix represent the clusters of cells automatically classified as distinct categories of response.
Figure 5 illustrates changes in neuronal responsivity throughout the entire poststimulus period for different cell subcategories. This analysis provided a classification of neuronal types according to the poststimulus activity. It also allowed us to observe the complex dynamics of neuronal responses in S1 and VPM. A monotonic pattern of modulation was observed only in subcategories a and c, in which continuous excitation or inhibition could be observed, respectively. Figure 6 shows examples of cells in which activity modulation was sustained during the whole interval between stimulus and reward, returning to baseline firing immediately after RW. This classical response type, typical of “memory” cells in the PFC (Romo et al., 1999), does not represent the majority of the cells recorded from S1 and VPM. Among all cells that showed some modulation of neuronal activity during Stim, only 9 and 7% showed sustained persistent modulation in S1 and VPM, respectively.
Tactile responses in individual categories. Neuronal responses represented in raster plots, average frequency histograms, and cumulative sum plot throughout the trials, from stimulus sample to reward delivery. All plots are centered on the NP. Neuronal activity is higher or lower than the baseline during different moments of the trials. Different neuronal response categories are represented. A, Responses of S1 neurons to the narrow aperture. B, Responses of VPM neurons to the narrow aperture. Blue points indicate the beginning of the trials, and red points indicate reward delivery. +, Increased firing rate; −, decreased firing rate; 0, no response.
Sustained neuronal activation in S1 and VPM. A, Perievent raster plot and average across-trial frequency of two S1 neurons during the tactile discrimination task. Green dots represent the NP event, and red dots represent the RW event. B, Perievent raster plot and average across-trial frequency of two VPM neurons during the tactile discrimination task. Neurons on the right show inhibition from NP to RW during the whole poststimulus period, whereas cells on the left show excitation. Blue dots indicate the beginning of the trials.
To gain additional insight into the effect of active tactile discrimination on thalamocortical ensemble activity, we used an ANN to classify single trials and determine how much stimulus-related information could be detected in the recorded ensembles. As expected, the ANN performance was near chance (no stimulus information) before contact with the stimulus, in both VPM and S1 ensembles. This result is an important validation of our analysis method (Fig. 7A,B). The ANN analysis across different animals showed that both S1 and VPM responses carried a significant amount of information about the stimuli around the NP. Furthermore, this information often increased between the moment when the animal contacted the stimulus and delivery of reward. Data for individual animals are shown in Figure 7A. The average percentage of correctly classified trials for each selected period, Pre (before contact with stimulus), Stim (during contact with stimulus), and Post (after stimulus offset, but before reward), were respectively 47.3, 62.2, and 72.4% for cortical ensembles and 51.7, 63.6, and 63.8% for VPM ensembles. The time course of stimulus-related information in the thalamocortical loop in a representative animal is shown in Figure 7B.
Stimulus-related information in S1 and VPM neuronal ensembles. A, In individual animals, stimulus-related information (mean percentage of correctly predicted trials ± SEM) increased from the prestimulus period (Pre) to the intervals of stimulus sampling (Stim) and poststimulus (pre-reward delivery) period (Post). A significant increase in the number of correct predictions occurred when the animals were contacting the stimulus as well as after stimulus offset. *p < 0.05, **p < 0.01, significant differences from Pre. B, The temporal evolution of stimulus-related information in S1 and VPM neuronal ensembles shows that stimulus-related information in CR sessions slowly increased before NP and persisted significantly above chance levels (50%) even after stimulus removal. The right arrow indicates that reward delivery occurs after the Post period, as indicated by reaction time II measures (see Fig. 1E). Time labels mark the first 100 ms bin in each 400 ms moving window.
To determine whether the time course of stimulus-related neuronal information in primary sensory areas depends on stimulus relevance, we manipulated the reward contingency of the task (i.e., the probability of reward delivery given a certain stimulus–response pair). Two different task conditions were compared for five animals (Fig. 1A): CR and FR. CR consisted of the same task animals had already learned; i.e., reward administration was contingent on animals choosing the correct water-delivery window (left or right) depending on which stimulus was present immediately before (narrow or wide). In contrast, FR consisted of free reward delivery at both windows, regardless of the stimuli presented at each trial. Therefore, stimulus relevance was high during CR sessions and null during FR sessions.
Video coding of the motor behavior within trials showed no significant differences in the spatial pathways taken to perform the CR and FR tasks. Figure 8 shows the frame-by-frame comparison of four trials in which the animal had made either a left or right reward choice. Figure 9A shows the spatial grid used to quantify the motor trajectories in CR and FR sessions. To characterize the path repertoire used by rats in the CR and FR conditions, frame-by-frame eye inspection of the motor pathways was carried on by an experimenter unaware of the CR/FR identity of the sessions (see Materials and Methods). Figure 9B shows that the most frequent trajectory determined by the motion of the animal's head is the shorter one from the NP to RW. This trajectory is strongly predominant in both behavioral conditions for all animals analyzed (Fig. 9C). The distance between the CR and FR distributions (for a given rat and left/right choice) was not statistically different from zero, indicating that the motor behavior across CR and FR conditions is consistent and stereotyped. Therefore, differences in neural activity between the CR and FR tasks are not likely caused by gross differences in the motor paths chosen by the rats in the two conditions (Kolmogorov–Smirnov test used to compare trajectory type distributions, 0.0312 < D <0.2; 0.905 < p < 1.0).
Comparison of motor trajectory between CR and FR. Left, Comparison of left choice trials between CR and FR sessions, when the animal is sampling a narrow aperture in the tactile discrimination task. Right, Comparison of right choice trials between CR and FR sessions when the animal is sampling a wide aperture in a tactile discrimination task. Note that the rats present stereotypical motor trajectories regardless of CR or FR conditions, as verified frame by frame. The precise moments when the photobeams at the NP and RW are broken are shown in the figure.
Quantification of motor trajectory comparison between CR and FR. A, Grid generated by the video tracking software (Anymaze; Stoelting) used for the frame-by-frame quantification of head position during the motor trajectory from stimulus sampling to reward delivery. The red dot between the ears represents the reference point used for the behavioral quantification of the animal body motion. B, Spatial grid scheme with columns (1–9) and rows (a–c) used to track the white dot reference for the animal's movement from the NP to RW. The thicker line represents the predominant trajectory (red dot moving from a4 to b5 to b6 to b7 to b8) observed in correct wide trials (>90% of all trajectories), and the thinner lines represent other alternative trajectories found (<4% each). X indicates the center of each grid square. C, Comparison of the most frequent motor trajectories during CR and FR sessions. All the animals analyzed showed the same predominant trajectory in both tasks.
To contrast the firing modulation of single S1/VPM neurons during CR and FR sessions around epochs of stimulus and reward delivery, a bin-by-bin (100 ms) comparison was used to identify specific neurons, among those showing some kind of modulation, whose firing modulations were distinct during CR and FR trials (p < 0.05, t test corrected for multiple comparisons) in S1 and VPM (Fig. 10A).
Single-neuron activity in S1 and VPM is modulated by stimulus relevance. A, Perievent histograms depict representative neuronal activity in the S1 and VPM around stimulus sample (NP, left) and before RW (right). Statistically significant neuronal activity differences between CR and FR sessions (marked by gray vertical bands; α = 0.05; t tests corrected for the number of comparisons) indicate neurons whose firing was modulated by stimulus relevance. B, Significant CR/FR differences in firing rates show a temporal gradation across neurons simultaneously recorded in S1, spanning the entire 1 s interval before reward. Similar results were observed in VPM (data not shown).
There were at least as many S1/VPM cells whose firing rate was differentially modulated around reward (S1, 26.2 ± 8.9; VPM, 19.3 ± 12.0; mean percentage ± SEM of the total number of cells per area) as there were around the stimulus (S1, 24.8 ± 8.1; VPM, 13.0 ± 7.5; ANOVA, S1, F(1,9) = 0.015, p = 0.9; VPM, F(1,7) = 0.19, p = 0.7). In both areas, large proportions of neurons were found to be modulated around stimulus delivery but not before reward (46 and 25% of modulated neurons in S1 and VPM, respectively). Interestingly, a considerable percentage of neurons displayed the opposite behavior; i.e., they were modulated before reward but not around stimulus delivery (39 and 50% of modulated neurons in S1 and VPM, respectively). Table 1 shows a summary of these results. Analysis of all the neurons recorded in each animal showed that significant CR/FR differences in the firing modulation of individual neurons occurred at different moments in time across ensembles in both areas, so that differential firing modulations spanned the entire 1 s window around stimulus delivery (data not shown) and before reward delivery (Fig. 10B).
Cells showing significant differences in firing rate between CR/FR
To explore how differences in firing rates and temporal patterns are related to task performance, we quantified the firing rates and ISIs of neurons showing modulation during the Stim period. Table 2 shows that cells had similar mean firing rates and ISIs regardless of the task condition. No differences were detected between the two conditions (CR and FR) for any of the four cell categories considered (Table 2; excited S1, inhibited S1, excited VPM, and inhibited VPM units). S1 and VPM cells showed significant differences in firing rate (Mann–Whitney U test; for excited, p = 0.005; for inhibited, p = 0.037) and ISI (for excited, p = 0.009; for inhibited, p = 0.039) in CR condition as well as in FR condition (Mann–Whitney U test; for excited, p = 0.017; for inhibited, p = 0.010) and ISI (for excited, p = 0.015) except for ISI of inhibited cells in FR condition (p = 0.054).
The firing rates and ISI of neurons from the S1 and VPM showed no significant differences between the CR and FR conditions (Mann–Whitney U test)
To gain additional insight into the effect of stimulus relevance on stimulus-related neuronal persistence, we compared how much stimulus-related information could be detected from identical neural ensembles recorded during CR and FR sessions. Classification of the stimuli as wide or narrow was performed by an ANN based on the ensemble activity of S1 and VPM neurons (see Materials and Methods). To control for the possible effects of different motor behaviors associated with a given stimulus, only “correct” trials were used as inputs to the ANN (in the case of FR sessions, correct associations between narrow/wide and left/right). This ensured that the spatial trajectories of rats toward reward delivery windows (left/right) were fixed relative to stimulus type (narrow/wide), ruling out this motor confound. The percentage of correct classifications was then used as an estimate of stimulus-related information.
We found that during CR sessions the stimulus-related information in both S1 and VPM, as measured by the ANN performance, was significantly different between the Pre and Post periods (t test; S1, p = 0.028; VPM, p = 0.020). During CR sessions, ANN performance tended to increase monotonically from the beginning of the trial (Pre period) to reward delivery, which occurs after the Post period (Fig. 11A). In FR sessions, however, stimulus-related information at the ensemble level was did not vary significantly between Pre and Post periods (Fig. 11A). Cortical neuronal ensembles showed a statistically significant increase in stimulus-related information from FR to CR sessions during stimulus presentation (Stim; Mann–Whitney; p = 0.0465). Despite a similar trend, thalamic VPM neuronal ensembles did not show significant differences in ANN performance between the CR and the FR conditions.
Reward contingency effect in the thalamocortical ensemble. A, ANN performance for neuronal ensembles recorded in S1 shows that stimulus-related information around stimulus sampling is significantly different between CR and FR sessions. The filled asterisk indicates significant differences between CR and FR during the Stim period, and open asterisks denote a significant difference between Pre and Post for CR sessions (p < 0.05). Note the increased number of correct predictions by the ANN from the beginning to the end of the trial in both S1 and VPM. B, ANN discrimination of stimulus type when animals have high and low task performance in CR sessions. Cortical and thalamic neuronal ensembles recorded from two animals were analyzed for two sessions (1 low and 1 high performance). Stimulus-related information provided by the ANN are compared for both sessions in three periods: Pre, Stim, and Post. Note the lower ANN performance when the animals are still not sure about the correct association between stimulus type and reward. **p < 0.01. Error bars represent SEM.
The CR > FR difference in cortical stimulus-related information after stimulus offset cannot be interpreted as a motor artifact, because the trials analyzed in the CR and FR conditions had the same two associations between stimulus type, motor pathway, and choice of reward side. Importantly, the time elapsed between nose poking at the NP window and reward delivery was not significantly different between CR and FR sessions (t test; p = 0.5462). This indicates that CR/FR differences were not caused by changes in reaction time, adding support to the similarity of overt behavior in the two conditions. Note that although some cells increased their firing rates during FR sessions (Fig. 10A), the performance of the ANN was consistently smaller when compared with CR sessions (Fig. 11A).
Overall, these results indicate that the differences between S1 and VPM are a matter of degree, because both brain areas show an increase in stimulus-related information from stimulus to reward delivery in the CR condition but not in the FR condition. In addition, stimulus-related information in S1 depended on the reward contingencies associated with correct stimulus discrimination; i.e., it depended on stimulus relevance. The significant difference in cortical ANN performance between CR and FR around stimulus delivery (Stim period) supports a role for S1 neuronal ensembles in the early neural processing of information directly relevant for solving the delayed tactile discrimination task. In contrast, the modest ANN performance difference observed in S1 before reward (Post period) between CR and FR sessions (p = 0.2506) suggests that stimulus relevance marginally affected the late neural processing of stimulus-related information, when animals were choosing which side to go to obtain reward.
Next, we used ANN to compare stimulus-related information in sessions with high and low behavioral performance (>80% and <55%). We analyzed data from two animals performing the CR task, in which we were able to record the entire learning process from low to high performance (Fig. 11B). In the sessions with low task performance, animals were still learning the correct association between stimulus and response (i.e., the task contingency). For both animals, the ANN showed a significantly higher prediction of stimulus type when poststimulus neuronal data from high-performance sessions was used (rat 1, Mann–Whitney, S1, p = 0.002; VPM, p = 0.0001; rat 2, Mann–Whitney, S1, p = 0.0001; VPM, p = 0.0047). This indicates that rats with many wrong trials during a CR session (low performance) also show lower levels of stimulus-related information in S1 and VPM, even when performing correct trials.
Finally, we assessed the correlation between the stimulus-related neuronal information measured by the ANN and correct task performance by the animals [i.e., the successful behavioral association between stimuli (narrow/wide) and the animal's behavioral response (left/right reward)]. Significant correlations around stimulus delivery were observed in both areas (α = 0.05), appearing first in VPM (∼80 ms before NP) and later in S1 (∼240 ms after NP) (Fig. 12A,B, left). In Figure 12B, the red points represent FR sessions, and the blue points represent CR sessions for all animals recorded. In the S1 (but not in VPM), significant correlation of task performance with stimulus-related information as measured by ANN was even more robust before reward (Fig. 12A), reaching sustained significance for the entire 480 ms before reward (Fig. 12B, right). Interestingly, correlations in S1 remained significant for 80 ms after reward (Fig. 12A), whereas correlations in VPM continued to be low.
Stimulus-related information before reward correlates with task performance. A, Data from all recorded neurons in all animals in both blocks of trials. Temporal evolution of the correlation between task performance (in CR and FR tasks) and stimulus-related information around the stimulus sampling and before reward delivery. Significant sustained correlations occurred only in S1 for the entire 480 ms before reward. *p < 0.05; **p < 0.01. B, Data from all recorded neurons in all animals (S1, n = 5; VPM, n = 4) show a relationship between stimulus-related information during the late poststimulus period (measured by the ANN) and the animals' behavioral performance in the tactile discrimination task. Each point reflects a single block of trials under the FR (red points) or CR (blue points) condition. In VPM, the correlation between stimulus-related information and task performance peaks during stimulus sampling between 0.16 and 0.08 s before nose poke (left); in S1 the correlation peaks after stimulus removal, between 0.24 and 0.16 s before reward (right).
Discussion
In the present study, we investigated the effect of reward contingency on extracellular recordings of neuronal activity from the S1 and its main thalamic afferent, the VPM thalamic nucleus, in adult rats performing a whisker-based tactile discrimination task that depends on S1 activity (Krupa et al., 2001, 2004). In this task, the whiskers work as a fine-grained distance detector, and accurate discrimination seems to depend on the integration by the trigeminal system of inputs from several neighboring vibrissae (Krupa et al., 2001).
Stimulus relevance modulates neuronal activity in the primary somatosensory thalamocortical loop
We first characterized the responses of thalamic and cortical neurons during the tactile discrimination task. The cortical and thalamic ensembles showed different activity profiles, with a clear predominance of firing rates of <10 Hz in the S1, in contrast with a substantial proportion of thalamic neurons with firing rates of >20 Hz. The low firing rates of the S1 cells suggest that they were mostly excitatory principal neurons (Swadlow, 2003). Despite the broader range of firing rates in the VPM cells, they also were likely to have been excitatory principal neurons, because no interneurons or GABAergic cells are present in rat VPM (Barbaresi et al., 1986; Williams and Faull, 1987). A broad range of tactile responses was evoked from VPM and S1 neurons, including excited, inhibited, and multiphasic responses in both brain regions, consistent with the results shown by Krupa et al. (2004). We also observed that a subset of cells showed firing rate modulation long after stimulus offset.
To gain insight into the meaning of such modulation, we used an ANN to measure stimulus-related information across trials, including the period between stimulus disengagement and reward. This was implemented by training the ANN with neuronal ensemble activity around the NP and testing it with data recorded from different periods during the trials. To investigate the effects of stimulus relevance, we compared the “active-mode” results (CR) with data obtained in a passive version of the tactile discrimination task (FR), in which successful tactile discrimination was not required for reward. In FR sessions, information about stimulus identity was low during the entire trial. Stimulus-related information was significantly higher in CR sessions, especially in the S1.
Next, we compared stimulus-related information in correct trials obtained from sessions with high or low task performance (>80% and <55% of correct discrimination, respectively). ANN predictions were poor when data from low performance sessions were analyzed and high when high-performance sessions were assessed. This indicates that little stimulus-related information is present in S1 and VPM neuronal ensembles when the task is not yet well learned (i.e., when the animal's movements to left or right are not strongly associated with the stimuli presented). Finally, we assessed the relationship between task performance and stimulus-related information in VPM and S1 neuronal ensemble activity. Whereas thalamic activity was only correlated with task performance during stimulus presentation, poststimulus activity in the S1 was highly correlated with task performance after stimulus offset, peaking during the 300 ms that preceded reward delivery. Therefore, when the association between stimulus and reward is well learned, S1 cortical neuronal ensembles carry high stimulus-related information up to the delivery of reward.
Sensorimotor effects, short-term memory, or reward expectation?
In principle, the decrease of stimulus-related information in S1 and VPM neuronal ensemble activity when the task is freely rewarded admits different alternative interpretations. First, we must distinguish three potential sources of stimulus-related information: (1) purely sensory (PS) responses driven by whisker stimulation by the narrow and wide aperture stimuli, (2) sensory-motor feedback (SMF) driven by grossly different body movements toward the left or right reward ports, and (3) motor-planning activity representing an intention to move to the left or the right reward port.
The CR/FR tasks were designed so as to obtain statistically identical movements up to the moment when animals trigger the NP; subsequently, they make a left or a right response. Detailed analysis of video recordings supports this assumption. Therefore, stimulus-related information during stimulation (around the NP) can be interpreted as purely sensory, with no component attributable to sensory-motor feedback or motor planning. In the interval between stimulus delivery (NP) and reward delivery, however, the results from the CR condition alone cannot disambiguate the contribution of PS and SMF to stimulus-related information. In principle, the better-than-chance ANN performance before reward delivery could represent maintenance of a purely sensory short-term memory. Alternatively, the ANN results could reflect differences in motor planning or sensory feedback before or during movements to the left or the right. Because the animals are trained to ∼80% correct behavioral performance, their movements to the left or right are strongly correlated with the stimulus presented on each trial; thus, information about their behavioral response necessarily provides some information about the stimulus. A priori, we cannot distinguish these possibilities, and it is conceivable that poststimulus information about the stimulus is dominated by SMF, with no contribution from short-term memory.
A comparison of the results in S1 and VPM, however, suggests that SMF cannot account for all the stimulus-related information in S1 ensemble activity. In particular, information about the stimulus increases during poststimulus time in S1, whereas it remains approximately constant in VPM. Because SMF reaches S1 primarily via VPM, these results suggest that S1 refines either a sensory representation of the stimulus or a motor response plan based on the learned sensory-motor mapping, rather than receiving additional SMF information from VPM. Both of these remaining possibilities are significant, because neither short-term memory nor motor planning is traditionally associated with primary sensory cortices.
To unravel this issue, we manipulated stimulus salience by changing the reward contingency under otherwise identical sensory-motor conditions (CR/FR conditions). To isolate the effect of changing the reward context on somatosensory responses, we were keen to compare equal numbers of correct trials only, so that the stimulus–response mapping did not vary between CR and FR sessions. In other words, we compared sets of trials in which each particular stimulus (wide or narrow aperture) was always followed by the same behavioral responses, so that only the difference in reward expectation could account for the differences between the CR and FR conditions. Because only “correct” trials were used to compute this result, and because the motor paths taken in CR and FR sessions were statistically undistinguishable (0.905 < p < 1.0), the CR > FR relationship regarding stimulus-related information could not be attributed to potential motor feedback on S1/VPM neurons during the poststimulus period. In future studies, more precise methods for the tracking of vibrissa micromotions (Knutsen et al., 2005) should allow for a final conclusion regarding the effect of variability in whisker position and signal transduction on the electrophysiological differences between CR and FR behavioral tasks.
Although the increase of stimulus-related information after stimulus offset in the CR condition could be caused by an enhanced short-term memory of the stimulus, similar to working memory in the PFC, this is an unlikely possibility. First, the interval between nose poke and reward delivery is not a passive interval, but rather a period of movement. Unlike primate studies, in which animals need to hold on to relevant information during a delay period to execute a decision at a later time, there is no necessary delay in our experimental design. Animals begin to move toward the reward immediately after NP, and therefore a choice of side is most likely made before the CD on the way out of the stimulus chamber. Second, we found a very low abundance of cells with sustained activation from the NP to RW (S1, 9% of all modulated cells; VPM, 7%). Because the single-cell data do not offer strong support to the “short-term memory” hypothesis, the most parsimonious interpretation for our results is an effect of reward expectation on primary somatosensory thalamocortical neuronal processing.
Top-down modulation of neuronal activity in the primary somatosensory thalamocortical loop
Altogether, our results show that reward expectation strongly modulates stimulus processing in S1 and VPM during sampling of the tactile stimulus. This supports the existence of strong top-down effects related to attention in the primary somatosensory cortex, as proposed recently (Krupa et al., 2004). We found that cortical and thalamic cells have similar mean firing rate and mean ISI regardless of the task condition (CR and FR). However, the increase in reward expectation during FR sessions produced much stronger effects in the S1 than in the VPM. These results indicate that the S1 is much more prone to top-down modulation than the VPM, in the tactile discrimination task studied.
The activation of neurons from primary sensory areas is classically believed to be time locked to stimulus presentation and to be restricted to the initial processing of the basic features of the stimulus. However, different lines of evidence challenge this notion, suggesting that primary sensory areas are not functionally wired in a rigid feedforward order. Instead, they show reciprocal connections with many areas, including prefrontal regions and other association cortices, supporting the existence not only of serial but also parallel processing of sensory information (Jones and Powell, 1970; Pandya and Yeterian, 1985; Goldman-Rakic, 1988; Fuster, 1989b). Working memory, for example, has been thought to depend on a feedforward circuit spanning primary sensory areas for the acquisition of stimulus-related information and higher-order areas for the processing and temporary storage of this information. Mnemonic activity was originally found in the prefrontal cortex (Fuster and Alexander, 1971; Fuster, 1973), but more recently neuronal responses related to working memory have also been detected in primary sensory areas (Fuster, 1990; Le Bihan et al., 1993). Although our results cannot be easily interpreted as reflecting mnemonic processing, they clearly point to a poststimulus increase of stimulus-related information only when the correct pairing between stimulus and response is crucial to obtain reward. This suggests that neuronal ensemble activity in the primary sensory thalamocortical loop may be important to complete accurate behavioral responses immediately before reward delivery, perhaps reflecting the top-down arrival in primary areas of efferent copies of the stimulus representation (Wikgren et al., 2003). Indeed, the exquisite resolution of stimulus discrimination in the primary sensory thalamocortical loop may be crucial for the progressive refinement of stimulus-related information long after stimulus removal, as suggested previously (Marr and Vaina, 1982; Kosslyn et al., 2001).
Our findings support a growing number of studies implicating primary sensory areas to play a role in cognitive functions related to the perception of visual, auditory, and somatosensory information (Le Bihan et al., 1993; Kosslyn et al., 2001). Evidence that primary cortical neurons engaged in early visual processing are also modulated by reward expectation (Shuler and Bear, 2006) reinforces the idea that sensory discrimination for reward is processed over vast cortical networks (Pasternak and Greenlee, 2005). Our results support the notion that sensory information during the poststimulus period of the task is not segregated into high-order cognitive centers, but is instead distributed over the same cortical regions responsible for the initial representation of the stimuli.
Footnotes
-
This work was supported by National Institutes of Health grants (M.A.L.N.) and by fellowships from Conselho Nacional de Desenvolvimento Cientifico e Tecnológico (J.P.), the Pew Latin-American Program for Biomedical Sciences (S.R.), Fundação do Ministério de Ciência e Tecnologia de Portugal (E.S.), and Inserm (D.G.). We thank Michael Platt, Mário Fiorani, João Franca, Eliane Volchan, and Leticia Oliveira for valuable experimental suggestions; Dana Cohen and Ranier Gutierrez for help with data analysis; Gary Lehew and James Meloy for multielectrode manufacturing; and Susan Halkiotis, Gayle Wood, and Terry Jones for miscellaneous help.
- Correspondence should be addressed to Miguel A. L. Nicolelis, Department of Neurobiology, Duke University Medical Center, Box 3209, Durham, NC 27710. nicoleli{at}neuro.duke.edu