Abstract
Acquiring the significance of events based on reward-related information is critical for animals to survive and to conduct social activities. The importance of the perirhinal cortex for reward-related information processing has been suggested. To examine whether or not neurons in this cortex represent reward information flexibly when a visual stimulus indicates either a rewarded or unrewarded outcome, neuronal activity in the macaque perirhinal cortex was examined using a conditional-association cued-reward task. The task design allowed us to study how the neuronal responses depended on the animal's prediction of whether it would or would not be rewarded. Two visual stimuli, a color stimulus as Cue1 followed by a pattern stimulus as Cue2, were sequentially presented. Each pattern stimulus was conditionally associated with both rewarded and unrewarded outcomes depending on the preceding color stimulus. We found an activity depending upon the two reward conditions during Cue2, i.e., pattern stimulus presentation. The response appeared after the response dependent upon the image identity of Cue2. The response delineating a specific cue sequence also appeared between the responses dependent upon the identity of Cue2 and reward conditions. Thus, when Cue1 sets the context for whether or not Cue2 indicates a reward, this region represents the meaning of Cue2, i.e., the reward conditions, independent of the identity of Cue2. These results suggest that neurons in the perirhinal cortex do more than associate a single stimulus with a reward to achieve flexible representations of reward information.
Introduction
Animals, including humans, learn to associate external events and combinations of events with the outcomes they predict. Previous studies have suggested that the perirhinal cortex, a part of the medial temporal lobe, plays an important role in reward-related information processing. After ablation of perirhinal and entorhinal cortices, monkeys are impaired in learning stimulus and reward-schedule/reward-size relationships (Liu et al., 2000; 2004; Clark et al., 2012), and in reversal learning of object discriminations (Murray et al., 1998; Hampton and Murray, 2002). These impairments probably arise from the interruption of information flow between these regions and the reward-related areas such as the amygdala and the orbitofrontal cortex (OFC) with which they are strongly connected (Stefanacci et al., 1996; Lavenex et al., 2002; Kondo et al., 2005; Saleem et al., 2008).
Electrophysiological studies have shown that the responses of perirhinal cortex neurons encode signals about specific reward schedule states regardless of the physical property of visual cues (Liu and Richmond, 2000), or represent reward conditions when visual stimuli are associated with either a rewarded or unrewarded outcome (Mogami and Tanaka, 2006). Perirhinal cortex neurons carry robust signals about visual stimulus characteristics (Liu and Richmond, 2000; Mogami and Tanaka, 2006) that are probably derived from the visual area TE (Suzuki and Amaral, 1994; Saleem and Tanaka, 1996). This tissue is known to be important for long-term memory, such as visual stimulus-stimulus association memory (Murray et al., 1993; Higuchi and Miyashita, 1996; Buckley and Gaffan, 1998). We ask whether perirhinal cortex neurons carry flexible representations of reward-condition signals like those seen in areas having anatomical connections with this cortex, for example, the amygdala (Paton et al., 2006; Belova et al., 2007) and the OFC (Morrison and Salzman, 2009). We hypothesized that perirhinal cortex neurons represent reward conditions flexibly so that when a stimulus indicates either rewarded or unrewarded outcome depending on context, the responses to the stimuli will represent the rewarded and unrewarded outcomes given the context.
In the present study, neuronal activity in the macaque perirhinal cortex was examined using a conditional-association cued-reward (CACR) task. In the task, two visual stimuli, a color stimulus as Cue1 followed by a pattern stimulus as Cue2, were sequentially presented. The meaning of the second cue was contingent on the color of the first cue. Our results show that the monkeys learned these conditional associations and that neurons in the perirhinal cortex flexibly encoded the associated reward conditions during the Cue2 presentation period based on the context that was set by Cue1.
Materials and Methods
Subjects
Subjects were two male rhesus monkeys (Macaca mulatta) weighing 8 and 10 kg (monkeys S and T, respectively). All experiments were approved by the Animal Care and Use Committee of the National Institute of Advanced Industrial Science and Technology (AIST) and were performed in accordance with the Guidelines for the Care and Use of Animals of AIST.
Apparatus
The monkeys were seated in a primate chair positioned in front of a monitor (16 × 13 inch, GDM-F520; SONY) on which visual stimuli were displayed. The center of the monitor screen was located at eye level, 61 cm in front of monkey T and 62 cm in front of monkey S. A touch-sensitive bar was mounted on the chair at the level of the hands of the monkey. Liquid rewards were delivered from a drinking spout positioned in front of the mouth of the monkey. Behavioral control and data acquisition were performed using the REX real-time data-acquisition program adapted to the QNX operating system (Hays et al., 1982). Spike times and task events were recorded with a 1 ms time resolution.
Behavioral task
Initial training.
The monkeys were initially trained to detect when a red visual target changed to green. Each trial began when the monkey touched the bar. A white square (visual angle, 0.7 × 0.7°; brightness, 13.30 candela/m2) was displayed at the center of the screen (Fix On). Next, a red target (Wait signal, 0.7 × 0.7°, 2.59 candela/m2) was presented. After a random interval of 300–900 ms, the target color turned green (Go signal, 9.38 candela/m2). If the monkey released the touch-sensitive bar within 150–1000 ms after the Go signal onset, the target turned blue (Correct signal, 1.49 candela/m2). The Correct signal disappeared after 200 ms, and a drop of juice was delivered as a reward. A black-and-white random-dot background covered the whole screen. The duration of the Wait signal presentation lasted between 300 and 900 ms. The temporal uncertainty encouraged the monkeys to pay attention to the red-to-green target transition to obtain a reward. If the monkey released the bar before the Go signal (or within 150 ms after the Go signal presentation, to account for anticipatory guessing) or after the Go signal disappeared, an error was registered, and the monkey had to repeat the trial from the beginning. After the monkeys learned to perform the red-to-green color-discrimination task to a criterion of 80% correct for 5 consecutive days, the CACR task was introduced.
CACR task.
In the CACR task, the monkeys did not make an explicit choice. The behavioral requirement was to perform the red-to-green color discrimination as in the initial training and to fixate at the center of the screen. The prediction of the animal as to whether it would or would not be rewarded was determined by error rates and licking behavior.
In each trial, two additional visual cues consisting of Cue1 and Cue2 were sequentially presented (seeFig. 1A). Cue1, i.e., a color stimulus, was a magenta square (Mgnta, 5.6 × 5.6°, 3.97 candela/m2) or a cyan square (Cyn, 5.6 × 5.6°, 10.80 candela/m2). Cue2, i.e., a pattern stimulus, was one of two black-and-white Walsh patterns (Ptrn1 or Ptrn2; black, 50%, white, 50%; Richmond et al., 1987). Each pattern stimulus was conditionally associated with both rewarded (R) and unrewarded (NR) outcomes depending on the preceding color stimulus (Fig. 1B). Juice as a reward was provided after a correct response on trials with Mgnta-Ptrn1 (Mgnta–Ptrn1–R trial type) and Cyn-Ptrn2 (Cyn–Ptrn2–R trial type) presentation, but not on trials with Cyn-Ptrn1 (Cyn–Ptrn1–NR trial type) and Mgnta-Ptrn2 (Mgnta–Ptrn2–NR trial type) presentation.
A trial began when the monkey touched the bar and the white target appeared. After the monkey fixated on the white target, Cue1 and Cue2 were presented sequentially, separated by a delay (450–550 ms). After Cue2 disappeared, the monkey was required to perform red-to-green color discrimination. In unrewarded trials, a disconnected solenoid was activated to produce a click sound after the disappearance of the Correct signal. This sound was similar to the sound produced by juice delivery in rewarded trials. The intertrial interval (ITI) was 1500–2200 ms following unrewarded trials. In rewarded trials, the ITI was 2600–3500 ms to wait for the jaw movements associated with taking the reward to abate before the beginning of the next trial. Each of the four trial types appeared two times in every eight trials in a pseudo-random order.
Eye positions were measured using a magnetic search coil technique (Robinson, 1963; Judge et al., 1980) for both monkeys, and also using an infrared pupil-position monitoring system (i_rec; http://staff.aist.go.jp/k.matsuda/eye/) for monkey S. The window size for eye fixation was 5.6 × 5.6° at the center of the screen for monkey T and 14 × 14° for monkey S.
Licking behavior was monitored using a touch sensor attached to a drinking spout. The tip of the spout was placed 7 mm (monkey T) and 11 mm (monkey S) from the upper front teeth of the monkey.
Error trials included bar-release errors and fixation-break errors. Bar-release errors referred to any bar release occurring outside of the 150–1000 ms period after the Go signal onset. A fixation break occurring between the Fix on and Wait signal presentation was counted as an error. If the monkey made an error, the trial was immediately aborted and the monkey had to repeat the identical trial type (a correction trial).
Fixation task.
To examine whether neuronal responses to the cue changed outside of the task context, a fixation task was used. In the fixation task, a trial began when the monkey touched the bar, after which the white target appeared in the center of the screen. After the monkey fixated on the white target for 200–300 ms, one of the color or pattern stimuli that were used as Cue1 or Cue2 was presented for 600–700 ms, after which the white target reappeared for 400–1300 ms. A drop of juice was delivered as a reward in every trial. If the monkey released the bar or broke eye fixation during the stimulus presentation, an error was registered and the monkey had to repeat the identical trial type from the beginning. The window size for eye fixation was the same as that was used during the CACR task.
Free-reward task.
A free-reward task was used to examine whether or not neurons were responsive to the delivery of the juice reward outside of the task context. During the free-reward task, the black-and-white random-dot background covered the whole screen, and neither visual cues nor targets were presented. A juice reward or the sound of the disconnected solenoid (sham) was alternately delivered with an interval ranging from 5000 to 9000 ms.
The CACR task, the fixation task, and the free-reward task were run in blocks, usually in that order.
Surgery
After the monkeys learned the red-to-green color-discrimination task, a titanium head-fixation post was affixed to the skull and a scleral magnetic search coil was implanted using aseptic surgical procedures (Robinson, 1963; Judge et al., 1980) under pentobarbiturate anesthesia (25 mg/kg) in a fully equipped and staffed surgical suite. Body temperature, heart rate, blood pressure, and the percentage of blood oxygen saturation were monitored throughout all surgical procedures. Monkeys were allowed to recover from surgery for 1 month. After the monkeys learned the CACR task, a recording chamber (Crist Instrument) was implanted above the dorsal surface of the left hemisphere in each monkey. Chamber locations were determined using stereotaxic coordinates from magnetic resonance (MR) images of each animal's brain (Saunders et al., 1990). The monkeys were given a 2 week postoperative recovery period, after which they were retrained in the CACR task. The monkeys received antibiotics for 1 week after each surgery to reduce the risk of postoperative infections and received an analgesic during and after surgery.
Unit recording
Recording sessions began after the monkeys had been retrained on the CACR task. A single-unit recording was performed extracellularly with a tungsten electrode (Micro Probe and Frederic Haer) that was inserted vertically through a guide cannula (Crist Instrument) at the beginning of each recording session. The electrode was advanced toward the perirhinal cortex using a hydraulic microdrive (MO-97A-S; Narishige). Single units were isolated on-line using a threshold and dual time-amplitude windows (DDIS-1; Bak Electronics). Unit activity was converted to pulses and was recorded at a 1 ms time resolution with REX. The recording site location was determined using MR images. 3D brain images were reconstructed using software (Brain Explorer (c); http://riodb.ibase.aist.go.jp/brain/index.php?LANG=ENG). The distance (millimeters) to the interaural line, the distance to the midline, and the distance from the bottom end of the grid in the recording chamber to white matter dorsal to the perirhinal cortex were measured using MR images taken with an electrode placed at least once for each guide tube position. To examine the distribution of single-unit locations, the distance from a grid in the recording chamber to the recording locations was measured after each day's experiment and plotted on MR images taken with an electrode placed at each recording track. All recording sites were located in the stereotaxic plane 18–26 mm anterior to the interaural line.
Data analysis
All data analyses were performed using conventional statistical procedures with the R statistical computing environment (R Development Core Team, 2004). Data were analyzed only from trials in which the monkey was successful on the first try. Correction trials were excluded from the analysis. Each recording session consisted of data from at least five trials for each of the four trial types, namely, Mgnta–Ptrn1–R, Cyn–Ptrn2–R, Cyn–Ptrn1–NR, and Mgnta–Ptrn2–NR trial types. For behavioral analysis, errors that occurred before the Cue2 presentation on the first attempt were excluded. Normalized licking duration was calculated by dividing the duration of touch to the drinking spout by the duration from the onset of the target or the cue to the disappearance of the target or the cue. For a delay period, the duration was measured from the Cue1 disappearance to the Cue2 onset.
A sliding window analysis of neuronal activity.
A sliding window analysis was performed to examine the time course of changes in neuronal activity. For every neuron, a 200 ms time window was used that was slid across the trial in 10 ms steps, the spike counts for each window being obtained. Then, for the data from each window, statistical analysis (ANOVA or Tukey's HSD test, as described below) was performed.
Neuronal activity was quantified around the following six trial events in the CACR task. The first number and second number in the brackets in the following indicate the beginning and the end time of each 200 ms time window, respectively: (1) the Cue1 period ([−100, 100] ms to [400, 600] ms after the Cue1 onset); (2) the delay period ([0, 200] ms to [250, 450] ms after the Cue1 disappearance or [−450, −250] ms to [−200, 0] ms from the Cue2 onset); (3) the Cue2 period ([−100, 100] ms to [400, 600] ms after the Cue2 onset); (4) the Wait signal period ([0, 200] ms to [100, 300] ms after the Wait signal onset); (5) the Go signal period ([−100, 100] ms to [50, 250] ms after the Go signal onset); and (6) the Correct signal period ([−100, 100] ms to [150, 350] ms after the Correct signal onset), i.e., a period before the reward delivery or the sham click solenoid sound. Neuronal activity was quantified after the pattern stimulus onset ([−100, 100] ms to [400, 600] ms from the pattern stimulus onset) in the fixation task.
Detection of task-related activity.
For all recorded neurons, a sliding ANOVA was applied to the spike counts in each of the 200 ms time windows at the six trial events to examine whether or not the spike counts of the neuron depended upon the image identity of Cue1, the image identity of Cue2, or the interaction between Cue1 and Cue 2, i.e., the reward conditions (delivery or omission of a juice reward). Activity dependent on the identity of Cue1 was examined during the Cue1 period and the delay period (one-way ANOVA, Cue1 identity with two levels, Mgnta or Cyn), and activity dependent on the identity of Cue1, the identity of Cue2, or the reward conditions was examined during the remaining four events (two-way ANOVA, factors = Cue1 identity and Cue2 identity, each in two levels, Mgnta or Cyn, Ptrn1 or Ptrn2). The set of p values that was obtained from each of the six trial events was adjusted for multiple comparisons using the Benjamini and Hochberg false discovery rate (FDR) procedure (“p.adjust” function in R) (Bouret et al., 2012). If activity in at least one time window was revealed to have a significant effect on the identity of Cue1, the identity of Cue2, or the reward conditions (p < 0.05) for a given neuron, the neuron was classified as showing a Cue1-identity dependent activity, a Cue2-identity dependent activity, or a reward-condition dependent activity, respectively.
To quantify the degree to which neuronal activity depended upon the Cue1 identity, Cue2 identity, or reward conditions, the variance in the neuronal activity explained by each factor was analyzed. In each of the 200 ms time windows, this variance was calculated directly from the ANOVA results by dividing the sum of squares for each factor by the total sum of squares (SSfactor/[SSfactor + SSresiduals] × 100) (Simmons and Richmond, 2008).
Detection of trial-type-specific activity.
For detection of a trial-type-specific activity during the Cue2 and the Correct signal periods, spike counts between pairs of Cue1–Cue2 sequences were compared using Tukey's HSD test (p < 0.05). The test was performed in each 200 ms time window that was slid across the trial in 10 ms steps. For each Cue2 period and Correct signal period, the set of p values was adjusted for multiple comparisons using the FDR procedure (p.adjust function in R). If spike counts in one trial type were significantly different from spike counts in every other trial type for at least one time window, and if the spike counts in the time window were revealed to have a significant effect on factors/interaction using the sliding ANOVA, the neuron was classified as having a trial-type-specific activity.
Results
Behavioral results
The behavior (bar-release errors and licking behavior) of the monkeys during conditional associations, i.e., the CACR task (Fig. 1A,B) became stable within 2 weeks of introduction, showing that they had learned the task quickly. As seen in Figure 2, the error rate was lower in rewarded trials than in unrewarded trials for both monkeys, with a significant interaction between the identity of Cue1 (Cue1 identity) and the identity of Cue2 (Cue2 identity), i.e., the reward conditions (juice reward vs no juice reward; Fig. 1C), and there was no significant main effect of the Cue1 identity or the Cue2 identity (factor = Cue1 identity, F(1,216) = 2.76, p = 0.10, factor = Cue2 identity, F(1,216) = 0.67, p = 0.41, interaction, F(1,216) = 115.3, p < 0.05 for monkey T; factor = Cue1 identity, F(1,224) = 3.65, p = 0.06, factor = Cue2 identity, F(1,224) = 2.09, p = 0.15, interaction, F(1,224) = 40.1, p < 0.05 for monkey S). These results show that, though the monkeys were free to ignore the cues, they distinguished rewarded from unrewarded trials, depending on the cue sequences. Licking behavior was also different between rewarded and unrewarded trials (Fig. 3A, top and bottom). The monkeys licked until the appearance of Cue2 for both rewarded and unrewarded trials. They continued licking through Cue2 (for monkey S), Wait, Go, and Correct periods (for both monkeys S and T) in the rewarded trials, but stopped licking after the appearance of Cue2 in the unrewarded trials (Fig. 3B; monkey T, two-way ANOVA, interaction, p < 0.05, F(1,216) = 74.4, 50.4 and 487.9, for Wait, Go, and Correct periods, respectively; monkey S, two-way ANOVA, interaction, p < 0.05, F(1,224) = 6.11, 80.4, 296.3, and 555.6, for Cue2, Wait, Go, and Correct periods, respectively). Thus, both the error rate and the licking behavior show that the monkeys recognized the reward conditions, i.e., whether a current trial was a rewarded or unrewarded trial, after the appearance of the Cue2.
Electrophysiological results
The activity of 218 single neurons (106 from monkey T and 112 from monkey S) in the perirhinal cortex was recorded during the CACR task. Figure 4A shows recording sites that were reconstructed from MR images (see Materials and Methods). The number of trials in a session was 37–365, with a mean of 116. Recording sites and types of neuronal activity were similar in both monkeys, so data from the two monkeys were treated as a single population. Figure 5 shows the distribution of the averaged firing rate during a background period (a 400 ms period before the Cue1 onset) for the 218 recorded neurons (mean, 13.7 spikes/s; range, 0–27.4; lower quartile, 5.1; median, 13.7; upper quartile, 22.3). Activity dependent upon the Cue1 identity, Cue2 identity, and/or reward conditions was observed for 67% (146/218) of the recorded neurons, and these neurons were regarded as having task-related activity (sliding two-way/one-way ANOVA; see Materials and Methods).
The task-related neurons showed activity modulations depending on the reward conditions starting after the Cue2 presentation. To provide an example of how such activity plays out in the task, the activity of a neuron with modulations dependent upon the reward conditions and that with modulations dependent upon the image identity of visual stimuli is shown in Figure 6A. This neuron slightly increased its activity after presentation of a magenta color cue (Mgnta), and the activity increased again after the presentation of Cue2, with the activity being stronger in unrewarded trials (the second and third rows) than in rewarded trials (the first and fourth rows). The activity also increased again starting from the Correct signal onset in rewarded trials. Thus, this neuron showed Cue1-identity dependent activity during the Cue1 period, and reward-condition dependent activity during the Cue2 period and during the Correct signal period.
Neuronal activity in the Cue2 period
Our main interest was to investigate how the conditional association starting with the Cue2 period was represented by the neuronal firing. During the Cue2 period, 87 of the recorded neurons (87/218, 40%) showed task-related activity (sliding two-way ANOVA; see Materials and Methods). During the period, three kinds of response modulations were identified, i.e., a Cue1-identity dependent activity that is a response related to the memory of the first cue for 30 neurons (30/218, 14%), a current Cue2-identity dependent activity for 61 neurons (61/218, 28%), and a reward-condition dependent activity for 39 neurons (39/218, 18%).
To examine the representations of these three kinds of activity over time, we examined the percentage of response variance explained by the Cue1 identity, Cue2 identity, and reward conditions around the Cue2 onset for the 87 task-related neurons (sliding two-way ANOVA; see Materials and Methods). The averaged values across the neuronal population are shown in Figure 7. The Cue1-identity signal was maintained through the delay period and decreased after Cue2 onset. The explanatory power of the Cue2 identity and the reward conditions increased following the Cue2 onset. The peak of the Cue2-identity signal was larger than the peak of the reward-condition signal (means of 29 and 15%, t test, p < 0.05), and the peak of the reward-condition signal was observed later than the peak of the Cue2-identity signal (medians of 410 and 260 ms, Kolmogorov–Smirnov test, p < 0.05).
Among the task-related neurons during the Cue2 period, 31 (36%, 31/87) showed a Cue2-identity dependent activity alone. To examine whether the activity of these neurons was related to physical characteristics of the stimuli or whether the activity might be dependent on the conjunction of the task and the stimulus, neuronal activity during the CACR task was compared with neuronal activity during the fixation task. The activity of an example neuron shown in Figure 8 exhibited larger responses to Ptrn1 than to Ptrn2 during the CACR task, but the responses to Ptrn1 and Ptrn2 were indistinguishable during the fixation task. Of the 31 neurons, the activity of 13 neurons (13/31) for which the level of background activity did not change across the two tasks was examined to decrease the possibility that the unit was missed during the task change (background activity, activity during a fixed 400 ms window before the Cue1 onset in the CACR task or before the stimulus onset in the fixation task, Student's t test, p > 0.05). For 12 of the 13 neurons, the stimulus-related activity changed, with eight showing greater activity during the CACR task than during the fixation task (Student's t test between spike counts during the CACR task and spike counts during the fixation task in a 200 ms time window that was slid from [−100, 100] ms to [400, 600] ms after the stimulus onset, p < 0.05, p values adjusted for multiple comparisons using the FDR procedure). Responses that were dependent on the identity of the pattern stimuli during the CACR task disappeared during the fixation task for 6 of the 12 neurons (a sliding one-way ANOVA, factor = pattern stimuli, p < 0.05, p values adjusted for multiple comparisons using the FDR procedure). This result is similar to an increased stimulus dependency during a cognitive task rather than a fixation task reported by Liu and Jagadeesh (2008). Thus, even when it might appear that there is a stimulus identity-dependent activity, the activity often appears to be contingent on the conjunction of stimulus and task.
Many neurons show activity related to reward conditions (Fig. 9A, where the activity is stronger in unrewarded trials than in rewarded trials during the Cue2 period). Moreover, this activity is not simply dependent on the reward delivery during a free-reward task (Fig. 9B). Of the 39 neurons with reward-condition dependent activity that was revealed by the sliding two-way ANOVA, activity was stronger in the rewarded trials than in the unrewarded trials for approximately half (20/39, 51%), a proportion similar than that reported by Mogami and Tanaka (2006). Among these 39 neurons, 56% (22/39) did not have Cue1-identity dependent activity during the Cue1 and/or the delay periods.
To examine whether or not the representation of the reward-condition dependent activity during the Cue2 period was unique to this specific combination of Cue1–Cue2 sequences, an alternative set of Walsh patterns (Ptrn1′ and Ptrn2′) was used as Cue2. For 12 neurons, both the original and the alternative cue sets were examined and background activity (a 400 ms period before the Cue1 onset) between the blocks using both sets was not significantly different (Student's t test, p > 0.05). For the 12 neurons, 6 showed a significant effect of the reward conditions during the Cue2 period using either the original or the alternative set, with 3 showing a significant effect of reward conditions only during the original cue set, 2 only during the alternative set, and the remaining one during both sets (sliding two-way ANOVA).
Because the reward conditions were determined for the Cue1–Cue2 combination, some neurons may represent signals delineating a specific Cue1–Cue2 sequence before representing the reward-condition signal. Signals related to a specific Cue1–Cue2 sequence were examined using a Tukey's HSD test (p < 0.05, in a 200 ms time window that was slid by 10 ms steps from [−100, 100] ms to [400, 600] ms from the Cue2 onset), by comparing neuronal activity between pairs of Cue1–Cue2 sequences (for example, comparisons between activity during the Cue2 period of Mgnta–Ptrn2–NR trials and that of Mgnta–Ptrn1–R, Cyn–Ptrn1–NR, or Cyn–Ptrn2–R trials). If spike counts in one trial type were significantly different from spike counts in every other trial type (see Materials and Methods for details), the neuron was classified as representing a specific Cue1–Cue2 sequence, i.e., a trial-type-specific activity. Figure 10A shows responses of a neuron with such a trial-type-specific activity. This neuron shows responses to Ptrn2 presentation, but the activity strength is different between trial type Mgnta–Ptrn2-NR and Cyn–Ptrn2–R delineating these two cue sequences. The activity of the neuron is defined as the trial-type-specific activity during a period indicated by red bars on the abscissa.
Figure 11, A–C, illustrates the time course of response variance explained by the Cue1 identity, Cue2 identity, and reward conditions for each task-related neuron. Figure 11D illustrates the time course of the trial-type-specific activity. A trial-type-specific activity was observed for a total of 18 neurons consisting of 11 (11/39, 28%) neurons with the reward-condition dependent activity and 7 neurons with either the Cue2-identity dependent activity and/or the Cue1-identity dependent activity. The percentage of neurons with the reward-condition dependent activity (45%, 39/87; Fig. 11C) was larger than that with the trial-type-specific activity (21%, 18/87; Fig. 11D) (χ2 test, p < 0.05).
The latency of the trial-type-specific activity (Fig. 12B; N = 18, first windows of blue bars in Fig. 11D) was compared with the latency of the Cue2-identity dependent activity (Fig. 12A; N = 61, first windows of colored bars in Fig. 11B) and with the latency of the reward-condition dependent activity (Fig. 12C; N = 39, first windows of colored bars in Fig. 11C). The latency distribution of the reward-condition dependence was significantly longer than the latency distribution of the Cue2-identity dependence (Fig. 12D, alternate long and short dash vs solid curves; Kolmogorov–Smirnov test, p < 0.05). The latency distribution of the trial-type-specific activity was not significantly different from the latency distribution of the Cue2-identity dependence (broken vs solid curves, Kolmogorov–Smirnov test, p = 0.35) nor from the latency distribution of the reward-condition dependence (broken vs alternate long and short dash curves, Kolmogorov–Smirnov test, p = 0.27).
To examine whether or not the neurons representing the reward conditions show responses to actual reward delivery outside of the task context, 18 of the 39 reward-condition dependent neurons were examined using a free-reward task (Figs. 6B, 9B, 10B). Of the 18 neurons, 6 (6/18 neurons, 33% including 1 with a trial-type-specific activity) showed a significant response after the reward delivery during the free-reward task (Student's t test between spike counts during a 400 ms period before the reward apparatus activation and spike counts during a 400 ms period starting from 150 ms after the reward apparatus activation, p < 0.05). Among these six neurons, three showed a stronger response to Cue2 in the rewarded trials, and the remaining three showed a stronger response to Cue2 in the unrewarded trials (Fig. 6B) during the CACR task.
We examined whether or not the reward-condition dependent activity was related to the licking behavior of the monkeys for 26 of 28 neurons with the reward-condition dependent activity but without the trial-type-specific activity. Trial-by-trial analysis of the correlation between the spike counts (during a 600 ms period from the Cue2 onset) and such analysis of the proportion of licking duration during the Wait period (Fig. 3B) were performed independently in rewarded or unrewarded trials, and revealed a weak correlation between spike counts and the licking duration (p < 0.05) in 10/26 (38%) neurons.
Task-related activity across the events
Figure 13A shows the percentage and the number of neurons showing task-related activity across trial events. Cue1-identity dependent activity was observed during the Cue1 period (23%, 51/218), during the delay period (17%, 36/218), and during the Cue2 period (14%, 30/218). The number of neurons with task-related activity was the greatest during the Cue2 period (40%, 87/218, χ2 test, p < 0.05). The percentage of task-related neurons decreased during the Wait and Go signal periods (χ2 test, p < 0.05). The percentage of neurons with reward-condition dependent activity increased during the Correct signal period (13%, 29/218; the activity of an example neuron is shown in Fig. 6A). During this period, no trial-type-specific activity was found. The peak value of the response variance explained by the reward conditions during the Cue2 period and that during the Correct signal period were not significantly different (a median of 22 and 29% for the Cue2 and the Correct signal periods, respectively; Kolmogorov–Smirnov test, p = 0.07). Of the 29 neurons with a reward-condition dependent activity during the Correct signal period, 17 were examined using the free-reward task. Six neurons (6/17, 35%) showed a response to free reward (Fig. 6B), including three neurons with stronger responses in the rewarded trials and three neurons with stronger responses in the unrewarded trials during the CACR task.
Figure 13B shows the significance profiles of each neuron across the trial events. These reveal that many neurons showed task-related activity during more than one epoch and that the character of the response might change from one epoch to the next.
Distribution of neurons
The location of recorded neurons was examined using MR images (see Materials and Methods). Figure 4B shows that task-related neurons were distributed widely across tracks and along dorsoventral planes. There was no consistent trend between the two monkeys regarding the distribution of neurons according to the response types.
Discussion
We investigated how neurons in the perirhinal cortex represent reward conditions upon presentation of two visual stimuli in sequence. For each trial, first one of two color stimuli appeared as Cue1 and was followed, after a short delay, by one of two visual patterns as Cue2. Each pattern stimulus was conditionally associated with rewarded and unrewarded outcomes depending on the first presented color stimulus. This design allowed us to study how the responses to the pattern stimulus depended on the prediction of the animal as to whether it would or would not be rewarded. Reward-condition dependent activity was observed during the Cue2 period for approximately half of the task-related neurons, that is, an activity that was related to the reward conditions regardless of the pattern. The number of neurons appearing to show the pattern-dependent activity increased in the CACR task over that seen when the pattern was presented in a simple fixation task, indicating that most of task-related neurons do not simply reflect physical properties of visual stimuli.
Reward-condition dependent activity upon presentation of physically identical visual stimuli
Previous electrophysiological studies of the perirhinal cortex have shown that neurons in this region represent information regarding associative relationships between two visual stimuli (Naya et al., 1996, 2003;Messinger et al., 2001; Fujimichi et al., 2010; Takeuchi et al., 2011), association between visual stimuli and reward schedule states (Liu and Richmond, 2000), or association between visual stimuli and a delivery/omission of reward (Mogami and Tanaka, 2006). The results of the present study extend previous findings by showing that the activity of perirhinal neurons signals the meaning of a stimulus, namely, the reward conditions, independent of the identity of a stimulus. Because associative relationships between the stimulus and reward outcomes were set trial by trial by another stimulus (an occasion setter, Holland, 1986), these findings further indicate that the ability of this cortex is not limited simply to association of a single stimulus with a reward.
The perirhinal cortex receives projections from the visual area TE (Suzuki and Amaral, 1994; Saleem and Tanaka, 1996), but area TE neurons do not change response strength nor response selectivity when relationships of visual stimuli and associated reward conditions (a reward or an aversive taste) are reversed (Rolls et al., 1977). If one considers the feedforward aspect of neuronal processing, the flexible representation of the reward conditions might first arise in the perirhinal cortex. This is compatible with findings of impaired performances of monkeys with perirhinal cortex ablation during reversal learning of stimulus–reward association (Murray et al., 1998; Hampton and Murray, 2002), and during learning or remembering relative reward values (Clark et al., 2012). Because the neurons were equally divided as to whether responses were stronger in rewarded or unrewarded trials, the responses cannot be categorically attributed to response enhancement related to directed attention (Chelazzi et al., 1993, 1998) nor can they be attributed to response decrement following recently and/or repeatedly presented visual stimuli (Fahy et al., 1993; Xiang and Brown, 1998; Liu et al., 2009).
The CACR task was similar to the task used in Watanabe (1990) to examine neurons in the dorsolateral prefrontal cortex (DLPFC). Many DLPFC neurons represented reward conditions (67% of cue related neurons) rather than the physical property of cues (7%) or specific cue sequences (26%) (Watanabe, 1990). Because the perirhinal cortex has only a minor projection to the DLPFC (Petrides and Pandya, 2002; Muñoz and Insausti, 2005; Saleem et al., 2008), it is not clear if the perirhinal cortex lies at an earlier processing stage than the DLPFC for representing the associated reward conditions, or whether the reward-condition dependent activity arises independently in these two areas.
Neuronal mechanisms encoding reward conditions in the perirhinal cortex
The latency distributions of the Cue2-identity dependent activity, the trial-type-specific activity, and the reward-condition dependent activity showed that the reward-condition signal appeared later than the Cue2-identity signal, and that the cue sequences, i.e., the trial-type-specific activity, were represented between these signals. Based on these results, we can imagine that hierarchical relationships might exist to give rise to these three signals, in that neurons might show the trial-type-specific activity by associating Cue1 and Cue2, and that neurons might encode the reward conditions by associating the trial-type-specific activity and the rewarded/unrewarded outcomes. It is not clear in this study whether the trial-type-specific signal represents associative signals about Cue1–Cue2, Cue2–reward conditions, or Cue1–Cue2–reward conditions.
We speculate that at least two processing stages are necessary to encode the reward conditions. If the association between Cue1 and Cue2, and the association with reward occurred in one stage, there should be a reward signal in either the Mgnta–Ptrn1–R trial or the Cyn–Ptrn2–R trial, but not in both. In the first stage, the trial-type-specific signal might be encoded by associating Cue1 and Cue2, resulting in four types of neurons representing each cue sequence, i.e., Mgnta-Ptrn1, Mgnta-Ptrn2, Cyn-Ptrn1, and Cyn-Ptrn2. The four types of the trial-type-specific neurons in the first stage project to a single second stage neuron. The second stage neuron also receives reward signals from other areas such as the amygdala or the OFC where neurons respond to appetitive or aversive stimuli (Belova et al., 2007; Morrison and Salzman, 2009). Probably, the trial-type-specific signal itself or the reward signals alone is insufficient to drive the second stage neuron, and both kinds of signals are required. Thus, the second stage neuron becomes responsive to Cue2 in the rewarded or unrewarded trials, leading to responses related primarily to reward conditions. The second stage might be performed by the amygdala or the OFC and the perirhinal cortex might represent a reward-condition signal that was sent from the amygdala/OFC. However, it is less likely to occur because the reward-condition signal in the OFC has already become cue set-independent (Tremblay and Schultz, 1999). Thus, we speculate that a hierarchical integration of signals occurs in the perirhinal cortex. The importance of this cortex in signal integrations has been suggested previously (Liu and Richmond, 2000; Yoshida et al., 2003).
Because signals related to reinforcement or punishment are widely represented throughout the brain (see Vickery et al., 2011), it is possible that the reward-condition signal in the perirhinal cortex is related to signals in areas other than the amygdala and the OFC.
The trial-type-specific signal could arise as a neuronal signal related to the conjunctive representation that has been proposed by Bussey et al. (2002) based on impaired performance of monkeys with perirhinal cortex ablation upon learning of biconditional discriminations. In our monkeys that had been well trained in the CACR task, the perirhinal cortex neurons encoded the reward conditions in addition to the conjunctive representation of the visual stimuli. The perirhinal cortex may have a greater effect on the reward-condition encoding than on representation of the specific cue sequence during the CACR task, because the percentage of neurons encoding the reward conditions (45%) was larger than the percentage of neurons showing the trial-type-specific activity (21%) and because the percentage of the reward-condition dependent neurons increased again during the Correct signal period.
Role of the reward-condition signal in the medial temporal lobe network
As described above, encoding of the reward-condition signal in the perirhinal cortex may be closely related to processing of the amygdala/OFC network. Characteristics of the reward-condition dependent activity seem to be similar across the three areas. The proportion of the reward-condition dependent neurons that preferred either the rewarded or unrewarded trials was similar to the proportion of neurons that preferred either positive or negative outcome in the amygdala and in the OFC (Paton et al., 2006; Morrison and Salzman, 2009). Some neurons with cue-related responses showed responses to free reward like those observed in the amygdala and the OFC (Tremblay and Schultz, 2000; Sugase-Miyamoto and Richmond, 2005). Studies of rodents and cats have shown that interactions between the perirhinal/entorhinal cortices and the amygdala increase impulse traffic to the entorhinal–hippocampal network (Kajiwara et al., 2003; Pelletier et al., 2005; Paz et al., 2006). The reward-condition signal may contribute to such increased impulse traffics. We speculate that the reward-condition signal plays a role in memory of the cue sequences. The pathway from the perirhinal cortex to the entorhinal cortex is regarded as one of the main paths into the entorhinal–hippocampal network, which is crucially involved in memory formation in the hippocampal memory system (Squire, 1992). A recent ablation study has indicated a role of this cortex in contextual learning about reward size (Clark et al., 2012).
The activity of neurons in the perirhinal cortex is not pure reflection of the sensory signal, but represents the conjunction of the task and the stimulus. By combining information regarding specific cue sequences and reward-related signals, perirhinal cortex neurons carry flexible representations of reward-condition signals like those seen in the amygdala and the OFC. This representation, that is, whether the current cue indicates a rewarded or unrewarded outcome depending on the context, would provide information regarding a situation with which animals are confronted and could be used to predict upcoming rewarded/unrewarded outcomes.
Footnotes
This work was supported by Grant-In-Aid for JSPS Fellows 10J00502 (K.O.); Grant-in-Aid for Scientific Research on Innovative Areas, “Face perception and recognition” from MEXT KAKENHI and KAKENHI 20700356 and 23530971 (Y.S.-M.) and 20700219 and 22700161 (N.M.); and Grant-in-Aid for Scientific Research on Innovative Areas, “Structural Cell Biology” from MEXT KAKENHI and CREST (C.S.). We thank Dr. Barry J. Richmond for helpful comments on this manuscript, and Dr. Ichiro Takashima, Dr. Aya Takemura, Dr. Shigeru Yamane, Mizuho Yamane, Ai Muramatsu, and Toshiharu Takasu for their assistance.
The authors declare no competing financial interests.
- Correspondence should be addressed to Dr. Yasuko Sugase-Miyamoto, System Neuroscience Group, Human Technology Research Institute, National Institute of Advanced Industrial Science and Technology, Tsukuba Central 2, 1-1-1 Umezono, Tsukuba, Ibaraki 305-8568, Japan. y-sugase{at}aist.go.jp