Abstract
Learned behavioral responses to sounds depend largely on the expected outcomes associated with each potential choice. Where and how the nervous system integrates expectations about reward with auditory sensory information to drive appropriate decisions is not fully understood. Using a two-alternative choice task in which the expected reward associated with each sound varied over time, we investigated potential sites along the corticostriatal pathway for the integration of sound signals, behavioral choice, and reward information in male mice. We found that auditory cortical neurons encode not only sound identity, but also the animal's choice and the expected size of reward. This influence of reward expectation on sound- and choice-related activity was further enhanced in the major striatal target of the auditory cortex: the posterior tail of the dorsal striatum. These results indicate that choice-specific information is integrated with reward signals throughout the corticostriatal pathway, potentially contributing to adaptation in sound-driven behavior.
SIGNIFICANCE STATEMENT Learning and maintenance of sensory-motor associations require that neural circuits keep track of sensory stimuli, choices, and outcomes. It is not clear at what stages along the auditory sensorimotor pathway these signals are integrated to influence future behavior in response to sounds. Our results show that the activity of auditory cortical neurons and of their striatal targets encodes the animals' choices and expectation of reward, in addition to stimulus identity. These results challenge previous views of the influence of motor signals on auditory circuits and identifies potential loci for integration of task-related information necessary for updating auditory decisions in changing environments.
Introduction
Behavioral choices in response to sensory stimuli depend not only on stimulus identity but also on the expected outcomes (rewards or punishments) associated with a given choice. Successful learning of sensory-motor associations requires that neural circuits integrate information about stimuli, choices, and outcomes to update the flow of behavior-driving signals for future events. How this integration process occurs and where along the auditory sensorimotor pathway integration takes place is not fully understood. A first step toward addressing these gaps in knowledge is to identify at what stages in the pathway from sound to action the activity of neurons encode information about sensory stimuli, choices, and reward. Here, we quantify the extent to which these task-relevant signals are present in the auditory cortex (AC) and its major striatal target, the posterior tail of the dorsal striatum (pStr).
Previous evidence suggests that motor signals as well as expectations about sensory stimuli and rewards can influence early stages of cortical sensory processing (Shuler and Bear, 2006; Weinberger, 2007; Jaramillo and Zador, 2011; Jaramillo et al., 2014; Guo et al., 2017). In the AC of primates, reward expectation signals have been observed when the amount of reward is manipulated on a trial by trial basis (Brosch et al., 2011). Moreover, studies in ferrets showed that the valence (reward vs punishment) of the behavior associated with an auditory stimulus can influence auditory cortical responses (David et al., 2012). In addition to effects of reward expectation, motor-related signals can also modulate neuronal activity in the AC (Nelson et al., 2013). Nevertheless, whether a combination of each behavioral choice and its associated reward is encoded in the activity of auditory cortical neurons or only appears in areas downstream is not fully understood.
The posterior tail of the dorsal striatum is a major subcortical output of the AC in mammals (Hunnicutt et al., 2016), with the potential to modulate motor responses via its outputs to other basal ganglia structures. In support of this view, the auditory corticostriatal pathway has been shown to play an important role in reward-driven auditory decisions (Znamenskiy and Zador, 2013; Guo et al., 2018). In addition to dense neuronal projections from the AC, the posterior striatum receives substantial dopaminergic innervation (Menegas et al., 2015), making it a potential site of integration of motor and reward signals, as neurons in this region display activity that reflects the choice associated with a sound stimulus (Guo et al., 2018). However, to what extent the representation of these task-related variables arises in the striatum or is inherited from cortex remains unknown.
We report that neural activity in both AC and posterior striatum are selective not only to the identity of the sound, but also to the behavioral choice. Moreover, while reward expectation modulates sound-evoked activity in a small fraction of neurons in the auditory corticostriatal pathway, a larger proportion of cells in both brain areas display choice-selective activity influenced by reward expectation. Although all these effects are seen in both brain regions, they are more prevalent in neurons of the posterior striatum compared with the AC.
Materials and Methods
Animals.
A total of 11 adult male wild-type mice (C57BL/6J) were used in this study: 6 for electrophysiological recordings from the AC, and 5 for striatal recordings. Mice had ad libitum access to food, but water was restricted. Free water was provided on days with no experimental sessions. All procedures were performed in accordance with National Institutes of Health standards and were approved by the University of Oregon Institutional Animal Care and Use Committee.
Behavioral task.
The two-alternative choice sound discrimination task was performed inside single-walled sound isolation boxes (IAC-Acoustics). Animals performed the task in the dark. Behavioral data were collected using the taskontrol platform (https://taskontrol.readthedocs.io) written in the Python programming language (http://www.python.org). Mice initiated each trial by poking their noses into the center port of a three-port behavior chamber. After a silent delay of random duration (150–250 ms, uniformly distributed), a narrow-band sound (chord) was presented for 100 ms. Animals were required to stay in the center port until the end of the sound, and then choose one of the two side ports for water reward according to the frequency of the sound (low-frequency, left port; high-frequency, right port). If animals withdrew before the end of the stimulus, the trial was aborted and ignored in the analysis. Each behavioral session lasted 60–90 min.
Stimuli consisted of chords composed of 12 simultaneous pure tones logarithmically spaced in the range f/1.2 to 1.2f for a given center frequency f. The intensity of all components of the chord was set to the same value between 30 and 50 dB-SPL, changing from one trial to the next during the initial training. This intensity was fixed to 50 dB-SPL during behavioral assessment and electrophysiological recordings. To evaluate categorization performance across the frequency spectrum, we used eight distinct center frequencies ranging from 6.2 to 19.2 kHz (logarithmically spaced) for nine of the mice, and ranging from 6.2 to 17 kHz for the other two mice. The lowest four values were considered low frequencies (requiring a left choice for reward) and the highest four values were considered high frequencies (requiring rightward choices). All center frequencies were equally likely to occur throughout the session.
After animals achieved stable high accuracy in their performance (>80% correct for the easiest stimuli), we modified this task to evaluate the effect of changing reward expectations on behavior.
The amount of water delivered on each port was varied from one block of trials to the next. A single session consisted of several blocks of 150–200 trials. In a “left more reward block”, the left reward port dispensed 6 μl of water on correct trials while the right port dispensed 1.3 μl of water. In a “right more reward block” these amounts were reversed: 1.3 μl left, 6 μl right (Fig. 1A). In addition to the reward amount, no cues were given to indicate a transition from one block to the next. The initial block type in a session was randomized from one day to the next. During electrophysiological recordings, only one frequency from each category was used, chosen based on the preferred frequency of a recording site. The frequencies used in these sessions were far from the categorization boundary to obtain a large number of correct trials for each condition.
Surgical implant of tetrode arrays.
Animals were anesthetized with isoflurane through a nose cone on the stereotaxic apparatus. Mice were surgically implanted with a custom-made microdrive containing eight tetrodes targeting the right posterior striatum or the right primary AC. Each tetrode was composed of four tungsten wires (CFW0011845, California Fine Wire) twisted together. Tetrodes varied in length, with a 500 μm difference between the longest and the shortest tetrodes. To enable visualization of the electrode tracks postmortem, electrodes were coated with DiI before implantation. For the posterior striatum, tetrodes were positioned at 1.7 mm posterior to bregma, 3.5–3.55 mm from midline, and 2 mm from the brain surface at the time of implantation. For AC, tetrodes were implanted at 2.8 mm posterior to bregma, 4.5 mm from midline, and 0.5 mm from the brain surface. Cortical recordings were targeted to the primary AC, but neurons from all cortical fields were included in the analysis. All animals were monitored and recovered fully before behavioral and electrophysiological experiments.
Neural recordings.
Electrical signals were collected using an RHD2000 acquisition system (Intan Technologies) and OpenEphys software (http://www.open-ephys.org). Evoked responses to sound were monitored daily and tetrodes were advanced for 80 μm after each recording session. At the first depth where sound-evoked responses were observed, we started collecting electrophysiological data during the sound discrimination task. Recordings for each animal stopped when no more sound responses were observed. Tetrode locations were confirmed histologically based on electrolytic lesions and fluorescent markers.
Experimental design and statistical analysis.
Psychometric curve fitting (Fig. 1B) was performed via constrained maximum likelihood to estimate the parameters of a logistic sigmoid function (http://psignifit.sourceforge.net). We calculated reaction times and movement times for each mouse separately for trials with a left choice and a right choice, as some mice developed faster responses for one side than the other. Average reaction times and movement times across mice between the two reward contingencies were compared using the Wilcoxon signed-rank test.
Neuronal data were analyzed using in-house software developed in Python (http://www.python.org). Spiking activity of single units was isolated using an automated expectation maximization algorithm (KlustaKwik; Kadir et al., 2014). Isolated units were only included in the analysis if <2% of interspike intervals were <2 ms. Spike shapes of resulting units were visually inspected to exclude those dominated by noise.
Sound-evoked responses were quantified as the firing rate over the stimulus period (0–100 ms after sound onset) compared with the baseline firing rate (100 ms window before sound onset), using the Wilcoxon rank sum test with a significance level of 0.025 (after Bonferroni correction for multiple comparisons). Movement-related neural activity was measured by averaging firing rates across trials in a 0–300 ms window after the animal exited the center port. We excluded any trials in which the animal had reached a reward port before 300 ms. The baseline period for comparison with movement period was 0–100 ms before sound onset, chosen to ensure that animal was stationary without any sound stimuli. The choice selectivity index was calculated as (C − I)/(C + I), where C and I were average firing rates in trials with contralateral and ipsilateral movement, respectively. Average firing rates for trials with contralateral versus ipsilateral movement were compared using the Wilcoxon rank sum test; neurons with a resulting p value of <0.05 were considered selective for movement direction. For each movement-selective cell, we computed the difference in firing rate while holding either the sound stimulus or the movement direction constant. Results from the two sound stimuli and the two movement directions were averaged to obtain the mean difference.
We evaluated whether neuronal response encode reward expectation in two sets of neurons: those that showed statistically significant sound-evoked responses (see Fig. 4), and those that were selective to movement direction (see Fig. 7). To quantify the impact of reward expectation on neuronal activity, we calculated a reward modulation index using the equation: (M − L)/(M + L), where M is the average firing rate from trials with more reward on the chosen port, and L is the average firing for trials with less reward on the chosen port. We did this separately for the sound (0–100 ms after sound onset) and movement (0–300 ms after center port exit) periods. Only sessions with at least 70% correct trials and at least two reward contingency switches were included in the analysis. We tested statistical significance of the modulation for each cell using the Wilcoxon rank sum test between the evoked firing of each reward contingency. To exclude effects of non-stationarity in overall neuronal firing rate over time, cells were counted as significantly modulated only if the modulation effect was observed in at least two different switches of reward contingency blocks.
To evaluate the fraction of modulated cells expected by chance, we calculated the modulation index for datasets in which the trials for each recorded cell were randomly shuffled. We performed 100 such simulated experiments and quantified for each one the fraction of cells that showed a statistically significant modulation (p < 0.05, as in the original analysis). We report the mean fraction of modulated cells across these simulated experiments and the corresponding SD.
To estimate the dynamics of performance around a block transition (see Fig. 9A,C), the fraction of rightward choices was estimated for 60 trials before and 60 trials after the transition. This value was normalized to the average of rightward choices for a given trial type (high-frequency or low-frequency) in the corresponding session. The data corresponding to transitions from “left more reward” to “right more reward” was flipped around the average so all transitions could be pooled together. The gray lines in the figure show the average across all transitions for all animals. To estimate the dynamics of the physiological data from modulated neurons (see Fig. 9B,D), we evaluated the firing rate during movement for the preferred choice of each cell on each trial around each block transition. Firing rates were normalized by subtracting the mean across trials and dividing by the SD for the trials before the transition. The data for cells with negative modulation indices (and for transitions from less to more firing) were flipped around the mean so all neurons (and transitions) could be pooled together. The normalized firing rates for all cells for each trial around the transition were then averaged. This analysis has the caveat that pooling cells with positive and negative modulation indices may result in a spurious average magnitude of the modulation. Therefore, this analysis is informative about the dynamics but not the magnitude of modulation.
Results
Reward expectation influences auditory choices
To evaluate the effects of reward expectation on sound-driven choices, we trained mice to perform a two-alternative choice sound discrimination task. Mice initiated a trial by poking their nose in the center port of a three-port chamber. After the presentation of a 100 ms sound, mice were required to go to either the left or the right reward port based on the frequency of the stimulus to collect a water reward (Fig. 1A). Expectation of reward was manipulated by changing the amount of reward delivered for correct choices on each side port from one block of 150–200 trials to the next. In one block type, the left reward port (associated with low-frequency sounds) delivered 6 μl of water, whereas the right port (associated with high-frequency sounds) delivered only 1.3 μl. In the other block type, the reward amounts were swapped. Animals showed a consistent bias toward the port where a larger reward was expected in a given block of trials (Fig. 1B). These effects of reward expectation occurred not only for frequencies near the categorization boundary, but also for the extreme frequencies (Fig. 1C). Notably, in behavioral sessions that had equal amount of reward on both sides, mice achieved near perfect discrimination performance for these extreme frequencies (93.4 ± 5.6% across mice), as can be seen for the example mouse in Figure 1B. Even at these high-accuracy levels, the effect of reward expectation was apparent.
We next asked whether reward expectation affected the speed of behavioral choices. We quantified the reaction time in each trial as the duration between the end of the sound stimulus and the exit from the center port (Fig. 2A). Reward expectation did not change the average reaction time across animals for either movement direction (Fig. 2B,C; Leftward trials: p = 0.82, Rightward trials: p = 0.18, Wilcoxon rank sum test). Similarly, reward expectation did not change the movement time, defined as the duration of travel from the center port to the side port (Fig. 2D–F; Leftward trials: p = 0.87, Rightward trials: p = 0.92, Wilcoxon rank sum test).
The observed biases in behavior according to reward contingency indicate that mice integrate information about expected reward size into their sound-driven decisions. Notably, these biases occurred without major changes in the timing of behavioral responses. We next tested to what extent expectation of reward influences neural activity along the auditory corticostriatal pathway.
Reward expectation modulates sound-evoked responses in the auditory corticostriatal pathway
We recorded the activity of single neurons from the AC and its main striatal target, the pStr, via chronically implanted movable tetrode arrays (Fig. 3A,B). To obtain reliable estimates of firing rate under each experimental condition (i.e., from a large number of correct trials for a given stimulus), we collected electrophysiological data during sessions that included only two possible sounds, one from each frequency category (high or low). The specific sound frequencies were chosen to be near the preferred frequencies of recorded cells. A total of 404 well isolated cells from AC and 312 from pStr were recorded while animals performed the sound discrimination task with changing rewards.
Among the recorded cells, we found 159 cells (39.4%) in AC and 138 cells (44.2%) in pStr that responded to at least one of the sounds presented in the task. The average latency of sound-evoked responses was similar for both areas: 22.3 ms for AC and 21.1 ms for pStr (p = 0.15, Wilcoxon rank sum test), suggesting that in addition to cortical inputs, the posterior striatum receives sensory signals from other areas such as the auditory thalamus, as has been previously reported (Ponvert and Jaramillo, 2019). Among the sound responsive cells, 71.7% in AC and 50.0% in pStr showed activity that distinguished between the two sound stimuli presented (p ≤ 0.05, Wilcoxon rank sum test; Fig. 3C,D). For the population of sound-responsive neurons (including nonselective cells), we tested whether sound-evoked activity was modulated by the expected reward associated with each sound. Specifically, we compared responses evoked by the preferred stimulus of each cell between trials in which the animal expected to get a large water reward (6 μl) versus trials with the same stimulus when a small water reward (1.3 μl) was expected. We found that 6.3% (10/159) of sound responsive cells in AC displayed evoked responses that depended on the expected reward size (Fig. 4A; p ≤ 0.05, Wilcoxon rank sum test). Sound evoked responses in the large majority of AC recorded neurons were not affected by changing the amount of reward associated with the sound (Fig. 4C). In contrast, we found that the responses of 17.0% (19/138) of neurons in pStr were modulated by reward expectation (Fig. 4B). This modulation by larger reward expectation was expressed as an increase in activity on some neurons and a decrease in activity on other neurons. We did not find a consistent sign of modulation (e.g., always higher firing for more expected reward) in either AC (p = 0.21, Wilcoxon signed rank test) or pStr (p = 0.18, Wilcoxon signed rank test). Although the magnitude of the modulation across the population of cells did not differ between the two brain regions (Fig. 4E,F; p = 0.26, Wilcoxon rank sum test), the proportion of modulated cells in pStr was distinctly higher than that in AC (p = 0.03, Fisher's exact test).
We next tested whether spontaneous activity in the pre-sound period conveyed information about reward expectation in a given block of trials. We examined the activity of sound-responsive neurons while the animals waited for sound presentation in the center port, and found that only 5.7% (9/159) of cells in the AC and 8.7% (12/138) of cells in pStr changed their spontaneous firing according to the reward contingency. The proportions of modulated cells were not different between the two areas (p = 0.37, Fisher's exact test).
Because the effects of reward expectation may not be limited to sound responsive neurons, we also evaluated the modulation of activity during sound presentation for neurons that did not change firing in the presence on sounds. In this case, we found that only 2.4 and 2.9% of cells in AC were modulated by expectation during low-frequency and high-frequency trials, respectively. Similarly, for pStr only 1.7% of cells during high-frequency trials and 5.7% during low-frequency trials were modulated by expectation of reward. These results indicate that modulation of activity by reward expectation during sound presentation is more common in sound-responsive neurons.
Neuronal activity in the auditory corticostriatal pathway reflects choice direction
We observed that neural activity during the movement period after sound presentation (when mice traveled from the center port to the reward port) was different from baseline activity before sound presentation. These effects occurred not only for neurons in the posterior striatum, as previously reported (Guo et al., 2018), but also for auditory cortical neurons. By comparing neural activity during movement toward the reward ports (0–300 ms after center port exit) to baseline activity (0–100 ms before sound onset), we found that 73.0% (295/404) of cells in the AC and 77.6% (239/312) of cells in pStr showed changes in activity during this period (p ≤ 0.025, Wilcoxon rank sum test with Bonferroni correction for multiple comparisons to two possible movement directions). Among these cells, 79.3% in AC and 71.5% in pStr showed an increase in firing rate during movement relative to baseline.
In addition to changes from baseline activity, the activity of cortical and striatal neurons was dependent on the animals' choice (Fig. 5A,B). We quantified the difference in firing rate during movement toward the port contralateral versus ipsilateral to the recording site in a 300 ms window, excluding trials in which animals reached the reward port in <300 ms (6.8 ± 5.0% of trials excluded across all mice). We found that 41.8% (169/404) of neurons in AC and 45.8% (143/312) of neurons in pStr were selective to choice direction (Fig. 5C,D). In both regions, neural activity during movement contralateral to the recording site was larger than activity during ipsilateral movement (p = 0.007 for AC, p ≤ 0.001 for pStr; Wilcoxon signed rank test).
The population of cells that displayed choice selectivity largely overlapped with those that responded to acoustic stimuli in the task. For choice-selective cells in AC, 58.6% (99/169) were also responsive to sounds in the task. This was the case for 54.6% (78/143) of choice-selective cells in pStr. For sound-responsive cells in the AC and pStr, 62.3% and 56.5% were also selective for choice direction, respectively. We then tested whether the observed differences in neural activity between leftward and rightward trials could be explained by residual responses to different sounds as the animal exited the center port. For each choice-selective cell, we computed the difference in firing rate between trials with the same sound but different choices (correct vs incorrect trials for a given sound; Fig. 5A,B, bottom), and compared it to the difference in firing rate for trials with different sounds but the same choice (correct trials for one sound vs incorrect trials for the other sound). Excluding cells that showed a larger difference in firing rate when the sound differed yielded 141 (83% of 169) choice-selective cells in the AC and 110 (77% of 143) in the pStr, suggesting that choice selectivity during movement in the majority of cells cannot be explained by sound features alone. Subsequent analysis focused on this population of choice-selective cells whose activity encoded choice direction independent of sound.
To test whether selectivity to choice direction occurred at different time points for cortical versus striatal neurons, we sorted choice-selective cells based on the 10 ms time window in which they displayed their maximal choice selectivity (Fig. 6). The time of maximal selectivity to choice direction spanned the whole movement period in both brain regions and was not different between the two regions (p = 0.12, Wilcoxon rank sum test). Consistent with the results in Figure 5, C and D, this analysis showed that a larger proportion of choice-selective cells in pStr (70%, 77/110) compared with AC (48%, 69/141) had higher firing rates when animals were moving toward the direction contralateral to the recording site (p = 0.0008, Fisher's exact test).
Together, these results indicate that a large fraction of neurons along the auditory corticostriatal pathway are selective not only to sound frequency, but also to choice direction. We then asked whether choice-selective activity in these neurons reflected reward expectation.
Reward expectation modulates choice-selective neural activity in the auditory corticostriatal pathway
To evaluate whether information about reward expectation was integrated into choice-selective activity, we compared the firing rate of recorded neurons during the movement period (0–300 ms after center port exit) when the animal expected to get a large reward versus a small reward on a given side port. For this analysis we used the choice direction that elicited a stronger response in each cell, and excluded trials with movement times <300 ms.
We found that 17% (24/141) of choice-selective cells in AC and 21% (23/110) of neurons in pStr showed different levels of activity depending on the size of expected reward, when the mouse was moving toward a given port (Fig. 7A,B). The magnitude of reward expectation effects on firing rates during movement (Fig. 7E,F) were not different between neurons from AC and those from pStr (p = 0.40, Wilcoxon rank sum test). Moreover, modulation of activity appeared as either an increase or a decrease in firing for larger expected rewards without a consistent trend (p = 0.85 for AC, p = 0.81 for pStr; Wilcoxon signed rank test). For comparison, applying the same analysis to data in which the trials have been randomly shuffled yielded an average of only 3.4 ± 1.6% (mean ± SD) modulated cells in AC and 3.9 ± 1.9% modulated cells in pStr (using 100 random shuffles for each cell).
Because the effects of reward expectation may not be limited to choice-selective neurons, we also evaluated the modulation of activity during movement for neurons that did not show differences in activity between choices. In this case, we found that only 6.4 and 5.1% of cells in AC were modulated by expectation during trials with leftward and rightward choices respectively. Similarly, for pStr only 8.9% of cells during leftward trials and 7.1% during rightward trials were modulated by expectation of reward. These results indicate that modulation of activity by reward expectation during movement is more common in choice-selective neurons.
We then evaluated whether modulation of activity by reward expectation occurred at different time points within a trial for cortical versus striatal neurons. Similar to the analysis for choice selectivity, we evaluated modulation by reward in 10 ms time windows, and sorted choice-selective cells based on the time window in which they displayed their maximal modulation (Fig. 8). Although the time of maximal reward modulation spanned the whole movement period in both brain regions, cells in pStr showed maximal modulation at later times than AC cells (AC median time: 70 ms from movement onset, pStr median time: 115 ms. p = 0.001, Wilcoxon rank sum test). Together, these results indicate that reward expectation modulates the activity of choice-selective cells in both the AC and the posterior striatum, but may do so at different time periods during the task for each brain area.
Next, we estimated how long it took for neurons in each area to change their activity after a change in reward contingency and compared this to the corresponding changes in behavior. We found that, on average, animals changed their behavior in <10 trials after a block switch (Fig. 9A,C). Consistent with these changes in behavior, choice-selective neurons that were modulated by reward expectation also changed their firing rapidly after a block switch, reaching their asymptotic level of firing in 10 trials or less (Fig. 9B,D). Because of the limited number of trials in these transitions and the variability in firing, the current data are not sufficient to reliably estimate small differences in dynamics across brain areas. However, our data clearly indicates that neurons in both AC and posterior striatum can rapidly change their firing after animals have been exposed to a new reward contingency.
Last, we tested whether the magnitude of activity modulation by reward at one time period during the trial was correlated to the modulation at another period, or correlated with the magnitude of choice selectivity for each cell. This analysis included cells that did not change firing after the presentation of sounds and those that were not selective to choice. A quantification of the Spearman correlation across all estimated modulation indices (Fig. 10) indicated that the magnitude of choice selectivity was not correlated with the magnitude of modulation by reward during any period within a trial. For neurons in both areas, we found a small correlation between the modulation observed during the stimulus period and the modulation before stimulus presentation (Spearman r = 0.24, p = 0.006 for AC and r = 0.18, p = 0.016 for pStr), suggesting that changes in spontaneous activity were reflected in changes during evoked activity. Moreover, there were two cases in which correlations were present for the posterior striatum but not the AC. In pStr, the modulation by reward during movement was related to the modulation during sound presentation (Spearman r = 0.29, p < 0.001) and to the modulation of spontaneous activity (Spearman r = 0.2, p = 0.007). These correlations were much lower for AC neurons (r = 0.01 and r = 0.06, respectively). For comparison, correlations estimated from shuffled data (10 repetitions) yielded average values <0.02.
Discussion
Neurons in the auditory corticostriatal pathway play an important role in the acquisition of sound-action associations and the execution of sound-driven decisions (Znamenskiy and Zador, 2013; Xiong et al., 2015; Guo et al., 2018). A key step toward uncovering the mechanisms implemented by these neuronal circuits is to identify which task-related components are integrated at each stage of the pathway to drive appropriate behavior. Our data indicate that auditory cortical neurons display many of the physiological responses observed in their striatal targets during a two-alternative sound-discrimination task, with neurons in both regions being influenced by stimulus identity, behavioral choice, and expected reward. Specifically, >40% of auditory cortical cells were selective to the animal's choice during the task, and approximately one-fifth of these neurons were affected by the amount of expected reward. These results depict both AC and posterior striatum as potential loci of integration of sound, choice, and reward information, a key mechanism for adaptive auditory behaviors.
Movement-related activity in the auditory corticostriatal pathway
Changes in the activity of auditory cortical neurons because of motor-related signals have been previously reported in several species (Schneider and Mooney, 2015) including rodents (Nelson et al., 2013; Rummell et al., 2016), nonhuman primates (Müller-Preuss and Ploog, 1981; Eliades and Wang, 2008; Yin et al., 2008; Niwa et al., 2012), and humans (Curio et al., 1998; Flinker et al., 2010). Studies in mice showed that spontaneous and tone-evoked activity of auditory cortical excitatory neurons was suppressed before and during locomotion (Schneider et al., 2014), and that this suppression was apparent in cortical layer 2/3 but not in layer 4 of primary AC (Zhou et al., 2014). These effects potentially serve as a mechanism for suppressing self-generated sounds, and have been accounted for by a corollary discharge from motor cortex, which conveys copies of motor command signals to the AC (Nelson et al., 2013; Zhou et al., 2014). In contrast to this suppression in activity, our study shows that, during a freely-moving two-alternative choice task, the activity of a large fraction of auditory cells increases during movement compared with a quiet waiting period; an effect that occurred in both AC and its striatal target. These changes in neural activity with movement may therefore reflect decision-related signals that are potentially useful for adapting behavior, rather than simply suppressing self-generated sounds.
Beyond the observed changes in neural activity during movement, we found that movement-related activity was selective to choice direction in >40% of auditory cortical cells during the two-alternative choice task. This fraction was comparable to that found in the posterior striatum in this study and in previous reports (Guo et al., 2018). Data from the current study does not distinguish between activity related to a specific movement (independent of the task) and activity that depends on a perceptual choice (which could still be present even if downstream circuits produce a different action). As such, the data does not rule out the possibility that the selectivity observed is because of a corollary discharge from motor regions (Nelson et al., 2013). Importantly, our analysis comparing trials with different sounds but the same action showed that the neurons' selectivity to choice direction cannot be accounted for by differences in the stimulus identity. Further evidence for the motor nature of these effects comes from previous studies in which the action associated with a sound changed from one block of trials to the next. In these reports, choice-selective activity was observed in both rat AC (Jaramillo et al., 2014, their Fig. 6C) and mouse posterior striatum (Guo et al., 2018, their Fig. 7d). It remains unknown to what extent these effects emerge in the cortex, or are inherited from earlier stages of the auditory system.
The population of recorded neurons in both AC and pStr displayed diverse patterns in the dynamics of choice-selective activity observed during movement toward the reward port. For some cells, activity was sustained throughout the movement period (Fig. 5B), whereas for other cells there were clear transient periods of stronger activity (Fig. 7B). This observation is consistent with the sequential activity observed in other regions of the dorsal striatum during the delay period of memory tasks (Akhlaghpour et al., 2016) or during interval timing tasks (Mello et al., 2015), which suggest a role for this activity in time keeping. However, the observation that the activity of the neurons we recorded depends on the animals' choice suggests that these neurons represent variables beyond a global timing signal.
Although the exact source of choice-selective activity in the auditory system has not been determined, a possibility beyond a motor corollary discharge is that signals from proprioception or somatosensation (which may differ according to the animals' choice) reach the auditory system. Because fully removing such stimuli during a behavioral task would be extremely challenging, further studies manipulating neural pathways between these sensory systems may be required to address this possibility. Nevertheless, the fact that reward expectation can modulate choice-selective activity in the auditory system suggests that, independent of their source, these signals may play a role in adaptive behavior.
Representation of expected reward in the auditory corticostriatal pathway
Pairing acoustic stimuli to positive or negative reinforcers often results in long-lasting changes in cortical responses to sounds (Weinberger, 2007). Moreover, associating sounds to either rewards or punishments results in a modulation of evoked responses by task engagement in opposite directions depending on the type of expected outcome (David et al., 2012). In primates, when the amount of reward is varied according to the outcome of previous trials, the activity of auditory cortical neurons reflects the expected reward size on subsequent trials (Brosch et al., 2011). This evidence from animal studies, together with observations in humans (Puschmann et al., 2013; Weis et al., 2013), supports the idea that reward expectation influences the activity of auditory cortical neurons.
In the present study, the use of a two-alternative choice task with varying amounts of reward enabled a comparison of neural responses to the same sound-action association for different amounts of expected reward. Our results go beyond previous observations by showing that reward expectation can modulate not only sound-evoked responses of cortical neurons, but also choice-selective neural activity while animals move toward a reward port. As demonstrated by analysis of reaction times and movement times during the task (Fig. 2), these physiological effects of reward expectation cannot be explained simply by changes in the speed of movement of the animals. Moreover, we found that reward expectation modulates the sound-evoked responses of a larger fraction of striatal neurons than cortical neurons. This phenomenon could be the result of specific connectivity patterns between cortical and striatal neurons, or the existence of multiple loci of integration of reward information along the corticostriatal pathway.
A large body of literature has demonstrated that regions of the ventral and dorsomedial striatum encode the value of actions and sensory stimuli (Lauwereyns et al., 2002; Cromwell and Schultz, 2003; Samejima et al., 2005; Lau and Glimcher, 2008; Tai et al., 2012; Wang et al., 2013). Much less is known about the influence of reward information in the posterior tail of the striatum, the main striatal target of the AC. Similar to anterior regions of the dorsal striatum, the posterior tail receives extensive dopamine innervation that could provide additional outcome-related signals to auditory responsive neurons in this area. A recent study, however, found that activation of dopamine axons in this region of the striatum elicits avoidance behavior, inconsistent with the idea that these signals provide information directly related to reward (Menegas et al., 2018). The effects of dopamine inputs on auditory neurons in the posterior striatum remains an open question.
Integration of task-related features in the auditory corticostriatal pathway
Successful establishment and maintenance of sound-driven behavioral responses require that neural circuits integrate information about stimulus features, actions, and outcomes. This combination of signals allows a circuit to associate appropriate actions to each sound. Our results indicate that a representation of these variables is present as early as sensory cortex, and raises the possibility that some of this information in cortex is inherited from structures earlier in the ascending sensory pathway such as the thalamus.
As a link between the AC and the basal ganglia, the posterior tail of the striatum is ideally located to play a key role in sound-driven decisions (Guo et al., 2018). Understanding whether the representations of choice and expected reward differ between the AC and the posterior striatum helps delineate where these variables are integrated along the auditory sensorimotor pathway. Our results challenge a view in which the AC provides only sensory information to the striatum, and suggest that many of the response properties of posterior striatal cells can be explained by signals inherited from cortex, especially in light of the fact that there is no direct feedback from the posterior striatum to the AC. Our study provides the basis for future investigations of properties that uniquely emerge in the posterior striatum to help an organism make adaptive auditory decisions in a changing world.
Footnotes
This work was supported by the National Institute on Deafness and Other Communication Disorders (R01DC015531), and the Office of the Vice President for Research and Innovation at the University of Oregon. We thank members of the Jaramillo laboratory for discussion and comments on the paper.
The authors declare no competing financial interests.
- Correspondence should be addressed to Santiago Jaramillo at sjara{at}uoregon.edu