Abstract
Little is known about whether information transfer at primary sensory thalamic nuclei is modified by behavioral context. Here we studied the influence of previous decisions/rewards on current choices and preceding spike responses of ventroposterior medial thalamus (VPm; the primary sensory thalamus in the rat whisker-related tactile system). We trained head-fixed rats to detect a ramp-like deflection of one whisker interspersed within ongoing white noise stimulation. Using generative modeling of behavior, we identify two task-related variables that are predictive of actual decisions. The first reflects task engagement on a local scale (“trial history”: defined as the decisions and outcomes of a small number of past trials), whereas the other captures behavioral dynamics on a global scale (“satiation”: slow dynamics of the response pattern along an entire session). Although satiation brought about a slow drift from Go to NoGo decisions during the session, trial history was related to local (trial-by-trial) patterning of Go and NoGo decisions. A second model that related the same predictors first to VPm spike responses, and from there to decisions, indicated that spiking, in contrast to behavior, is sensitive to trial history but relatively insensitive to satiation. Trial history influences VPm spike rates and regularity such that a history of Go decisions would predict fewer noise-driven spikes (but more regular ones), and more ramp-driven spikes. Neuronal activity in VPm, thus, is sensitive to local behavioral history, and may play an important role in higher-order cognitive signaling.
SIGNIFICANCE STATEMENT It is an important question for perceptual and brain functions to find out whether cognitive signals modulate the sensory signal stream and if so, where in the brain this happens. This study provides evidence that decision and reward history can already be reflected in the ascending sensory pathway, on the level of first-order sensory thalamus. Cognitive signals are relayed very selectively such that only local trial history (spanning a few trials) but not global history (spanning an entire session) are reflected.
Introduction
Primary sensory or “first order” thalamus is thought to be driven mainly by sensory input arriving from an ascending sensory pathway and terminating in the primary sensory cortex (Sherman and Guillery, 1998). However, human pathology (Van der Werf et al., 2000; Schmahmann, 2003) and experimental data (Steriade and Llinás, 1988; Saalmann and Kastner, 2011) have provided accumulating evidence that the thalamus, including first-order nuclei, assumes a role in processing behavioral states like sleep and wakefulness as well as higher cognitive functions. Among extra-sensory signals, movement-related information (Lee and Malpeli, 1998; Reppas et al., 2002; Lee et al., 2008) as well as attentional processes (O'Connor et al., 2002; McAlonan et al., 2008; Halassa et al., 2014) have been shown to be reflected in thalamic spiking. Choice-related activity, another cognitive signal, has long been assumed to be largely confined to neocortical circuits (Nienborg and Cumming, 2006; Gold and Shadlen, 2007). However, massive recurrent connectivity of cortical areas, not only to thalamus but basically to all stations on ascending sensory pathways (Smith et al., 2015), keeps open the possibility that cognitive influence of sensory signal streams can be revealed using either more elaborate behavioral assessment and/or better tools to study neuronal coding. This notion has been strengthened by a recent report demonstrating choice-related modification of stimulus-evoked response in the medial ventroposterior (VPm) thalamus, the tactile first-order stimulus of the whisker-related tactile system in rodents (Yang et al., 2016).
Previous studies of choice-related inputs have focused on the question of whether choice-related signals in sensory processing streams are a product of noise fluctuations large enough to be signaled in a bottom-up fashion and gain perceptual significance (Britten et al., 1996), or whether there exist cognitive signals that stream backward, from cognitive centers like the prefrontal and parietal association cortices to the more specialized sensory processing structures (Nienborg and Cumming, 2009). To date, consideration of choice related influences has been limited to a single behavioral trial. However, we suggest that if top-down cognitive signaling exists, it must also reflect the longer time scales that are generally encompassed by cognitive signals. This idea is further supported by the fact that choice is strongly associated with value and reward, variables that are profoundly involved in learning and memory as well as in moment-to-moment decision making, and certainly recruit memories about behavioral events distributed in extended periods of past time (Herrnstein, 1961; Schultz, 2006). In fact, on the behavioral level there is good evidence that past choices can affect current ones (Corrado et al., 2005; Busse et al., 2011; Stüttgen et al., 2013; Fründ et al., 2014). Such signals have been shown to modify sensorimotor signals in parietal cortex (Sugrue et al., 2005). However, how these cognitive signals interact with subcortical structures remains unclear.
Our aim for the present study was therefore to investigate whether modulatory signals related to past choices and rewards affect encoding subcortically in a first-order thalamic nucleus. Specifically, we examined the impact of prior choices and outcomes on spiking in the rat VPm nucleus, which is the first-order thalamic nucleus in the tactile system that selectively encodes kinematic features of whisker vibrations (Petersen et al., 2008) and projects to primary somatosensory cortex. We used a simple tactile detection task in head-fixed rats that combines punctate whisker-ramp stimuli with ongoing white noise whisker stimulation (Waiblinger et al., 2015b) to enable investigation of neuronal coding properties (i.e., reverse correlation) throughout the full task epoch. We analyzed thalamic neural activity with respect to simple task-related variables on two time scales: the short time scale of choices/rewards from one trial to another (local scale) and the longer periods of task engagement (global scale). Behavioral modeling revealed correlations on the animal's choices and rewards on a local (a few trials) as well as global scale (a session), while thalamic spiking was significantly related only to local trial history.
Materials and Methods
Animals, surgery, and general procedures for behavioral testing.
All experimental and surgical procedures were performed in accordance with standards of the Society of Neuroscience and the German Law for the Protection of Animals. Subjects were three female Sprague-Dawley rats (Charles River Laboratories), aged 12–16 weeks at the time of implantation. The basic procedures of headcap surgery, habituation for head-fixation, and behavioral training exactly followed the ones published in a technical review (Schwarz et al., 2010) and a more recent psychophysical study (Waiblinger et al., 2015b). In the following text, only procedures pertaining to the specific paradigm established here are described in detail.
Oral antibiotics (Baytril, Bayer HealthCare; 2.5% in 100 ml drinking water) were provided for 3 d before surgery and 1 week postoperatively. The animals were anesthetized using ketamine and xylazine (100 and 15 mg/kg body weight, respectively) and chronic electrode arrays (Haiss et al., 2010) were implanted. A craniotomy was performed (2–4 mm caudal to bregma, 1.5–3.5 mm lateral to the midline) to permit access to the right VPm nucleus of thalamus. Barreloids were identified through functional mapping of the VPm using a single extracellular microelectrode. Unit and field potential responses to a brief manual whisker flick were monitored until a site maximally responsive to deflections of a single whisker with lower activation of adjacent whiskers was found. Across the three animals, units in α, A1, A2, C1, C2, D1, and D2 barreloids were recorded. Movable multielectrode arrays (2 × 2 configuration; electrode distance, 250–375 μm) were centered over the mapped location and slowly inserted through the dura at a speed of 1.25 μm/s−1 until all electrodes had reached the VPm (usually at a depth of ∼4000–5000 μm). The electrodes were then slowly retracted to a depth of ∼3500–4000 μm relative to the cortical surface and fixed to the skullcap with dental cement so that the mobility of the array was still guaranteed. The wound was treated with antibiotic ointment and sutured. Analgesia and warmth were provided after surgery. Rats were allowed to recover for at least 10 d before habituation training. Subjects were housed together with a maximum number of four in one group cage and kept under a 12 h inverted light/dark cycle. During behavioral testing, water intake was restricted to the apparatus where animals were given the opportunity to earn water to satiety. Testing was paused and water was available ad libitum during 2 d per week. Bodyweight was monitored daily, and was typically observed to increase during training. No animal in this study needed supplementary water delivery outside training sessions to keep its weight. The first step of behavioral training was systematic habituation to head-fixation lasting for ∼2 weeks. Once rats were trained on the behavioral task, 1–2 experiments were usually conducted per day comprising 100, 150, or 200 trials. During behavioral testing a constant auditory white background noise (70 dB) was produced by an arbitrary waveform generator (W&R Systems) to mask any sound emission of the galvo-motor-based whisker actuator.
Whisker stimulation.
For whisker stimulation a galvo-motor (galvanometer optical scanner model 6210H, Cambridge Technology) as described by Chagas et al. (2013) was used. The stimulator contacted a single whisker on the left side of the rats face at 5 mm (±1 mm tolerance) distance from the skin, and thus, directly engaged the proximal whisker shaft, largely overriding bioelastic whisker properties. The rotating arm of the galvo-motor was arranged such that the mean whisker position during noise stimulation was its resting point. During the main experiment that included the behavioral task, stimulation was always delivered along the rostrocaudal axis.
Voltage commands for the actuator were programmed in MATLAB and Simulink (v2015b, MathWorks). The whisker was deflected by unfrozen Gaussian white noise (sampling rate 20 kHz) that was low-pass filtered using a Butterworth filter (fifth order) with a cutoff frequency of 100 Hz (Chagas et al., 2013). As ascertained in a separate study (Waiblinger et al., 2015b), the amplitude of the noise stimulus (An = 1°, Vn = 350°/s; as determined by An = 2 × SD of the distribution) applied here, is in the perceptible range of the animals. The noise stimulus was presented continuously throughout the session. A stimulus consisted in a single event, a sinusoidal ramp embedded into the ongoing noise (the ramp shape was half period of a 100 Hz sine wave, starting at 1 minimum and ending at the next maximum). The ramp amplitudes used [A = (0, 3, 6, 9, 12)°, or maximal velocities, respectively: V = (0, 942, 1884, 2827, 3769)°/s] were well within the range reported for frictional slips observed in natural whisker movement (Ritt et al., 2008; Wolfe et al., 2008). The embedded ramp was followed by a slow decay back to the resting position during the following 1 s epoch. This slow movement is known to be imperceptible (Stüttgen et al., 2006). To assure a smooth embedding of the ramps, the noise was silenced (multiplied) with an inverted Gaussian (SD = 10 ms; minimum at the peak is 0, approaching 1 at ± infinity) centered at the time of the ramp's maximum velocity. As a result, the fast transition was smooth and largely noise free.
Experimental paradigm.
Before data collection began, all three rats were trained on a Go-NoGo “detection of change” psychophysical task using the same protocol as described before (Waiblinger et al., 2015a,b; Fig. 1A). In this task, the whisker is continuously vibrated (e.g., with background noise, S−). At intervals of 4–10 s (flat probability distribution) the detection target (S+) was presented, which in the present case was a single ramp. A trial was categorized as a “Go” response or a “hit” if the animal generated the indicator response, a lick at a water spout within 1000 ms of the onset of the S+ (ramp stimulus). If no lick was emitted (“NoGo” response) the trial counted as a “miss”. Throughout the report and the figures, we use the symbols h and m as short for hit and miss trials, respectively. Premature licking in a 2 s period before the stimulus was mildly punished by resetting time and starting a new intertrial interval of a 4–10 s duration drawn at random from a flat probability distribution. This measure effectively suppressed licking and associated whisker movements.
During initial training of the task, all rats learned to detect deflections in caudal direction. In the first sessions after concluding the training, the animal's psychometric curve (Fig. 1C) using the method of constant stimuli was measured, which implied the presentation of all stimuli including catch trials. The sequence of stimuli was pseudorandom, i.e., consisted in repeated blocks of stimuli, each containing all stimuli once in shuffled order. Responding to non-tactile cues was checked by sessions with actuators in place and moving, but detached and out of reach of the whiskers. All rats failed to detect the stimuli in these sessions.
For the main behavioral and electrophysiological experiment, a single ramp stimulus was chosen from the suprathreshold sloped portion of the psychometric curve to assure that it is perceptible but engages the rats' tactile system without driving it into saturation (Af = 6° or 9°, Vf = 1884 or 2827°/s). As described in the section whisker stimulation, this feature was embedded into the noise either in caudal or rostral direction.
Generalized linear models.
To model future behavioral and spike responses based on past behavioral responses, we used generalized linear models (GLMs), a class of models that is fit to map several independent variables to one predicted variable using a sequence of a linear regression and a monotonic (nonlinear) response function (McCullagh and Nelder, 1989). After fitting to the data, the models were used in two principal ways, as “predictive” and “generative” models: the first calculates a prediction based on the animals behavioral performance (Busse et al., 2011). The second validates the model by generating de novo responses based on the model's own trial history (Corrado et al., 2005; Stüttgen et al., 2013).
The linear regression incorporated three independent variables describing the predisposition of the animal (and its neurons) to respond in the next trial. The first is a constant term (weight b0) describing general biases including the tendency of the animal to respond to the constant stimulus (ignoring the switch in direction at the middle of a session, see data analysis). The second term (weight bS) describes a global trend of slow reduction of response probability during the run time of a session. This tendency is driven mainly by the increasing satiation of the animal and finally leads to task withdrawal when full satiation is reached. It is modeled by the sum of past rewards gained (in that session) normalized to the total number of session trials. The third term represents local trial history, encompassing only the decisions (and rewards) of the last three trials (weight bH). The triplet of history trials is interpreted as a binary number (mmm = 0, hmm = 1, …, hhh = 7) giving the most remote trial the weight of 1, the middle trial a weight of 2, and the current trial a weight of 4. The full regression equation, thus, had the following form: with [t3, t4, … tn−1, tn], indexing the series of behavioral outcomes (x = 0 if m, and x = 1 if h) in the current (tn) and past trials (tn−v with ν∈[1, 2]). The prediction z(tn) for the current trial is converted into the output of the GLM (μ) by a monotonic response function m of the form μ=m(z). For behavioral decisions, we assumed a Bernoulli distribution of binary Go/NoGo responses and used the logistic function as the response function (i.e., the GLM became a logistic regression): the output of which can be interpreted as the probability of a Go response depending on the decision quantity z, favoring Go decisions when positive and NoGo decisions when negative. For spike counts we assumed the Poisson distribution and used the following response function: The spike count prediction μSPK was then the input for a second GLM using the regression equation with bias term b0,S The prediction z2(tn) in turn was plucked into the logistic response function (Eq. 2), which estimated the probability of the binary decisions based on the predicted spikes. A graphical representation of the models is given in Figures 2B and 6C.
The GLMs were either fitted using all three linear terms (“full” model) or with a subset of them (“reduced” or “nested” models), using glmfit.m (MATLAB v2015b, MathWorks). This function also returns the model's deviance from a saturated model (the one with maximal number of parameters). Nested models were compared using the deviance test, which estimates the significance of model improvement as follows: where p is the significance value, chi2cdf is the cumulative χ2 distribution. D2 is the deviance of the extended model and D1 the deviance of the reduced model, V is the degrees of freedom calculated as number of terms set to 0 in the reduced model.
The model's prediction in one session was determined by sorting the trials and respective probabilities p (calculated using Eqs. 1 and 2) according to Go and NoGo responses. Accuracy a of the model to match the responses of the animal was then obtained as follows: where i and j are numbers of Go and NoGo trials, respectively. To validate the fitted models, generative runs were performed. To do that the first three trials were set identical to the ones obtained from the rats. All remaining trials were then filled by the model generating its own satiation curve and trial history. This was done 1000 times for each session. From the generated decision series we extracted the number of hits as well as the number of changeovers generated during the session and compared it with the performance of the rat in the session that was used to fit the model.
Electrophysiology.
The movable multielectrode arrays used here are described by Haiss et al. (2010). Extracellular voltage traces recorded by the electrodes were bandpass filtered (200–5000 Hz) and recorded at a sampling frequency of 20 kHz using a multichannel extracellular amplifier (Multi Channel Systems). Spike sorting was achieved by a custom written software package in the MATLAB environment. Briefly, a simple amplitude threshold was applied to the raw voltage trace. Depending on whether the threshold was positive or negative, local maxima (peaks) or minima (troughs) that exceeded this threshold were identified. A 1.1 ms window was then applied around each of these peaks or troughs (300 μs before and 800 μs after) to extract putative neural event waveforms. All such neural event waveforms were subjected to a principal component analysis and the first three principal components were used to generate scatter plots. Single units (SU) were then manually defined by selecting clusters of neural events that were clearly separated from the noise cluster in PC space. This was done in a time resolved fashion across the entire recording. Multiunit (MU) activity was conservatively defined by waveforms that exceeded the amplitude threshold but could not be separated from the noise cluster in PC space. On one electrode typically one, maximally two units were recorded. Timestamps and waveforms of units were saved and used for later analysis (see example recording in Fig. 1D).
Data analysis.
Psychophysical data were assessed as response-probabilities from single sessions for each animal (100–200 trials). The psychometric curves in this study are logistic fits estimated from a maximum likelihood estimator (Wichmann and Hill, 2001a,b). The same toolbox was used to evaluate the goodness of fit and obtaining confidence intervals for perceptual thresholds and slopes.
It is important to note that all spiking recorded in this study was evoked, either by white noise or ramp-like whisker deflection. We therefore refer to “noise-driven” versus “ramp-driven” spiking throughout the text. For large parts of the study, we focused on noise-driven spiking in an interval of 2 s before ramp onset; data shown in Figures 1D, 3A, 4, and 5C,D, contain ramp-evoked spiking. The 2 s prestimulus period was, by design, free of licks (as licks occurring 2 s before scheduled ramps led to punitive time outs; see experimental paradigm). To identify and remove spike trains eventually contaminated by whisking expressed as ∼10 Hz rhythmic spiking activity (Moore et al., 2015; Urbain et al., 2015), we calculated the autocorrelation function on the multiunit spike time-stamps in a 1 s window before feature presentation on a trial by trial basis. The autocorrelograms were scaled as correlation coefficients, i.e., by definition the zero-lag bin held a value of 1. Suspected whisking trials were classified using thresholds on metrics of autocorrelogram periodicity, the amplitude modulation of the periodicity (the mean peak-to-peak amplitude of the oscillations), and the correlation to a 10 Hz sine wave to mimic a relevant whisking frequency. The classification algorithm was validated across simultaneously recorded multiunit recordings to minimize mismatches in the classification of suspected whisking activity on any given simultaneously recorded trial (96.6% performance). Spikes and stimuli contained in suspected whisking trials were not used for analysis of neuronal coding.
To investigate neuronal activity as a function of task performance, we separated trials according to the animals' behavioral choice (single trial: h and m trials) and choice history [doublet-trial sequences: monitoring decisions and outcomes in one past trial (hh, mh, hm, mm) or triplet-trial sequences: monitoring decisions in the two most recent trials (hhh, mhh, hmh, mmh, hhm, mhm, hmm, mmm)]. We ruled out a systematic difference of noise stimuli between the trial classes by fitting a Gaussian to the amplitude distribution. The goodness of fits was r2 > 0.99 for all cases.
We used receiver operating characteristic analysis to calculate the area under the curve (AUC), a nonparametric effect size to quantify the difference of neuronal activity under different experimental conditions (Britten et al., 1992). AUC is the probability of correct classification of a binary classifier (using varying thresholds to strip off the observer bias) confronted with the Gaussian noise stimulus in two experimental conditions (Green and Swets, 1966). For this purpose a window in which spikes were counted was moved in 1 ms steps across spike trains obtained in different contexts (e.g., before h vs m trials). The window was 50 ms long for prestimulus spiking (see Fig. 4A) and 5 ms for stimulus-evoked spiking (see Fig. 4B). A prediction interval of the population response was assessed by a resampling technique. Random picks from observed spike counts (across trials) were performed for each window and the AUC values across time were calculated 1000 times. The 2.5 and 97.5 percentile of this boot-strapped ensemble of AUC values across trial time served as prediction limits (see Fig. 4B).
To test whether spike patterning changes with different contexts we calculated several measures from the spike trains observed in the interval 2 s before S+ onset (see Fig. 7). First, the Fano factor of spike counts: was assessed, followed by coefficient of variation of inter spike intervals: where μc, μi, σc, σi are mean and SD of spike counts and interspike intervals, respectively. Normalized (scaled in correlation coefficients) and spike rate-corrected autocorrelograms averaged across units for each class of trials were computed from binary vectors (time bin 1 ms) that held zeros for no spike and ones for spikes. The effect of spike rate on correlation was corrected for by subtracting the mean of 1000 resampled autocorrelograms based on permutated input vectors. The observed and resampled autocorrelograms were all corrected for border artifacts using a triangular function as done by Kohn and Smith (2005, their Eq. 5).
Stimulus-dependent response plasticity was studied by dividing the recording session into two equal parts where the first half typically contained between 50 and 100 consecutive trials with target features (ramps) of one direction (e.g., caudal) and exactly the same amount of trials in the second half with embedded features of the opposite direction (e.g., rostral). The order of directions was reversed after each recording session (i.e., if caudal was first in Session 1, rostral was first in Session 2, etc.). Subjects were rewarded for detecting every deflection that exceeded the noise independent of its direction.
Coding properties of VPm neurons were extracted from spike-triggered stimulus ensembles as done before for primary afferents (Chagas et al., 2013). The Gaussian stimulus was differentiated to yield Gaussian distributions of position (pos), velocity (vel), and acceleration (acc). For each recorded VPm unit, three-dimensional spike-triggered distributions (pos,vel,acc) at a single fixed delay were constructed. An example of such a spike-triggered ensemble is shown in Figure 3B. The time between spike and sampled kinematic triple was varied between 0 and 20 ms (in steps of 1 ms) and the optimal delay was determined by maximizing the information that can be gained about the stimulus by observing or not observing a spike. This information is given by the Kullback–Leibler divergence D (in bits) calculated as follows: where x is the instantaneous stimulus ensemble at one delay, s the binary value signifying the presence/absence of a spike, and p a probability distribution (Chagas et al., 2013).
The mean preferred stimulus was calculated as the mean of the vectors pointing to the kinematic triple (pos, vel, acc) obtained with each spike using the optimal latency. The length of the mean vector gave an estimate on how much the preferred stimulus differed from the stimulus mean [which by design was (0, 0, 0); Fig. 3D]. Changes in preferred stimulus were tracked by calculating the difference vectors for preferred stimuli obtained in the trials with caudal versus rostral ramps (Fig. 3C).
Overview of experiment designs and statistics used in the study.
The fit of behavioral models was assessed using the deviance measure. Statistical significance was estimated using the deviance test as described under generalized linear models (Fig. 2C). The sign rank test was used to test differences found with mean vectors and information rates in spike-triggered stimulus ensembles (paired samples: n = 50 spike trains, α error level 0.01; Fig. 3D,E). Differences in spike rate depending on trial history was expressed as AUC (see data analysis). The significance was estimated by comparing the 95% prediction interval assessed AUCs obtained from n = 50 spike trains against random performance (AUC = 0.5; Fig. 4).
Results
We trained three head-fixed rats using a tactile detection-of-change paradigm (Waiblinger et al., 2015a; Fig. 1A). The task was to detect ramp-like deflections of a single whisker embedded in low-pass filtered Gaussian white noise (edge frequency 100 Hz; 2 × SD = 1°; Fig. 1B). In this simple behavioral paradigm, the animals' decisions and outcomes are conflated, as a Go decision would always result in obtaining a reward (h) and a NoGo decision would result in not obtaining a reward (m). We therefore interchangeably use Go/NoGo designations for the animals' decisions and outcome designations h versus m to label trials in the remainder of the text. By the same token, decision and reward history cannot be disentangled; to acknowledge this we will use the neutral term “trial history”.
The amplitude used for the background noise stimulus is known to readily engage the tactile system and to be detectable by rats (Waiblinger et al., 2015b). Thus, the background noise (and the neuronal responses it evoked) could be used to probe the neuronal coding of VPm neurons at any point during the behavioral session. Introducing changes to the stimuli, we were able to determine whether changes in coding properties were of sensory or nonsensory nature. The characteristics of the ramp-like deflection were determined in a preliminary experiment, conducted with every rat, in which psychometric detection curves were measured presenting one catch and four different stimulus amplitudes [A = (0, 3, 6, 9, 12)°, V = (0, 942, 1884, 2827, 3769)°/s] in pseudorandom fashion. The resulting psychometric curve for each individual rat (Fig. 1C) was used to select a near threshold stimulus, which was going to be presented in the subsequent main experiments. This procedure ascertained first, that the stimuli would be just perceptible, thus calling for the animal's attention to perform well on the task. Second, saturation of ramp-evoked neuronal responses was avoided, keeping them in a physiological working range. The threshold position and velocity amplitudes obtained for the three rats were A = (5.8 6.0 6.2)°, and V = (1824, 1884, 1947)°/s. The stimulus parameters selected for the main experiment were A = 6° (V = 1884 °/s) for the first two rats and A = 9° (V = 2827 °/s) for the third (accounting for this rats' overall decreased performance). It is noteworthy that these threshold amplitudes, measured with stimuli embedded in noise, would be considered suprathreshold in a noise-free environment (Stüttgen et al., 2006). These increased thresholds confirm the notion that detection performance is diminished by preadaptation (Ollerenshaw et al., 2014). The main body of data presented here was obtained in 32 behavioral sessions across three rats (rat1/2/3: 22/3/7 sessions), in which we presented the near-threshold ramps and simultaneously recorded from VPm barreloid single and multiunits (rat 1/2/3, 37/4/9 units; Fig. 1D).
Determinants of decision series within one session
Behavioral sessions were of three predetermined lengths, either 100, 150, or 200 trials long. Figure 2A shows the raw behavioral data obtained from three rats. Trials in which the animal successfully detected the stimulus (hits) are marked in black and those in which no indicator lick occurred after stimulus presentation (misses) are marked in white. As stimulus amplitude was chosen close to perceptual threshold, the task was nontrivial for the rat, reflected by the low response rate (hit trials divided by total number of trials); on average 0.53 (SD 0.14). Interestingly, although generating a near-threshold average response probability, this particular arrangement prompted considerable variation in overall response patterns from all three animals, which can be expressed as the “changeover rate”, i.e., the number of trial-to-trial changes (either from hit to miss or reverse) divided by the total number of trials. Across all sessions the rats showed a mean changeover rate of 0.27 (SD 0.10). Figure 2A sorts all sessions in the sample from lowest changeover rate of 6% to the highest of 50% (these extreme rates translate to one observed change within as many as 17 trials, down to every second trial on average). Some sessions basically consisted in of what could be called an “engaged” and a “lazy” period (Fig. 2A, Session 1), whereas others were characterized by more constant engagement with hit and miss trials thoroughly interspersed (Fig. 2A, Session 32).
Before trying to elaborate whether primary sensory thalamus codes for these nonsensory behavioral aspects, we wanted to quantitatively and qualitatively capture the response pattern of the animals by systematically comparing nested generative behavioral models on the basis of logistic regressions (Corrado et al., 2005; Busse et al., 2011). As detailed in the methods section, the regression equation of the full model included three terms (Eq. 1; Fig. 2B): the first term (b0) was a “bias” term that represented the presence of the stimulus and other nonvarying biases the animal may have had. “Satiation”, the second term (weight bS), was calculated as the cumulative sum of the reward series. Satiation may actually represent several correlated variables (e.g., satiation, motivation, general attention, vigilance, etc.) that we simply subsume in this report under the label of satiation for ease of communicating. The main point in this study was not to differentiate between these possible variables, but to define a nonsensory variable that varies “globally”, across the time scale of a session. In this sense, the group designation of satiation seems reasonable as it is the kind of process that is related closest to the sequence of decisions because every hit trial, with another drop of water consumed, will directly increase its value. The third regressor, called trial history (weight bH) is extracted as well from the sequence of decisions, but is very different from satiation with respect to its reference point and dynamic time scale: differing from satiation, trial history is very local, in our current definition consisting exclusively in triplet of trial outcomes representing the current and two preceding trials (t, t−1, t−2, with t being the current trial). As regressors bS and bH were both extracted from the decision series, they showed an overall positive correlation (correlation coefficient r = 0.4), but, nevertheless, varied greatly from session to session (−0.41 < r < 0.75). The distribution of the correlation of the three history terms with satiation was almost identical, indicating that the correlation (if it existed) was due to longer strips of nonchanging behavior, presumably at the beginning (predominance of hits) and at the end of the session (predominance of misses). As will be seen despite the correlation in some sessions, inclusion of the local term bH improved the performance of the behavioral model quantitatively and qualitatively over those only containing the global variable.
After fitting the models, we used them in two different modes to generate model decisions (Corrado et al., 2005). The first mode is called predictive. Predictive models use regressors that are calculated from the sequence of previous decisions observed in the rat experiment (Fig. 2C). The second mode is called generative. Generative models take the rat data out of the loop; except from the very first three trials to initialize the process. From there on the model generates its output based solely on the model-generated previous sequence of decisions (Fig. 2D). Importantly, the generative modeling was aimed at explaining the observed variety of changeover rates, a property not contained explicitly among the regressors of the model.
Figure 2C displays the predictive performance of the full and two reduced models: the first reduced model was devoid of contributions from recent decisions (bias and satiation), i.e., it lacked the bH term), whereas the second reduced model additionally eliminated the satiation term bS from the equation (bias). Comparing each model pairwise with the next extended one, we found that each extension of the model significantly improved the fit [deviances: Dbias = 196; Dbias & satiation = 158, Dfull = 136; deviance tests (Eq. 5): Dfull vs Dbias/satiation and Dbias & satiation vs Dbias, both p ≪ 0.01; Fig. 2C). As expected, all models predicted hit rates very well; the overall hit rate was in all cases included in the 95% prediction intervals estimated from 1000 model runs. It is clear however, that the response patterns were not recovered by the reduced models. The “bias model” generated fast changing response trains, which throughout all 32 sessions generated large numbers of changeovers (this result on first glance might feel counter-intuitive as the predictions shown in C look rather constant within individual sessions; however, drawing from binomial distributions with probabilities close to 0.5 will generate quite some random fluctuations in the binary outcomes). Also the typical decay of responding seen in the end of the sessions due to satiation was not recovered. Adding satiation to the list of regressors the “bias and satiation model” reflected the response decays but failed to represent the exact temporal position of finely resolved response patterns. Only the full model, including the trial history terms, was able to fully recover these fine grained response patterns. This gradient in the quality of model performances is captured by the median accuracy (Eq. 6) of prediction obtained with each model [bias: median accuracy = 0.52 (the 95% prediction interval straddled random accuracy of 0.5 in 21 of 32 sessions): bias and saturation: median accuracy = 0.64 (10 of 32 prediction intervals straddled random performance); full: median accuracy = 0.69 (3 of 32 prediction intervals straddled random performance)]. In summary, these results support the idea that response patterns on the scale of the session are predicted, as expected, by satiation of the animal, although more local patterning on the level of a few trials is predicted by trial history. To arrive at an estimate of the impact of past trials on the actual decision, we additionally formulated a model with separate terms for each trial in the triplet of recent trials. Therefore this model had five terms, in addition to the bias and satiation weights b0, bS, we fitted the weights b1, b2, b3 for each recent trial in the triplet (deviance = 126; data not shown). The distribution of coefficients fitted for the three history terms across sessions revealed that the most proximate one (b1) robustly attained positive coefficients [median/interquartile range (IQR): 0.91/1.00], the second (b2) was already considerably less effective (0.30/0.67), and the effect of the most distant one b3 was small. In fact, the distribution of b3 was wide, consistently encompassing zero (no effect; 0.17/0.95). We, therefore, conclude that trial history affecting behavior is rather short lived and does not extend beyond 2–3 trials.
Finally, we wished to validate the modeling predictions by having the model simulate the animals' performance using its own model-generated decision series (Corrado et al., 2005). We asked whether an aspect of the decision series, the number of changeovers, that was not explicitly used to fit the models could be recreated as well. As expected already from the prediction results, the bias model was not able to correctly recreate the changeover rate produced by the rats (Fig. 2D, right; observed changeover rates fell outside the 95% prediction interval generated by 1000 model runs in 25 of 32 sessions). The extended model bias and satiation did substantially better (Fig. 2D, middle; failing in 10 of 32 sessions). By far the best performance, however, was achieved by the full model incorporating also the trial history; it did not fail in any of the 32 sessions to match the performance of the rats (Fig. 2D, left).
In summary, our behavioral modeling showed that generalized logistic models can capture a significant portion of the rat behavioral variability based on the nonsensory variables satiation and trial history. Of particular interest for the remainder of this study is the fact that adding the local trial history term to the equation, significantly improved the model fit, the model's prediction, and the generative recreation of the number of changeovers, a variable not explicitly used to fit the models. We will show in the following sections that only one of the two nonsensory variables, trial history, is represented as well by VPm spike rate and patterns.
Determinants of VPm spiking
We recorded a total of 42 multiunits and 8 single units within the α, A1, A2, C1, C2, D1, and D2 barreloids associated with the stimulated whisker during detection behavior. By gradually moving the electrodes, units were found between 4000 and 5000 μm in depth, consistent with anatomical records of VPm location (Paxinos and Watson, 2007), and confirmed later by histological examination of electrode tracks and electrical lesions. VPm unit responses to ramp deflections of a single whisker approximately matched the ones reported by previous studies investigating anesthetized rats (Petersen et al., 2008). The neural response consisted of a short elevation of firing rates at a latency of 6.1 ± 1.2 ms (mean ± SD) with a duration of ∼20 ms (Fig. 3A). In a subset of units, we tested the responses to the ramp stimulus without embedding it in white noise and found that background whisker noise stimulus reduced both noise-driven and ramp-driven responses considerably (spontaneous background vs noise-driven: AUC 0.41; ramp evoked with noise stimulation vs ramp evoked without noise stimulation: AUC = 0.38; 7 SU and 21 MU) as expected from preadaptation of VPm units (Whitmire et al., 2016).
In addition to traditional sensory stimulus representation, here we aim to demonstrate nonsensory input representation in the VPm. Therefore, we first determined whether VPm responses (and thus their receptive field properties) adapt to sensory properties of the ramp. As VPm units show strong directional selectivity (Fig. 3A; Petersen et al., 2008), VPm unit receptive fields should be sensitive to the direction of the ramp stimulus. We therefore introduced a reversal of ramp direction in the middle of the session (after 50, 75, or 100 trials depending on the total number of session trials). In approximately half of the sessions we switched from caudal to rostral stimulus direction, in the remaining ones the sequence was reverse. As expected from their directional preference, units often showed differences in responding to different ramp deflection directions (Fig. 3A exemplifies this using an SU), but the rat's detection performance was not affected at all by this manipulation (compare Fig. 2A). To determine whether the change in ramp direction impacted the coding properties across units, the temporal response properties were quantified from responses to the white noise stimulus within a 2 s period preceding the ramp presentation. In a first step, spike-triggered stimulus ensembles were constructed (as done before with primary afferent spike trains; Chagas et al., 2013). Exemplary data obtained from a single-unit spike train are shown in Figure 3B. The stimulus ensembles were instantaneous (i.e., measured at one single point in time), and were obtained at the time preceding the spike that maximized stimulus information contained in the spike train (Eq. 9). In the 2D projections of the 3D distribution of stimulus properties [position, velocity and acceleration = (pos, vel, acc)], the VPm single units showed single preferred stimuli and conveyed up to 3.7 bits/s (note that the total stimulus information conveyed by a VPm neuron is likely much higher if information within an interval is taken into account; cf. Chagas et al., 2013). We constructed the centroid of the preferred instantaneous stimulus separately in the first and second halves of the session. From the vectors pointing in three dimensions to the centroids of the two halves, we calculated the difference vector (plotted in Fig. 3C). We found that preferred stimuli did change systematically in the two halves of the session, but not with respect to the ramp direction. Contrary to the expectation of opposite changes (due to opposite ramp directions), we found that in all sessions the receptive fields changed on average in almost identical ways; toward slower stimulus velocities and caudal positions (Fig. 3C). This result held regardless of whether rostral or caudal ramps were presented in the first half, suggesting that it is not a stimulus driven change. All of the observed differences were smaller than 5% of 2 SD of the kinematic range covered by the noise stimulus. However, changes were highly significant, as reflected by a small reduction of the mean vectors (connecting the centroids to the origin) and a slight but robust decrement of stimulus information transmitted by the spike train (Fig. 3D; length of sum vector in arbitrary units: n = 50; first half: median: 1.8, IQR: 3.6; second half: median 0.9, IQR 2.9; sign rank test, p = 1e−05; Fig. 3E; information rate in bits/s: n = 50; first half: mean: 2.29, SD: 0.43; second half: mean 2.24, SD: 0.47; sign rank test, p = 0.006). In summary, we found changes in coding across sessions that were not systematically related to the sensory context (i.e., ramp direction).
In view of the small effect size and sensory context independent nature of the changes in neuronal coding, we were concerned that the effect could be due to run-down of the neuronal responses (e.g., systematic deterioration of spike quality during the recording). To address this possibility, we first tested stability of spike waveforms and spike counts recorded in the two halves. Spike waveforms were sampled as 23 voltage readings. Interpreting these as dots in a 23-dimensional space we calculated the Mahalanobis distance between the clouds made up by spikes in the first versus second half of the session. This distance is related to the effect size d′ but takes the correlations of the sample into account, i.e., it uses ellipsoids instead of spheroids to scale the distance of a point to the cloud's center of mass. The Mahalanobis distance was close to zero in the total sample of 50 units (n = 42 MU, 8 SU; median 0.08, IQR 0.14). We therefore conclude that spike waveforms were stable across the two halves of the behavioral sessions. Also, spike counts were in the same range in the two halves (50 trials each) giving no reason to suspect recording instability. However, counts were slightly but consistently higher in the second half (number of spikes first half: median: 5181, IQR: 3340; second half: median 5517, IQR 3658; sign rank test, p = 4e−04).
In summary, we found in this section that neuronal coding changed to a small degree and the spike rate evoked by the background white noise stimulus increased during the behavioral session. We excluded the possibility that these systematic changes were due to sensory context, as a reversal of the ramp stimulus, the perceptual target for the rats, had no effect. Further, our analyses render it unlikely that recording instability can be made responsible for the systematic changes found. In the next sections, we provide evidence that nonsensory variables are predictive for these changes in VPm firing.
VPm neurons reflect state-dependency
To explore whether firing rates differ with respect to the behavioral choice of the animal, we computed the population spike rate of all hit trials versus all miss trials observed in all units (n = 50; Fig. 4A). Parsing the trial classes revealed clearly distinguishable spike rates, even before the ramp onset. The noise-driven response in the 2 s pre-ramp interval was higher for upcoming m trials compared with upcoming h trials. Interestingly, the ramp-evoked responses showed the opposite, and were higher for trials that resulted in hits compared with misses (inset). These results were stable across session time, despite increasing satiation and differing ratio of hit versus miss trials (first half: 764 m, 2034 h; second half: 1636 m, 923 h; n = 50 units; Fig. 4A). The effect was robust even during phases of impulsive nonrewarded licking between trials in early epochs of the session versus reduced licking in later phases of the session. Figure 4B shows the effect size (AUC) of difference of firing rate giving rise to h or m outcome during the ramp-evoked peak of excitation. The effect across all units was small but highly significant. The median of bootstrapped AUC was >0.5 (random performance) before, and <0.5 in an interval 5–10 ms after ramp onset with the 95% prediction interval (calculated in a 5 ms running window) excluding 0.5 in both cases (Fig. 4B, right). The population of eight single units in the sample showed the same tendency albeit the prediction interval of noise-driven firing rates did not exclude 0.5, presumably due to the low number of SUs generating very low firing rates (Fig. 4B, left). To investigate the effect of trial history, we selected those trial doublets in which an h trial was either followed by another h trial or an m trial (hh vs mm; Fig. 4C). The preceding hit trials by definition were followed by licks, giving rise to extra spikes due to face movement, and therefore, precluded analysis of the resulting spike rate. However, after the licks subsided, the population firing rate quickly separated according to whether a hit or miss trial was going to ensue. Shortly before the next stimulus presentation, the average spike rates were well separated with higher spike rate before miss trials compared with hit trials. Again, the spike response to the ramp stimulus was reversed; higher for h trials than for m trials. We then went further to inspect differences in spike rate in prestimulus firing rates of SUs and MUs using spike rates found in the last of a triplet sequence of h trails (hhh) or m trials (mmm; data not shown). Mean spike numbers found in the second before the last stimulus in hhh triplets versus mmm triplets were 17.2 versus 19.2 spikes in SUs (n = 8) and 44.5 versus 57.8 spikes in MUs (n = 42). Considering the total sample of units (n = 50), the AUC effect size of this difference was consistently above random performance in the pre-ramp interval and below random performance after ramp onset (as indicated by the exclusion of 0.5 from the 95% prediction interval). These initial insights suggested that the slight differences in spike rate of the first versus second halves of the session reported in the last section (Fig. 3) were likely due to a specific enhancement of spikes before miss trials, which were more abundant in the second half of the sessions.
To describe the relationship of spiking on trial history and satiation in a more systematic way, we selected trials according to trial history obtained from one (single), two (doublet), or three recent trials (triplets). Further we arbitrarily classified satiation into bins of 0.1 (the range of satiation is 0 to 1). We show population noise-driven firing rates (obtained in a 2 s interval before ramp onset; Fig. 5A,B) as well as ramp-driven rates (obtained in a window 5–6 ms after ramp onset, Fig. 5C,D). The pairwise AUC effect sizes (comparing firing after m/h, mm/hh, and mmm/hhh trial sequences; Fig. 5A,C), and all satiation levels compared with lowest satiation at the beginning of the session (<0.1; Fig. 5B,D) were calculated as well. It can be appreciated that firing rate before and after the ramp is related to trial history. Further, the impact of trial history seems to reach back into the past further than the most recent trial: the AUC effect size of m/h, mm/hh, and mmm/hhh comparisons is increasing for noise-driven activity (0.59, 0.61, 0.63) and decreasing for ramp-evoked activity (0.45, 0.43, 0.42). The results so far are consistent with the view that recent task engagement is associated with an increase in the ratio of ramp-evoked spiking to noise-evoked spiking, possibly improving the rats' stimulus detection, while disengagement leads to the opposite. These variations are local and involve about three trials, with the most recent trial imparting the strongest effect and the most remote trial imparting the weakest effect. In contrast, satiation seems to correlate with noise, as well as ramp-driven activity, if at all, only in a weak and inconsistent fashion (Fig. 5B,D).
To substantiate and further quantify these results, we next tested whether a series of two coupled GLMs was able to predict the animal's decisions via the VPm spike responses. Figure 6A plots the behavioral performance registered with each of the 50 spike trains, and the respective noise-driven spike counts (in a 2 s interval before the respective trial; there are duplicates of behavioral performance among the 50 sessions as sometimes more than one unit was recorded within the same session). Figure 6B depicts the observed spike counts. To be able to study the effect of the same independent variables as done for the direct behavioral modeling, the first GLM used the identical linear regression model (Eq. 1) to map the three variables bias, satiation and trial history onto spike counts. To match the assumed Poisson distribution of spike counts, an exponential was used as the nonlinear response function (Eq. 3). A second GLM then used the spike count prediction of the first as input (weight bSPK, together with a bias term b0,SPK) to map them onto the rats' decisions (Fig. 6C). Fitting the full model and its nested versions, we found that the full model best explained pre-ramp spiking (deviances: DSPK = 1165; DDec = 144) closely followed by the reduced model lacking satiation (with the first GLM attaining slightly higher and the second attaining lower deviance: DSPK,full = 1192; DSPK,red1 = 1165; DDec,full = 144; DDec,red1 = 165), indicating that the satiation term did not consistently contribute to the models' prediction performance. Finally, abolishing the trial history term reduced the models' performance consistently and significantly (DSPK,red2 = 1319; DDec,red2 = 196; both significant different form the full and the first reduced model, p < 0.01, deviance test). The results of the model predictions for each trial in the sample are shown in Figure 6D and should be compared with the animals' actual performance in Figure 6A. Qualitatively, the full model shows only a modest improvement over the bias and trial history model, suggesting that the fine grained changes in responding are conveyed largely by the inclusion of the trial history term in the regression model rather than the satiation term. Finally, we predicted the changeover rate based on the rat behavioral data (Fig. 6E). Although changeover prediction grossly failed with the bias-only model (data not shown), adding the trial history term approximates the animals' changeover rate (38 of 50 prediction interval estimated from 1000 model runs included the animals' changeover rate), although there was a tendency of overestimation in sessions with low changeover rates. The prediction of changeover rate did not improve by adding satiation to the regression equation. If at all, the addition of this term deteriorated the model performance somewhat (29 of 50 prediction intervals included the experimental value). As before for the direct behavioral model the gradient of performances of the indirect model is captured by the median accuracy (Eq. 6) of prediction [bias: median accuracy = 0.52 (the 95% prediction interval straddled random accuracy of 0.5 in 40 of 50 sessions): bias and trial history: median accuracy = 0.66 (11 of 50 prediction intervals straddled random performance); full: median accuracy = 0.65 (13 of 50 prediction intervals straddled random performance)].
The similarity of model performance to predict the choices of the rats based directly on the behavioral data (Fig. 2) and indirectly via VPm spike counts (Fig. 6) indicated that spike counts carry some information about trial history as well as the upcoming decision. To show this more directly, we correlated pre-ramp noise-driven spike counts first with current trial history, second with upcoming decisions and thirdly with the predictor for decisions from the behavioral model. The resulting correlations were scattered widely but assumed predominantly the negative range of correlation coefficients (i.e., low firing predicting high probability of a Go decision; Fig. 6F). The best units reached coefficients lower than r = −0.5. To form an intuition what these differences mean in terms of number spikes, we plot these correlation coefficients against the difference of spike counts they would predict for the extremes of trial history (mmm vs hhh).
In a final analysis, we sought to determine whether information about trial history is contained exclusively in spike counts or whether it could also be transferred by spike patterns. This is an important question as thalamus regular versus burst spike patterns are known to have a unique basis in special membrane properties and have long been considered as decisive variables of thalamic functionality (Jahnsen and Llinás, 1984). We first calculated the Fano factor, which is a measure of variability of spike count (Eq. 7). This measure clearly contained information about upcoming miss and hit trials and orders the binary value of doublet and triplet trial history sequences fairly well (Fig. 7A). The ordering is such that miss trials or sequences containing more (or more recent) miss trials tend to assume a higher Fano factor, i.e., more irregular spike counts. Another measure, better focused on spike timing, is the coefficient of variation (CV) of spike intervals (Eq. 8). Like the Fano factor, the CV of spike intervals indicated more irregular firing with sequences containing more (or more recent) miss trials (Fig. 7B). Finally, the time scale of spike patterning was captured by autocorrelograms (ACs) of MU spike trains obtained from the different trial classes. The mean ACs showed a peak of correlation at small time intervals <5 ms, a possible sign of bursting within the population of neurons. Importantly, the amplitude of the peak systematically varied with trial history in line with Fano factor and CV: the most patterning on this precise time scale was seen with trials containing most (or most recent) miss trials (Fig. 7C). In summary, information about trial history is contained in VPm neuron prestimulus spiking as well as spiking irregularity.
Discussion
In this study we have revealed two task-dependent, nonsensory variables, which partly predict the choice behavior of rats trained on a tactile detection task. We provide evidence that one of them, a local variable that reports the last few choices and rewards, is also represented in the spike rate and patterns of rat whisker thalamus (VPm). Specifically, our study has the following novel aspects. First, we show that VPm neurons, an early tactile neuronal structure of the thalamus on the ascending tactile pathway, can reflect cognitive signals. Second, we show that this effect is related to behavioral outcomes of previous trials. Third, the reflection of local trial history in VPm spiking is largely limited to two to three trials in the past.
Functional aspects of the found modulation
Our focus on task-dependent variables ignores all task independent variables needed to explain the total variance of choice behavior and VPm spiking. This reduction is shared with many studies that have studied effects on either behavioral (Corrado et al., 2005; Busse et al., 2011; Stüttgen et al., 2013) or neurophysiological variables (Britten et al., 1996; Sugrue et al., 2004; Nienborg and Cumming, 2009; Yang et al., 2016). In the visual and tactile system of monkeys, it is generally assumed that choice probability is high in higher cortical areas and declines in early sensory ones (Britten et al., 1996; de Lafuente and Romo, 2005; Nienborg and Cumming, 2006). Nevertheless, some choice related activity has been reported as early as the first-order thalamus (Jiang et al., 2015; Yang et al., 2016). The mentioned studies generally argued that choice-related spiking can be readily explained by bottom-up conveyance of random activity fluctuations in the sensory pathway, which then impact perceptual circuits; a purely “sensory” interpretation of the observed relation (Britten et al., 1996). Nienborg and Cumming (2009) have exploited white noise analysis to reveal that choice-signaling in monkey V2 area builds up over hundreds of milliseconds after onset of stimulus presentation, a finding incompatible with bottom-up signaling. We extend their finding by demonstrating, first, that the found influences are related to past decision/outcomes, and second, are reflected by VPm noise-driven spiking in the interstimulus interval, long before the upcoming stimulus is presented and a choice is executed. Therefore, the found effects are related mainly to behavioral context and corresponding brain states, and therefore must have been caused by central signals. Bottom-up fluctuations are not excluded by our results, but their effect naturally must be limited to the poststimulus period.
The mentioned reports, studying choice-related signals based on the choice in just a single trial, all found very small but significant choice probabilities. Interestingly, we found that effect sizes for pre stimulus spiking in VPm were seen to increase from just >0.5 to close to 0.7, when considering past choices and outcomes (Fig. 5, compare A, C). We conclude that even very small effect sizes, obtained with classical single trial approaches, can gain considerable strength if appropriate behavioral context is taken into account (in our case local trial history, the preceding 3 trials). One study in the mouse tactile system, directly related to the present work, investigated choice related signals at the level of VPm in the context of a detection task (Yang et al., 2016). This study found a nonrandom choice probability in activity evoked by the target stimulus, which we confirm here. An important difference was, however, that the previous study did not find any influence of choice on the “spontaneous” (nonstimulated) firing between stimuli. To explain this apparent difference, we hold that consideration of trial history and intertrial white noise stimulation were the two decisive experimental factors that enabled us to extend the previous results by exposing modulation of spike rates and patterns that anticipated upcoming choices. It is noteworthy that modulation of spike rate in the prestimulus period relating to upcoming hit trials was opposite to that after stimulus onset (the first negative, the second positive). It is thus a suggestive proposition that increment in spike rate contrast between background and target stimulus in VPm units are at the basis of improved detection.
The nature of modulatory effects
We first delineate the found effects from motor-related signals on the ascending tactile pathway. We and others have shown repeatedly that a passive psychophysical task in well habituated and overtrained head-fixed rats largely abolishes whisking. To have head-fixed animals generate whisker movements they need to be explicitly rewarded for movements (Bermejo et al., 1996, 1998; Stüttgen et al., 2006; Schwarz et al., 2010; Ollerenshaw et al., 2012). Moreover, our experimental condition, with the whisker inserted into a narrow tube, would translate whisking to forces of the inserted whisker against the tube, and thus, to elevated spiking in concert with the whisking rhythm (∼10 Hz; Urbain et al., 2015; adding to internally generated rhythmic whisking-related signals, Moore et al., 2015). Sensory gating, the well described suppressive effect of movement on ascending sensory signals (Chapman, 1994; Hentschke et al., 2006) would surely be overwritten by these forces acting directly on the mechanically clamped whisker. We therefore, calculated ACs of spike trains and scrutinized them for signs of the ∼10 Hz whisker rhythm. Applying rigorous criteria, we removed all trials from the data, in which rhythmic influence of spiking by whisking was suspected. Further, modulation possibly linked to licking movements were excluded by conditioning stimulus presentation to the absence of licking 2 s before stimulus presentations.
Our study is a first step to characterize the effect of behavioral disposition on upcoming choices and spiking in the ascending tactile pathway. The most direct evidence obtained in our study is in favor of memory content about past trial decisions and outcomes. We found that triplets of trials gain access to future choice behavior and VPm spiking in a highly structured way, such that the most proximate trial attains the largest weight on upcoming decisions and spiking, with the second last one exerting a lower effect, and the third last one being close to ineffective. Intertrial dependencies are regularly observed in behavioral paradigms, in which agents are required to keep past experiences in memory and adaptively extract behavioral strategies from them (Corrado et al., 2005; Lau and Glimcher, 2005; Stüttgen et al., 2013). However, it is a remarkable and well corroborated fact in animals and humans that intertrial dependencies are regularly observed also with fixed stimulus–reward properties/contingencies (Nienborg and Cumming, 2009; Busse et al., 2011; Fründ et al., 2014). It appears therefore, that intertrial correlations may be generated internally in a quasi-automated fashion, even in cases where the trial history is behaviorally irrelevant. Our behavioral and physiological data clearly confirm this notion. Such intertrial dependencies may be related to attentional processes, for which facilitated thalamic spike responses (McAlonan et al., 2008) or BOLD responses (O'Connor et al., 2002; Saalmann and Kastner, 2011; Ling et al., 2015) were reported.
On the behavioral level, our data also point to a role for satiation, a more global influence on choice behavior compared with trial history. In contrast, satiation, when combined with bias and trial history as a regressor in the indirect model predicting behavior from spikes, did neither consistently improve the model fit nor did it predict changeover rates better. From the relative lack of influence of the global variable on thalamic processing, we suggest that signals mediating satiation and similar variables access executive functions downstream from VPm. This conclusion is in agreement with the idea that ingestion is controlled by a neuronal network that involves hypothalamus, basal ganglia and prefrontal cortex, and receives tactile inputs from the brainstem/spinal cord level (Risold et al., 1997).
Possible anatomical bases
Working memory, holding the short term decision memory as found here, likely originates in association areas in prefrontal or parietal cortex (de Lafuente and Romo, 2005; Gold and Shadlen, 2007; Fuster, 2009). The primary sensory thalamus may be accessed for this kind of signal by top-down chains of corticocortical and corticothalamic projections (Crandall and Keller, 1985; Ghazanfar et al., 2001; Temereanca and Simons, 2004; Casagrande et al., 2005; Gilbert and Sigman, 2007; Briggs and Usrey, 2008; Buffalo et al., 2010; Mease et al., 2014). Although well described projections of prefrontal areas to neuromodulatory centers in the brainstem, which in turn terminate on neurons of ascending sensory pathways, may play a role as well (Moruzzi and Magoun, 1949; Steriade and McCarley, 1990; McCormick and Bal, 1997). Crick (1984) proposed that the corticothalamic feedback system may help to functionally couple neurons within and across ascending projection systems, and thus act as the “searchlight of attention”. For this to happen, the integrative power of the corticothalamic system should clearly surpass the borders of unisensory integration. Recent evidence of nonsensory and multisensory signaling of primary sensory cortex is supporting such a role (Lau and Glimcher, 2005; Busse et al., 2011; Halassa et al., 2014), and the nonsensory thalamic signals revealed here would be a required element in this scheme.
Footnotes
This work was supported by Grants from the Deutsche Forschungsgemeinschaft (DFG SCHW577/ 12-2 and 14-1) and BMBF CRCNS (01GQ1113) to C.S., and NSF CRCNS (IOS-1131948) to G.B.S. C.J.W. was supported by GT/Emory NIH Computational Neuroscience Training Grant (T90DA032466) and the NIH NRSA Pre–doctoral Fellowship (F31NS089412). We thank Maik Stüttgen and Laura Busse for advice about behavioral modeling, and Ursula Pascht for fabricating the electrode arrays.
The authors declare no competing financial interests.
- Correspondence should be addressed to Cornelius Schwarz, Systems Neurophysiology, Werner Reichardt Center for Integrative Neuroscience, University Tübingen, Otfried Müller Str. 25, 72076 Tübingen, Germany. cornelius.schwarz{at}uni-tuebingen.de