Abstract
Growing evidence supports a critical role for the dorsal striatum in cognitive as well as motor control. Both lesions and in vivo recordings demonstrate a transition in the engaged dorsal striatal subregion, from dorsomedial to dorsolateral, as skill performance shifts from an attentive phase to a more automatic or habitual phase. What are the neural mechanisms supporting the cognitive and behavioral transitions in skill learning? To pursue this question, we used T-maze training during which rats transition from early, attentive (dorsomedial) to late habitual (dorsolateral) performance. Following early or late training, we performed the first direct comparison of bidirectional synaptic plasticity in striatal brain slices, and the first evaluation of striatal synaptic plasticity by hemisphere relative to a learned turn. Consequently, we find that long-term potentiation and long-term depression are independently modulated with learning rather than reciprocally linked as previously suggested. Our results establish that modulation of evoked synaptic plasticity with learning depends on striatal subregion, training stage, and hemisphere relative to the learned turn direction. Exclusive to the contralateral hemisphere, intrinsic excitability is enhanced in dorsomedial relative to dorsolateral medium spiny neurons early in training and population responses are dampened late in training. Neuronal reconstructions indicate dendritic remodeling after training, which may represent a novel form of pruning. In conclusion, we describe region- and hemisphere-specific changes in striatal synaptic, intrinsic, and morphological plasticity which correspond to T-maze learning stages, and which may play a role in the cognitive transition between attentive and habitual strategies.
SIGNIFICANCE STATEMENT We investigated neural plasticity in dorsal striatum from rats that were briefly or extensively trained on a directional T-maze task. Our results demonstrate that both the extent of training and the direction a rat learns to turn control the location and type of change in synaptic plasticity. In addition, brief training produces changes in neuron excitability only within one striatal subregion, whereas all training produces widespread changes in dendritic morphology. Our results suggest that activity in dorsomedial striatum strengthens the rewarded turn after brief training, whereas activity in dorsolateral striatum suppresses unrewarded turns after extensive training. This study illuminates how plasticity mediates learning using a task recognized for transitioning subjects from attentive to automatic performance.
Introduction
Performance of a newly learned task requires more attention, separate cognitive processes, and engages different brain regions than skillful performance of the same task after extensive training. Within the basal ganglia, striatal subregions preferentially serve these distinct learning stages (Ragozzino, 2003; Murray et al., 2012). The dorsomedial region is engaged in periods of attentive decision-making and serves early learning, while the dorsolateral striatum automates responses and practiced skills later in learning. This shift in striatal engagement is indicated by changes in behavior following subregional lesions (Whishaw et al., 1987; Yin et al., 2004; Lee et al., 2014), in vivo neural activity (Yin et al., 2009; Thorn and Graybiel, 2014), and changes in glutamate receptors suggestive of synaptic plasticity (Yin et al., 2009; Kent et al., 2013; Shan et al., 2014).
Synaptic plasticity is the activity dependent adjustment in connections between neurons; within striatum, this enables experience to selectively enhance critical action–outcome associations. The only study to date demonstrating learning-related change in evoked synaptic plasticity across striatal subregions reports enhanced long-term depression (LTD) and altered AMPA:NMDA ratios in the dorsolateral striatum of extensively trained animals (Yin et al., 2009). One interpretation is that recent long-term potentiation (LTP) elevates synaptic weight, thereby enhancing room for synaptic weight change in the opposing direction, i.e., LTD (Lin, 2010; Cooper and Bear, 2012). Alternatively, learning may modulate LTD and LTP independently rather than reciprocally. Distinguishing these possibilities requires direct comparison of bidirectional plasticity, which we achieve using a novel theta-burst LTP protocol (Hawes et al., 2013).
T-maze training transitions rats from attentive, action–outcome performance to automatic, stimulus–response performance of a rewarded turn (Packard, 1999; Yin and Knowlton, 2004). Hemispheric lesions and in vivo recordings demonstrate that turning behavior corresponds to increased striatal activity in the contralateral hemisphere (Ungerstedt et al., 1969; Cui et al., 2013). However, the development of plasticity sculpting a learned turn is uncharacterized, making a lateralized task useful. The present study tracks plasticity by hemisphere during T-maze learning to identify the hemispheric distribution of plasticity sculpting a turn.
Intrinsic excitability and morphology may interact with synaptic plasticity to serve learning. In dorsal striatum, potassium channel regulation accompanies spatial learning (Truchet et al., 2012), and modifies synaptic plasticity (Nazzaro et al., 2012). Dendritic spine growth is an indication of LTP (Kasai et al., 2010), whereas a recent work shows that memory and LTP are supported by spine loss in behaviorally engaged circuits, suggesting signal-to-noise enhancement through synaptic pruning (Sanders et al., 2012). To directly evaluate whether these different forms of plasticity interact to produce learning behavior, we measure intrinsic excitability and morphology of striatal medium spiny neurons in parallel with synaptic plasticity measures.
This is the first study to investigate anatomical distribution of evoked bidirectional striatal plasticity as animals transition from early, attentive place to late, automatic response strategies with T-maze learning (Packard, 1999). We find learning independently modulates striatal LTP and LTD. Plasticity, intrinsic excitability, and morphology each reflect maze training, and we demonstrate that neural learning signatures have a biased hemispheric distribution reflecting the direction an animal learns to turn.
Materials and Methods
Animals and habituation.
All animal handling and procedures were in accordance with the National Institutes of Health animal welfare guidelines and were approved by the George Mason University IACUC. Adult, male Long–Evans rats (2- to 3-months-old, Charles River Laboratories) were acclimated to the animal facility, undisturbed, for a minimum of 1 week. After acclimatization, rats were habituated to human handling by passive holding for 5 min a day for 7 d, during which time they began food restriction (Fig. 1A). To motivate food seeking, rats were maintained between 85% of their initial free-feeding weight and 85% average weight for their age in free-feeding male Long–Evans rats (providing for weight-gain with age in late-trained animals). On their seventh day of holding rats were given three Kellogg Froot-Loop halves in their home cage to begin habituation to this food reward, the same reward used in maze training and probe runs. The next day rats began food cup habituation, in which they explored a rectangular table until eating from a food cup at one end of the table containing one Froot-Loop half. Food cup habituation continued until rats ate from the cup in <3 min on 2 consecutive days (typically taking 3 d). Holding and food cup habituation occurred in the same room, distinct from housing and maze rooms. Rats were first exposed to the maze room during a single day of maze habituation in which rats were released onto the maze from the opaque south arm start box as would occur during training, and were given 5 min to explore the maze without reward. Including holding, food cup habituation and maze habituation, all rats experienced 11 ± 1.1 d of habituation (Fig. 1A). To avoid reinforcing intrinsic bias in turning behavior, experimenters noted the order of arm entry during maze habituation, and rewarded the second-choice arm during maze training. The rewarded turn direction was consistent for each rat, but varied between subjects. Assignment to habituated, early-trained and late-trained groups was pseudorandom and preceded the start of behavior for each rat. Behavior start dates were staggered such that, on any given day, animals from each condition were in training but final probes would not overlap.
T-maze training.
The maze room was dimly lit to minimize animal anxiety, with bold visual cues distinguishing all quadrants of the room. Maze habituation, training, and probe trials were video recorded by a ceiling-mounted camera centered over the maze. Identical food cups were secured at ends of east and west maze arms, and identical, opaque start boxes were fastened to the ends of south and north arms. A mobile, clear plastic barricade blocked entry into the arm opposite the animal's start position, which was the south arm during training and north arm on probes.
Each training day consisted of four maze runs, and rats were trained every day in which a probe was not given. Rats entered the maze room in an opaque transfer cage and were given ∼30 s in the transfer cage, followed by ∼10 s in the south arm start box before each run. Either the east or the west arm was baited (Fig. 1B, left). After each run, rats were removed to the transfer cage after either eating the reward or committing a wrong turn. The maze was wiped down between runs to obscure olfactory cues, and on a pseudorandom schedule, the maze top was rotated 180° to prevent reliance on intrinsic cues. Criteria to end early training were correct execution of all four runs within a training day, after a minimum of 4 d training. On meeting these criteria, a strategy probe was administered the next day (Fig. 1A, P1). Late-trained rats were given 2 additional weeks of training with six training days per week, and a single strategy probe every seventh day (Fig. 1A, P2 and P3). Early-trained rats trained 5.9 ± 0.4 d (23.6 ± 1.5 runs), whereas late-trained rats trained 18.6 ± 0.6 d (74.3 ± 2.2 runs) beyond habituation.
On probe days, rats were started in the north arm, both food cups were baited, and rats were given a single run (Fig. 1B, right). On a probe run, an animal rewarded throughout training for turns toward the east arm was scored as demonstrating a place strategy if it made a turn toward the east arm, thereby choosing the spatial location rewarded in training. In contrast, the same animal was scored as using a response strategy if it made a turn toward the west arm, thereby executing the turn direction rewarded in training. On both probe days and training days, a turn was determined by the entire body and base of the tail crossing into an arm. Vicarious trial and error (VTE) was defined by a nose-cross into an arm followed by nose-cross out of the arm rather than committing to a turn, assessed from the aerial video view.
Slice preparation.
Habituated control rats were killed 24 h after maze habituation. Trained rats were killed 24 h following the first probe (early-trained group) or third probe (late-trained). Brain slices were prepared as described by Hawes et al. (2013). Briefly, animals were anesthetized with isoflurane and brains were extracted quickly and placed in oxygenated ice-cold sucrose slicing solution (in mm: 2.8 KCL, 10 dextrose, 26.2 NaHCO3, 1.25 NaH2PO4, 0.5 CaCl2, 7 Mg2SO4, 210 sucrose). Coronal slices were cut 350 μm thick on a Leica vibrotome (VT1000S), and the animal's right and left hemispheres were carefully tracked and moved to separate, labeled incubation chambers containing aCSF (in mm: 126 NaCl, 1.25 NaH2PO4, 2.8 KCl, 2 CaCl2, 1 Mg2SO4, 26.2 NaHCO3, 11 dextrose) heated to 33°C for 30 min and then removed to room temperature (21−23°C) until recording.
Field recordings.
During field recordings, a pair of hemi-slices was transferred to a submersion recording chamber (Warner Instruments) perfused with oxygenated aCSF at 2.5–3 ml/min and 30−32°C containing 50 μm picrotoxin (Tocris Bioscience). Pipettes (resistance 3–6 MΩ) were pulled from borosilicate glass on a P-2000 puller (Sutter Instruments) and filled with the same aCSF bathing the tissue. Raw data were recorded using an intracellular electrometer (IE-251A, Warner Instruments) and 4-pole Bessel filter (Warner Instruments), sampled at 20 kHz and processed using a PCI-6251 and LabView (National Instruments). Population spikes were evoked by stimulating white matter overlaying either dorsomedial or dorsolateral striatum with a tungsten bipolar electrode (diameter 0.005 inch bare, 0.007 inch Teflon-coated, A-M Systems) at an intensity producing 40–60% of the peak signal amplitude on an input–output (IO) curve collected at 0.066 Hz. In most recordings, the synaptically evoked striatal population spike (N2) was preceded by a downward voltage deflection (N1) indicating afferent depolarization by applied current (Takagi and Yamamoto, 1978; Lovinger et al., 1993). Experiments in which N1 varied by >20% from baseline at any point in an experiment were excluded. Population spikes were sampled at 0.033 Hz preinduction and postinduction. Plasticity induction was accomplished as described by Hawes et al. (2013). Briefly, LTP was induced by theta-burst stimulation (TBS) consisting of 10 trains, each train consisting of 10 bursts at 10.5 Hz (theta), and each burst consisting of four stimuli at 50 Hz, with trains spaced 15 s apart. Using this protocol, LTP was only reliably induced in the dorsomedial striatum of control animals; hence, it was not studied dorsolaterally. In both dorsomedial and dorsolateral regions, LTD was induced by moderate frequency stimulation consisting of four trains of 100 stimuli delivered at 20 Hz, with trains spaced 10 s apart.
The experimenter was blind to behavioral data during electrophysiology recording and data extraction. Population spike amplitude was extracted automatically from the 40 ms of raw data surrounding each test pulse using the software IGOR (Wavemetrics). The most negative voltage (N2) following the stimulation artifact was subtracted from the more positive of the following two features to determine population spike amplitude: either (1) mean voltage averaged over 1 ms immediately preceding the stimulation artifact, or (2) the upward going peak dividing N1 (fiber volley) and N2, as previously described (Lovinger et al., 1993; Hawes et al., 2013). During automated amplitude extraction, traces from each experiment were graphically displayed for review by eye, guarding against errors in data extraction. Statistical analysis was performed on the population spike amplitude normalized to the preinduction baseline. Significant increase or decrease in population spike amplitude relative to average baseline amplitude indicates LTP or LTD, respectively.
Whole-cell recordings.
Single hemi-slices from the same subjects used in plasticity experiments were transferred to a submersion recording chamber (ALA Science) gravity-perfused with oxygenated aCSF at room temperature. As with plasticity experiments, the experimenter remained blind to subject strategy and turn direction. In each hemi-slice, up to two medium spiny neurons (MSNs) were patched: one dorsomedial and one dorsolateral. No more than two cells were obtained from the same animal in a given region. Cells were patched under visual guidance using IRDIC imaging (Zeiss Axioskop2 FS plus). Pipettes were fire-polished (Narishige MF-830) to a resistance of 4–7 MΩ, and filled with a potassium-based internal solution (in mm: 132 K-gluconate, 10 KCl, 8 NaCl, 10 HEPES, 3.56 Mg-ATP, 0.38 Na-GTP, 0.1 EGTA, 0.77 biocytin), pH 7.3. Intracellular signals were collected in current-clamp and filtered at 3 kHz using an EPC 10 amplifier and Patchmaster software (HEKA Electronik). Series resistance (6–15 MΩ) was compensated 80%, and capacitance was not compensated. Cells were determined to be MSNs by their low resting membrane potential (near −80 mV), rounded AHPs, and long latency to first action potential. Current–voltage (IV) and current–frequency (IF) curves were recorded from each cell using 400 ms current injections. Because MSNs display strong inward rectification, their IV curves display distinct linear components at potentials negative and positive to rest. Therefore we analyzed two input resistance (IR) values for each cell by fitting a line to the IV curve at current injections of −100 to −500 pA (IRneg) and at 0 to +100 pA (IRpos). More positive current injections were excluded from input resistance analysis to avoid contamination from action potential firing. Rheobase was the lowest current injection value eliciting an action potential, and latency was the time between onset of current injection and action potential peak at rheobase. Note that each hemi-slice was used for either field recording or whole-cell recording but not both.
Morphology.
MSNs were filled with biocytin through the patch pipette for 20–30 min during intrinsic excitability measurements. Hemi-slices were then fixed in 4% paraformaldehyde overnight before removal to PBS. Hemi-slices (350 μm thick) were stained using the biocytin staining protocol for thick slices (Marx et al., 2012). Briefly, after fixation and rinsing in PBS, slices were incubated in the Vectastain ABC kit (Vector Laboratories) overnight at 4°C. After further rinsing in PBS, slices were stained using the DAB kit (Vector Laboratories) with the nickel addition. Slices were then rinsed in PBS and dried overnight in a humid chamber on gelatin-coated slides. Finally, slices were slowly dehydrated in an ethanol series (25, 35, 45, 55, 65, 75, 85, 95, and 100%) and cleared in xylene. Eukitt mounting medium (Vector Laboratories) was used for coverslipping.
Successfully stained neurons were reconstructed directly from the tissue. Neurons were fully reconstructed at 40× magnification without spines, and partially reconstructed (one branch) at 100× magnification to count spines. The branch selected for high-magnification reconstruction was the primary dendritic branch with the most clearly identified spines. Reconstructions were done manually, i.e., a human reconstructor used a cursor to trace and mark visible structures on the monitor using the software Neurolucida (v7), while adjusting focus to move through the tissue in 3D. Reconstructors were trained identically, and were blind to subjects' experimental condition.
Dendritic length, number of branch points, and spine density were each analyzed by path distance from the soma, as opposed to the more traditional Scholl (i.e., Euclidean) distance. Path distance measures distance from the soma when traveling along the dendrites. Within a bin of set path distance, the amount of dendritic length depends on the number of contributing dendrites, and thus depends on the number of branches and the length of branches present. Note that, unlike Scholl distance, the amount of dendritic length within a set path distance from the soma is unchanged by tortuosity.
Structure and spine density analysis were conducted in NeuroExplorer, and values were transferred to SAS for statistical analysis. Because of variability between reconstructors, randomly selected cells were reconstructed multiple times by different reconstructors; such repetition was distributed among experimental conditions, and we included reconstructor as an independent factor in all analyses. In addition, care was taken so that all potential subgroups (such as hemisphere relative to the learned turn) were represented within each training condition.
The untrained control group in the morphology section included fully naive rats which were never food restricted or regularly handled. As reported in results, naive neuronal morphology measures were statistically indistinguishable from those of our habituated controls. Only habituated controls were used for all other sections of the study.
Analysis.
Figures were made using IGOR (v6.1.2.1). Statistical analysis was performed in SAS (v9.3, SAS Institute). The procedure general linear model (GLM) was used to carry out ANOVA and repeated-measures ANOVA, and GLM contrast was used for post hoc comparisons. The procedure FREQ was used to carry out χ2 analyses. The procedure t test was used to assess plasticity in habituated controls, and to compare plasticity across hemispheres. Error bars in all graphs show SEM.
Additional statistical analyses were conducted within rat, e.g., to compare plasticity between hemispheres or between regions. For these analyses, we calculated the difference between evoked plasticity in each trained rat and mean evoked plasticity in habituated controls (plasticity change). Then, we calculated the difference between the ipsilateral and contralateral plasticity change (averaged over the final 15 min after induction) both after the first and third induction period, for each region. In addition, we calculated the difference between dorsomedial and dorsolateral plasticity change for each hemisphere. Note that the sample size is much smaller for these within-rat comparisons because they were applied only to rats for which both measurements were collected using the same induction protocol. To assess independence of LTP and LTD we used the procedure CORR applied to the plasticity change in rats for which both LTP and LTD were collected in the same hemisphere of dorsomedial striatum. This analysis was applied only to the ipsilateral hemisphere, because we had insufficient samples (n = 3) in the contralateral hemisphere.
For plasticity graphs and statistical analysis, n is number of experiments, with not more than one experiment per slice, and not more than two identical treatments collected from the same animal. For within-rat paired comparisons, n is the number of rats, and when applicable, the plasticity change was averaged over two identical treatments of a single rat. For intrinsic excitability and morphology graphs and statistical analysis, n is number of cells, with not more than two cells (one medial and one lateral) per slice, and not more than two cells from the same region collected from the same animal. Means are reported ± SEM, and in all graphs error bars illustrate SEM.
Results
T-maze strategy transition distinguishes early- and late-trained groups
To investigate the involvement of distinct striatal regions as learning progresses, we train rats in T-maze navigation (Fig. 1A,B). Maze training transitions subjects through recognizable performance stages; in particular subjects demonstrate a place strategy during maze acquisition and a response strategy once maze navigation is an acquired skill (Tolman et al., 1946; Dunnett and Iversen, 1981).
To confirm the place to response transition, we examined strategy use during the final probe at both trained stages. There was a significant relationship between strategy at final probe and training stage (X(1, N = 52)2 = 4.74 p = 0.0295), such that early-trained rats made greater use of a place strategy, whereas late-trained rats predominantly demonstrated a response strategy (Fig. 1B,C). Time to reach reward decreased markedly before the first probe (Fig. 1D), suggesting rapid acquisition of the reward location. Frequency of rats' visual inspection of alternative choice arms before action selection, termed VTE, was analyzed as this represents a behavioral correlate of place strategy use and attentive decision-making associated with dorsomedial striatal engagement (Schmidt et al., 2013). VTE was most frequent at the first probe, and declined across training and probes (Fig. 1E). Both strategy during probe trial and frequency of VTE demonstrated strategy transition throughout the course of training.
We assessed several alternative factors that could have influenced T-maze performance. We verified that experimental groups were not different preceding training, or at the time of first probe. There was a positive correlation between weight at time of final probe and both time to reach performance criteria and time to reach the reward on first and final probes (GLM: F(1,51) = 5.09 p = 0.0284, days to criteria; F(1,51) = 4.88 p = 0.0317, Probe 1 time to reward; F(1,51) = 4.46 p = 0.0398, final Probe time to reward). This suggested that heavier rats were less food-motivated. However weight did not influence final probe strategy (GLM: F(1,51) = 2.55 p = 0.1164). Habituated, early- and late-trained groups did not differ by weight at time of final probe (GLM: F(2,64) = 0.56 p = 0.5731); early- and late-trained groups did not differ in days required to meet performance criteria (GLM: F(1,51) = 1.1 p = 0.2997), or in strategy use at first probe (X(1, N = 52)2 = 0, p = 1). Thus, the only factor that predicted strategy at the time of final probe was training stage.
In summary our early- and late-trained groups differed significantly in navigation strategy. Early-trained rats used a strategy associated with attentive performance and dorsomedial striatal engagement more frequently than late-trained animals, which more often used a strategy linked to skilled performance and dorsolateral striatal engagement. We proceeded to examine diverse neuronal measurements across subregions and training-stages to test for physiological differences corresponding to these behaviors.
Striatal changes with learning
A recent study showed that motor skill learning alters evoked striatal plasticity (Yin et al., 2009), and here we build on this work by evaluating subregion-specific changes in synaptic plasticity relative to the learned turn, by examining LTP alongside LTD, and by examining morphology as well as intrinsic excitability in neurons after training. Because the T-maze training is a lateralized task (each rat learns to seek food on only one side of the maze), we assessed changes in striatal synaptic plasticity not only in dorsomedial and dorsolateral subregions during early versus late stages of learning, but also in hemispheres both ipsilateral and contralateral to the rewarded turn (Fig. 2A). We assessed striatal synaptic plasticity and excitability through extracellular recordings, and examined morphology from reconstructions of MSNs patched for intrinsic excitability measurement.
Striatal synaptic plasticity
We measured change in population spike amplitude to assess corticostriatal synaptic plasticity in ex vivo brain slice, similar to others (Akopian et al., 2000; Yin et al., 2009; Adermark et al., 2011). Synaptic plasticity was measured in response to a series of inductions repeated at 30 min intervals. To identify plasticity modulation with maze learning, we compared evoked plasticity (both LTP and LTD) among habituated, early-trained, and late-trained groups. Below, we present dorsomedial change in LTP first, dorsomedial change in LTD second, and dorsolateral change in LTD third. Inter-regional comparison, as well as analysis by strategy and turn direction, is detailed at the end.
Dorsomedial LTP magnitude was significantly reduced in early-trained rats in the contralateral, but not the ipsilateral hemisphere. Habituated controls exhibited robust dorsomedial LTP in response to TBS (138 ± 8% 85–90 min postinduction; t(9) = 4.4 p = 0.0012). Statistical analysis demonstrated that training stage modified LTP exclusively within the hemisphere contralateral to the learned turn (GLM repeated: F(2,31) = 4.55 p = 0.0185, contralateral; F(2,33) = 1.04 p = 0.3617, ipsilateral). Post hoc comparison to habituated controls showed contralateral LTP magnitude was significantly reduced only for early-trained (p = 0.0055) and not late-trained (p = 0.2215) groups. As illustrated in Figure 2B, the same TBS which produced pronounced LTP in habituated controls instead evoked transient depression in the contralateral hemisphere of early-trained rats. Note that in the ipsilateral hemisphere the early-trained (but not late-trained) group exhibited a transient depression in population spike amplitude immediately following induction, indicated by a within-subjects time × stage interaction (GLM repeated: F(34,561) = 1.56 p = 0.0246, stage × time; post hoc vs habituated: p = 0.0373 early, p = 0.9748 late). However, in the ipsilateral hemisphere this transient depression was only evident after the second and third inductions, and the final magnitude of LTP was not significantly altered (Fig 2B). Comparing synaptic plasticity across hemispheres within each rat permits each animal to serve as its own control, though we were not able to collect contralateral and ipsilateral recordings for each subject. Nonetheless, comparison across hemispheres generally agrees with the original results (using all subjects), as demonstrated in Table 1. The hemispheric difference in dorsomedial LTP for early-trained subjects is consistent for both induction periods, though not reaching significance due to reduced sample size. The hemispheric difference in dorsomedial LTP for late-trained subjects has too few samples to say anything meaningful.
Dorsomedial LTD magnitude was reduced in late-trained rats in both hemispheres. Habituated controls exhibited robust dorsomedial LTD in response to 20 Hz stimulation (50 ± 5% 85–90 min postinduction; t(7)= 9.3 p < 0.0001). Statistical analysis confirmed a significant main effect of training stage within the hemisphere contralateral to the learned turn (Fig. 2C; GLM repeated: F(2,31) = 3.78 p = 0.0339, contralateral; F(2,32) = 0.65 p = 0.5268, ipsilateral). Post hoc comparison to habituated controls showed that contralateral LTD was unchanged for early-trained rats (p = 0.5079), and was reduced for late-trained rats (p = 0.0173). In addition, within the ipsilateral hemisphere, we found a significant within-subjects time × stage interaction indicating reduced LTD compared with habituated controls for both training stages (GLM repeated: F(34,527) = 1.51 p = 0.0338, stage × time; post hoc vs habituated: p = 0.0034 early, p = 0.0031 late; Fig. 2C). Despite the reduced LTD ipsilaterally at some time points for early-trained animals, the within-rat comparison across hemispheres confirms no hemispheric difference in dorsomedial LTD for late-trained animals, and does not support lateralization in dorsomedial LTD for early-trained animals (Table 1).
Dorsolateral LTD magnitude was reduced in late-trained rats exclusively in the hemisphere ipsilateral to the learned turn. Habituated controls exhibited robust dorsolateral LTD in response to 20 Hz stimulation (62.2 ± 7% 85–90 min postinduction; t(9) = 5.3 p = 0.0005). Training stage did not produce altered LTD immediately following induction; instead, a significant time × stage effect exclusively within the ipsilateral hemisphere (Fig. 2D; GLM repeated: F(34,595) = 3.0 p < 0.0001, time × stage) demonstrated a marked reduction in the persistence of LTD. Post hoc analysis indicated difference from controls is restricted to the late-trained group (F(17,595) = 0.42 p = 0.9810, early; F(17,595) = 3.84 p < 0.0001, late). Significant interhemispheric difference in late-trained dorsolateral LTD is supported by the within-rat comparison of ipsilateral and contralateral LTD (Table 1).
To investigate inter-regional plasticity, we used a within-rat analysis to compare LTD in dorsomedial versus dorsolateral striatum. Despite a reduced sample size, this analysis shows inter-regional difference in LTD after the second induction (p = 0.012) for late-trained rats, but only in the contralateral hemisphere. This supports subregional independence of plasticity at the late-trained stage, and strengthens the finding of lateralization of dorsolateral late-trained LTD.
Because the early-trained rats were evenly split between place and response strategy, we further analyzed whether strategy was a better predictor of synaptic plasticity than training stage. First, we performed the repeated measures analysis using strategy instead of stage (with strategy = NA for habituated controls). Neither strategy nor the strategy by time interaction term was significant for any striatal region or induction protocol. Then, we compared evoked synaptic plasticity (change from baseline averaged over the final 15 min) among three groups: early-trained rats using place strategy, early-trained rats using response strategy, and late-trained rats using response strategy (Fig. 3). Late-trained rats using a place strategy were excluded because of insufficient numbers. Again, strategy was not a significant predictor of plasticity. Note that the rat's strategy must be determined using a single probe trial; thus it may not accurately identify a rat as attentive versus automatic. Figure 3A seems to suggest that LTP is reduced bilaterally early in learning in rats using a place strategy; however, we do not have sufficient samples to test this in the present study. In summary, this analysis suggests that training stage, in which early-trained rats are defined by a behavioral performance criterion, is a better predictor of the change in evoked synaptic plasticity than is rats' performance on standard T-maze strategy probes.
To verify that the changes in synaptic plasticity with training appeared for both turn directions, we performed the same GLM repeated measures analysis by training stage separately for rats rewarded during training for turning east (right-turning) and for rats rewarded during training for turning west (left-turning). Table 2 shows that, for the most part, the effect of training stage was observed for both turn directions. For LTP, training stage significantly influences synaptic plasticity contralaterally but not ipsilaterally, both for rats rewarded for turning east and (at trend level) for those rewarded for turning west. For dorsomedial LTD, training stage significantly influences synaptic plasticity contralaterally both for rats rewarded for turning east (at trend level) and for those rewarded for turning west. Interestingly, ipsilateral LTD in both dorsomedial and dorsolateral striatum shows a significant training stage effect exclusively within rats trained to turn west. Should this finding be replicated, it would suggest some degree of direction-specificity of the task within the striatum. In summary, the majority of plasticity findings derived independently within east- or west-rewarded groups show good correspondence to results derived from all subjects. This confirms that synaptic plasticity is modulated with respect to a learned turn in either direction.
Together our findings reveal novel patterns coupling learning stages with altered synaptic plasticity relative to the learned turn. Hemisphere-specific changes in dorsomedial synaptic plasticity align with early training, at which point reduced (eliminated) LTP is observed contralaterally, without a change in LTD, suggesting that LTP and LTD are modified independently. This independence is confirmed by the lack of correlation between LTP and LTD in the ipsilateral hemisphere of dorsomedial striatum (within-rat comparison, n = 5, p > 0.7; low sample size prevents correlation analysis of contralateral hemisphere). Hemisphere-specific change in dorsolateral synaptic plasticity aligns with late training, at which point reduced LTD is observed in the ipsilateral striatum. Thus hemisphere-specific (i.e., turn-relative) synaptic plasticity differences are present dorsomedially early in training and dorsolaterally late in training.
Excitability
In addition to synaptic plasticity changes, plasticity in excitability may be integral to learning. Altered excitability can directly facilitate transmission of signals in support of learned behavior, and may provide a metaplastic backdrop modulating synaptic plasticity's direction or impact (Abraham, 2008; Rogerson et al., 2014). Recognizing that excitability changes in striatal MSNs during T-maze training could be important for learning, we assess population excitability through extracellular IO curves and several MSN intrinsic excitability measures.
Extracellular IO curves related strength of afferent depolarization to striatal population spike amplitude and were collected preceding induction for synaptic plasticity experiments. Statistical analysis of extracellular IO curves within habituated control rats revealed no difference between striatal regions (GLM: F(1,50) = 0.24, p = 0.62). Comparing training stages, we found a significant training effect in the IO curve shape (F(2,271) = 5.67, p = 0.0039). Specifically, peak output was smaller in dorsomedial striatum in late-trained rats, in the hemisphere contralateral to the learned turn (Fig. 4A; post hoc comparison to habituated control, p = 0.0009). No difference from controls was detected in early-trained rats, in late-trained rats ipsilateral to the learned turn, nor in any dorsolateral group. Importantly, synaptic plasticity results were not due to differences in extracellular responsiveness as half-maximal current from IO curves was used for all evoked synaptic plasticity experiments, and did not differ among groups.
We examined intrinsic excitability of single MSNs across training groups and striatal regions. Specifically, we measured resting membrane potential (RMP), rheobase, both IRpos and IRneg to RMP, evoked spiking, and spike latency during somatic current injection. We examined habituated controls for inter-regional differences before learning the T-maze, and found a small but significant difference in RMP (Fig. 4B; GLM: F(1,22) = 5.04, p = 0.0356; dorsomedial: −81.18 ± 0.6 mV, dorsolateral: −79.89 ± 0.4 mV) which disappeared with training. No other whole-cell measure differed between regions for habituated controls. Analysis by region and across training stages showed significant changes in RMP with training for dorsomedial striatum, such that MSNs from early-trained animals are more depolarized at rest, and return to control-matched RMP by late training (Fig. 4B; GLM: F(2,53) = 4.09, p = 0.0226; RMP(mV): −81.18 ± 0.6 habituated, −78.99 ± 0.6 early, −81.67 ± 1.1 late). RMP did not change for dorsolateral cells.
Several complimentary, inter-regional differences in intrinsic excitability measures indicate dorsomedial intrinsic excitability was increased relative to dorsolateral in early-trained animals; each of these differences was restricted to the contralateral hemisphere. When contralateral and ipsilateral hemispheres were considered together, we found that the difference between dorsomedial and dorsolateral in rheobase (Fig. 4C) and IRpos (Fig. 4D) were significantly modulated with training stage (F(2,40) = 3.37, p = 0.0448, rheobase; F(2,40) = 3.28, p = 0.0487, IRpos). Reduced rheobase and increased input resistance dorsomedially contributed to a trending left-shift in the IF curve for the dorsomedial relative to dorsolateral striatum in early training (Fig. 4E; GLM: F(2,40) = 3.23, p = 0.0506, IF half-max). Analysis by hemisphere relative to the learned turn revealed that each of these inter-regional differences was highly significant for the contralateral hemisphere (Fig. 4C–E; GLM: F(1,21) = 8.5, p = 0.0086, rheobase; F(1,21) = 8.36, p = 0.009, IRpos; F(1,21) = 9.95, p = 0.005, IF half-max) but not for the ipsilateral hemisphere (GLM: F(1,22) = 3.06, p = 0.0949, rheobase; F(1,22) = 3.58, p = 0.0725, IRpos; F(1,22) = 3.88, p = 0.0621, IF half-max). By late training, inter-regional differences were absent within and across hemispheres. No regional or training-related change was detected for spike latency. Intrinsic excitability measures are summarized in Table 3.
In summary, changes in intrinsic excitability measures combine to show transient enhancement in excitability for dorsomedial relative to dorsolateral striatum. This enhancement emerges during early learning and dissipates with late training. Importantly, inter-regional excitability differences emerge in a turn-relative pattern (exclusive to the hemisphere contralateral to the learned turn), connecting intrinsic excitability modulation to behavioral modification with early training.
Morphology
Morphological changes, such as new spine growth, are reported with learning (Knott and Holtmaat, 2008). We therefore reconstructed the same MSNs from which whole-cell intrinsic excitability measures were collected to investigate whether morphology covaried with learning. For each reconstructed neuron, morphological measurement included spine density, number of primary dendrites, total dendritic length, and both dendritic length and number of dendritic branch points as a function of path distance (as opposed to Scholl distance) from the soma. Spine density (counted from images at 100× magnification; Fig. 5B), number of branch points, and dendritic length were analyzed in 20 μm bins out to 120 μm from the soma; beyond this distance the number of usable samples decreases.
Data on spine density do not reveal an effect of training, but do show that, for all training stages, spine density is low near the soma and rises to peak ∼60 μm as has been reported for MSNs (Berlanga et al., 2011). The dependence of spine density on distance from the soma is statistically significant (GLM repeated: F(5,255) = 97.07, p < 0.0001, distance;), but spine density does not differ by training stage (F(10,225) = 0.46, p = 0.8466, stage × distance). Note that spine density also varies by reconstructor (GLM repeated: F(10,225) = 7.63, p < 0.0001, reconstructor × distance) but the interaction term reconstructor by stage is not significant, suggesting that difference in reconstructor style does not obscure a difference due to training. Spine density also does not differ by hemisphere (GLM repeated: F(10,255) = 0.9727, p = 0.9266, hemisphere × distance) or by striatal region (GLM repeated: F(5,260) = 0.98 p = 0.4072, region × distance). Figure 5 shows spine density by training stage, distance, and either hemisphere (Fig. 5D,E; collapsed across region) or region (Fig. 5F,G; collapsed across hemisphere). Our results suggest T-maze learning occurs without persistent alteration in MSN spine density.
A remarkable change in dendritic arbor complexity with training is evident through analysis of 40× reconstructions (Fig. 5C). Changes with training are illustrated in Figure 5A by representative dendrograms and reconstructions from an untrained and from a trained animal (example cells are habituated and late-trained, respectively), which shows a reduced number of dendrites for the trained animal. Cumulative dendritic length varies by training stage (GLM: F(2,54) = 14.21, p < 0.0001), but neither reconstructor nor the interaction term training stage × reconstructor are significant (type III SS: F(4,54) = 1.85, p = 0.1378, reconstructor; F(8,54) = 1.11, p = 0.379, reconstructor × stage), indicating that reconstructor difference does not produce the training stage effect. Relative to controls, cumulative dendritic length is reduced in trained animals, but shows no difference between early- and late-trained groups (GLM contrast: p = 0.0002, early vs untrained; p = 0.0047, late vs untrained; p = 0.329, early- vs late-trained). Table 4 summarizes cumulative dendritic length (which encompasses the influence of reduced branches) by region, stage, and hemisphere. For each of the four groups defined by region and hemisphere, a separate GLM of cumulative dendritic length by training stage was performed. All but the dorsolateral contralateral region showed a significant training effect (p < 0.0403), and post hoc contrast indicates difference from untrained, but not between early- and late-trained except for in the dorsomedial ipsilateral region (Table 4).
More detailed analysis of dendritic arbors by 20 μm distance bins from the soma confirms that training stage influences number of branch points (F(2,52) = 11.14, p < 0.0001) and also dendritic length (GLM repeated: F(2,52) = 14.21, p < 0.0001). Post hoc analysis at various path distances shows fewer branch points in trained animals between 21 and 100 μm from the soma (p < 0.0494 for all bins in this range), with no difference between early- and late-trained animals. Similarly, dendritic length is reduced in trained animals between 21 and 120 μm from the soma (p < 0.0083 for all bins in this range) with no difference between early- and late-trained animals. The number of primary dendrites is unchanged across groups, reflected by no difference in either branch point number or dendritic length 0–20 μm from the soma (F(2,54) = 0.76, p = 0.4729; F(2,54) = 1.83, p = 0.1706, respectively).
Within trained animals, we tested the effects of region and hemisphere separately. Collapsing across hemisphere, there is no difference between dorsomedial and dorsolateral regions in the number of branch points (GLM repeated: F(1,33) = 0.11, p = 0.7469) or in dendritic length (GLM repeated: F(1,33) = 0.06, p = 0.8066). Collapsing across region, there is no difference between hemispheres in the number of branch points (GLM repeated: F(1,33) = 3.84, p = 0.0584), but there is a difference in dendritic length between hemispheres (GLM repeated: F(1,33) = 4.92, p = 0.0336): the reduction in dendritic length with training appears greatest ipsilaterally. Further analysis of dendritic length by region shows this hemispheric difference is specific to dorsolateral MSNs (Fig. 6H; GLM repeated: F(1,15) = 5.61, p = 0.0318). We found no other regional or hemispheric difference in morphology distinguishing MSNs from trained animals (Figs. 5D–G, 6A–G).
The lack of difference between early- and late-trained rats is in marked contrast with the electrophysiology data. To ensure that the training effect observed in morphology was not due to a difference in handling between trained and untrained animals, the morphological analysis presented above includes an additional control group: neurons from naive rats that were neither food-restricted nor regularly handled. Analysis shows that morphology is statistically indistinguishable when comparing habituated controls and naive rats (GLM repeated: spine density, F(5,100) = 0.25, p = 0.9385, stage × distance; branch points, F(5,90) = 1.28, p = 0.28, stage × distance; dendritic length, F(5,90) = 1.32, p = 0.26, stage × distance). In terms of handling and length of time in food restriction, the habituated and naive controls are quite different (0 d for naive controls vs 11 ± 1.1 d for habituated controls), whereas habituated controls and the early-trained animals are quite similar (differing by ∼6 d). Early- and late-trained rats differ considerably more in time spent experiencing handling and food restriction (∼13 d). This strongly suggests that difference in experience outside of maze training cannot explain the morphological changes we find in trained rats. In summary, the morphology results reveal a change in the dendritic arbors of adult MSNs, which is specific to training, but which does not distinguish our early- and late-trained groups.
Discussion
We analyzed bidirectional synaptic plasticity, population and single-cell excitability, and morphology from MSNs to investigate the contributions of anatomical and task-defined dorsal striatal regions to maze learning. Our data reveal independently altered LTP and LTD, as well as changes in intrinsic excitability and dendritic remodeling not previously reported with learning. Importantly, this is the first study describing lateralization in evoked striatal plasticity relative to the direction an animal is trained to turn.
Consistent with previous reports, our early-trained group shows variability in T-maze strategy, whereas a response strategy predominates in late-trained rats (Packard, 1999; Yin and Knowlton, 2004; Lex et al., 2011). Fittingly, VTEs suggesting heightened spatial awareness and deliberative decision-making (Papale et al., 2012; Schmidt et al., 2013) are most frequent early in training. Early strategy variability may arise from exploratory behavior in spatially attentive rats started in a novel maze arm; this is consistent with the elevated VTE on the first probe and could explain greater correlation of physiology to training stage than to strategy. Because pausing on the maze without committing a nose poke into a maze arm could represent exploratory behavior without being scored as a VTE, VTE count may not be ideal for correlating with physiology either. Nonetheless, rats' reduced VTE, swift maze completion, and a predominant response strategy indicate progress toward habitual performance with late training. Recordings in vivo show that modulation in MSN activity that corresponds to learning success emerges dorsomedially first, and develops dorsolaterally later in training (Yin et al., 2009; Thorn et al., 2010). Furthermore, lesion studies reveal that dorsomedial striatum (working with the hippocampus) is required for goal directed behavior and spatially attentive learning (Moussa et al., 2011; Lee et al., 2014), whereas dorsolateral striatum is required for automatic responses to stimuli and habit development with overtraining (Yin and Knowlton, 2004). Therefore, a shift from spatially attentive toward automatic or habitual performance suggests that a shift from dorsomedial to dorsolateral engagement distinguishes our early- and late-trained rats.
Hemisphere-specific findings are consistent with prior studies demonstrating striatal engagement and plasticity during lateralized behavior. Unilateral striatal lesions promote turning toward the lesioned hemisphere (Ungerstedt et al., 1969), and MSN firing is negatively correlated with ipsilateral (Bryden et al., 2012) and positively correlated with contralateral turning behavior (Cui et al., 2013). NMDA subunit composition is modified in opposite directions within striatal hemispheres relative to the reaching limb (Kent et al., 2013). A recent publication demonstrated in vivo potentiation of corticostriatal local field potentials contralateral to the direction a rat is trained to nose poke (Xiong et al., 2015). These studies confirm the importance of lateralization in our results.
Our findings show agreement with Yin et al., 2009, the only other study to illustrate change in evoked dorsal striatal plasticity with skill learning. Their study found a depressed AMPA:NMDA ratio dorsolaterally with rotarod overtraining, which complements our finding reduced LTD (ipsilateral) in this subregion with late T-maze training. Contralaterally, we see the same late-trained flattening of dorsomedial IO curves and the same dorsomedial reduction in evoked LTD without dorsolateral reduction. Ipsilaterally we find the same dorsomedial LTD reduction but without the flattened IO curves. In contrast to Yin et al., 2009, we find dorsolateral LTD reduced rather than enhanced relative to controls. This small divergence in results may be attributed to our use of habituated controls, whereas those by Yin et al. (2009) were naive, representing subtly different time points on the spectrum from task-naive to overtrained. In addition, locomotor and cognitive demands distinguish rotarod and T-maze learning.
Our results do not distinguish between direct pathway and indirect pathway MSNs, although intracellular cascades and neuromodulation critical to bidirectional plasticity differ between pathways (Shen et al., 2008; Ding et al., 2010; Lerner and Kreitzer, 2012). Despite pathway differences, postsynaptic calcium elevation is critical to bidirectional plasticity in both (Wang et al., 2006; Pawlak and Kerr, 2008; Shen et al., 2008). Putatively motor-enhancing direct and putatively motor-suppressing indirect pathway MSNs (Kravitz et al., 2010) show the same pattern of activity dependent calcium elevation with turning behavior (Cui et al., 2013). Therefore, it is likely that both pathways contribute to the learning-associated changes in plasticity measured in the present study, though the direction of plasticity may differ between pathways. With goal-directed learning (Shan et al., 2014), the AMPA:NMDA current ratio is potentiated in direct and depressed in indirect pathway MSNs dorsomedially. Those findings complement the dorsomedial reductions we find in LTP (contralateral) and LTD (ipsilateral) after early training, and suggest our results may reflect change in direct and indirect pathways, respectively.
Reduction in evoked synaptic plasticity after learning may indicate occlusion (Whitlock et al., 2006; Yin et al., 2009; Padmashri et al., 2013) or other metaplastic processes, i.e., processes influencing the extent to which plasticity can occur (Abraham, 2008). For instance, metaplasticity may constrain capacity for off-task potentiation at dorsomedial synapses that were not recently potentiated; such off-task damping could permit a fine pattern of task-relevant LTP to be comparatively enhanced. On the other hand, if striatal LTP and LTD are reciprocally regulated, as explained by the Bienenstock–Cooper–Munro plasticity theory (Cooper and Bear, 2012), then reduction in evoked plasticity is likely to be occlusion, i.e., the saturation of plasticity in either direction. This was the framework for inferring that enhanced striatal LTD ex vivo indicates a history of LTP during learning (Lin, 2010). Alternatively, learning may regulate bidirectional plasticity forms independently rather than reciprocally within the striatum. We were able to distinguish these possibilities using 20 Hz to induce LTD together with TBS to induce LTP (Hawes et al., 2013). In dorsomedial striatum, we find reduced LTP without concomitant increase in LTD early in training, as well as reduced LTD without concomitant increase in LTP late in training. This refutes a reciprocal relationship between LTP and LTD. Whether reduced plasticity indicates recent occlusion or other metaplastic processes remains unclear, but our results establish that learning modulates both striatal LTP and LTD, and that these are modulated independently.
Learning and plasticity can be supported not only by synaptic change, but also by modified intrinsic excitability (Frick and Johnston, 2005; Sehgal et al., 2013; Rogerson et al., 2014). Our whole-cell measures collectively show dorsomedial MSNs to be more excitable than dorsolateral MSNs early in training, specifically within the contralateral hemisphere. Altered intrinsic excitability has been causally linked to synaptic plasticity in vivo (Epsztein et al., 2011; Lee et al., 2012). Thus, in the contralateral hemisphere, where a neural pattern driving the rewarded turn is expected to emerge, greater excitability may reflect greater capacity for information encoding dorsomedially with early training.
Excitability adjustments linked to learning are often accomplished through potassium channel regulation (Alkon, 1979; Disterhoft and Oh, 2006). The reduced LTP we report in early-trained rats may be caused by a period of elevated fast KA-type potassium channel activity, given that blocking KA currents enhances hippocampal LTP, and that these channels are transiently upregulated in the striatum with learning (Truchet et al., 2012). The slow KA current does not appear to contribute to striatal learning, as latency to the first action potential was unchanged. The curtailed LTD persistence we see with late training may be linked to enhanced SK-type potassium channel activity, as blocking these channels converts transient depression to LTD (Hopf et al., 2010). In rigidly habitual animals, blocking SK restores both goal-oriented behavior and LTD (Nazzaro et al., 2012). Aligning identified currents modulating MSN excitability with their influence on learning behavior is an important next step.
Neuronal morphology can influence excitability and plasticity by altering electrotonus and synapse distribution. In contrast to regional MSN hypertrophy which follows chronic stress (Dias-Ferreira et al., 2009), we observe reduced dendritic complexity after training in both regions. Similar dendritic reduction is reversibly induced in adult rodents by manipulating dopamine receptors or inwardly rectifying potassium channel activity (Cazorla et al., 2012). Thus, ion channel modifications with learning potentially unite our electrophysiological and morphological findings. Both increased spine density and spine loss have been observed with learning and LTP in the neocortex and hippocampus (Knott and Holtmaat, 2008). New spine growth suggests new information pathways (Kuhlman et al., 2014; Yang et al., 2014), whereas spine loss potentially enhances signal-to-noise ratio (Lai et al., 2012; Sanders et al., 2012). Whereas MSN spine density is unchanged, loss of dendritic length after training likely accompanies a reduction in synapses. Thus, dendritic pruning may enhance signal-to-noise ratios in the striatum.
This study gives novel insight into dorsal striatal changes enabling T-maze learning in the context of the classic concept of early dorsomedial and late dorsolateral engagement. Our data suggest that early, task-oriented dorsomedial activity supports a rewarded turn (engaging contralateral LTP). With late training, task-oriented plasticity appears dorsolaterally (engaging ipsilateral LTD), and may function to suppress unrewarded turns. Also with late training, engaging LTD bilaterally in the dorsomedial striatum may suppress recently relevant, as well as distracting, new information soon after a novel task has been learned (Ragozzino, 2003, 2007). Patterns of task-oriented plasticity will be useful in future studies intent on dissecting striatal adaptations responsible for cognitive and locomotor aspects of skill learning.
Footnotes
This work was supported by ONR Grant MURI N00014-10-1-0198.
The authors declare no competing financial interests.
- Correspondence should be addressed to Dr. Kim T. Blackwell, George Mason University, Krasnow Institute, MS 2A1, Fairfax, VA 22030-4444. kblackw1{at}gmu.edu