Abstract
Motor skills learned through practice are consolidated at later time, which can include nighttime, but the time course of motor memory consolidation and its underlying mechanisms remain poorly understood. We investigated neural substrates underlying motor memory consolidation of learned changes in birdsong, a tractable model system for studying neural basis of motor skill learning. Previous studies in male zebra finches and Bengalese finches have demonstrated that adaptive changes in adult song structure learned through a reinforcement paradigm are initially driven by a cortical-basal ganglia circuit, and subsequently consolidated into downstream cortical motor circuitry. However, the time course of the consolidation process, including whether it occurs offline during nighttime or online during daytime, remains unclear and even controversial. Here, we provide in both species experimental evidence of virtually no consolidation of learned vocal changes during nighttime. We demonstrate instead that the consolidation occurs during daytime and the amount of consolidation is strongly correlated with the amount of learning, suggesting online, performance-dependent mechanisms of consolidation of learned vocal changes. Moreover, by using computer simulations based on our experimental results, we demonstrate that such online, performance-dependent consolidation can account for the contradicting conclusions concerning the time course of consolidation process reached by previous studies. These results thus reconcile a controversy in the study of vocal motor consolidation in songbirds, and illustrate the neural substrates through which newly learned motor skills initially implemented by cortical-basal ganglia circuits become encoded in the cortical motor circuitry.
SIGNIFICANCE STATEMENT Motor skills learned through repetitive practice become stable and are consolidated into cortical motor circuits. We investigate neural substrates of this “motor memory consolidation” in adult songbirds, which produce songs that are complex motor skills learned and maintained through repetitive vocal practice. We demonstrate that learned changes in song acoustic structure are consolidated into the cortical motor circuits predominantly during daytime, but not during nighttime, depending on ongoing song performance. These consolidation mechanisms reconcile seemingly contradicting results of previous studies regarding the time course of vocal learning consolidation, and provide fundamental insights into the process through which learned performance of complex motor skills is consolidated and encoded in in motor circuits.
Introduction
As with bike riding and speech production, many complex motor skills learned through repetitive practice become stable and automatic, and are maintained for extended periods of time. This process is thought to involve the transformation of newly acquired, relatively labile motor memories into more robust and enduring states, a process termed “motor memory consolidation” (Doyon et al., 2009; Dudai et al., 2015). Neural substrates of motor memory consolidation have been extensively studied in humans using noninvasive techniques, such as brain imaging, revealing significant changes in brain activity during a post-learning period (Dudai et al., 2015; King et al., 2017). However, very few studies have been conducted in animal models at the neural circuit and synaptic levels (Yang et al., 2014; Nagai et al., 2017); thus, detailed neural mechanisms of motor memory consolidation remain poorly understood.
In the present study, we investigate the neural circuit mechanisms of motor memory consolidation with a special focus on its time course using two species of songbirds: the zebra finch (ZF) and the Bengalese finch (BF). Songbirds learn to produce and maintain song with highly complex but quantifiable structure by using discrete and specialized neural circuits (Mooney, 2009), providing a tractable model system for studying neural substrates of motor skill learning, including motor consolidation. In the adult male ZF and BF, the neural mechanisms by which birds regulate and optimize song structure have been intensively studied using a reinforcement learning paradigm that induces adaptive changes in the fundamental frequency (FF) of a harmonic song element (“syllable”) using white noise (WN) playback as a negative reinforcement (Andalman and Fee, 2009; Warren et al., 2011). This externally reinforced learning for a selected syllable of adult song is phenomenologically different in many aspects from that of a developmental song learning, by which juvenile birds spontaneously develop a complex song that resembles their tutors' song from immature vocalizations. Nevertheless, the highly controllable and easily quantifiable nature of the reinforcement-driven adaptive learning of the syllable FF easily allows us to examine a direct link between vocal changes and neural activity. Using this reinforcement learning paradigm, both Andalman and Fee (2009) and Warren et al. (2011) have demonstrated that learned changes in the FF of a target syllable are initially driven by a specialized cortical-basal ganglia circuit, the anterior forebrain pathway (AFP; see Fig. 1C), and subsequently consolidated into the song motor pathway (SMP) to become independent of the AFP (Andalman and Fee, 2009; Warren et al., 2011). This so-called “synaptic consolidation” (Dudai et al., 2015) appears to progress over time, but the extent of the consolidation that occurs during nighttime and/or during daytime remains unclear.
Moreover, the above two studies have reported seemingly contradicting conclusions regarding the time course of the consolidation process, despite of their similar experimental paradigms used to induce FF changes and to examine the consolidations. Andalman and Fee (2009) have suggested that consolidation is completed within 1 d after FF changes, whereas Warren et al. (2011) have demonstrated that consolidation is not fully complete even after many days. These contradictory results may be attributable to the difference in bird species used and/or to a subtle difference in their learning paradigms (Fee and Goldberg, 2011): Andalman and Fee (2009) used ZF and drove large amounts of FF changes over many consecutive days by daily adjustment of the WN-feedback threshold, whereas Warren et al. (2011) used BF and maintained FF at a stable value away from the baseline over many days by maintaining the WN-feedback threshold throughout that period. To date, however, no studies have systematically examined the reason why those two studies have drawn seemingly contradicting conclusions, leaving the time courses and the mechanisms of FF-change consolidation elusive.
In the present study, we examine the time course of FF-change consolidation by focusing on whether it occurs during nighttime or daytime and address the discrepancy between the above two studies, by systematically measuring FF-change consolidations at specific times of day on successive days in both BF and ZF. In both species, we found that almost no consolidation occurs overnight; instead, it occurs predominantly during the daytime, depending on the amount of learning rather than the simple passage of time. Moreover, using computer simulations based on our experimental results, we demonstrated that such online, performance-dependent mechanisms of learned vocal consolidation can account for the discrepancies seen between the above two previous studies in BF and ZF (Andalman and Fee, 2009; Warren et al., 2011). These findings shed light on the consolidation mechanisms of learned vocal changes shared across different songbird species, and provide a novel insight into the mechanisms of basal ganglia-dependent motor skill learning.
Materials and Methods
Subjects
Subjects were adult male BFs (Lonchura striata domestica, >120 d old) and adult male ZFs (Taeniopygia guttata, >120 d old), both of which were either bred in our colony or purchased from a local supplier. Birds were isolated and housed individually in sound-attenuating chambers (MC-050, Muromachi Kikai) on a 14:10 h light:dark cycle. Their care and treatment were reviewed and approved by the Animal Care and Use Committee of the Korea Brain Research Institute.
Negative reinforcement-driven learning of syllable FF using conditioned auditory feedback
Songs were recorded using a microphone (PRO35, Audio-Technica) positioned above the cage. All song recordings were of undirected song (i.e., no female was present). Birds with songs containing a harmonic stack and with sufficient singing rates (>200 song bouts per day) were used in our experiments. To drive adaptive changes in FF of a target syllable in song, we used a conditioned auditory feedback technique, which includes computerized delivery of aversive auditory feedback (a brief WN burst, 75-80 dB at the bird's ear) contingent on the FF of a harmonic syllable in song (Tumer and Brainard, 2007; Andalman and Fee, 2009; Ali et al., 2013; Tachibana et al., 2017). Song recording and WN feedback were controlled by a modified version of a previously described system (Tachibana et al., 2017). We chose a target syllable that contains clear harmonic structure and follows the preceding syllable with a stereotyped order, and the WN-feedback system detected a combination of those syllables by comparing the spectral structure of ongoing song with a set of spectral templates that were constructed with exemplars before the experiment. FF was measured in a 4 ms segment of the targeted syllable at a fixed time from the syllable onset, where harmonic structure was well defined. In order to drive the FF of the targeted syllable higher, a 40 ms WN stimulus was delivered immediately after the FF detection segment if the syllable FF was below a previously set threshold for WN playback; to drive a syllable FF lower, a WN was delivered if the syllable FF was above the WN threshold. All target-syllable renditions detected that had FF below or above the WN threshold were followed by WN feedback. When driving upward or downward shifts of the syllable FF over consecutive days, we set the WN threshold each morning roughly to the average of the syllable FF at the end of the previous day so that ∼50% of syllable renditions would trigger WN feedback. During each day, we either updated the WN threshold manually at the beginning of a day and maintained it throughout the day (Andalman and Fee, 2009) or automatically every 50 target syllable detections to be an average of the past 200 samples of detected FF values (Tachibana et al., 2017). One of those two modes for updating the WN threshold were chosen each day depending on the purpose of individual experiments and on the individual bird's behavior. For the experiment aiming to examine the nighttime consolidation of the AFP bias (see Fig. 2), we maintained the threshold throughout a day when the bird showed a relatively large shift (∼>1 SD of cross-rendition variability in target syllable FF) in a preliminary experiment or on the previous day of training; however, when the bird showed a smaller shift, we updated the FF threshold automatically to drive a greater shift. We also used the above two update modes for the experiment aiming to compare the ΔConsol/day and the ΔLearning/day (see Figs. 4, 5), in which we intentionally changed the update mode across days in a relatively random fashion so as to induce FF changes to variable magnitudes.
Pharmacological blockade of AFP output via a reverse microdialysis technique
In birds that exhibited a clear FF shift during the conditioned auditory feedback training (∼>1 SD of cross-rendition variability in the target syllable FF), we transiently infused agonists and antagonists of neurotransmitters (see below) to the lateral magnocellular nucleus of the anterior nidopallium (LMAN) or the robust nucleus of the arcopallium (RA) using a reverse microdialysis technique that was previously described in detail (Stepanek and Doupe, 2010; Warren et al., 2011). Briefly, birds were anesthetized with pentobarbital injection (∼50 mg/kg), and guide cannulas (CMA Microdialysis) were bilaterally implanted above LMAN or RA so as to direct probe tips toward the upper edge of each nucleus (see Fig. 2A, bottom). LMAN was localized stereotaxically relative to the bifurcation of the sagittal sinus (5.1 mm anterior, 1.75 mm lateral, beak angle 50 degrees from vertical). RA was mapped electrophysiologically by monitoring spontaneous firing during cannula implantation. RA implants were angled in a posterior direction by 30 degrees from vertical to avoid the axonal projections from the nucleus HVC to RA. After birds recovered from surgery, microdialysis probes (CMA 7; CMA Microdialysis) were inserted into the guide cannulas, and perfused continuously with PBS at a flow rate of 1-2 ml/min via flexible tubing that is connected to an infusion pump (CMA Microdialysis). In order to inactivate LMAN or RA, we infused the GABAA agonist muscimol (0.2-0.5 mm, Sigma-Aldrich) into the target area by remotely switching the dialysis solution; to block the LMAN transmission to RA, we infused the NMDAR antagonist DL-AP5 (2-5 mm; Abcam) into RA. Because a previous study has shown that muscimol infusions into LMAN and AP5 infusions into RA induce FF reversions in a qualitatively and quantitatively similar manner (Warren et al., 2011), we combined the data using those two methods to examine the effect of blocking the AFP output on the FF (for each bird, the same method was used throughout the experiments). Probe positioning and the path of drug diffusion were assessed postmortem by histologic staining of the sectioned tissue as previously described (Warren et al., 2011). Tissue damage caused by cannulae enabled confirmation that probes were accurately targeted to LMAN or RA.
Assessing effects of drug infusion on FF of a target syllable
We examined the effects of drug infusion on the FF of the targeted syllable as follows. In the experiments designed to examine AFP bias, we compared the FF of target-syllable renditions between the time period of drug infusions (muscimol infusion into LMAN or AP5 infusion into RA) and the 1 h period immediately preceding the infusion. Because the FF gradually changes following the onset of drug infusions, and reaches a plateau (see Fig. 1E) with different latencies across birds (1-2 h), we set a transition period starting at the onset of the drug infusion and lasting 1.0, 1.5, or 2.0 h for each bird, and excluded songs in the transition period from the analysis; the same transition period was used for all drug infusions in individual birds. In typical experiments, the drug was infused in the evening (starting at ∼5 h before lights-out) and/or in the morning (starting at ∼1 h after the first morning song), and each infusion lasted for 2-4 h. In all experiments aiming to compare the magnitudes of the FF reversions between two consecutive drug infusions (e.g., evening infusion vs morning infusion), we only used datasets in which at least one of the two infusions induced a significant FF reversion (p < 0.05, Mann–Whitney U test).
In the experiments designed to suppress nighttime RA activity (see Fig. 3), we infused muscimol into RA throughout the night. The infusion was initiated ∼2 h before the light goes off in the evening to make sure that RA neurons were actually inactivated by observing abrupt cessation of singing; muscimol was washed out by switching the dialysis solution to PBS ∼2 h before the light was turned on in the next morning to make sure that daytime singing was not affected by the muscimol infusions. Overnight changes in the syllable FF were measured as a difference in mean FF of target-syllable renditions between a 2 h period immediately before the onset of muscimol infusion in the evening and a 2 h period immediately after the light was turned on in the next morning.
Statistical analysis
To examine consolidation of AFP bias overnight, we compared the ΔConsol/night with zero using a Wilcoxon signed rank test. We also compared the AFP bias magnitude between evening singing and morning singing using a Wilcoxon signed rank test. To examine the effect of nighttime RA inactivation on the overnight maintenance of learned FF, we compared overnight changes in the syllable FF between an RA-inactivated night and the preceding night using a Wilcoxon signed rank test. We also examined the relationship between the ΔConsol/day and the ΔLearning/day using Pearson's correlation coefficients.
In the experiments examining AFP bias consolidation into the SMP (see Figs. 2, 3), we assessed the contribution of all possible experimental factors to the consolidation amount using the linear mixed-effect (LME) model analysis. The response variable (y) of the model was the amount of consolidation. We included four explanatory variables (“nightday,” “learn,” “manip,” and “spec”) and three interaction terms among them as fixed effects into the model. The explanatory variable nightday was a categorical variable indicating either the consolidation was measured overnight (see Fig. 2) or over a 1 d period (see Fig. 4). The learn was either the amount of learning over 1 d (ΔLearning/day) in Figure 4 or zero for Figure 2. The manip indicated the category where the pharmacological manipulation to block AFP output was an AP5 infusion into RA or a muscimol infusion into LMAN. The spec was a variable for species: BF or ZF. Three interactions were set for detecting species difference among nightday, learn, and manip factors (nightday×spec, learn×spec, and manip×spec). We also added birds' ID as a random effect. The significance level (α) was set at p = 0.05. We used the “fitlme” function in the MATLAB software (The MathWorks) with the following expression: consol ∼ nightday*spec + learn*spec + manip*spec + (1|ID).
Methods of computational simulation
Simulation model
The computational simulation was performed with a model that consists of several calculations that mimic the properties of reinforcement-driven FF learning suggested by the present and previous experimental findings (see Fig. 7A). The FF of a virtual syllable, expressed as a deviation (% change) from the baseline FF, was calculated as a summation of outputs of the SMP and the AFP as follows:
The SMPoutput consists of the motor command that the SMP generates (SMPcommand) and an internal motor noise (SMPnoise), while the AFP output consists of AFP bias (AFPbias) and cross-rendition variability (AFPvariab) as follows:
As in our reinforcement learning experiment in live birds, the virtual song system receives WN feedback contingent on FF output for each rendition. For example, during the periods of upward FF learning, the AFP receives a WN-driven error signal when the output FF is below a threshold (WN hit), thus causing an increase in AFP bias to change the FF in the next rendition; on the other hand, when the output FF is above the same threshold (WN escape), the AFP receives no error signal. In both cases, the AFP also simultaneously receives a separate error signal that changes the FF back toward the baseline depending on the current FF deviation from the baseline. Thus, we defined the total amount of error as follows:
One should note that this updating process theoretically follows the “variability copy model” that has been hypothesized in a previous study (Fee and Goldberg, 2011), although we did not discriminate whether the variability (AFPvariab) originates in LMAN (and internally fed back to Area X) or in Area X.
As an essential point of our model, consolidation (a change in SMPcommand) is promoted depending on the intensity of the AFPoutput in every rendition when the intensity exceeds the threshold value. This relationship was represented as the consolidation function (CF) that links the AFP bias to the SMP consolidation (see Fig. 7B) as follows:
Simulation experiments
We simulated two types of experiments by mimicking the two different experiments conducted in previous studies in live birds (Andalman and Fee, 2009; Warren et al., 2011). Andalman and Fee (2009) drove large amounts of FF changes over many consecutive days (≥5 d) by resetting the WN-feedback threshold every night and by reversing the learning directions every ≥5 d. We mimicked this learning paradigm in our simulation, which we referred to as the “continuous shift” experiment, by having 6 d of upward shift learning followed by 6 d of downward shift learning, repeated 2 times (see Fig. 6B); the threshold for WN feedback was updated every morning to be the average of FF outputs on the previous day. In contrast, Warren et al. (2011) initially drove FF shifts through daily adjustments of the WN threshold for a few days, and then fixed the threshold to maintain FF at a constant offset from the original baseline value; after the period of maintained shift, reinforcement with WN feedback was terminated while the syllable FF was continuously monitored. We mimicked this learning paradigm in our simulation, which we referred to as the “maintained shift” experiment, by updating the WN threshold only in the first two nights (three upward days), and then by maintaining it at the same value for 6 d.
Model parameters
The AFPvariab (σ) and the SMPnoise (φ) were both set to 0.0141 (1.41% in CV) so as to achieve a summed variation of 2.0% in the FF output, which is the value normally observed in previous studies (Kao and Brainard, 2006; Hampton et al., 2009; Stepanek and Doupe, 2010; Warren et al., 2011) as well as in our current study. The threshold parameter (δ) was fixed as 2 σ, which means that 4.6% of the distribution edge of the AFPoutput is reflected on SMPcommand during the baseline state. This linkage between δ and σ was assumed from a previous study that has shown an association between variability and learnability (Garst-Orozco et al., 2014). Moreover, we assumed that a bird produces 2000 renditions of the target syllable per day based on our song recording data in live birds. The other three parameters (α, β, and γ) were heuristically determined to satisfy the following criteria, which were obtained from data in the present study as well as from previous studies (Andalman and Fee, 2009; Hampton et al., 2009; Warren et al., 2011): (C1) maximum FF shift is close to 8% away from the baseline in the maintained shift experiment; (C2) maximum FF shift is close to 12% away from the baseline in the continuous shift experiment; and (C3) FF goes back to the baseline within 7 d after WN off in the maintained shift experiment. As a result, we used the value 0.003, 0.05, and 0.0014 for α, β, and γ, respectively (see Table 2).
Analyzing simulation results
Our simulation results with the “continuous shift” experiment were analyzed in the same way as the experimental results reported by Andalman and Fee (2009). The authors measured the amount of consolidation over a 2 d period (Andalman and Fee, 2009, their Fig. 5A, Δm) as the difference in evening FF(AFP−) between a day n and a day n-2. Using our simulation results, we similarly measured the Δm as the difference in mean FF(AFP−) calculated over 200 consecutive renditions at the end of a day between a day n and a day n-2 (Δm in Fig. 6D of the present study). Andalman and Fee (2009) also estimated the sum of evening AFP bias on a day n-2 (β) and that on a day n-1 (β*); the latter was not actually measured and estimated as the amount of learning (FF changes) that occurred during that day. We calculated β + β* using our simulation results in a similar way (see Fig. 6D): β was calculated as the difference between mean FF(AFP+) and mean FF(AFP−), both being calculated over 200 consecutive renditions at the end of a day n-2; β* was calculated as the difference in mean FF(AFP+) between the beginning and the end of a day n-1. We then plotted the time series of Δm and β + β* over 40 consecutive days (see Fig. 6E), and examined the relationships between the time courses of the two measures by calculating the correlation coefficient (Lag = −1 d in Fig. 6F). We also calculated the correlation coefficients of those measures at different time lags between them just as Andalman and Fee (2009) did (their Figs. 5D, E and S7): at time lags (Lag) ranging from −4 d to 2 d (with a 1 d increment), β and β* were calculated for a day n+Lag-1 and a day n+Lag, respectively; the obtained β + β* were compared with Δm to calculate correlations (see Fig. 6F,G).
Our simulation results with the “maintained shift” experiment were analyzed in the same way as the experimental results reported by Warren et al. (2011). In that paper, the authors measured FF(AFP+) and FF(AFP−) at multiple times during a ≥5 d period in which the threshold for the WN feedback was maintained away from the FF baseline, and then combined the data for 2 consecutive days (i.e., days 1-2, 3-4, and 5-6) (Warren et al., 2011, their Fig. 3E). Similarly to this analysis, we calculated the mean FF(AFP+) and FF(AFP−) of our simulation results over 2 consecutive days during a 6 d period with the maintained threshold for WN playback (days 7-8, days 9-10, and days 11-12) (Fig. 6I).
Results
Adult BFs and ZFs produce complex song consisting of a sequence of song elements (“syllables”), each with a highly stereotyped acoustic structure across renditions. We induced adaptive changes in the FF of a target syllable that contains clear harmonic structure using a previously established reinforcement learning paradigm (Tumer and Brainard, 2007; Andalman and Fee, 2009; Charlesworth et al., 2011; Warren et al., 2011; Ali et al., 2013). In this paradigm, loud bursts of WN were played to a bird during singing as negative reinforcement, and were made contingent on FF at a precise time point within a targeted syllable (Fig. 1A; for detailed procedures including the choice of target syllables, see Materials and Methods). As in previous studies, we induced either upward or downward shifts in the FF of the targeted syllable depending on whether the WN was applied to renditions of the targeted syllable with the FF below or above the experimentally imposed threshold (Fig. 1B; for more detail, see Materials and Methods). It has been shown that this adult vocal learning is initially driven by outputs of a specialized cortical-basal ganglia circuit, the AFP, to the motor cortex-analog in the SMP, the RA (Fig. 1C) (Andalman and Fee, 2009; Warren et al., 2011): Bilateral blockade of the AFP output to RA, either by inactivating the AFP output nucleus LMAN or by blocking the synaptic transmission from LMAN to RA, has been shown to cause a significant reversion of the learned change in FF (Andalman and Fee, 2009; Warren et al., 2011) (as exemplified by our data shown in Fig. 1D). This suggests that the AFP contributes to this reinforcement-driven vocal learning by biasing the motor activity in RA to modify the acoustic structure of the target syllable; such contribution of the AFP is referred to as “AFP bias” (Fig. 1D) (Andalman and Fee, 2009). We refer to the mean FF of the targeted syllables produced with AFP output blocked as FF(AFP−), and to the mean FF without AFP output blocked as FF(AFP+) (Fig. 1D; for the procedure to measure mean FF, see Materials and Methods). Because FF(AFP−) reflects the motor activity that is encoded in the SMP without the influence of the AFP, changes in FF(AFP−) in the direction of learning indicate consolidation of the AFP bias-driven FF changes into the SMP. For example, when a bird learns to increase the FF of a target syllable over consecutive days (Fig. 1E), an increase in FF(AFP−) from the evening of a given day to the evening of the next day indicates the occurrence of plastic changes in the motor circuit to consolidate learned FF initially driven by AFP bias (ΔConsol/day).
AFP bias is not consolidated into SMP overnight
We first examined what fraction of the AFP bias is consolidated into the SMP over a single night by measuring the ΔConsol overnight (ΔConsol/night) in adult BFs. In order to obtain the ΔConsol/night, we measured the FF(AFP−) in the evening of a given day and in the morning of the next day (Fig. 2A, left) by inducing a transient blockade of AFP output to RA using a microdialysis technique as in a previous study (Warren et al., 2011). The AFP output to RA was blocked either by infusing the GABAA agonist muscimol into LMAN to inactivate LMAN neurons or by infusing the NMDAR antagonist DL-AP5 into RA to block the synaptic transmission from LMAN to RA (for the detailed procedure, see Materials and Methods) (Fig. 2A, right); the data with those two manipulations were combined because a previous study has shown that those same manipulations exert qualitatively and quantitatively similar effects on learned FF (Warren et al., 2011). If the FF(AFP−) significantly changes from the evening to the next morning in the direction of learning (i.e., if ΔConsol/night is greater than zero) as schematized in Figure 2A (left), it would indicate that the learned FF that is driven by AFP bias is consolidated into the SMP overnight. We found, however, that it is not the case. The ΔConsol/night was not significantly different from zero (Fig. 2B,C; n = 15 evening-morning comparisons in 8 BFs, p = 0.978, Wilcoxon signed rank test, signed rank = 61). Moreover, not surprisingly, the AFP bias magnitude did not significantly change from evening to the next morning (Fig. 2D; p = 0.934, signed rank = 62). These results provide direct evidence that the AFP bias is not substantially consolidated into the SMP overnight.
Our findings of no apparent overnight consolidation of the AFP bias in BFs are consistent with those of Warren et al. (2011) showing that consolidation of the AFP bias occurs slowly over many days with only small progress over a single day. In contrast, Andalman and Fee (2009) have suggested that consolidation is completed within 1 d after the FF changes take place, providing a conclusion that seemingly is in contradiction with the results of Warren et al. (2011), despite their similar experimental paradigms used to induce FF changes and to examine AFP bias consolidations. A possible explanation for this discrepancy could be that different songbird species were used across these studies as discussed previously (Fee and Goldberg, 2011): the data in Warren et al. (2011) and our results so far were obtained from BFs, whereas the data in Andalman and Fee (2009) were obtained from ZFs, and these two species may possess distinct mechanisms regarding the time course of AFP bias consolidation. However, we found almost no overnight consolidation in ZFs just as in BFs: the ΔConsol/night was not significantly different from zero (Fig. 2E,F; n = 14 evening-morning comparisons in 7 ZFs, p = 0.339, Wilcoxon signed rank test, signed rank = 52); the AFP bias magnitude did not significantly change from the evening to the next morning (Fig. 2G; p = 0.301, signed rank = 53). Thus, in both BFs and ZFs, AFP bias was not substantially consolidated into the SMP overnight in reinforcement-driven adult vocal learning.
Previous studies suggest that syllable FF is critically regulated by RA projection neurons (Vu et al., 1994; Sober et al., 2008; Miller et al., 2017) and that the consolidation of AFP-driven FF changes is attributable to plastic changes in RA circuitry (Doya and Sejnowski, 1995; Fiete et al., 2007; Fee and Goldberg, 2011). Because long-term plasticity can be induced in RA circuitry in an activity-dependent manner (Sizemore and Perkel, 2011; Mehaffey and Doupe, 2015), inactivation of RA neurons would suppress such activity-dependent plasticity and subsequent consolidation of AFP bias. Given our results of no substantial consolidation of the AFP bias overnight, we expected that nighttime inactivation of RA will have no significant effect on the overnight maintenance of FF changes learned through daytime singing, providing additional support for the conclusion of no overnight consolidation of the AFP bias. Consistently with this prediction, we found that nighttime inactivation of RA by infusing GABAA antagonist muscimol (Fig. 3A) does not have a significant effect on the overnight maintenance of the FF changes induced by learning on the preceding day in either BFs or ZFs: overnight changes in FF were not significantly different between the RA-inactivation nights and the preceding nights (Fig. 3B,C; for BFs, n = 15 comparisons in 5 birds, p = 0.577, Wilcoxon signed rank test, signed rank = 26; for ZFs, n = 9 comparisons in 3 birds, p = 0.945, signed rank = 17). These results further support the conclusion that no substantial consolidation of AFP bias occurs overnight.
Consolidation of AFP bias depends on the amount of daytime learning
As previous studies have demonstrated that substantial consolidation of the AFP bias occurs over a single day (Andalman and Fee, 2009) or longer periods (Warren et al., 2011), our results showing almost no consolidation overnight raise the possibility that the consolidation process occurs predominantly during daytime. In agreement with this idea, we found that consolidation of the AFP bias strongly depends on the amount of learning achieved through daytime singing. We induced variable amounts of change in syllable FF per day by adjusting the threshold for WN-feedback with varying degrees in the morning or by automatically updating the threshold every 50 target-syllable renditions throughout a day (for more details, see Materials and Methods) (Tachibana et al., 2017). We then compared the amount of learning over 1 d (ΔLearning/day) with the amount of consolation over a similar period of time (ΔConsol/day) (Fig. 4A). The ΔConsol/day was calculated as the change in FF(AFP−) from the evening of a given day to the evening of the next day, and each FF(AFP−) was measured by blocking the AFP output as shown in Figure 1D. The ΔLearning/day was quantified as the difference in FF(AFP+) between the evening of a given day and the evening of the next day, each of which was measured immediately before blocking the AFP output (Fig. 4A). We found a strong correlation between the ΔConsol/day and the ΔLearning/day in both BFs and ZFs when data from all birds of the same species were combined (Fig. 4B,C; n = 42 comparisons in 5 BFs, r = 0.748, p < 0.001; n = 8 comparisons in 2 ZFs, r = 0.749, p = 0.003). Even in individual birds, relatively high correlations were observed (Fig. 4D,E). This correlative relationship was not observed when the ΔConsol/day was compared with the ΔLearning that occurred 1 d earlier (ΔLearning/day-1; Fig. 5A–D). Together with the results showing almost no consolidation of AFP bias overnight (Fig. 2), these findings suggest that consolidation occurs predominantly during the daytime and depends on ongoing song performance (FF changes) in both BFs and ZFs, thus revealing that there are common consolidation mechanisms across those species.
To further validate the conclusions drawn from our results regarding AFP-bias consolidation so far, we assessed how all possible experimental factors, such as the species difference, could contribute to the consolidation amount by using a statistical test using the LME model (for detail, see Materials and Methods). The LME analysis showed that only the ΔLearning/day was significantly correlated with the ΔConsol/day (t = 2.85, p = 0.006), while other factors, such as species difference and inactivation methods, were not (Table 1). The model fitness was moderate (R2 = 0. 39; adjusted R2 = 0.34). This analysis further supports our interpretation that the amount of daytime learning can explain the amount of consolidation regardless of the species.
A computational model of AFP bias consolidation can explain the contradicting results seen in previous studies
Our evidence of no substantial consolidation of the AFP bias into the SMP over the nighttime periods (10 h in duration) indicate that the consolidation process does not necessarily progress with the passage of time. Moreover, the strong dependence of the AFP bias consolidation on the amount of learning suggests that the consolidation process is rather critically regulated by a mechanism that depends on ongoing song performance (i.e., the FF of a target syllable). This “time-independent” and “performance-dependent” mechanism of AFP bias consolidation is likely to explain why the two previous studies with similar experiments have reported contradicting results regarding the time course of the AFP bias consolidation: Andalman and Fee (2009) have suggested that consolidation of AFP bias is completed within 1 d, whereas Warren et al. (2011) have demonstrated that consolidation is not fully completed even after many days. Given our findings, this difference in the time course of the consolidation process is likely to be attributable to differences in the daily amount of learning induced in those studies: Andalman and Fee (2009) drove large amounts of FF changes over many consecutive days by daily adjustment of the WN-feedback threshold, whereas Warren et al. (2011) maintained the FF at a stable value away from the baseline over many days by maintaining the WN-feedback threshold throughout that period, respectively.
To further test this idea and shed more light on the mechanisms of this AFP bias consolidation, we constructed a mechanistic model of a virtual song circuit in which the AFP bias is consolidated into the SMP during daytime (but not nighttime) in a performance-dependent manner (Figs. 6A, 7; see Materials and Methods for more detail). We then examined whether the model could replicate the results of the aforementioned two previous studies when we change only the daily amount of learning by changing learning paradigms. The central features of our model and its rationale are as follows. Given the fact that the synaptic plasticity that likely underlies the consolidation of the AFP bias can be induced in the SMP depending on neural inputs from the AFP (Mehaffey and Doupe, 2015), our model hypothesizes that the AFP bias consolidation critically depends on the magnitude of the neural signal that is sent from the AFP to the SMP to modulate the FF of a virtual syllable (“AFP output” in Figs. 6A, left, 7A). The AFP output is the sum of an AFP-bias signal that biases the FF of a virtual syllable and a variability signal that generates rendition-by-rendition variability in FF, based on previous experimental studies (Kao et al., 2005; Ölveczky et al., 2005; Kao and Brainard, 2006; Andalman and Fee, 2009; Warren et al., 2011; Kojima et al., 2013, 2018). Moreover, the amount of consolidation is nonlinearly dependent on an AFP output magnitude with a “threshold” (Figs. 6A, right, 7B): if AFP output is smaller than the threshold in a given syllable rendition, no consolidation occurs in the next rendition, whereas if an AFP output is greater than the threshold, consolidation immediately occurs in the next rendition and the consolidation amount is linearly dependent on the suprathreshold AFP output magnitude. This nonlinear consolidation function with an AFP-output threshold is based on the experimental results in reinforcement-driven FF learning that FF changes are initially driven by AFP output without obvious plasticity (consolidation) in the SMP (Andalman and Fee, 2009). This consolidation function is also matched with the properties of synaptic plasticity in RA that induction of plasticity requires substantial strengths of LMAN inputs to RA as well as a specific range of time lags between LMAN inputs and inputs from the upstream nucleus HVC (used as a proper name) (Mehaffey and Doupe, 2015). The final FF output in our model was defined as the summation of two components: the AFP output and the consolidated change of the SMP activity. By virtue of the rectified-linear consolidation behavior, our model shows a correlative relationship between consolidation and learning (changes in FF output) as seen in our experimental results (Fig. 4).
Using this “consolidation threshold model,” we have attempted to replicate the results of both Andalman and Fee (2009) and Warren et al. (2011) by only changing learning paradigms to induce different amounts of learning per day. We first induced large and continuous changes in syllable FF in a manner similar to the learning paradigm in Andalman and Fee (2009), by updating the WN-feedback threshold every night and reversing the learning direction every 6 d (Fig. 6B). In such a “continuous shift” experiment, we found that the trajectory of the FF(AFP−), which represents the plasticity of the SMP to consolidate the AFP bias, appears to follow the trajectory of the FF(AFP+), which represents FF learning (Fig. 6B, red and blue lines, respectively). The observed time lag between FF(AFP+) and FF(AFP−) is caused by the consolidation threshold in our model, which prevents consolidation of the AFP bias (i.e., no changes in FF(AFP−)) when the magnitude of the AFP output (sum of AFP bias and variability) is relatively small, that is, during the initial phase of learning (days 0-2 in Fig. 6B) and in the transition phases when learning directions were reversed (days 6-8, 12-14, and 18-20). Such a time lag between the FF changes and consolidation is largely maintained even during continuous changes in syllable FF (days 3-5, 9-11, 15-17, and 21-24). Although the time lag varies depending on the parameters of our model, including the consolidation threshold, we found that the time lag can be ∼1 d given a certain parameter set (Fig. 6B,C; for more detail, see Materials and Methods and Table 2). This simulation result is consistent with the conclusion reached by Andalman and Fee (2009) that learned FF changes driven by an AFP bias appear to be mostly consolidated into the SMP within 1 d. Moreover, we analyzed our simulation results in the same way as Andalman and Fee (2009) did, and observed relationships that were qualitatively similar to those seen in their experimental results (for more details, see Materials and Methods): similar to Andalman and Fee (2009, their Fig. 5C–E), we observed a strong correlation between the measures used in that study (Δm [the amount of consolidation over a 2 d period] vs β + β* [the AFP bias on the day n-2 plus the amount of learning on the day n-1] shown in Fig. 6D of the present study) only at the given time lag of 1 d (Fig. 6E–G).
Using the same model with the same parameter set, we mimicked the learning paradigm of Warren et al. (2011). In this instance, we only changed the learning paradigm so as to maintain the WN-feedback threshold at a fixed value away from the baseline after daily updates over 2 consecutive days (“maintained shift” experiment; Fig. 6H) (for more details, see Materials and Methods). Our simulation using this learning paradigm qualitatively replicated the results of Warren et al. (2011). Just as in Warren et al. (2011, their Fig. 3C,E), the magnitude of the AFP bias, represented as a difference between FF(AFP+) and FF(AFP−), decreased only gradually over many days (Fig. 6H,I), indicating that consolidation of the AFP bias is not completed within a single day. Together, these two simulation results demonstrate that the discrepancy between the two previous studies regarding the time course of the AFP bias consolidation can be explained simply by differences in the learning paradigms. Moreover, given the nature of our model with the rectified-linear function of AFP bias consolidation (Figs. 6A, right, 7B), our results suggest that the time course of the consolidation process critically depends on the learning speed but not on the simple passage of time: the consolidation process does not begin until the AFP output reaches a threshold and the amount of consolidation depends on the suprathreshold magnitude of AFP output, resulting in a faster consolidation in the continuous shift experiment (as in Andalman and Fee, 2009) and a slower consolidation in the maintained shift experiment (as in Warren et al., 2011), respectively. These results thus reconcile the controversy regarding the time course of vocal learning consolidation, and provide important computational insights into the circuit mechanisms underlying the performance-dependent consolidation of the basal ganglia-dependent motor skill learning.
Discussion
Our experiments using reinforcement-driven adaptive learning of song acoustic structure in adult songbirds address the important questions of when and how learned changes in motor skills, initially driven by a cortical-basal ganglia circuit (AFP), are consolidated into the cortical motor circuitry (SMP). Although previous studies reported seemingly contradicting conclusions across BF and ZF regarding the time course of the consolidation of the AFP bias into the SMP, we found in both species that the AFP bias is largely maintained overnight with no apparent consolidation into the SMP. Consistent with this, nighttime blockade of activity-dependent synaptic plasticity in RA that could underlie AFP bias consolidation had no significant effect on the overnight maintenance of learned FF. Moreover, we found evidence of the contribution of daytime singing to AFP bias consolidation in both species: the amount of consolidation over a single day was strongly correlated with learned FF changes over a similar period. These results strongly suggest that consolidation of the AFP bias is dependent on ongoing song performance rather than on the simple passage of time. Finally, our computational model of AFP bias consolidation with a nonlinear consolidation function has qualitatively replicated the seemingly contradicting results of the two previous studies from BF and ZF (Andalman and Fee, 2009; Warren et al., 2011), further providing evidence on the performance-dependent and time-independent mechanisms underlying learned vocal consolidation being shared across those two species. Together, our findings illustrate the neural substrates through which newly learned motor skills initially implemented in cortical-basal ganglia circuits become encoded in the cortical motor circuitry and are expressed independently of the cortical-basal ganglia circuits.
Given the advantage that the songbird AFP and SMP are discrete neural circuits specialized for song learning and maintenance (Doupe et al., 2005; Mooney, 2009), the synaptic mechanisms that could underlie the properties of the AFP bias consolidation that we found have been well studied. RA neurons projecting to brainstem motor neurons receive direct excitatory inputs from the cortical premotor nucleus HVC as well as from LMAN (Nottebohm et al., 1976; Bottjer et al., 1989). Several lines of evidence suggest that the HVC inputs drive stereotyped premotor activity in RA neurons, which generates individual syllable structure, including FF, whereas the LMAN inputs are thought to modulate the HVC-driven RA activity to generate exploratory song variability that is biased to reduce vocal error (i.e., AFP bias) (for review, see Woolley and Kao, 2015). It has also been suggested that adaptive changes in syllable structure, including the consolidation of AFP-driven FF changes, are attributable to plastic changes at the HVC-RA synapses (Doya and Sejnowski, 1995; Fiete et al., 2007; Fee and Goldberg, 2011). In support of this view, bidirectional Hebbian plasticity can be induced at the HVC-RA synapses as well as at the LMAN-RA synapses with the critical contribution of metabotropic glutamate receptors (Mehaffey and Doupe, 2015). Importantly, this plasticity involves opposing changes in the synaptic strengths of the two inputs to RA: when the HVC-RA synapses are potentiated, the LMAN-RA synapses are depressed, and vice versa, depending on the relative timing between the two inputs to RA. This indicates that the relative influence of the two inputs to RA can dramatically shift from LMAN-dominant to HVC-dominant, providing a possible mechanism by which FF changes initially driven by AFP bias (via LMAN input) are consolidated into the SMP (HVC to RA pathway) and maintained even in the absence of AFP bias. Moreover, because LMAN input at different time lags relative to HVC input would result in different degrees of real-time modulation of the HVC-driven RA activity and, subsequently, of the corresponding syllable's structure (Kao et al., 2005; Kojima et al., 2018), it is possible that the timing of the LMAN input relative to the HVC input determines the magnitude of AFP bias. These possible mechanisms for generating AFP bias and inducing its consolidation could be the biological basis of our consolidation threshold model: WN-feedback training may gradually change the timing of the LMAN input relative to the HVC input to develop AFP bias, and LMAN inputs that generate suprathreshold AFP bias may also induce synaptic plasticity responsible for consolidation of the AFP bias. Our model can also be explained by changes in the strength of the LMAN input to RA rather than changes in the timing of LMAN inputs. The induction of synaptic plasticity in RA neurons requires a high-frequency burst stimulation of the LMAN axons (Mehaffey and Doupe, 2015), and many LMAN neurons exhibit characteristic burst firing with variable number of spikes during singing (Kao et al., 2008; Kojima et al., 2013). Thus, it is also possible that the WN-feedback training gradually increases the spike frequency in individual bursts in LMAN neurons to develop AFP bias and subsequently induces RA plasticity when spike frequency in LMAN bursts exceeds a certain threshold. In theory, we cannot rule out the possibility that the synaptic plasticity in the SMP is induced by a mechanism independent of the AFP bias, such as neuromodulatory inputs to RA that convey reinforcement signals (Fiete et al., 2007). However, such a mechanism does not easily explain our results of significant correlations between learning and consolidation amounts or the seemingly contradicting results of the two previous studies (Andalman and Fee, 2009; Warren et al., 2011).
Our results of no obvious overnight consolidation of the learned FF in adult birds are consistent with the idea that spontaneous nocturnal activity observed in the song motor system has a functional role not in memory consolidation but in the maintenance of the stereotyped motor program in adult birds (Young et al., 2017; Bush et al., 2018). In contrast, several lines of evidence suggest significant contributions of sleep to developmental song learning that naturally occurs in juvenile birds without receiving external reinforcement. Juvenile birds spontaneously develop a highly structured song from immature vocalizations by imitating their adult tutor, and such developmental song learning has been shown to be associated with sleep (Derégnaucourt et al., 2005; Margoliash and Schmidt, 2009; Shank and Margoliash, 2009; Rauske et al., 2010; Yanagihara and Hessler, 2011; Brawn and Margoliash, 2014; Giret et al., 2017). In particular, RA neurons dramatically increase high-frequency spiking activity in the night just before the first daytime improvement of their song structure following tutor song exposure (Shank and Margoliash, 2009). Also, syllable structure in the middle of song development dramatically deteriorates overnight, to a degree that is positively correlated with the final quality of the learned song (Derégnaucourt et al., 2005). Even in adult birds producing stable songs, RA neurons exhibit a spontaneous replay of song premotor patterns during sleep (Dave and Margoliash, 2000), and premotor patterns of daytime singing slightly but reliably change across sleep periods (Rauske et al., 2010). Given these findings, it has been hypothesized that sleep-related activity in the song system serves as the substrate for an “offline” processing of song motor networks required for the development and maintenance of song structure (Dave and Margoliash, 2000; Rauske et al., 2010), although this idea is not supported by recent findings (Young et al., 2017; Bush et al., 2018).
The discrepancy regarding the contribution of sleep between the reinforcement-driven adaptive learning of syllable FF and the developmental song learning is likely to reflect the methodological and behavioral differences between the two forms of learning. In reinforcement-driven FF learning, birds change their syllable structure to avoid extrinsic negative reinforcement (WN feedback) that causes a vocal error and/or aversive auditory input. In developmental song learning, in contrast, birds spontaneously improve their song structure by comparing the auditory feedback to a previously memorized model of the tutor song, without receiving any extrinsic reinforcement. Moreover, FF learning is a simple, unidimensional process in which birds adaptively change only the FF of a specific portion of a single syllable. The developmental song learning is, in contrast, a complex, high dimensional process in which birds build up a sequence of complex syllables by changing multiple acoustic and temporal features simultaneously. Given these apparent differences between the two forms of learning, it is reasonable to assume that their underlying mechanisms are considerably different and that the FF learning does not involve sleep-related processes, such as the offline processing of song motor networks hypothesized for developmental song learning and maintenance. This idea is consistent with studies in mammals showing that the contribution of sleep to the consolidation of motor skills depends on the nature of the task used in practice (Dudai et al., 2015). It is also possible that sleep is actually involved in FF learning as in developmental song learning but to a much smaller extent because of the simplicity of the FF learning. In light of the largely shared neural circuits responsible for FF learning and developmental song learning (Mooney, 2009; Fee and Goldberg, 2011; Woolley and Kao, 2015), fundamental neural mechanisms to modify and optimize the syllable acoustic structure are likely to be shared, at least in part, between the two forms of learning (Hisey et al., 2018). The differential contributions of sleep to the two forms of learning could be attributable to the differential complexity of the learning, as already reported for motor skill learning in humans (Kuriyama et al., 2004), resulting in no detectable overnight consolidation in FF learning by our experimental approaches.
In conclusion, our findings in songbirds provide a glimpse of the neural mechanisms through which learned performance of complex motor skills is consolidated and encoded in motor circuits. Moreover, given that learning of song acoustic structure provides one of the simplest examples for linking neural activity in cortical-basal ganglia circuits to volitionally produced skilled motor behavior, our findings have wider implications on our understanding of how rapid reinforcement-driven plasticity in basal ganglia-related circuits “trains” slower learning mechanisms and long-term plasticity in the cortical motor circuitry (Pasupathy and Miller, 2005; Turner and Desmurget, 2010).
Footnotes
This work was supported by Korea Brain Research Institute research program funded by Ministry of Science and Information and Communication Technology (21-BR-01-03) to S.K. We thank R. Hahnloser (ETH Zurich), K. Wada (Hokkaido University), R. Rajan (Indian Institute of Science Education and Research Pune), and the S.K. laboratory members for discussion and comments on this manuscript as well as technical assistance.
The authors declare no competing financial interests.
- Correspondence should be addressed to Satoshi Kojima at satoshikojima.sk{at}gmail.com