## Abstract

Variation in sequencing of actions occurs in many natural behaviors, yet how such variation is maintained is poorly understood. We investigated maintenance of sequence variation in adult Bengalese finch song, a learned skill with rendition-to-rendition variation in the sequencing of discrete syllables (i.e., syllable “b” might transition to “c” with 70% probability and to “d” with 30% probability). We found that probabilities of transitions ordinarily remain stable but could be modified by delivering aversive noise bursts following one transition (e.g., “b→c”) but not the alternative (e.g., “b→d”). Such differential reinforcement induced gradual, adaptive decreases in probabilities of targeted transitions and compensatory increases in alternative transitions. Thus, the normal stability of transition probabilities does not reflect hardwired premotor circuitry. While all variable transitions could be modified by differential reinforcement, some were less readily modified than others; these were cases that exhibited more alternation between possible transitions than predicted by chance (i.e., “b→d ” would tend to follow “b→c ” and vice versa). These history-dependent transitions were less modifiable than more stochastic transitions. Similarly, highly stereotyped transitions (which are completely predictable) were not modifiable. This suggests that stochastically generated variability is crucial for sequence modification. Finally, we found that, when reinforcement ceased, birds gradually restored transition probabilities to their baseline values. Hence, the nervous system retains a representation of baseline probabilities and has the impetus to restore them. Together, our results indicate that variable sequencing in a motor skill can reflect an end point of learning that is stably maintained via continual self-monitoring.

## Introduction

Many complex behaviors are composed of distinct elements performed in a sequence. When such behaviors are practiced extensively, the sequencing of elements can become highly stereotyped (Immelmann, 1969; Schwartz, 1980; Cohen et al., 1990; Grafton et al., 2002). However, rendition-to-rendition variation in sequencing can persist even in well practiced behaviors in which individual elements have become highly stereotyped. For example, persistent variation in sequencing occurs in trained operant behaviors, in rodent and insect grooming, and in birdsong (Schwartz, 1980; Lefebvre, 1981; Berridge, 1990; Okanoya, 2004). Sequence variation has been hypothesized to aid in attracting mates, in evading predators, and in optimal foraging (Humphries and Driver, 1967; Real and Caraco, 1986; Searcy and Andersson, 1986). Thus, rather than reflecting a limitation in motor control, sequence variation may be a feature of learned behavior that is actively maintained.

In this study, we test how sequence variation is maintained in adult birdsong, a learned behavior in which the statistics of sequencing (e.g., the probabilities of transitions between distinct elements) can be monitored precisely. Birdsong, like speech, gradually progresses from variable “babbling” to a highly stereotyped, “crystallized” adult form (Marler and Tamura, 1964; Mooney, 2009). Production of the crystallized adult song involves two hierarchically distinct levels of motor control—over individual vocal elements, termed “syllables,” which have high acoustic stereotypy, and over the sequencing of those syllables (Vu et al., 1994; Yu and Margoliash, 1996; Tchernichovski et al., 2001; Fee et al., 2004; Ashmore et al., 2005; Horita et al., 2008; Fujimoto et al., 2011). In some species, such as the zebra finch, a single sequence of syllables (“abcd”) is performed with little variation (Zann and Bamford, 1996). However, in other species, including the Bengalese finch (BF), syllable sequencing can remain highly variable even in adult song (Okanoya, 2004; Sakata and Brainard, 2006; Jin, 2009). For example, at a “branch point” in Bengalese finch song, syllable “b” might transition to “c” 70% of the time and to “d” 30% of the time.

Here, we first demonstrate that the probabilities of transitions between syllables ordinarily remain stable over weeks. Such stability could reflect stochastic output of a hardwired pattern generator, consistent with a view of adult song as a crystallized behavior that is not subject to adaptive modification (Immelmann, 1969; Marler, 1970). However, recent experiments indicate that adult birds can modify the acoustic structure of individual syllables to reduce aversive feedback or to correct perceived errors in production. Moreover, birds restore the original structure of syllables in the absence of continued instruction (Tumer and Brainard, 2007; Sober and Brainard, 2009). Thus, for individual syllables, the normal stability of acoustic structure derives from an active process of maintenance, rather than a loss of plasticity.

Here, we investigate whether adult birds exert analogous adaptive control over syllable sequencing. First, we test whether birds can adjust probabilities of transitions between syllables in a directed manner in response to differential reinforcement. Then, we test whether birds restore these probabilities to their baseline values following termination of reinforcement. Our results indicate that the stability of transition probabilities does not simply reflect hardwiring of neural circuitry but instead depends on a capacity for adaptive adjustment coupled with an impetus to actively maintain probabilities at specific values.

## Materials and Methods

##### Subjects.

Seventeen adult (range, 115–933 d) Bengalese finches (*Lonchura striata domestica*) were used in this study. Nine birds were used to study the stability of transition probabilities over extended periods at baseline. Eight of these birds plus 5 additional birds (13 total) were used to study the effects of differential reinforcement via white noise playback following variable syllable transitions. Three additional birds were used to study the effects of white noise playback following stereotyped transitions. All birds were bred in our colony and housed with their parents until at least 60 d of age. During experiments, birds were isolated and housed individually in sound-attenuating chambers on a 14 h on/10 h off light cycle. All song recordings were of undirected song (i.e., no female was present). All procedures were performed in accordance with protocols approved by the University of California, San Francisco, Institutional Animal Care and Use Committee.

##### Computerized control of sound recording and of delivery of auditory feedback.

Song recording and delivery of auditory feedback were controlled using a modified version of EvTAF (Tumer and Brainard, 2007). Song was recorded continuously and the spectral structure of ongoing song was analyzed in 8 ms segments. We detected specific syllable transitions by comparing the spectral structure of recorded sound segments to multiple, successive spectral templates (Tumer and Brainard, 2007; Warren et al., 2011). For instance, we detected the transition “ab” (as opposed to “ac” or “xb,” where “x” refers to any syllable other than “a”) through successive spectral detection of “a” and “b” within a time interval that ensured that they were performed consecutively.

We differentially reinforced transitions at branch points by playing a loud white noise burst over the latter syllable in one transition (targeted transition) and not over alternative transitions (nontargeted transitions). The duration of white noise (WN) was 50–60 ms. For instance, for a branch point in which syllable “a” could be followed by either syllable “b” or “c,” to target the transition to “b,” we played WN over any instance of syllable “b” in which it was preceded by syllable “a.” WN playback was not contingent on variation in syllable acoustic structure (i.e., all acoustic variants of the targeted syllable at the branch point elicited WN playback). The mean latency from syllable onset to onset of WN delivery, across all experiments, was 25.3 ± 1.54 ms, and the mean syllable duration was 71.7 ± 6.7 ms. The mean fraction of the targeted syllables that overlapped with WN playback was 0.62 ± 0.02. Targeted syllables included both “harmonic stacks,” in which acoustic power is distributed across several discrete, stable frequencies, as well as broadband syllables, in which acoustic power is distributed more broadly. Qualitative inspection suggested no tendency for learning to vary with the type of syllable targeted.

We differentially reinforced transitions in 18 experiments at 13 branch points (from 13 individuals; Table 1). The most probable transition at the branch point was targeted with WN playback in 16 of 18 experiments; the second most probable transition was targeted with WN playback in 2 of 18 experiments (Table 1). There were two transitions from the branch point at baseline for 11 of 13 branch points and more than two possible transitions for 2 of 13 branch points.

Before initiating WN playback, song was recorded during a baseline period (2–4 d). Playback of WN was maintained for 4–12 d. Following termination of WN playback, song was monitored for a recovery period of at least 3 d. For all experiments, WN was omitted in a random subset (5–10%) of “catch” songs. This subset was used to measure transition probabilities independently of any acute effects of disrupting auditory feedback [as described by Sakata and Brainard (2006)].

##### Analysis.

All analysis was performed using custom MATLAB software.

##### Statistics.

To determine whether the means of two groups were significantly different from one another, we performed *t* tests. To determine whether a test statistic other than the mean (e.g., the probability of a transition at a branch point) was significantly different in the two groups, we performed permutation tests (Efron and Tibshirani, 1994). For permutation tests, data values were randomly permuted across groups 10,000 times. Permutations shuffled the identity of values across data groups but did not alter the size of each data group. By determining the frequency at which the differences in a test statistic across these resampled groups were as large as the originally observed difference in the test statistic, we generated a *P* value for rejection of the null hypothesis that the two data groups were drawn from the same distribution. Confidence intervals for estimates of the mean of a distribution were obtained via bootstrap resampling (Efron and Tibshirani, 1994).

##### Effects of white noise playback on syllable sequencing.

The probability of a specific syllable transition (“transition probability”) at a branch point was defined as the fractional occurrence of that transition, relative to all possible transitions at that branch point (i.e., for a given branch point, all distinct transition probabilities summed to 1).

The effects of differential reinforcement of transitions at branch points were assessed by comparing transition probabilities across three time periods: (1) “baseline,” a 2–4 d time period before WN onset, (2) “white noise,” the final day of WN delivery, and (3) “post white noise,” the third day following the termination of WN. The percentage change in transition probability induced by differential reinforcement was defined as 100*(**p**_{bas} − **p**_{WN})/(**p**_{bas}), where **p**_{bas} refers to the transition probability during the baseline period and **p**_{WN} refers to the transition probability on the final day of WN delivery. The percentage recovery of transition probabilities following the termination of WN was defined as 100*(**p**_{post} − **p**_{WN})/(**p**_{bas} − **p**_{WN}), where **p**_{post} refers to the transition probability on the third day following the termination of WN.

To analyze the time course of changes to transition probabilities, we compared the cumulative magnitude of learning that had been reached by a particular day of reinforcement to the cumulative amount of learning reached by the final day of reinforcement. Significant differences from the final level of learning (defined as 100% learning) were calculated via a *t* test. Here, no correction was made for multiple comparisons to maintain a conservative standard for what constituted a nonsignificant difference from the final level of learning (all differences that were reported as nonsignificant also would have been nonsignificant, and by a wider margin, following a correction for multiple comparisons).

For each branch point, we assessed the extent to which there were changes in sequence stereotypy associated with the induced changes in transition probability. Sequence stereotypy at a branch point refers to how often the most probable transition (or “dominant transition”) occurs relative to the total number of transitions at the branch point. This is similar to the measure of sequence consistency used by Scharff and Nottebohm (1991). Sequence stereotypy is greatest when there is only one possible transition and least when all possible transitions have equal probability. Branch points with greater stereotypy have lower branch point entropy (Sakata and Brainard, 2006), approaching 0 bits of entropy as the probability of the most dominant transition approaches 1, and branch points with lower stereotypy have higher branch point entropy, approaching 1 bit of entropy as the probability of two alternate transitions approaches 0.5:0.5. We used permutation tests as described above to determine whether the stereotypy of individual branch points was significantly altered as transition probabilities were modified.

The effects of WN playback on completely stereotyped transitions (where there was only one possible transition) were assessed both by monitoring transition probabilities and by determining the relative frequency of occurrence of the transition, defined as the number of occurrences of the targeted transition divided by the total number of transitions in song, regardless of syllable identity.

##### Syllable acoustic structure.

We measured the extent to which acoustic structure of syllables changed in response to reinforcement by measuring changes to the fundamental frequency (FF) of the syllables that elicited WN playback. For 14 of 18 experiments, the baseline acoustic structure of the syllable targeted with WN permitted reliable measurement of FF, which was calculated as described previously (Tumer and Brainard, 2007). We compared the magnitude of difference in mean syllable FF across two equivalent time intervals; one interval was within the baseline period (first and last day) in which no WN was delivered, and the second interval was between the last day of the baseline period and day 4 of WN delivery.

##### History dependence of syllable transitions.

For all 13 branch points that were differentially reinforced via WN playback, we evaluated the history dependence of syllable transitions from the branch point by determining whether transition probabilities of the most probable transition were significantly influenced by which transition had occurred at the previous instance of the branch point. For example, for a branch point in which syllable “a” could be followed by either “b” or “c,” and for which transitions to “b” were more probable, we calculated both the conditional probability of a transition to “b,” given that a transition had occurred to “b” at the prior branch point, *p*(ab* _{n}*|ab

_{n}_{−1}), and also the conditional probability of a transition to “b,” given that a transition had occurred to “c” at the prior branch point,

*p*(ab

*|ac*

_{n}

_{n}_{−1}). As a measure of history dependence, we computed the absolute value of the difference between these conditional transition probabilities, |

*p*(ab

*|ac*

_{n}

_{n}_{−1}) −

*p*(ab

*|ab*

_{n}

_{n}_{−1})|. These conditional transition probabilities were computed from baseline songs, before the onset of differential reinforcement. For calculation of conditional transition probabilities, we required that the prior syllable transition (e.g., ab

_{n}_{−1}) be in the same song. For the 2 of 13 branch points in which there were more than two possible transitions, all transitions other than the most probable transitions were combined into a single transition type, allowing for the same calculation of history dependence, |

*p*(ab

*|ac*

_{n}

_{n}_{−1}) −

*p*(ab

*|ab*

_{n}

_{n}_{−1})|, where “ab” refers to the most probable transition and “ac” refers to any of the less probable transitions.

We performed linear regressions with percentage modification of transition probability as a dependent variable and three distinct putative explanatory variables: age of the individual bird at the onset of WN, initial transition probability of the dominant transition, and history dependence of the branch point. The regressions were always performed for the experiment targeting the dominant transition with WN. For three branch points, differential reinforcement was repeated multiple times in independent experiments. In these cases, to avoid pseudoreplication, the first experiment was used for calculation of both history dependence as well as the percentage modification of transition probability.

## Results

### Transition probabilities at branch points remain stable over weeks

The song of an individual BF is composed of a discrete set of ∼5–10 acoustically distinct vocal elements, termed “syllables,” which are produced in sequences characteristic of the individual. We refer to each distinct syllable in an individual bird's repertoire with a unique label (e.g., the syllables for the song illustrated in Fig. 1*A–C* are labeled “a,” “b,” “c,” and so on) and focus in this study on the statistics that characterize the transitions between these categorically defined syllables. For BF song, some transitions between syllables are stereotyped. For example, syllable “a” for the song depicted in Figure 1*A–C* was always followed by syllable “b” (*p*_{ab} = 1.00; 1194 of 1194 cases over 4 d). In other cases, transitions between syllables are variable. For example, syllable “b” in this bird's song could be followed by either syllable “c” (*p*_{bc} = 0.68; 808 of 1194 cases), or syllable “d” (*p*_{bd} = 0.32; 386 of 1194 cases). We refer to syllables that can be followed by alternative transitions as branch points and characterize the sequencing of syllables at branch points by quantifying the transition probabilities for each of the alternate possible transitions (which sum to 1.0).

We found that transition probabilities in adult BF song remain highly stable over time. An example is shown in Figure 1*A–C* for a branch point at which syllable “b” could be followed by either syllable “c” or “d.” Transition probabilities were calculated for a random sample of songs on 4 consecutive days at the beginning of a 2 month period and 4 consecutive days at the end of the period (Fig. 1*C*). None of the transition probabilities on these 8 d was significantly different from the others (*p*_{bc} for each of the 8 d = 0.68, 0.66, 0.70, 0.72, 0.71, 0.67, 0.70, 0.72; n.s. by permutation test across all pairs, Bonferroni corrected). For nine branch points (from nine birds), we similarly compared transition probabilities over two time periods separated by 30–60 d (Fig. 1*D*). Measured transition probabilities spanned a broad range, from 0.05 to 0.74. The transition probabilities measured at the beginning and end of this period were highly correlated (*r* = 0.80), and for 17 of 20 transitions there was no significant change in transition probability (*p* > 0.05, permutation test across pairs, Bonferroni corrected). These data demonstrate that transition probabilities can span a broad range of values and remain stable at those specific values over periods of months, consistent with prior studies, which have suggested stability of transition probabilities (Okanoya and Yamaguchi, 1997; Woolley and Rubel, 1997; Yamada and Okanoya, 2003; Hampton et al., 2009). Hence, syllable sequencing in Bengalese finch song is a variable behavior in which the probabilities of transitions ordinarily remain stable, at values that are unique to individual transitions.

### Differential reinforcement of transitions at branch points induces rapid, adaptive modification of transition probabilities

The stability over time of transition probabilities could indicate that adult birds have lost the capacity to change sequencing adaptively. If this were the case, the specific values of transition probabilities at branch points might be relatively immutable, reflecting “hardwiring” of the neural circuitry that controls sequencing. We tested whether syllable sequencing in adult song could be modified adaptively by using an automated system to differentially reinforce birds for producing some syllable sequences over others. Previous experiments have shown that playback of loud WN that is contingent on the FF of a syllable induces birds to modify syllable FF to “escape” WN, indicating that WN is effective for aversive reinforcement (Tumer and Brainard, 2007; Charlesworth et al., 2011). Here, we used WN to deliver differential reinforcement that was contingent on the syllable transition that had just occurred in song, and not on the acoustic structure of individual syllables (see Materials and Methods). Hence, while previous experiments showed that differential reinforcement could shape the acoustic structure of an individual syllable, here we tested whether differential reinforcement could modify the normally stable probabilities of transitions between categorically distinct syllables.

Although transition probabilities normally remain stable, birds rapidly modified transition probabilities in response to differential reinforcement. Figure 2*A* illustrates an experiment in which WN was played back over one of two alternative transitions at a branch point (the same branch point shown in Fig. 1*A–C*). In this case, the transition from syllable “b” to “c” was targeted with WN, while the alternate transition, from “b” to “d,” was not. This reinforcement via contingent playback of WN over the “bc” transition induced a significant reduction in the probability of that transition within the first day of WN (baseline *p*_{bc} = 0.72; day 1 *p*_{bc} = 0.55; *p* < 0.05, permutation test). The probability of the “bc” transition decreased further over the subsequent 3 d, and on day 4 of reinforcement, *p*_{bc} was 0.39. This decrease in the probability of “bc” transitions resulted in a complementary increase in the probability of transitions to the alternative syllable “d” (baseline *p*_{bd} = 0.28; day 4 *p*_{bd} = 0.61). Hence, in this case, the bird adaptively reduced exposure to WN by decreasing the probability of the targeted transition and increasing the probability of the alternate, nontargeted transition.

The probabilities of transitions at a branch point could be increased or decreased depending on which transition was targeted. For example, for the experiment described above, the probability of the “bc” transition was systematically decreased over time when this transition was targeted with WN. However, for a later experiment from this same bird, the probability of the “bc” transition was systematically increased when the alternate transition was targeted with WN. This is illustrated in Figure 2*B*. Here, when the transition from syllable “b” to “d” was targeted with WN, the probability of the “bd” transition decreased (from a baseline value of *p*_{bd} = 0.31 to a value on day 4 of *p*_{bd} = 0.16), and the probability of the nontargeted “bc” transition increased.

As illustrated in these examples, transition probabilities at branch points always responded to differential reinforcement by shifting in an adaptive direction (reducing exposure to WN). Figure 2*C* and Table 1 summarize the changes to transition probabilities at branch points for 18 experiments (in 13 individuals) in which one transition was targeted with WN. Over the period of reinforcement (4–12 d), the probability of targeted transitions (green circles, WN) always decreased, and the magnitude of this decrease was significant in 15 of 18 cases (we later consider factors that might account for heterogeneity in the amount by which transition probabilities changed across different experiments). For each experiment, there was a compensatory increase in the probability of the nontargeted transitions (black circles, “no WN”). For experiments in which there were more than two possible transitions at a branch point (three experiments for a branch point with three possible transitions and one experiment for a branch point with four possible transitions), the probability of each of the nontargeted transitions increased (detailed in Table 1), suggesting that compensation for a reduction in the probability of the targeted transition was distributed across all possible alternatives. Compensatory increases in the probability of nontargeted transitions were restricted to transitions that were already present at a branch point; we never observed the introduction of novel transitions between syllables that were not present at baseline. The mean reduction in the probability of targeted transitions, measured on the last day of reinforcement, was 46.8 ± 6.9%, corresponding to a ∼47% reduction in the amount of WN experienced by the birds. Hence, differential reinforcement with WN consistently elicited learning in the form of directed, adaptive changes to sequencing at branch points.

Learned changes to transition probabilities occurred rapidly and reached a stable level after 4–5 d. This is suggested by the examples in Figure 2, *A* and *B*, in which the majority of changes to transition probabilities occurred during the first 2 d of reinforcement with little apparent change thereafter. To define the time course of learning systematically, we examined the progression of changes to transition probabilities in nine experiments in which reinforcement was maintained for at least 7 d and in which there was a significant decrease in transition probabilities (Fig. 2*D*). Changes to transition probabilities were normalized to the probabilities on the last day of reinforcement to examine the percentage of the final learning that had been completed by each day of reinforcement (see Materials and Methods). By day 4, the amount of learning that had been completed was 92.9 ± 5.9%, which was slightly below but not significantly different from the amount of learning assessed on the last day of reinforcement (Fig. 2*D*, *p* = 0.12, one-tailed *t* test comparing day 4 values with final values). By day 5, the amount of learning that had been completed was 101.4 ± 7.5%, indistinguishable from its final value (*p* = 0.57, one-tailed *t* test). Hence, the probability of targeted transitions was reduced to a new value by the end of day 4, with little change thereafter.

While differential reinforcement reduced the probability of targeted transitions, it never caused the total elimination of targeted transitions. This can be seen in Figure 2*C*, which plots the final transition probabilities after 4–12 d of reinforcement for all 18 experiments. In none of these cases did the final transition probability drop to zero. Moreover, in the subset of experiments in which we maintained reinforcement for at least 9 d, transition probabilities achieved stable, nonzero values by day 4 that were not further reduced by the final day (*p* = 0.24, one-tailed paired *t* test for change between day 4 and final day; *n* = 5). These data indicate that, while the probability of targeted transitions could be rapidly decreased, these transitions could not be readily eliminated.

Changes to sequencing of syllables occurred independently of changes to the acoustic structure of syllables. In these experiments, the playback of WN was contingent on which transition the bird produced at a branch point (i.e., “ab” vs “ac”), and not on the acoustic structure of any of the syllables surrounding the branch point. Under these conditions, in which playback of WN is not contingent on variation in the acoustic structure of targeted syllables, previous experiments indicate that acoustic structure will not change (Charlesworth et al., 2011). Consistent with this, we observed no change in syllable structure in experiments in which we drove changes to transition probabilities; for example, the magnitude of change in mean fundamental frequency (of the syllables that received WN playback) over the period of reinforcement was not significantly different from the magnitude of change in fundamental frequency that occurred over a comparable baseline period (mean change during baseline, 35.9 ± 10.0 Hz; mean change during reinforcement, 35.6 ± 7.8 Hz; paired *t* test, *p* = 0.38; *n* = 14).

These data demonstrate a previously unrecognized capacity of adult birds to make adaptive changes to the ordinarily stable sequencing of the syllables in their songs. As a corollary, they indicate that this stable sequencing does not simply reflect a loss of plasticity in the song premotor circuitry responsible for the control of syllable sequencing.

### Transition probabilities recover to a baseline set point following termination of reinforcement

To determine whether the stable transition probabilities that are normally present in adult song are actively maintained, we assessed whether transition probabilities were restored to their baseline values following termination of reinforcement. We found that transition probabilities did reliably recover their baseline values. An example of this recovery is shown in Figure 3*A*. Here, differential reinforcement over 6 d induced a large shift in transition probabilities (the probability of the targeted “ab” transition was reduced from 0.67 at baseline to a stable value of 0.06 ± 0.02 from days 4 to 6). After 6 d, the delivery of WN was terminated. Transition probabilities then gradually returned to their original baseline values, and by the third day of recovery the transition probabilities were not significantly different from their baseline values (day 3 recovery, *p*_{ab} = 0.64; baseline, *p*_{ab} = 0.67). A similar recovery was observed in all experiments in which transition probabilities were monitored after termination of reinforcement (*n* = 14). On average, by the third day after termination of WN, transition probabilities recovered 106.0 ± 10.4% of the way back toward baseline (Fig. 3*B*, *n* = 14). This recovery, in the absence of any instruction, indicates that a representation of baseline transition probabilities is retained by the nervous system even while overt transition probabilities are maintained at distinct values during the period of reinforcement. Moreover, this recovery indicates that the baseline transition probabilities represent stable set points that birds have both the capacity and impetus to restore.

A corollary of the finding that transition probabilities at branch points are restored to their baseline values is that the degree of sequence stereotypy at branch points is also restored to its baseline value. Sequence stereotypy at a branch point is highest when the probability of one transition approaches 1.0 and lowest when all transitions have equal probabilities (0.5:0.5 for cases with two transitions). Differential reinforcement could therefore drive either increases or decreases in sequence stereotypy depending on how transition probabilities were altered. For example, for the experiment shown in Figure 2*A*, differential reinforcement drove a decrease in sequence stereotypy (transitions were propelled closer to 0.5:0.5), while for the experiment in Figure 2*B*, differential reinforcement drove an increase in sequence stereotypy at the same branch point (transitions were propelled further from 0.5:0.5). Across experiments, differential reinforcement resulted in a significant decrease in sequence stereotypy in three cases (Fig. 3*B*, orange lines) and a significant increase in sequence stereotypy in seven cases (Fig. 3*B*, cyan lines). In every case, these changes to sequence stereotypy recovered following termination of reinforcement. While this recovery follows directly from the recovery of transition probabilities, it emphasizes a distinct point; these data show that the stably maintained transition probabilities at branch points correspond to stably maintained levels of sequence stereotypy. Of particular note in this respect are the cases in which differential reinforcement drove a significant increase in stereotypy and spontaneous recovery therefore reflected a decrease in stereotypy (Fig. 3*B*, cyan lines, *n* = 7). These cases illustrate that the stably maintained end point of learning for these branch points is a variable state that is not the most stereotyped that is possible.

### Stereotyped sequences are not modified in response to aversive reinforcement

In contrast to the reliable reduction in transition probabilities that was induced when variable transitions were targeted with WN, there was no alteration of syllable sequencing induced when stereotyped transitions were targeted with WN. An example experiment testing the modifiability of stereotyped transitions is shown in Figure 4, *A* and *B*. In this example, five transitions were stereotyped (“ia,” “cd,” “de,” “ef,” and “fg”) and two transitions were variable (syllable “a” could be followed by “b” or “c,” and syllable “b” could be followed by “i” or “c”). Here, we tested whether any adaptive modifications to sequencing occurred when WN was played over syllable “d.” The bird could have reduced the playback of WN by inserting a novel transition following syllable “c” to replace the transition to “d,” or by modifying other transition probabilities in song so as to reduce the relative frequency of occurrence of the “cd” transition (see Materials and Methods). For example, decreasing the probability of the “bc” transition would have reduced the relative frequency of the “cd” transition, and thus reduced the amount of WN the bird experienced per song.

We observed neither type of adaptive sequence modification for the example experiment depicted in Figure 4, *A* and *B*, or for two other experiments in which stereotyped transitions were targeted with WN. In all three experiments, we exhaustively examined the targeted, stereotyped transitions to see whether novel transitions were introduced in their place. In no case, however, did we observe the insertion of novel transitions to replace the targeted, stereotyped transitions. We also did not observe any changes in the relative frequency of the targeted transitions (Fig. 4*C*,*D*; 4.6 ± 5.8% reduction; paired *t* test = 0.56; *n* = 3). In contrast, for branch points, the probabilities of targeted transitions were significantly reduced in response to WN playback (Fig. 4*D*; 46.8 ± 6.9%). Thus, stereotyped transitions were not adaptively modified when targeted with WN, indicating that some degree of baseline sequence variation in the targeted transition is required for adaptive modification via aversive reinforcement.

### Branch points at which transitions exhibit greater history dependence are less modifiable

While stereotyped transitions were qualitatively less modifiable than variable transitions, there was also heterogeneity in how modifiable variable transitions were. This heterogeneity is illustrated in Figure 5*A*, which quantifies the percentage learning (percentage change in transition probability) induced for the most probable transition at each branch point (*n* = 13). Percentage learning ranged broadly across experiments, from 6.2% (almost no change in the targeted transition) to 90.2% (almost complete elimination of the targeted transition). Because learning in these cases had reached a stable asymptote (Fig. 2*C*,*D*), this heterogeneity in learning could not be attributed to different durations of exposure to reinforcement. We therefore examined several other variables that differed across experiments to see whether they could account for this heterogeneity.

We first tested whether the age of birds at the time of experiments could account for differences in learning. For many animals, including songbirds, the capacity for learning diminishes with age, even in adulthood (Lombardino and Nottebohm, 2000; Brainard and Doupe, 2001). Our birds were all nominally adults, but ranged from 115 to 933 d of age, consistent with the possibility that differences in age might explain some of the observed differences in learning. However, there was no correlation between the age of birds at the time of experiments and the amount of learning that occurred (Fig. 5*B*; *r* = 0.22, *p* = 0.46). Indeed, the single experiment with the greatest amount of learning was for a bird that was nearly 2 years of age. This suggests that for reinforcement-driven sequence learning, the capacity for learning does not diminish with age during adulthood.

We next examined whether differences in the baseline transition probabilities of branch points could explain differences in learning. Since stereotyped transitions (with a probability of 1.0) were not modifiable, we supposed that, for branch points, transitions with higher probabilities (closer to 1.0) might be less modifiable than transitions with lower probabilities. However, there was no correlation between baseline transition probability and learning (Fig. 5*C*; *r* = 0.11, *p* = 0.71). Moreover, within a narrow range of baseline transition probabilities (from 0.5 to 0.75), there were cases with both very large and very small amounts of learning (Fig. 5*C*). Hence, differences in baseline transition probabilities could not explain the observed heterogeneity in learning.

In contrast to baseline transition probability, we found that another feature that differs across branch points, which we term the “history dependence” of transitions, could account for a significant amount of the variation in learning across experiments. History dependence refers to the degree to which the transition that occurs at a specific instance of a branch point (such as an “ab” vs an “ac” transition) can be predicted by the history of recent transitions (such as whether an “ab” or “ac” transition occurred at the previous instance of the branch point). Such history dependence has been noted in prior studies of sequencing at branch points in Bengalese finch song (Fujimoto et al., 2011; Katahira et al., 2011). One particularly clear form of history dependence, which we observed in our data set, is alternation in the transitions that occur across successive instances of a branch point. Complete alternation would mean that an “ab” transition at one branch point would always be followed by an “ac” transition at the next instance of the branch point, and vice versa. In this case, although the overall probability of the “ab” transition is 0.5, whether this “ab” transition occurs at any branch point is completely predictable. This predictability makes branch points with a high degree of history dependence similar to stereotyped sequences, which are completely predictable and unmodifiable (Fig. 4*C*,*D*). We therefore supposed that branch points with high amounts of history dependence might be less modifiable than branch points with little or no history dependence.

Consistent with the possibility that differences in history dependence could account for heterogeneity in learning, we found that the degree of history dependence varied widely across the branch points in our data set. Figure 5*D–F* illustrates three branch points that had similar overall transition probabilities but different amounts of history dependence. The first branch point (Fig. 5*D*, “branch point 1”) exhibited strong history dependence in the form of almost perfect alternation. For this branch point, the overall probability of an “ab” transition was 0.52. However, the probability of an “ab” transition at a branch point, given that an “ac” transition occurred at the previous instance of the branch point was almost 1.0 [*P*(ab* _{n}*|ac

_{n}_{−1}) = 0.96], while the probability of an “ab” transition given that an “ab” transition occurred at the previous instance of the branch point was close to zero [

*P*(ab

*|ab*

_{n}

_{n}_{−1}) = 0.04]. In this case, the history dependence of the branch point (defined as the absolute value of the difference between these two conditional probabilities; see Materials and Methods) was 0.92 [history dependence = |

*P*(ab

*|ac*

_{n}

_{n}_{−1}) −

*P*(ab

*|ab*

_{n}

_{n}_{−1})| = 0.96–0.04 = 0.92; note that in this example the sum of the two conditional probabilities, 0.96 and 0.04, is 1.0 by coincidence and these two values need not sum to 1.0]. The second branch point (Fig. 5

*E*, “branch point 2”) also exhibited significant alternation, but the degree of alternation was less extreme. In this case, the history dependence was 0.4 [|

*P*(de

*|df*

_{n}

_{n}_{−1}) −

*P*(de

*|de*

_{n}

_{n}_{−1})| = 0.86–0.46 = 0.4]. The third branch point (Fig. 5

*F*, “branch point 3”) had no tendency toward alternation and was not history-dependent, as there was no significant difference between conditional transition probabilities. In this case, the history dependence was 0.04 [|

*P*(gh

*|gi*

_{n}

_{n}_{−1}) −

*P*(gh

*|gh*

_{n}

_{n}_{−1})| = 0.59–0.63 = 0.04]. These three examples spanned the range of history dependence that we observed across our data set. Figure 5

*G*shows the amount of history dependence for each of the 13 branch points that we studied and indicates whether the amount of history dependence exceeded the 95% confidence interval for the amount expected by chance (gray bars). For 6 of 13 branch points, there was significant history dependence (red points), and for 5 of these cases, history dependence reflected an excess tendency for alternation in the transitions that occurred across successive branch points (as in the example branch points 1 and 2). These data illustrate that, independently of baseline transition probabilities, the degree to which transitions are “stochastic” versus “predetermined” can vary widely across branch points and raise the question of whether such differences in the history dependence of branch points can account for observed differences in learning.

We indeed found that branch points that had greater amounts of history dependence exhibited less learning. Figure 5*H* illustrates that, across all branch points, there was a strong negative correlation between history dependence and percentage learning (*r* = −0.71; *p* = 0.007). Correspondingly, the average percentage learning for the branch points that exhibited significant history dependence (26.2 ± 8.3%; red triangle; *n* = 6) was significantly less than the learning for branch points that exhibited no history dependence (63.2 ± 10.6%; black triangle; *n* = 7; *p* = 0.02 for no difference between group, *t* test). This difference in learning between history-dependent and history-independent branch points suggests that the capacity for adaptive sequence modification is greatest when variation is generated stochastically, proximal to the branch point, rather than when the syllable transition is determined earlier in the song.

## Discussion

### Adaptive modification of syllable sequencing

Our results demonstrate that adult songbirds have a previously unrecognized capacity to adaptively modify syllable sequencing. These modifications were adaptive, because they reduced the amount of aversive WN delivered. Moreover, they were local and specific; only targeted transitions were modified and syllable acoustic structure did not change. These results contrast with prior studies, in which disruptive manipulations caused nonadaptive, song-wide deterioration of sequencing and syllable structure (Woolley and Rubel, 1997; Leonardo and Konishi, 1999; Yamada and Okanoya, 2003; Thompson and Johnson, 2007). The changes to sequencing, which persisted over days, also differ from “on-line” changes caused by feedback perturbation or presentation of a female (Sossinka and Bohner, 1980; Sakata and Brainard, 2006; Sakata et al., 2008).

Our results reveal that similar “learning rules” govern changes to hierarchically distinct aspects of song. Previous studies showed that syllable structure can be modified by differential reinforcement of rendition-to-rendition variation in fundamental frequency (Tumer and Brainard, 2007; Andalman and Fee, 2009; Charlesworth et al., 2011; Warren et al., 2011). Here, we demonstrate that similar principles govern changes to transition probabilities. Both forms of learning proceed over hours and are rapidly reversed following termination of reinforcement. However, despite these similarities, the neural substrates controlling syllable sequencing and syllable acoustic structure are different. Distinct syllables are produced by different sequences of activation of neural ensembles in the cortical analogs HVC and RA (Hahnloser et al., 2002; Fee et al., 2004; Leonardo and Fee, 2005; Sober et al., 2008; Wohlgemuth et al., 2010; Fujimoto et al., 2011). In contrast, rendition-to-rendition variation in fundamental frequency is associated with subtle variation in firing rate of neurons within an RA ensemble (Sober et al., 2008). Hence, while modification of fundamental frequency could reflect modulation in firing rates of RA neurons, modification of syllable sequencing likely reflects gross changes in the sequence of activation of neurons within the entire recurrent song motor circuit (Hahnloser et al., 2002; Fee et al., 2004; Ashmore et al., 2005; Jin, 2009; Fujimoto et al., 2011).

### Variation is a stable end point of learning

While our results demonstrate that statistics of syllable sequencing can be modified, they also reveal a drive to maintain those statistics at a stable set point. Following termination of reinforcement, transition probabilities returned to their baseline values. Hence, the nervous system maintains a stable representation of baseline sequencing statistics and restores those statistics without externally imposed instruction. The finding that targeted transitions were never eliminated also suggests an impetus to restore baseline values. This lack of elimination might reflect an equilibrium between opposing drives to restore baseline values and to avoid WN (Criscimagna-Hemminger and Shadmehr, 2008; Warren et al., 2011).

More generally, our results demonstrate that persistent variation in sequencing can reflect a stably maintained end point of learning, rather than a limitation in motor control. This is especially clear for experiments in which reinforcement-driven increases in sequence stereotypy were followed by self-driven decreases in stereotypy (Fig. 3*B*). This contrasts with prior results, in which recovery from disruptions of sequencing always entailed increases in stereotypy (Okanoya and Yamaguchi, 1997; Woolley and Rubel, 1997; Leonardo and Konishi, 1999; Yamada and Okanoya, 2003). Hence, while variation in motor skills may be minimized via extensive practice (Schwartz, 1980; Cohen et al., 1990; Grafton et al., 2002), for birdsong there is a drive to maintain sequence variation above the minimum possible value.

Interestingly, the maintained variation in sequencing at branch points was not always stochastic. Rather, the specific transition that occurred (i.e., “ab” vs “ac”) could in some cases be partially predicted based on preceding sequences. Such history dependence is consistent with analyses indicating that transitions in Bengalese finch song do not always follow a first-order Markov process (Fujimoto et al., 2011; Jin and Kozhevnikov, 2011; Katahira et al., 2011). Here, we demonstrate that history-dependent sequencing often manifests as a tendency toward alternation between possible transitions at successive branch points. While the reasons for this remain to be determined, it is noteworthy that, in other systems, including studies of human behavior, deviations from randomness also frequently manifest as excessive alternation (Wagenaar, 1972; Rapoport and Budescu, 1997).

The normal stability of sequencing may depend on error correction in which deviations from baseline statistics are detected and corrected. For fundamental frequency of individual syllables, a perceptual target is learned during development, and error correction operates in adulthood to reduce perceived differences from this target (Sober and Brainard, 2009). Similarly, Bengalese finches might actively correct differences between produced song and a learned target that encodes transition probabilities. Without error correction, prior work suggests that transition probabilities might drift over time, because these probabilities are highly sensitive to synaptic “weights” between neural assemblies in premotor areas (Yamashita et al., 2008; Jin, 2009). In this context, the plasticity that we have demonstrated might function normally to maintain stable sequencing despite the instability of synapses (Holtmaat et al., 2005; Elhilali et al., 2007; Kasai et al., 2010), rather than to modify sequencing.

### Potential benefits of sequence variation

Variation in sequencing may be adaptive in diverse behaviors, including courtship, predator avoidance, and foraging (Humphries and Driver, 1967; Real and Caraco, 1986; Searcy and Andersson, 1986). Maintained sequence variation in birdsong could be adaptive, as it enables a complex and varying courtship repertoire to be constructed from a few distinct elements. Consistent with this possibility, females of some species prefer songs with greater sequence complexity (Searcy and Andersson, 1986; Searcy and Nowicki, 1998). However, studies in Bengalese finches provide limited support for this possibility (Morisaka et al., 2008; Kato et al., 2010).

### Influence of history dependence on sequence modification

Stereotyped sequences were not modified by WN playback (Fig. 4*C*,*D*), indicating that baseline variation in sequencing is required for adaptive modification of transition probabilities. However, variation alone was insufficient to enable robust sequence modification. Instead, there was an inverse relationship between the degree to which branch points exhibited history dependence and the degree to which those branch points could be modified (Fig. 5*H*). Our findings therefore show that history dependence has a functional consequence of limiting the capacity for adaptive modification. This effect of history dependence is surprising because, in our paradigm, the same amount of WN is delivered, with the same contingency, whether a transition is performed 50% of the time in an alternating, history-dependent fashion or 50% of the time in a stochastic, history-independent fashion.

The differences in modifiability of history-dependent and history-independent transitions might arise because of differences in the stability of underlying trajectories of neural activity. One model for the generation of history-independent transitions at branch points posits that trajectories of neural activity controlling syllable production have unstable bifurcation points where propagation can continue along alternate trajectories (Yamashita et al., 2008; Jin, 2009; Hanuschkin et al., 2011). In such models, synaptic strengths and noise at these bifurcation points determine the relative probabilities of alternate trajectories (Yamashita et al., 2008; Jin, 2009; Hanuschkin et al., 2011). Consequently, for history-independent transitions, probabilities of alternate transitions could be modified by adjusting synaptic weights at unstable bifurcations immediately preceding the branch point. In contrast, for history-dependent transitions, there may be little instability in underlying neural trajectories at branch points. For example, for alternation, an “ab” transition at one branch point guarantees an “ac” transition at the next. In this case, there is no unstable bifurcation in neural activity at the branch point. Rather, activity may be as entrenched as it is for completely stereotyped sequences. Thus, history-dependent transitions may be less modifiable because they are produced with less instability in underlying neural trajectories. More broadly, our results indicate that the degree to which behavioral transitions are produced by stochastic versus deterministic processes may constrain how readily those transitions are modifiable.

### Potential neural mechanisms

An avian cortical-basal ganglia circuit, the anterior forebrain pathway (AFP), plays an essential role in adaptive modification of individual syllables in adult song (Andalman and Fee, 2009; Warren et al., 2011; Charlesworth et al., 2012). Hence, the AFP might play a similar role in modification of transition probabilities. Consistent with this possibility, lesions of LMAN (an output of the AFP) prevent or reverse some sequencing changes that occur following disruptive manipulations in adults (Williams and Mehta, 1999; Brainard and Doupe, 2000; Thompson and Johnson, 2007; Thompson et al., 2007; Nordeen and Nordeen, 2010). Moreover, lesions and inactivations of LMAN increase sequence stereotypy in juveniles (Bottjer et al., 1984; Scharff and Nottebohm, 1991; Olveczky et al., 2005; Stepanek and Doupe, 2010). However, lesions of LMAN in adult Bengalese finches do not grossly alter sequencing or its modulation by social context (Hampton et al., 2009). Nor do lesions prevent some changes to sequencing following deafening (Horita et al., 2008). Hence, it remains to be determined whether basal ganglia circuitry contributes to adaptive modification of syllable sequencing in adult song as it does to modification of syllable acoustic structure.

## Footnotes

This work was supported by a National Institute on Deafness and Other Communication Disorders R01 Award and a National Science Foundation (NSF) Award (M.S.B.), the Sloan-Swartz Foundation (E.C.T.), and the NSF Graduate Research Fellowship (T.L.W., J.D.C.). We thank Kris Bouchard, Allison Doupe, and David Mets for comments on this manuscript, and Jon Sakata for advice as well as for assistance in data collection.

- Correspondence should be addressed to Timothy L. Warren, Center for Integrative Neuroscience, 675 Nelson Rising Lane, Department of Physiology, University of California, San Francisco, San Francisco, CA 94158-0444. twarren{at}phy.ucsf.edu