## Abstract

Variable motor sequences of animals are often structured and can be described by probabilistic transition rules between action elements. Examples include the songs of many songbird species such as the Bengalese finch, which consist of stereotypical syllables sequenced according to probabilistic rules (song syntax). The neural mechanisms behind such rules are poorly understood. Here, we investigate where the song syntax is encoded in the brain of the Bengalese finch by rapidly and reversibly manipulating the temperature in the song production pathway. Cooling the premotor nucleus HVC (proper name) slows down the song tempo, consistent with the idea that HVC controls moment-to-moment timings of acoustic features in the syllables. More importantly, cooling HVC alters the transition probabilities between syllables. Cooling HVC reduces the number of repetitions of long-repeated syllables and increases the randomness of syllable sequences. In contrast, cooling the downstream motor area RA (robust nucleus of the acropallium), which is critical for singing, does not affect the song syntax. Unilateral cooling of HVC shows that control of syllables is mostly lateralized to the left HVC, whereas transition probabilities between the syllables can be affected by cooling HVC in either hemisphere to varying degrees. These results show that HVC is a key site for encoding song syntax in the Bengalese finch. HVC is thus involved both in encoding timings within syllables and in sequencing probabilistic transitions between syllables. Our finding suggests that probabilistic selections and fine-grained timings of action elements can be integrated within the same neural circuits.

**SIGNIFICANCE STATEMENT** Many animal behaviors such as birdsong consist of variable sequences of discrete actions. Where and how the probabilistic rules of such sequences are encoded in the brain is poorly understood. We locally and reversibly cooled brain areas in songbirds during singing. Mild cooling of area HVC in the Bengalese finch brain—a premotor area homologous to the mammalian premotor cortex—alters the statistics of the syllable sequences, suggesting that HVC is critical for birdsong sequences. HVC is also known for controlling moment-to-moment timings within syllables. Our results show that timing and probabilistic sequencing of actions can share the same neural circuits in local brain areas.

## Introduction

Variable sequences of discrete actions are prevalent in animal and human behaviors. Examples include human speech, birdsong (Doupe and Kuhl, 1999), whale song (Payne and McVay, 1971), and grooming in rodents (Cromwell and Berridge, 1996). These behavioral sequences display regularities and structures that are often referred to as “action syntax” (Lashley, 1951), in analogy to syntax in language (Chomsky, 1965). Where and how such syntactic rules for actions are encoded in the brain remains an unsolved problem.

The songbird is a model system for studying the neural mechanisms of vocal sequences (Doupe and Kuhl, 1999). Many songbird species sing songs with variable sequences (Okanoya, 2004). One often-studied species is the Bengalese finch, the song of which consists of variable sequences of stereotyped vocal elements called syllables (Fig. 1*A*,*B*; Okanoya and Yamaguchi, 1997; Woolley and Rubel, 1997; Jin and Kozhevnikov, 2011). Certain statistical properties of the song, such as repeat distributions (the probability of observing a syllable a certain number of times in a row) and pairwise transition probabilities (the probability of observing one syllable after another; Fig. 1*C*), are stable over long periods of time (Warren et al., 2012). Statistical rules describing patterns such as these comprise the song syntax (Okanoya, 2004).

It has been established that the premotor nucleus HVC (proper name) plays a role in controlling the tempo of singing in songbirds (Hahnloser et al., 2002; Long and Fee, 2008; Long et al., 2010). However, the role of HVC in song syntax remains unclear. One idea is that the probabilistic syllable selection is determined within HVC (Jin, 2009; Hanuschkin et al., 2011; Wittenbach et al., 2015). This mechanism is built on the notion that syllables are encoded with chain networks of HVC neurons projecting to the premotor nucleus RA (robust nucleus of the arcopallium; Fee et al., 2004; Long et al., 2010). As the activity propagates along such a syllable chain, HVC_{RA} neurons burst once at precise times and drive RA neurons to generate moment-to-moment acoustic features (Fee et al., 2004). When the activity reaches the end of the syllable chain, it briefly activates a number of syllable chains that are connected. Among them, the next syllable chain to propagate spikes is selected stochastically via a winner-take-all mechanism mediated through HVC interneurons, completing the probabilistic syllable transition (Chang and Jin, 2009; Jin, 2009). An alternative idea is that sequences are determined, not in HVC, but in areas upstream of HVC; lesion studies suggested the thalamic nucleus Uva (nucleus uvaeformis; Williams and Vicario, 1993) or NIf (nucleus interface of the nidopallium; Hosino and Okanoya, 2000). This proposal fits with the general scheme of hierarchical organization of motor sequence control (Rosenbaum et al., 1983; Yu and Margoliash, 1996). A third idea is that song is produced via activity propagating around a feedback loop consisting of HVC, RA, the brainstem, and Uva (Wild, 1997; Schmidt, 2003; Ashmore et al., 2005), with this loop generating HVC_{RA} bursts (Gibb et al., 2009; Hamaguchi et al., 2016). In this model, any of the nuclei in the loop could determine syllable sequences.

To evaluate the role of HVC in song syntax, we reversibly cooled HVC as a means of perturbing its dynamics in singing Bengalese finches. For comparison, we also cooled RA, the immediate target of HVC in the song pathway. Previously cooling has been used on the zebra finch (Long and Fee, 2008; Aronov and Fee, 2011; Hamaguchi et al., 2016) and the canary (Goldin et al., 2013) to investigate how different brain regions contribute to the control of song timing. Here, we applied the same technique to understand how song syntax is encoded. We found that cooling HVC systematically affects the transition probabilities between the syllables. In contrast, cooling RA produced minimal effects. Our observations show that HVC is a key area for shaping the song syntax of the Bengalese finch. HVC is thus critical for both moment-to-moment timings within syllables and probabilistic sequencing of syllables.

## Materials and Methods

#### Subjects

Eight male Bengalese finches (*Lonchura striata domestica*, >120 d after hatching) were used in the experiments. During the experiments, birds were housed individually in sound-attenuating chambers on a 14 h/10 h light/dark photoperiod. Food and water were provided *ad libitum* except for the night before surgery. All procedures were performed in accordance with the protocol approved by the local institutional animal care and use committee.

#### Reversible cooling

We constructed a Peltier device for reversible cooling or heating of the songbird brain. The device used a convection air-cooled heat sink and was similar to one described previously (Aronov and Fee, 2011). Two Peltier modules (12-0.45-1.3; TE Technology) sharing the same heat sink were connected in series and each had a gold-plated silver contact pad attached to the cold side. Jumper connectors (A9577-001; Omnetics) were used to switch between bilateral and unilateral modes of operation (Fig. 2*A*). The cooling pads were placed bilaterally on the brain surface on top of HVC (Fig. 2*A*). The device was attached to the skull using dental acrylic. The temperature of the two cooling pads was changed bilaterally or unilaterally. A calibration of HVC temperature as a function of the current applied to the Peltier module of the device was performed. The device was capable of altering HVC temperature by ∼5°C in either direction (Fig. 2*B*). Manipulating the temperature of one hemisphere did not affect the other hemisphere (Fig. 2*B*). The device was connected to a rotary commutator with a flexible cable, which allowed the bird to move freely. Temperature in HVC was monitored with a miniature thermocouple (5SRTC-TT-K-40–36; Omega) inserted in the vicinity of HVC at the depth of 0.5 mm. We also made a probe attached to the device for cooling RA, as was done in previous studies for cooling deeper brain areas (Long and Fee, 2008). The cooling probes were implanted bilaterally into RA. To measure RA temperature, a thermocouple was inserted anterior to the probe to reach RA from a different angle. Another thermocouple was inserted in the ipsilateral HVC at a depth of 0.5 mm beneath the brain surface. Both RA and HVC temperatures were recorded simultaneously while applying currents to the device.

#### Data collection

Undirected songs were recorded with an omnidirectional microphone (Cardioid Pro 44; Audio-Technica) during temperature manipulations. The audio signal was amplified (DMP3; M-Audio), band-pass filtered (0.3–10 kHz), digitized with a 16-bit A/D card (PCI-6251;, National Instrument), and recorded using customized song-triggered software written in MATLAB (The MathWorks) at a sampling rate of 40 kHz. Song recording was stopped when the postsong silence exceeded 2 s. Five to seven different current settings were used in each bird, ranging from −75 mA to 200 mA, corresponding to temperature changes from ∼2.0°C to ∼−4.5°C relative to the brain temperature measured in the normal condition (40.3 ± 1.7°C during the waking hours). Whenever current was adjusted, we allowed 10 min for the temperature to stabilize. A set of experimental conditions consisted of two to three temperature conditions interleaved with normal conditions. This set was repeated every 2–3 d.

Segmentation of the song into syllables and intersyllable gaps was done offline based on the envelope of the sound amplitude crossing a threshold. Syllable identification was first performed automatically using a custom sorting algorithm (Jin and Kozhevnikov, 2011). The results of automatic sorting were verified manually by visual inspection of the syllable spectrograms. The end of a song in a continuous recording was identified by the appearance of a silent period longer than 2 s or an introductory note. Introductory notes were excluded from analysis.

#### Data analysis

##### Syllable and gap durations.

We compared the durations of the song syllables and the intersyllable gaps under the normal condition and when HVC or RA temperature was manipulated. To eliminate possible variation of the song over days, the comparison was done between data collected on the same day. At each temperature, the mean duration of each syllable or gap was computed. To compute the change of the mean durations with temperature, we fit a weighted linear regression model of the mean duration as a function of change in temperature. The weight at each temperature was the inverse square of the SD of the durations measured at that temperature. The fractional stretch of the duration with cooling, denoted as *dD*_{T}, was obtained by dividing the slope of the line by the mean duration in the normal condition. The fractional stretch is expressed in percent per degree Celsius.

##### Syllable repetitions.

For each repeated syllable, the effect of temperature on the length of repetitions was quantified with *dR*_{T}, which is the slope of a weighted linear regression model of the mean repetition length as a function of the temperature. Based on the cooling effects, we defined two classes of repeated syllables, denoted type I (with positive *dR*_{T}) and type II (with zero *dR*_{T}). To define these two classes rigorously, we fit the distribution of *dR*_{T} values across repeated syllables with a mixture of two Gaussians using the function *mle* in MATLAB for maximum likelihood estimation. To determine whether the two types could be distinguished by the variability of the repetition lengths, we trained a linear classifier using linear discriminant analysis (McLachlan, 2004) to predict type I versus type II based on the SD of the repetition length distribution in the normal condition. We quantified the goodness of fit by the misclassification rate via leave-one-out cross-validation. One syllable was near the border and was classified as type I based on temperature sensitivity but as type II based on SD of repetition distribution in the normal condition. To keep the criterion consistent across analyses, we classified it as type II for subsequent analyses. Some long-repeating syllables had a bimodal distribution of repetition number, with a sharp peak at one to two repetitions and a broad peak with a long tail (Wittenbach et al., 2015). This has been suggested to be a “many-to-one mapping” from multiple neural representations to syllables sharing similar acoustic features (Jin and Kozhevnikov, 2011). For these syllables, we discarded the repetitions with lengths that were ≤2 and used only the broad part of the distribution.

##### Branch points.

A branch point in the song syntax is where a syllable transitions probabilistically to two or more other syllables. To compute the transition probability *p*_{ij} from syllable *i* to syllable *j* (*i* ≠ *j*), we counted the number of transitions from syllable *i* to *j* and divided it by the total number of transitions from syllable *i*. This creates a matrix of transition probabilities, the rows of which sum to 1. In analyzing the branching points, segments of repeating syllables are treated as a single “repeated syllable.” Therefore, the diagonal elements of the transition matrix were set to zero. Branch points were identified under the normal condition. To account for the possibility of misidentification of syllables due to ambiguity, we excluded transitions with probabilities below a noise threshold (see below). In the cooling/heating conditions, we kept the transitions with probabilities below the noise level if the transitions existed in the normal condition. This enabled us to monitor the changes of the transition probabilities across all conditions.

We used a fairly conservative procedure for estimating the noise level of the transition probabilities. We estimated the rate of ambiguous syllables as follows. We randomly chose 10 songs under the most cooled condition for each bird and labeled the syllables that did not clearly belong to any type. We reasoned that more ambiguity arises due to more deformations of acoustic structures in these conditions and we should obtain an upper bound for the error rates. The error rate of syllable identification for each bird is *p*_{e} = *N*_{e}/*N*, where *N*_{e} and *N* are the number of ambiguous syllables and the total number of syllables, respectively. *p*_{e} can fluctuate if a different set of *N* syllables were observed. Because such sampling leads to a binomial distribution, we estimated the upper 95% confidence level for *p*_{e} using the Wilson score interval with continuity correction (Newcombe, 1998), given by the following:
where *z* = 1.96. If *w*^{+} > 1, it was set to *w*^{+} = 1.

For a syllable, *i*, the probability of misidentification is *P*(error|*i*) and the total count of incorrectly identified syllables is *e*_{i} = *P*(error|*i*)*N*_{i}, where *N*_{i} is total number of counts of syllable *i*. We assumed that, in the transition *i* → *j*, the case of both syllables being misidentified is negligible and a false detection of this transition is due to either a misidentified *i* or *j*, but not both. We further assumed that occurrence and misidentification are independent; therefore, the number of incorrect *i* → *j* transitions is given by the following:
where *P*_{i} = *N*_{i}/*N* is the probability of syllable *i* in the song. Bayes theorem leads to *P*(error | *i*)= *P*(*i* | error)*p _{e}*/

*P*and

_{i}*P*(error |

*j*) =

*P*(

*j*| error)

*p*/

_{e}*P*and the following: Therefore, a conservative estimate of the number of incorrect

_{j}*i*→

*j*transitions due to ambiguity is the error rate multiplied by the total counts of syllables

*i*and

*j*. With a confidence level of 95%, we have

*p*

_{e}≤

*w*

^{+}, so we set the noise threshold for the

*i*→

*j*transition to the following: For five birds, the error rate was 1.73 ± 0.64% when cooled by 4°C (the most cooled condition) and 0.52 ± 0.18% in the normal condition. We used the first value to calculate the noise threshold in the cooling conditions and the second to calculate the noise threshold in the normal and the warming conditions.

We used transition entropy, defined as *H _{i}* = − Σ

*log*

_{j}p_{ij}_{2}

*p*, to quantify the variability of the transitions at a branch point of syllable

_{ij}*i*(Okanoya and Yamaguchi, 1997; Woolley and Rubel, 1997; Sakata and Brainard, 2009). The transition entropy was normalized by the maximum possible value of entropy for a branch point with

*n*

_{i}possible transitions, which is log

_{2}

*n*

_{i}.

##### Song bouts.

The end of a song bout was identified as a silent period >2 s or one followed by introductory notes. The significance of the change in the number of syllables in a bout with the temperature was assessed with the linear mixed model (LMM; see below), treating the subjects as a random factor. The entropy of the probability distribution of the ending syllables was calculated as *H _{e}* = − Σ

*log*

_{i}p_{ie}_{2}

*p*, in which

_{ie}*p*

_{ie}is the probability for a song bout to be end after syllable

*i*. The rate of song production was quantified by the time elapsed between two consecutive song bouts.

##### Lateralization.

The contributions of left and right HVCs to syllable repetition were assessed by comparing the effect of temperature on repetition length, *dR*_{T}, between unilateral and bilateral cooling. To evaluate the lateralization of a syllable repetition, we first chose the side with the larger *dR*_{T} in unilateral cooling and then compared it with *dR*_{T} in the bilateral cooling. Hemispheric contributions to transition probabilities at branch points were assessed by comparing changes of the transition entropy between unilateral and bilateral cooling.

#### Tests for statistical significance

The criterion for significance in all tests was set at α = 0.05.

##### Syllable repetitions.

We used an LMM (Gelman and Hill, 2006) to analyze the relationship between the mean repetition length, *R*, and the change of temperature of either HVC or RA from the normal condition, Δ*T*. We allowed for the possibility that the slope and the intercept of the relationship can vary from syllables to syllable. The equation for the LMM is as follows:
Where the first index denotes the temperature condition and the second denotes the syllable identity. *R*_{ij} is the mean repetition length for the *i*-th condition and the *j*-th syllable; Δ*T*_{i} is the temperature change for the *i*-th condition; β_{0} (fixed effect) is the mean intercept across syllables; *s*_{j0} (random effect) is the shift in the intercept for the *j*-th syllable; β_{1} (fixed effect) is the average slope across syllables; *s*_{j1} (random effect) is the shift in the slope for the *j*-th syllable; ϵ is the random error. The LMM analysis was performed using the *fitlme* function in MATLAB with the formula *R* ∼ 1 + Δ*T* + (1 | syllable label) + (Δ*T* | syllable label). We tested whether the coefficient β_{1} was significantly different from zero. β_{1} was estimated by the MATLAB implementation of maximum likelihood estimation and the *p*-value was given in the MATLAB output.

To determine whether HVC cooling affected the type I and type II repeated syllables differently, we introduced a dummy variable, *C*, for type I/II syllables and tested for an interaction between C and Δ*T*. In addition, we estimated the random effect of the syllable labels. The model was as follows:
We used MATLAB function *fitlme* with the formula *R* ∼ 1 + Δ*T* + C + Δ*T*: C + (1 | syallable label) to test whether the interaction coefficient β_{3} was different from zero. The *p*-value was reported in the MATLAB statistical outputs. The comparison of HVC and RA cooling on syllable repetitions was performed using the same model by substituting type I/II with HVC/RA cooling when defining the dummy variable, *C*.

##### Null model for temperature effects.

A simple model attributes the cooling effects of repetition entirely to the stretches of the syllables and the gaps. There could be a fixed duration of time during which repetitions occur that is not affected by temperature changes. Longer syllables at cooler temperatures would then lead to fewer repetitions. Let *L* be the duration of a repeat bout. Because the syllables and gaps stretch linearly with temperature change, Δ*T*, the duration of one syllable plus gap can be expressed as *D* = *D*_{0}(1 − *dD _{T}* · Δ

*T*), where

*D*

_{0}is the duration in the normal condition and

*dD*

_{T}is the fractional change in duration per unit change in temperature. The number of repetitions is then given by N(ΔT) = . Here,

*N*(0) is the number of repetitions in the normal condition. Therefore, the model predicts that the repeat distribution at Δ

*T*can be generated directly from that in the normal condition through rescaling by a factor of 1/(1 − dD

_{T}· ΔT). The same scaling will then hold for the means of the distributions. To generate model predictions for how the mean number of repetitions will change as a function of Δ

*T*, we set

*dD*

_{T}to the average between the experimentally observed values of this quantity for syllables and gaps. To assess the difference between the null model predictions and our actual observations, we compared

*dR*

_{T}from the model and the data. We centered the data by subtracting the mean repetition length at Δ

*T*= 0 for each syllable. To test whether the

*dR*

_{T}was the same for model and data, we applied an analysis of covariance (ANCOVA) using the MATLAB implementation

*aoctool*to fit a separate line for each group. The significance was given based on an

*F*test between deviations of

*dR*

_{T}from the mean for the model and the data.

##### Branch points.

The significance of changes in transition probabilities for each branch point were assessed with a χ^{2} test for independence or Fisher's exact test if some counts of transitions were smaller than 10 (Fisher, 1922). For the transitions from each syllable, a contingency table was constructed by counting the number of transitions to all target syllables. Each row of the table represents one target syllable and each column represents a different condition. For testing the effect of temperature, each column contained the counts to the target syllables at a different temperature. For testing day-to-day fluctuations, each column contained the counts observed from a different day. Such tables were used to test the null hypothesis that the transition probabilities remain the same across different conditions.

The significance that the transition entropy changed with temperature was assessed using a two-tailed *t* test on the null hypothesis that a linear regression of entropy as a function of temperature change results in a slope of zero. To assess the dependence of this slope, β, on the transition entropy at Δ*T* = 0, *H*_{0}, we performed two-tailed *t* tests on whether the slope of β versus *H*_{0} was different from 0.

##### RA cooling and the syllable and gap durations.

While cooling RA, HVC temperature changed by ∼32% of the temperature change in RA. To determine whether changes in syllable (or gap) duration observed during RA temperature manipulation could be explained by the collateral cooling in HVC, we tested the null hypothesis that the fractional stretch of the syllables (or gaps) per unit temperature change in RA is equal to 32% of that observed during HVC cooling. We compared the fractional change of syllable (or gap) durations during RA cooling with those produced by HVC cooling multiplied by a factor of 0.32. The significance of the difference in means between the two groups was evaluated with a two-tailed *t* test.

##### Hemispheric asymmetry.

To assess the significance of the correlation between the gap stretches under left versus right HVC cooling, we performed a shuffle test in which we randomized the assignment of gap pairs to the two conditions. The randomly paired gaps were surrogates for the null hypothesis of uncorrelated effects on the same gaps from cooling the left versus the right HVC. The *p*-value for the correlation coefficient was calculated from the percentile of the observed correlation coefficient in the shuffle distribution. To compare left, right, and bilateral HVC cooling effects on repetition length and transition entropy, we performed an ANCOVA on the slopes of these values with respect to temperature change using the MATLAB routine *aoctool*. If the between-group difference was significant, we performed a Tukey–Kramer *post hoc* test using MATLAB *multcompare* to find out which groups contributed to the difference.

## Results

We manipulated the temperature of HVC in adult Bengalese finches. Relative to the normal condition, temperature changes in HVC ranged from ∼−4.5°C (coolest) to ∼2.0°C (warmest). To simplify the presentation, we will mainly present the cooling effects and leave the heating effects to the figures. Unless specified, manipulations are bilateral and the temperatures in the two hemispheres are kept the same.

### Syllable and gap durations

Cooling stretched the songs of the Bengalese finch (Fig. 2*C*), similar to previous studies that have cooled HVC in the zebra finch (Long and Fee, 2008). The durations of song syllables were stretched by 2.8 ± 0.9%/°C (mean ± SD, percentage per degree Celsius, *n* = 34 syllables in 5 birds; Fig. 2*D*). The durations of intersyllable gaps were stretched by 4.2 ± 2.5%/°C (*n* = 50 intersyllable gaps in 5 birds; Fig. 2*E*). The stretch in gap durations was greater than that in syllable durations (*p* = 7.3 × 10^{−4}, two-tailed *t* test). Overall, the amount of stretch in both the syllable and gap durations was comparable to that seen in cooling HVC in the zebra finch (Long and Fee, 2008). Heating had the opposite effects (Fig. 2*C*).

### Cooling HVC changes the song syntax

#### Syllable repetitions

Syllable repetitions are a common feature of Bengalese finch song (Jin and Kozhevnikov, 2011; Wittenbach et al., 2015). Cooling HVC consistently reduced the number of repetitions within a repeat bout while heating increased it, especially for those syllables with long and variable repetition lengths. Figure 3, *A* and *B*, shows two examples. For the long-repeated syllable “A” of Bird 1, the distribution of repetition lengths shifted toward fewer repetitions as HVC was cooled (Fig. 3*B*). In contrast, the short-repeating syllable “C” of Bird 2 was not affected by changes in temperature (Fig. 3*B*). We quantified this effect with the slope of the mean repetition length with respect to changes of HVC temperature, *dR*_{T} (Fig. 3*C*). Across all repeated syllables, *dR*_{T} had a bimodal distribution, with one subset (type I) strongly affected by HVC temperature, whereas the other subset (type II) remained unaffected (Fig. 3*D*).

We next investigated whether these two subsets had distinctive repetition length distributions in the normal condition. We observed that variability of repetition length, defined as the SD of repetition length in the normal condition, was positively correlated with the temperature effect, *dR*_{T} (*r* = 0.61, *p* = 0.026, two-tailed *t* test; Fig. 3*D*). *dR*_{T} was also positively correlated with the mean repetition length measured in the normal condition (*r* = 0.77, *p* = 0.001). In other words, syllables with long and variable repetition bouts tended to be affected by HVC temperature, whereas those with short bouts of more fixed lengths tended to be unaffected.

The two types could be predicted from the variability alone (linear classifier, error rate 0.077). We therefore used a threshold on the variability to classify repeating syllables into these two types in subsequent analyses (see Materials and Methods). The cooling effect on type I syllables was significant on aggregation (*n* = 8 syllables from 5 birds, slope = 0.4 ± 0.1, *p* = 1.3 × 10^{−4}, LMM; Fig. 3*C*), whereas the effect on type II syllables was not (*n* = 5 from 3 birds, slope = 0.019 ± 0.008, *p* = 0.026, LMM; Fig. 3*C*). The difference in effects was significant between the two types (*p* = 2.7 × 10^{−7}, LMM).

A simple explanation for the shortening of syllable repetitions during cooling is that the song tempo is slowed while the durations of repetition bouts, which could be encoded in a different brain area, are unaffected by the cooling. In this scenario, fewer syllables can “fit” in a repeat bout because the syllables and gaps are longer. We compared this theoretical prediction to the actual data. One example is shown in Figure 3*E* (syllable “A” from Bird 1 at Δ*T* = −2°C). The predicted reduction in mean repetition length was significantly smaller than that observed (*p* = 2.6 × 10^{−11}, two-tailed *t* test). This was true across the population of type I syllables (Fig. 3*F*). Compared with the predicted values, the observed data showed significantly larger shifts in mean repetition lengths as HVC temperature was changed (*p* = 8 × 10^{−4}, ANCOVA). Therefore, the cooling-induced reduction in syllable repetition is not simply a byproduct of slowed song tempo.

#### Branch points

A branch point in the song syntax consists of probabilistic transitions from one syllable to multiple syllables. Two examples are shown in Figure 4*A*. In the first example, syllable “K” can transition to syllable “B” with a probability of 0.68 or to syllable “D” with a probability of 0.32 (Bird 3). In the second example, syllable “F” can be followed by syllable “B” with a probability of 0.60, syllable “G” with a probability of 0.36, or syllable “J” with a small probability of 0.04 (Bird 7). We identified a total of 12 branch points in the songs of five birds. The branch points were defined after controlling for syllable misclassification. In the normal condition, all transition probabilities were stable across days (*p* > 0.05, Fisher's exact test).

Cooling HVC affected transition probabilities at branch points. In the first example shown in Figure 4*A*, when HVC was cooled by 4°C, the probability for the transition “KB” was reduced to 0.45, whereas the probability for “KD” was enhanced to 0.55. In the second example, the transition probability of “FJ,” which was very small in the normal condition, was enhanced to 0.24. These changes were highly significant (*p* = 1.7 × 10^{−5} and *p* = 1.1 × 10^{−6}, respectively, Fisher's exact test). Transition probabilities for 9 of 12 branch points were significantly dependent on temperature (*p* < 0.05, χ^{2} test of independence or Fisher's exact test for counts <10).

Cooling HVC also affected the variability of syllable sequences. The randomness of syllable transitions at a branch point can be quantified by transition entropy (Okanoya and Yamaguchi, 1997; Woolley and Rubel, 1997; Sakata and Brainard, 2009). A deterministic transition has the lowest entropy and the value is 0. The most random transition, in which the transitions to all *N* targets are equally probable, has the maximum entropy, with a value of log_{2}*N*. We calculated transition entropy for the 12 branch points. Cooling HVC significantly increased the transition entropy of four branch points (*p* = 0.010, 0.017, 0.014, 0.023, two-tailed *t* test on the slope of transition entropy vs temperature), slightly reduced the transition entropy of one (*p* = 0.025, two-tailed *t* test on slope), and had no significant effect on the rest (Fig. 4*B*). The extent of the cooling effect on sequence variability depended on the transition entropy of the branch points under normal conditions. The branch points with lower transition entropy at normal temperature tended to increase their transition entropy more when HVC was cooled (*r* = 0.92, *p* = 1.2 × 10^{−5}, *t* test; Fig. 4*C*). At the same time, all of the branch points that showed either no change or a slight increase in entropy had baseline values of entropy that were close to the maximum possible value to begin with. That is, cooling HVC increased the randomness of the syllable transitions that were closer to being stereotypical under normal conditions, but did not strongly affect the transitions that were already highly random.

A more drastic change in syllable sequence would be the appearance of novel transitions. However, across the total of 286 possible transition types inspected (*n* = 5 birds), we observed only two instances of novel transitions in one bird (Bird 7) at the lowest temperature conditions. Therefore, cooling primarily affected the probabilities of the existing transitions instead of creating new transitions.

#### Song bouts

HVC temperature also affected the lengths of song bouts. We defined the end of a song bout as a syllable followed by either a silent period of >2 s or introductory notes. Excluding introductory notes, a song bout consisted of ∼12–21 syllables and lasted for ∼1.4–2.4 s (25th–75th percentile, *n* = 5 birds). The number of syllables within a bout was affected by temperature changes in HVC (*p* = 4.0 × 10^{−46}, LMM), decreasing ∼0.9 syllables per degree Celsius decrease in temperature (Fig. 5*A*, black curve). This effect could not be fully explained by variations in syllable repetition length. When counting repeated syllables only once, the dependence of song bout length on HVC temperature still existed (*p* = 4.6 × 10^{−25}, LMM), with a shortening of ∼0.3 syllables per bout for every degree Celsius temperature drop in HVC (Fig. 5*A*, red curve). A possible explanation is that cooling HVC might increase the probability for a song to stop at syllables after which the song rarely ends in the normal condition. If so, then the entropy of the probability distribution of the ending syllables should increase with cooling HVC; this was indeed the case (*p* = 6.6 × 10^{−6}, LMM; Fig. 5*B*). Warming HVC had the opposite effect (Fig. 5*B*). In the normal condition, birds spontaneously produced a song approximately every 20–102 s (25th–75th percentile). Temperature manipulation in HVC slightly changed the frequency of song production, with the mean interval between songs increasing by 9 s for every degree Celsius decrease of temperature (*p* = 0.04, LMM). Overall, the number of vocal outputs within a given time period is positively correlated with HVC temperature.

### Cooling RA has minimal effects on the song syntax

To test whether the effects on song syntax were due to modification of the neural activity in downstream areas, we manipulated the temperature of RA. RA is a major motor area for birdsong production and is directly innervated by HVC (Nottebohm et al., 1976). Because RA is located deep (2 mm) below the brain surface, we used thermally conductive probes to manipulate its temperature (Long and Fee, 2008).

Cooling RA produced only mild effects on the durations of the syllables and gaps. Cooling RA stretched the syllables by 0.9 ± 0.1%/°C (*n* = 22 in 2 birds; Fig. 6*A*) and the gaps by 0.5 ± 0.3%/°C (*n* = 34 in 2 birds; Fig. 6*B*). According to a prior study, RA cooling leads to cooling in HVC with the change of temperature ∼30% of that in RA (Long and Fee, 2008). The effect of RA cooling on syllable durations could be explained by the collateral temperature change in HVC (*p* = 0.94, *t* test). However, the effect of cooling RA on gap durations is slightly less than can be accounted for by the collateral temperature changes in HVC (*p* = 0.01, *t* = test). In the songs of the two birds in which we manipulated RA temperature, there were four type I repeated syllables (Fig. 6*E*) and seven branch points. Figure 6, *C* and *D*, shows an example of a type I syllable for which the distribution of repetition lengths was not shifted by changes in RA temperature. For all four type I repeated syllables, changes in mean repetition length were not significantly dependent on RA temperature (*p* = 0.27, LMM) and were significantly smaller than those observed in HVC cooling (*p* = 8.2 × 10^{−7}, LMM; Fig. 6*F*). At branch points, no significant trends in the changes of transition entropy were observed as RA was cooled (*p* > 0.05, *t* test; Fig. 6*G*) regardless of the transition entropies in the normal condition (Fig. 6*H*). Finally, changes in RA temperature did not affect the length of song bouts (*p* = 0.60, LMM) or the rate of song production as measured by intersong intervals (*p* = 0.39, LMM). In conclusion, RA temperature manipulation showed minimal impact on the song syntax.

### Hemispherical asymmetry

To evaluate the relative contributions of the left and right HVCs to cooling effects on Bengalese finch song, we performed unilateral cooling. We found that the stretching of syllable durations was mainly due to left HVC cooling (37 syllables in 5 birds, *p* = 1.2 × 10^{−8}, paired *t* test; Fig. 7*A*). The mean stretch from cooling the left HVC was 2.0 ± 0.9%/°C (*p* = 4.7 × 10^{−15}, *t* test for difference from 0); the mean stretch from cooling the right HVC was 0.3 ± 1.0%/°C (*p* = 0.09, *t* test for difference from 0). There was no correlation between the changes in syllable durations observed from cooling the left HVC and from cooling the right HVC (*r* = 0.008, *p* > 0.05, *t* test; Fig. 7*B*).

Cooling either hemisphere stretched the intersyllable gaps by approximately the same amount (left: 2.1 ± 2.6%/°C; right: 2.6 ± 3.6%/°C, *p* = 0.35, paired *t* test; Fig. 7*C*). There was an anticorrelation between the stretches of gaps from left and right HVC cooling (*r* = −0.31, *p* = 8 × 10^{−3}, shuffle test; Fig. 7*D*).

We found that some of the type I syllable repetitions were primarily affected by cooling only one of the hemispheres, whereas others were affected by cooling either hemisphere. To assess hemispheric lateralization, we compared the effects of temperature change on syllable repetition under unilateral cooling and bilateral cooling in four birds. Figure 8, *A–C*, shows three examples of the different ways in which repetition length was affected by unilateral/bilateral cooling. Figure 8*A* shows a case in which cooling the right HVC produced similar effects on syllable repetitions as bilateral cooling, whereas the effect from cooling the left HVC was much smaller than the other two conditions (*F* = 21.1, *p* = 4.1 × 10^{−4}, ANCOVA). In contrast, Figure 8*B* shows an example in which the effect of cooling the left but not the right HVC was close to bilateral cooling (*F* = 10.5, *p* = 4.4 × 10^{−3}, ANCOVA). For some cases, cooling either HVC produced similar effects as bilateral cooling (*F* = 1.4, *p* = 0.29), as shown in Figure 8*C*. Among the 10 repeated syllables, three were significantly affected by cooling the left HVC but not the right, three were affected by the right but not the left, three were affected by both hemispheres, and one was not affected by either hemisphere (two-tailed *t* test on slopes of repetition length vs temperature; Fig. 8*D*). To evaluate the strength of lateralization, for each syllable, we selected the hemisphere where the cooling effect was stronger (the side with greater slope) and compared that effect with the bilateral cooling effect. We found that dominant side cooling accounts for ∼70% of the effect on syllable repetition (slope from dominant side cooling divided by slope from bilateral cooling = 0.67; Fig. 8*E*).

Unilateral cooling also affected the transition probabilities at the branch points. For the four branch points that showed a significant increase in entropy under bilateral cooling (Fig. 4), cooling of at least one hemisphere significantly increased the transition entropy by itself (Fig. 8*F–I*). The branch points that were not significantly affected by bilateral cooling remained unaffected. For three of the affected branch points, there were no significant differences among left, right, or bilateral HVC cooling (Fig. 8*F,G*,*I*; ANCOVA, *p* > 0.05). For the remaining branch point, right cooling had a significantly weaker effect than bilateral cooling (*F* = 5.87, *p* = 0.0184, ANCOVA); the effect of left cooling was statistically indistinguishable from the effect from bilateral cooling, although the average was lower (Fig. 8*H*).

Cooling HVC in either hemisphere reduced song bout length in 4 of 5 birds (*p* < 0.05, LMM); for the remaining bird, only left HVC cooling produced an effect. This effect persisted after syllable repetitions were taken out. Within the same subjects, the hemispheric dominance in the control of song bout length was consistent with that in the syllable repetitions and in the branch points. No obvious change was observed in the rate of song production with unilateral cooling except in one bird.

## Discussion

Variability in syllable sequences is a prominent feature of Bengalese finch song. From rendition to rendition, syllables can repeat a variable number of times and different syllables can be chosen at branch points. Syllable sequencing thus requires making choices among multiple possibilities. Using a cooling technique, we have shown that HVC plays a key role in making such choices. Cooling HVC reduced repetition lengths of long and variably repeated syllables and increased the randomness of syllable sequences. These results suggest that HVC is a prime candidate for encoding song syntax. Previous experiments in the zebra finch established HVC's role in controlling the timing of features at the syllable level (Hahnloser et al., 2002; Fee et al., 2004; Long and Fee, 2008; Long et al., 2010; Lynch et al., 2016; Picardo et al., 2016). Our experiments suggest that HVC plays an additional role of determining probabilistic transitions between syllables. This is inconsistent with the idea of strict hierarchical control of birdsong, in which song syntax is at a level higher than that of syllables and is determined in an area upstream of HVC, perhaps NIf (Hosino and Okanoya, 2000) or Uva (Williams and Vicario, 1993). Instead, HVC is critical for both controlling syllables and encoding song syntax.

Although the cooling technique has been used before on the zebra finch (Long and Fee, 2008; Aronov and Fee, 2011; Hamaguchi et al., 2016) and the canary (Goldin et al., 2013), our experiments provide novel insights into HVC's role in encoding probabilistic syllable transitions, a crucial aspect of song syntax in many species. Zebra finch songs typically have rigid sequences (Sossinka and Böhner, 1980), making this species unsuitable for studying variability in syllable sequences. Songs of the canary have elaborate syntax with syllable repetitions and probabilistic transitions (Gardner et al., 2005; Markowitz et al., 2013). However, the study that cooled HVC in the canary was focused on the interaction between HVC inputs and the nonlinear dynamics of respiratory control (Goldin et al., 2013) and did not investigate the song syntax. It would be interesting to repeat the cooling experiment on the canary and determine whether our findings on the song syntax extend to other species with variable syllable sequences.

HVC projects directly to RA and Area X, the input station of the anterior forebrain pathway (AFP; Nottebohm et al., 1976). We showed that cooling RA, which affects RA's intrinsic dynamics as well as the inputs to RA from other areas, has a minimal effect on the song syntax, ruling out RA in determining the song syntax. Song syntax is not encoded in AFP either because lesioning the lateral magnocellular nucleus of the anterior nidopallium, the main output station of AFP, does not affect song syntax in the Bengalese finch (Hampton et al., 2009). Therefore, the effects of HVC cooling on song syntax are not due to altered inputs to RA or Area X.

Singing in songbirds is controlled bilaterally. The HVCs in the two hemispheres control sound production in ipsilateral halves of the syrinx (Goller and Suthers, 1996). Our unilateral cooling results showed that syllable durations were mostly affected by cooling left HVC, suggesting that left HVC controls feature timings in syllables. In contrast, both sides affected gap durations, with the amount of stretching anticorrelated for left and right cooling. These results suggest that syllables are mainly encoded in left HVC, whereas gaps are influenced by both HVCs. The left dominance of syllable control in the Bengalese finch is consistent with a previous experiment showing that devocalization of the left but not the right syrinx destroyed the majority of Bengalese finch song syllables (Secora et al., 2012).

In the zebra finch, the left and right HVCs alternate in controlling the song tempo (Long and Fee, 2008; Wang et al., 2008). This difference in hemispherical dominance between the zebra finch and the Bengalese finch may be due to the difference in the song variability.

Similarly to the gaps, syllable transition probabilities were affected by cooling both HVCs. For long and variable repetitions, some were mainly affected by only left or right HVC cooling, some by both. Randomization of sequences at branch points was seen in cooling both sides, although in some cases one side dominated. Given that syllables are encoded in left HVC, it is easy to understand how left HVC cooling could affect transition probabilities by affecting the synapses and neurons encoding the syllables. It is harder to explain how right HVC cooling can contribute to, and in some cases even dominate, changes in transition probabilities. One possibility is that the right HVC contributes to the initiation and termination dynamics of inspiration in the brainstem during gaps (Andalman et al., 2011) and influences left HVC activity through the brainstem signals triggered by inspiration and delivered bilaterally to HVC via Uva (Schmidt, 2003; Andalman et al., 2011). This is consistent with our observation that right HVC can control gap durations. In this view, syllable transitions involve the feedback loop connecting the left and right HVCs through the brainstem and Uva (Andalman et al., 2011).

Our results are consistent with the model that probabilistic syllable selections are determined within HVC (Chang and Jin, 2009; Jin, 2009; Hanuschkin et al., 2011; Wittenbach et al., 2015). A crucial aspect of the model is that a winner-take-all competition between briefly coactive syllable chains takes place at the transition points (Chang and Jin, 2009; Jin, 2009). The competition is mediated through the inhibitory interneurons in HVC (Mooney and Prather, 2005; Kosche et al., 2015), which necessarily requires that the syllable chains reside in the same hemisphere (Jin, 2009). This agrees with our observation that syllables are encoded in left HVC. The model allows external inputs from Uva or NIf to influence transition probabilities by biasing the excitabilities of syllable chains. In particular, the model distinguishes type I and type II syllable repetitions (Wittenbach et al., 2015). Analysis of long and variable repeated syllables (type I) showed that these syllables should be initially sustained by strong auditory feedback from NIf to the repeating chain such that initial repeat probability is close to 1. As the syllable repeats, the auditory feedback weakens due to synaptic adaptation, leading to termination of the repetition. For short repeats with minimal variability (type II), such strong auditory feedback is unnecessary. Indeed, deafening Bengalese finches reduced repetition lengths of type I repeated syllables while leaving type II repeated syllables intact (Wittenbach et al., 2015). It is possible that cooling HVC weakens the NIf synapses in HVC, reducing the repetition lengths of type I but not type II syllables. Similar reasoning can be applied to the branch points. We observed that cooling HVC tends to equalize the transition probabilities at branch points and thus increases the randomness of the syllable sequences. It is possible that auditory feedback enhances transition probability to a particular syllable by specifically targeting the neurons that encode it (Sakata and Brainard, 2006; Hanuschkin et al., 2011; Wittenbach et al., 2015). Cooling the NIf synapses in HVC reduces such bias, revealing more random transitions within HVC's intrinsic network promoted by the Uva inputs at each syllable transition. More detailed analysis of the model's responses to cooling should reveal modifications needed for the model to fully explain our cooling results.

Our experiments do not rule out the possibility that song syntax is generated within the feedback loop involving HVC, RA, the brainstem, and Uva (Schmidt, 2003; Ashmore et al., 2005). Perhaps other areas in the loop in addition to HVC also play a role in determining syllable sequences. RA can be excluded because cooling RA has a minimal effect on the syntax. A recent experiment cooled Uva in the zebra finch and showed that the song tempo slowed (Hamaguchi et al., 2016). Cooling Uva in the Bengalese finch could be useful to address its role on syntax. However, one needs to be cautious in analyzing such experiments because it could be that Uva affects the song indirectly by modulating HVC dynamics.

One caveat is that surrounding areas are also inevitably affected when cooling HVC. Based on the relationship between distance and temperature drop (Aronov and Fee, 2011), we estimate that cooling HVC or RA should produce similar temperature changes in NIf. Because cooling RA had minimal effects on syntax, we can rule out the possibility that the effects on song syntax are due to the collateral temperature change in NIf. Uva is relatively distant from HVC, so temperature change in Uva from manipulating temperature in HVC is negligible (Hamaguchi et al., 2016). Other areas, such as the auditory area field L and the avalanche nucleus, an area mutually connecting to NIf and HVC (Akutagawa and Konishi, 2010), could be cooled at most by ∼2°C. Further investigation is needed to determine whether this is enough to change the song syntax.

In summary, we find that HVC is a key site for encoding song syntax in the Bengalese finch. Syllable durations are mostly controlled by left HVC, whereas the gap durations and the song syntax can be influenced bilaterally.

## Footnotes

This work was supported by the National Science Foundation (Grant 0827731), the Pennsylvania State University Department of Physics, and the Huck Institute for Life Sciences. We thank Guerau Cabrera and Bruce Langford for assistance at the initial stage of the project and Dmitry Aronov for technical advice.

The authors declare no competing financial interests.

- Correspondence should be addressed to Dezhe Z. Jin, Department of Physics, Pennsylvania State University, 104 Davey Laboratory, PMB 206, University Park, PA 16802. dzj2{at}psu.edu