Listening to music provides a significant source of human emotional experience. One way to better understand why mere perception of a piece of music can lead to emotional activation is to study musical preference and its neural underpinnings. A longstanding hypothesis within neuro-esthetics research is that the relationship between the predictability and complexity of a stimulus (e.g., a melody) and the preference to it follows an inverted U-shape. Going back to Wilhelm Wundt, who first described a balanced mixture of predictable and dynamic elements as a necessary condition for esthetic appreciation, this is called the Wundt curve, or Wundt effect (Wundt, 1911). In current predictive processing accounts of human brain functioning, this relationship is conceived to be a result of perceptual learning processes (i.e., model updating), where intermediate complexity allows for optimal reduction of prediction errors and is therefore perceived as rewarding (Koelsch et al., 2019).
Evidence for such Wundt effects in music has been mixed so far (Chmiel and Schubert, 2017). This may stem mainly from disagreement between different studies about which stimulus feature should be measured to predict its potential to elicit positive emotions. In his psychobiological theory of esthetics, Berlyne (1960) proposed that complexity, novelty/familiarity, change, conflict, surprisingness, uncertainty, and ambiguity determine the likability of an esthetic stimulus. Many of these have been experimentally tested to study their potential to evoke emotional responses, but yielded inconsistent results probably because of differences in construct operationalization and methodology (Chmiel and Schubert, 2017). This leads to a second critical issue in studying possible Wundt effects: developing an objective and reliable measure that takes all meaningful variables contributing to musical complexity into account.
To reliably measure musical complexity, a recent article published in The Journal of Neuroscience used a computational model of auditory processing that acquires the syntactic structure (i.e., melodics) of a given musical style through variable-order Markov modeling (Gold et al., 2019a). The model predicts the pitch and onset time of a subsequent note based on long- and short-term musical contexts, weighed by their relative certainty. The authors tested seven different model configurations that based predictions on different musical attributes. The model that best predicted subjective measures of unexpectedness was based on information about the timing of notes as well as their pitch. This model provided two different, yet mathematically related, information-theoretic measures of complexity per musical stimulus: information content and entropy. Information content is described as the unpredictability or surprise of a note. Thus, a note with an overall low probability will be high in information content and a note with high probability to occur will be low in information content. Entropy, on the other hand, is characterized as the inherent uncertainty or instability of a stimulus. Correspondingly, notes with either very high or very low probability will result in a piece of music with low entropy. Atonal music will therefore most likely be categorized as high in information content as well as entropy.
To experimentally test the relationship between subjective likability and these two measures, the authors edited 55 monophonic (i.e., containing only 1 tone at a time) musical stimuli to approximately the same length and comparable tempos to exclude as many potential musical confounds as possible. Participants' musical sophistication, as well as their tendency to perceive music as rewarding, were measured via questionnaires. Participants rated likability and familiarity with all musical stimuli, and pieces of music that were already known to participants were excluded from the analysis. Linear mixed-effect models with linear as well as quadratic terms were applied to the data to analyze the relationship between the two model-derived measures of complexity and the subjective likability ratings. A significant linear term would imply a straightforward linear relationship of complexity measures and stimulus likability. A significant quadratic term would imply that intermediate levels of stimulus complexity correspond to highest likability ratings (i.e., a Wundt effect). Results showed significant Wundt effects for both measures of complexity, with linear and quadratic terms combined fitting the data significantly better than a linear or a quadratic term alone. Whether the linear component of the effect is due to an actual preference shift to lower over higher complexity stimuli, or whether the effect is due to a stimulus selection bias where stimuli with very low complexity were simply not tested, remains an open question at this point.
In addition to demonstrating Wundt effects in the analyzed pieces of music, the authors looked at the differential effects of information content (i.e., unpredictability) and entropy (i.e., uncertainty). Information content best predicted subjective measures of uncertainty, and also explained more variance in individual likability ratings than entropy. It therefore seems to represent human perception of musical complexity better than entropy and should thus be considered as the primary focus in further research.
Next, the authors looked at possible interactions between entropy and information content by clustering stimuli into pieces with relatively high or low entropy as well as low, medium or high levels of information content. They again found an inverted U-shape relationship between likability and information content, but only in stimuli with low entropy. In high entropy stimuli, the relationship appears to be linear, where higher information content is associated with lower likability. This pattern suggests elevated preferences to predictable musical features when the overall syntactical structure is perceived as rather unstable. This can be recognized in the common alternation between musical elements of high and low predictability such as chorus and verses in pop music. At the same time, this finding possibly explains a lot of prior research in which musical preference linearly increased with stimulus familiarity and predictability, and in which no Wundt effects were found (Chmiel and Schubert, 2017).
In a second experiment, the authors replicated their results with real-time likability ratings in a sample of musical experts and further examined the influence of expertise as well as the effects of repeated listening to the stimuli on liking. This allowed them to independently compute the effects of structural (a measure of complexity) and veridical stimulus predictability (indicating familiarity). Statistical analysis revealed a significant negative effect of stimulus repetition on likability and no interaction effect of repetition with information content. This result documents the independence of structural and veridical predictability, and it contrasts with earlier findings in which stimulus repetition was positively correlated with likability (Johnston, 2016). Data from both experiments by Gold et al. (2019a) also showed stronger Wundt effects (i.e., higher skewness and kurtosis values) in participants with higher musical perceptual abilities. Behaviorally, this equates to sharper musical preferences, but interestingly not to a general preference shift toward more complex music in musical experts.
How do these findings broaden our understanding of the ontological nature of esthetic appreciation and its neural basis? Gold et al. (2019a) emphasize the inherent value that people ascribe to the reduction of uncertainty. This has been proposed as an overarching principle of neurocognitive functioning: biological agents are constantly striving to minimize the long-term average of perceived surprise (Friston, 2010). Note that in this framework positive affect is conceived of as a marker for model confidence (Hesp et al., 2019). Therefore, in the domain of perception, positive affect would arise from either sensory evidence that confirms preexisting models or from information that helps to update these models to minimize surprise in the long-term (Van de Cruys, 2017). Accordingly, stimuli that allow for model confirmation through predictability, as well as for model updating through surprise should be perceived as more rewarding.
Following that line of reasoning, the neuronal bases for music perception and for probabilistic learning should widely overlap, which is indeed the case. In probabilistic learning tasks, the salience of a stimulus (i.e., its importance for learning) is primarily correlated with activity in the orbitofrontal cortex (OFC), whereas expected and perceived reward is associated with activity in limbic structures such as the nucleus accumbens (NAc) and the amygdala (Schoenbaum et al., 1998; Werlen et al., 2019). A similar pattern can be found in music perception, where perceived tension during listening was found to correlate with activity in the OFC and limbic structures (Koelsch, 2014; Lehne et al., 2014). Music likability ratings are especially consistent in showing correlations with activity in the NAc (Salimpoor et al. 2013; Shany et al., 2019). These regions seem to be involved in assigning predictive value to stimuli, matching predictions with perceived outcomes, and deriving motivational pleasure from it, processes pivotal to associative learning, as well as esthetic perception and appreciation (Koelsch et al., 2019). Additional differentiation of the neuronal basis for the coding of information content and perceived pleasure from esthetic stimuli could prove a promising approach for further research.
The results from Gold et al. (2019a), combined with reported neuronal patterns of OFC and NAc activity, support current predictive processing accounts of emotional experience: embedded in a constant integration of prior beliefs with sensory data, emotion is hypothesized to arise from the degree of accordance of those informational domains (Van de Cruys, 2017). It therefore acts as a marker of value of the current predictive model and possibly facilitates learning processes (Gold et al., 2019b; Hesp et al., 2019). Following that conceptualization, listening to music is conceived of as a form of statistical inference learning, and expert models should therefore comprise higher predictive power. This in turn should lead to a preference shift in musical experts to stimuli high in information content, which was not observed by Gold et al. (2019a). This finding adds to an overall inconsistent pattern of results on this issue. Although evidently neuronal and behavioral markers for expectancy violations in music are elevated in experts, the question of whether this leads to a general preference shift to high-complexity stimuli is yet to be elucidated (for review, see Pearce, 2014). Together, the findings of Gold et al. (2019a) support an information theoretic approach to esthetic perception and encourage further studies into its neural basis.
Footnotes
Editor's Note: These short reviews of recent JNeurosci articles, written exclusively by students or postdoctoral fellows, summarize the important findings of the paper and provide additional insight and commentary. If the authors of the highlighted article have written a response to the Journal Club, the response can be found by viewing the Journal Club at www.jneurosci.org. For more information on the format, review process, and purpose of Journal Club articles, please see https://www.jneurosci.org/content/jneurosci-journal-club.
I thank Dr. Guido Hesselmann for helpful comments on the paper, and Scott Alexander for inspiration regarding the title.
The author declares no competing financial interests.
- Correspondence should be addressed to Nils Kraus at n.kraus{at}phb.de