Abstract
Evolutionary hypotheses regarding the origins of communication signals generally suggest, particularly for the case of primate orofacial signals, that they derive by ritualization of noncommunicative behaviors, notably including ingestive behaviors such as chewing and nursing. These theories are appealing in part because of the prominent periodicities in both types of behavior. Despite their intuitive appeal, however, there are little or no data with which to evaluate these theories because the coordination of muscles innervated by the facial nucleus has not been carefully compared between communicative and ingestive movements. Such data are especially crucial for reconciling neurophysiological assumptions regarding facial motor control in communication and ingestion. We here address this gap by contrasting the coordination of facial muscles during different types of rhythmic orofacial behavior in macaque monkeys, finding that the perioral muscles innervated by the facial nucleus are rhythmically coordinated during lipsmacks and that this coordination appears distinct from that observed during ingestion.
Introduction
Humans and other primates rely on facial expressions to mediate close-range interaction (Andrew, 1962; Hinde and Rowell, 1962; Van Hooff, 1962; Redican, 1975). Two crucial issues have been debated regarding the evolution of orofacial signaling: first, whether facial signals are produced through common mechanisms across primates; and second, whether these signals arose de novo or by ritualization of noncommunicative types of movement.
Variability in the production of facial expressions was long presumed to increase along a scala naturae toward humans, with an increasing number of muscles dedicated to facial expressions in primates with more recently evolved features (Huber, 1930a,b). However, while it is true that facial mobility is expanded in humans compared with other primates, interspecific variation in facial mobility can be explained partly by scaling with body size (Dobson, 2009a) and social group size (Dobson, 2009b). Moreover, modern studies of comparative anatomy reveal that the number and configuration of facial muscles does not strongly differ with position in the primate phylogeny, and that, in fact, the facial muscles of chimpanzees (Burrows et al., 2006) and macaques are quite similar to humans (Burrows et al., 2009).
Despite this general conservation of musculature, there is some evidence that both the production and perception of facial expressions have expanded neural substrates in anthropoid species living in larger social groups (Dobson and Sherwood, 2011a,b; but cf. Sherwood, 2005; Sherwood et al., 2005). This scaling with social group size was selective to the facial nucleus, the final common output pathway for the muscles of facial expression, and not to the otherwise-similar motor trigeminal and hypoglossal nuclei (Dobson and Sherwood, 2011b). Additionally, while descending influences from motor cortex to the facial nucleus may differ between less-expressive and more-expressive primates, these pathways are understood only in outline (Jürgens, 2009). Primate facial movements thus appear to rest upon a shared neural foundation, but their evolution remains poorly understood. In particular, muscles innervated by the facial nucleus—hereafter “mimetic” muscles—have received significantly less systematic study than the coordination of masticatory muscles innervated by the trigeminal nucleus.
Such data are necessary, in particular, to address a longstanding question as to the origin of primate facial displays. Mammals use facial movements not only to communicate, but also to ingest food and to control the sensitivity of sense organs. Researchers have theorized that communication signals evolve through ritualization of preexisting behaviors, and specifically that rhythmic communication signals (such as speech and lipsmacking) may be ritualized from rhythmic chewing or nursing movements (Huber, 1930a,b; Andrew, 1962; MacNeilage, 1998). The hypothesis that lipsmacks evolved from nit-picking or nursing suggests that the mimetic muscles should be coordinated in a similar way during both lipsmacks and ingestion, but this has never been tested.
We measured the dynamic coordination of monkeys' mimetic musculature using electromyography during both facial displays and ingestive movement. We targeted five mimetic muscles—three in the lower face (zygomaticus, orbicularis oris, and mentalis) and two in the upper face (frontalis and auricularis). Several outcomes were possible: the facial nucleus could (1) fail to produce any of the rhythmic coordination typical of the trigeminal nucleus, (2) produce similar coordination of mimetic muscles in all rhythmic behaviors, or (3) produce different types of rhythmic coordination in signaling and ingestion. Our data best fit the third scenario: muscle coordination was evident in both communicative and ingestive behavior, but coordination of the perioral mimetic muscles was stronger and more stereotyped during lipsmacks than during either chewing or sucking. Our data suggest that for the ritualization hypothesis to be viable, it must allow for significant reorganization in the motor program: specifically, for a migration of pattern coordination from the trigeminal nucleus (masticatory muscles) to the facial nucleus (mimetic muscles).
Materials and Methods
Subjects and surgery.
Two adult male long-tailed monkeys (Macaca fascicularis) were fitted with a head-holding prosthesis using standard sterile surgical techniques. All procedures were designed and performed in accord with the NIH Guide for the Care and Use of Animals and approved by the Princeton Institutional Animal Care and Use Committee.
Paradigm and data collection.
We used fine-wire electrodes to record the facial motor activity in three to five facial muscle groups. Specifically, we targeted three lower facial muscles: the zygomaticus, which retracts the mouth corner toward the cheekbone [Macaque Facial Action Coding System (MaqFACS) AU12]; the dorsal orbicularis oris, which purses the upper lip (AU18); and the mentalis, which puckers the chin by pulling it upward toward the canines (AU17). In addition, we targeted two upper facial muscles: the frontalis, which raises the brow (AU1 + 2), and the auricularis, which raises and flattens the ear (EAU2/EAU3). All of these muscles are believed to be involved in some aspects of macaque facial signaling, but the upper facial muscles were hypothesized to be much less involved in the rhythmic mouth movements of interest while being similarly vulnerable to electrical artifact. MacFACS action units are specified per Parr et al. (2010).
In each session, monkeys were anesthetized with a reversible anesthetic (0.27 mg/kg midazolam with 0.022 mg/kg dexdormitor); recording sites were shorn, cleaned, and sterilized with alcohol; and the monkey was held in place in a restraint chair via head-holding prosthesis. The monkey's depth of anesthesia and ease of breathing was monitored by a pulse oximeter. Paired fine-wire electrodes had previously been readied: each wire (0.1 mm diameter) was stripped and then threaded through a 23-gauge needle so that 2–5 mm protruded from the tip; this length was folded over the needle tip, and needle and wire pairs were packaged and autoclaved for later use. During the experiment, each needle was inserted through the skin into the underlying muscle belly and gently withdrawn, leaving the bare wire anchored in the muscle. In addition to each paired insertion, an unpaired ground wire was inserted at the back of the head behind the head-holding prosthesis.
Each electrode pair was positioned and confirmed based on the recent work of Waller and colleagues (2008). Specifically, muscle targeting was tested using unipolar stimulation with 1–3 mA current in ½ ms pulses, delivered 10–30 times per second. We used a Grass S88 stimulator providing constant current through a PSIU6 optical isolation unit. Muscle responses were visually assessed and recorded to digital video by handheld camera. If stimulation was unsuccessful or recruited nontargeted muscles, electrodes were repositioned. Once all electrodes sites were verified, channel recordings were visually inspected for noise. Figure 1A shows the stimulation-induced movements associated with each muscle.
Recording sites confirmed by stimulation. A, Acute indwelling electrodes were inserted into the auricularis (cyan), frontalis (blue), zygomaticus (green), orbicularis oris (red), and mentalis (magenta) facial muscles. These mimetic muscles contribute to facial expression, and stimulation was confirmed to selectively raise and flatten the ear at the auricularis site, to lift the brow at the frontalis site, to retract the mouth corners into a grin at the zygomaticus site, to purse the lips at the orbicularis oris site, and to lift and protrude the lips at the mentalis site. B, Recordings showed that, while muscles were often coactivated, each recording site tapped independent electrical activities that corresponded with video-monitored muscle tension.
Following verification of recording sites, subjects were awakened through a reversal agent (0.22 mg/kg antisedan) while an automated juice delivery system was positioned at the mouth. After ∼15 min recovery, the subject participated in a calibration procedure to track eye movements (ASL EyeTrack 6000); success provided evidence the monkey was alert and engaged. Once this was successfully completed, the monkey viewed video sequences (including frontally directed monkey facial expressions recorded in the lab, naturalistic interactions between rhesus monkeys recorded in the wild, and excerpts of Hollywood films) and was periodically engaged by the experimenters, who simulated monkey facial displays or held a mirror toward the subject (both eliciting lipsmacks) or provided small nutritive rewards such as raisins (eliciting chewing responses) or juice from a needleless syringe (eliciting sucking).
We recorded electromyographic (EMG) signal as bipolar voltage differences (1000× gain, to ±5 mV maximum) over the frequency range 0.7–300 Hz with 1000 Hz digital sampling rate, while simultaneously recording subjects' facial movements at 30 frames per second via infrared camera (Plexon with Cineplex). Video stimuli and random juice delivery were controlled through the Presentation software package (Neurobehavioral Systems); however, behaviors analyzed here were primarily gathered during experimenter interaction.
Data scoring.
EMG data were analyzed in 64-bit Matlab 7.9.0 on a Windows 7 PC. Each daily session included several recording segments, within which we identified clips that included various sorts of facial movement. To analyze the data, we selected time clips with unambiguous lipsmacking, chewing, and sucking behavior based on experimental context and video recordings using standard ethological criteria (van Hoof, 1967). For example, lipsmacks are listed in the MaqFACS as action descriptor AD 181: a “tightening of the lips together followed by a rapid opening and parting motion” associated with action unit AU18i (Parr et al., 2010). For bouts of chewing, the first and final seconds of each bout (food positioning and swallowing) were excluded to emphasize stereotyped rhythmic chewing movements. We separately inspected our recorded voltage traces and power spectrograms to exclude clips in which signal quality was compromised by electrical noise: Noise was harmonic, had a characteristic ramping onset, and was sustained for many seconds at a time; it was thus distinct from normal movement-associated activity. We were able to assemble a library of movement exemplars from both monkeys and in all three categories, along with session-, segment-, duration-, and monkey-matched baseline distributions.
Qualitative evaluation of EMG signals.
We first inspected EMG traces recorded for each type of behavior by examining their log-frequency spectrograms. To do this, we used in-house Matlab code that first filtered the raw trace into one-eighth-octave bands using Makeig and Delorme's function, eegfilt.m, and then squared and smoothed the data with a frequency-scaled window to convert from voltage to power/octave. Examples of each behavioral category are displayed in Figures 3⇓–5.
Data processing.
We were primarily interested in rhythmic changes in muscle tone, which correlate with amplitude modulations of recorded EMG power, particularly in the 45–256 Hz frequency bands. We therefore filtered the raw EMG traces with a passband ranging from 45 to 256 Hz using eegfilt. To convert from voltage to power, we squared the signal and smoothed the data. Smoothing was performed either using a running average with a one-thirtieth of a second sliding boxcar window or by low-pass filtering at 42 Hz; choice of smoothing method had no significant impact on the data presented. After initial data processing, we visually identified noise-contaminated samples and marked them for exclusion by designating the data values at those time points “NaN”. These manipulations produced a cleaned data pool reflecting the instantaneous power of 45–256 Hz EMG activity at each site across each recording segment, as illustrated in Figure 2.
Amplitude modulation power spectra.
Instantaneous power records were next analyzed to extract the frequency content. We first normalized each segment by z-scoring, then extracted and concatenated clips with bouts of each behavioral type. To establish a permutation baseline, we extracted and concatenated randomly selected session-, segment-, and duration-matched clips (including clips overlapping with our behaviors of interest, but excluding those contaminated by noise). Spectral analysis was conducted using the Chronux toolbox for Matlab (August 2008 release). To use the maximal number of tapers possible while permitting frequency discrimination with ½ Hz accuracy, we set the tapers parameter [W, T, p] to [½, (duration of the concatenated signal), 1]. We evaluated behaviors with significantly increased or decreased 1–15 Hz modulation of the instantaneous power relative to baseline, and describe our findings below.
Amplitude modulation spectra: frequency distribution and coherency.
We next examined the distribution of amplitude modulation frequencies and examined their coherency between recordings. These measures differ from the above analysis only in that they were z-scored by clip, rather than by segment, to weight each bout of behavior equally. These measures thus indicate frequencies with consistently elevated power relative to other frequencies in the same bout and examine the extent to which frequency modulations co-occur across sites with characteristic phase delay. Resultant amplitude modulation spectra were smoothed over 1 Hz for plotting and averaged into ½ Hz bins for tabular and in-text summary. Significance was assessed (at α = 0.05) by comparing the resultant spectra within each behavioral category to its matched permutation baseline (two-tailed for modulation frequency distribution, one-tailed for modulation coherency). The frequency distribution of power modulations is plotted in Figure 6, while coherency in frequency modulation between sites is reported, for each behavioral type, in Figures 7⇓–9. To summarize these frequency-domain measurements of intramuscular coordination, we calculated the extent to which 1–15 Hz frequency modulations and 1–15 Hz coherency during behavioral bouts exceeded the significance threshold given by the permutation baseline: these data were used to scale nodes and links, respectively, in Figure 10A.
Muscular coordination in the time domain.
To illustrate the time-domain relationships between muscle activations during each behavioral type, we identified the muscle in each behavioral category for which 1–15 Hz modulations were most coherent with other muscles. We concatenated instantaneous power recordings according to behavioral type, as described above, and performed a cross-correlation normalized so that each signals' autocorrelation was 1 at zero lag (i.e., coeff normalization of xcorr.m). The time axis of the resultant autocorrelations and cross-correlations were reversed for purposes of illustration, showing when muscles of interest had increased expected muscle tone relative to the peak contraction of the most controlling muscle in each behavioral type. These data are shown in Figure 10B.
Results
Our goal was to contrast the coordination of the mimetic muscles, governed by the facial nucleus, during communicative and ingestive behaviors. Recordings sessions required reversible anesthesia and the acute implantation of seven to 11 independent fine-wire electrodes in a confined space in such a way as to minimize physical and electrical artifact; these constraints made our recording conditions highly atypical of spontaneous macaque social interactions. Given the challenges of acute, multisite facial EMG recordings, lipsmacks could not be reliably elicited in every recording session. In total, we recorded eight bouts of lipsmacks in data spanning four recording sessions with two monkey subjects; from the same sessions, we drew seven bouts of chewing and five bouts of drinking/sucking (for details, see Table 1). Figure 2 shows voltage traces and power recorded at each muscle during each of the three behavioral categories. We then examined qualitative similarities and differences in spectra recorded from bouts of lipsmacks, chewing, and sucking. Examples of these recordings can be seen in Figures 3⇓–5. In all cases, high-frequency signal (i.e., >45 Hz) appeared distinct from lower-frequency signals. Lower-frequency signals are more vulnerable to contamination from movement artifact (Fridlund and Cacioppo, 1986); our analysis of amplitude modulations therefore concentrated on the power of signals band-passed from 45 to 256 Hz.
Description of data sources
Raw traces of voltage and power reflect muscle action. Raw voltage (gray) was filtered, squared, and smoothed to measure power (black), providing a quantitative measure of muscle activity. A, auricularis; F, frontalis; Z, zygomaticus; O, orbicularis (Orb.) oris; M, mentalis.
Example lipsmack bout: log-frequency spectrogram. Lipsmacks were characterized by rhythmic activity in the orofacial muscles. Though upper facial muscles controlling the ear and brow have been implicated in natural observations, they appeared to play an intermittent and nonrhythmic role, suggesting independent control, perhaps as emphatic signals. Because higher-frequency electrical signals were more resistant to mechanical artifact, subsequent analyses quantified the rhythmic modulation (1–15 Hz) of power in this higher frequency (45–256 Hz) band.
Example chewing bout: log-frequency spectrogram. Chewing bouts were triphasic, consistent with prior literature, reflecting an initial preparatory phase followed by rhythmic chewing and concluding with a swallow. Further analysis truncated chewing bouts to include only the rhythmic-chewing phase evident centrally, here.
Example sucking bout: log-frequency spectrogram. In contrast to other observed behaviors, orofacial activity during sucking (drinking from a syringe) was characterized by tonic activation of the mentalis and orbicularis oris, sealing the lips, with rhythmic activity in the zygomaticus, presumably associated with periodic negative pressure facilitating fluid consumption.
Muscle activity observed during facial movements
Power spectra showed clear qualitative differences between movement categories. Lipsmack bouts showed strong rhythmic modulations in the lower facial muscles, especially the orbicularis oris and mentalis, with zygomaticus sometimes evident but offset in phase (Fig. 3). Chewing behavior likewise involved rhythmic movement of the perioral musculature, but these movements appeared less stereotyped, presumably because their goal was the manipulation of food across various stages of intraoral positioning and mastication (Fig. 4). Finally, drinking/sucking behavior involved tonic activity in the mentalis and orbicularis oris except during initial and final manipulation of the juice syringe, while zygomaticus appeared to have rhythmic modulation, perhaps to pull the syringe closer or create negative pressure (Fig. 5). Overall, lipsmacks tended to involve repetition (at a variable rate) of highly conserved and coordinated perioral movement, while ingestive movements varied across time. These data suggest that any effects of somatosensory feedback are relatively time-invariant for communicative displays but are time-varying for ingestive behaviors in which food is positioned, processed, and swallowed.
We also recorded two sites in the upper face. The lipsmack expression is often accompanied by raised eyebrows (van Hoof, 1967); however, we found that brow flashes were not obligate, did not have a strong rhythmic component, and did not synchronize with lower facial movements. Likewise, ear movements have been implicated in macaque social interactions (van Hoof, 1967), but we found little evidence for coordination between the upper and lower face during lipsmacks.
Rhythmic modulation of muscle activity in facial movement
To quantitatively contrast the rhythmic muscle activity across behaviors, we analyzed the amplitude modulation of EMG power. The analyses of chewing movements were restricted to the middle period, during which chewing movements were most stereotyped, to better compare rhythmic coordination of muscle activity; this is common in electromyographic work on ingestion (Iriki et al., 1988; Yao et al., 2002; Ootaki et al., 2004). We first compared the absolute amount of frequency modulation observed during each behavioral type, relative to that observed elsewhere during each recording segment. Lipsmacks had significantly more 1–15 Hz amplitude modulation of perioral muscle tension in the zygomaticus and especially the orbicularis oris and mentalis, chewing involved no significantly increased modulation of mimetic muscle activity at any site, and sucking involved significant decreases in auricularis modulation along with significant increases in orbicularis oris and mentalis modulation.
To compare the relative frequency distribution of power modulations observed during type of behavior, we z-scored activity recorded at each site within each bout. These power modulations thus reflect the relative periodicity of changes in facial muscle tension and their stereotypy between instances of each behavior. Periodic muscle contractions at specific frequencies were observed in several orofacial muscles and frequency bands (Fig. 6, Table 2). For lipsmacks, stereotyped rhythmic activity was prominent in all three recorded perioral mimetic muscles. For chewing and sucking, the zygomaticus site exhibited the most stereotyped rhythmic modulation, with strikingly similar activity recorded at the auricularis site during rhythmic chewing only. Rhythmic activity was most stereotyped in the orbicularis oris during lipsmacks, rather than at the zygomaticus site as seen during ingestion. The co-occurrence of identical rhythmic activity at the zygomaticus and auricularis sites during chewing, combined with the lack of accompanying power increases, suggests that these signals may have arisen via volume conduction from the underlying mandibular muscles (specifically, the masseter and temporalis innervated by the trigeminal nucleus), which are known to be active during ingestion.
Frequencies of power modulations observed during orofacial movement types. Low-frequency modulations were relatively increased during orofacial movements compared with their baseline distribution. In lipsmacks, this was most evident in the orofacial muscles (zygomaticus, orbicularis oris, and mentalis) and appeared to include a lower frequency component at 2–5 Hz, reflecting the rate of smacks and a higher frequency component at 5–9 Hz, reflecting smack structure. For chewing, power elevations were most prominent in the auricularis and zygomaticus at harmonics of ∼2 Hz, presumably reflecting strong activations in the underlying masticatory musculature innervated by the trigeminal nucleus. For sucking, only the zygomaticus site showed strong increases in relative power, with peaks near 4, 8–10, and 13 Hz.
Significantly elevated power modulation, by muscle and behavior type
Coordination of muscle activity during rhythmic facial movement
We measured the coordination of activity by analyzing the coherency with which rhythmic changes of activity occurred between facial muscles (Figs. 7⇓–9). Coherency measures how consistently a frequency co-occurs in paired recordings, with 0 indicating no relationship and 1 indicating a rhythm with perfectly correlated amplitude and consistent phase delay. In other words, significant coherency between muscle pairs indicates that they contract in consistent synchrony, sequence, or opposition to one another, and are thus likely to be under the control of the same central pattern generator. We measured strong and significant coherency between perioral muscles during lipsmacks, but not during in ingestive movements, suggesting that the coordination of perioral muscles innervated by the facial nucleus is much more stereotyped in communicative than in ingestive behaviors. Instead, coherency observed during rhythmic chewing primarily reflected synchronous activity across many frequencies between the zygomaticus and auricularis sites, reinforcing the idea that activity recorded during chewing may in fact arise in the nearby masseter and temporalis. In general, the most prominent coherencies involved the orbicularis oris (with the zygomaticus and mentalis) during lipsmacks and the zygomaticus (with auricularis in chewing and orbicularis in sucking) during ingestive movements (Table 3, Figs. 7⇓–9, and summarized in Fig. 10 alongside relevant cross-correlations). It appears that perioral muscles innervated by the facial nucleus have strongly stereotyped interactions in communication, focused on the oribicularis oris, while rhythmic interactions between these muscles are weaker in ingestive movements and focused on either the zygomaticus or the underlying mandibular muscles.
Coherent modulations during lipsmacks. Coherent power modulations in the lower facial muscles (i.e., orbicularis oris ↔ zygomaticus and mentalis ↔ orbicularis oris) indicated significantly coordinated muscle contractions at low frequencies (reflecting smack frequency) and higher frequencies (reflecting smack structure) in the perioral region. Upper facial muscles, reported in the literature to play a role in communication, were not coherently activated; this suggests independent neural and behavioral control, perhaps as emphatic signals.
Coherent modulations during chewing. Coherent power modulations between the ear and cheek sites (zygomaticus ↔ auricularis) occurred over a range of frequencies 2–10 Hz. Because these signals were in-phase (Table 3), this suggests a common origin, perhaps in the underlying masticatory apparatus. These contractions were weakly associated with orofacial movements, particularly at low frequencies characteristic of mandibular movement.
Coherent modulations during sucking. In general, there was little coherent modulation of mimetic muscles during sucking behavior, but significant peaks at ∼4 and ∼9 Hz coordinated zygomaticus (or possibly mandibular) activity with orbicularis oris contractions, perhaps representing motor coordination that created negative pressure to facilitate fluid ingestion.
Significant coherencies, by behavioral type and muscle pair
Muscle rhythm coordination—coherencies and cross-correlations. Left, Significant power modulations and modulation coherencies are depicted for each of the muscle groups (cyan, auricularis; blue, frontalis; green, zygomaticus; red, orbicularis oris; magenta, mentalis). Node weight corresponds to the total amount by which measured power modulations exceeded the permutation baseline; line weight corresponds to the total amount by which measured coherency exceeded the permutation baseline. While lipsmacks are characterized by coherent movements of perioral mimetic muscle, chewing exhibited inconsistent perioral coordination in these muscles despite strong coordination of signal at the auricular and zygomatic sites. Right, Cross-correlations are shown synchronized to the orbicularis oris activity for lipsmack at to the zygomaticus for ingestive movements. Only coherent muscle activities are depicted. Lipsmacks are characterized by fast interaction between the orbicularis oris, mentalis, and zygomaticus (corresponding to smack structure) while the more variable intersmack interval results in a broad flanking peak in orbicularis oris/mentalis activity at 200–500 ms; zygomaticus activity is generally antiphase to orbicularis oris and mentalis. In rhythmic chewing, chew rate creates a sharp flanking peak at approximately one-third seconds, with muscle activity sweeping in sequence from zygomaticus/auricularis/jaw closure to mentalis and then to orbicularis oris contraction. In sucking, regular sharp peaks reflect fast cycling of the zygomaticus (or, possibly, jaw closure), and occur in sequence with subtle modulations of tonic orbicularis oris and mentalis activity.
Discussion
We simultaneously recorded activity in multiple facial muscles while monkeys engaged in communicative and ingestive behaviors with rhythmic features (lipsmacking, chewing, and sucking/drinking). We find that muscle coordination was significantly different during these behaviors: in lipsmacks, we observed strong and consistently coordination between perioral mimetic muscles, in particular with the orbicularis oris; while for ingestive behaviors coordination was relatively weak or involved broadly identical activity at zygomaticus and auricularis sites, suggesting volume conduction from underlying masticatory muscles. These data suggest communication signals evoke coordinated activity among muscles innervated by the facial nucleus that is distinct from, and more robust than, that seen in ingestion. These data stand in contrast to past research, which focused on the strong rhythms present during ingestion in trigeminally innervated masticatory muscles. Rhythmic features of communication and ingestion thus appear to involve distinct central pattern generators or, alternately, shared central pattern generators operating in novel modes under the influence of descending modulation.
Neural control of facial movements
Though facial displays have received relatively little scrutiny from neuroscientists, it is possible to sketch their likely neural control systems based on lesion studies and research on vocalization and ingestion. It is believed that the fine-scale structure of vocalization and the central pattern generation for ingestive rhythms are provided to the cranial nerve nuclei by the reticular system of the medulla (Jürgens, 2002; Lund and Kolta, 2006). Two streams of descending input initiate these patterns: first, the facial nuclei receives projections from the periacqueductal gray, thought to initiate affective signaling, and which integrates inputs from amygdalae, hypothalami, insula, and medial frontal lobes; second, the facial nuclei of primates also receives direct and reticular-formation-mediated indirect projections from motor cortex, particularly the lateral frontal lobes, which are thought to initiate and refine voluntary movements, including feeding and trained (but not spontaneous) vocalizations (Hopf et al., 1992; Jürgens, 2009; Caruana et al., 2011; Coudé et al., 2011). Human lesions differentially disrupt affective and volitional facial movements, suggesting that this dichotomy generalizes to facial displays. Past researchers have sometimes treated “emotional,” “volitional,” and “rhythmic” facial movements as distinct entities. While these gross categorizations may indeed map to medial frontal/lateral frontal/brainstem control systems, they are hardly exclusive. Lipsmacks, for example, are profoundly rhythmic, showing that human speech is not the only case in which finely structured mimetic coordination is essential to communicative signaling.
A second reported distinction involves cortical control of the upper and lower face: direct projections to the facial nucleus from the rostral cingulate and supplementary motor face patches bilaterally innervate subnuclei governing the upper face, while the caudal cingulate and lateral frontal face patches contralaterally innervate subnuclei governing the lower face (Morecraft et al., 2004). This finding suggests that mimetic muscles of the upper and lower face may play different roles in behavior. Consistent with these reports, and despite that upper face movements have been strongly associated with lipsmacks in both laboratory (Mosher et al., 2011) and field (van Hoof, 1967) studies, we here found no fine-scale coordination between upper and lower facial movements during either communicative or ingestive displays.
The origins of primate facial displays
Interspecies variation in the production of facial expressions were long associated with an evolutionary ascent, from “primitive” primates to humans, along a scala naturae of increasing complexity (Huber, 1930a,b): this hypothesis is no longer tenable with respect to peripheral facial anatomy (Burrows and Smith, 2003; Burrows et al., 2006, 2008, 2009; Diogo, 2009; Diogo et al., 2009) and behaviorally is better accounted for by social group and body size than phylogeny (Dobson, 2009a,b). Interestingly, the size of the facial nucleus—the final common output for mimetic muscle control—may increase in response to selection pressures associated with social living (Dobson and Sherwood, 2011a,b; but cf. Sherwood, 2005; Sherwood et al., 2005). Similarly, descending influences from motor cortex may also play a role in the evolution of primate facial signals: motor cortices, in apes, have significantly more interconnections (Sherwood et al., 2004a) and distinct neurochemistry (Sherwood et al., 2004b) relative to other primates.
Modern neuroscience has a compelling challenge in decoding the neural mechanisms that evolved to facilitate expanded communication repertoires among primates. While several facial expressions are thought to arise from defensive or protective responses (Andrew, 1962), affiliative expressions such as the lipsmack have generally been suggested to arise via ritualization of nit-picking or suckling behavior (Andrew, 1962; Van Hooff, 1962; Redican, 1975). Ingestion is known to involve coordination of mandible, tongue, and throat muscles governed by the trigeminal, hygoglossal, and glossopharyngeal nuclei, respectively. However, we find that lipsmacks involve significant rhythmic coordination of mimetic muscles that is not strongly evident during ingestive behavior. It is notable, in this light, that Dobson and Sherwood (2011b) found that the social group size of Old World monkeys predicted the size of their facial but not their trigeminal or hypoglossal nuclei. While our data neither confirm nor disconfirm ingestive origins for communicative pattern generation, they do indicate that these types of movement are coordinated by different nuclei and, at least in the facial nucleus, in different ways. The rhythmic coordination of perioral mimetic muscles during lipsmacks (and not ingestion) shows that the facial nucleus plays an unprecedented role in coordinating rhythmically complex facial signals.
Our findings show that primates exhibit distinct modes of rhythmic orofacial control during ingestive and communicative behavior. Interestingly, because many movements involved in ingestion are not readily visible, Andrew (1963) hypothesized that “an important early function of vocalization in the prelanguage stage of human evolution was to carry information about invisible positions of the tongue in facial displays.” These arguments echo a debate regarding the origins of human language: are there central pattern generators for human communication signals (notably speech) that arise from central pattern generators for ingestive behaviors (Moore and Ruark, 1996; MacNeilage, 1998; Lund and Kolta, 2006)? We know that when humans speak, we do so by coordinating, at a fine scale, the diaphragm and intracostal muscles that govern breathing; the laryngeal muscles, which tense our vocal folds; the muscles of mastication, which open and close our jaw; and the facial, tongue, and pharyngeal movements, which shape vowels and define consonants (Smith, 1992; Jürgens, 2002). However, these muscles predate both our speech and our species, and it is interesting to consider how preexisting behaviors might have been co-opted for use in human speech. Despite decades of attention to speech, the coordination of mimetic muscles has rarely been studied in humans or in our nonhuman relatives. While human muscles of mastication and respiration have received some attention (Smith and Denny, 1990; Smith, 1992; Goffman and Smith, 1994; Wohlert and Goffman, 1994; Moore and Ruark, 1996; Steeve et al., 2008; Steeve and Price, 2010), there has been fairly little investigation of muscle coordination via the facial nucleus during expressions (but see Root and Stephens, 2003). We are aware of no other report investigating the coordination of mimetic muscles during animal communication (but cf. Yao et al., 2002; Ferrari et al., 2003; Ootaki et al., 2004; Mosher et al., 2011). Our novel finding that communication signals involve rhythmic coordination of muscles innervated by the facial nucleus—and that this coordination is in fact more robust than that seen in mimetic muscles during ingestion—suggests a pathway for facial movement optimized to accord with social demands rather than with sensed changes in food position and consistency. This opens a new approach to investigating the evolution of primate communication and its neural control.
Footnotes
This work was supported by NIH (NINDS) R01NS054898. We thank Michael Graziano for advice on the use of fine-wire electrodes and Daniel Takahashi for suggestions regarding both our analysis and this manuscript.
- Correspondence should be addressed to Stephen V. Shepherd at the above address. stephen.v.shepherd{at}gmail.com