Abstract
Rodent jaws evolved structurally to support dual functionality, for either biting or chewing food. Rodent hands also function dually during food handling, for actively manipulating or statically holding food. How are these oral and manual functions coordinated? We combined electrophysiological recording of muscle activity and kilohertz kinematic tracking to analyze masseter and hand actions as mice of both sexes handled food. Masseter activity was organized into two modes synchronized to hand movement modes. In holding/chewing mode, mastication occurred as rhythmic (∼5 Hz) masseter activity while the hands held food below the mouth. In oromanual/ingestion mode, bites occurred as lower-amplitude aperiodic masseter events that were precisely timed to follow regrips (by ∼200 ms). Thus, jaw and hand movements are flexibly coordinated during food handling: uncoupled in holding/chewing mode and tightly coordinated in oromanual/ingestion mode as regrip–bite sequences. Key features of this coordination were captured in a simple model of hierarchically orchestrated mode-switching and intramode action sequencing. We serendipitously detected an additional masseter-related action, tooth sharpening, identified as bouts of higher-frequency (∼13 Hz) rhythmic masseter activity, which was accompanied by eye displacement, including rhythmic proptosis, attributable to masseter contractions. Collectively, the findings demonstrate how a natural, complex, and goal-oriented activity is organized as an assemblage of distinct modes and complex actions, adapted for the divisions of function arising from anatomical structure. These results reveal intricate, high-speed coordination of disparate effectors and show how natural forms of dexterity can serve as a model for understanding the behavioral neurobiology of multi-body-part coordination.
Significance Statement
Survival hinges on efficiently handling and ingesting food. During food handling, mice switch between statically holding and actively manipulating food with the hands and also between biting with the incisors or chewing with the molars. Using masseter electromyography and kilohertz frame rate video, we show that the two modes of hand movement and jaw movement are tightly linked, with holding co-occurring with chewing and handling with biting. On faster timescales, biting is precisely timed to occur after rapid sequences of manipulative hand movements. Our findings reveal fast, intricate, hierarchical coordination of hand and jaw movements, show how morphology influences the organization of complex behavior, and establish food handling as a model for neurobiological studies of multi-body-part coordination.
Introduction
Survival depends on nutrition intake by efficient feeding, for which mammals have evolved diverse strategies (Williams, 2019; Laird et al., 2023). For rats and mice, these include using the hands to bring food to the mouth to manipulate and ingest it (Whishaw, 1996; Whishaw and Coles, 1996; Guo et al., 2015). This process of food handling by rodents is an attractive rodent model for investigating the detailed kinematics of manual dexterity (Whishaw et al., 2017; Barrett et al., 2020; An et al., 2022; Falardeau et al., 2023). In particular, recent studies of forelimb and hand movements during feeding, using close-up high-speed video recordings, show that food handling involves cycling between two postural modes of the hands (Fig. 1A; Barrett et al., 2020, 2022). During oromanual mode, food is held up to the mouth and actively manipulated. “Regrips” occur frequently, in which either or both hands make an extremely fast (<20 ms) lateral excursion, briefly releasing and readjusting their grip on the food while gripping it with the mouth. Cortical activity rises with the transition to oromanual mode. During holding mode, food is held in a mostly static posture below the mouth while the animal chews. Cortical activity drops during this mode.
Conceptual framework, technical approach, and example recording. A, Left, Forelimb movements during food handling are organized around cycling between an active oromanual mode and a static holding mode (top); regrips are a hallmark of the oromanual mode (bottom). Right, Rodent jaws are structured for dual functionality: biting with the incisors (top) or chewing with the molar (bottom; images from digimorph.org). B, Hypothetical possibilities for how the jaws and hands might interact during food handling. At one extreme, functions of the jaw (biting, chewing) might be synchronized with those of the hands (active manipulation, static holding), creating holding/chewing and oromanual/ingestion modes; at the other extreme, they might be largely independent and asynchronous. C, Depiction of the technical approach, based on dual-view kilohertz-rate videography for 3D kinematic analysis of the hand–hand, hand–nose, and jaw–nose distances, with EMG recordings from the masseter and forelimb muscles. D, Schematics illustrating the multiple data streams and signals to be collected and expectations and unknowns for how they may relate to hand and jaw actions. E, Example of simultaneously recorded kinematic and EMG traces from one food handling session. One oromanual epoch is indicated (yellow boxed region), during which regrips and bites occurred, and chewing during the subsequent holding epoch is indicated (clear boxed region).
The oral components of feeding have been extensively studied (Williams, 2019) and involve three phases, ingestion, mastication, and deglutition (Hiiemae and Ardran, 1968), of which the jaw is primarily involved in the former two. In rodents, the mandible can be positioned relative to the maxilla either to bring the incisors together for biting during ingestion or to bring the molars together for mastication, i.e., chewing (Cope, 1888; Vinogradov, 1926; Druzinsky, 2015; Fig. 1A). This can be observed electromyographically by the occurrence of two modes of jaw muscle activity with distinct frequencies and amplitudes, corresponding to periods of chewing and biting (Kobayashi et al., 2002).
Because the oral and the manual components of the rodent feeding apparatus have largely been studied in isolation, it is unclear how movements across these body parts are coordinated. For example, at one extreme, biting and chewing might be precisely synchronized with the oromanual and holding modes, respectively, of the hand movements, hence creating coordinated “oromanual/ingestion” and “holding/chewing” modes. At the other extreme, biting and chewing might occur throughout food handling, either continuously or intermittently, independent of manipulation-related movements (Fig. 1B).
To address this question, we sought to simultaneously assay jaw and hand function during food handling. Video tracking of jaw kinematics during rodent food handling is challenging due to occlusion by the hands. We therefore combined electromyography (EMG) of the masseter muscle with kilohertz videography of mouse food handling. The findings reveal how masseter activity and hand movements are coordinated as mice handle food.
Materials and Methods
Subjects
This study used experimentally naive mice of both sexes on a C57BL/6 background (stock no. 000664, The Jackson Laboratory; Table 1) aged 92–230 d postnatal and weighing 20.8–25.3 g at the time of recording (23.4–28.7 g before food restriction, minimum 56 d postnatal and 22.5 g at the time of EMG implantation). Mice were selected for weight as larger mice were found to have better surgical outcomes; hence, the dataset skews male. The small sample size for female mice precluded analysis of sex differences, but sex differences in forelimb kinematics during food handling have not previously been noted (Barrett et al., 2020, 2022). Mice were bred in-house, housed in groups with a 12 h reverse light/dark cycle, and had ad libitum access to food and water prior to food restriction (see below). This study comprised five experimental groups (main analyses, tooth sharpening analysis, hardness analysis, acoustic recordings, bilateral masseter recordings) that were not conducted concurrently (with the exception that one mouse used for hardness analysis was also used for acoustic recordings); hence, no randomization to groups was required and mice were used as they became available. Experiments were conducted during the dark phase of the mice's light cycle. Studies were approved by the Northwestern University IACUC and complied with the animal welfare guidelines of the National Institutes of Health and Society for Neuroscience.
Mice used in this study
EMG electrode fabrication
Electrodes for EMG recording were made using established methods (Pearson et al., 2005) adapted for mouse forelimb muscles (Miri et al., 2017; Kristl et al., 2024). Briefly, electrodes were made from knotted pairs of Teflon-coated stainless steel wire (A-M systems, catalog #793200). On one side of the knot, ∼0.5 mm of insulation was removed from each wire (starting at 1 and 2.5 mm from the knot, hence separated by 1 mm), the two wires were braided together, and the wires were crimped into a 27-gauge needle (Exelint International). On the other side, the unbraided wire pair was soldered to a miniature connector (CLP-112-02-F-D-A, Samtec). The length of wire leading to the connector was adapted to the muscle to be implanted (3.5 cm for masseter and biceps brachii, 4.5 cm for brachioradialis).
Surgical procedures
Under deep isoflurane anesthesia (3% induction, 0.8–1.5% maintenance), the skin over the cranium and muscles to be implanted was shaved and cleaned, mice were placed in a stereotaxic frame (Model 900, David Kopf Instruments), and a midline incision was made to expose the cranium and upper back. The periosteum was removed, the skull was gently scraped to improve adhesion, and a titanium head-fixation bar (0.875 × 0.187 inches, cut by water-jet from 0.08 inch Ti-6Al-4V sheet, Big Blue Saw) was placed on top of lambda, perpendicular to the sagittal suture, and affixed using dental cement (C&B Metabond, Parkell).
Following head-bar mounting, incisions were made over one cheek to expose the masseter and either the upper forelimb to expose biceps brachii or the lower forelimb to expose the brachioradialis. The twisted end of each wire pair was led under the skin to the implantation site and inserted into the corresponding muscle using the needle. The distal end of the wire was knotted, the needle and excess wire cut away, and the incision(s) sutured. The connector was affixed to the head-bar with dental cement.
In two mice, instead of twisted pair electrodes, we implanted 32-channel Myomatrix flexible microelectrode arrays (Chung et al., 2023), configured in four shanks of eight electrodes. The surgery for these was similar to that for twisted pairs, except to implant the electrodes, an 8-0 suture (N-2547; Covidien) was tied to a hole in the distal end of each shank of the electrode array, the suture needle was passed through the muscle, and the array drawn through the muscle using the suture. The array was then secured in place by sutures distal and proximal to the implantation sites.
The cranial incision was then sutured to close the wound margins and cover any exposed cranium or muscle not covered by dental cement and/or the head bar. Mice were given 1 mg/kg buprenorphine intraperitoneally and a small amount of bupivacaine (<2 mg/kg) under each incision site preoperatively and 20 mg/kg meloxicam subcutaneously postoperatively as analgesia along with 10 mg/kg enrofloxacin subcutaneously, followed by a second dose of meloxicam and enrofloxacin 24 h later. Mice were single-housed following head-bar mounting and EMG electrode implantation.
EMG recordings and analysis
To record EMG signals, the loose wires of an Intan 36-pin wire adapter were soldered to FTSH-112-03-F-DV-ND pin connectors that mate with the socket connectors of the EMG implants. This adapter allowed the EMG implants to be connected to and amplified by an RHD2216 bipolar headstage (Intan), from which data were acquired at 30 kHz using an RHD2000 USB evaluation board. A subset of early recordings were made using a monopolar RHD2132 headstage, with the signals from each contact subtracted offline. Myomatrix signals were recorded by plugging an RHD2132 headstage directly into the Omnetics connector of the Myomatrix array.
EMG signals were preprocessed using pyemgpipeline (Wu et al., 2022). The DC offsets were removed by subtracting the mean of the signal, and then signals were bandpass filtered between 10 and 450 Hz with a fourth-order Butterworth filter, full-wave rectified, and down-sampled to 1,000 Hz.
Behavioral training and videography
Mice were acclimatized to head fixation as previously described (Barrett et al., 2022). Briefly, following at least 1 week of recovery from head-bar and EMG implantation, mice were placed on food restriction, receiving a limited amount of standard rodent diet each day to maintain their bodyweights at 85–90% of prerestriction bodyweights. Mice were monitored throughout the food restriction period for signs of ill health (Guo et al., 2014), and body condition scores (Ullman-Cullere and Foltz, 1999) were taken each day. No signs of ill health were observed, and no mouse fell below a body condition score of 3 throughout the study. Once on food restriction, mice were familiarized with the apparatus and handling by the experimenter and then exposed to progressively longer durations of head fixation until they became comfortable handling and consuming food items presented to them while head fixed. For most recordings, mice were fed shelled sunflower seed kernels (S5137-1, Bio-Serv) or 45 mg grain-based dustless precision pellets (F0165, Bio-Serv). In a subset of recordings, to investigate how food hardness affects the structure of food handling behavior, mice were also fed farro grains, walnut fragments, and Fruity Gems (F5136-1, Bio-Serv).
Videos were obtained with a high-speed CMOS-based monochrome video camera (Phantom VEO 710L, Vision Research). Videos were acquired at 1,000 frames per second (fps), 999.6 µs exposure time, and 1,024 × 512 pixel field of view. Two oblique views of the mouse were obtained by mounting two 50 × 50 mm flat enhanced aluminum surface mirrors (#43-876, Edmund Optics) and a 50 mm antireflection coated equilateral prism (#49-435, Edmund Optics) in the camera optical path. A prime lens (Nikon AF Micro-NIKKOR 60 mm f/2.8D, Nikon) was mounted on the camera body. The mouse was illuminated from both sides and slightly below using two red LEDs (M660L1 and MLEDC25, Thorlabs). Camera and video recording settings were controlled with Phantom Camera Control Application v3.5 (PCC, Vision Research). Video was recorded to the camera memory and then saved to disk as uncompressed Phantom Cine files, later converted to H.264-encoded MP4 files. Video recording was manually triggered by a TTL pulse delivered from an NI USB-6229 data acquisition board (National Instruments) once the mouse had successfully retrieved a morsel in both hands and the mouth and begun eating.
Videos were cropped to isolate each view using ffmpeg (ffmpeg.org) and then markerless tracking of the nose, the tip of the lower jaw, and the second through fourth digits (D2–4) on each hand was performed using DeepLabCut (Mathis et al., 2018). From these two sets of 2D trajectories for each body part, 3D trajectories were reconstructed using Anipose (Karashchuk et al., 2021). Anipose's camera model was calibrated for each experimental session using videos of a ChAruCo board at various angles captured at the end of the session without adjusting the lens settings. As previously (Barrett et al., 2020, 2022), we focused our analysis on Dhand-hand, the 3D Euclidean distance between the third digits (D3) of each hand, and Dhand-nose, the distance between the mid-point of the two D3s and the nose. Additionally, we calculated Dnose-jaw as the 3D Euclidean distance between the nose and the tip of the lower jaw.
To track the eye, a separate DeepLabCut model was trained to track the dorsal, rostral, ventral, and caudal-most points of the eye from the left side camera views. 3D reconstruction was not used in this case as each eye was usually visible in only one camera view.
Acoustic recordings
In a subset of experiments, a probe tube microphone (FG-23329-P07, Knowles) was placed near the mouse's head during food handling recording sessions. The signal was amplified with a custom-built amplifier, digitized at 50 kHz by an NI USB-6229 data acquisition board, and recorded in WaveSurfer (wavesurfer.janelia.org). Acoustic recordings were triggered by the same TTL pulse used to trigger video recording. Acoustic recordings were bandpass filtered offline between 2,000 and 3,500 Hz, and audio peaks with a height of 6 standard deviations above the mean and a separation of at least 10 ms were extracted. The amplitude of the peaks was taken as the absolute value of the Hilbert transform of the filtered signal at the time of the peak. Audio tracks in Extended Data Video 3-1 were denoised using the noise reduction feature in Adobe Audition (Adobe) for presentation purposes only.
Ethogramming
Semiautomated ethogramming of hand kinematics into holding and oromanual modes was performed as previously (Barrett et al., 2022). Briefly, videos were segmented into holding, oromanual, or other based on manual thresholding of Dhand-nose, and then characteristic model functions (sigmoid for transport-to-mouth, exponential for lowering-from-mouth) were fit to each transition to more precisely determine their start and end.
Regrips were detected in Dhand-hand trace by first high-pass filtering the trace at 5 Hz with a sixth-order Butterworth filter to remove slow changes in the baseline hand–hand distance (due to, e.g., changing size or orientation of the food item), then detecting peaks with minimum height of 2.5 standard deviations, temporal separation of 20 ms, and prominence of 0.2 standard deviations using the find_peaks function in Scipy's signal processing module.
Masseter events (chewing and biting) were detected by low-pass filtering the preprocessed masseter EMG signal at 50 Hz with a sixth-order Butterworth filter and then detecting peaks with a minimum prominence of six times the median absolute deviation of the masseter EMG signal and a minimum temporal separation of 100 ms using find_peaks. A median absolute deviation threshold was preferred over a standard deviation threshold due to the highly skewed, asymmetric distribution of EMG amplitude values resulting from the full-wave rectification. All plots show the low-pass filtered EMG signal. Infrequently, chewing cycles that were visible in the jaw tracking lacked a corresponding EMG peak (Fig. 4A). To assess whether these were due to lateralized masseter activity or an overly conservative detection threshold, in two mice we recorded bilateral masseter EMG activity (Table 1). Most chews were accompanied by bilateral EMG peaks, but on rare occasions the EMG activity passed the detection threshold only on one side. On average across both mice, 93% of jaw troughs with a prominence of at least 1 mm were accompanied by a detectable peak in the masseter EMG on at least one side.
To classify masseter events as chewing or biting without reference to hand position, we classified triplets of events as rhythmic if their mean interevent interval was between 154 and 286 ms and the ratio of the larger to the smaller interval was <2, and aperiodic otherwise. This range of interevent intervals corresponds to a frequency range of 3.5–6.5 Hz and was chosen to cover the known 4–6 Hz frequency of rodent mastication; however, the results did not depend strongly on the particular choice of parameters. Each masseter event belongs to three triplets and was classified as a chew if it belonged to at least two rhythmic triplets and a bite otherwise. To minimize the effect of missed masseter events (see above) on this classification, where the jaw tracking was available, we classified jaw peaks (minimum prominence, 0.5 mm; minimum separation, 100 ms) as rhythmic or aperiodic in the same manner as the EMG peaks, and any masseter peak occurring during a bout of rhythmic jaw movements was classified as a chew.
To assess how well this classification agreed with that based on the Dhand-nose ethogram, we constructed confusion matrix between the two pairs of binary classifications for each mouse and from this calculated the balanced accuracy, i.e., the ratio of diagonal entries of the confusion matrix to the row sums, which corresponds to the mean over classes of the rates of correct classification for each class. Given the hypothesis that hand and jaw movements are synchronized, we took the classification based on the Dhand-nose ethogram as the “true” labels, but similar results were obtained when taking the transpose of the confusion matrices. To construct an ethogram based on the masseter EMG alone, transitions from chewing to biting were placed at the mid-point between each bite that follows a chew and vice versa for transitions from biting to chewing. For each transition in the Dhand-nose ethogram, we calculated the latency to its nearest same-direction (i.e., holding to oromanual with chewing to biting and oromanual to holding with biting to chewing) transition in the masseter EMG ethogram. To assess how likely the observed balanced accuracies and transition latencies were to occur due to chance, we performed a bootstrap analysis in which masseter EMG peaks were randomly assigned to biting or chewing, with the number of chews and bites equal to the number of masseter events during holding and oromanual, respectively, and repeated this for 10,000 iterations to estimate null distributions of balanced accuracy and transition latencies. From these distributions, a right-tailed p value was calculated for the balanced accuracy and a left-tailed p value for the transition latencies.
Power spectra of holding/chewing and oromanual/ingestion epochs were estimated using the multitaper method with adaptive weighting (Thomson, 2007) as implemented in the Nitime python package (Rokem et al., 2009). To detect tooth-sharpening events, the spectrograms of the masseter EMG and Djaw-nose traces were calculated by taking the real part of the Fourier transform of a sliding 2 s window, and the ratio of the power of these two spectrograms in the 4–15 Hz band (to capture both chewing and high-frequency activity) was calculated. Potential tooth-sharpening bouts were identified as peaks in the spectrogram ratio with a minimum height of 2.5 × 105 and minimum prominence of 2.0 × 105. To determine the temporal extent of each bout, sharpening-related peaks were detected in the masseter EMG signal with height and prominence threshold tuned manually per recording and all consecutive EMG peaks in the vicinity of the spectrogram ratio peak with interpeak intervals <200 ms were included in the bout. The length of chewing bouts in the same recordings was determined similarly, except the interpeak interval threshold for defining the start and ends of bouts was 333 ms.
Regarding terminology, the various components of the behavior were described as follows; movements, any motion of one or more body parts; kinematics, measurements of movements; actions, short, discrete movements of identifiable type (e.g., bites, regrips, chews, transports to mouth, lowerings from mouth); behaviors, concatenation of multiple actions with distinct temporal structure and syntax (e.g., food handling); modes, subdivisions of a behavior identifiable by different occurrence and temporal sequencing of actions (e.g., holding/chewing vs oromanual/ingestion); epoch, one occurrence of a mode; event, one occurrence of an action; bout, an unbroken sequence of events of a single rhythmic action type (i.e., chewing, tooth sharpening); and burst, a short sequence of events of aperiodic actions (e.g., bites, regrips) closely spaced in time.
Analysis of action timing and sequencing
For all simulations, we used empirical cumulative distribution functions (ECDFs) estimated by calculating the ECDF for each mouse individually, then taking the mean ECDF across mice.
The difference in distributions of regrip→bite and bite→regrip intervals in the real data (Fig. 7A) implies but does not guarantee that they are generated by interdependent processes. Hence, to more directly assess the independence of bites and regrips, we simulated regrip and bite times as follows. We sampled from the per-mouse ECDFs of regrip–regrip and bite–bite intervals, ignoring interleaved actions of the other type, and then calculated the resulting regrip→bite and bite→regrip intervals. This was done in 1,000 iterations of 100 samples each of bites and regrips, separately using each mouse's ECDFs, and the mean simulated ECDFs of regrip→bite and bite→regrip intervals were then compared with the real mean ECDFs using the Kolmogorov–Smirnov test.
We developed a simple two-level hierarchical continuous-time discrete-state semi-Markov model capturing key elements of the structure of food handling sequences as follows. At the top level of the hierarchy, a process representing switching between the holding/chewing and oromanual/ingestion modes alternated between two states with probability 1. Within each state, separate processes generated the sequence of the corresponding actions (chews for holding/chewing, bites and regrips for oromanual/ingestion). During holding/chewing, the lower-level process started with lowering-from-mouth, and then any and all subsequent events were chews. During oromanual/ingestion, the lower-level process started in transport-to-mouth and then transitioned to either bite or regrip, switching between those two according to a transition matrix, estimated for each mouse as ni,j/ni, where ni,j is the number of times action i was followed by action j and ni the total number of occurrences of action i. This model also incorporated timing information, with epoch durations for the holding/chewing–oromanual/ingestion process and inter-event intervals for the bite/regrip and chew processes sampled from the corresponding observed ECDFs from the real data. State transitions were drawn from the bite/regrip and chew processes until the sum of inter-event intervals exceeded the current epoch duration. This model was semi-Markovian in that the type and timing of the next action depends only on the current action but in continuous time is not memoryless as neither the epoch durations nor the inter-event intervals are exponentially distributed. In this model, the expected distribution of sequence lengths (numbers of chews or bites/regrips in each holding/chewing or oromanual/ingestion epoch) is nontrivial and hence was estimated by simulation. A sequence of 100,000 actions was simulated for each mouse using that mouse's epoch duration and inter-event interval ECDFs, from which the distributions of sequence lengths for each mode were calculated. Then, we calculated the mean simulated distribution over mice and compared it with the real data using the two-sample Kolmogorov–Smirnov test. To minimize the risk of overfitting, the hierarchical model used transition probabilities and epoch duration/inter-event interval ECDFs measured from half the data and the resulting expected distributions were compared with the mean ECDFs of sequence lengths measured from the remaining half of the data. Two mice with fewer than 10 holding/chewing or oromanual/ingestion epochs were excluded from this analysis.
For comparison, we also considered an even simpler flat model. This model comprised a single continuous-time discrete-state semi-Markov process governing the sequencing of all actions (transport-to-mouth, regrip, bite, lowering-from-mouth, chew) according to a transition matrix estimated for each mouse in the same way as the hierarchical model. Under this model, the expected distributions of sequence lengths should be distributed geometrically. This can be seen by decomposing the model into two alternating absorbing Markov chains, one for each mode, and taking transport-to-mouth and lowering-from-mouth as the absorbing state for each. We calculated these theoretical distributions for each mouse separately and then took the mean distributions across mice and compared it to the mean ECDF of sequence lengths from the real data, cross-validating in the same way as the hierarchical model.
Experimental design and statistical analysis
All statistical analysis was performed in Python 3.9 using the numpy (version 1.26.2), scipy (version 1.10.1), and statsmodels (version 0.13.5) packages. Details of statistical tests including tests used, test statistics, and sample sizes can be found in the figure legends and Results text. For all tests, n represents the number of mice included in the sample except where noted. Central tendency and dispersion were reported as mean ± SD except where noted. Wilcoxon's signed-rank was used for all paired statistical tests where sample size was at least 6. For smaller samples, paired t tests were used and normality was assessed using the Shapiro–Wilk test. Pairs of samples from unknown distributions were compared using the two-sample Kolmogorov–Smirnov test and empirical distributions were compared with theoretical distributions using the one-sample Kolmogorov–Smirnov test. Significance was defined as p < 0.05.
Results
Two modes of masseter activity are aligned with two modes of forelimb movements
To investigate oromanual coordination, we recorded high-speed, close-up video of head-fixed mice handling sunflower seed kernels and grain pellets, while simultaneously recording EMG signals from one masseter, plus the ipsilateral forelimb (biceps brachii or brachioradialis) in most cases (Fig. 1C). Doing so allowed us to simultaneously capture food handling-related forelimb movements and major jaw events such as biting and chewing (Fig. 1D). As shown in the example recordings (Fig. 1E, Extended Data Video 1-1), masseter EMG recordings comprised alternating periods of large-amplitude rhythmic activity and low-amplitude irregular activity that appeared to align with the holding and oromanual modes of the hands defined based on hand kinematics (Materials and Methods).
Extended Data Video 1-1
Basic example of multi-modal recording, with kinematics and EMG Associated with Fig. 1. Top: an example video recording of a mouse eating a grain pellet, shown in real time, with ethogram modes and different actions annotated. Middle: hand and jaw kinematic traces aligned to the video. Bottom: EMG recording traces aligned to the video. Download Extended Data Video 1-1, MP4 file.
To study this relationship in more detail, we recorded video and masseter activity from a cohort of mice (n = 9 mice; 7 with forelimb EMG; Table 1) and aligned to different food handling actions. First, we focused on transitions from holding epochs into oromanual epochs (transport-to-mouth) and from oromanual epochs into holding (lowering-from-mouth). In example recordings (Fig. 2A, Extended Data Video 2-1), rhythmic activity was evident in the masseter EMG traces that abruptly ceased around the transport-to-mouth and resumed around the following lowering-from-mouth movement. Concomitantly, there was an increase in forelimb EMG amplitude during oromanual epochs, between the transport-to-mouth and lowering-from-mouth actions. This pattern held across mice (Fig. 2B,C), with significantly higher mean masseter EMG amplitude during holding epochs than oromanual epochs (holding: 86 ± 49 µV, mean ± SD, n = 9 mice; oromanual: 45 ± 22 µV; Wilcoxon signed-rank: W = 0, p = 0.004) and vice versa for forelimb EMG amplitude (holding: 8.2 ± 4.8 µV, n = 7 mice with forelimb EMG recordings; oromanual: 20 ± 7 µV; W = 0, p = 0.02).
Two modes of masseter activity are aligned with two modes of forelimb movements. A, Example of event alignment of the data traces, aligned to the onset of the transport-to-mouth action marking the start of oromanual epochs. Traces are sorted by oromanual epoch durations, giving a roughly right-triangle appearance. Black traces represent missing data due to events close to the start/end of recordings or, in the case of Djaw-nose, occlusion of the jaws by the hands. B, Average Dhand-nose and EMG traces across animals (n = 7 mice with forelimb EMG), aligned to the onset of the transport-to-mouth action (top) or to the last chew before an oromanual epoch (bottom). EMG traces are Z-scored to account for inter-animal variability in electrode impedances. C, Same but aligned to the onset of the lowering-from-mouth action (top) or to the first chew after an oromanual epoch (bottom). D, Plots, from left to right, of the mean amplitude, inter-event intervals, quartile coefficient of dispersion of inter-event intervals, and acoustic amplitude of the masseter “chew” and “bite” events, for each mouse (n = 9, gray lines, first three subpanels) or event (gray circles, last subpanel) and mean ± SD across mice/events (black lines or circles).
Extended Data Video 2-1
Event-aligned snippets Associated with Fig. 2. Examples of the main hand actions involved in food handling (transport-to-mouth, lowering-from-mouth, and regrips) shown at one-tenth (one-fiftieth in the case of regrips) speed, formatted as in Extended Data Video 1-1. Download Extended Data Video 2-1, MP4 file.
From the masseter EMG, we identified large-amplitude events using peak detection. To assess how these related to ongoing hand movements, we compared masseter events occurring during holding versus oromanual epochs of the ethogram defined based on hand–nose distance (Materials and Methods). Masseter events during holding were larger in amplitude (holding: 457 ± 291 μV; oromanual: 253 ± 147 μV; W = 0, p = 0.004) and shorter in inter-event interval (holding: 243 ± 31 ms; oromanual: 449 ± 94 ms; W = 0, p = 0.004) and were also more regularly timed (based on quartile coefficients of dispersion of inter-event timings; holding: 0.14 ± 0.07; oromanual: 0.46 ± 0.08; W = 0, p = 0.004; Fig. 2D). The side peaks in the average trace on aligning to first or last masseter events during holding epochs emphasized the rhythmic nature of the masseter EMG (Fig. 2B,C). Based on previous reports of regular bouts of masticatory activity interspersed with irregular biting activity in masseter EMG during food consumption (Kobayashi et al., 2002), these results suggest that the regularly timed masseter events during holding and irregularly timed events during holding correspond to chews and bites, respectively.
To confirm this, we classified masseter events as rhythmic chews or aperiodic bites based on their interevent intervals, without reference to hand position (Materials and Methods). We then compared this classification of masseter events to one that classifies masseter events during holding as chews and during oromanual as bites. There was a good correspondence between these classifications, even when accounting for the relatively higher proportion of chews than bites in the dataset (balanced accuracy: 88 ± 4%, median ± m.a.d., n = 9 mice; bootstrap test vs random assignment: p < 0.0001). For each holding to oromanual or oromanual to holding transition in the Dhand-nose ethogram, the average absolute latency to the nearest corresponding transition from chewing to biting or biting to chewing in the masseter ethogram was <300 ms (holding/chewing to oromanual/biting: 284 ± 56 ms, p < 0.0001; oromanual/biting to holding/chewing: 293 ± 78 ms, p < 0.0001). Given the high concordance between the two methods of classifying masseter events as bites or chews, we henceforth refer to masseter events during holding as “chews” and during oromanual as “bites.”
Supporting this inference, chews and bites were also audibly distinct. For two additional mice feeding on farro grains, which are harder than sunflower seeds but similar in shape, we recorded eating noises (Table 1, Fig. 3A, Extended Data Video 3-1, Materials and Methods). Bites were louder than chews (Mouse #14—bites: 33 ± 24 µV, chews: 14 ± 17 µV, Mann–Whitney U test: n = 106 bites and 18 chews with audio peaks, U = 7176, p = 0.0001; Mouse #15—bites: 11.3 ± 8.9, chews: 6.3 ± 3.7 µV, n = 57 bites and 51 chews, U = 3637, p = 0.001; Fig. 2D) and were also more often audible, i.e., associated with peaks in the audio trace within 50 ms following the masseter EMG peak (39 or 43% of bites, 8 or 17% of chews, n = 2 mice). Audible and inaudible bites and chews were electromyographically similar (Fig. 3B,C). For audible bites, alignment of the acoustic, EMG, and kinematic data to the audio peaks revealed a sharp drop in masseter EMG amplitude precisely aligned to the sound, along with a slight increase in Dhand-nose (i.e., a downward movement of the hands and food item) shortly thereafter (Fig. 3D–F). Audible chews were less precisely aligned to the audio peak and did not show an obvious postpeak suppression (Fig. 3G–I). In summary, many bites are loud, more so than chews, and audible bites have a unique milliseconds-scale kinematic-electromyographic profile, corresponding to the successful incision and breaking-off of a food fragment.
Bites are associated with audio peaks. A, Top, Dhand-nose (blue), Dhand-hand (red), Djaw-nose (brown), regrips (black circles), masseter EMG (teal), bites (black crosses), chews (black squares), and ethogram (gray) for an example farro handling recording. Bottom, Simultaneous acoustic recording with the behavior above. Audio peaks are highlighted with blue circles. B, Masseter EMG waveforms for bites with (right) and without (left) and associated audio peak within 50 ms following the masseter EMG peak. Thick line and error bands are mean ± SD over bites. Similar results were found for a second mouse. C, Same as B, but for chews. D, Left, Heatmap showing audio traces aligned to audio peaks associated with bites. Right, The same data, plotted as mean ± SD over events. E, Same as D, but for masseter EMG aligned to audio peaks. Audible bites showed postpeak suppression and rebound, as denoted by the arrow. F, Same as D, but for change in Dhand-nose aligned to audio peaks. Similar results as shown in E–G were found for a second mouse. G–I, Same as D–F, but for chews. Audible chews did not show postpeak suppression, as denoted by the arrow in H. Similar results were found for a second mouse.
Extended Data Video 3-1
Bites are audible Associated with Fig. 3. Top: an example video recording of a mouse eating a dry farro grain, shown first in real time and again at one-tenth speed, accompanied by audio recorded from the microphone adjacent to the mouse’s head. Middle: acoustic traces aligned to the video. Bottom: EMG recording traces aligned to the video. Download Extended Data Video 3-1, MP4 file.
Having noted distinct patterns of masseter activity between holding and oromanual modes, we proceeded to examine masseter EMG activity in each mode separately.
Chewing occurs during holding epochs, with characteristic rhythmic properties
First, we focused on the holding mode (Fig. 4). As shown in the example trace (Fig. 4A), rhythmic masseter EMG activity consistent with chewing (mastication) appeared during holding epochs and was largely absent during oromanual epochs. This was also evident in the jaw–nose distance (Djaw-nose) during holding, when the jaw was not occluded by the hands. Thus, holding mode can be considered holding/chewing mode, with hands used to hold the food without manipulation, while the jaw is used actively for chewing.
Chewing occurs during holding epochs, with characteristic rhythmic properties. A, Example traces of Dhand-hand, Dhand-nose, Djaw-nose, masseter EMG, and holding/chewing versus oromanual ethogram for one complete holding/chewing–oromanual cycle, showing rhythmic activity during holding/chewing. B, Left, Peak-normalized power spectra for masseter EMG averaged over all holding/chewing epochs (blue) and all oromanual epochs (red), as well as peak-normalized power spectrum for Djaw-nose averaged over all holding/chewing epochs (yellow), for an example recording (top) and mean ± SD over mice (bottom, n = 9 mice). Right, Same, for the autocorrelograms of the same data. C, Correlation between electromyographically and kinematically measured peak frequencies (left) and autocorrelation lags (middle) and between frequencies determined from EMG by spectral versus autocorrelation analysis (right). Circles are individual mice, error bars mean ± SD over mice. D, Alignment of Djaw-nose (brown) and masseter EMG (teal) traces to chews (masseter EMG peaks during holding/chewing), showing phase offset of jaw kinematics. Top, Mean ± SD over events for the example recording in A and B; bottom, mean ± SD over mice. Maximal EMG activity occurs as the apparent jaw gape aperture approaches a minimum. E, Left, Mean lag between the Dhand-nose and masseter EMG peaks during holding/chewing. Circles are individual mice, error bars mean ± SD over mice. Right, Circular histogram of the phase offset between Dhand-nose and masseter EMG peaks during holding/chewing. Polar axis represents the number of mice in each bin of the histogram.
This was confirmed by spectral analysis, with the power spectrum of both masseter EMG and jaw tracking showing a consistent peak during holding/chewing epochs but not oromanual epochs, at 4–6 Hz (EMG: 4.6 ± 0.5 Hz, median ± m.a.d., n = 9 mice; jaw tracking: 5.0 ± 0.6 Hz; Fig. 4B, left). The rhythmicity of masseter activity specifically during holding/chewing epochs was also evident in the autocorrelation (Fig. 4B, right), and the frequency of chewing was highly consistent regardless of whether it was measured from jaw tracking or masseter EMG, or from the power spectrum or autocorrelation (Fig. 4C). The measured 4–6 Hz chewing frequency is also consistent with prior reports (Kobayashi et al., 2002).
Aligning to masseter EMG peaks during holding/chewing revealed a highly consistent and precise temporal relationship between peak jaw aperture (maximum Djaw-nose) and maximum masseter EMG amplitude (Fig. 4D), with the former occurring 73 ± 10 ms before the latter (Fig. 4E, left). This corresponds to a phase relationship of −123 ± 21° (Fig. 4E, right), indicating that maximum masseter contraction occurs just before minimum jaw aperture as food is crushed between molars.
In summary, masseter EMG activity during holding/chewing epochs is dominated by rhythmic contractions generated during chewing.
Rapid regrip–bite sequences are a hallmark of oromanual/ingestion epochs
Next, we studied masseter activity during oromanual epochs. As shown in the example (Fig. 5A), there was no rhythmic chewing. Instead, we observed irregularly sized and timed masseter events that mostly occurred in isolation, corresponding to biting events (Kobayashi et al., 2002), hence oromanual mode can be considered oromanual/ingestion mode. Note that “ingestion” as used here refers to “transferring food from the environment to the oral cavity,” preceding and distinct from mastication and deglutition (Hiiemae and Ardran, 1968). The lack of clear side peaks in the bite-aligned masseter EMG (Fig. 5B,C), and the lack of peaks in the power spectrum of masseter EMG during oromanual epochs (Fig. 4B) demonstrated the relative lack of rhythmicity of biting compared with chewing. Consistent with the acoustic event alignment analysis above, following bites, the Dhand-nose tended to increase and forelimb EMG amplitude decrease. Inspection of individual bite-aligned traces suggested that regrips, identified as peaks in Dhand-hand, occur shortly before, but not after, bites. Accordingly, we performed the converse analysis of aligning masseter EMG traces to regrips (Fig. 5D,E). As previously noted (Barrett et al., 2020), regrips often occur in short, rhythmic bursts. Therefore, to minimize contamination from closely spaced events in the aligned traces, we separately aligned the data to isolated (“solo”) regrips, first regrips of a burst, and last regrips of burst. Alignment to solo and last regrips showed a sharp, short latency increase in masseter EMG amplitude, suggesting bites often occur immediately following a burst of regrips. Consistent with this, the average interval between consecutive regrips and bites was much shorter than that between consecutive bites and regrips (regrip→bite: 132 ± 25 ms; bite→regrip: 354 ± 104 ms; Wilcoxon signed-rank: n = 9, W = 0, p = 0004). Additionally, forelimb EMG traces aligned to regrips showed a triphasic pattern wherein regrip-aligned forelimb EMG tended to peak shortly before a regrip and then decrease again, followed by a second peak aligned with the Dhand-hand peak.
Rapid regrip–bite sequences are a hallmark of oromanual/ingestion epochs. A, Example traces of Dhand-hand, Dhand-nose, Djaw-nose, masseter EMG, forelimb EMG, and holding/chewing versus oromanual/ingestion ethogram for two example oromanual/ingestion epochs. Bites and chews are indicated, along with regrips occurring as solo, doublet, or triplet events. B, Example heatmaps showing event-alignment of traces to bites. C, Bite-aligned traces, mean ± SD across mice (n = 7 mice with forelimb EMG). D, Example traces and heatmaps showing event alignment to solo regrips (left) and, in the case of “bursts” of multiple regrips, to the first (middle) or last (right) regrips of a burst. E, Average Dhand-hand, masseter EMG, and forelimb EMG traces aligned to solo, first, and last regrips as in D. Data are mean ± SD across mice.
In summary, mice literally “grab a bite” when they handle food, in the sense that they first adjust their grip on the morsel (by performing one or more regrips) and then bite it immediately thereafter. These rapid regrip–bite sequences are a prominent and characteristic feature of the oromanual/ingestion mode of food handling. Additionally, the overall pattern of food handling actions was preserved across a wide range of food hardnesses, with most parameters being similar for all tested food types (Fig. 6).
Food handling parameters are not affected by foods differing in hardness. A, Left, Example kinematic traces (blue, Dhand-nose; red, Dhand-hand; brown, Djaw-nose; gray, ethogram), EMG traces (teal), and identified events (crosses, bites; squares, chews; circles, regrips) from a recording of a mouse eating a Fruity Gem, walnut fragment, sunflower seed kernel, or farro grain (ordered by increasing food hardness). Right, matrices showing probabilities of one food handling action following another. T, transport-to-mouth; R, regrip; B, bite; L, lowering-from-mouth; C, chew. Transition matrices were calculated from all videos for each food type for each mouse, then averaged over n = 2 mice. B, Quantification of various food handling parameters for each food type, including median chew amplitude, bite amplitude, interchew interval, inter-bite interval, regrip→bite interval, and bite→regrip interval. Light-colored circles are medians for individual holding/chewing or oromanual/ingestion epochs, error bars are median ± m.a.d. over epochs. Data from n = 2 mice with recordings of all four food hardnesses are shown in different colors.
Hierarchical coordination of hands and jaw during food handling
Having characterized the timing of bites, chews, and regrips within holding/chewing and oromanual/ingestion epochs, we used simulations to explore their temporal structure in greater detail. The large difference in length of regrip→bite and bite→regrip intervals noted above suggests, but does not guarantee, that they are interdependent, i.e., not generated by independent processes. To test this possibility, we simulated bite and regrip timings using the observed distributions of inter-event intervals under the assumption that they were generated by independent stochastic processes (Fig. 7A, Materials and Methods). The distribution of simulated regrip→bite intervals was skewed much longer than the real distribution (two-sample Kolmogorov–Smirnov test: D = 0.21, p = 0.006) and similarly the simulated bite→regrip intervals were much shorter (D = 0.40, p = 3.6 × 10−9; Fig. 7A). This suggests that bites and regrips are not controlled independently, instead supporting the idea of close coordination between the hands and jaw.
Hierarchical coordination of hands and jaw during food handling. A, Empirical cumulative distribution functions (ECDFs) for regrip→bite intervals (magenta) and bite→regrip intervals (cyan). Solid lines, mean ECDF over mice (n = 9 mice). Dashed lines, mean simulated ECDFs assuming independence for regrip→bite and bite→regrip intervals (because the two lines were virtually identical only one is shown). B, Left, A flat model in which all food handling actions are sequenced by a single semi-Markov process. Right, ECDFs of holding/chewing (blue) and oromanual/ingestion (red) action sequence lengths. Thick lines, mean over mice; thin lines, theoretical ECDFs under the flat model. C, Left, A hierarchical model in which an overarching two-state process governs transitions between holding/chewing and oromanual/ingestion, and separate subprocesses for each state govern chews and bites/regrips, respectively. Right, ECDFs of holding/chewing (blue) and oromanual/ingestion (red) action sequence lengths. Thick lines, mean over mice; thin lines, mean simulated ECDFs.
Given the interdependence of bites and regrips, we sought to examine whether a single model could capture the particular sequencing of all food-handling actions (transports-to-mouth, regrips, bites, lowerings-from-mouth, chews). Due to the inherently nested form of the behavior (i.e., bites and regrips within one mode, chews within another), we compared a flat semi-Markov model in which all actions are governed by a single transition matrix, to a hierarchical semi-Markov model. In this latter model, transitions between the two modes (holding/chewing vs oromanual/ingestion) are sequenced separately from the actions within them (chews vs bites/regrips, respectively; Fig. 7B,C; Materials and Methods). We evaluated both models by testing how well they reproduced the experimentally observed distribution of sequence lengths, i.e., numbers of chews in holding/chewing epochs, and of bites and regrips in oromanual/ingestion epochs. For the flat model, the expected distribution of sequence lengths can be determined analytically (Materials and Methods). For the hierarchical model, we simulated sequences of food handling actions by drawing randomly from the experimentally observed epoch durations and inter-event intervals to estimate the expected distribution of sequence lengths (Materials and Methods). Both models were trained on one half of the data and tested on the other. Two mice were excluded due to insufficient data after splitting. The flat model did not reproduce the experimentally observed distributions of sequence lengths (one-sample Kolmogorov–Smirnov test—holding/chewing: D = 0.18, p = 0.003, n = 7 mice; oromanual/ingestion: D = 0.18, p = 0.003; Fig. 7B). Comparison of the real and simulated distributions of sequence lengths under the hierarchical model showed no significant differences (two-sample Kolmogorov–Smirnov test—holding/chewing: D = 0.07, p = 0.95; oromanual/ingestion: D = 0.13, p = 0.15; Fig. 7C). The hierarchical model was also a better fit to the data than the flat model (sum of squared differences between cumulative distribution functions—holding/chewing: 0.41 flat vs 0.17 hierarchical; oromanual/ingestion: 0.21 flat vs 0.09 hierarchical). In summary, these analyses show tight coordination between the hands and jaws, rather than independent operation of each, and show that the orchestration of food handling can be captured in a simple hierarchical model based on the experimentally characterized modes.
Incisal sharpening and masseter-related physiological proptosis
We also observed another type of rhythmic masseter EMG event that occurred intermittently during holding/chewing or between food items. Examples of such events are shown in Figure 8A, flanked by two chewing bouts, and Extended Data Video 8-1. Spectral analysis showed most of the power was concentrated in the 10–15 Hz frequency band, compared with the 4–6 Hz band for chewing. Unlike chewing, this high-frequency activity was accompanied by minimal jaw movement, and there was limited increase in power in either band in the jaw tracking spectrogram. We used this spectral signature to search for additional bouts of high-frequency events, identified as periods in which the ratio of masseter EMG and jaw tracking spectrogram power in the 4–15 Hz band was large (Materials and Methods). In total, we identified eight such bouts from n = 3 mice and compared these with chewing bouts (identified by masseter EMG peak detection during holding/chewing as previously). There was no significant difference in average bout duration between high-frequency events and chewing (high-frequency events: 3.3 ± 0.8 s; chewing: 1.7 ± 0.8 s; paired t test: n = 3, t2 = 3.2, p = 0.09), nor in peak amplitude (high-frequency events: 175 ± 87 μV; chews: 233 ± 176 μV; paired t test: n = 3, t2 = −0.8, p = 0.49) but the former occurred at significantly higher frequency (high-frequency events: 13.1 ± 0.5 Hz; chews: 5.2 ± 0.4 Hz; t3 = 18.3, p = 0.003; Fig. 8B). The frequency of these events is consistent with that previously reported for incisal sharpening in rats and sciurid rodents, in which the hard enamel of the labial surface of one tooth is rubbed against the softer dentin of the lingual surface in order to sharpen the teeth (Druzinsky, 1995; Byrd, 1997; Taylor et al., 2017). Consistent with this, we also observed small, rhythmic oscillation of the jaw position (Fig. 8C). Based on their constellation of characteristic features, we infer that these bouts of high-frequency events represent incisal sharpening.
Incisal sharpening and masseter-related physiological proptosis. A, Masseter EMG and Djaw-nose traces from an example tooth-sharpening bout, flanked by two chewing bouts. Note the low-frequency oscillation in both EMG and kinematics during chewing compared with the high-frequency EMG events with minimal change in jaw position during sharpening. Part of the Djaw-nose trace after the sharpening bout was blanked (dashed line) due to occlusion by the hands. B, Mean frequency of masseter EMG peaks during sharpening and chewing bouts in n = 3 mice. C, Peak aligned traces for sharpening (left), chewing (middle), and biting (right). Thin lines are means over events for individual mice and normalized to the range [0,1]; thick lines are mean over mice. Teal, masseter EMG; brown, Djaw-nose; red, vertical position of eye centroid; purple, area of eye. D, Example images showing sharpening-associated movements of the eyes and lower jaw. Images are single frames taken from the time of greatest proptosis during a bout of sharpening (S, top right) and immediately prior to the same sharpening bout (B, top left). These were subtracted to generate a difference image (bottom), showing the sharpening-related bulging of the right (yellow arrow) and left (cyan arrow) eyes and the prognathic anterior movement of the lower jaw (magenta arrow).
Extended Data Video 8-1
Incisal sharpening and eye bulging Associated with Fig. 8. Top: two examples of mice engaged in tooth sharpening, each shown first in real time and then again at one-tenth speed. The first example involves exaggerated proptosis, whereas the second involves subtler but still noticeable proptosis. Middle: eye and jaw kinematic traces aligned to the video. Bottom: EMG recording traces aligned to the video. Download Extended Data Video 8-1, MP4 file.
We also noticed that incisal sharpening was sometimes accompanied by overt rhythmic proptosis (“popping” or bulging of the eyes), which could be of strikingly large amplitude (Extended Data Video 8-1, first example; Fig. 8D). To quantify this, we tracked the position of the left eye in the left camera view using DeepLabCut and measured the change in vertical position and total area of the eye. We then aligned the eye tracking data to the masseter EMG. This showed that proptosis occurred in a subset of sharpening bouts (three bouts from two mice); the remainder (five bouts from three mice) were accompanied by more subtle changes in eye position and area (Extended Data Video 8-1, second example). Given the possibility that a physiological form of proptosis might accompany masseter contractions in general, we extended the analysis to include biting and chewing actions as well as incisal sharpening. Indeed, across all three types of masseter event, eye movements generally occurred in sync with contractions of the masseter and accompanying jaw movements (Fig. 8C). The phase relationship was similar across all three actions, with eye position and area in antiphase with Djaw-nose, indicating that the eyes are pushed forward as the masseter contracts and the jaw reaches maximum elevation.
Discussion
“Handling” of food by tetrapodal vertebrates, in the broadest sense of preprocessing and preparing food items for ingestion, is an integral yet understudied component of foraging behavior, proposed to have fundamentally shaped mammalian evolution (Cisek, 2022). Our study, focusing on hand–jaw dexterity of mice as they manipulate and consume food, provides a detailed, simultaneous, multimodal characterization, with high spatial and temporal resolution, of both the oral and manual aspects of the process of food handling in this species, demonstrating intricate coordination (Fig. 9).
Schematic summary. Top, Food handling is depicted within the broader context of foraging, which involves searching for, handling, and internalizing food items. Cycles of manipulation and mastication (chewing), characterized in this study, constitute the last stages of the handling phase. Middle, These cycles are shown in greater detail in an ethogram-style depiction of alternation between oromanual/ingestion and holding/chewing modes. Bottom, Illustrations of the associated postures and actions of the hands and jaw for the two modes.
Whereas the two modes of hand (Barrett et al., 2020, 2022) and jaw (Kobayashi et al., 2002) movements during food consumption have been noted separately, here we demonstrate that these two modes are in sync for both hands and jaws. Just as the hands cycle between the static holding and active oromanual modes, masseter activity switches between rhythmic chewing and irregular biting modes, with the former ceasing abruptly at the transition to oromanual and resuming with the return to holding, thus creating coordinated holding/chewing and oromanual/ingestion modes. Within the oromanual/ingestion mode, there is a tight coordination between regrips and bites, with the latter often occurring very shortly after bursts of the former, as confirmed by regrip→bite intervals being shorter on average than bite→regrip intervals.
Occlusion of the jaw by the hands and food during the oromanual/ingestion mode of food handling interferes with the tracking of jaw movements. Here, we overcame this by implanting EMG electrodes in the masseter muscle. This allowed us to measure masseter contractions with high temporal resolution but still precluded measurement of jaw kinematics during oromanual/ingestion epochs. Nevertheless, comparison of jaw tracking and masseter EMG signals during holding/chewing epochs, when the jaw is visible, shows a tight correlation between the two. In mammals the mandible is controlled by multiple muscles including the masseter, digastric, temporalis, and pterygoids, with distinct activities in relation to the chewing cycle (Hiiemae, 1971; Weijs and Dantuma, 1975; Druzinsky, 1995; Kobayashi et al., 2002; Yoshimi et al., 2017; Williams, 2019). Since the goal of this study was to relate hand movements to jaw activity (rather than, e.g., detailed characterization of jaw muscle activity during consumption), we focused on just one jaw muscle, the masseter, due to its functional importance, large size, and ease of implantation. Alternative methods for tracking jaw actions, such as using Hall effect sensors (Koga et al., 2001) and XROMM (Brainerd et al., 2010), are of interest but may be limited for resolving fast sequences of small food handling movements. Previous studies have used acoustic recordings to identify bites (Allred et al., 2008; Whishaw et al., 2017; An et al., 2022), which however requires very hard foods such as dry farro or pasta. We found that only a subset of masseter EMG events were associated with detectable sounds, consistent with the possibility that inaudible and audible bites reflect gnaws and ingestive bites, respectively (Hiiemae and Ardran, 1968). Postpeak suppression of masseter EMG activity following audible bites but not chews is consistent with experiments involving tooth tapping of human molars and incisors (van der Glas et al., 1985; Brinkworth et al., 2004). Another consideration is that aspects of the food item (e.g., current size, position, orientation) as well as of the mouse's internal state (e.g., satiety) likely influence the sequencing of food handling actions; as these variables were not tracked here, this remains an avenue for further study.
Many natural and learned behaviors exhibit hierarchical organization (Fentress and Stilwell, 1973; Glaze and Troyer, 2006; Botvinick, 2008; Moore et al., 2013; Geddes et al., 2018; Kaplan et al., 2020; Mazzucato, 2022). Organization of mouse behavior into identifiable syllables (Wiltschko et al., 2015) implies at least two hierarchical levels, one controlling action timing and kinematics over short timescales and the other governing the transitions between behavioral modes over longer timescales. Consistent with this, a hierarchical semi-Markov model was able to accurately reproduce the distribution of numbers of actions performed during holding/chewing and oromanual/ingestion epochs of food handling. Similar models have been successfully applied to model Drosophila open-field behavior (Berman et al., 2016; Tao et al., 2019). This organization forms the lower levels of a larger hierarchy, with food handling itself nested within the foraging cycle of searching, handling, and internalizing (Fig. 9).
Our analyses also raise the question of the neural substrates of these distinct processes. Multiple cortical areas are associated with hand and orofacial movements, and their activity increases around oromanual/ingestion epochs in particular (Hira et al., 2015; Mayrhofer et al., 2019; Mercer Lindsay et al., 2019; An et al., 2022; Barrett et al., 2022; Yang et al., 2023). Activity in forelimb primary motor cortex around transitions from holding/chewing to oromanual/ingestion may relate to the holding/chewing–oromanual/ingestion process in the hierarchical model, while the sustained oromanual/ingestion-associated activity in more anterior and lateral areas may relate to the bite–regrip process within the oromanual/ingestion state (Barrett et al., 2022). Both these cortical areas project to brainstem nuclei implicated in skilled forelimb movements, including those related to food handling (Mercer Lindsay et al., 2019; Ruder et al., 2021; Sainsbury and Mathis, 2023; Yang et al., 2023; Vargas et al., 2024). The holding/chewing component of the model likely involves the chewing central pattern generator (CPG) in the brainstem (Morquette et al., 2012; Falardeau et al., 2023; Kleinfeld et al., 2023). Jaw movements evoked by cortical stimulation are usually rhythmic (Morquette et al., 2012; Falardeau et al., 2023), which raises the question of whether and how descending cortical activity can evoke aperiodic jaw movements such as biting. One possibility is that cortical areas involved in biting inhibit the chewing CPG. An alternative is that sensory feedback from the incisors arising during the oromanual/ingestion mode inhibits the chewing CPG, allowing isolated bites without rhythmic activity. The rhythmicity of chewing was implicit in the model by the use of the empirical distribution of inter-chew intervals, but this formulation could be made more explicit by the inclusion of oscillatory components in the model, as has been successful for other rhythmic orofacial actions (Liao and Kleinfeld, 2023).
During the course of our experiments, we also serendipitously identified several high-frequency tooth-sharpening bouts. Tooth sharpening is an important behavior for maintaining the health and sharpness of the continuously erupting incisors in rodents (Krumbach, 1904; Bennett, 1990; Druzinsky, 1995; Neveu and Gasc, 1999; Druzinsky, 2015). The large field of view, high resolution, and high frame rate of our videos allowed us to detect and quantify this behavior in mice and also changes in eye size and position during chewing, biting, and sharpening events, showing how the eye bulges out—in some cases dramatically so—as the jaw reaches maximum elevation. Rhythmic eye bulging simultaneous with jaw movements has been noted in the fancy rat community, where it is referred to as “boggling” (Neville et al., 2022). The most pronounced proptosis likely occurs during sharpening of the lower incisors, when the lower incisors are protracted anterior to the upper incisors so that the lingual surface of the lower incisor can be ground against the tip of the upper incisor. This represents the anatomical limit of anterior excursion of the lower jaw, during which the temporalis muscle, attached to the coronoid process of the mandible, pushes up against the globe of the eye as the mandible moves forward. However, our data reveal that detectable proptosis associated with jaw movements occurs even during less extreme movements, such as chewing.
In summary, we have used high-speed, high-resolution tracking of hand movements with a quantitative description of jaw activity during food handling to reveal fast, intricate coordination between hand and jaw movements during this behavior, and identifying additional masseter-related actions. These results exemplify how morphological features influence the organization of complex motor behavior and provide a basis for future studies of oromanual coordination and its neural and neuromuscular mechanisms.
Footnotes
We thank Nina Kraus and Jonathan H Siegel for advice on acoustic recording hardware and David Kleinfeld, Arlette Kolta, and Daniela Piña Novo for comments and suggestions. Myomatrix arrays were provided as part of the EMORY-SKAN Remote Workshop for Advanced EMG Methods, funded by the Simons-Emory International Consortium on Motor Control, and for which we thank Samuel Sober, Bryce Chung, and Amanda Jacob. Funding support included grants from the National Institutes of Health/National Institute of Neurological Disorders and Stroke (1R21NS135642, 5R01NS061963, 1R37NS061963).
The authors declare no competing financial interests.
- Correspondence should be addressed to John M. Barrett at john.barrett{at}cantab.net.















