Skip to main content

Main menu

  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Collections
    • Podcast
  • ALERTS
  • FOR AUTHORS
    • Information for Authors
    • Fees
    • Journal Clubs
    • eLetters
    • Submit
    • Special Collections
  • EDITORIAL BOARD
    • Editorial Board
    • ECR Advisory Board
    • Journal Staff
  • ABOUT
    • Overview
    • Advertise
    • For the Media
    • Rights and Permissions
    • Privacy Policy
    • Feedback
    • Accessibility
  • SUBSCRIBE

User menu

  • Log out
  • Log in
  • My Cart

Search

  • Advanced search
Journal of Neuroscience
  • Log out
  • Log in
  • My Cart
Journal of Neuroscience

Advanced Search

Submit a Manuscript
  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Collections
    • Podcast
  • ALERTS
  • FOR AUTHORS
    • Information for Authors
    • Fees
    • Journal Clubs
    • eLetters
    • Submit
    • Special Collections
  • EDITORIAL BOARD
    • Editorial Board
    • ECR Advisory Board
    • Journal Staff
  • ABOUT
    • Overview
    • Advertise
    • For the Media
    • Rights and Permissions
    • Privacy Policy
    • Feedback
    • Accessibility
  • SUBSCRIBE
PreviousNext
Research Articles, Behavioral/Cognitive

The Representation of Observed Actions at the Subordinate, Basic, and Superordinate Level

Tonghe Zhuang, Zuzanna Kabulska and Angelika Lingnau
Journal of Neuroscience 29 November 2023, 43 (48) 8219-8230; https://doi.org/10.1523/JNEUROSCI.0700-22.2023
Tonghe Zhuang
Faculty of Human Sciences, Institute of Psychology, Chair of Cognitive Neuroscience, University of Regensburg, 93053 Regensburg, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Tonghe Zhuang
Zuzanna Kabulska
Faculty of Human Sciences, Institute of Psychology, Chair of Cognitive Neuroscience, University of Regensburg, 93053 Regensburg, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Angelika Lingnau
Faculty of Human Sciences, Institute of Psychology, Chair of Cognitive Neuroscience, University of Regensburg, 93053 Regensburg, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Angelika Lingnau
  • Article
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF
Loading

Abstract

Actions can be planned and recognized at different hierarchical levels, ranging from very specific (e.g., to swim backstroke) to very broad (e.g., locomotion). Understanding the corresponding neural representation is an important prerequisite to reveal how our brain flexibly assigns meaning to the world around us. To address this question, we conducted an event-related fMRI study in male and female human participants in which we examined distinct representations of observed actions at the subordinate, basic and superordinate level. Using multiple regression representational similarity analysis (RSA) in predefined regions of interest, we found that the three different taxonomic levels were best captured by patterns of activations in bilateral lateral occipitotemporal cortex (LOTC), showing the highest similarity with the basic level model. A whole-brain multiple regression RSA revealed that information unique to the basic level was captured by patterns of activation in dorsal and ventral portions of the LOTC and in parietal regions. By contrast, the unique information for the subordinate level was limited to bilateral occipitotemporal cortex, while no single cluster was obtained that captured unique information for the superordinate level. The behaviorally established action space was best captured by patterns of activation in the LOTC and superior parietal cortex, and the corresponding neural patterns of activation showed the highest similarity with patterns of activation corresponding to the basic level model. Together, our results suggest that occipitotemporal cortex shows a preference for the basic level model, with flexible access across the subordinate and the basic level.

SIGNIFICANCE STATEMENT The human brain captures information at varying levels of abstraction. It is debated which brain regions host representations across different hierarchical levels, with some studies emphasizing parietal and premotor regions, while other studies highlight the role of the lateral occipitotemporal cortex (LOTC). To shed light on this debate, here we examined the representation of observed actions at the three taxonomic levels suggested by Rosch et al. (1976). Our results highlight the role of the LOTC, which hosts a shared representation across the subordinate and the basic level, with the highest similarity with the basic level model. These results shed new light on the hierarchical organization of observed actions and provide insights into the neural basis underlying the basic level advantage.

  • action categorization
  • action observation
  • action recognition

Introduction

Depending on the circumstances, different aspects of an action become relevant. As an example, we might be interested in the type of punch when watching a boxing match, while we might be more concerned with the broader distinction between attacking and greeting when approaching a stranger at night. How the brain adapts its representational states to achieve this flexibility is a key question in Cognitive Neuroscience.

The hierarchical organization of objects has been studied for decades (Gauthier et al., 1997; Mack et al., 2008; Carlson et al., 2013; Iordan et al., 2015). Rosch et al. (1976) argued that objects can be organized into the superordinate (e.g., furniture), basic (e.g., chair), and subordinate level (e.g., kitchen chair), depending on the degree of abstraction, and that the basic level plays a central role in categorization, e.g., in terms of the number and types of features used to describe an object, and in terms of the speed of processing (see also Mack et al., 2008; Macé et al., 2009). Moreover, different taxonomic levels of objects have been shown to be dissociated at the neural level (Kriegeskorte et al., 2008; Iordan et al., 2015; Dehaqani et al., 2016), and it has been proposed that the ventral temporal cortex (VTC) has flexible access to these different levels (Grill-Spector and Weiner, 2014).

Likewise, the planning and control of actions is assumed to be organized hierarchically (Gallivan et al., 2013; Kadmon Harpaz et al., 2014; Krasovsky et al., 2014; Ariani et al., 2015; Gallivan and Culham, 2015; Turella et al., 2020). Similar hierarchies have been proposed to underlie the organization of observed actions. Several authors distinguished between the How, What and Why level (Vallacher and Wegner, 1985; Wegner and Vallacher, 1986; Spunt et al., 2016). Hamilton and Grafton (2006, 2008) distinguished between the goal level (corresponding to the purpose/outcome of an action), the muscle level and the kinematic level, while Wurm and Lingnau (2015) distinguished between different levels of abstraction (e.g., opening vs closing a bottle).

It is assumed that areas involved in action recognition should show invariance to the way the actions are performed (Hamilton and Grafton, 2006, 2008; Oosterhof et al., 2010, 2012; Wurm and Lingnau, 2015). Several studies have highlighted the role of parietal and premotor regions for action representations at the goal level that generalize across the muscle or kinematic level (Hamilton and Grafton, 2006, 2008; Majdandzic et al., 2009; see also Aflalo et al., 2020; Lanzilotto et al., 2020). Wurm and Lingnau (2015) revealed representations of observed actions at a concrete level (specific for the object and kinematics) in the LOTC, inferior parietal lobule (IPL), and ventral premotor cortex (PMv), whereas representations at an abstract level (generalizing across object and kinematics) were restricted to the IPL and LOTC (see also Wurm et al., 2016). In sum, previous studies successfully distinguished between observed actions at varying hierarchical levels, with some studies highlighting the role of parietal and premotor regions, whereas other studies emphasize the role of the LOTC. However, to the best of our knowledge, no previous neuroimaging study directly compared the three taxonomic levels proposed by Rosch et al. (1976). The current study aims to fill this gap.

Zhuang and Lingnau (2022) examined the characteristics of observed actions at the three taxonomic levels. Actions at the three levels differed with respect to the number and type of features participants used to describe them, and in their ratings of abstraction. Moreover, participants verified the action category faster at the basic and subordinate level in comparison to the superordinate level. Together, these results suggest that the basic level holds the maximized information, consistent with the basic level advantage reported for objects (Rosch et al., 1976). Given these behavioral results, here we aimed to determine which brain regions (1) represent observed actions at the three taxonomic levels, and (2) which brain regions host a joint representation across these levels.

Materials and Methods

Overall rationale and hypotheses

To reveal which brain areas represent actions at the three taxonomic levels, we separated 12 daily actions into three action categories at the superordinate level (see also Zhuang and Lingnau, 2022). Each superordinate action category consisted of two types of actions at the basic level, and each basic level action encompassed two actions at the subordinate level (Fig. 1). To verify this hierarchy, we used a multiarrangement experiment (Kriegeskorte and Mur, 2012) combined with inverse multidimensional scaling (MDS) and hierarchical cluster analysis. Next, to determine which brain areas represent observed actions at the three different hierarchical levels, we conducted an fMRI experiment and conducted region of interest (ROI)-based and whole-brain searchlight-based representational similarity analysis (RSA; Kriegeskorte et al., 2008). Specifically, we examined the representation of observed actions at the subordinate, basic and superordinate level, and the representation of the behavioral similarity structure resulting from the multiarrangement experiment.

Figure 1.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 1.

A, Stimulus set and corresponding hierarchical structure (based on Zhuang and Lingnau, 2022). Left column, Example stimuli (one out of six exemplars per subordinate action). Second to fourth columns, English labels of actions at the subordinate, basic, and superordinate levels. B, Model RDMs for observed actions at the subordinate (left), basic (middle), and superordinate (right) levels. Each model RDM consists of a 72 × 72 matrix (12 actions, with 6 exemplars per action), where each cell in the matrix corresponds to the dissimilarity between a pair of actions. Yellow: high dissimilarity; blue: low dissimilarity.

We expected that the subordinate level model is represented by patterns of activations in early visual areas, the LOTC and possibly the IPL and the PMv (see also Wurm and Lingnau, 2015). The basic level model was expected to be represented in the LOTC and the IPL, but not in the PMv, whereas the superordinate level was expected to be represented in anterior portions of the LOTC (Wurm and Lingnau, 2015). The behavioral model was expected to be captured by neural patterns of activation in the LOTC and possibly the IPL (Tucciarelli et al., 2019; Tarhan et al., 2021).

Stimulus selection and validation

Stimuli consisted of static images of 12 different actions (600 × 480 pixels, 14.36 × 11.07° of visual angle; six exemplars each; see Fig. 1 for an overview of stimulus exemplars and corresponding action words). The 12 actions were chosen on the basis of a series of rating and behavioral studies (Zhuang and Lingnau, 2022) that we briefly summarize here. First, we selected action verbs corresponding to the basic level from Levin (1993). Using these action verbs, we conducted a semantic similarity rating, followed by hierarchical cluster analysis. Based on the resulting clusters, we selected a subset of basic level actions, excluding actions that might be hard to portray as a picture (e.g., to learn, to memorize). To select labels for the superordinate level, a new set of participants was provided with the basic level labels of actions belonging to a given cluster revealed by the hierarchical cluster analysis. To select actions belonging to the subordinate level, participants were provided with different action verbs corresponding to the basic level and were asked to generate action verbs corresponding to the subordinate level. Next, another group of participants was asked to rate (1) the relationship between actions at the subordinate and the superordinate level (e.g., between “swim front crawl” and “locomotion,” or between “swim front crawl” and “ingestion”) and (2) the degree of abstraction and complexity of each action at the subordinate level. Actions were only included in the final set if they were consistently rated to belong to a given superordinate category, and not to other superordinate categories.

We selected the six different exemplars for each of the 12 actions based on the following criteria: young adult agents of both genders, with an equal representation of three males and three females per action. In addition, we selected three distinct orientations for each agent, including two profile views (facing left and right, respectively) and one frontal view. Note that for the action “doing the dishes,” we replaced frontal views by another profile view exemplar because of the lack of suitable images depicting this action in a frontal view.

Since the rating studies were based on written words, we first wanted to verify how human participants categorize these actions when presented as static images. To this aim, we conducted a multiarrangement experiment (Kriegeskorte et al., 2008) as implemented in the online platform MEADOWS (https://meadows-research.com) in a group of N = 18 participants (12 female; mean age: 27 years; range: 22–31 years) that did not take part in the fMRI experiment. Participants were instructed to judge the degree of similarity between the 12 actions depicted in these static images, to arrange them accordingly (i.e., the more similar in meaning, the closer they should be positioned on the screen), and to press a button when they were satisfied with the arrangement of the stimuli. In the first trial, all stimuli appeared on the screen (stimulus size: 48 × 38 pixel). In all subsequent trials, an adaptive algorithm chose a subset of all stimuli to provide the optimal evidence for pairwise dissimilarity estimates (for details, see Kriegeskorte and Mur, 2012). The experiment continued until the adaptive algorithm reached the required evidence level for pairwise dissimilarities. The full stimulus set contained 72 static images (12 actions × 6 exemplars). Each participant was provided with 12 different actions. Action exemplars were counterbalanced across participants.

Results of the multiarrangement experiment were collapsed across participants and averaged across image exemplars. We visualized the results by creating a 12 × 12 representational dissimilarity matrix (RDM) where each cell contains a value corresponding to the Euclidean distance between two actions (Fig. 2A). Figure 2B shows a two-dimensional (2D) arrangement derived from multidimensional scaling (metric stress), averaged across participants. Inverse MDS revealed three larger clusters corresponding to the three superordinate action categories. Subsequently, to reveal the corresponding hierarchical structure, we conducted average-linkage hierarchical cluster analysis (using the MATLAB function linkage) on the results obtained from the multiarrangement experiment. The results are shown in Figure 2C. This analysis confirmed the hierarchical structure of the selected actions, with three action categories corresponding to the superordinate level (locomotion, ingestion and cleaning), six action categories at the basic level (to swim, to ride, to eat, to drink, to clean the body and to do housework), and 12 actions at the subordinate level (to ride a motorbike, to ride a bike, to swim front crawl, to swim backstroke, to drink water, to drink beer, to eat cake, to eat an apple, to clean windows, to do the dishes, to brush teeth, and to clean the face).

Figure 2.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 2.

Stimulus selection and validation. A, Behavioral action space model, averaged across N = 18 participants and across six exemplars per action. B, Two-dimensional representation of the results shown in panel A, resulting from multidimensional scaling analysis (Borg and Groenen, 2005). For color code see legend in panel C. The numbers refer to the 12 actions at the subordinate level (1: “to ride a motorbike,” 2: “to ride a bike,” 3: “to swim front crawl,” 4: “to swim backstroke,” 5: “to drink beer,” 6: “to drink water,” 7: “to eat an apple,” 8: “to eat cake,” 9: “to clean windows,” 10: “to brush teeth,” 11: “to do the dishes,” and 12: “to clean the face.” C, Dendrogram resulting from hierarchical cluster analysis, confirming the clusters corresponding to the three superordinate level categories (locomotion, ingestion, cleaning) and the six basic level categories (see also Fig. 1).

Participants

A group of N = 23 participants (17 female; mean age: 26 years; range: 21–39 years) took part in the fMRI experiment. All participants except two authors of the paper (T.Z. and Z.K.) were naive to the purpose of the study. All participants gave written informed consent before joining the experiment and received either monetary compensation or course credits at the University of Regensburg. All participants had normal or corrected-to-normal vision and reported to have no psychiatric or neurologic disorders.

fMRI experimental design and task

We used a rapid event-related fMRI design (Fig. 3), programmed in ASF (Schwarzbach, 2011), adopting the design used by Tucciarelli et al. (2019). During each trial, participants were provided with a static image of an action and a central fixation cross superimposed on the image (1 s), followed by a blank screen and a central fixation cross (3 s). Participants were instructed to observe the action while keeping their eyes at fixation. During occasional catch trials (11% of all trials), participants were presented with a phrase depicting an action (e.g., “to swim?,” 1 s) followed by 3-s fixation. During catch trials, participants had to perform a category verification task. Specifically, they were instructed to indicate by button press with the index or middle finger of the right hand whether or not the action shown in the previous trial corresponded to the phrase shown during the catch trial. To make sure that participants were not biased toward answering questions at one of the three different levels, phrases presented during catch trials had an equal probability to address the superordinate (e.g., “locomotion?”), basic (e.g., “to swim?”) or subordinate (“to swim backstroke?”) level. That is, there was the same number of catch trials for each of the three taxonomic levels (four catch trials per taxonomic level in each run). Additionally, to improve design efficiency, we included null events (22.2% of all trials) that consisted of 4-s fixation and were presented pseudo-randomly (no consecutive null events and catch trials).

Figure 3.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 3.

Experimental procedure used in the fMRI experiment. We used a rapid event-related design, where each trial consisted of a static image (1 s) depicting one of the 12 actions (see Table 1, second column), followed by a fixation cross (3 s). During occasional catch trials (11% of all trials), participants had to perform a category verification task, targeting the action presented in the previous trial. Questions to be answered during catch trials targeted the superordinate (e.g. “locomotion?”), basic (e.g. “to swim?”), or subordinate (e.g. “to swim backstroke?”) level with an equal probability. Moreover, null events (22.2% of all trials) were included to enhance design efficiency (see text for details).

View this table:
  • View inline
  • View popup
Table 1.

Overview regions of interest

Participants performed six runs, each consisting of 72 experimental trials (66.7%), 12 catch trials (11.1%), and 24 null events (22.2%). Additionally, each run included a 10-s fixation period at the beginning and the end. Each run lasted 7.5 min. Halfway throughout the experiment, an anatomic scan was performed with a duration of ∼5 min. The whole experiment lasted ∼50 min. To ensure that participants fully understood and followed the instructions, participants performed a short practice run (consisting of 12 trials) before entering the scanner.

Data acquisition

The experiment was conducted in the MRI laboratory at the University of Regensburg. Data were collected using a 3T full-body Siemens-Prisma scanner with a 64-channel head coil. A T2*-weighted gradient multiband (MB) echo-planar imaging (EPI) sequence was used for acquiring functional images with 64 slices per volume, using the following parameters: repetition time (TR): 2 s, echo time (TE): 30 ms, flip angle: 75°, excitation pulse duration = 9 ms; echo spacing = 0.58 ms; bandwidth = 2368 Hz/pixel; field of view (FoV): 192 × 192 mm2, partial Fourier = 6/8; voxel resolution: 2.5 mm3, MB-acceleration: 4. Each functional run consisted of 226 volumes and lasted 7 min and 32 s. Between the third and fourth EPI sequence, we acquired a 5 min T1-weighted Magnetization Prepared Rapid Gradient Echo (MPRAGE) structural sequence (TR: 1910 ms, TE: 3.67 ms, FOV: 256 × 256 mm, voxel size: 1 mm3, flip angle: 9°).

fMRI data preprocessing

We used the FMRIB Software Library (FSL 6.0 https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/) to preprocess the data. The first four volumes were deleted from each functional run to ensure to have reached steady-state magnetization. Functional images were slice time corrected, highpass filtered (with a cutoff of 100 s), corrected for head-motion with 7 degrees of freedom (DOF) and the middle volume as reference, and then co-registered to the individual T1 anatomic image. For univariate analysis, functional data were smoothed with a 5-mm full width half maximum (FWHM) kernel. For multivariate analysis, we used unsmoothed data. Data were aligned into Montreal Neurologic Institute (MNI) space.

Region of interest (ROI) definition

ROI definition was conducted using a combination of functional data and anatomic masks obtained from the Harvard–Oxford Cortical structural and the Jülich Histologic atlas (see Table 1). We focused on key areas of the action observation network described in previous studies (Hamilton and Grafton, 2008; Kilner, 2011; Wurm et al., 2017b). Specifically, we selected the bilateral LOTC, IPL, superior parietal lobule (SPL), dorsal premotor cortex (dPM), and inferior frontal gyrus (IFG). In addition, to be able to compare results with an area concerned with low-level visual analysis, we identified bilateral V1.

ROI definition consisted of several steps. First, we computed the random-effects (RFX) general linear model (GLM) contrast “all actions versus baseline” with spatially smoothed data (5-mm FWHM). The baseline consisted of all events not explicitly modeled in the GLM. The RFX GLM included seventy-two regressors (12 actions × 6 exemplars), one regressor for the catch trials, and six regressors for head motion. The statistical map resulting from the RFX GLM contrast was corrected for multiple comparisons using threshold-free cluster enhancement (TFCE; Smith and Nichols, 2009; p < 0.05, |z| > 1.96, two-tailed, 5000 permutations) as implemented in CoSMoMVPA (Oosterhof et al., 2016; http://cosmomvpa.org/index.html). Second, we defined anatomic masks from the Harvard–Oxford Cortical structural and the Jülich Histologic atlas (threshold: 20%; see Table 1 for details). The mask for the LOTC was created by merging the anatomic masks for the inferior lateral occipital cortex (LOC) and the occipitotemporal cortex. Third, within each resulting mask, we extracted the peak coordinate resulting from the RFX GLM contrast “all actions versus baseline.” ROIs were defined as spheres (radius: 10 mm) centered around these peaks.

Representational similarity analysis

We conducted RSA using the CoSMoMVPA Toolbox (Oosterhof et al., 2016) and custom written MATLAB functions (available at https://osf.io/b6ea4/). We used the following procedure both for the ROI-based and the whole-brain searchlight approach unless otherwise noted.

ROI-based RSA (partial correlations)

To examine the representation of observed actions in predefined ROIs, we used RSA with partial correlations (Kriegeskorte et al., 2008). First, we created a model RDM for each of the three taxonomic levels (Fig. 1, bottom panel), corresponding to the hierarchical structure shown in Figure 1, top panel (see also Zhuang and Lingnau, 2022). Since we used 6 exemplars for each of the 12 actions, each model RDM consisted of a 72 × 72 matrix, where each cell in the matrix corresponds to the dissimilarity between a pair of actions. The subordinate level model consists of 12 clusters along the diagonal, with each cluster comprising six exemplars of the same type of action. The basic level model consists of six clusters along the diagonal, while the superordinate model consists of three clusters along the diagonal.

Second, to account for differences between the different action categories in terms of low-level visual properties, we constructed a control model using the first layer of a Deep Neural Network (DNN) that has been trained to classify image classes in the ImageNet dataset (ResNet50; He et al., 2016). We chose this layer because the first layer of a deep neural network is generally assumed to learn to detect edges, colors, texture orientations and other simple shapes in input images (Zeiler and Fergus, 2014; Kriegeskorte, 2015; Mahendran and Vedaldi, 2015). For each of the 72 images used in the current experiment, we determined the activations of each unit in the first convolutional layer of the ResNet50 and converted these values into activation vectors, separately for each image. Next, we constructed the 72 × 72 low-level visual control model by computing the dissimilarity (1-correlation) between these activation vectors for each pair of action images (for similar approaches, see Kriegeskorte et al., 2008). Additionally, to account for features related to the scene or background in which the action took place, we created a 72 × 72 scene control model. We used the same procedure as the one described for the construction of the low-level visual control model. The only difference was that we obtained activations in response to each image from the second-to-last layer of the ResNet50 trained on a large image dataset to distinguish between scene categories (Zhou et al., 2017, 2018).

Third, we constructed neural RDMs within each ROI following previous studies (Bonner and Epstein, 2018; Tucciarelli et al., 2019). To do so, separately for each run and each of the 72 action images (12 actions × 6 exemplars), we extracted the β estimates for each voxel in a given ROI and converted these β estimates into t values, resulting in a vector of t values for each action image, with the length corresponding to the number of voxels. Next, we averaged the t values across runs and normalized the t values by subtracting the mean t values of each voxel from the t values of each action image (Diedrichsen and Kriegeskorte, 2017), separately for each participant. Finally, for each of the 72 × 72 pairwise comparisons of action images, we computed the squared Euclidean distance between the corresponding vectors of t values, resulting in a 72 × 72 neural RDM.

Finally, to determine which of the ROIs captured the similarity between actions at the three taxonomic levels, we computed partial correlations between neural RDMs and each of the three model RDMs, regressing out the low-level visual control model and the scene control model.

Searchlight-based RSA (multiple regression)

To determine whether additional areas not captured by the ROI analysis contained information regarding observed actions at the three taxonomic levels, we conducted two different multiple-regression representational similarity analyses (RSAs). In both types of searchlight-based RSAs, we used a spherical neighborhood of 100 voxels which were nearest to each center voxel. As in the ROI analysis, t values were averaged across runs and normalized by subtracting the mean t value of each voxel across conditions.

To examine risks of multicollinearity, we determined the variance inflation factor (VIF) for the three taxonomic models depicted in Figure 1, bottom panel, the low-level visual control model and the scene control model. We obtained small to moderate VIFs (subordinate model: 1.90; basic model: 2.55; superordinate model: 1.78; low-level visual control model: 1.31, scene control model: 1.77), suggesting a low risk of multicollinearity (Mason et al., 2003).

In the first multiple regression RSA, computed separately for each taxonomic level, we included the model for a single taxonomic level (e.g., the subordinate level), the low-level visual control model and the scene control model. This allowed us to obtain β weights for each taxonomic level while regressing out the low-level visual control model and the scene control model.

In the second multiple regression RSA, we aimed to determine the unique contribution of each taxonomic level. To this aim, we set up a multiple regression that included all three taxonomic level models, the low-level visual control model and the scene control model. This way, we were able to obtain the β weights for each of the three taxonomic levels while regressing out the contribution of the other two taxonomic levels, the low-level visual control model and the scene control model.

Finally, to examine the relationship between neural RDMs and the behavioral dissimilarity matrix obtained from the multiarrangement task (Fig. 2B), we conducted another searchlight-based multiple regression RSA in which we included (1) the behavioral model, the low-level visual control model and the scene control model, and (2) the behavioral model, each of the taxonomic models separately, the low-level visual control model and the scene control model. VIFs were also computed and the results were small to moderate, justifying the use of multiple regression RSA. In particular, the VIFs for the low-level visual control model, the scene control model and the behavioral model were small (low-level visual control model, scene control model and behavioral action space model: VIF = 1.31, 1.85, 1.53, respectively). Additionally, the VIFs for regression models including the behavioral action space model, each of the taxonomic level models, the low visual control model and the scene control model were small to moderate (behavioral action space model, subordinate level model, low-level visual control model and scene control model: VIF = 3.84, 3.18, 1.33, 1.88, respectively; behavioral action space model, basic level model, low-level visual control model and scene control model: VIF = 6.10, 5.34, 1.32, 1.88, respectively; behavioral action space model, superordinate level model, low-level visual control model and scene control model: VIF = 2.63, 2.24, 1.31, 1.87, respectively).

Conjunction analysis

To reveal which brain regions host converging representations across the different taxonomic levels, we conducted a conjunction analysis (Nichols et al., 2005). To this aim, we determined the minimum t value for each voxel for the whole-brain searchlight maps corresponding to the unique representation of the different taxonomic levels.

Statistics

For the ROI-based partial correlation analysis, we first used one-sample t tests to compare the Fisher-transformed partial correlation values against zero. Second, to determine whether the similarity between neural and model RDMs depicted in Figure 4 was modulated by the taxonomic level and ROI, we conducted a two-way repeated measures ANOVA with the factors taxonomic level (superordinate, basic, subordinate) and ROI (see Table 1), followed by pairwise comparisons. Correction for multiple comparisons for all factorial combinations (taxonomic level × ROI) was conducted using the false discovery rate (FDR; Benjamini and Hochberg, 1995). Since the assumption of sphericity was violated for the two-way ANOVA, we report Greenhouse–Geisser-corrected p values (indicated as pGG).

For the searchlight-based multiple regression RSA, the β maps determined separately for each participant were entered into a group statistic by means of a one-sample t test (one-tailed) against zero. The resulting t maps were corrected for multiple comparisons using TFCE as implemented in CoSMoMVPA (Oosterhof et al., 2016; number of permutations = 5000, corrected p < 0.05, z > 1.65, one tailed). Thresholded statistical maps were visualized using BrainNet (https://www.nitrc.org/projects/bnv/).

Results

Behavioral results

Participants reached a high accuracy (91.1%, SD: 6%) in the category verification task, indicating that they paid attention to the actions depicted in the images.

ROI-based results

Figure 4 shows the results from ROI-based RSA for the three taxonomic levels, regressing out low-level visual properties by means of a control model obtained from the first convolutional layer of a Deep Neural Network [ResNet50_conv1; for details, see Materials and Methods, ROI-based RSA (partial correlations)] and the scene control model [based on the last second layer of ResNet50 pretrained with places; for details, see Materials and Methods, ROI-based RSA (partial correlations)]. We obtained significant partial correlations between neural RDMs and all three taxonomic level models in most of the examined ROIs, with the exception of the right IFG, where no significant partial correlation was observed between neural RDMs and the subordinate level. Similarities (partial correlation values) between neural RDMs and model RDMs differed between the three taxonomic levels in bilateral LOTC and SPL. In bilateral LOTC, we obtained the highest partial correlations for actions at the basic level in comparison to the other two taxonomic levels. In bilateral SPL, we observed higher partial correlations between the neural RDM and the basic level model in comparison to the superordinate level model.

Figure 4.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 4.

Partial correlations between neural RDMs and the models corresponding to the three taxonomic levels (see also Fig. 1), regressing out a low-level visual control model, based on the first convolutional layer of a deep neural network (ResNet50; He et al., 2016); trained on a large scale image dataset (ImageNet dataset) to distinguish image classes and a scene control model [based on the second-to last layer of ResNet50, trained to distinguish between different scene categories; Zhou et al., 2017, 2018; for details, see Materials and Methods, ROI-based RSA (partial correlations)]. Error bars show the standard error of the mean. Asterisks illustrate statistical significance (*qFDR < 0.05, **qFDR < 0.01, ***qFDR < 0.005) of one-sample t tests against zero with FDR correction for all possible combinations of taxonomic level and ROI. Since a two-factorial (level × ROI) repeated measures ANOVA revealed a significant interaction between taxonomic level and ROI, partial correlation values between the three taxonomic levels within each ROI were compared using pairwise comparisons (corrected for multiple comparisons using FDR).

These observations are supported by the corresponding statistics. In particular, a two-way (taxonomic level × ROI) repeated measures ANOVA revealed a significant main effect of taxonomic level (F(2,44) = 8.88, pGG = 0.002, partial η2 = 0.29) and ROI (F(11,242) = 32.93, pGG < 0.001, partial η2 = 0.60), and a significant interaction between taxonomic level and ROI (F(22,484) = 3.46, pGG < 0.001, partial η2 = 0.14). Pairwise comparisons within each ROI (using FDR to correct for multiple comparisons) revealed that neural RDMs in the bilateral LOTC showed the highest similarity with the basic level model in comparison to the other two models (left: basic vs subordinate level, t(1,22) = 6.11, qFDR < 0.001; basic vs superordinate level, t(1,22) = 3.57, qFDR = 0.003; right: basic vs subordinate level, t(1,22) = 5.04, qFDR < 0.001; basic vs superordinate level, t(1,22) = 4.04, qFDR < 0.001). Additionally, neural RDMs in bilateral SPL showed a higher partial correlation with the basic level model in comparison to the superordinate level (left: basic vs superordinate level, t(1,22) = 3.53, qFDR = 0.006; right: basic vs superordinate level, t(1,22) = 3.86, qFDR = 0.003).

To illustrate the results obtained in bilateral LOTC in comparison to bilateral early visual cortex (V1), Figure 5 visualizes neural RDMs (left panel) and the corresponding 2D-arrangements obtained from multidimensional scaling analysis (right panel) in bilateral LOTC (top row) and bilateral V1 (bottom row). Using the same approach described for the searchlight-based RSA, we created a 72 × 72 neural RDM in LOTC and V1, and then collapsed the neural RDM across hemispheres and participants. Next, to account for low-level visual features and scene-related properties, we extracted values from the lower triangular part of the neural RDM and regressed out the corresponding lower triangular part of the low-level visual control model and the scene control model. Finally, the resulting residuals were rescaled to values from 0 to 100 (Nili et al., 2014). In the 2D visualization of the MDS (Fig. 5B,D), the three superordinate action categories are highlighted in different colors (blue, red, and green), the basic action categories within a superordinate category are shown by open and filled symbols, and different symbols (circles, squares) indicate the different actions at the subordinate level within each basic level category. As can be seen from Figure 5, the neural RDM in bilateral LOTC (Fig. 5A) and the corresponding 2D arrangement resulting from multidimensional scaling (Fig. 5B) reveals a broad distinction into actions at the basic level, with open symbols mostly on the right side and filled symbols mostly on the left side, and an additional broad clustering into the three different superordinate categories highlighted in red, green, and blue (Fig. 5B). By contrast, this pattern is less obvious in early visual cortex (Fig. 5C,D), in line with the results of the ROI-based analysis shown in Figure 4.

Figure 5.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 5.

Neural RDM (left panel) and the corresponding 2D visualization of the MDS results (right panel) obtained in bilateral LOTC (A and B) and bilateral V1 (C and D) after regressing out the low-level visual control model (based on the first convolutional layer of ResNet50, trained to distinguish image classes) and the scene control model [based on the second-to last layer of ResNet50, trained to distinguish between scene categories; Zhou et al. (2017, 2018); for details, see Materials and Methods, ROI-based RSA (partial correlations)], using the same ROIs as those shown in Figure 4. Colors to indicate actions belonging to the three superordinate level categories (red-locomotion, blue-ingestion, and green-cleaning) are the same as in Figure 2. Opened versus filled symbols indicate different actions at the basic level, whereas circles versus squares are used to distinguish actions at the subordinate level.

Figure 6.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 6.

Results of the searchlight-based multiple regression RSA for the subordinate (left panel), basic (middle panel), and superordinate (right panel) level model. A, Multiple regression RSA, regressing out the low-level visual control model and the scene control model (see section Searchlight-based RSA for details). B, Multiple regression RSA, regressing out the low-level visual control model, the scene control model and the remaining two taxonomic level models. For group statistics (N = 23), β weights resulting from the multiple regression RSA were entered into one-sample t tests against zero. The statistical t value maps were corrected for multiple comparisons using TFCE (z > 1.65, TFCE-corrected p < 0.05, one-tailed) as implemented in CoSMoMVPA (Oosterhof et al., 2016).

Whole-brain searchlight results

Figure 6A reveals areas that capture the similarity between actions at the three different taxonomic levels when low-level visual features and scene-related properties are regressed out via a low-level visual control model and a scene control model [for details, see Materials and Methods, ROI-based RSA (partial correlations)]. This analysis reveals a wide set of regions for the subordinate and basic level, with peaks in the right fusiform cortex (Table 2). By contrast, the representation of observed actions at the superordinate level is restricted to a more circumscribed region in occipitotemporal cortex extending into the anterior IPL, with a peak in the right occipital pole (Table 2).

View this table:
  • View inline
  • View popup
Table 2.

Peak locations for the three taxonomic levels in the whole-brain searchlight

Figure 6B shows which areas capture the similarity structure of observed actions that is unique to each of the taxonomic levels (accounting for the other two taxonomic models as well as for the low-level visual features and the scene control model). This analysis reveals that the dissimilarity structure for actions that is unique to the subordinate level is restricted to bilateral occipitotemporal cortex (OTC), with a peak in right fusiform cortex, while information that is unique to the basic level is associated with bilateral occipitotemporal cortex, IPL, and SPL, with a peak in the right occipital fusiform cortex (see Table 2). By contrast, we obtained no region that represented information that is unique to the superordinate level. We will return to this observation in the discussion.

To identify regions in which actions are jointly represented at the subordinate and basic level, we computed a conjunction analysis on the basis of the statistical maps corresponding to the unique representation of the subordinate and basic level shown in Figure 6B (for details, see Materials and Methods). The results are shown in Figure 7. The convergence of observed actions at the subordinate and basic level was located in bilateral OTC.

Figure 7.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 7.

Conjunction map for the unique representation of observed actions at the subordinate and basic level (Fig. 6B). See text for details.

Relationship between neural RDMs and behavioral dissimilarity structure

Whereas the previous analyses focused on the representation of the three taxonomic levels, we next aimed to determine which brain areas capture the dissimilarity structure between observed actions resulting from the multiarrangement task, and to which degree these representations are accounted for by the model RDMs corresponding to the three taxonomic levels. To address these questions, we conducted two additional multiple regression searchlight analyses.

First, we ran a multiple-regression RSA between neural RDMs and the behavioral dissimilarity structure shown in Figure 2A, regressing out the low-level visual control model and the scene control model. As shown in Figure 8A, this analysis revealed large clusters in bilateral occipitotemporal cortex, IPL, and SPL that were associated with the behavioral action space model.

Figure 8.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 8.

Brain areas capturing the representational space of actions obtained from behavioral ratings (Fig. 2A). A, Brain areas capturing the behavioral dissimilarity structure after regressing out the low-level visual control model and the scene control model. B–D, Brain areas capturing the behavioral dissimilarity structure after regressing out the low-level visual control model, the scene control model and models for the (B) subordinate, (C) basic, and (D) superordinate levels. For the results of a control analysis examining the areas revealed by a whole-brain searchlight RSA for the low-level visual control model, see Extended Data Figure 8-1.

Extended Data Figure 8-1

Results of the whole-brain searchlight RSA for the low-level visual control model. See Materials and Methods for details. Download Figure 8-1, EPS file.

Second, separately for each of the taxonomic levels, we conducted another multiple-regression RSA in which we included the behavioral action space model, the low-level visual control model, the scene control model and the respective taxonomic level (e.g., the subordinate level). A map of the β weights corresponding to the behavioral action space model, after regressing out the low-level visual control model, the scene control model and each of the taxonomic level models are shown in Figure 8B–D. As can be seen, after regressing out the subordinate level, the low-level visual control model and scene control model, the behavioral action space model is associated with neural patterns in bilateral occipitotemporal cortex (Fig. 8B). After regressing out the superordinate model (Fig. 8D), the low-level visual control model and the scene control model, large regions of bilateral occipitotemporal cortex, IPL, and SPL remained significantly associated with the behavioral action space model. By contrast, after regressing out the basic level model, the low-level visual control model and the scene control model, only a small cluster in the temporal occipital fusiform cortex (bilaterally) and left superior LOC remained that captured the behavioral action space model (Fig. 8C). Whereas these latter results need to be interpreted with caution, they are in line with the view that the cortical representations of the behavioral action space model best reflects information at the basic level (see also Figs. 4 and 5). We will return to this observation in the discussion.

Control analysis: whole-brain searchlight analysis for the low-level visual control model

To examine which brain regions show a significant correlation between neural activation patterns and the control model capturing low-level visual features, we conducted an additional whole-brain searchlight RSA with the low-level visual control model (ResNet50_conv1). The results are shown in Extended Data Figure 8-1. As expected, this analysis revealed clusters in early visual cortex (V1 and V2) bilaterally and additionally in a small portion of the superior LOC, temporal occipital fusiform cortex, and inferior LOC (right hemisphere). In other words, this analysis suggests that whereas some of the regions capturing low-level visual features overlap with the regions capturing the three taxonomic levels and the behavioral action space model, the majority of voxels revealed by the taxonomic level models and the behavioral action space model do not overlap with the regions capturing low-level visual features.

Discussion

In the current study we aimed to examine the neural representation of observed actions at different taxonomic levels. Using a hierarchical stimulus set and representational similarity analysis, we observed the highest similarity between neural patterns and the basic level model in bilateral LOTC. A searchlight RSA revealed that the similarity between observed actions at the subordinate and the basic level was captured in a widespread set of occipitotemporal, parietal and frontal areas, whereas neural patterns corresponding to the superordinate level model were obtained in a more circumscribed region in bilateral occipitotemporal cortex. Unique information corresponding to the basic level was captured by patterns of activation in lateral and ventral occipitotemporal cortex and bilateral SPL, while unique information corresponding to the subordinate level was restricted to bilateral occipitotemporal cortex. For the superordinate model, we did not obtain any cluster that captured unique information. Additionally, we found that bilateral occipitotemporal cortex jointly hosted representations that are unique to the subordinate and the basic level. Finally, the behavioral action space model was captured by patterns of activation in occipitotemporal and SPL, and these neural patterns showed high similarity with the basic level model. Together, our results are in line with the view that lateral and ventral portions of the occipitotemporal cortex have flexible access to the representational space of observed actions at the subordinate and basic level, with a special role for the basic level (see also Rosch et al., 1976; Zhuang and Lingnau, 2022). In the following, we will discuss these points as well as limitations and directions for future studies in more detail.

The representational space of observed actions in the LOTC

The LOTC has been shown to distinguish between different observed actions at the basic level on the basis of patterns of activation, with a generalization across the subordinate level (Wurm and Lingnau, 2015; Wurm et al., 2016; Hafri et al., 2017). We previously reasoned that if the LOTC is involved in processing observed actions at a conceptual-semantic level, it should not only distinguish between two actions A and B, but also capture the similarity structure of a wider range of actions. Several previous studies have demonstrated that this is indeed the case (Tucciarelli et al., 2019; Tarhan et al., 2021), and Wurm and Caramazza (2019) furthermore showed that representations in the LOTC generalize across visual stimuli and verbal descriptions of the same actions. In line with this view, several authors proposed a posterior-to-anterior concrete-to-abstract gradient in the LOTC (Lingnau and Downing, 2015; Wurm et al., 2017b; Papeo et al., 2019; Wurm and Caramazza, 2022).

The results of the current study extend these findings, demonstrating that the LOTC captures (1) the behavioral similarity structure of a different set of actions and (2) the unique representation at the subordinate and basic level, with a preference for the basic level (see also Iordan et al., 2015). Our results demonstrate that the LOTC plays a crucial role in representing actions at multiple levels in a situation in which participants are not biased toward processing one of these levels. An important next step for future studies will be to examine the impact of the observer's goal, emphasizing one level over one other. For instance, our goals (such as acquiring a new skill vs predicting the intention of another agent) may determine our focus on either concrete representations at the subordinate level or more abstract representations at the basic or superordinate level. Moreover, these results are consistent with the view that the lateral and ventral portion of the OTC hosts and integrates different action components (such as visual motion, body part, manipulation of tools; Lingnau and Downing, 2015; Tucciarelli et al., 2019) at varying levels of abstraction (see also Wurm and Lingnau, 2015; Wurm and Caramazza, 2022).

The importance of the basic level for the representation of observed actions

Previous behavioral studies suggested that the basic level holds the maximized information for objects (Rosch et al., 1976; Murphy, 2004) and actions (Zhuang and Lingnau, 2022). The results of the current study provide insights into the neural basis underlying these behavioral findings. First, we found that patterns of activation in bilateral LOTC and SPL showed a higher similarity with the basic level model in comparison to the subordinate and the superordinate level (Fig. 4). Second, information that is unique to the basic level was captured in more widespread regions, including OTC, IPL, and SPL (Fig. 6B), in comparison to information that is unique to the subordinate level (Fig. 6A), while we did not obtain any area that captured information that is unique to the superordinate level. Third, the behavioral action space model showed the highest similarity with neural patterns capturing the basic level model (Fig. 8C). Together, the results of the current study on observed actions are consistent with results by Iordan et al. (2015) demonstrating that objects at the basic level show the strongest similarity with patterns of activation in the LOC.

The representation of actions at the superordinate level

In contrast to the subordinate and basic level, we obtained no region that represented information that is unique to the superordinate level. Given that this result is based on the absence of evidence, this finding needs to be interpreted with care. There are several reasons that might account for this observation: (1) lack of power; (2) there is no region that host the unique information at the superordinate level; (3) information at the superordinate level is based on the combination of information at the subordinate and basic level; (4) the representation of actions at the superordinate level is more distributed than the representation of actions at the subordinate and basic level, making it unlikely to be revealed by methods that rely on local patterns of activation (searchlight, ROI analysis). In line with the latter interpretation, Abdollahi et al. (2013) found that different classes of observed actions (e.g., manipulation, locomotion and climbing, corresponding to the superordinate level) recruit different parts of parietal cortex. This observation might explain why we failed to obtain evidence for a unique, local, representation capturing the similarity between the different actions at the superordinate level. Further studies will be required to distinguish between these alternatives more systematically.

The contribution of high-level visual features and semantic knowledge to the categorization of actions

Since we did not compare the processing of visually presented actions with the processing of the corresponding action verbs or phrases, we cannot clearly separate between the contribution of higher-level visual features (or perceptual action precursors; see Wurm and Caramazza, 2022) and semantics. As pointed out above, a study that focused on this direct comparison revealed shared representations between patterns of activations for different actions depicted as videos and as written sentences found this to be the case exclusively in the lateral posterior temporal cortex (Wurm and Caramazza, 2019). That said, to be able to tolerate varying degrees of variability, we assume that the categorization of visually presented actions requires access to semantic knowledge for all three taxonomic levels, while the amount of variability is likely to differ between the levels. Specifically, members of the same subordinate level are assumed to have a higher number of shared features that can be exploited to categorize actions at the subordinate level (e.g., the body posture to distinguish between front crawl and backstroke; see also Zhuang and Lingnau, 2022). At the basic level, which is considered to be the most informative level (see Rosch et al., 1976), categorization has been proposed to rely on more common features that are shared across members of a category (e.g., applying some liquid to the body with hands or a tool to categorize an image to belong to the basic level category “cleaning the body”). Finally, to distinguish actions at the superordinate level, an even higher degree of generalization across high-level visual features is required (e.g., bringing an object to the mouth for the category ingestion).

Comparison of object and action representations

There exists a long tradition both in human (see Bach et al., 2014; Wurm et al., 2017b; Livi et al., 2019; Wurm and Caramazza, 2022) and monkey studies (Bonini et al., 2014) demonstrating that objects provide strong clues regarding the actions they afford, and that objects play an important role during action observation. Consequently, the representation of objects and actions have been proposed to follow similar, though not identical, principles of organization (see Pillon and d'Honincthun, 2011; Wurm et al., 2017b). As an example, Wurm et al. (2017b) demonstrated that features corresponding to actions directed toward persons and objects are preferentially represented in dorsal and ventral portions of the LOTC, parallel to the organization of information related to persons versus inanimate objects, respectively. Given these assumed similar principles of organizations, it is notoriously difficult to dissociate the representation of actions and objects. That said, Tucciarelli et al. (2019) showed that patterns of activation in the LOTC capture the behavioral similarity structure of actions, over and above variability because of action components such as objects involved in the action. Likewise, several previous studies demonstrated representations of observed actions in the LOTC that generalized across the object (Wurm and Lingnau, 2015; Wurm et al., 2016). More specifically, Wurm et al. (2016) demonstrated that it is possible to distinguish between observed opening and closing actions, regardless of the object, in the LOTC, IPL, and PMv, whereas decoding between objects across actions was restricted to clusters in the ventral stream (Wurm et al., 2016; see their Figs. 3, 4 and Supplementary Fig. 5). While we cannot clearly dissociate between the representation of action functions and object states/object affordances in the current study, these previous studies indicate that action representations in the LOTC can be dissociated from representations of objects that are being manipulated.

The impact of scene-related information

Like objects, because of frequent co-occurrences, scenes can provide important cues regarding the kind of actions that are likely to be experienced [e.g., food-related actions in a kitchen scene and sport-related actions in a gym; see also (Wurm and Schubotz, 2012, 2017; Wurm et al., 2017a)]. To be able to account for the potential contribution of scene-related information in the current study, we used the activations of the second-to-last layer of a convolutional neural network (ResNet50) trained to distinguish between scene categories. A promising future step will be to systematically investigate the impact of the relationship between scene and action information on the representational space of actions.

Limitations and future directions

Previous studies revealed that posterior portions of the LOTC are recruited both by static images (Hafri et al., 2017; Tucciarelli et al., 2019) and by dynamic videos depicting actions (Wurm and Lingnau, 2015; Hafri et al., 2017; Wurm et al., 2017b). In addition, Hafri et al. (2017) successfully decoded observed actions across static and dynamic input in a number of regions, including the LOTC. That said, we cannot rule out that additional aspects relevant to the processing of the depicted actions, in particular, the corresponding kinematics, are not well captured in the current study.

Moreover, while we accounted for the impact of low-level visual and scene-related features, another important limitation of the current study is related to the fact that with the current stimulus material we cannot discount the impact of variance shared with object-related features involved in the actions. Moreover, it will be important to demonstrate to which degree the behavioral action space examined in the current study generalizes to a wider range of actions. Further studies are required to address these points.

In conclusion, our results offer a neural perspective on categorical distinctions of observed actions across taxonomic levels, yielding insights into the mechanisms underlying behavioral flexibility aligned with the observer's goals. Lateral and ventral portions of the LOTC appear to capture the unique similarity of observed actions at the subordinate and basic level, with a preference for the basic level. Together, our results offer new perspectives on the hierarchical organization of observed actions and the neural basis of the basic level advantage.

Footnotes

  • This research was supported by a grant from the German Research Foundation (DFG, LI 2840/4-1). A.L. was supported by a DFG Heisenberg-Professorship (LI 2840/2-1). T.Z. was funded by a PhD stipend from the Chinese Scholarship Council.

  • The authors declare no competing financial interests.

  • Correspondence should be addressed to Angelika Lingnau at angelika.lingnau{at}ur.de

SfN exclusive license.

References

  1. ↵
    1. Abdollahi RO,
    2. Jastorff J,
    3. Orban GA
    (2013) Common and segregated processing of observed actions in human SPL. Cereb Cortex 23:2734–2753. https://doi.org/10.1093/cercor/bhs264 pmid:22918981
    OpenUrlCrossRefPubMed
  2. ↵
    1. Aflalo T,
    2. Zhang CY,
    3. Rosario ER,
    4. Pouratian N,
    5. Orban GA,
    6. Andersen RA
    (2020) A shared neural substrate for action verbs and observed actions in human posterior parietal cortex. Sci Adv 6:eabb3984. https://doi.org/10.1126/sciadv.abb3984
    OpenUrlFREE Full Text
  3. ↵
    1. Ariani G,
    2. Wurm MF,
    3. Lingnau A
    (2015) Decoding internally and externally driven movement plans. J Neurosci 35:14160–14171. https://doi.org/10.1523/JNEUROSCI.0596-15.2015 pmid:26490857
    OpenUrlAbstract/FREE Full Text
  4. ↵
    1. Bach P,
    2. Nicholson T,
    3. Hudson M
    (2014) The affordance-matching hypothesis: how objects guide action understanding and prediction. Front Hum Neurosci 8:254. https://doi.org/10.3389/fnhum.2014.00254 pmid:24860468
    OpenUrlCrossRefPubMed
  5. ↵
    1. Benjamini Y,
    2. Hochberg Y
    (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B 57:289–300.
    OpenUrlCrossRefPubMed
  6. ↵
    1. Bonini L,
    2. Maranesi M,
    3. Livi A,
    4. Fogassi L,
    5. Rizzolatti G
    (2014) Space-dependent representation of objects and other's action in monkey ventral premotor grasping neurons. J Neurosci 34:4108–4119. https://doi.org/10.1523/JNEUROSCI.4187-13.2014 pmid:24623789
    OpenUrlAbstract/FREE Full Text
  7. ↵
    1. Bonner MF,
    2. Epstein RA
    (2018) Computational mechanisms underlying cortical responses to the affordance properties of visual scenes. PLoS Comput Biol 14:e1006111. https://doi.org/10.1371/journal.pcbi.1006111 pmid:29684011
    OpenUrlCrossRefPubMed
  8. ↵
    1. Borg I,
    2. Groenen PJF
    (2005) Modern multidimensional scaling–Theory and applications. New York: Springer.
  9. ↵
    1. Carlson T,
    2. Tovar DA,
    3. Alink A,
    4. Kriegeskorte N
    (2013) Representational dynamics of object vision: the first 1000 ms. J Vis 13:1. https://doi.org/10.1167/13.10.1
    OpenUrlAbstract/FREE Full Text
  10. ↵
    1. Dehaqani MRA,
    2. Vahabie AH,
    3. Kiani R,
    4. Ahmadabadi MN,
    5. Araabi BN,
    6. Esteky H
    (2016) Temporal dynamics of visual category representation in the macaque inferior temporal cortex. J Neurophysiol 116:587–601. https://doi.org/10.1152/jn.00018.2016 pmid:27169503
    OpenUrlCrossRefPubMed
  11. ↵
    1. Diedrichsen J,
    2. Kriegeskorte N
    (2017) Representational models: acommon framework for understanding encoding. PLoS Comput Biol 13:e1005508. pmid:28437426
    OpenUrlCrossRefPubMed
  12. ↵
    1. Gallivan JP,
    2. Culham JC
    (2015) Neural coding within human brain areas involved in actions. Curr Opin Neurobiol 33:141–149. https://doi.org/10.1016/j.conb.2015.03.012 pmid:25876179
    OpenUrlCrossRefPubMed
  13. ↵
    1. Gallivan JP,
    2. McLean DA,
    3. Flanagan JR,
    4. Culham JC
    (2013) Where one hand meets the other: limb-specific and action-dependent movement plans decoded from preparatory signals in single human frontoparietal brain areas. J Neurosci 33:1991–2008. https://doi.org/10.1523/JNEUROSCI.0541-12.2013 pmid:23365237
    OpenUrlAbstract/FREE Full Text
  14. ↵
    1. Gauthier I,
    2. Anderson AW,
    3. Tarr MJ,
    4. Skudlarski P,
    5. Gore JC
    (1997) Levels of categorization in visual recognition studied using functional magnetic resonance imaging. Curr Biol 7:645–651. https://doi.org/10.1016/s0960-9822(06)00291-0 pmid:9285718
    OpenUrlCrossRefPubMed
  15. ↵
    1. Grill-Spector K,
    2. Weiner KS
    (2014) The functional architecture of the ventral temporal cortex and its role in categorization. Nat Rev Neurosci 15:536–548. https://doi.org/10.1038/nrn3747 pmid:24962370
    OpenUrlCrossRefPubMed
  16. ↵
    1. Hafri A,
    2. Trueswell JC,
    3. Epstein RA
    (2017) Neural representations of observed actions generalize across static and dynamic visual input. J Neurosci 37:3056–3071. https://doi.org/10.1523/JNEUROSCI.2496-16.2017 pmid:28209734
    OpenUrlAbstract/FREE Full Text
  17. ↵
    1. Hamilton AF,
    2. Grafton ST
    (2006) Goal representation in human anterior intraparietal sulcus. J Neurosci 26:1133–1137. https://doi.org/10.1523/JNEUROSCI.4551-05.2006 pmid:16436599
    OpenUrlAbstract/FREE Full Text
  18. ↵
    1. Hamilton AF,
    2. Grafton ST
    (2008) Action outcomes are represented in human inferior frontoparietal cortex. Cereb Cortex 18:1160–1168. https://doi.org/10.1093/cercor/bhm150 pmid:17728264
    OpenUrlCrossRefPubMed
  19. ↵
    1. He K,
    2. Zhang X,
    3. Ren S,
    4. Sun J
    (2016) Deep residual learning for image recognition. Proc IEEE Conf Comput Vis Pattern Recognit 770–778.
  20. ↵
    1. Iordan MC,
    2. Greene MR,
    3. Beck DM,
    4. Li FF
    (2015) Basic level category structure emerges gradually across human ventral visual cortex. J Cogn Neurosci 27:1426–1446.
    OpenUrl
  21. ↵
    1. Kadmon Harpaz N,
    2. Flash T,
    3. Dinstein I
    (2014) Scale-invariant movement encoding in the human motor system. Neuron 81:452–462. https://doi.org/10.1016/j.neuron.2013.10.058 pmid:24462105
    OpenUrlCrossRefPubMed
  22. ↵
    1. Kilner JM
    (2011) More than one pathway to action understanding. Trends Cogn Sci 15:352–357. https://doi.org/10.1016/j.tics.2011.06.005 pmid:21775191
    OpenUrlCrossRefPubMed
  23. ↵
    1. Krasovsky A,
    2. Gilron R,
    3. Yeshurun Y,
    4. Mukamel R
    (2014) Differentiating intended sensory outcome from underlying motor actions in the human brain. J Neurosci 34:15446–15454. https://doi.org/10.1523/JNEUROSCI.5435-13.2014 pmid:25392511
    OpenUrlAbstract/FREE Full Text
  24. ↵
    1. Kriegeskorte N
    (2015) Deep neural networks: a new framework for modeling biological vision and brain information processing. Annu Rev Vis Sci 1:417–446. https://doi.org/10.1146/annurev-vision-082114-035447 pmid:28532370
    OpenUrlCrossRefPubMed
  25. ↵
    1. Kriegeskorte N,
    2. Mur M
    (2012) Inverse MDS: inferring dissimilarity structure from multiple item arrangements. Front Psychol 3:1–13.
    OpenUrlCrossRefPubMed
  26. ↵
    1. Kriegeskorte N,
    2. Mur M,
    3. Ruff DA,
    4. Kiani R,
    5. Bodurka J,
    6. Esteky H,
    7. Tanaka K,
    8. Bandettini PA
    (2008) Matching categorical object representations in inferior temporal cortex of man and monkey. Neuron 60:1126–1141. https://doi.org/10.1016/j.neuron.2008.10.043 pmid:19109916
    OpenUrlCrossRefPubMed
  27. ↵
    1. Lanzilotto M,
    2. Maranesi M,
    3. Livi A,
    4. Giulia Ferroni C,
    5. Orban GA,
    6. Bonini L
    (2020) Stable readout of observed actions from format-dependent activity of monkey's anterior intraparietal neurons. Proc Natl Acad Sci U S A 117:16596–16605. https://doi.org/10.1073/pnas.2007018117 pmid:32581128
    OpenUrlAbstract/FREE Full Text
  28. ↵
    1. Levin B
    (1993) English verb classes and alternations: a preliminary investigation. Chicago: University of Chicago Press.
  29. ↵
    1. Lingnau A,
    2. Downing PE
    (2015) The lateral occipitotemporal cortex in action. Trends Cogn Sci 19:268–277. https://doi.org/10.1016/j.tics.2015.03.006 pmid:25843544
    OpenUrlCrossRefPubMed
  30. ↵
    1. Livi A,
    2. Lanzilotto M,
    3. Maranesi M,
    4. Fogassi L,
    5. Rizzolatti G,
    6. Bonini L
    (2019) Agent-based representations of objects and actions in the monkey pre-supplementary motor area. Proc Natl Acad Sci U S A 116:2691–2700. https://doi.org/10.1073/pnas.1810890116 pmid:30696759
    OpenUrlAbstract/FREE Full Text
  31. ↵
    1. Macé MJM,
    2. Joubert OR,
    3. Nespoulous JL,
    4. Fabre-Thorpe M
    (2009) The time-course of visual categorizations: you spot the animal faster than the bird. PLoS One 4:e5927. https://doi.org/10.1371/journal.pone.0005927 pmid:19536292
    OpenUrlCrossRefPubMed
  32. ↵
    1. Mack ML,
    2. Gauthier I,
    3. Sadr J,
    4. Palmeri TJ
    (2008) Object detection and basic-level categorization: sometimes you know it is there before you know what it is. Psychon Bull Rev 15:28–35. https://doi.org/10.3758/pbr.15.1.28 pmid:18605476
    OpenUrlCrossRefPubMed
  33. ↵
    1. Mahendran A,
    2. Vedaldi A
    (2015) Understanding deep image representations by inverting them. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit 2015:5188–5196.
    OpenUrl
  34. ↵
    1. Majdandzic J,
    2. Bekkering H,
    3. van Schie HT,
    4. Toni I
    (2009) Movement-specific repetition suppression in ventral and dorsal premotor cortex during action observation. Cereb Cortex 19:2736–2745. https://doi.org/10.1093/cercor/bhp049 pmid:19321652
    OpenUrlCrossRefPubMed
  35. ↵
    1. Mason RL,
    2. Gunst RF,
    3. Hess JL
    (2003) Statistical design and analysis of experiments: with applications to engineering and science. New York: Wiley.
  36. ↵
    1. Murphy G
    (2004) The big book of concepts. Cambridge: MIT Press.
  37. ↵
    1. Nichols T,
    2. Brett M,
    3. Andersson J,
    4. Wager T,
    5. Poline JB
    (2005) Valid conjunction inference with the minimum statistic. Neuroimage 25:653–660. https://doi.org/10.1016/j.neuroimage.2004.12.005 pmid:15808966
    OpenUrlCrossRefPubMed
  38. ↵
    1. Nili H,
    2. Wingfield C,
    3. Walther A,
    4. Su L,
    5. Marslen-Wilson W,
    6. Kriegeskorte N
    (2014) A toolbox for representational similarity analysis. PLoS Comput Biol 10:e1003553. https://doi.org/10.1371/journal.pcbi.1003553 pmid:24743308
    OpenUrlCrossRefPubMed
  39. ↵
    1. Oosterhof NN,
    2. Wiggett AJ,
    3. Diedrichsen J,
    4. Tipper SP,
    5. Downing PE
    (2010) Surface-based information mapping reveals crossmodal vision–action representations in human parietal and occipitotemporal cortex. J Neurophysiol 104:1077–1089. https://doi.org/10.1152/jn.00326.2010 pmid:20538772
    OpenUrlCrossRefPubMed
  40. ↵
    1. Oosterhof NN,
    2. Tipper SP,
    3. Downing PE
    (2012) Viewpoint (in)dependence of action representations: an MVPA study. J Cogn Neurosci 24:975–989. https://doi.org/10.1162/jocn_a_00195 pmid:22264198
    OpenUrlCrossRefPubMed
  41. ↵
    1. Oosterhof NN,
    2. Connolly AC,
    3. Haxby JV
    (2016) CoSMoMVPA: multi-modal multivariate pattern analysis of neuroimaging data in matlab/GNU octave. Front Neuroinform 10:27. https://doi.org/10.3389/fninf.2016.00027 pmid:27499741
    OpenUrlCrossRefPubMed
  42. ↵
    1. Papeo L,
    2. Agostini B,
    3. Lingnau A
    (2019) The large-scale organization of gestures and words in the middle temporal gyrus. J Neurosci 39:5966–5974.
    OpenUrlAbstract/FREE Full Text
  43. ↵
    1. Pillon A,
    2. d'Honincthun P
    (2011) A common processing system for the concepts of artifacts and actions? evidence from a case of a disproportionate conceptual impairment for living things. Cogn Neuropsychol 28:1–43. https://doi.org/10.1080/02643294.2011.615828 pmid:22114769
    OpenUrlPubMed
  44. ↵
    1. Rosch E,
    2. Mervis CB,
    3. Gray WD,
    4. Johnson DM,
    5. Boyes-braem P
    (1976) Basic objects in natural categories. Cogn Psychol 8:382–439. https://doi.org/10.1016/0010-0285(76)90013-X
    OpenUrlCrossRef
  45. ↵
    1. Schwarzbach J
    (2011) A simple framework (ASF) for behavioral and neuroimaging experiments based on the psychophysics toolbox for MATLAB. Behav Res Methods 43:1194–1201. https://doi.org/10.3758/s13428-011-0106-8 pmid:21614662
    OpenUrlCrossRefPubMed
  46. ↵
    1. Smith SM,
    2. Nichols TE
    (2009) Threshold-free cluster enhancement: addressing problems of smoothing, threshold dependence and localisation in cluster inference. Neuroimage 44:83–98. https://doi.org/10.1016/j.neuroimage.2008.03.061 pmid:18501637
    OpenUrlCrossRefPubMed
  47. ↵
    1. Spunt RP,
    2. Kemmerer D,
    3. Adolphs R
    (2016) The neural basis of conceptualizing the same action at different levels of abstraction. Soc Cogn Affect Neurosci 11:1141–1151. https://doi.org/10.1093/scan/nsv084 pmid:26117505
    OpenUrlCrossRefPubMed
  48. ↵
    1. Tarhan L,
    2. De Freitas J,
    3. Konkle T
    (2021) Behavioral and neural representations en route to intuitive action understanding. Neuropsychologia 163:108048. https://doi.org/10.1016/j.neuropsychologia.2021.108048 pmid:34653497
    OpenUrlPubMed
  49. ↵
    1. Tucciarelli R,
    2. Wurm MF,
    3. Baccolo E,
    4. Lingnau A
    (2019) The representational space of observed actions. Elife 8:e47686. https://doi.org/10.7554/eLife.47686
    OpenUrlCrossRef
  50. ↵
    1. Turella L,
    2. Rumiati R,
    3. Lingnau A
    (2020) Hierarchical action encoding within the human brain. Cereb Cortex 30:2924–2938. https://doi.org/10.1093/cercor/bhz284 pmid:31942941
    OpenUrlCrossRefPubMed
  51. ↵
    1. Vallacher RR,
    2. Wegner DM
    (1985) A theory of action identification. Hillsdale: Lawrence Erlbaum Associates.
  52. ↵
    1. Wegner DM,
    2. Vallacher RR
    (1986) Action identification. В: handbook of motivation and cognition: foundations of social behavior (Sorrentino RM, Higgins ET, eds), pp 550–582. New York: Guilford.
  53. ↵
    1. Wurm MF,
    2. Schubotz RI
    (2012) Squeezing lemons in the bathroom: contextual information modulates action recognition. Neuroimage 59:1551–1559. https://doi.org/10.1016/j.neuroimage.2011.08.038 pmid:21878395
    OpenUrlCrossRefPubMed
  54. ↵
    1. Wurm MF,
    2. Lingnau A
    (2015) Decoding actions at different levels of abstraction. J Neurosci 35:7727–7735. https://doi.org/10.1523/JNEUROSCI.0188-15.2015 pmid:25995462
    OpenUrlAbstract/FREE Full Text
  55. ↵
    1. Wurm MF,
    2. Schubotz RI
    (2017) What's she doing in the kitchen? Context helps when actions are hard to recognize. Psychon Bull Rev 24:503–509. https://doi.org/10.3758/s13423-016-1108-4 pmid:27383619
    OpenUrlPubMed
  56. ↵
    1. Wurm MF,
    2. Caramazza A
    (2019) Distinct roles of temporal and frontoparietal cortex in representing actions across vision and language. Nat Commun 10:289. https://doi.org/10.1038/s41467-018-08084-y pmid:30655531
    OpenUrlCrossRefPubMed
  57. ↵
    1. Wurm MF,
    2. Caramazza A
    (2022) Two 'what' pathways for action and object recognition. Trends Cogn Sci 26:103–116.
    OpenUrlCrossRefPubMed
  58. ↵
    1. Wurm MF,
    2. Ariani G,
    3. Greenlee MW,
    4. Lingnau A
    (2016) Decoding concrete and abstract action representations during explicit and implicit conceptual processing. Cereb Cortex 26:3390–3401. https://doi.org/10.1093/cercor/bhv169 pmid:26223260
    OpenUrlCrossRefPubMed
  59. ↵
    1. Wurm MF,
    2. Artemenko C,
    3. Giuliani D,
    4. Schubotz RI
    (2017a) Action at its place: contextual settings enhance action recognition in 4- to 8-year-old children. Dev Psychol 53:662–670. https://doi.org/10.1037/dev0000273 pmid:28182453
    OpenUrlPubMed
  60. ↵
    1. Wurm MF,
    2. Caramazza A,
    3. Lingnau A
    (2017b) Action categories in lateral occipitotemporal cortex are organized along sociality and transitivity. J Neurosci 37:562–575. https://doi.org/10.1523/JNEUROSCI.1717-16.2016 pmid:28100739
    OpenUrlAbstract/FREE Full Text
  61. ↵
    1. Zeiler MD,
    2. Fergus R
    (2014) Visualizing and understanding convolutional networks. B: In Computer Vision–ECCV: 13th European Conference, Zurich, Switzerland, September 6–12: 818–833.
  62. ↵
    1. Zhou B,
    2. Lapedriza A,
    3. Torralba A,
    4. Oliva A
    (2017) Places: an image database for deep scene understanding. J Vis 17:296. https://doi.org/10.1167/17.10.296
    OpenUrl
  63. ↵
    1. Zhou B,
    2. Lapedriza A,
    3. Khosla A,
    4. Oliva A,
    5. Torralba A
    (2018) Places: a 10 million image database for scene recognition. IEEE Trans Pattern Anal Mach Intell 40:1452–1464. https://doi.org/10.1109/TPAMI.2017.2723009 pmid:28692961
    OpenUrlCrossRefPubMed
  64. ↵
    1. Zhuang T,
    2. Lingnau A
    (2022) The characterization of actions at the superordinate, basic and subordinate level. Psychol Res 86:1871–1891. https://doi.org/10.1007/s00426-021-01624-0 pmid:34907466
    OpenUrlPubMed
Back to top

In this issue

The Journal of Neuroscience: 43 (48)
Journal of Neuroscience
Vol. 43, Issue 48
29 Nov 2023
  • Table of Contents
  • Table of Contents (PDF)
  • About the Cover
  • Index by author
  • Masthead (PDF)
Email

Thank you for sharing this Journal of Neuroscience article.

NOTE: We request your email address only to inform the recipient that it was you who recommended this article, and that it is not junk mail. We do not retain these email addresses.

Enter multiple addresses on separate lines or separate them with commas.
The Representation of Observed Actions at the Subordinate, Basic, and Superordinate Level
(Your Name) has forwarded a page to you from Journal of Neuroscience
(Your Name) thought you would be interested in this article in Journal of Neuroscience.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Print
View Full Page PDF
Citation Tools
The Representation of Observed Actions at the Subordinate, Basic, and Superordinate Level
Tonghe Zhuang, Zuzanna Kabulska, Angelika Lingnau
Journal of Neuroscience 29 November 2023, 43 (48) 8219-8230; DOI: 10.1523/JNEUROSCI.0700-22.2023

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Respond to this article
Request Permissions
Share
The Representation of Observed Actions at the Subordinate, Basic, and Superordinate Level
Tonghe Zhuang, Zuzanna Kabulska, Angelika Lingnau
Journal of Neuroscience 29 November 2023, 43 (48) 8219-8230; DOI: 10.1523/JNEUROSCI.0700-22.2023
Twitter logo Facebook logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Jump to section

  • Article
    • Abstract
    • Introduction
    • Materials and Methods
    • Results
    • Discussion
    • Footnotes
    • References
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF

Keywords

  • action categorization
  • action observation
  • action recognition

Responses to this article

Respond to this article

Jump to comment:

No eLetters have been published for this article.

Related Articles

Cited By...

More in this TOC Section

Research Articles

  • Change of spiny neuron structure in the basal ganglia song circuit and its regulation by miR-9 during song development
  • Increased neuronal expression of the early endosomal adaptor APPL1 replicates Alzheimer’s Disease-related endosomal and synaptic dysfunction with cholinergic neurodegeneration.
  • Presynaptic mu opioid receptors suppress the functional connectivity of ventral tegmental area dopaminergic neurons with aversion-related brain regions
Show more Research Articles

Behavioral/Cognitive

  • Change of spiny neuron structure in the basal ganglia song circuit and its regulation by miR-9 during song development
  • Increased neuronal expression of the early endosomal adaptor APPL1 replicates Alzheimer’s Disease-related endosomal and synaptic dysfunction with cholinergic neurodegeneration.
  • Presynaptic mu opioid receptors suppress the functional connectivity of ventral tegmental area dopaminergic neurons with aversion-related brain regions
Show more Behavioral/Cognitive
  • Home
  • Alerts
  • Follow SFN on BlueSky
  • Visit Society for Neuroscience on Facebook
  • Follow Society for Neuroscience on Twitter
  • Follow Society for Neuroscience on LinkedIn
  • Visit Society for Neuroscience on Youtube
  • Follow our RSS feeds

Content

  • Early Release
  • Current Issue
  • Issue Archive
  • Collections

Information

  • For Authors
  • For Advertisers
  • For the Media
  • For Subscribers

About

  • About the Journal
  • Editorial Board
  • Privacy Notice
  • Contact
  • Accessibility
(JNeurosci logo)
(SfN logo)

Copyright © 2025 by the Society for Neuroscience.
JNeurosci Online ISSN: 1529-2401

The ideas and opinions expressed in JNeurosci do not necessarily reflect those of SfN or the JNeurosci Editorial Board. Publication of an advertisement or other product mention in JNeurosci should not be construed as an endorsement of the manufacturer’s claims. SfN does not assume any responsibility for any injury and/or damage to persons or property arising from or related to any use of any material contained in JNeurosci.