A Comparison of Primate Prefrontal and Inferior Temporal Cortices during Visual Categorization

David J. Freedman; Maximilian Riesenhuber; Tomaso Poggio; Earl K. Miller

doi:10.1523/JNEUROSCI.23-12-05235.2003

Abstract

Previous studies have suggested that both the prefrontal cortex (PFC) and inferior temporal cortex (ITC) are involved in high-level visual processing and categorization, but their respective roles are not known. To address this, we trained monkeys to categorize a continuous set of visual stimuli into two categories, “cats” and “dogs.” The stimuli were parametrically generated using a computer graphics morphing system (Sheltonelton, 2000) that allowed precise control over stimulus shape. After training, we recorded neural activity from the PFC and the ITC of monkeys while they performed a category-matching task. We found that the PFC and the ITC play distinct roles in category-based behaviors: the ITC seems more involved in the analysis of currently viewed shapes, whereas the PFC showed stronger category signals, memory effects, and a greater tendency to encode information in terms of its behavioral meaning.

Figure 7.

This figure shows a model of the recognition architecture in cortex (Riesenhuber and Poggio, 1999) that combines and extends several recent models (Fukushima, 1980; Perrett and Oram, 1993; Wallis and Rolls, 1997) and effectively summarizes many experimental findings (for review, see Riesenhuber and Poggio, 2002). It is a hierarchical extension of the classical paradigm (Hubel and Wiesel, 1962) of building complex cells from simple cells. The specific circuitry that we proposed consists of a hierarchy of layers with linear (“S” units, performing template matching, gray lines) and nonlinear units (“C” pooling units, performing a “MAX” operation, black lines, in which the response of the pooling neuron is determined by its maximally activated afferent). These two types of operations provide, respectively, pattern specificity, by combining simple features to build more complex ones, and invariance: to translation, by pooling over afferents tuned to the same feature at different positions, and scale (data not shown), by pooling over afferents tuned to the same feature at different scales. Shape-tuned model units (STU) exhibit tuning to complex shapes but are tolerant to scaling and translation of their preferred shape, like view-tuned neurons found in ITC(cf.Logothetis et al., 1995). STUs can then serve as input to task modules located farther downstream, e.g., in PFC, that perform different visual tasks such as object categorization. These modules consist of the same generic learning process but are trained to perform different tasks. For more information, see http://www.ai.mit.edu/projects/cbcl/hmax.

View Full Text