How does the saccadic movement system select a target when visual, auditory, and planned movement commands differ? How do retinal, head-centered, and motor error coordinates interact during the selection process? Recent data on superior colliculus (SC) reveal a spreading wave of activation across buildup cells the peak activity of which covaries with the current gaze error. In contrast, the locus of peak activity remains constant at burst cells, whereas their activity level decays with residual gaze error. A neural model answers these questions and simulates burst and buildup responses in visual, overlap, memory, and gap tasks. The model also simulates data on multimodal enhancement and suppression of activity in the deeper SC layers and suggests a functional role for NMDA receptors in this region. In particular, the model suggests how auditory and planned saccadic target positions become aligned and compete with visually reactive target positions to select a movement command. For this to occur, a transformation between auditory and planned head-centered representations and a retinotopic target representation is learned. Burst cells in the model generate teaching signals to the spreading wave layer. Spreading waves are produced by corollary discharges that render planned and visually reactive targets dimensionally consistent and enable them to compete for attention to generate a movement command in motor error coordinates. The attentional selection process also helps to stabilize the map-learning process. The model functionally interprets cells in the superior colliculus, frontal eye field, parietal cortex, mesencephalic reticular formation, paramedian pontine reticular formation, and substantia nigra pars reticulata.
- eye movements
- superior colliculus
- burst neurons
- buildup neurons
- parietal cortex
- frontal eye fields
- reticular formation
- substantia nigra
Saccades are ballistic eye movements that facilitate the survival of an animal in a rapidly changing environment. Saccadic eye movements include at least three types: visually reactive, multimodal (e.g., auditorily cued), and planned. Visually reactive saccades are reflexive movements generated by areas of rapid visual change. Auditory saccades direct the eyes toward acoustic stimuli. Planned saccades move the eye to intended targets; they “direct the eye at objects selected beforehand from the visual environment” (Becker, 1989). Such eye movements can be made to targets that may or may not be visible when an eye movement begins.
The saccadic system can execute a movement to a planned target without being distracted by irrelevant reactive targets. However, when intense visual or auditory stimuli appear, they can take precedence over a planned movement. How and where is the shared control between visually reactive, auditory, and planned saccades achieved? In particular, visual cues are registered in retinotopic coordinates, whereas auditory cues are registered in head-centered coordinates. How are these distinct coordinate systems merged so that a particular location can be selected as a saccadic target? This transformation must be learned to align the corresponding visual and auditory representations.
The present article develops a model of how these adaptive coordinate changes take place in the deeper layers of the superior colliculus (SC) and enable a saccadic movement target to be chosen there. The neural circuitry that supports this learning process simulates neurophysiological data about the burst neurons (Waitzman et al., 1991;Munoz and Wurtz, 1995b), also called T cells (Moschovakis et al., 1988a), and the buildup or tectoreticulospinal neurons (Munoz et al., 1991; Munoz and Wurtz, 1995b), also called X cells (Moschovakis and Karabelas, 1985), that exist in the deeper SC layers. These simulations include the responses of both types of cells in visual, overlap, memory, and gap behavioral tasks (Munoz and Wurtz, 1995a). The model also provides a functional rationale for how multimodal cells in these SC layers process inputs from converging unimodal pathways and how these converging multimodal inputs yield enhancement or suppressive effects, depending on the relative locations of the inputs (Stein and Meredith, 1993). The model predicts that burst neuron outputs act as teaching signals to buildup neurons and that these latter cells are the postsynaptic sites of associative learning along pathways from cortical auditory centers (Stein and Meredith, 1993) and the frontal eye fields (Schlag-Rey et al., 1992). This hypothesis is consistent with recent evidence showing the importance of NMDA receptors for multimodal integration in the deep layers of the cat superior colliculus (Binns and Salt, 1996). The model also suggests why, although the frontal eye fields excite SC cells that code a similar movement and inhibit SC cells that code a different movement (Schlag-Rey et al., 1992), there are additional adaptive mechanisms for the control of planned eye movements than those that engage the SC (Schiller and Sandell, 1983; Segraves, 1992; Deubel, 1995). These mechanisms are modeled in Gancarz and Grossberg (1997).
MATERIALS AND METHODS
To set the stage for these analyses, we summarize in this section brain regions that converge onto the SC, which in turn projects to gaze control centers. Key SC neuron types are then surveyed that influence saccadic control, and the model is introduced. In the Results, we show how the model can control multimodal, planned, and visually reactive saccades via a process of SC map learning. The model functionally interprets known anatomical connections between cells in the superior colliculus, frontal eye field, parietal cortex, mesencephalic reticular formation, paramedian pontine reticular formation, and substantia nigra pars reticulata. The model is then used to simulate physiological data of SC burst and buildup neurons during a variety of behavioral tasks and to shed light on the following types of data.
The role of the superior colliculus in the saccadic eye movement system. Saccades are mediated by pontine and mesencephalic burst circuits that are usually controlled by the superior colliculus (Goldberg et al., 1991). The superior colliculus generates a high-frequency burst of activity preceding saccades. The discharge of neurons is related to changes in the eye position regardless of the initial position of the eye in the orbit (Sparks and Mays, 1990). The superior colliculus contains topographic maps, and the location of neurons in these maps codes motor error (Sparks and Mays, 1980; Sparks and Nelson, 1987).
The deeper layers of the superior colliculus contain heterogeneous cell sizes and an intermingling of cells and axons with a large degree of overlap of dendritic fields (Grantyn, 1988). These cells are direction- and amplitude-specific cells that contribute to populations that are broadly tuned. The sensory properties of these cells include multimodal responses and large receptive fields (Stein and Meredith, 1993). They also exhibit rapidly habituating response to repetitive stimuli and sensitivity to dynamic stimuli. Because the deeper layers are those that are involved in eye movement control, the remainder of this discussion will concentrate only on these layers of the superior colliculus.
Many studies include data suggesting that the deeper layers of superior colliculus are involved in the control of visually reactive, auditory, and planned saccades (Powell and Hatton, 1969; Sparks and Mays, 1981;Jay and Sparks, 1984, 1987a,b, 1990; Davson, 1990; Zambarbieri et al., 1995). The deeper layers receive afferent signals from both descending and ascending sources. The descending sources originate as ipsilateral projections from cortical visual, auditory, and somatosensory areas (McIlwain, 1977; Wurtz and Albano, 1980; Schlag-Rey et al., 1992; Stein and Meredith, 1993) and signal both sensory and motor information.
Sensory afferents. By contrast with ascending visual input to superficial SC layers, ascending projections to deeper SC layers provide a limited distribution of contralateral visual inputs from retina, the lateral geniculate nucleus, pretectum, and superficial superior colliculus. Much heavier ascending inputs originate from contralateral auditory sources. These sources include projections from the periolivary regions of the superior olive, the nuclei of the trapezoidal body, the ventral nucleus of the lateral lemniscus, and, to a lesser extent, the external nucleus of the inferior colliculus (Edwards et al., 1979). The external nucleus of the inferior colliculus is implicated in attention or arousal responses to auditory stimuli. All of these regions, whether cortical or subcortical, provide unimodal projections to the superior colliculus, and no examples of multisensory projections have been reported.
Motor afferents. Motor afferents arise from numerous, primarily ipsilateral, sources. The posterior parietal cortex projection to the superior colliculus is primarily to the deep intermediate layers. Cells from the lateral intraparietal area of posterior parietal cortex fire before saccades and indicate the intended amplitude and direction of eye movement in motor coordinates (Gnadt and Andersen, 1988). Posterior parietal cortical projections are distributed in a relatively homogeneous manner, with the exception of projections from area 7a, which possibly alternate with projections from the frontal eye fields (Huerta and Harting, 1986).
The cortical inputs to the superior colliculus also include heavy descending projections from the frontal eye field. Anterograde tracing from the frontal eye field demonstrates a predominant projection to intermittent patches in the deep regions of the intermediate gray layer. Physiological responses of neurons in the frontal eye field generally exhibit pre- or postsaccadic activity (Goldberg and Bruce, 1990). Presaccadic activity is generated in response to visually guided or purposive saccades, and 54% of frontal eye field neurons are presaccadic. It is surmised (Goldberg and Bruce, 1990) that the frontal eye field sends both a motor signal to the superior colliculus that specifies the coordinates of a saccade and a fixation signal that is involved in the maintenance and release of fixation. Via its projections to the caudate, the frontal eye field can also influence saccadic eye movements via the substantia nigra.
Another important structure involved in the oculomotor system is therefore the basal ganglia that include the caudate and substantia nigra. Most substantia nigra cells need the presence of a visual target to pause before eye movements (Hikosaka and Wurtz, 1983b). Some of the cells depend on the state of fixation or presence of a memory trace to affect their activity (Hikosaka and Wurtz, 1983b). Heavy inputs from substantia nigra pars reticulata directly contact the majority of tectoreticulospinal neurons in discrete patches in the intermediate layers of the superior colliculus (Graybiel, 1978).
There is no proprioceptive feedback from extraocular muscle receptors to signal directly eye position to deeper layer neurons (Stein and Meredith, 1993). It is thought that such a signal is provided by corollary discharge from neurons extrinsic to the colliculus (Stein and Meredith, 1993).
Efferents. Four efferent pathways project from the superior colliculus. One pathway is ascending and reaches the thalamus, and one projects to the opposite superior colliculus. The descending efferents are involved in repositioning the eyes, head, limbs, and other peripheral sensory organs via a contralateral and an ipsilateral pathway. Although a majority of neurons without efferent projections are unimodal (Meredith and Stein, 1986), including projections from auditory centers (Jay and Sparks, 1984, 1987b, 1990; Zambarbieri et al., 1995), nearly 75% of neurons with descending efferent projections are multisensory.
Efferent neurons of the superior colliculus convey motor commands using widespread connections with neurons in other motor areas of the brainstem and spinal cord (Moschovakis et al., 1988a). Two important classes of output cells are tectoreticular and tectoreticulospinal neurons. Tectoreticular neurons project to the predorsal bundle (PDB) and the abducens region and have medium-sized somata that occupy the intermediate gray layer, including the uppermost levels. Tectoreticulospinal neurons project to the abducens region and the spinal cord and have large somata that reside only in the deeper levels of the intermediate gray layer and below (Guitton, 1991).
The intermediate and deep layers of the superior colliculus project to the brainstem, providing motor commands to the region that controls eye movements (Sparks and Hartwich-Young, 1989). These brainstem regions, which include the paramedian pontine and mesencephalic reticular formation, in turn contain neurons that produce important components of a saccade. Burst cells produce the pulse component of saccades, and the prepositus nucleus and the vestibular nuclear complex are part of the neural integrator that provides the step component. The nucleus prepositus hypoglossi projects back to deep collicular neurons (Stechison et al., 1985). The burst cells provide direct input to the motor neurons that move the extraocular muscles (Hepp et al., 1989). The gain of the saccadic system is also influenced by the cerebellum (Goldberg et al., 1991).
Intrinsic neurons in the deeper layers of the superior colliculus, bursts and spreading waves. Among the efferent neurons of the superior colliculus that convey motor commands to the brainstem and spinal cord, two distinct neural activity patterns can be found, a burst and a buildup of activity. During saccades, the peak neural activity in a population of saccade-related burst neurons in the superior colliculus remains in a fixed position, and the activity level rapidly increases immediately before eye movement. Then the activity peak decays as a function of the remaining gaze error (Waitzman et al., 1991). In a buildup or tectoreticulospinal cell population, there is a slow buildup of activity well in advance of the saccade. During the saccade, buildup cells exhibit a spread of activation that moves across sites that code the current gaze error (Munoz et al., 1991; Guitton, 1992).
Intracellular staining studies of alert monkeys have been used to identify collicular neurons with presaccadic activity during spontaneous eye movements with a fixed head (Moschovakis et al., 1988b). Saccade-related neurons were identified as cells showing little spontaneous activity but an intense burst of activity before spontaneous saccades (Fig. 1,left). Morphologically, these burst neurons are Tcells defined by Moschovakis et al. (1988a).
Neurons in the colliculus that displayed activity that ceased at or near the end of a saccade were studied by Waitzman et al. (1991). These cells were found to produce presaccadic discharges related to eye movement and in some cases to the presence of a visual target. The neuronal discharge of these cells was investigated in relation to changes in saccade amplitude and radial velocity. The location of a cell in the colliculus was correlated to the amplitude of the eye movement. The level of activity of the cell encoded the difference between the desired and current eye displacement throughout the saccade. This difference is also known as the gaze motor error. These burst neurons can be identified with T neurons defined byMoschovakis et al. (1988a).
Munoz et al. (1991) antidromically identified and studied neurons located in the intermediate and deep layers of the superior colliculus in alert cats. These neurons were identified as tectoreticular and tectoreticulospinal neurons. Because of the large amplitude of the extracellularly recorded spike and the short antidromic latencies (implying large diameter axons), these neurons were presumed to represent X cells described by Moschovakis and Karabelas (1985).
Munoz et al. (1991) described the movement-related discharges of two classes of these tectoreticulospinal neurons in the intermediate and deep layers of the superior colliculus. Tectoreticulospinal neurons are organized in a retinotopically coded motor map. Fixation tectoreticulospinal neurons are located within the foveal representation of the motor map. They reduced their rate of discharge during orienting gaze shifts and resumed their sustained discharge when the target was fixated. Orientation tectoreticulospinal neurons are located outside of the region in the superior colliculus representing the fovea. They exhibited prolonged buildups followed by phasic motor bursts immediately before the onset of gaze shifts in head-fixed and head-free cats (Fig. 1, right). Their discharge rate exhibited an increase before gaze shifts corresponding to the amplitude and direction of the preferred movement of the cell. The timing of the burst relative to the onset of the gaze shift was shown to vary depending on the gaze shift amplitude. Each tectoreticulospinal neuron reached its peak discharge rate when the instantaneous gaze motor error matched the optimal vector of the cell.
These observations suggest that the activity of tectoreticulospinal neurons reflects the change in gaze motor error (Guitton and Munoz, 1985). At the start of a gaze shift, a zone of activity was established at the collicular locus encoding the desired gaze displacement. As the gaze shift proceeded, this zone of activation moved continuously across the superior colliculus motor map to form a spreading wave of activation that moves toward the rostral pole in such a way that the location of its forward edge covaries with the remaining gaze motor error. As the gaze shift terminated, the fixation tectoreticulospinal neurons at the rostral pole became active.
Munoz and Wurtz (1993) characterized the discharge pattern of fixation cells. During saccades, the tonically active fixation cells showed a pause in their rate of firing. This pause always began before the onset of a saccade, and the cell resumed firing before the end of contraversive saccades.
A model of multimodal adaptive saccadic control. The neural model presented in this section, which was first reported in Roberts et al. (1994), simulates one of the core processes that is proposed to control how visually reactive, auditory, and planned saccades are calibrated and coordinated. In so doing, the model gives a functional explanation for both the peak decay and wave-like activity patterns exhibited by burst and buildup cells, respectively. It also explains why buildup, but not burst, cells show activation well in advance of planned saccades.
The model proposes the following neural mechanisms. Early in development, visual cues trigger saccades via a visually reactive saccadic system. These reactive movements are not necessarily accurate at first. The model proposes that visual error signals are caused by inaccurate foveations and trigger a learning process through which movement gains change adaptively until accurate foveations of visually reactive movements are achieved (Grossberg and Kuperstein, 1986, Chapter 3). These visual error signals are registered in retinotopic coordinates and are converted into motor error signals before this learning process occurs in the cerebellum (see below). The accuracy of auditory and planned saccades is assumed to build on the gains learned by the visually reactive system. For this to occur, a transformation between a head-centered and a motor error target representation needs to be learned. Recent data (Gilmore and Johnson, 1997) suggest that this process is complete by 6 months in human infants. Targets in retinotopic and head-centered coordinates are in this manner rendered dimensionally consistent so that they can compete for attention to generate a movement command in motor error coordinates. As shown below, both stationary, decaying (burst neurons) and spreading-wave (buildup neurons) activity profiles are produced in a natural way as emergent properties of the circuits that enable this transformation.
In particular, when auditory or planned movement vectors represent the same position as a visual target, then the former vectors learn how to map onto the SC motor error locations that represent visually reactive movement vectors. These various movement vectors can then compete when not in agreement to select winning cells the activity of which generates a focus of attention. Competition also helps to stabilize the map-learning process by suppressing all but the winning vectors, so that the learned map is not eroded by the massive flux of multiple possible target locations and the corresponding teaching signals.
Retinotopic visually reactive saccade system. Initially, saccades are executed reactively to targets defined by changing visual signals that are registered on the retina. These retinotopic signals map topographically in a natural way into a motor error map (Grossberg and Kuperstein, 1986, Chapter 3). These motor error signals activate map locations in the peak decay (PD) layer (Fig.2 a) of burst cells that, in turn, topographically excite the spreading wave (SW) layer of buildup cells. The term “spreading wave” will be used below as a mnemonic to designate the spreading activity that occurs at buildup cells. The reactive target coordinates at PD and SW cells are thus consistent with the motor error coordinates that are coded in collicular maps (Davson, 1990).
The locus of activation in such a motor error map codes the direction and amplitude (or length) of a saccadic movement. Such an encoding is not the same thing, however, as generating an accurately calibrated saccade (e.g., see Stanford and Sparks, 1994; White et al., 1994;Stanford et al., 1996). For this to happen, several other processes need to occur. For one, the motor error signal is converted from the spatial coordinate system of the collicular map to a temporal code that specifies the firing rate of cells in the saccade generator (Robinson, 1973; Grossberg and Kuperstein, 1986, Chapter 7). Although both PD and SW cells play a role in generating saccadic outputs (see Table 1), only the SW output will be explicitly modeled here, for simplicity.
This spatial to temporal conversion is calibrated via a side path containing an adaptive gain stage. Early in development, if retinal error exists after a visually reactive saccade, the error is used to modify an adaptively weighted connection to the adaptive gain stage, the anatomy and neurophysiology of which model aspects of cerebellar learning (Ito, 1984; Grossberg and Kuperstein, 1986, Chapter 3; Fiala et al., 1996). The adaptively gain-controlled reactive pathway then produces accurate saccades. In some models (e.g., Lefèvre and Galiana, 1992), accuracy requires calibrated negative feedback to the superior colliculus. In others (Grossberg and Kuperstein, 1986; Dominey and Arbib, 1992), accuracy is explained without negative feedback to the superior colliculus during reactive saccades.
A further refinement requires an analysis of how different combinations of auditory, visually attentive, and planned eye movements, which are controlled by the parietal and prefrontal cortices, get calibrated by mechanisms downstream and parallel to the SC. Gancarz and Grossberg (1997) build on the present model to show how spatially distributed outputs from both PD and SW cells to the saccade generator cause movements with experimentally observed properties in response to these different types of movement commands.
Deciding among visual, auditory, and planned saccades. In addition to retinotopic coordinates, visual targets can also be stored in head-centered coordinates. For example, when a visual saccadic target must be stored in memory, intervening eye movements could render the stored target inaccurate if the location remained in retinotopic coordinates before eye movement.
Accurate visually reactive movements create a stable dynamical substrate on which a head-centered spatial map of invariant target location can form. A visual target can be converted from a retinotopic to a head-centered signal by adding the current eye position (Robinson, 1973, 1975; Andersen et al., 1985; Grossberg and Kuperstein, 1986;Andersen and Zipser, 1988). This is true because a corollary discharge from, for example, tonic cells of the saccade generator can provide an accurate measure of current eye position once reactive movements have been calibrated by the cerebellar adaptive gain stage (Fig.2 b). Such a head-centered map can be used as a source of intentional and memory-based movement commands and is identified with the proposals that similar map properties exist in the parietal and prefrontal cortices (Andersen et al., 1985; Schlag and Schlag-Rey, 1987; Mann et al., 1988).
These planned eye movement targets share the efferent neurons of the colliculus with reactive saccade targets. However, visually reactive cells encode gaze error in a retinotopically activated motor map. Auditory and planned targets are coded in head-centeredcoordinates. Targets in head-centered coordinates must be adaptively mapped to a gaze motor error in retinotopic coordinates to access the correct efferent zones in the SC.
The transformation takes place in the model in three steps. First, the transformation between a head-centered target position and a motor error vector (viz., the direction and amplitude of the desired eye movement) is learned. This transformation is learned by computing the difference between the head-centered target position and the final eye position after a reactive movement terminates (see Fig.2 c). This computed difference is a motor error vector. Because reactive movements are rendered accurate by cerebellar learning, the final eye position is the target position after such a movement. In other words, the motor error vector between the stored head-centered target position and the final eye position should equal zero. Learning of the transformation is thus accomplished by a process that reduces the error vector to zero (Grossberg and Kuperstein, 1986, Chapter 4). This is accomplished by using the error vector as a teaching signal that alters the adaptive weights in the pathway from the cells that compute the head-centered spatial map to those that compute the motor error (Fig. 2 c). Weight learning continues until the error equals zero, at which time the signals from the head-centered cells read out the target position in motor coordinates at the motor error vector cells.
Such a transformation can be learned in response to any number of head-centered maps, including auditory, visually attentive, and planned movement maps. In each case, when a new target is instated and read out at the motor error vector cells of its map, the present eye position is subtracted from it. This difference codes the desired movement to the new target. Thus the motor error vector cells not only control a learned coordinate change but also compute movement vectors.
Groh and Sparks (1992) have also used motor error vectors to model saccadic movements. They noted that auditory signals enter the brain in head coordinates. To convert them to motor error coordinates, they subtracted present eye position from the head coordinate representation. These authors do not, however, consider how this transformation is adaptively calibrated. Instead, their model assumes that perfect calibration is available. We show how these motor error vectors may be calibrated via learning. Within the parietal cortex, these motor vectors represent potential or intended movements toward attended visual or auditory targets. It is assumed that parietal cortex can store at most one target at a time. In contrast, frontal cortex is capable of working memory, whereby it can store a sequence of object or spatial commands (Perecman, 1987; Thierry et al., 1994; Rao et al., 1997). Grossberg (1978a,b) and Grossberg and Kuperstein (1986, Chapter 9) have modeled how such a sequence of commands can be stored in working memory and performed one at a time. It is assumed that these head-centered commands are converted into motor error vectors before being read out from prefrontal cortex. Data from parietal cortex (Barash et al., 1991a,b; Colby et al., 1992) and frontal eye fields (Bruce and Goldberg, 1985; Goldberg and Bruce, 1990) support the hypothesis that outputs from these areas code the direction and amplitude of saccadic movements.
Further experiments are needed to determine whether these representations take the form of motor vectors and/or the motor error maps that the model also invokes; namely, the second step of the model converts these motor vectors into locations on a topographic map, which is called the motor error map (Fig. 2 c). This step transforms large activity levels in the motor vector code to caudal positions in the topographic map and small activity levels to rostral positions (Grossberg and Kuperstein, 1986, Section 6.3). Here the terms “caudal” and “rostral” refer primarily to opposite ends of the map. However, they also anticipate that learning will create a correlation between the position code of this map and that of the colliculus.
The third step is a learned transformation from the maps of the auditory, visually attentive, and planned motor errors to the map of visually reactive motor errors at the buildup cell or SW layer (Fig.2 d). This transformation renders the initially visually reactive map also sensitive to multimodal and planned targets. For example, it is proposed to be the means whereby frontal eye field (Schlag-Rey et al., 1992) and auditory (Jay and Sparks, 1984, 1987b,1990) signals get accurately mapped onto the SC movement map. This hypothesis is consistent with evidence showing that the latency of auditory saccades depends on retinotopic motor error, as does latency to a visual target presentation (Zambarbieri et al., 1995). By transforming the planned, auditory, and head-centered visual targets into gaze motor error coordinates at the SW layer, all of these input sources can compete to select a winning target location. In addition, all of these various commands can use the cerebellar, or adaptive gain, side path that the visually reactive map controls (Fig. 2 a). Multimodal fusion onto initially visually reactive pathways hereby enables planned frontal commands and parietal auditory commands to both compete with and exploit the accuracy of the visually reactive saccade system.
Retinotopic and head-centered coordinate system alignment.The various maps from head-centered to motor error coordinates are learned by associating two representations of the same target position in space. For example, after the visually activated head-centered parietal map forms, a visual target can activate both the peak decay layer in the visually reactive pathway and the head-centered maps in the parietal and prefrontal cortices (Fig. 2 d). These pathways are associated by transforming the targets in head-centered coordinates into a gaze motor error map the output signals of which converge on the spreading wave layer, in which they use associative learning to become adaptively aligned with the visually reactive map. From this perspective, visually driven output signals from the peak decay, or burst cell, layer define teaching signals to the spreading wave, or buildup cell, layer at which unimodal inputs converge from multiple cortical loci (Fig. 2 d). This is the central reason in the model why both cell types exist. Learning enables the motor error vectors from these unimodal inputs to map onto the correct locations within the spreading wave layer using the teaching signals from the burst cell layer as aligning cues.
In response to each active burst cell location, a Gaussian teaching function across position is sent to the buildup cell layer (Fig.2 d). This teaching signal enables maximal learning to occur at the peak of the Gaussian, whereas less learning occurs farther away. Each error vector is hereby associated with a population of SC cells, with the most active cell situated at a map location that best codes the correct saccadic direction and amplitude. Such distributed population learning has several functional roles. First and foremost, it enables new target locations that have not been practiced during development to generate accurate saccades by using the Gaussian to interpolate locations that have been practiced. Secondary consequences are that an SC population code determines saccadic movements (Sparks and Nelson, 1987; Sparks and Mays, 1990), saccadic averaging can occur (Schiller and Sandell, 1983), and the buildup activity profile across the SC is very broad (see below).
Map learning takes place when a visual cue onset is coded by both the head-centered and visually reactive pathways. Consistent simultaneous activity in both pathways allows the location in the cortical error map that is activated by the head-centered representation to sample the position that correctly codes the desired eye movement in the reactive pathway on a number of learning trials. Activation of mismatched locations by discordant cues are statistically uncorrelated and get washed out by competitive interactions across the layers. In this way, the planned target is adaptively transformed from a head-centered representation to a gaze motor error vector and then to a gaze motor error map that is adaptively aligned with the map of the gaze motor error in the visually reactive pathway. The error map in this layer resembles the directional maps described for deep layers of the superior colliculus (Sparks and Mays, 1980).
Spreading waves, peak decay, and map learning. Why does map learning produce a system characterized by a spread of activity across the buildup cell layer during saccades? When a gaze motor error signal is sent to the saccade generator (Fig.3), an eye movement begins. As the movement progresses, the motor error vector decreases due to the negative feedback from the eye position corollary discharge. The declining error vector excites a series of loci on the cortical error map. As a result, the commanded location at the buildup cell layer shifts across the map. The buildup cell layer thus exhibits a spreading wave as an emergent property of the circuit that makes auditory, planned, and visually reactive commands dimensionally consistent so that they can be adaptively mapped into one another. The spreading wave results from continuous updating of the adaptive motor error map as the movement progresses.
This hypothesis helps to explain why the buildup of activity across the SC is so broad. Each new error vector maps into a new Gaussianly distributed location at the SW layer, and the cells of this layer take a while for their activity to decay. In addition, signals from the decaying activity of the burst cells cause a residual secondary peak of buildup cell activity to gradually decay. These properties are simulated below to fit SC data.
Our present interpretation of the process by which error vectors are updated as the saccade unfolds uses feedback loops that include the parietal and frontal cortices. There is, however, no logical requirement that would make it impossible for such a feedback loop to also be closed using noncortical sites. If the cortical loops are the only ones that exist, then lesions of all the appropriate parietal and prefrontal representations should eliminate the spreading wave but not the ability to generate saccades directly via burst and buildup activities that would not spread toward the fixation cells during a saccade. Moreover, note that the cortical input to the colliculus is excitatory, not inhibitory. Thus cutting this input would not have the same effect as cutting inhibitory feedback in a classical negative feedback loop.
Topographically organized excitatory feedback from the spreading wave to the peak decay layer allows a resonant activation to occur between corresponding locations in the two layers. Because resonant activation drives the map-learning process, it is critical to restrict its spatial locus (Fig. 4 a). A nonspecific inhibitory signal from the spreading wave layer, proposed to be mediated by the mesencephalic reticular formation (Edwards and de Olmos, 1976; Edwards et al., 1979; Sparks and Hartwich-Young, 1989), reaches all target locations at the fixed peak layer. The resonantly supported locations can survive the inhibition. A target is thus chosen by a competition in which irrelevant targets are inhibited. This circuit hereby helps to select an attended target. It also stabilizes the learning process by preventing irrelevant targets from being associated with each other. Models of this type are called adaptive resonance theory, or ART, models. The present model is thus called the SACCART model. ART models suggest that resonant states help to focus and stabilize learning in many brain systems, other than the SC, including multiple levels of visual and auditory processing (Carpenter and Grossberg, 1993; Gove et al., 1995; Grossberg, 1995, 1997;Grossberg et al., 1997a,b).
Rostral migration of activity in the spreading wave layer from its original location erodes feedback excitation to the burst cell map at which visually reactive targets are stored. The eroding excitatory input thus leads to decay of the fixed peak of activity because the error in foveating the target decreases. That is why this spatial map is referred to as the PD layer.
Reconciling auditory, planned, and reactive saccade targets.Auditory, planned, and visual targets compete for attention (Kowler et al., 1995; Deubel and Schneider, 1996). Auditory or planned targets must be able to be chosen over a visual one under some conditions. In the model, the chosen eye movement locks out interruptions from other targets during its execution. In all cases, when the auditory or planned and the visually reactive targets agree, learning is reinforced. When the auditory or planned and the visually reactive targets disagree, learning between these different representations is suppressed, and the distracting target is prevented from interrupting the saccade. It should be realized, however, that a head-centered visual representation of a target that agrees with its visually reactive representation can still support learning at the corresponding location on the motor error map. When such a visual target location is attentionally selected, irrelevant visually activated target locations are suppressed.
Two sources of inhibition suppress irrelevant visually activated target locations (Fig. 4 b). One source is the nonspecific inhibition discussed in the previous section. The second inhibitory source is interrupted during an eye movement at both the spreading wave and peak decay layers. At the spreading wave layer, this latter source of inhibition is eliminated when the target is presented to allow activity to build at this layer. At the peak decay layer, it is released at the location of the chosen target gaze error. Inhibition remains to other cells in the peak decay layer, therefore preventing their activation when the spreading wave activity shifts across the map.
Anatomical and neurophysiological SC correlates and sites of attentional target selection. The connections of cells in the model, summarized in Figure5 a, closely correlate with known anatomy and neurophysiology of the superior colliculus. Figure5 b depicts this correspondence to anatomical data in the superior colliculus. These connections are also summarized in Table1, which relates anatomical evidence to different cell types in the model (Hikosaka and Wurtz, 1983a,b; Cohen and Büttner-Ennever, 1984; Moschovakis et al., 1988a,b; Sparks and Hartwich-Young, 1989). Neurophysiological correlates are summarized along with the simulations reported below.
Attentional selection of a saccadic target may be progressively elaborated in several brain regions. For example, it is known that movement commands in the parietal cortex are attentively modulated (Mountcastle et al., 1981; Wurtz et al., 1982; Maylor and Hockey, 1985;Fischer, 1986; Fischer and Breitmeyer, 1987; Rizzolatti et al., 1987). On the other hand, visual, auditory, and planned movement commands converge in the SC, where a key stage in the selection of a saccadic target occurs.
It is also known, however, that planned saccades can be made when the SC is lesioned (Schiller and Sandell, 1983) and that the frontal eye fields can activate the saccade generator without activating the SC (Schlag-Rey et al., 1992; Segraves, 1992). As a result, volitional saccades can use additional adaptive stages than the ones used for calibrating reactive saccades (Deubel, 1995). The model rationalizes these latter results by noting that accurate visually reactive saccades can be made before the head-centered maps develop and gain access to the visually reactive map via associative learning. The model hereby suggests that visually reactive saccades may be possible in sufficiently young infants without generating a spreading wave. When the spreading wave does develop, it alters the total SC output command, which becomes distributed across a larger expanse of SC cells. Likewise, even in adults, visually reactive and planned saccades can generate different activation patterns across the burst and buildup cell populations (see Figs. 8, 10). The model suggests that error vector and map inputs to the SC from the parietal and frontal cortices help to select the final saccadic target but that additional adaptive pathways help to ensure that the gains of volitional movements control accurate movements even though they generate different activation patterns than do the visually reactive movements. Gancarz and Grossberg (1997) extend the present model to simulate data concerning how these several adaptive pathways work together to ensure that visually reactive, visually attentive, auditory, and planned movement are all calibrated correctly.
Realistic simulations of the physiological response properties of both burst and buildup cells in five different saccade paradigms are simulated in the Results. The mathematical equations and parameters of the model are summarized in .
All simulations used a fourth order Runge–Kutta algorithm with a fixed step size of 0.0025. The first simulations demonstrate how the multimodal map is learned. The adaptive weightz ij from the ith cell in the spatial error map to the jth cell in the PD layer grew if their activities X i andP j were simultaneously large, wherej is the cell corresponding to the initial gaze motor error. For a saccade encoded by an initial gaze motor error at cell 12, the weights z 12j from cell 12 in the planned motor error map to cells near a j of 12 in the SW layer also grow because of the Gaussian spread of the PD activity that is input to the SW layer. This spread was initially very broad and covered over half of the SW layer (Fig.6 a).PkGk–j in Figure6 a shows the spatial width of the teaching signal from thekth PD cell to the jth SW cell, whereP k is the activity of the kth PD cell and G k–j is the strength of the Gaussian filter connection to the jth SW cell.
Learning was performed during 1000 randomly generated saccades. All of the adaptive weights z ij were initialized to 1.0, and the reactive input R k to the PD layer was set equal to 200. All of the weightsz ij that resulted from training during these 1000 saccades are shown in Figure 6 b. These weights were used to generate activity at the SW layer during saccades, with each saccade made to a target at a different gaze motor error. The activity profile at the SW layer for each different saccade is shown in Figure 7. The SW activity is shown at a specific time after each target presentation but before eye movement has started. The vertical lines correspond to the location in which the peak decay activity was found. In this figure, the maximally active cell at the SW layer corresponds to the location of the peak decay activity. This correspondence indicates that the learned weights provide an accurate mapping from the spatial error map to the SW layer.
Burst and buildup cell simulations
The next simulations are of the time course of activation of burst and buildup cells during a saccade, using the adaptive weights learned above (Munoz and Wurtz, 1995b). These simulations were run using the learned map values z ij summarized in Figures 6 and 7. When a target is presented at a gaze motor error coded by cell 18 (0.36 radians), a desired eye position is input to the planned pathway. The corollary discharge coding current eye position is subtracted from the desired eye position signal, and a motor error vector results (Fig. 3). The new motor error is converted to a map representation, which produces a distributed region of input to the error map, the maximum of which occurs at the location that codes the current motor error. This distributed error map input produces a buildup of activity at the SW layer before the saccade begins (Fig.8). The location of peak activity in the SW layer covaries with the gaze motor error. As the eye movement progresses, the corollary discharge coding current eye position is subtracted from the desired eye position signal. The dynamic motor error coded by the motor difference vector decreases as the eye approaches the target location. As this new motor error is converted to a map representation, the locations of the most active sites in the error map shift, and the location of the maximal peak in the layer migrates toward the rostral edge of the map. The migration of the peak in the error map causes a similar shift of activity in the SW layer (Fig. 8).
Several factors complicate the distribution of activity at the SW layer. One factor is that the input to the SW layer is spatially distributed even in response to a fixed motor error (Figs.2 d, 3). A second factor is that activity builds up and decays in response to the input to SW cells at a finite rate, even as the dynamic motor error that causes it is changing. Finally, as the activity moves across the SW layer, excitatory input to the active location at the PD layer erodes. This decrease in excitatory input causes the activity at the PD layer to decrease (Fig.9). A secondary peak of activity in the SW layer near the location of the initial motor error is visible, even as the saccade ends, because of residual input from the PD layer (Fig.8).
The release from inhibition by the substantia nigra at the PD layer, together with the increasing SW activity input, causes a rapid increase in the activity at the PD layer before eye movement, followed by activity the declining amplitude of which, at the same location, covaries with residual motor error (Fig. 9).
Next we simulate the dynamics of burst and buildup cell activities during visual, overlap, memory, and gap tasks (Munoz and Wurtz, 1995a). During each of the four saccade paradigms that were simulated, it was assumed that both planned and reactive inputs were provided to the SW and PD layers, respectively. While the target light was on, both the reactive and planned inputs were presented to the PD and SW layers. The reactive input shut off at target offset or the start of eye movement; the planned input remained on throughout the eye movement. Eye movement was initiated when the activity at the peak SW cell was greater than a threshold value. We assumed the simplest output law by letting the muscle plant integrate the output signal from the maximally activated SW cell (see for details) in the motor error map, which shifts its location and activity as the saccade progresses. Gancarz and Grossberg (1997) build on the present model to show that it works when spatially distributed burst and buildup cell activity inputs to a model of the reticular saccade generator, which in turn activates the muscle plant. The simulation results derived from the present model are compared with physiological data in the following sections.
The simulation results for each of the four saccade paradigms are displayed below (see Figs. 10, 11, 12, 13). In each of the figures, there are data comparing the physiological responses of a burst cell (top left) and a buildup cell (bottom left) with the simulated responses of a PD cell (top right) and a SW cell (bottom right). Above both the biological and simulated cell responses are two time lines that indicate the status (on or off) of the fixation point and the external target stimulus. Below each graphed cell response is a line indicating the current eye position throughout the simulation.
Visually guided paradigm simulation
In the visually guided saccade simulation, shown in Figure10, fixation point offset coincides with target presentation, at which point eye movement begins. At the SW layer, only rostral pole fixation cells (not plotted) are active while the fixation point remains on. At fixation point offset and target onset, the fixation cell activity decays, whereas a hill pattern builds up across the orientation buildup cells. The peak of the hill corresponds to the initial gaze motor error. When the fixation cell activity ceases, the hill travels from the caudal to the rostral end of the map. The activity eventually reaches the fixation zone and stops moving. The SW cell activity rises and falls gradually as does buildup cell activity. The fall of the activity at the locus of the initial peak of the SW layer extends beyond the end of the eye movement.
When the target is presented, activity at the PD layer is produced at cell 20 corresponding to the initial gaze motor error. The peak decay cell activity begins later than does SW cell activity and coincides closely with the beginning of eye movement. This relationship is similar to that of burst cell activity in comparison with buildup cell activity. PD cell activity rises and falls abruptly as does burst cell activity, and the fall of the PD cell activity coincides with the end of the eye movement.
Overlap paradigm simulation
In the overlap saccade, shown in Figure11, target presentation precedes fixation point offset. Activity at the PD layer grows and generates a burst after the fixation point is turned off. An eye movement then begins. Initially at the SW layer, while only the fixation point is on, maximum activity is produced only at the fixation cells. When both the fixation point and the target are on, the fixation cell becomes active, and a hill of activity builds up across the buildup cells. The SW cell response subsequently decays because of habituation (seeZ ij in Equation 9 of ) until the fixation light is turned off. When the fixation point is turned off and only the target remains on, the hill of activity builds up to a higher level and travels across the map as a spread of buildup cell activity.
Memory-guided paradigm simulation
In the memory-guided saccade, shown in Figure12, both target onset and offset precede fixation point offset. Activity at the PD layer grows into a burst after the fixation light is turned off. When only the fixation point is on, only the SW fixation cells are active at the SW layer. When the target is flashed, the orientation buildup cells exhibit activity along with the fixation cells. The level of orientation cell activity remains constant while the fixation point remains on but increases and then migrates once the fixation point is turned off.
Gap paradigm simulation
In the gap saccade, shown in Figure13, fixation point offset precedes target onset. Activity at the PD layer coincides with target onset and then decays. When only the fixation point is on, the fixation cell at the SW layer is the only SW cell active. When neither the fixation point nor the target is on, the fixation cell activity quickly decays, and there is no activity at the SW layer. At target onset, an activity hill builds up across orientation buildup cells. Note that this buildup occurs at a quicker rate than in the visually guided saccade simulation of Figure 10. This shorter latency can be compared with the production of an express saccade. Express saccades are often elicited during the gap saccade task (Fischer and Weber, 1993). Observations by Dr. Doug Munoz (personal communication) indicate that there is a spreading wave during both regular and express saccades.
These simulations demonstrate how the PD cells in the model can be identified with burst cells and how the SW cells in the model can be identified with buildup cells. Table 2summarizes these comparisons.
Multimodal enhancement and depression simulations
Auditory inputs to the model SC can be transformed from a head-centered into a motor error map by using the same type of circuit that planned and attended visual targets use (see Fig. 5 a). The same model mechanisms transform all head-centered signals into a motor error map. This coordinate transformation is thus a general engine for linking head-centered to motor error commands. Because each corticocollicular pathway is unimodal, each such pathway needs to compute a motor error vector (Fig. 5 a), but all of these error vectors can then use the same PD layer teaching signals and SW layer cells to determine the winning-target locations.
We now show that the SW membrane equations that combine visual and auditory input produce multiplicative response enhancement in cells at the spreading wave layer for coincidentally located visual and auditory targets and response depression in cells when the targets are not at the same location, as also occurs in vivo (Stein and Meredith, 1993). The activity at the SW layer was compared during three different target presentations. A unimodal target consisting of a visual target only was presented at a gaze motor error coded by cell 10. Two different multimodal targets were presented. The first was a multimodal target consisting of a visual and an auditory target both at a gaze motor error coded by cell 10. The second was a visual and an auditory target at different locations, with the visual target presented at a gaze motor error coded by cell 10 and the auditory target presented at a gaze motor error coded by cell 5.
The activity across the SW layer (Fig.14) at the end of this sensory response period is shown for all three target presentations. The SW layer activity is shown during presentation of the single multimodal target (a), the unimodal target (b), and the separate auditory and visual targets (c). In each graph, the activities of all buildup cells at the SW layer are shown, and the diameters of the circles are directly proportional to the activity of the SW cell at each corresponding location. The value of the slider bars at the top of each graph reflects the gaze motor error of the visual or auditory target when the target is presented. The location of the filled circles also indicates this gaze motor error.
The average sensory response of a SW cell was used to determine the effects of multimodal response enhancement or depression compared with unimodal response. This measure was considered analogous to the mean number of impulses produced by a neuron during presentation of a sensory cue that was used by Stein and Meredith (1993). The average sensory response is defined as the average activity produced by a cell from the time a target is presented until the time that the premotor response of the PD layer begins. The cell response that was used was always from a cell at the same location regardless of the target presentation. In the simulations described below, this is always cell 10.
According to the formula used by Stein and Meredith (1993) for computing a comparison of activity at a cell during single and multimodal target presentations, the response enhancement or depression of the cell can be computed as:
where CM is the average number of impulses evoked by the combined-modality stimulus and SMmax is the average number of impulses evoked by the most effective single-modality stimulus. When the average activities at cell 10 in the unimodal target simulation and the single multimodal target are compared, the response enhancement is:
The increase in SW cell activity in the multimodal case occurs because there is excitatory input not only from the visual spatial error map but also from the auditory spatial error map. This increase in excitatory input pushes the SW cell activity into the linear region of the signal function c(t) in Equation 9 of the, thereby producing additional excitatory feedback to this cell and enhancing its activity. In the case in which a unimodal target is presented, the excitatory input from the visual spatial error map to the SW cell is insufficient to produce SW cell activity in the linear region of the signal function; thus the SW cell activity remains in its slower-than-linear region, and the excitatory feedback to this cell is negligible.
Increasing the spatial separation of a visual and an auditory stimulus produces an activity pattern across the SW layer that reflects a gaze change biased toward the closer target. If both targets are spatially coincident, the combination of the two excitatory bell-shaped inputs corresponding to the visual and auditory targets in the planned pathway produces a bell-shaped activity profile in the SW layer the peak of which corresponds to the motor error of the target location. If the disparity between the two targets is increased, the combination of the two activity profiles produces a bell-shaped activity pattern with an initial peak location representing the gaze motor error between the two targets. If the spatial separation is increased further, the activity pattern that is produced has a peak closer to the gaze error representing the medial target.
This bias in the peak location is produced by the combination of the two activity profiles from the auditory and visual spatial error maps. Because of the distribution of weights from the error vector map to the spatial error map, the bell-shaped activity profile corresponding to a target with a small gaze error is narrower than is the activity profile corresponding to a target with a larger gaze error. As a result, the map locations where the two activity profiles overlap are closer to the target with the smaller gaze error. When the two activity profiles are combined, this results in a profile with a peak closer to the location coding the smaller gaze error.
When the average activities at cell 10 in the unimodal target simulation and the simulation of multimodal targets at separate locations are compared, the response depression is:
The depression in SW activity at cell 10 in the multimodal case can be compared with the SW activity in the unimodal case. This depression results because the excitatory input from the visual spatial and the auditory spatial error maps do not significantly overlap. Therefore, in both the unimodal and multimodal cases, the excitatory input to the SW cell is virtually the same. This amount of excitatory input allows the SW cell activity to remain in the slower-than-linear region of the feedback function, and the excitatory feedback to this cell is negligible. The response depression increases in the multimodal case because the growth of activity at a gaze error location between the visual and auditory targets pushes the surround activity at cell 10 into the linear region of the signal function c, producing additional inhibitory feedback to this cell.
Recent data on the superior colliculus reveal a spreading wave of activation the peak of which codes the current gaze error (Munoz et al., 1991). In contrast, Waitzman et al. (1991) found that the locus of peak activity in the superior colliculus remains constant, whereas the activity level at this locus decays as a function of residual gaze error. The two main modeling approaches that have been used previously to understand how the superior colliculus controls saccadic eye movements have attempted to understand one or both of these data sets. In the first approach, the location of activity on the caudal region of the superior colliculus codes the initial size of the gaze shift, and its amplitude decreases as the remaining motor error decreases (Tweed and Vilis, 1990; Waitzman et al., 1991). When the target is fixated, neural activity on the caudal superior colliculus map disappears, and activity in the rostral zone appears. In the second approach, not only does the amplitude of the initial activity decay with decreasing motor error, but the location of maximal activity on the map travels from the initial caudal location until it reaches the rostral zone after fixation (Droulez and Berthoz, 1991; Munoz et al., 1991; Dominey and Arbib, 1992; Lefèvre and Galiana, 1992).
The more recent model of Optican (1995) (Wurtz and Optican, 1994) does suggest clear-cut functional roles for both burst and buildup cells. The model burst and buildup activity patterns play roles in producing two output signals from the SC. Burst cells produce output specifying the desired initial gaze displacement. Buildup cells integrate velocity command feedback to form a representation of gaze displacement. The spread of activity across the buildup cells reflects the motion of the eye by incorporating the influence of the velocity feedback. The resulting current and desired gaze displacement signals, the difference of which specifies the dynamic motor error, are output from the SC to the brainstem burst generator. This model assumes that the dynamic motor error is computed from these two signals in the brainstem, not in the SC. Thus, Optican proposes that burst and buildup cell types are needed to produce the two kinds of signals assumed in a model of the type originally proposed by Jürgens et al. (1981).
The Optican (1995) model is not supported by the data in two areas in which the current model matches the data. If buildup cells really were integrating feedback of the eye velocity command, then they would not show the observed presaccadic buildup that starts at a time associated with target onset and not just before saccade onset, as required by theOptican (1995) model. Moreover, if burst cells really reflected the desired initial gaze displacement, with the residual or dynamic motor error computed later, in the brainstem, then burst cells would not show the correlation with dynamic motor error reported by Waitzman et al. (1991).
The present model is consistent with the main data about burst and buildup cells and with many aspects of previous models. However, this model also analyzes how the coordinate systems that the superior colliculus uses to control movement arise and how multiple types of movement commands are rendered dimensionally consistent via learning and compete to select and attend to a final movement command. Whereas Optican’s model suggests that both burst and buildup signals are needed to produce output signals, it does not analyze the role that we propose they play in adaptively aligning and coordinating multiple types of movement signal. The present model suggests that these issues are central ones for understanding the functional organization of the saccadic system in general and the superior colliculus in particular.
For example, reflexive and volitional commands both share control of saccadic eye movements. Visually reactive cells encode gaze error in a retinotopically activated motor map, whereas auditory targets are obviously registered in head-centered coordinates, and it has been proposed that visually attended target representations in parietal cortex are also computed in head-centered coordinates (Andersen et al., 1985, 1990). To combine these saccade commands in the superior colliculus, the commands in head-centered coordinates are mapped via learning to a gaze motor error in motor error coordinates.
This transformation is functionally rationalized by the fact that calibration between visual inputs and eye movement commands is learned early in development within a visually reactive saccade system (Grossberg and Kuperstein, 1986). The accuracy of multimodal and planned saccades in the model can build on the gains learned by the visually reactive system. Auditory and planned movements need, however, to take precedence over visually reactive movements when circumstances require. For these properties to be achieved, a transformation between head-centered and retinotopic target representations needs to be learned. Targets in retinotopic and head-centered coordinates are rendered dimensionally consistent when both are transformed into motor error coordinates and they compete for attention to select a movement command.
The SACCART neural network model developed here shows that the map-learning process that combines visually attentive, auditory, and planned saccade commands provides a functional rationale for both burst and buildup cell activity patterns and provides fits to data from these cells in visual, overlap, memory, and gap behavioral paradigms (Munoz and Wurtz, 1995a,b), as well as an explanation of enhancement and depression effects caused by multimodal convergence and divergence (Stein and Meredith, 1993). The model hereby provides a conceptual bridge by which to link behavioral to biochemical manipulations of the NMDA receptors that exist at multimodal SC cells (Binns and Salt, 1996) to test predicted effects on SC burst and buildup cells. In particular, the model predicts how these NMDA receptors may be involved in learning the map that links auditory, visually attentive, and planned saccade commands from the parietal and frontal cortices to the SC. This predicted linkage between behavior, neurophysiology, and biochemistry may help to establish the deeper SC layers as an important new paradigm for studying the neural substrates of associative learning.
Rucci et al. (1997) have modeled how auditory and visual maps may get aligned in the optic tectum of the barn owl, which is homologous to the superior colliculus. Their model assumes that the eyes are fixed in a movable head, and thus all computations are done in head-centered coordinates. Adaptive transformations to the motor error coordinates that are known to occur in superior colliculus are not considered. The model also does not analyze how multiple visual, auditory, and planned cues can simultaneously compete for attentional resources before the winning location triggers learned changes in path weights. Instead, it assumes that visual foveation of a target activates a nonspecific now-print learning signal (Miller, 1963; Livingston, 1967; Estes, 1969;Grossberg, 1974, 1982) that is used to influence learning outside of the optic tectum. As a result, the model does not focus on the internal dynamics of optic tectum, including possible analogs of burst and buildup cells. It remains to be seen how a nonspecific now-print learning signal generalizes to the case in which the eye is free to move within the head in the presence of multiple, simultaneously active visual, auditory, and planned cues.
Although the SACCART model presented here assumes that the planned input is stored in head-centered coordinates before being transformed into motor error coordinates, data on parietal cortex support at least two different viewpoints. The data of Andersen et al. (1990)demonstrated how motor information in the parietal cortex is modulated by a planar gain field that is influenced by eye position and suggested how this information reflects head-centered coordinates. In contrast,Colby et al. (1992) suggested that motor information is stored in an oculocentric representation that is updated with each saccade. It is important to point out that the collicular core of the model can accept the existence of planned target representations in either representation, because the adaptive process between the planned target spatial map and the spreading wave layer aligns both maps in motor error coordinates, which are dimensionally consistent with retinotopic coordinates. In fact, there is reason to question whether these views are truly in competition; the parietal cortex may compute retinotopically consistent vector representations as well as head-centered target representations. Such dual coding may correspond to the dual coding in frontal oculomotor areas. Schlag and Schlag-Rey (1987) and Mann et al. (1988) have discovered mediofrontal areas that code eye position targets in a head-based reference system, quite unlike the saccadic vector representation formed in the frontal eye fields.
Finally, it is important to note the effect predicted by the current model in the event of a parietal lesion that interrupts the transcortical feedback to the colliculus. The primary role of the cortical projection to the SC is to allow attentive cortical selection of the target, not to provide an input corresponding to residual motor error. In fact, one effect of the model cortical input that creates the spreading wave is to cooperate with the signals that could control the saccade in the absence of cortical input. Thus according to the model, if cortical input is interrupted, visually guided saccades can still occur, but they would be expected to be hypometric. This outcome has been reported for patients with posterior parietal cortical lesions (Duhamel et al., 1992). This prediction contrasts sharply with any model in which parietal cortex mediates negative feedback to the SC, which would predict hypermetria after parietal lesions.
This section lists the mathematical equations and parameters of the model.
The visually reactive pathway in the model learns to accurately foveate visual targets. These visual targets are registered in a retinotopic map, the peak decay (PD) layer. Reactive signals that activate a region within this map topographically excite the corresponding region in the spreading wave (SW) layer. The locus of activation at the PD layer defines the direction and amplitude of a saccadic movement.
Two sources of inhibition suppress irrelevant visually activated target locations in this map. A nonspecific signal from the SW layer provides a constant level of inhibition to the peak decay layer. In addition, the model substantia nigra inhibits this layer but is released at the topographic location K encoding the saccadic eye movement. Excitatory feedback from the spreading wave to the PD layer, along with the release of substantia nigra inhibition, overcomes the nonspecific inhibition at the location encoding the target. Because all other locations are inhibited, the selected location can remain active throughout the saccade without interference from competing reactive targets. This resonant activation between layers allows a target choice to be made at the PD layer while interfering targets are suppressed. By gaining access to the SW layer, planned saccade targets can suppress saccades to reactive targets and instantiate the desired command.
This resonant circuitry at the peak decay layer is implemented by defining the change in the activity level at each cell. The peak decay cells (P k) combine excitatory input (R k) reflecting the reactive target input and excitatory input (S k) from the spreading wave layer. In addition, the peak decay cells are inhibited by model mesencephalic reticular formation (MRF) input [m(∑S j)], by fixation cell input from the rostral pole of the SW layer (W 1 F k), and by model substantia nigra input (N Pk). The following membrane, or shunting, equation defines the change in activity at the peak decay layer.
where the following functions obtain:
where K identifies the cell location encoding initial gaze error.
SW signal function
Spread of fixation cell inhibition
When the target is turned on, the reactive inputR k excites the location at the peak decay layer corresponding to the initial gaze motor error with a maximum reactive input value. After the target is turned off or the eye begins to move, the reactive input is shut off. Activity of each spreading wave cell excites the corresponding peak decay cell via a sigmoid function f(S k). All of the input to this function is in the slower-than-linear region of the sigmoid. Migration of the spreading wave away from the initial location diminishes the strength of the mutual excitation established with the PD layer, causing the activity peak in this layer to decay.
Inhibitory input to this layer is provided from the MRF, the fixation cell of the spreading wave layer, and the substantia nigra. The MRF inhibits the entire peak decay layer if there is any activity at the buildup cells of the spreading wave layer. If there is activity, the MRF input to the peak decay layer is established at a constant level of inhibition.
The fixation cell of the spreading wave layer inhibits cells at the peak decay layer based on the distance between the fixation cell and the peak decay cell. The fixation cell inhibits peak decay cells that encode smaller initial gaze motor errors more strongly than peak decay cells that encode larger errors. This stronger inhibition helps to produce quicker decay in cells that encode smaller initial gaze motor errors than in those that encode larger errors so that the complete decay of activity coincides to eye fixation in both cases. When the initial gaze motor error is small, eye fixation occurs quickly, and the peak decay activity correspondingly returns to zero at this time. Thus several factors work together to shut off the peak decay cells.
Inhibition from the substantia nigra decreases to the peak decay cell encoding the initial gaze motor error once the fixation point is turned off. After the first 150 time steps subsequent to fixation point removal, the SNr inhibition begins to gradually decrease. After 375 time steps have passed, the SNr inhibition decreases more rapidly. The effect of this SNr activity pattern produces an initial slow rise of peak decay activity followed by a strong burst.
Head-centered pathway: auditory, visually attentive, or planned saccades
The second pathway in the model provides the ability to perform saccades to targets coded in head-centered coordinates. The case of planned saccades will be discussed for definiteness. First, a planned target signal is combined with the current eye position signal to produce the gaze motor error vector. Next, the current gaze motor error (V h) is converted to a two-dimensional representation [X (r,q)]. Finally, this planned target input is combined with the reactive target input at the spreading wave layer.
From head-centered targets to motor difference vectors
At this stage, the combination of head-centered target coordinates and eye position coordinates consists of a set of paired cell populations coding the error between the two in vector coordinates. Each cell population is defined as part of an agonist–antagonist pair coding a given degree of freedom of the oculomotor plant. For example, a pair may code the horizontal degree of freedom; one member of the pair codes activity of the muscle that pulls the eye to the right, and the other codes activity of the muscle that pulls the eye to the left. For simplicity, the oculomotor plant can be approximated by two degrees of freedom, horizontal and vertical, so there are four cell populations that define the muscle lengths of four muscles each of which control up, down, left, or right movements in a two-dimensional plane in head-centered coordinates. The current gaze motor errorV h, where h = 0, 1, 2, or 3, corresponds to the vector difference between the desired muscle length (x h *) and the current muscle length (x h) (Grossberg and Kuperstein, 1986): Equation 2
Parameter A defines the amplitude of the desired eye movement, and D is the direction of the movement in radians. The average direction (D h) coded by each eye position vector element is: Equation 3
The current muscle length is updated when the eye moves by invoking the simplest possible movement rule, because events downstream from the SC are not the focus of the present work. A wide variety of similar rules generate analogous results. It is assumed that the current muscle length x h integrates activity from the spreading wave layer: Equation 4
if S J exceeds the threshold activation of 0.01. The value of J reflects the retinotopic position of S J, the maximum activity in the spreading wave layer.
Next, the vector difference (V h) is converted from a one-dimensional to a two-dimensional map representation [X (r,θ)], which can subsequently be input to the map in the spreading wave layer. To transform a spatial vector into the corresponding location in a spatial map, both weight and threshold gradients are used to map incoming signals (Grossberg and Kuperstein, 1986). The gradients work together to produce a single peak of activity at a location that corresponds to the amplitude and direction coded by the input vector. This connectivity maps higher activity levels in the vector stage to locations on the caudal edge of the map, whereas lower activity levels are mapped to locations closer to the rostral edge. The equations below accomplish this transformation: Equation 5
where the two-dimensional representation is defined in coordinates of the amplitude r and direction θ of eye movement. The function [x]+ indicates that the value ofM (r,θ) is set to 0 if it is negative.
Functions W h(r,θ) and Γ(r,θ) represent the gradient of weights and thresholds, respectively. The weights reflect the path strength between each of the elements (h) in the difference vector to each position (r,θ) in the two-dimensional representation: Equation 6
The thresholds determine the level above which a signal is permitted to generate activity and are a function of the amplitude (r) of the eye movement: Equation 7
The two-dimensional activity is normalized to produce a single peak of activity: Equation 8
Equation 8 approximates the action of a recurrent on-center, off-surround network (Grossberg, 1973, 1982). Because the following simulations consider only horizontal eye movements, the value of θ was set to 1. Therefore, the subscript (r,θ) is replaced by the subscript i in the following equations.
Spreading wave layer
Because the spatial representation of the planned target is aligned with that of the reactive target, the two signals can be combined at the jth cell in the spreading wave layer to produce a change in the activity of a cell via a membrane, or shunting, equation (Grossberg, 1973). The spreading wave cell activities (S j) can be classified into two different types, fixation cells and buildup cells. Because these cells have different inputs, they will be treated separately.
The buildup cells (S j where 1 <j < N) combine excitatory input (X i) reflecting the planned target input and excitatory input P k from the peak decay cell layer. The inputs (X i) are multiplied, or gated, by transmitter weights (Z ij) that habituate in an activity-dependent manner. In addition, the buildup cells are inhibited by MRF input [m(∑S j)], fixation cell input (S 1), and substantia nigra input (N S) (Cohen and Büttner-Ennever, 1984; Sparks and Hartwich-Young, 1989). The following equation defines the change in the activity of cells at the buildup layer.
Spreading wave (buildup cells)
Habituative intermodal path strength
PD signal function
Spread of input from PD to SW
Feedback signal function
In greater detail, Equation 9 says that excitatory input to this layer is provided from the planned target spatial map 2∑i=1 N ZijX i, the peak decay layer 4Σk g(PkG k–j), and recurrent feedback 40c(S j). When the target is turned on, the planned target input excites a region at the spreading wave layer with maximum activity corresponding to the initial gaze motor error. The planned input to this layer remains even after the target is turned off or the eye begins to move. The peak decay layer is designed so that only one cell in the sum 4∑k g(PkG k–j) is positive at any time. Activity of each peak decay cell excites the corresponding spreading wave cell via a sigmoid and a Gaussian function. The peak decay input reaches several neighboring cells at the spreading wave layer with a Gaussian spread of activity. All of this peak decay input excites the spreading wave cells via a sigmoid function. An excitatory termc(S j) incorporates on-center feedback using a nonlinear signal function.
Inhibitory input 40m(∑j>1 N S j) to this layer is also provided from the MRF, the fixation cell 0.8S 1 of the spreading wave layer, and the substantia nigra 50N S as in the peak decay layer. The MRF inhibits the entire buildup wave layer at a constant level if there is any activity at the buildup cells. The SNr inhibition is applied equally to all of the spreading wave cells in the path of the wave. In addition, Σc(S k)H k–jreflects that the off-surround inhibition is distance-dependent and is weighted by a Gaussian function H k–j.
The fixation cells (S j, wherej = 1) combine excitatory input (F) reflecting fixation input and weighted excitatory input (2∑i=1 N ZijX i) from the spatial error map. In addition, the fixation cells are inhibited by buildup cell input 10∑j>1 N SjI j, and peak decay cell input 10∑k>1 N P k. The buildup cell inhibition is distance-dependent. The following equation defines the change in activity of fixation cells at the buildup layer.
Spreading wave (fixation cells)
SW buildup input kernel
Intermodal map learning
For the multimodal and planned pathways to gain access to the visually reactive pathway, the different coordinate systems of these pathways are adaptively aligned via associative learning. This learning process needs to be stable through time, despite the possible interference from multiple sensory cues. In addition, the eye movement calibration learned by the visually reactive pathway at the cerebellar adaptive gain stage needs to be maintained after multimodal learning.
Multimodal and planned signals are relayed to the SW layer where they compete for attention with visually reactive saccades via the recurrent on-center off-surround interactions in Equation 9. Learning occurs between the motor error map of each multimodal and planned input source and the SW layer. After the learning process begins, it is stabilized by the congruence of saccade commands from multimodal and visually reactive pathways. If multimodal and visually reactive positions agree, then learning reinforces the current map alignment. If the commands disagree, then both learning and the distracting saccade target are suppressed by their mismatch.
The equations for one such learned map are shown below. Each map is determined by the same mechanisms but can use its own motor difference vector to map into the SW layer. Learning proceeds according to the following associative learning rule, which is often called the outstar learning rule (Grossberg, 1968, 1969): Equation 11
Variable z ij in Equation 11 defines the strength of the learned connection between the ith cell in the multimodal map, the activity of which isX i, and the jth cell of the SW layer. The kth cell in the PD layer (P k) acts as a teaching signal and is input to the SW layer using a Gaussian spread (G k–j). The adaptive weightsz ij all start out with the same value. When X i is sufficiently active,n(X i) > 0 in Equation 11, so learning commences. The termn(X i) > 0 herebygates learning on and off. Whenn(X i) > 0, the adaptive weight tracks the teaching signalp(PkG k–j). Such tracking behavior is called steepest descent. Thus the outstar learning rule is a gated steepest descent law. In particular, whenever the intermodal and visually reactive targets agree, bothn(X i) andp(PkGk–j ) are positive, and the weight z ij converges toward the PD activity P k that is projected to sitej in the SW layer. The weightsz ij are bounded because the PD layer input (P k) is bounded by 1.2, the upper bound of the excitatory term in Equation 1. Learning continues until the intermodal target input becomes aligned with the visually reactive PD that projects to the SW. The learning rate is α = 0.5 in Equation11.
Learning is sped up and rendered more robust when the intermodal spatial map can sample neighboring locations in the SW layer. As described above, the excitatory input from the PD layer to the SW layer shows a Gaussian fall-off. In addition, the gating inputn(Xi ) is active over a broad region. The spatial distribution of the predictive and PD inputs enables learning to interpolate across several map locations. As a result, distributed learning of the entire map can occur using only a small set of sampled positions.
Three conditions cooperate to achieve stable learning. First, only one visually reactive target is active at the PD layer during learning. Second, the gaze motor error vector remains fixed during the beginning of the target presentation. This period is when most of the learning occurs. Third, the learning rate parameter α is small so that spurious learning is compensated by statistically reliable correlations.
This work was supported in part by Air Force Office of Scientific Research Grant F49620-92-J-0499 (S.G. and K.R.), Defense Advanced Research Projects Agency Grant N00014-92-J-4015 (S.G. and K.R.), and Office of Naval Research Grants N00014-92-J-1309 (S.G., K.R., M.A., and D.B.) and N00014-95-1-0409 (S.G., K.R, and D.B.). We thank Diana Meyers for her valuable assistance in the preparation of this manuscript.
Correspondence should be addressed to Dr. Stephen Grossberg, Department of Cognitive and Neural Systems and Center for Adaptive Systems, Boston University, 677 Beacon Street, Boston, MA 02215.
Dr. Roberts’s present address: Cognex, One Vision Drive, Natick, MA 01760.
Dr. Aguilar’s present address: Machine Intelligence Group, Massachusetts Institute of Technology Lincoln Labs, 244 Wood Street, Lexington, MA 02173.