Cortical connections and early visual function: intra- and inter-columnar processing☆
Introduction
Orientation provides the basis for organizing much of visual cortex and provides the foundation for visual information processing. An examination of receptive field structure shows how the neurons in V1 form a position + orientation map of the visual field [19]. Roughly speaking, recordings along a short tangential (interblob) penetration reveal a collection of cells with about the same receptive field centers but with shifts in orientation preference, while normal penetrations reveal cells with similar orientation and position preferences but different receptive field sizes (or spatial frequency selectivity). Taken together these observations define an array of orientation columns which, combined with eye of origin, provide a basic representation for visual information processing; see Fig. 1.
The resulting hypercolumns are not independent. There exist long-range horizontal interactions between them, which greatly enlarges the domain of possibilities for information processing. The basic question is: which early visual information processing tasks take place within columns, and which between?
One such task is edge detection, and it is widely believed that the long-range horizontal connections support the sensory integration necessary for this. The nature of the evidence supporting this belief is reviewed in the next section. However, less analysis has been applied to the question of determining which tasks comprise sensory integration, and it is on this latter question that we concentrate in this paper. Assembling estimates of local orientation (within columns) into long curves (between columns) is clearly one such task, but this is not unique; there are several other tasks that also arise naturally in early vision, all of which, we shall show, are consistent with the available data on long-range horizontal integration.
This paper is primarily computational. We first develop a computational theory of curve detection to illustrate that there is more to this task than is normally presupposed. We also use this opportunity to introduce the conceptual thread that runs through the paper: the use of differential-geometric ideas to articulate theories of sensory integration. We then sketch how these ideas can be extended for the analysis of texture, shading, and binocular information. Particular types of sensory integration arise for each of these tasks, although the general form remains invariant.
The computational ideas have a natural expression in physiological terms, and we will use the intra-columnar processing to develop local representations and inter-columnar processing to integrate them into coherent wholes. Some of the functional roles in which the long-range horizontal connections could be involved are quite unexpected, in that they do not appear related to curve integration without the analysis provided.
More detailed issues arise throughout the discussion, such as the role of certain non-linearities in information processing. We also relate these to biology, at both a biophysical level (implementing the non-linearities with shunting inhibition) and at a detailed anatomical level (elaborating intra-columnar processing across layers).
While evidence is provided to demonstrate that the above claims are well-founded computationally, in our current state of understanding any attempt to reduce such abstract models to physiological circuits must necessarily involve a degree of speculation. This said, we feel the time is right to start contributing such ideas to the neurophysiological community. Our hope is that they will broaden both the discussion around, and the experimental perspective on, sensory integration.
Finding the boundaries of objects is one of the central problems of early visual information processing (Fig. 2), and cells with oriented receptive fields are often interpreted as local edge detectors, or at least as components of a system for local edge detection. That a system is required follows from the fact that many of the local responses are noisy or ambiguous: they may arise from an accidental alignment of viewing geometry and lighting, from a specularity, from noise in the sensory process, or from a myriad of other causes. The resolution of these noisy, ambiguous responses is sought from context, with those cells responding consistently along an edge facilitating one another to enhance the correct responses while eliminating the noisy, random ones.
This is perhaps the most basic type of contour integration, and it is classical. The Gestalt psychologists argued for a form of orientation good continuation nearly 75 years ago [46], and it was suggested nearly 20 years ago that the anatomical substrate for such facilitation could be the long-range horizontal connections [33].
The long-range horizontal axons effectively connect cells in the superficial layers of nearby orientation columns, so to test the above hypothesis the orientation preferences of the cells involved must be known. Two experimental paradigms have been developed. First, to test for contextual effects, one can isolate a target unit with a particular orientation preference, and then plot how its activity is modulated by stimuli in the surround. This paradigm was first used to demonstrate influences from beyond the classical receptive field [1], [34], and is still being applied in technically advanced ways [26], [28]. Many such studies show that the target cell’s firing is facilitated by surround stimuli that have about the same orientation as the one preferred by the target cell. While this technique can provide detailed characterization of the influences on an individual cell, it is not suitable for gathering population distributions, it is stimulus dependent, and the system-wide functional links it reveals are more difficult to interpret in terms of the physical connections between neurons.
Alternatively, studies that are more suitable for population statistics have used optical imaging and anatomical tracing to reveal the entire connectivity structure of cells (Fig. 3). Here optical imaging techniques are used to “colour” (an image of) the cortex with the approximate orientation preference of the underlying cells. Tracers are then injected into a cell and mark its terminals. The distribution of these terminals can then be plotted against the “colour” (or rough orientation preference) of the domain in which they terminate to yield a complete characterization of cells connectivity structure in the orientation domain [7], [31].
Physiological evidence of both types is accumulating, and a summary of the evidence is that the majority of such facilitory interactions are iso-orientational; i.e., between cells with similar orientation preferences [26], [31], [43]. Taken back to the edge detection problem, many researchers observe that this is essentially what the orientation good continuation hypothesis would predict, and it suggests that a form of co-linear facilitation underlies edge detection.
This basic model for edge detection––filtering by operations analogous to linear, oriented receptive fields followed by co-aligned facilitation––can be implemented and tested on natural images. Thus far this is only an outline of an approach, however, because variations in filters, their interactions, and the detection process remain unspecified. Researchers in computer vision have considered these issues, and one of the most widely used edge detectors is that of Canny [8]. This system effectively implements the above outline, and consists of an initial filtering stage, with filters very similar in form to simple-cell receptive fields, followed by a facilitory (hysteresis) stage that implements a type of co-aligned facilitation. Allowing an analogy to driving a car along the edge, the output of the detector is essentially the strongest set of initial edge detector responses that continue the edge in the same orientation in which it was (recently) going. Of course, the continuation can be adjusted by several parameters, intuitively varying the “inertia” with which an edge continues. Both the “driving history” and the strength of responses affect the final result.
The Canny detector can be evaluated on an image to assess performance, and researchers unfamiliar with computer vision are often surprised at the results. However, edge detection is not an easy problem as it may initially appear. Different values of the parameters illustrate the variations that typically occur, and it is instructive to examine them (Fig. 4). Among the problems that emerge: boundaries can be broken apart or, what is perhaps worse, proper but physically disconnected boundary segments can be inappropriately connected. Note in particular how different values can bridge between nearby (but totally different) parts of edges by stressing how the inertia parameter jumps across gaps. Unfortunately, there is no agreement on how to select the parameters so that these problems do not arise.
The experience with edge detection in computer vision reveals some of the types of questions that can arise from an information processing perspective. First, for the initial operators, there is the question of how to obtain local estimates of orientation that provide a consistent bridge between the image and the scene. This must not induce incorrect linkages between disparate curves, even when they are close in the image, because this implies incorrect physical structure for objects in the world. Second, there is the question of orientation good continuation: What does this mean in terms of physical object properties and how should nearby oriented responses facilitate one another so that they induce precisely the right amount of facilitation, co-linear when appropriate and curved when not?
Thirdly, different types of questions arise from physiology. For example, if the majority (as opposed to the entirety) of connections are between (approximately) iso-oriented cells, are the “outliers” to this simply noise? Is the correct abstraction for the majority co-linear facilitation, or is something else going on? And, how should the majority be defined; in particular, what is the proper spread in the distribution of connections in the orientation domain?
Finally, there are questions that arise from the interface between physiology and computer vision: are horizontal facilitations only participating in edge detection, or might they also be implementing other functional roles? If so, what might these be and are they consistent with the given data about connections?
We shall consider each of these groups of questions in this paper. To set the stage, we note that a closer examination of the physiological data suggests the story is more complex than co-aligned contour facilitation (Fig. 5). For example, while Kapadia et al. [26] stress iso-orientation facilitation, they also provide examples of facilitation between cells with up to 50° orientation difference (their Fig. 10). The distributions provided by Bosking et al. [7] clearly show non-negligible portion of anatomical connections even between cross orientations (their Fig. 6). And from the findings of Ts’o et al. [43] it is evident that there are facilitory functional interactions between iso-oriented cells whose receptive fields are parallel to one another, rather than co-aligned (see top pair in their Fig. 5). Interpreting such pairs as participating in collinear contour integration is awkward. Instead, we shall argue that such exceptions are naturally explained from a series of computational tasks, including curved contour integration, texture, shading, and stereo processing.
To develop this argument, we shall have to consider how early visual information processing can be structured on orientation hypercolumns. We do this in two stages. First, we briefly review a model that captures enough of the structure of the columnar architecture that it can be related to neurophysiology and neuroanatomy, but is sufficiently abstract that it can be analyzed mathematically and computationally. We then proceed to the analysis of curves, textures, shading, and stereo in it, and how these relate back to (certain aspects of) scene structure. As we show, facilitory interactions can be involved in all of them, but co-aligned facilitation by itself is insufficient. Differential geometry, and curvature in particular, is necessary to understand why co-aligned facilitation is dominant but not unique; there are an important (and predictable) number of non-co-aligned facilitory influences that play key processing roles.
Section snippets
The columnar machine
We begin by re-drawing the ice-cube model to illustrate the possibility of geometric information processing. We focus on columns with the same monocular specificity and drop deep layers. We depict the orientation hypercolumns as vertical fibres distributed over a retinotopic array (the tilted plane) and we display the orientation preference of cells within each hypercolumn as oriented segments (Fig. 6). When organized in this fashion, a geometric view of processing emerges, in which the fibre
The curve inference problem
Assume that orientation selectivity defines a substrate for representing those tangents that approximate the curves that bound objects in the scene and that define highlights and other surface markings. We now analyze how long-range horizontal interactions can reduce the errors inherent in locally estimating tangent orientation. Orientation change can be used to localize corners and junctions (as may occur at the point where one object occludes another in depth), and for the perceptual
Intra-columnar processes
The above example shows that boundaries need not be (in fact, are rarely) straight. Viewed locally, then, an approximation to a boundary over a very short distance is the tangent to the (boundary) curve. Viewed over a slightly larger neighbourhood, curvature begins to matter. Thus there are two problems that need to be addressed: (i) estimating tangents and (ii) estimating curvature.
Determining tangent directions with linear operators can be difficult, because different types of structure may
The geometry of texture flows
With this understanding of the inference of tangent maps for individual curves, we move to patterns of multiple curves. Examples include pinstriped material and artist’s etchings, animal coats and zebra’s stripes. For such patterns orientation is distributed over two-dimensional regions, and as the requirement for perfect continuations is relaxed we obtain texture flows.
The importance of locally (almost) parallel structure has been observed psychologically [14], [24], [44]. In particular, the
Shading flows and fold-type edges
While texture flows are a generalization of perceptual organization beyond curves, they are still a somewhat special class of patterns. It is therefore difficult to understand why the cortex might have evolved specialized circuitry for them. We believe an important part of the answer to this question is that there are other, more universal perceptual features that share the basic structural properties of texture flows. Prominent among these features is the shading flow field––the vector field
The geometry of stereo correspondence
As our final example of interactions between orientations we turn to stereo. Thus far we have been considering interactions between orientations within cells driven by one eye; we now consider both of the ocular dominance bands. Abstractly this implies an important construction for the columnar machine: the “product” of two machines, one for the left eye and the other for the right eye. Mathematically this suggests working in the product space (R2×S1)×(R2×S1) and designing compatability
Conclusions
We have argued for a differential geometric approach to vision, and have shown how this can be supported by the columnar architecture of visual cortex. Our computational model comprises non-linear local orientation measurements and the refinement of these measurements using context, provided by curvature, through a cooperative network. We have demonstrated how the local orientation measurements could be computed by intra-columnar neural circuitry, via shunting inhibition. The observed
Acknowledgements
We thank S. Alibhai for the stereo computations.
References (54)
- et al.
Endstopping and curvature
Vision Res.
(1989) - et al.
Contour integration by the human visual system: evidence for a local association field
Vision Res.
(1993) - et al.
Improvement in visual sensitivity by changes in local context: parallel studies in human observers and in vl of alert monkeys
Neuron
(1995) - et al.
Cliques, computations, and computational tractability
Pattern Recog.
(2000) - et al.
Stimulus specific responses from beyond the classical receptive field: Neurophysiological mechanisms for local-global comparisons in visual neurons
Ann. Rev. Neurosci.
(1985) - S. Alibhai, S.W. Zucker, Contour-based correspondence for stereo, in: Computer Vision––ECCV 2000, Lecture Notes in...
- et al.
Generation of end-inhibition in the visual cortex via interlaminar connections
Nature
(1986) - O. Ben-Shahar, S.W. Zucker, Flowing towards coherence: on the geometry of texture and shading flow, in: IEEE Computer...
- O. Ben-Shahar, S.W. Zucker, On the perceptual organization of texture and shading flows: from a geometrical model to...
- P. Breton, S.W. Zucker, Shadows and shading flow fields, in: Proc. Computer Vision and Pattern Recognition,...
Orientation selectivity and the arrangement of horizontal connections in the tree shrew striate cortex
J. Neurosci.
A computational approach to edge detection
IEEE Trans. Pattern Anal. Machine Intell.
Differential Geometry of Curves and Surfaces
Endstopped neurons in the visual cortex as a substrate for calculating curvature
Nature
Stable Mappings and their Singularities
Moiré effect from random dots
Nature
Untersuchungen über die wahrnehmung ebener geometrischen figuren die ganz oder teilweise von anderen geometrischen figuren verdecket sind
Zeitschrift fur Psychologic
Texture fields and texture flows: sensitivity to differences
Spat. Vision
On the foundations of relaxation labeling processes
IEEE Trans. Pattern Anal. Mach. Intell.
Logical/linear operators for image curves
IEEE Trans. Pattern Anal. Mach. Intell.
Organization in Vision: Essays on Gestalt Perception
Cited by (23)
Boundaries, shading, and border ownership: A cusp at their interaction
2009, Journal of Physiology ParisModeling the top-down influences on the lateral interactions in the visual cortex
2008, Brain ResearchCitation Excerpt :It enables detection of co-circularity in the pattern of oriented boundary responses. Our model is not intended to provide a detailed account of the curve detection since this was already done by Zucker and his colleagues (Ben-Shahar et al., 2003; Zucker et al., 1989). We only wanted to show how the same computational principles could be extended into modeling of physiological and psychophysical findings about the role of attention in contour integration.
The different stages of visual recognition need different attentional binding strategies
2008, Brain ResearchCitation Excerpt :Additional studies have found the response to the preferred stimulus changes when presented along with other stimuli, a pattern inconsistent with a feed-forward max operation (Sheinberg and Logothetis, 2001; Rolls et al., 2003). A theoretical argument may also be made against a feed-forward max using the equivalence conditions between relaxation labeling processes and max selection (Zucker et al., 1981), and especially considering the role of lateral processes in vision (Ben-Shahar et al., 2003). If lateral interactions are included time course matters.
From receptive profiles to a metric model of V1
2019, Journal of Computational Neuroscience
- ☆
Research supported by AFOSR and DARPA.
- 1
Current address: Artificial Intelligence Laboratory, MIT, Cambridge MA.