Abstract
Visual crowding refers to the phenomenon where a target object that is easily identifiable in isolation becomes difficult to recognize when surrounded by other stimuli (distractors). Many psychophysical studies have investigated this phenomenon and proposed alternative models for the underlying mechanisms. One prominent hypothesis, albeit with mixed psychophysical support, posits that crowding arises from the loss of information due to pooled encoding of features from target and distractor stimuli in the early stages of cortical visual processing. However, neurophysiological studies have not rigorously tested this hypothesis. We studied the responses of single neurons in macaque (one male, one female) area V4, an intermediate stage of the ventral, object-processing pathway, to parametrically designed crowded displays and their texture-statistics matched metameric counterparts. Our investigations reveal striking parallels between how crowding parameters, e.g., number, distance, and position of distractors, influence human psychophysical performance and V4 shape selectivity. Importantly, we also found that enhancing the salience of a target stimulus could alleviate crowding effects in highly cluttered scenes, and this could be temporally protracted reflecting a dynamical process. Thus, a pooled encoding of nearby stimuli cannot explain the observed responses and we propose an alternative model where V4 neurons preferentially encode salient stimuli in crowded displays. Overall, we conclude that the magnitude of crowding effects is determined not just by the number of distractors and target–distractor separation but also by the relative salience of targets versus distractors based on their feature attributes—the similarity of distractors, and the contrast between target and distractor stimuli.
Significance Statement Psychophysicists have long studied the phenomena of visual crowding, but the underlying neural mechanisms are unknown. Our results reveal striking correlations between the responses of neurons in mid-level visual cortical area V4 and psychophysical demonstrations, revealing that crowding is influenced not only by the number and spatial arrangement of distractors but also by the similarity of features between target and distractors, as well as among the distractors themselves. Overall, our studies provide strong evidence that the visual system uses strategies to preferentially encode salient features in a visual scene presumably to process visual information efficiently. When multiple nearby stimuli are equally salient, the phenomenon of crowding ensues.
Footnotes
The authors declare no competing financial interests.
The authors are grateful to Rohit Kamath, Dr. Anjani Chakrala, and Dr. Dina Popovkina for providing helpful discussions and comments on the manuscript, and Amber Fyall for assistance with animal training. This work was supported by NEI Grant R01 EY018839 to A.P.; NEI Center Core Grant for Vision Research P30 EY01730 to the UW; NIH/ORIP Grant P51 OD010425 to the WaNPRC.
Jump to comment: