Abstract
Here, we studied neural correlates of orientation-contrast-based saliency in the optic tectum (OT) of barn owls. Neural responses in the intermediate/deep layers of the OT were recorded from lightly anesthetized owls confronted with arrays of bars in which one bar (the target) was orthogonal to the remaining bars (the distractors). Responses to target bars were compared with responses to distractor bars in the receptive field (RF). Initially, no orientation-contrast sensitivity was observed. However, if the position of the target bar in the array was randomly shuffled across trials so that it occasionally appeared in the RF, then such sensitivity emerged. The effect started to become significant after three or four positional changes of the target bar and strengthened with additional trials. Our data further suggest that this effect arises due to specific adaptation to the stimulus in the RF combined with suppression from the surround. By jittering the position of the bar inside the RF across trials, we demonstrate that the adaptation has two components, one position specific and one orientation specific. The findings give rise to the hypothesis that barn owls, by active scanning of the scene, can induce adaptation of the tectal circuitry to the common orientation and thus achieve a “pop-out” of rare orientations. Such a model is consistent with several behavioral observations in owls and may be relevant to other visual features and species.
SIGNIFICANCE STATEMENT Natural scenes are often characterized by a dominant orientation, such as the scenery of a pine forest or the sand dunes in a windy desert. Therefore, orientation that contrasts the regularity of the scene is perceived salient for many animals as a means to break camouflage. By actively moving the scene between each trial, we show here that neurons in the retinotopic map of the barn owl's optic tectum specifically adapt to the common orientation, giving rise to preferential representation of odd orientations. Based on this, we suggest a new mechanism for orientation-based camouflage breaking that links active scanning of scenes with neural adaptation. This mechanism may be relevant to pop-out in other species and visual features.
Introduction
For the rapid and efficient detection of food items in naturally cluttered environments, animals have evolved a variety of visual mechanisms for camouflage breaking. One such mechanism is the increased saliency of feature contrasts (Treisman, 1982; Wolfe, 1994; Mokeichev et al., 2010; Eckstein, 2011). Saliency based on orientation contrast is commonly tested by displaying visual scenes comprising separated bars oriented identically (the distractors) except for one bar that is oriented differently (the target). Several visual species from fish to birds and mammals have been shown to perceive the target as more salient compared with distractors (Nothdurft, 2002; Mokeichev et al., 2010; Harmening et al., 2011).
In humans, if the target differs from its distractors by a single feature, the time for target detection is largely independent of the number of distractors (Hochstein and Ahissar, 2002). In such cases, the target tends to “pop-out,” which is indicative of a rapid, parallel, pre-attentive process (Treisman and Gelade, 1980). The pop-out effect is abolished when the target differs from the distractors by a combination of more than one feature (conjunction search; Nakayama and Silverman, 1986). In such cases, the time to target increases with the number of distractors and the search is considered serial.
Barn owls are specialized for hunting small prey in dimly lit conditions and thus possess robust and efficient visual search mechanisms (Ohayon et al., 2006; Ohayon et al., 2008; Hazan et al., 2015). Recently, it was demonstrated that barn owls look at oddly oriented targets earlier, for longer periods, and more often compared with distractors (Harmening et al., 2011). Therefore, in barn owls, orientation contrasts are effective in enhancing perceived saliency. In a follow-up study (Orlowski et al., 2015), it was further demonstrated that the number of fixation changes to reach the odd target was not increased with the number of distractors. Therefore, it was suggested that oddly oriented targets pop-out for barn owls as for primates.
Here, we searched for neural correlates of orientation-contrast-based saliency in tectal neurons of the barn owl. The motivation to search for such selectivity in the optic tectum (OT) lies in the emerging hypothesis that the OT, as its mammalian homolog the superior colliculus, is directly involved in the selection of the most salient target for the next focus of attention (Boehnke and Munoz, 2008; Mysore and Knudsen, 2011b; Shen et al., 2011; Dutta and Gutfreund, 2014). We presented orientation pop-out displays and compared the responses of neurons when the target was inside of their receptive field (RF) with the responses to when a distractor was inside. Initially, tectal neurons did not show sensitivity to orientation contrasts. However, such sensitivity emerged if the neurons were first adapted by repeated presentations of a uniformly oriented scene. We further showed that this effect arises due to stimulus-specific adaptation (SSA) of tectal neurons to the position and orientation of the bar inside of their RF. We thus propose a new mechanism for computing orientation-contrast-based saliency that involves active scanning of the scene by the animal.
Materials and Methods
Animals.
Five adult barn owls (Tyto alba) of both sexes were used in this study. The owls were hatched and raised in captivity and housed in aviaries equipped with perching spots and brooding boxes. All procedures were in accordance with the guidelines of and approved by the Technion Institutional Animal Care and Use Committee. All surgical procedures were performed under isoflurane anesthesia and, in all recording sessions, the animals were sedated with mixture of oxygen and nitrous oxide. During recording sessions, no painful procedures were performed.
Electrophysiological procedures.
The owls were prepared for repeated electrophysiological recording sessions in a single surgical procedure. First, they were anesthetized with 2% isoflurane in a 4:5 mixture of nitrous oxide and oxygen. Lidocaine (lidocaine hcl 2% and epinephrine) was injected locally at the incision site. A craniotomy of 1 cm diameter was performed 0.6 cm lateral to the midline and 1.7 cm anterior from the anterior tip of attachment of dorsal neck muscles to skull and then a recording chamber was cemented to the skull (Unifast dental cement mixed with VetBond tissue adhesive) over the craniotomy. The chamber was filled with chloramphenicol ointment (5%) and sealed with a cap. The bird was allowed to recover overnight and returned to its aviary.
At the beginning of each electrophysiological session, the owl was anesthetized briefly with isoflurane (2%) and nitrous oxide in oxygen (4:5). Once anesthetized, the owl was wrapped in a holding jacket and positioned in a stereotaxic apparatus inside a double-walled, sound-attenuating booth (internal size 2.05 × 1.7 × 1.95 m). The head was bolted to the apparatus after alignment of the visual axis using retinal landmarks (as described in Reches and Gutfreund, 2008). After the bird was fixed, isoflurane was removed and the bird was maintained on a steady mixture of nitrous oxide and oxygen (4:5). The cover of the recording chamber was removed and a tungsten, epoxy-coated electrode (0.5–1.5 MΩ; Alpha-Omega) was driven using a motorized manipulator. Because eye movements in barn owls are limited to a range that is smaller than ±3° (du Lac and Knudsen, 1990), we did not immobilize or control for eye movements. The recorded electrical signal was amplified, digitized (16 bit ADC, 44,642 Hz sampling rate), filtered (313 Hz–5000 Hz), and stored using the AlphaLab SnR system (Alpha Omega). In each experiment, a threshold was set online to select the larger units in the recording sites, isolating action potentials from a small cluster of neurons (multiunit recording). At the end of each recording session, the recording chamber was treated with chloramphenicol ointment (5%) and closed. The owl was then returned to its home flying cage.
Identification of the location of the recording site was based on stereotaxic coordinates and on the expected physiological properties: the OT was recognized by characteristic bursting activity and spatially restricted visual and auditory RFs. Position within the OT was determined based on the location of the visual RF. The intermediate layers of the OT were located beneath the bursty layers and identified based on a transition from bursty activity to regular firing (Knudsen, 1982; Netser et al., 2010). Once it reached the intermediate layers, the electrode was advanced in small steps to search for sites with clear units and visual responses. Several recording sites (10–20) were collected in each experimental day along multiple electrode penetrations. Recording sites were separated by at least 300 μm. All recording sites were from the anterior part of the OT having visual RFs between left and right 20° and up and down 20° relative to the center of the visual field.
Visual stimulation.
Computer-generated visual displays were projected (XD400U projector; Mitsubishi; 72 Hz refresh rate) on a large, calibrated white screen positioned 1.5 m in front of the owl. The size of projected area was 150 × 115 cm, corresponding to 53° × 42°. Stimuli consisted either of bars (1° width and 4° length) of 0°, 90°, 45°, or −45° orientations relative to the horizontal plane, or of dots (1.5° radius). Stimuli were generated and presented by custom-written codes with Psychtoolbox running in MATLAB (Brainard, 1997). All stimuli were black against a gray background (luminance of background 17 cd/m2). First, the location of the RF was identified by manually moving the dot on the screen and then a detailed map was constructed around the localized area with the dot presented for 200 ms at locations 4° apart. To test orientation-contrast responses, bars were presented inside the RF together with an array of bars outside of the RF. Bars in the array were spaced by 10° along the azimuth and 9° in elevation. However, to break the regularity of the spatial pattern, a small variability of up to 0.5° was introduced in the positioning of bars (see insets in Fig. 1). In separate experiments, orientation-contrast response was tested while changing positions of the bars in every trial (jitter experiments). The bars were presented inside a 4° × 4° bounding box. The bounding area was divided to four nonoverlapping sections (4° × 1° each) along its length and height. In each trial, the bar was randomly positioned in one of the four sections (illustrated in Figs. 6A, 7A). This protocol was designed to ensure that each pixel inside the box had the same probability to be activated regardless of bar orientation. All bars in the array were jittered congruently to maintain the same distance between adjacent bars.
To explore adaptation effects, an oddball paradigm was used with long sequences of 110 trials involving two mutually orthogonal oriented bars, one presented in 90% of trials (standard stimulus) and the other in 10% (deviant stimulus) in the RF. Each bar was presented for 400 ms with 1 s interstimulus interval. In the next set of trials, the roles of the bars were switched: the deviant stimulus was made standard and the standard stimulus was made deviant for an additional 110 trials (illustrated in Fig. 5A).
Data analysis.
Neural responses to a visual stimulus were quantified as the number of spikes in a given time window after stimulus onset minus the number of spikes during the same amount of time immediately before stimulus onset. The time window for spike count was equal to the stimulus duration plus 200 ms starting from the onset of stimulation. To observe the time course of the response, we generated peristimulus time histograms (PSTHs) with 50 ms bins. PSTHs were normalized to the maximum response in each experiment and averaged across the population of recorded sites. SEMs are depicted as the width of the PSTH curves. To further quantify the sensitivity to orientation contrasts, we calculated the stimulus modulation index (MI) (Lee et al., 2002) as follows: MI = (Rcontrast − Runiform)/(Rcontrast + Runiform), where Rcontrast is the average response to the bar in the RF when it is different from the surrounding elements and Runiform is the average response to the same bar in the RF when it is identical to the surrounding elements. Likewise, SSA was quantified in oddball paradigm by calculating the stimulus index (SI) (Ulanovsky et al., 2003; and Gutfreund, 2008), defined as follows: SI = (Rdev − Rstandard) /(Rdev + Rstandard), where Rdev is the response to an oriented bar in the RF when it is deviant and Rstandard is the response to the same oriented bar when it is standard. Positive values of these indices indicate a stronger response to a stimulus that is contrasting its background. For each site, two MIs or two SIs were calculated corresponding to two mutually orthogonal orientations (0° vs 90° and 45° vs −45°). The indices of one orientation were plotted against the indices of its orthogonal orientation. A sign test was used to address the population distribution of the indices. The population of points that were significantly distributed above the center diagonal were considered to exhibit significant orientation contrast or SSA. To quantify the tendency of a site to respond to a contrasting or rare stimulus independently of its orientation, we calculated the neural index (NI), defined as follows: NI = (Rcontrast′ − Runiform′) + (Rcontrast′′ − Runiform′′)/(Rcontrast′ + Runiform′ + Rcontrast′′ + Runiform′′), where R′ and R′′ are the average responses to one orientation and to its orthogonal, respectively. To avoid first stimulus effects, the first 10 stimuli in each block were omitted from the analysis.
Results
In this study, we analyzed responses from 322 recording sites in the OT. All recording sites were in intermediate/deep layers of the OT (Knudsen, 1982; Netser et al., 2010).
Responses to orientation-contrasting stimuli
Initially, we recorded responses of tectal neurons to horizontal and vertical bars in the RF under three conditions: displayed alone on the screen (singleton), embedded in an array of bars similar to the bar in the RF (uniform), and embedded in an array of bars oriented perpendicular to the bar in the RF (contrasting). Typically, the neurons responded rigorously in the singleton conditions and much less so in the uniform and contrasting conditions (Fig. 1A–F). Importantly, on average, no significant difference was observed between the responses to the uniform and contrasting conditions (Fig. 1G,H; pairwise t test, n = 21; for 90° p = 0.9; for 0° p = 0.59). The mean spike count per second (±SEM) in singleton, contrasting, and uniform conditions for horizontal bar were 82.6 ± 11.0, 15.1 ± 3.56, and 14.7 ± 3.85, respectively, and for the vertical bar were 78.5 ± 10.065, 16.8 ± 4.23, and 17.9 ± 4.05, respectively. Therefore, tectal neurons demonstrated strong suppression of the response from the surrounding stimuli but no sensitivity to orientation contrast between the RF and the surround. This finding agrees with our previous study, which failed to identify orientation-contrast sensitivity in tectal neurons (Zahar et al., 2012).
The situation tested so far corresponds to passive viewing of a static target. However, in freely viewing conditions, owls actively scan their surroundings, moving their point of gaze from one feature to another (Harmening et al., 2011). We therefore tested tectal neurons in a paradigm that emulated active scanning of the orientation array by randomly changing the target position in the array in each trial, making sure that it appeared inside the RF only once in each block of 10 trials, in trial 8, 9, or 10 (see Fig. 2A for an illustration of the paradigm). This paradigm, which contained 110 trials, was repeated four times; in each, a different orientation of the oddly oriented bar (0°, 90°, 45°, or −45°) served as the target. An example from a single recording site is shown in Figure 2B. The raster plots and corresponding PSTHs in the top row show the responses to trials in which the target bar was inside the RF, compared with the responses to trials in which a distractor was in the RF (bottom row). In these conditions, the site clearly responded stronger when the targets were inside the RF compared with when the distractors of the same orientation were in the RF. Therefore, by causing the target to appear less frequently in the RF compared with distractors, the site was made sensitive to contrasts between the stimulus and the surround.
Figure 3, A–C, shows a summary of the results from three experimental conditions. In one, the target was made to appear in the RF every second trial; that is, at an equal rate for a target or one of its orthogonal distractors in the RF. In the second, the target was made to appear inside the RF once in every four trials, randomly fluctuating between the third and fourth trial. In the third, similar to the paradigm illustrated in Figure 2A, the target was in the RF once in every 10 trials in trial 8, 9, or 10. In the first condition, as expected, the responses to targets versus distractors were not significantly different (Fig. 3A; pairwise t test; n = 41; for 0°, p = 0.13; for 90°, p = 0.86; for 45°, p = 0.6; for −45°, p = 0.23). However, when the rate of appearance of the target in the RF was decreased to one in every four trials (Fig. 3B), significant orientation-contrast sensitivity was noted (pairwise t test for 0° and 90°; n = 59; p = 0.27 × 10−2; p = 4.6 × 10−6, respectively; pairwise t test for 45° and −45°; n = 37; p = 4.83 × 10−4, p = 1.23 × 10−5, respectively). This effect was further increased when the rate of appearance of target in RF was reduced to 1 in every 10 trials (Fig. 3C; pairwise t test for 0° and 90°; n = 50; p = 3.56 × 10−1°, p = 3.76 × 10−1°, respectively; pairwise t test for 45° and −45°; n = 23; p = 8.31 × 10−5, p = 3.91 × 10−5, respectively). The right column of Figure 3 shows the corresponding scatterplots of the modulation indices (see Materials and Methods). The distributions of the dots in Figure 3, B and C, were significantly biased above the diagonal (sign test, p = 1.86 × 10−9 and p = 1.51 × 10−19, respectively), but not in Figure 3A (p = 0.22). The mean distance of the points from the diagonal was larger in 1:10 presentations compared with 1:4 presentations (0.54 and 0.25, respectively), indicating that the tendency to respond more strongly to contrasting targets is intensified with lesser frequency of appearance of the target in the RF.
Freely behaving owls engaged in visual search behaviors tend to maintain fixation for ∼2 s before rapidly shifting gaze to a different fixation point (Harmening et al., 2011; Hazan et al., 2015). Therefore, to encompass such long fixations in our experimental paradigm, in several experiments, we increased the duration of the stimulus on the screen to 1.9 s, which was followed by a short gap of 100 ms before the next stimulus in the sequence appeared (Fig. 3D). Apart from the timing differences, the stimulus sequence was identical to the sequence used in the experiments shown in Figure 3B; that is, the target appeared in the RF once in every four trials in trial 3 or trial 4. This paradigm, which takes into account the longer fixations in natural scanning behaviors, showed a similar result; that is, the average population responses to all four targets were significantly larger compared with the corresponding distracters (pairwise t test, n = 21; for 0°, p = 1.53 × 10−4; for 90°, p = 5.65 × 10−4; for 45°, p = 6.5 × 10−4; for −45°, p = 4.0 × 10−2). The sign test of the distribution of MIs showed a significant bias above the diagonal line (p = 5.65 × 10−8; n = 42).
The above findings show that orientation-contrast sensitivity can emerge as a history-dependent phenomenon. Therefore, it should be possible to manipulate the neural responses to oddly oriented targets by controlling the history of stimulation. To explore this, we presented a uniform array of bars for a 110 trials. Occasionally, in every 10th trial, the orientation of the distractors was changed by 90° while the orientation of the bar in the RF remained the same. Therefore, once every 10 trials, a “pop-out” display was created. For comparison, we performed another test in the same site in which the orientation of the distractors was maintained fixed throughout the sequence while every 10 trials the orientation of the target in the RF was changed by 90° to create the same “pop-out” display, but following a different history of presentation (see Fig. 4A,B for an illustration of these paradigms). The results from the single site example show a dramatic difference between the response to target in the two different contexts (cf. Fig. 4C,D with E,F). The difference can also be seen at the population average PSTHs. When the orientation of the distractors was different relative to their history (Fig. 4G), the response to the target embedded in orthogonal distractors was, on average, below the response to the same target embedded in parallel distractors (pairwise t test, n = 17; p = 0.0015), exposing suppression that began ∼150 ms after the onset of stimulation. Conversely, when it was the orientation of the target in the RF that was different from its past orientation (Fig. 4H), the same neurons responded to the same “pop-out displays” with an average increase in firing rate (pairwise t test, n = 27; p = 2.91 × 10−6). Therefore, the response to orientation contrasts in tectal neurons is highly history dependent. The neurons can respond either by increased firing rate or by suppression depending on whether it is the background that is different from the past or the target.
SSA
The responses to the target bars in the 1:10 condition (Fig. 3C) were generally larger than the responses to the targets in the 1:4 condition (Fig. 3B), which were generally larger than the responses in 1:2 conditions (Fig. 3A). These suggest to us an adaptation phenomenon; that is, the more often the neurons are exposed to the target the lower the response becomes. To explore adaptation, we resorted to singleton stimuli in our next experiment. We used an oddball paradigm to study SSA (see Materials and Methods and Fig. 5A). In the example shown in Figure 5B, all four orientations, when deviant, exhibited clear increases in firing rates. However, when used as standards, they did not elicit noticeable responses, indicating a strong adaptation that is specific to the standard stimulus. The same tendency for SSA was reflected at the population responses (Fig. 5C); for all four orientations used, the population PSTH to deviants was noticeably above the PSTH to the standards. Therefore, most neurons in the intermediate/deep layers of the OT adapt specifically to the standard stimulus, thus leading to stronger responses to deviants. The SSA was rapid and already fully developed at the first deviant stimulus in the sequence, following an initial adaptation of nine trials (Fig. 5D). The average SI (±SEM) for the first and last appearances of the deviant was 0.32 ± 0.0398 and 0.42 ± 0.0395, respectively. The SIs for the first deviant were not significantly different from the SIs for the last deviant (pairwise t test; n = 146; p = 0.06).
In the previous experiments, in each trial, the bar was presented at the same position on the screen and thus acquired on the same retinal position. Therefore, a rarely oriented bar is expected to activate areas on the retina that have been less activated in the past and the observed SSA effect may be an adaptation to the specific position on the retina, the specific orientation of the bar, or both. To explore this, we repeated the oddball experiments, this time with a positional jitter of the bar inside a 4° × 4° bounding box covering the center of the RF (illustrated in Fig. 6A; see Materials and Methods). This setting made sure that the only statistical difference between the standard and deviant stimulus was the orientation of the bar. Randomly varying the position of the bars inside the 4° × 4° bounding box did not eliminate the principal finding. For all orientations, the population PSTH to the odd target was above the response to the uniform target (Fig. 6B). Figure 6C shows a summary of the results from the two experiments, one without jitter (upper row) and a second with the jitter (lower row). Both experiments gave rise to qualitatively similar results: significantly stronger responses to deviant targets (pairwise t test with jitter, n = 37, for 0°, p = 8.94 × 10−6; for 90°, p = 9.37 × 10−9; for 45°, p = 9.88 × 10−4; for −45°, p = 8.62 × 10−5; pairwise t test without jitter, n = 24; for 0°, p = 2.15 × 10−5; for 90°, p = 1.26 × 10−5; for 45°, p = 3.06 × 10−4; for −45°, p = 4.0 × 10−6). Without jitter, the SSA effect was stronger: the mean distance of the points in the SIs scatterplot from the diagonal line was 0.53 compared with 0.23 with jitter (Fig. 6C). Jittering the stimulus position across trials also gave an overall stronger average response (60 spikes/s in jitter conditions compared with 28 spikes/s in no-jitter conditions), indicating that adaptation in general tends to be less pronounced in jitter conditions. The distribution of SIs above the diagonal line was significant for both the no-jitter and jitter conditions (sign test: p = 8.36 × 10−12, n = 48; p = 2.63 × 10−13, n = 74) Therefore, we conclude that visual SSA in tectal neurons has a positional element as well as an orientation element. The SSA in the Jitter conditions, as in the without-jitter conditions, was fully developed during the presentation of the first odd stimulus (Fig. 6D). The average SI (±SEM) for the first and last appearances of the deviant was 0.14 ± 0.034 and 0.2 ± 0.037, respectively. The SIs for the first deviant was not significantly different from the SI for the last deviant (pairwise t test; n = 162; p = 0.2).
Effects of the surround
The SSA effect in the RF can account for the history-dependent sensitivity to the contrasting orientations described above. By moving the display from one point to another in the scene, the neurons become adapted to the common orientation. Because the adaptation is stimulus specific, this leaves the neurons free to respond more strongly to the contrasting orientation. The results in Figure 4 suggest that inhibition from the surround is also vulnerable to adaptation. Therefore, we next studied the effects of the surrounding elements by recording alternating blocks of singleton (no surrounding elements) and full array (with surrounding elements) conditions. Results were compared between singleton and full array for both jitter (Fig. 7B) and non-jitter (Fig. 7C) protocols. In both protocols, the odd target consistently induced stronger responses (pairwise t test; p < 0.01). However, in the full array, the average responses were reduced ∼3-fold compared with the singleton condition (Fig. 7B,C). Here again, the jitter reduced but did not eliminate the SSA effect for the full array and for the singleton condition (the average distance of the dots from the diagonal line was 0.76 ± 0.04 in the no-jitter condition and 0.21 ± 0.04 in the jitter condition).
To address how the elements in the surround affect the SSA of the stimulus in the RF, we calculated the neural indices with and without surrounding elements and compared their distributions. The neural indices of all four conditions were significantly shifted above zero (Fig. 7E; sign test; singleton no-jitter: n = 27, p = 1.49 × 10−8; full array no-jitter: n = 27, p = 1.49 × 10−8; singleton with jitter: n = 78, p = 1.85 × 10−15; full array with jitter: n = 78, p = 1.0 × 10−9). Interestingly, in the no-jitter conditions, the distribution of the full array NIs was significantly shifted toward positive values compared with the singleton conditions (Fig. 7E1; mean ± SD = 0.44 ± 0.14 in the singleton condition compared with 0.7249 ± 0.338 in the array condition, pairwise t test, n = 27, p = 1.35 × 10−8). When jittering the display, the distribution of the NIs for the full array condition was slightly shifted to positive values compared with the singleton condition, but not significantly (Fig. 7E2; mean ± SD = 0.166 ± 0.10 in the singleton condition compared with 0.2 ± 0.24 in the array condition, pairwise t test, n = 78, p = 0.32). Therefore, the surrounding elements produced larger SSA in the RF; however, this effect was reduced with jittering.
An additional difference between the singleton and the full array conditions is exposed by comparing the responses to the first stimulus in each block with the responses to the odd target. In this analysis, we collapsed together all of the trials performed for each condition independently of the orientation of the bar. For each experiment, we extracted the average responses to the first stimulus in the block, the stimulus preceding an odd target in the RF, the odd target in the RF, the average response to the first stimulus after the target, and the average response to the second stimulus after the target. In the singleton no-jitter condition (Fig. 8A), the average response to the target was significantly reduced compared with the first stimulus, but significantly larger compared with the rest of the stimuli (pairwise t test, p < 0.05). The pattern revealed in Figure 8A is a typical SSA pattern observed previously for auditory stimuli in the OT (Reches and Gutfreund, 2008). However, in the full array conditions, the response to the odd target is significantly larger compared with the first stimulus in the block (Fig. 8B; pairwise t test, n = 54; p = 4.5 × 10−13). Therefore, with surrounding elements, the adaptation has a facilitatory effect; that is, the response to the target is enhanced compared with the nonadapted state. Likewise, in the jitter condition, when singleton (Fig. 8C), the response to the target was significantly reduced compared with the nonadapted response (pairwise t test, n = 162; p = 9.6 × 10−11). However, when a full array was displayed (Fig. 8D), the response to the target was not significantly different compared with the nonadapted response (p = 0.95). Therefore, the same trend is observed with jittering, but the facilitatory effect of the surround is reduced.
Discussion
Orientation-contrast-based saliency is common in visual animals (Nothdurft, 2000; Mokeichev et al., 2010; Harmening et al., 2011), indicating its importance for ecological vision in cluttered environments. However, the neural mechanisms for detecting odd targets among distractors remain mostly obscure. The recent findings that barn owls perceive oddly oriented targets as more salient (Harmening et al., 2011; Orlowski et al., 2015) lead to the question of what are the neural mechanisms underlying such perception in barn owls. Neurons in the intermediate/deep layers of the barn owls' OT respond to sensory stimuli in a highly context-dependent manner, responding stronger to stimuli that are different from their past (Reches and Gutfreund, 2008; Netser et al., 2011), and are suppressed by competing stimuli outside of the RF (Mysore and Knudsen, 2011a; Mysore et al., 2011; Zahar et al., 2012). These findings support the hypothesis that the role of the OT is to detect and guide responses to salient targets (Knudsen, 2011; Dutta and Gutfreund, 2014) and, therefore, it is expected to find neural correlates of feature-contrast-based saliency in the OT. Tectal neurons show strong sensitivity to motion contrasts (Frost and Nakayama, 1983; Zahar et al., 2012). However, in Zahar et al. (2012) as well as initially in this work, we did not find sensitivity to orientation contrasts in the OT. This may indicate that the computation and control of orientation-based saliency is not occurring in the OT. However, then the question arises of why is it that other types of exogenous saliency are so robustly represented in the OT whereas orientation-contrast-based saliency is not?
Here, we show that neural correlates of orientation-contrast-based saliency can emerge in tectal neurons of the barn owl as a history-dependent effect. Based on this, we suggest a new mechanism for computing orientation-contrast-based saliency that takes into account gaze shifts. We assume that, initially, all bars are equally salient and the choice of the first bar for a gaze shift is random. Because the orientation of the distractors is much more common, after the first saccade, more cells in the retinotopic map will experience a sequence of two similar orientations in their RFs, whereas only a few cells, those that are pointing to the odd targets, will experience a sequence of two differently oriented bars. Because of SSA, those latter cells will respond slightly more and thus create a bias to the representation of the oddly oriented targets. This bias is expected to increase with additional gaze saccades. Therefore, via active vision, a computation in time (SSA) can be transformed to a computation over space, allowing the detection of oddly oriented targets among distractors. One attractive aspect of this mechanisms is that, as in the pop-out effect, the search time or number of saccades to target is not expected to increase much as a function of the number of distractors.
In principle, for the above mechanism to work, SSA to the orientation of the bar in the RF is sufficient. This raises the question of to what extent the SSA recorded here is truly an orientation-specific adaptation. To address this, we applied a jitter protocol to eliminate the effects of location-specific adaptation. The significant reduction of the SSA effect when jittering the display coupled with the significant maintenance of the effect suggests that two types of adaptations are involved, one location specific and one orientation specific. SSA to the location of visual stimulus was previously shown in OT of barn owls (Reches and Gutfreund, 2008); however, SSA to orientation was not. Interestingly, tectal neurons are mostly untuned to stimulus orientation (Knudsen, 1982; Zahar et al., 2012). SSA to orientation, therefore, may arise from pathway-specific adaptation in top-down connections from the visual Wulst, a forebrain visual area where orientation-selective units are abundant (Pettigrew and Konishi, 1976; Baron et al., 2007).
Given that SSA alone can explain enhanced responses to orientation contrasts, it is interesting to consider what is the contribution (if any) of the surrounding elements in shaping the responses to the stimulus in the RF? Some observations, made when the full array of elements was displayed, were not predicted by the responses to the singleton stimuli: the overall reduction in the responses (Fig. 7B,C), the larger neural indices (Fig. 7E), and the facilitation of the response compared with the unadapted state (Fig. 8). These effects may be caused by long-range lateral inhibition. Adaptation to the surrounding elements is expected to reduce the strength of lateral inhibition, changing the balance in favor of excitation and thus giving rise to facilitated responses in the RF (Solomon and Kohn, 2014). The lateral inhibition induced by the surrounding elements is expected to reduce responses to both common and odd stimuli proportionally, bringing both closer to the firing threshold. The contrast between the two may then be magnified through an “iceberg effect” of thresholding the responses (Nelken, 2014). This possibility can explain the larger neural contrasts between the odd and the common stimuli (NIs) observed in the presence of the surrounding elements and may also explain why, when jittering the display, the NIs were not enhanced as much. In jitter conditions, the difference between the odd and common responses become smaller and the responses were less adapted, so the iceberg effect was not as pronounced. A network of lateral inhibition that can account for the above observations has been described in detail in the OT. Discharges of neurons in the intermediate layers of the OT are fed to the tectal-isthmi loop, which generates long-range divisive inhibition over the whole ipsilateral tectum, sparing only the activating tectal site (Mysore et al., 2010; Lai et al., 2011; Mysore and Knudsen, 2013). Relevant to this, jittering both the singleton and array displays gave rise to an overall increase in responses (Figs. 6C, 7B,C). It can be argued then that a similar iceberg effect may account for the larger SSA observed in the no jitter compared with the jitter conditions and not an additional SSA to the position, as we interpreted. However, when the whole array was jittered across trials, overall responses were relatively small due to the effects from the surrounding elements (Fig. 7B). If the strength of SSA is a simple reflection of overall spike rates, then we would expect a strong SSA, but the SSA was relatively small (Fig. 7E1). Therefore, it seems that the effects of jittering the display on adaptation go beyond mere spike reduction.
In humans, target detection can be very rapid and independent of the number of distractors (Treisman and Gormican, 1988). To account for this, most models for feature-contrast-based saliency assume parallel processing of static images (Stemmler et al., 1995; Itti and Koch, 2000; de Brecht and Saiki, 2006; Zehetleitner et al., 2008; Chikkerur et al., 2010). However, in natural visual search tasks, animals tend to move their gaze considerably before detecting targets of interests (Hayhoe and Ballard, 2005; Ohayon et al., 2008). Barn owls naturally scan the scene by abrupt head saccades (Ohayon et al., 2008; Zahar et al., 2009; Harmening et al., 2011). When focusing, barn owls tend to perform complex side-to-side head motions (Ohayon et al., 2006). Such motions have been suggested to play a role in depth perception (van der Willigen et al., 2002), but may also enhance adaptation of tectal neurons to common features in the scene. In recent studies, visual search of barn owls freely observing displays of oriented bars was studied (Harmening et al., 2011). In many cases, the owls tended to switch gaze among distractors multiple times before hitting the target. The average time to reach the target was 7–11 s and the average number of saccades to target was 3–4 (Harmening et al., 2011). Note that, in these studies, the time and number of saccades to reach target were measured from the first fixation on the display array (and not from the stimulus onset) and therefore should be considered a lower estimate. These behavioral observations do not rule out rapid parallel processing for orientation contrasts in owls, but are consistent with the possibility, suggested here, that active vision can improve saliency mapping.
The mechanism suggested above, in principle, can be generalized to visual searches in other animals as well as other visual features. Fixation shifts are a central part of natural visual search (Yarbus, 1967; Tatler et al., 2010). When allowed to move the eyes during laboratory single-feature visual search, subjects sometimes shift fixations between distractors before landing on target (Findlay, 1997; Zelinsky and Sheinberg, 1997). For example, Becker and Ansorge (2013) reported that, in pop-out search conditions of 12 items, the percentage of trials in which the first fixation landed on target was ∼50% in color pop-out and ∼30% in size pop-out. This ratio decreased if the target was made more similar to distractors and vice versa. Therefore, gaze shifts may contribute to enhancing the detection of rare visual features, particularly in difficult conditions. The proposed mechanism, however, can only work if the relevant visual features undergo SSA. Indeed, SSA, defined as a difference between the response to a stimulus when it is rarely presented compared with when it is commonly presented (Ulanovsky et al., 2003), is common in the visual system (Kohn, 2007) and has been reported at the single-neuron level, as well as at the level of global signals (i.e., EEG and fMRI). It ranges from basic features such as location and color (Woods and Frost, 1977; Marlin et al., 1991; Miller et al., 1991; Czigler et al., 2002; Reches et al., 2010) to higher-level features from orientations to faces (Miller and Desimone, 1994; Carandini et al., 1998; Boynton and Finney, 2003; Amano et al., 2005; Kimura et al., 2011). Moreover, it was shown in human psychophysical experiments that adaptation to common features such as color and orientation affects the perceived saliency and improves performance in visual search tasks (McDermott et al., 2010; Wissig et al., 2013). Therefore, it is possible that similar mechanisms of adaptation contribute to spatial saliency mapping over a wide range of species.
Footnotes
This work was supported by a the German–Israel Foundation (Y.G. and H.W.) and the Israel Science Foundation (Y.G.).
The authors declare no competing financial interests.
- Correspondence should be addressed to Prof. Yoram Gutfreund, The Rappaport Faculty of Medicine, Technion, Bat-Galim, Haifa 31096, Israel. yoramg{at}tx.technion.ac.il