Electrophysiological correlates of common-onset visual masking

https://doi.org/10.1016/j.neuropsychologia.2007.02.023Get rights and content

Abstract

In common-onset visual masking (COVM) the target and the mask come into view simultaneously. Masking occurs when the mask remains on the screen for longer after deletion of the target. Enns and Di Lollo [Enns, J. T., & Di Lollo, V. (2000). What's new in visual masking? Trends in Cognitive Sciences, 4(9), 345–352] have argued that this type of masking can be explained by re-entrant visual processing. In the present studies we used high-density event-related brain potentials (HD-ERP) to obtain neural evidence for re-entrant processing in COVM. In two experiments the participants’ task was to indicate the presence or absence of a vertical bar situated at the lower part of a ring highlighted by the mask. The only difference between the experiments was the duration of the target: 13 and 40 ms for the first and second experiment respectively. Behavioral results were consistent between experiments: COVM was stronger as a joint function of a large set size and longer trailing mask duration. Electrophysiological data from both studies revealed modulation of a posterior P2 component around 220 ms post-stimulus onset associated with masking. Further, in the critical experimental condition we revealed a significant relation between the amplitude of the P2 and behavioural response accuracy. We hypothesize that this re-activation of early visual areas reflects re-entrant feedback from higher to lower visual areas, providing converging evidence for re-entrance as an explanation for COVM.

Introduction

Several studies have shown that backward projections directly and continuously affect visual processing. For example, Hupe et al. (1998) studied backward connections from higher to lower visual areas of the macaque monkey, and reported that feedback projections served to amplify and focus the activity of units in the lower areas. Similarly, Lee, Mumford, Romero, and Lamme (1998) proposed that V1 is engaged in many levels of visual analysis through intra-cortical and feedback connections. Lamme and Roelfsema (2000) also concluded that feed-forward connections relay information from lower to higher visual cortical areas, but there are also horizontal, within-areas and, more importantly, feedback connections. In a recent report, Pascual-Leone and Walsh (2001) used transcranial magnetic stimulation (TMS) to demonstrate that the feedback projection from secondary visual areas to V1 is necessary for conscious visual perception. Collectively these findings provide evidence for re-entrant, feedback connections, and interactions between lower and higher cortical visual areas. More importantly, they suggest that top-down connections have an impact on bottom-up processes in perception, attention, and recognition (Spratling & Johnson, 2004).

Some electrophysiological findings of dynamic shifts of voltage change at the scalp surface are also consistent with re-entrant migration of information between higher and lower levels of information processing. Curran, Tucker, Kutas, and Posner (1992) examined visual event-related brain potentials (ERPs) during a word reading task. They reported that following N1 (the first negative deflection following stimulus onset) a separate posterior positive pattern emerges (termed as the ‘P1-reprise’) that seemed to repeat the topography of P1. According to the authors, the scalp distribution of this effect was similar to the P1 and it seemed unlikely that it reflected stimulus offset. Another example of back-projection during a recognition task is described in De Haan, Pascalis, and Johnson (2002). These authors also used ERPs to compare the spatial and temporal characteristics of electro-cortical activation during the early stages of face processing in adults and 6-month-old infants. They reported that in both cases there is apparent dynamic movement of voltage change consistent with migration of information along the ventral visual pathway. This is followed by a re-activation of overlapping visual areas. Finally, Martínez et al., 1999, Martínez et al., 2001 used ERPs combined with functional magnetic resonance imaging (fMRI) in a series of studies in order to investigate the cortical mechanisms of visual-spatial attention. Thus, they reported that attentional modulation of activity in V1 was substantial as measured by fMRI but non-existent as measured by the early ERP components originating from V1. However, they found that a later ERP component, with a latency of 160–260 ms, was modulated by attention and was localized to V1. They concluded that V1 activity was affected by a delayed, re-entrant feedback from higher visual areas. Similar results have also been reported by others (e.g., Noesselt et al., 2002).

In the present study we use common-onset visual masking (COVM; sometimes also referred to as “object substitution masking”) as a tool for exploring re-entrant visual processing (Di Lollo et al., 2000, Di Lollo et al., 2002). At a general level, visual masking refers to the reduction of visibility of one object (the target) by another object (the mask) that appears nearby in space or time. For instance, a highly visible object briefly presented in isolation can be made almost invisible when it is followed by another object occupying the same spatial location or even an adjacent, but not overlapping, location. This kind of masking is also referred to as backward masking since the detection of an object is impaired by events that occur subsequently (Breitmeyer, 1984). Visual backward masking is an empirically and theoretically rich phenomenon that can be a powerful methodological tool to study visual information processing and function (Breitmeyer & Ogmen, 2000).

COVM is a form of backward masking that occurs when a brief display of the target plus a mask, that consists of only four black dots that surround but do not touch the target, is followed by the mask alone (Di Lollo et al., 2000, Di Lollo et al., 2002, Enns and Di Lollo, 1997). According to Di Lollo et al., the first wave of feed-forward or bottom-up processing of the visual input is not sufficient for target identification. As a result, identification is aided by feedback or top-down neural projections during which, the circuit actively searches for a match between a descending code, representing a perceptual hypothesis, and an ongoing pattern of low-level activity. The comparison between information at higher and lower areas allows a percept to be achieved, ensuring that information is consistent between both levels. However, if the target item is deleted and only the four-dot mask is left on the target location, the ongoing activity at the lower level would then consist of an image of the mask alone, and a decaying image of the target. This creates a mismatch between the ongoing (bottom-up) pattern of low-level activity and the re-entrant (top-down) perceptual hypothesis that includes both the target and the mask, leading to confusion and disruption of the target's identification. What is consciously perceived would then depend on the number of iterations required to identify the target.

COVM is sensitive to the attentional demands of the task (Di Lollo et al., 2000, Di Lollo et al., 2002). Indeed, COVM is greater when the target is surrounded by similar distracter items, which corresponds to conditions that increase attentional demands in visual search tasks even when no mask is used (Treisman & Souther, 1985). By contrast, COVM is significantly weaker when the participant's attention is directed to the target by a spatial precue (Di Lollo et al., 2000, Di Lollo et al., 2002). Di Lollo et al., 2000, Di Lollo et al., 2002 report that for COVM to occur attention must be distributed among many potential targets and the four-dot mask must remain on the screen after deletion of the target in order for the mask-alone representation to substitute the target-plus-mask representation. If this object substitution hypothesis is correct, then COVM is a tool for exploring the temporal dynamics of visual perception, and more specifically, the iterative processing that occurs when an initial visual representation is discarded if it is incompatible with subsequent attention-based analysis of the visual scene (Lamme & Roelfsema, 2000).

The present studies aimed to bring together the two sets of literature discussed above by investigating the electrophysiological pattern of activity associated with re-entrant processing during common-onset visual masking in adults. Note that there is currently very little evidence for neurophysiological correlates of backward masking, possibly because authors have looked for evidence of inhibition, rather than re-entrant processing (see Enns & Di Lollo, 2000 for further discussion), Both behavioral and electrophysiological data were collected and analyzed in the present studies. From a behavioral point of view, common-onset visual masking by four dots should become stronger as a joint function of set size and trailing mask duration (Di Lollo et al., 2000, Di Lollo et al., 2002). From an electrophysiological point of view, visual information initially activates the early extrastriate visual cortical areas situated at the posterior part of the cortex (P1). Then, information is projected to more anterior parts of the cortex creating an occipito-temporal negative deflection (N1). These early ERP components usually occur within 200 ms following the presentation of the stimulus (Curran et al., 1992; Fabiani, Gratton, & Coles, 2000; Lamme & Roelfsema, 2000). However, according to Di Lollo et al., 2000, Di Lollo et al., 2002 model, competing information or representations feed back from higher to lower visual areas for confirmation. When masking is strong this renewed activation of early visual areas corresponds to a hypothetical stronger mismatch between the re-entrant visual representation and the ongoing lower-level activity produced by current sensory input. In so far as the magnitude of a component reflects the size of the population of neurons generating the signal, we therefore expect to find a larger re-entrant positivity for conditions where masking is stronger, immediately following N1 post-stimulus onset (we will refer to this component as a posterior P2). A further prediction is that there should be a relationship between the magnitude of the re-enterant positivity and behavioural response accuracy.

Section snippets

Participants

Participants consisted of 12 neurologically normal paid volunteers (6 males) with an average age of 27.4 years (SD = 4.6 years). All participants had normal or corrected-to-normal vision and four of them were left-handed.

Stimuli

All stimuli were monochrome images displayed on a 21 in. computer monitor with 75 Hz refresh rate using in-house presentation software designed to interface with the ERP equipment. On any given trial either 1 or 9 complete rings were displayed in the cells of a notional 3.5° × 3.5°

Experiment 2

In Experiment 2, we increase the target exposure time to examine its effect on both behavior and electrophysiological response. If the posterior P2-effect found in Experiment 1 was due to the later extinction of the mask in the delayed offset condition, we would expect the latency of this effect to move with the change in stimulus offset.

Comparing response performance across the two experiments

In the two experiments we have found evidence of increased P2 amplitude as a function of masking conditions. In this section we assess directly whether there is a relationship between P2 amplitude and response accuracy, and whether this relation persists across the different target durations of Experiments 1 and 2.

To this end, we considered the responses and ERP amplitudes in the only condition that provided consistent evidence of masking (bar present, mask present, nine elements). An aggregate

General discussion

Overall, the behavioral data from both studies were consistent with the findings reported by Di Lollo et al., 2000, Di Lollo et al., 2002: common-onset visual masking became stronger as a joint function of a large set size and long trailing mask duration. Indeed, an increase in set size combined with an increase in the duration of the trailing mask had an adverse effect on the sensitivity to targets. Clearly, the two factors cannot be considered in isolation: it is their interaction that

Acknowledgements

We would like to thank all the participants who took part in this study. This work was supported by the European Commission grants HPRN-CT-1999-00065 and 516542-NEST, and MRC Program Grant G9715587.

References (31)

  • B.G. Breitmeyer et al.

    Recent models and findings in visual backward masking: A comparison, review, and update

    Perception & Psychophysics

    (2000)
  • T. Curran et al.

    Topography of the N400: Brain electrical activity reflecting semantic expectancy

    Electroencephalography and Clinical Neurophysiology

    (1992)
  • M. De Haan et al.

    Specialization of neural mechanisms underlying face recognition in human infants

    Journal of Cognitive Neuroscience

    (2002)
  • R. Desimone et al.

    Annual Review of Neuroscience

    (1995)
  • V. Di Lollo et al.

    Competition for consciousness among visual events: The psychophysics of re-entrant visual pathways

    Journal of Experimental Psychology: General

    (2000)
  • Cited by (46)

    • Attractiveness and neural processing of infant faces: effects of a facial abnormality but not dopamine

      2020, Physiology and Behavior
      Citation Excerpt :

      In addition, and replicating our previous findings, both N170 and P2 amplitudes were found to be smaller in response to images of infants with a cleft lip than in response to images of healthy infant faces, but only reduced N170 amplitudes mediated the diminished attractiveness of infants with a cleft lip. Thus, the current findings add to the evidence that although the presence of a cleft lip interferes with both face specific (N170, see e.g. [7]) and more general attentional and/or executive (P2; see e.g., [28, 44, 56, 82, 96]) “normative” processing of faces, face-specific processing (N170) is likely to be specifically implicated in evaluative and behavioral effects of a cleft lip. Faces are highly salient social stimuli, providing information not only about an individual's identity, but also about a person's motivational and emotional state, and intentions.

    View all citing articles on Scopus
    View full text