Abstract
It is not yet known whether attention and consciousness operate through similar or largely different mechanisms. Visual processing mechanisms are routinely characterized by measuring contrast response functions (CRFs). In this report, behavioral CRFs were obtained in humans (both males and females) by measuring afterimage durations over the entire range of inducer stimulus contrasts to reveal visual mechanisms behind attention and consciousness. Deviations relative to the standard CRF, i.e., gain functions, describe the strength of signal enhancement, which were assessed for both changes due to attentional task and conscious perception. It was found that attention displayed a response-gain function, whereas consciousness displayed a contrast-gain function. Through model comparisons, which only included contrast-gain modulations, both contrast-gain and response-gain effects can be explained with a two-level normalization model, in which consciousness affects only the first level and attention affects only the second level. These results demonstrate that attention and consciousness can effectively show different gain functions because they operate through different signal enhancement mechanisms.
SIGNIFICANCE STATEMENT The relationship between attention and consciousness is still debated. Mapping contrast response functions (CRFs) has allowed (neuro)scientists to gain important insights into the mechanistic underpinnings of visual processing. Here, the influence of both attention and consciousness on these functions were measured and they displayed a strong dissociation. First, attention lowered CRFs, whereas consciousness raised them. Second, attention manifests itself as a response-gain function, whereas consciousness manifests itself as a contrast-gain function. Extensive model comparisons show that these results are best explained in a two-level normalization model in which consciousness affects only the first level, whereas attention affects only the second level. These findings show dissociations between both the computational mechanisms behind attention and consciousness and the perceptual consequences that they induce.
Introduction
Attention and consciousness are both major cognitive functions that determine visual processing. The relationship between attention and consciousness and their effects on our visual perception have been strongly debated (Wundt, 1874; Iwasaki, 1993; Posner, 1994; Chun and Wolfe, 2000; Dehaene et al., 2006; Koch and Tsuchiya, 2007; Mole, 2008; De Brigard and Prinz, 2010; van Boxtel et al., 2010a; Cohen et al., 2012; Koch and Tsuchiya, 2012; Prinz, 2012). A commonly held belief is that a tight link exists between attention and consciousness (Dehaene et al., 2006; De Brigard and Prinz, 2010; Cohen et al., 2012; Prinz, 2012). However, other researchers believe that attention and consciousness can be separated experimentally (Kentridge et al., 2004; Wyart and Tallon-Baudry, 2008; Brascamp et al., 2010; van Boxtel et al., 2010b; Watanabe et al., 2011). It should be noted that “attention” here refers to endogenous directed focal attention, not, for example, exogenous attention, and “consciousness” refers to perception as gauged by reported visibility and manipulated through masking.
Evidence that attention and consciousness are separable would find the strongest support from a double dissociation between these two functions (Koch and Tsuchiya, 2007; van Boxtel et al., 2010a; Cohen et al., 2012); that is, when the effects of attention and consciousness go in opposite directions. A rare behavioral example of such a double dissociation comes from a psychophysical paradigm that investigated how afterimage durations are affected by attention and consciousness (van Boxtel et al., 2010b). It was shown that consciousness increased afterimage durations, whereas attention decreased afterimage durations.
Although this previous research shows strong support for a dissociation between attention and consciousness, it does not provide insight into how this dissociation comes about. To gain such mechanistic understanding, researchers often map out contrast response functions (CRFs) (Albrecht and Hamilton, 1982; Reynolds and Heeger, 2009; Carrasco, 2011). Therefore, to determine whether the effects of attention and consciousness are subserved by similar or different mechanisms, afterimage durations were measured over a range of contrasts to determine behavioral CRFs and investigate the modulations of this curve caused by attention and consciousness.
CRFs of neuronal responses (Albrecht and Hamilton, 1982), as well as behavioral performance (Carrasco, 2011), generally increase in a sigmoidal fashion with increasing stimulus contrast (see Fig. 1A). Attention generally enhances contrast responses, but it can do so in different ways (Reynolds and Heeger, 2009; Carrasco, 2011). For example, attention can shift the contrast response curve to the left, thereby effectively boosting stimulus contrast (Reynolds and Heeger, 2009; Carrasco, 2011). This response modulation is called a contrast gain function and shows the largest effects at intermediate contrasts (Fig. 1A,B). Attention has also been reported to modulate responses multiplicatively. This type of function is called a response gain function, which shows the largest effects at high contrasts (Fig. 1A,B). The influence of consciousness on CRFs is not yet known, although one of the techniques to modulate conscious perception, interocular suppression, often, but not always, produces contrast gain functions (Sengpiel et al., 1998; Watanabe et al., 2004; Li et al., 2005; Bahrami et al., 2008; Yuval-Greenberg and Heeger, 2013).
Importantly, no study has measured and controlled for the effects of both attention and consciousness on CRFs. It is therefore possible that the previously reported effects in CRFs are a mixture of both attention and consciousness. Here, a technique is used that strictly controlled both attention and consciousness modulations, allowing the study of their separate effects on CRFs. Gain functions for both attention and consciousness were thus independently determined to study whether they operate through the same or distinct signal enhancement mechanisms.
Materials and Methods
Experimental design and statistical analysis
The same stimuli and paradigm were used as described previously (van Boxtel et al., 2010b).
Participants
Low spatial frequency (SF).
Data were obtained from 17 individuals (nine male/eight female college students). Based on trial inclusion criteria (see below), two participants were excluded because they had missing data in more than two parameter combinations. Therefore, the data from 15 individuals (eight males) were analyzed further.
High SF.
Nineteen participants participated. Three participants were excluded because of missing data. Therefore, data from 16 individuals (6 male, 10 female, mean age 20 years/1 month, SD = 1 year/4 months) were analyzed further.
Stimuli
The afterimage inducer was a Gabor patch with a contrast that was drawn from 0.03, 0.06, 0.125, 0.25, 0.50, and 1. The patch was Gaussian windowed (σ = 1.43°) and had an SF of 0.23 cycles/° or 3 cycles/°, a random orientation, and was presented at 4.9° eccentricity. The mask, which was shown on half of the trials, was of 100% contrast rotated at 120°/s and consisted of a Gaussian windowed (σ = 1.43°) checkerboard (0.78 cycles/°). It reversed contrast every 67 ms. Presentation location of the inducer and the mask was shifted by 45° counterclockwise between trials. Background luminance was 49 cd/m2. All experiments were performed on a gamma-corrected monitor.
In the low-attention conditions, the attentional task was a rapid serial visual presentation (RSVP) of red letters (font Helvetica, 12 point). These letters were shown for 133 ms, after which they were immediately replaced by the next letter.
Procedure
Each trial had three phases. The first phase was the adaptation phase during which the afterimage inducer was shown in one eye (left and right eye presentation was counterbalanced over trials). To the other eye, the mask was shown in invisible trials, whereas no mask was shown in visible trials. The adaptation phase lasted 4 s. To create conditions in which a low amount of attention was paid to the inducer, subjects were distracted by a RSVP in which they counted the number of X's (n = 2–5; participants were told there could be 1–5 in any trial) that appeared in an RSVP stream of nontarget letters (randomly chosen from: M, S, T, A, B, C, D, O, K, P, Y). Subjects did not report visibility in the low-attention trials, avoiding the need to deploy attention to the inducer. In the high-attention trials, the RSVP task was not performed but the letters were shown. Instead, subjects tracked the subjective visibility of the inducer by pressing and releasing a keyboard button.
The second phase was the afterimage phase. Participants pressed a button as soon as they perceived an afterimage and released the button when the afterimage disappeared. Because afterimages in this experiment were perceived instantaneously, afterimage duration was recorded from the start of the afterimage phase until the button was released. Participants pressed the space bar if no afterimage was perceived (this was recorded as a 0 s afterimage duration).
In the third phase, the observer was asked to indicate the number of X's that were counted. The number was indicated with the keypad. In the high-attention condition, this question was skipped by pressing the space bar. All trials were presented in a pseudorandom order and divided over eight blocks separated by brief rest periods.
Data analysis
For the data analysis, trials were excluded when the reported number of X's was off by >1 (except where noted differently). For the trials with the mask, which should lead to invisibility of the inducer stimulus, trials that were nonetheless reported as visible for any length of time (i.e., they broke through suppression) were excluded. These selection criteria led to the inclusion of the following number of trials (mean ± SEM over participants): in the low SF condition: 47.53 ± 0.47 (Visible–High Attention), 35.53 ± 1.40 (Visible–Low Attention), 41.93 ± 2.63 (Invisible–High Attention), 36.87 ± 0.97 (Invisible–Low Attention); and in the high SF condition: 47.58 ± 0.34 (Visible–High Attention), 32.74 ± 1.76 (Visible–Low Attention), 28.16 ± 3.65 (Invisible–High Attention), 32.79 ± 2.16 (Invisible–Low Attention). The lowest mean number of trials of any of the contrast conditions was as follows: 5.67 ± 0.33 (Visible–Low Attention, contrast = 0.06) for the low SF condition and 4.0 ± 0.68 (Invisible–Low Attention, contrast = 0.5) for the high SF condition.
Calculating psychophysical gain functions
The afterimage durations in the condition with invisible trials that were not attended (Att−/Vis−) were taken as a baseline measure, as this condition has a maximally reduced influence of attention and conscious visibility in our design. The attention-induced gain function was then calculated by subtracting, per participant, this baseline from the attended, but invisible conditions (Att+/Vis−): ΔAI = Att+/Vis− − Att−/Vis−. The visibility-induced gain function was similarly calculated by subtracting Att−/Vis− from the visible but less-attended condition: ΔAI = Att−/Vis+ − Att−/Vis−. ΔAI-measures were averaged over participants, and calculated per contrast value.
Contrast-gain index (CGI) and response gain index (RGI)
CGI and RGI were calculated per individual. The CGI was calculated as the difference between the average response at the middle two contrasts minus the average response at the lowest and highest contrast, as follows: 0.5 × ((ΔAI0.13 + ΔAI0.25) − (ΔAI0.03 + ΔAI1)), where ΔAIc refers to ΔAI at contrast c. The RGI was calculated as the average response at the highest two contrasts minus the average response at the lowest two contrasts. It was then multiplied by −1 to yield a positive index, as follows: − 0.5 × ((ΔAI0.50+ΔAI1)−(ΔAI0.03+ΔAI0.06)). Two subjects missed data for some of the ΔAIc values; these participants were removed for this analysis. CGI and RGI values were analyzed with one-sample t tests versus 0. Comparisons between experiments for both CGI and RGI were made using two-tailed paired t tests.
Z-transformed partial correlations and descriptive model fitting
Both the attention and visibility ΔAI data were fit with predictions from a contrast gain and response gain model. For the contrast gain model, the contrast-response curve change is characterized by a shift to the left or right. For a response gain model, the change is characterized by a multiplicative increase in response. Therefore, using the NonLinearModel class in MATLAB (2014b, 2015b) with default parameters, the average ΔAI data were fitted with the following functions: Where Rmax is the maximum obtainable response, C is the contrast of the inducer, n is an exponent that determines the steepness of the curve, and C50 is the contrast at which half-maximum response is reached. The parameter γ is the gain of the process under scrutiny; that is, attention or visibility. For both high and low SF experiments, we fixed all parameters, except γ. Rmax was set to 150% of the maximum AI duration at 100% contrast, C50 to 0.25, and the exponent n was set to 2 (Herrmann et al., 2010).
To remove correlations between the contrast gain and response gain models themselves, partial correlations were computed (Movshon et al., 1985; Smith et al., 2005). The partial correlations for contrast-gain (Rc) and response-gain (Rr) models are as follows: and where rc and rr are the correlations between the ΔAI data and the contrast-gain and response-gain models, respectively, and rcr is the correlation between the two fitted gain models.
Pearson's r correlations are not normally distributed, so the Fisher r-to-Z transformation (shown just for Rc) was used as follows: where df is the degrees of freedom, which is equal to the number of contrast values measured minus 3. A Z larger than 1.65 (equal to p = 0.05, one-tailed) was taken as the threshold for significance.
Model fitting
Several models were considered as potentially being able to explain our data. The models that were considered had a very similar two-level architecture, with a level 1 (L1) activity feeding into a L2. Both levels represent the neural activity of that level, with L1 representing a contrast-polarity sensitive level and L2 representing a contrast-polarity-insensitive level. Both levels include adaptation. The layout of all models was as follows: where RespL1 is the response at L1; RespL2 is the response at L2 and takes the output of L1 as input; FL1 and FL2 describe the contrast-response functions of L1 and L2, which are explained below for each of the models separately; and Sp describes the sensitivity postadaptation at L2, which is dependent on the preadaptation activity of this level. The parameter m determines the strength of adaptation at L2. Note that a stronger adaptation at the second level lowers the sensitivity Sp which is analogous to increasing the contrast detection threshold (cf. Brascamp et al., 2010). The final response of the model is RespAdapt, which is scaled (with s) and offset (with O) to best fit the data. It was assumed that the afterimage duration is directly related to this final response (i.e., a larger response means a longer time to decay).
Note that the activity in L1 is also dependent on adaptation, but this is only implicitly present in the model. Specifically, it was assumed that there exist two populations of neurons at L1, a population L1+ that is responding to the presented contrast polarity of the inducer and a population L1− that is not. Without adaptation, the responses of both populations in L1 would lead to an unbiased response to a gray screen: L1− − L1+ = 0. However, adaptation at L1 leads to a decreased sensitivity for cells in L1+, but not in cells that are nonresponsive to the stimulus (L1−). Therefore, after adaptation, there is now a biased response (L1− − L1+ = RespL1 > 0), causing the afterimage.
Models were fitted using the fmincon function in MATLAB. The models were simultaneously fit to the four contrast-response functions, as well as to the two ΔAI curves (the consciousness and attention gain functions). Both upper and lower bounds were provided on all parameters. Lower and upper bounds for the various parameters were as follows: n [lower = 0, upper = 3], C50 [0, 1], m [0, 1], s [0, 100], O [0, 2]. The various multiplicative attention terms (A) were bounded as follows [1, 100], with 1 being no attentional effect. When the parameter Ccfs was free to vary, it was limited to [0, 1].
In all models except the full model, Ccfs was set to 0 when no CFS stimulus was presented and 1 when a CFS stimulus was presented. In the full model, Ccfs was a free parameter when the CFS stimulus was presented (independently at L1 and at L2). In all models, multiplicative attention (A) parameters were set to 1 in low-attention conditions and were free to vary in the high-attention condition.
Contrast-response functions for the different models
Monocular normalization only model.
The monocular normalization-only model was the base model against which other models were compared. This model consists of two stages, each with a normalization operation but without attention and consciousness manipulation. Figure 4B shows the model layout. The CRFs, which include the normalization step, for L1 and L2 are as follows: This model has five free parameters: n, C50, m, s, and O. Parameters m, s, and O are as explained above. The parameter n controls the steepness of the tuning curves and it is the same for L1 and L2. C50 determines the point where the tuning curve reaches half the maximum height; it is also the same for L1 and L2. The parameters n, C50, s, m, and O are free parameters in all models.
Attention-first model.
The attention-first (and consciousness second) model includes the influence of attention at L1 and the influence of the CFS stimulus at L2 (see model architecture in Fig. 4B). The CRFs for L1 and L2 are as follows: where A1 is the multiplicative influence of attention at L1 and CCFS is the contrast of the CFS stimulus (set to 0 when the CFS stimulus was absent and to 1 when the CFS stimulus was present).
Consciousness-first model.
The consciousness-first (and attention second) model includes the influence of CFS at L1 and the influence of attention at L2 (see model architecture in Fig. 4B). The CRFs for L1 and L2 are as follows: where A2 is the multiplicative influence of attention at L2.
Attention-first model with response gain for attention.
In this model, the CRFs for L1 and L2 are identical to the attention-first model apart from excluding the attentional effects in the denominator (thus resulting in an attention-induced response-gain) as follows: Parameter definitions are as in the attention-first model.
Consciousness-first model with response gain for attention.
In this model, the CRFs for L1 and L2 are identical to the consciousness-first model apart from excluding the attentional effects in the denominator (thus resulting in an attention-induced response-gain) as follows: Parameter definitions are as in the consciousness-first model.
Full model.
The full model includes the influence of CFS and attention at L1 and at L2 (see model architecture in Fig. 4B). Attention is implemented as spatial attention, and therefore boosts both the activity to the afterimage inducer (C), and the CFS stimulus (CFS). The CRFs for L1 and L2 are as follows:
Model comparison
The fits were compared using the Bayesian information criterion (BIC), in which a low BIC is better (a perfect fit leads to a BIC of −∞). The BIC depends on the likelihood, which was calculated as follows: Where L is the likelihood, N is the number of data points, dfe is the degrees of freedom of the error (= N minus the number of parameters that are estimated), and MSE equals the sum of squared errors of all the fit residuals divided by dfe.
As a baseline, we took the monocular normalization-only model (N) that has only monocular normalization, but no influence of attention and interocular interactions. Models were compared by calculating the difference in the BIC score (ΔBIC) relative to the monocular normalization-only model (N). According to Kass and Raftery (1995), ΔBIC values not worth more than a bare mention are 0 < abs(ΔBIC) < 2; those that show positive evidence are 2 < abs(ΔBIC) < 6; those that show strong evidence are 6 < abs(ΔBIC) < 10, and those that show very strong evidence are abs(ΔBIC) > 10. Note that the ΔBIC thresholds of 2, 6, and 10 convert to the Bayes factors (BF10) of 2.7, 20, and 148, respectively. Apart from the models discussed here, we also considered various other models, which are not discussed below, but our conclusions remain the same.
Results
Psychophysical data
We measured the complete CRFs under various levels of focused attention and consciousness (operationalized as visibility) while using a full-factorial design. This approach allowed us to investigate the independent influences of attention and consciousness on visual processing (Fig. 1C). Previous research has shown that, when the influences of attention and consciousness are not strictly controlled, it may be difficult to attribute experimental effects to either of these processes (Koch and Tsuchiya, 2007; van Boxtel et al., 2010a). Masking is a powerful technique to modulate conscious perception (Kim and Blake, 2005) and it allows one to separate the influences of attention from those of consciousness (Dehaene et al., 2006; Kanai et al., 2010; van Boxtel et al., 2010b). We therefore used the most versatile method to render stimuli invisible through the process of masking, a method called continuous flash suppression (CFS) (Tsuchiya and Koch, 2005). We paired the CFS technique with a stringent control of the participant's attention allocation. We minimized attention to the inducer stimulus in half of the trials by means of a distracting RSVP task (Rees et al., 1999). On other trials, participants reported the visibility of the inducer stimulus, which resulted in high levels of attention to the inducer stimulus. By using this design, attention and consciousness were independently modulated, resulting in a 2 × 2 matrix that consisted of four conditions: low-attention/invisible (i.e., Att−/Vis−), low-attention/visible (i.e., Att−/Vis+), high-attention/invisible (i.e., Att+/Vis−), and high-attention/visible (i.e., Att+/Vis+). This design allowed us to derive the individual contributions of attention and consciousness.
We determined psychometric CRFs, as measured by afterimage durations, for all four conditions (Fig. 2A,C). We looked at the influence of attention and consciousness separately relative to a baseline condition that lacked both of these influences. Therefore, to calculate the gain functions for attention and consciousness (Fig. 2B,D), we took the low-attention/invisible condition (i.e., black dashed lines, Att−/Vis−, Fig. 2A,C) as a baseline because this condition lacked influences of both attention and visibility. The gain function for attention was then calculated as the difference in AI duration (ΔAI duration) between the high-attention/invisible condition and this baseline (i.e., ΔAI = Att+/Vis− − Att−/Vis−; Fig. 2B,D, red curve). Similarly, the gain function for visibility was then calculated as the ΔAI duration between this baseline and the low-attention/visible condition (i.e., ΔAI = Att−/Vis+ − Att−/Vis−; Fig. 2B,D, blue curve).
These results show strikingly different gain functions for attention and consciousness. Attention decreased afterimage durations (Fig. 2B,D, red curve), whereas visibility increased afterimage durations (Fig. 2B,D, blue curve) (Suzuki and Grabowecky, 2003; Tsuchiya and Koch, 2005; Brascamp et al., 2010; van Boxtel et al., 2010b). Furthermore, attention follows a response-gain function with small effects at low inducer contrasts and large significant effects at high inducer contrasts (Fig. 2B,D, red stars), whereas visibility follows a contrast-gain function, with significant effects only at intermediate inducer contrasts (Fig. 2B,D, blue stars). We quantified these effects further by calculating for each subject a CGI and an RGI (see Materials and Methods; Fig. 2E). These data revealed that, over all subjects (and both experiments), attention had a significant RGI (t(29) = 2.92, p = 0.007, Cohen's d = 0.53), but no significant CGI (t(29) = −1.30, p = 0.20, Cohen's d = −0.24). Conversely, visibility showed a significant CGI (t(29) = 2.91, p = 0.007, Cohen's d = 0.53), but no significant RGI (t(29) = −1.36, p = 0.18, Cohen's d = −0.25). Similar findings are obtained for the low and high SF experiments separately in that there are positive RGIs for attention (low SF: t(14) = 1.7, p = 0.10, Cohen's d = 0.45, ns; high SF: t(13) = 5.83, p < 0.0001, Cohen's d = 1.56), and a positive CGI for visibility (low SF: t(14) = 2.75, p = 0.016, Cohen's d = 0.71; high SF: t(13) = 1.0235, p = 0.32, Cohen's d = 0.27, ns), although significance was not always reached. There was no significant difference between the RGI for attention for the low and high SF data (t(27) = 0.69, p = 0.50, Cohen's d = 0.26). There was a significant difference between the CGI for attention for the low and high SF data (t(27) = 2.20, p = 0.036, Cohen's d = 0.83). Nevertheless, the overall data suggest that attention and consciousness operate through different gain functions.
Interestingly, we find that attention and consciousness also have opposite effects at 3 cycles/°, which is inconsistent with a previous report (Brascamp et al., 2010). That report showed that attention decreased afterimage durations at low and high SFs, but that the effect of visibility depended on SF: visibility increased afterimage duration at low SFs (consistent with our findings), whereas it decreased the afterimage durations at higher (3 cycles/°) SFs (inconsistent with our findings). This discrepancy could potentially be explained by the use of different paradigms (i.e., afterimage nulling versus duration paradigms). However, an alternative explanation is related to the fact the previous study did not use a 2 × 2 design. For example, the attention effect was only measured in the visible conditions and the visibility effect was measured without having a demanding secondary task. Therefore, attention was not strictly controlled for in the measurements of the visibility effect. Without this task, participants potentially paid more attention to the stimulus when it was visible then when it was invisible.
Based on our data, we can estimate the results of this methodological difference. To approximate the effects of the previous study, we looked at the condition that combined the effects of attention and consciousness. Therefore, we computed the gain function based on the difference between our standard baseline condition (i.e., the invisible/low-attention condition, Att−/Vis−) and the condition with both high visibility and high attention (i.e., the visible/high-attention condition): Att+/Vis+ − Att−/Vis−. These data are plotted with open circles in Figure 3. We compared this with the gain function without attention confound (open circles in Fig. 3, replotted from Fig. 2).
We found that, at low SFs, the lack of a demanding secondary task (and the consequent potential for a confounding influence of attention) has no or little influence of the gain function for visibility. However, at higher frequencies, an increase of attention (due to poor attentional control) shifts the visibility curve downward, resulting in apparent negative effects of visibility at many contrasts (being significantly negative at the highest contrast; t(15) = −2.19, p = 0.045, Cohen's d = −0.55). Therefore, this analysis suggests that the negative effects of visibility reported previously (Brascamp et al., 2010), measured at a contrast of 0.62, are potentially due to the lack of a stringent control of attention.
Descriptive model fitting
To investigate whether our measured gain functions can be described by pure contrast-gain and response-gain functions, we determined whether single parameter modulations from a baseline CRF could fit our data (Ling and Carrasco, 2006). We fitted both response gain and contrast gain models to the visibility and attention ΔAI data with one free parameter (see Materials and Methods). To control for correlations between response-gain and contrast-gain models, we computed Z-transformed partial correlations (Movshon et al., 1985; Smith et al., 2005) (see Materials and Methods for details). The resulting test-statistics, Zc and Zr, refer to the z-transformed partial correlations for the contrast-gain and response-gain fit, respectively. A Z of zero means that there is no correlation between the data and the model. We found that the attention data was best fit with a response-gain model (for the low SF data: Zr = 2.26, p = 0.012, one-tailed, whereas Zc = −0.162, p = 0.56; Fig. 2F). Conversely, visibility was better fit with a contrast gain model (Zr = −1.09, p = 0.86, Zc = 1.67, p = 0.047; Fig. 2F). These results are further supported when the Z-values are converted back into r-values (data not shown). Finally, using more lenient trial selection criteria on RSVP performance (e.g., correct number of items ±2 or ±3) did not change these results (Fig. 2F, different symbol sizes), showing that our results are not due to the specific trial selection criterion that we used. The high SF data are similar to the low SF data (Fig. 2F) and the same conclusions will be reached. These data show that gain functions of attention and consciousness can be modeled by changes of single parameters of the baseline response function and can be approximated by pure response and contrast gain functions, respectively.
Model comparisons with the BIC
The previous analysis showed that descriptive models of contrast gain and response gain can describe our data, but to provide a better mechanistic account, we constructed several computational models based on a normalization framework to investigate which model architecture could best explain our data.
In a normalization framework, the response to a target stimulus in one eye is divided by the activity of a normalization pool (Heeger, 1992; Reynolds and Heeger, 2009; Carandini and Heeger, 2011). This normalization pool combines the neural activity in response to both the target stimulus and the competing stimulus in the other eye (Baker and Meese, 2007; Ling and Blake, 2012). In this framework, an interocular mask will result in a contrast-gain effect.
Using a normalization framework, the modeled influence of attention on perception could result in both response gain and contrast gain effects (Reynolds and Heeger, 2009; Ling and Blake, 2012). However, an increase in attention cannot produce a negative effect on afterimage duration in this model. To explain a negative effect of attention, other researchers have proposed a two-level process (Suzuki and Grabowecky, 2003; Wede and Francis, 2007; Brascamp et al., 2010; van Boxtel et al., 2010b). The first level (L1) is sensitive to the particular configuration of light and dark patches; that is, it is contrast-polarity sensitive. After a prolonged adaptation period, the polarity sensitivity of this level causes it to produce a negative afterimage when a blank screen is shown. This process is generally assumed to underlie afterimages. On top of this first level, the two-stage model assumes that the output of L1 is the input of a second level. This second level is polarity insensitive, meaning that it is sensitive to the contrast of the stimulus independently of the light/dark arrangement of luminance. A change in activity at this level will change the perceived contrast of the stimulus, but will not generate an afterimage when no input is received. Here, we implement this conceptual two-level model into a simple computational model, incorporating the idea of response normalization (Heeger, 1992; Reynolds and Heeger, 2009; Carandini and Heeger, 2011) at both stages.
We modeled each of the two levels, L1 and L2, by a normalization process (see Materials and Methods; Fig. 4A) in which the output response of each level was modeled as follows: At L1, C is the contrast of the inducer and, at L2, C is the output of L1. The parameter a is the influence of attention (where a = 1 means unattended and a > 1 means attended). Here, we let attention influence both the processing of the afterimage inducer and the CFS stimulus because our design mainly manipulated spatial attention (in fact, feature-based attention would be difficult to deploy in our stimuli because the orientation and contrast was randomized over trials). CCFS is the contrast of the mask and C50 is the contrast at which half-maximum response is reached. The parameter n controls the steepness of the curves. Note that most of the models that we considered incorporate only contrast-gain changes because our stimuli were relatively small compared with the presumed size of the attentional window (Reynolds and Heeger, 2009). See Materials and Methods for full model descriptions.
To incorporate the differences between L1 and L2 in terms of their sensitivity to contrast polarity, the effects of adaptation were modeled differently in both levels. In L1, adaptation is polarity sensitive, so, after adaptation, when presented with a blank screen, unadapted neurons will be more active then adapted neurons, resulting in a negative afterimage. Therefore, we modeled the activity of L1 after adaptation as equal to the activity of the neurons sensitive to the opposite polarity before adaptation (as explained in the Materials and Methods). Adaptation at L2 was incorporated by assuming that L2 was less sensitive to the input from L1 after adaptation (i.e., similar to threshold elevation) by an amount proportional to its activity before adaptation (see Eq. 8 and 9 in the Materials and Methods).
Different model architectures were fitted to the data, differing in the level at which attention and consciousness took effect (i.e., at L1 and/or L2; Figure 4B). As a baseline, we constructed a model that included monocular normalization only, but no influences of attention or consciousness through interocular suppression (the “monocular normalization-only” (N) model). Other models were an “attention-first” model in which attentional modulated L1 whereas consciousness modulated L2; a “consciousness-first” model in which consciousness modulated L1 and attention modulated L2; and a “full model” in which both attention and consciousness could influence L1 and L2. We also included two models in which attention was explicitly modeled as a response gain. The models were compared using the BIC (Schwarz, 1978), which weighs the maximum value of the likelihood function against the number of free parameters in the model. A low BIC is preferred. All models were compared with the N model because this model did not incorporate the effects of our attention and visibility manipulations and therefore serves as a good baseline. Models were fit concurrently on both the afterimage durations (Fig. 2A,C) and attention and consciousness effects (Fig. 2B,D) with equal weights. Fitted model parameters for the low SF data are given in Table 1.
The BIC analysis revealed that the consciousness-first model best fits the data for low SF (Fig. 4C) and high SF (Fig. 4G) data. The attention-first model did not fit the data very well because the direction of modulation of attention and consciousness were always in the same direction. The full model fitted the data very well (Fig. 4D,H), but did not obtain a very low BIC score because of the added complexity of the model. Interestingly, the optimal fit parameters (Table 1) of the full model showed that the CFS parameter was 0.61 at L1, and 0.05 at L2, suggesting that CFS should only influence L1. In addition, it showed that attention was 2.2 at L1 (where 1 means no attention), whereas it was 100 at L2, showing that attention mostly influenced L2. Both of these findings conform the simpler model architecture of the consciousness-first model, suggesting that, indeed, the consciousness-first model is an optimal model. The models with attention modeled as a response gain, labeled Cr and Ar in Figure 4, C, D, G, and H, produced worse fits than the respective models with attention modeled as a contrast-gain (i.e., models C and A). Of these two, the Cr model was significantly better, but it was unable to show strong increases of attentional effects at medium contrasts and it did not evidence a plateau effect at high contrasts, thus its underperformance relative to the consciousness-first (C) model (ΔBIC = 3.84 between the two models). Posterior model comparisons (Wasserman, 2000) showed that the consciousness-first (C) model has a posterior probability of 0.83 of being the correct model, whereas the second-best model was the Cr model with a posterior probability of 0.12.
These same analyses were also performed for the high SF data, resulting in the same conclusions (Fig. 4G–J), with the consciousness-first model fitting the data the best (ΔBIC = 5.45 between the C and Cr model). Posterior model comparisons showed that the C-model has a posterior probability of 0.93 of being the correct model, whereas the second-best model was the Cr model with a posterior probability of 0.06.
This model comparison revealed that our data are best explained by a two-level normalization model in which consciousness operates at the first level, whereas attention operates at the second level.
Influence of the mask
One potential explanation of our visibility findings is that they are due to the presence/absence of the mask, not the visibility/invisibility of the inducer. Past research has shown that the presence/absence of the mask does not fully determine the strength of the aftereffects (Tsuchiya and Koch, 2005; Blake et al., 2006; van Boxtel et al., 2010b), but the mask may have an influence, through contrast adaptation, at low inducer SFs (Brascamp et al., 2010). To test for the influence of the mask, we compared linear mixed models through likelihood ratio tests (Bates et al., 2014). Afterimage duration was taken as the dependent variable and fixed effects were the inducer contrast, the CFS presence/absence, and the visibility (as indicated by the participants button presses). Random intercepts per participant were included. The unrestricted model included all fixed effects and their interactions, which we compared with restricted models that excluded either the factor CFS (and its interaction terms) or visibility (and its interaction terms). For low SF conditions, visibility had a significant effect (χ(4)2 = 33.30, p < 0.0001), and CFS had a significant effect (χ(4)2 = 9.94, p = 0.041). For the high SF condition, visibility had a significant effect (χ(4)2 = 353.87, p < 0.0001), but CFS did not (χ(4)2 = 2.57, p = 0.63). The model with the visibility influence (but not CFS) had a lower BIC than the model with the CFS influence (but not visibility): ΔBIC = 23.36 for low SF and ΔBIC = 351.31 for high SF). Therefore, the main influence appears to be visibility, but CFS does influence afterimage durations for low SF inducers.
Discussion
Modulations of CRFs are one of the most valuable descriptors of visual function (Reynolds and Heeger, 2009; Carrasco, 2011), allowing one to investigate the computational mechanisms underlying visual processing (Reynolds and Heeger, 2009). We measured the modulations of afterimage duration CRFs, or gain functions, induced by attention and consciousness. We show that attention and consciousness can have opposite effects on afterimage durations and, more importantly, that attention displays a response gain function, whereas consciousness displays a contrast gain function (which is possibly stronger for low SFs). These results show that attention and consciousness, arguably the two most important cognitive functions, operate through different underlying mechanisms.
These results are consistent with previous demonstrations of a dissociation between attention and consciousness (Kentridge et al., 2004; Wyart and Tallon-Baudry, 2008; Brascamp et al., 2010; van Boxtel et al., 2010b; Watanabe et al., 2011), but additionally provide an explanation of the processes that underlie this dissociation. Showing such clear dissociations between attention and consciousness is not always possible. For example, using a different experimental setup and parameters, attention produces contrast-gain functions at a behavioral level (Reynolds and Heeger, 2009; Herrmann et al., 2010), just like consciousness often does. Attention and consciousness would then appear to work synergistically. However, even though attention and consciousness show synergistic effects in those experiments, our findings suggest their computational underpinnings can be dissociated using the current paradigm.
Through extensive model comparisons, we show that the dissociation is best explained by a hierarchical two-level normalization model, in which attention acts on a late contrast-polarity-insensitive level, whereas consciousness manipulations, by means of interocular suppression, act on an early contrast-polarity-sensitive level. Because contrast-polarity sensitivity is predominantly found subcortically in LGN and in layer 4 in V1 (Hubel and Wiesel, 1968; Levitt et al., 2001), our findings are consistent with the idea that interocular suppression operates at early visual stages (Blake, 1989; Tong et al., 2006), potentially as early as LGN or V1 (Sengpiel et al., 1998; Polonsky et al., 2000; Meese and Hess, 2004; Watanabe et al., 2004; Haynes et al., 2005; Li et al., 2005; Bahrami et al., 2008; Yuval-Greenberg and Heeger, 2013). However, other research suggests that interocular masking at LGN or V1 is weak or nonexistent (Lehky and Maunsell, 1996; Leopold and Logothetis, 1996; Macknik and Martinez-Conde, 2004; Keliris et al., 2010; Watanabe et al., 2011). Our results cannot resolve this issue because, although we show sensitivity to contrast polarity, cells with such sensitivity (Zhou et al., 2000), as well as (rare) ocularly biased cells (Zeki, 1978; Hubel and Livingstone, 1987), exist in both striate and extrastriate visual areas. However, most results (including ours) are consistent with neurophysiological (Leopold and Logothetis, 1996) and computational (Freeman, 2005) findings that suggest that interocular inhibition at early stages is small and noisy and complete perceptual suppression builds up over consecutive stages in the visual hierarchy.
Our data further show that attention mainly operates on a higher contrast-polarity-insensitive level. This is largely consistent with the idea that attention allows for vision with scrutiny through top-down modulation of neural activity (Hochstein and Ahissar, 2002; Buffalo et al., 2010), acting more strongly higher up in the visual hierarchy (Rainer et al., 1998).
Different from our two-stage model, two recent models proposed that attention operates at the same level as interocular suppression (Ling and Blake, 2012; Li et al., 2015). Although seemingly at odds, this approach is not inconsistent with ours. For example, in Li et al. (2015), attention is binocular, consistent with the higher-level (second) stage in our model. Attention then influences (presumably through feedback) interocular competition. Importantly, the attentional influences on interocular suppression were strong only when competing stimuli were large, but not when they were small (Ling and Blake, 2012; Li et al., 2015). Because our stimuli were small, this may have obviated the need to model the first-level attentional influences for our experiments.
Other data suggest that attention is not merely a modulator of, but actually a prerequisite for binocular rivalry (Zhang et al., 2011; Brascamp and Blake, 2012). Although attention generally influences binocular rivalry minimally (Meng and Tong, 2004; van Ee et al., 2005), the finding that binocular rivalry may require attention could suggest that there exists an interaction between our attention manipulation and our consciousness manipulation (see also Ling and Blake, 2012; and our Fig. 3), thus potentially making them not completely independent in our experimental design. Although this situation is not ideal, the existence of such an interaction means that, if we were able to better separate attention and consciousness, then the results would probably be still clearer.
Interestingly, whereas attention may be required for binocular rivalry, conscious awareness of the visual conflict is not (Brascamp et al., 2015; Zou et al., 2016). This finding supports the proposed dissociation of attention and consciousness. It must be mentioned that recent neurophysiological measurements suggest that binocular rivalry may occur in the absence of both attention and consciousness (Xu et al., 2016), meaning that the above concerns about our paradigm are moot.
From a computational perspective, it is interesting that the best-performing consciousness-first model only included contrast-gain computations, even outperforming models that explicitly included response-gains for attention. It nevertheless produced a response gain for attention because the modeled contrast-response function at the second level was not yet at saturation at high contrasts, so a contrast-gain change induced by attention translated into increased activity, even at high contrasts. Arguably, this unsaturated response is not consistent with neurophysiological data. However, this “true” contrast response behavior of a neuron cannot be measured in the brain. One can only measure the contrast response curve at a certain level determined by the contrast-dependent input from preceding neuronal levels. In other words, the output of level X is dependent on the contrast-dependent input it receives from level X − 1. This “observed” contrast-response curve is the one that is measured in neurophysiological experiments. Consistent with these observations, our model also shows response saturation of the contrast-response function of L2 given the input of L1 (Fig. 4E). In addition to this effect, the response gain is then further strengthened by adaptation (which is effectively a response gain change) that is dependent on the attention-modulated activity.
Our modeling suggests that both the effects of attention and consciousness on afterimage duration are the result of a signal enhancement at the neuronal level. It has been suggested that these effects could therefore operate through identical or similar mechanisms (Brascamp et al., 2010; Blake et al., 2014), thus implying a weak separation between attention and consciousness. This interpretation, however, was based on measurements at just a single inducer contrast. Here, we have measured the entire contrast response curve and analyzed the results with different two-level models. Our analysis reveals that, even though attention and consciousness both cause increases in activity, they achieve that through processes that give rise to different (gain) functions and by influencing different levels within the visual processing hierarchy. Our overall set of findings is not consistent with alternative interpretations based on the assumption that the effects of suppression are purely due to decreasing attention to a stimulus that is rendered invisible (Blake et al., 2014; Li et al., 2015).
A separate concern is that the effects that we attribute to consciousness may instead be due to a difference in contrast adaptation when the interocular mask is present versus when it is absent (i.e., it is stimulus dependent) and not due to changes in conscious perception per se (Cohen et al., 2012). Consistent with past findings (Brascamp et al., 2010), we find that, for low SF stimuli, there is an influence of the mask, but not for high SF stimuli. It is thus worthwhile to investigate and control for this potential influence (Brascamp et al., 2010; Watanabe et al., 2011; Yuval-Greenberg and Heeger, 2013). However, conscious visibility had a significantly stronger effect in our data, consistent with previous indications that aftereffect strength is largely determined by the conscious percept and that the presence/absence of the mask is not (fully) determining the strength of the aftereffects (Tsuchiya and Koch, 2005; Blake et al., 2006; van Boxtel et al., 2010b).
It is currently not possible to determine how general our findings are. However, apart from afterimages, attention can decrease perception or performance in motion-induced blindness (Geng et al., 2007; Schölvinck and Rees, 2009) and the motion aftereffect (Murd and Bachmann, 2011), for Troxler/peripheral fading (Lou, 1999; De Weerd et al., 2006), visual memory (Voss and Paller, 2009), SF (Yeshurun and Carrasco, 1998), the attentional blink (Olivers and Nieuwenhuis, 2005), and visual search (Smilek et al., 2006). These findings suggest that our conclusions may be valid beyond the context of afterimages.
In conclusion, our results indicate that signal enhancement functions differently through attention and consciousness. Because it is easy to conflate attention and consciousness effects, future research would gain from carefully controlling parameters that influence attention and consciousness.
Footnotes
This work was supported in part by the Netherlands Organisation of Scientific Research (Rubicon Grant). I thank Naotsugu Tsuchiya, Jakob Hohwy, and April Kartikasari for feedback on the manuscript and Drisika Acharya for help with the data acquisition.
The author declares no competing financial interests.
- Correspondence should be addressed to Jeroen J.A. van Boxtel, Monash Biomedical Imaging (bld 220), Monash University, 770 Blackburn Rd, Clayton, VIC 3800, Australia. j.j.a.vanboxtel{at}gmail.com