Abstract
Cathode ray tubes (CRTs) display images refreshed at high frequency, and the temporal waveform of each pixel is a luminance impulse only a few milliseconds long. Although humans are perceptually oblivious to this flicker, we show in V1 in macaque monkeys and in humans that extracellularly recorded action potentials (spikes) and visual-evoked potentials (VEPs) align with the video impulses, particularly when high-contrast stimuli are viewed. Of 91 single units analyzed in macaque with a 60 Hz video refresh, 29 cells (32%) significantly locked their firing to a uniform luminance display, but their number increased to 75 (82%) when high-contrast stimuli were shown. Of 92 cells exposed to a 100 Hz refresh, 21 (23%) significantly phase locked to high-contrast stimuli. Phase locking occurred in both input and output layers of V1 for simple and complex cells, regardless of preferred temporal frequency. VEPs recorded in humans showed significant phase locking to the video refresh in all seven observers. Like the monkey neurons, human VEPs more typically phase locked to stimuli containing spatial contrast than to spatially uniform stimuli. Phase locking decreased when the refresh rate was increased. Thus in humans and macaques phase locking to the high strobe frequency of a CRT is enhanced by a salient spatial pattern, although the perceptual impact is uncertain. We note that a billion people worldwide manage to watch TV without obvious distortion of their visual perception despite extraordinary phase locking of their V1s to a 50 or 60 Hz signal.
Introduction
Even when a cathode ray tube (CRT) is displaying a sunny beach scene, its pixels are dark most of the time; as the electron beam of a CRT scans across the screen, each pixel emits an impulse of light that decays within milliseconds. Refreshing each pixel once every 17 msec is enough to produce the illusion of steady images. Nevertheless, human visual-evoked potentials (VEPs) can entrain to white light flickering at 60 Hz (Regan and Spekreijse, 1986). We wanted to know how pervasively neurons in the primary visual cortex modulate their spiking with the video refresh of a CRT and whether the response varies with the stimulus being displayed.
The temporal frequency responses of retinal ganglion cells (Frishman et al., 1987; Lee et al., 1990; Purpura et al., 1990) and visual neurons in the lateral geniculate nucleus (LGN) (Derrington and Fuchs, 1979; Derrington and Lennie, 1984) extend to high enough frequencies to suggest that such subcortical visual neurons would have spike trains modulated in time by the video refresh impulses. In contrast, neurons in macaque V1 usually do not follow 60 Hz sinusoidal temporal modulation (Hawken et al., 1996), and cat V1 cortical neurons have been reported to have temporal frequency cutoffs well below 60 Hz (Tolhurst and Movshon, 1975; DeAngelis et al., 1993). However, a previous study reported that many visual neurons in the cat LGN and primary visual cortex become phase locked to 60 Hz video rates on CRTs (Wollman and Palmer, 1995). As discussed below, it is difficult to tell how many of the cat cortical neurons reported in that paper truly were phase locked to the CRT refresh, and it is important to study this question in primates to understand how much human visual perception is affected by CRT refresh rates. Building on our earlier work (Mechler et al., 1996), the present study quantifies how much entrainment to the CRT video refresh is observable in primate visual cortex.
Because V1 is the primary source of VEPs recorded over occipital cortex (Maier et al., 1987), single unit recordings in macaque V1 can be compared to human VEPs. Previous studies have tested whether human occipital VEPs could be phase locked to a video display (Lyskov et al., 1998; Krolak-Salmon et al., 2003). Spatially uniform displays produced phase locking of VEPs to 60 and 72 Hz refresh rates in some subjects (Lyskov et al., 1998). Using patterned stimuli, Krolak-Salmon et al. (2003) found fairly strong entrainment to 70 Hz refresh rates in two epileptic human subjects.
We found entrainment to video displays both in macaque V1 neurons and in occipital VEPs in normal human observers, and phase locking increased with increasing contrast. Therefore, under ordinary viewing conditions when there is typically a high-contrast dynamical pattern on a CRT refreshed at 60 Hz, it is likely that most neurons in V1 cortex of monkeys and humans are firing nerve impulses that are phase locked to the video refresh. Up to at least 100 Hz, some cells continue to phase lock in the presence of a high-contrast stimulus.
Materials and Methods
Human VEP
Participants.Seven observers participated, two male and five female, ranging in age from 21 to 31. Each was informed of the purpose of the experiment and signed an informed consent approved by the Hunter College Institutional Review Board. All had normal or corrected-to-normal vision.
Apparatus. Stimuli were generated with a VENUS system (Neuroscientific, Farmingdale, NY) and displayed on a 17 inch iiyama Vision Master Pro 413 CRT (Nagano, Japan). The video frame buffer was swapped synchronously with the vertical refresh of the display, which was set to one of three rates: ∼57.47 Hz (57 Hz), ∼71.84 Hz (72 Hz), or ∼89.80 (90 Hz). These frame rates were chosen so that they would be well separated from the 60 Hz power line frequency in the data spectra (see Fig. 8). White posterboard masked all but a central, 20.8 cm wide by 7.3 cm high, region of the CRT. This was done to minimize the timing difference between the presentation of the first and last pixels in a stimulus and in this way to minimize the temporal dispersion in response. We calculate that each stimulus frame lasted approximately one-third of one refresh period.
In accord with the international 10-20 system, the EEG was recorded from the midline of the head with electrodes placed at an active site Oz over primary visual cortex, referenced to a site on the vertex of the head, Cz, with a floating ground electrode placed midway between these sites at Pz. The EEG was bandpass-filtered (from 0.1 Hz to 1 kHz) and digitized synchronously with the video frame refresh: eight samples per frame at 57 Hz, six samples per frame at 72 Hz, and four samples per frame at 90 Hz. All data analysis was performed on the raw time series with MatLab (The MathWorks, Natick, MA) software.
Stimuli. Six stimuli were presented to each observer and included (1) a uniform luminance display held at the mean luminance, “Luminance-stationary”; (2) a uniform luminance display swapping between low and high luminance at a frequency of ∼1 Hz, “Luminance-modulated”; (3) a checkerboard pattern consisting of squares 0.3° to a side set to either low or high luminance, “Checkerboard-stationary”; (4) a contrast-reversing checkerboard with individual squares swapping between low and high luminance at ∼1 Hz, “Checkerboard-modulated”; (5) a sinusoidal grating pattern with spatial frequency 0.5 cycle/degree drifting upward with a temporal frequency of ∼2.5 cycles/sec, “Drifting sinusoidal grating”; and (6) a control, which consisted of white posterboard covering the CRT while the Luminance-stationary stimulus was displayed. The control stimulus was designed to reveal any electrical interference at the video refresh frequency that might be produced by the CRT or video card and picked up by the recording electrodes. Mean luminance was 50 cd/m2, high luminance 100 cd/m2, and low luminance nominally 0 cd/m2. In the control condition the posterboard was illuminated so that its luminance was similar to the mean of the screen.
Procedure. Observers binocularly viewed the CRT from a distance of 115 cm, making the visible portion of the monitor 10° wide by 3.6° high. They were instructed to maintain fixation on a small black circle at the center of the display rectangle throughout the presentation of each stimulus. At each refresh rate the observers were shown a block of 24 stimuli consisting of four repeats of all six stimuli. Three blocks (one at each frame rate) were presented to each observer in random order, and the order of stimuli was randomized within each block. Stimuli lasted ∼20 sec each and were separated by a pause of 5-10 sec between each presentation. Observers rested for 1-2 min between each block.
Analysis. To assess whether the VEP was locked to the video refresh, we tested whether the mean spectral amplitude at the frame rate was significantly greater than that expected by chance. We made two reasonable assumptions: (1) that the spectrum is flat in the immediate vicinity of the refresh rate and (2) that the real and imaginary Fourier components of any noise (non-phase-locked) process at the frame rate consists of independent Gaussian-distributed random variables with zero mean. Under these conditions an F test can be constructed to test for the presence of a line in a spectrum (Thomson, 1982; Victor and Mast, 1991). We calculated whether a statistically significant signal was present by using the following statistic (Victor and Mast, 1991):
where the zj are N independent estimates of the complex Fourier components at the frame rate and z̄ is their empirical mean. For noise alone, is distributed as the F[2,2N-2] statistic, so we determined that a statistically significant signal was present when was in the upper 1% of this F distribution.
We derived our estimates of the Fourier components by breaking each 20 sec repeat into shorter segments (16 segments for 57 and 72 Hz; 12 segments for 90 Hz) and pooling these together across the four repeats. Estimates of the Fourier components at the refresh frequency were calculated by applying the discrete Fourier transform, with a rectangular window and no tapering, to each segment.
Single unit recording in macaque V1 cortex
Animal preparation.Electrophysiological experiments were performed on anesthetized and paralyzed macaque monkeys (Macaca fascicularis), as described in detail previously (Ringach et al., 2002), in compliance with regulations of National Institutes of Health and the New York University Animal Use Committee. Briefly, anesthesia was induced with ketamine (30 mg/kg, i.m.) and maintained with intravenous infusion of sufentanil citrate (6-18 μg · kg-1 · hr-1). Muscle paralysis was induced and maintained with intravenous infusion of pancuronium bromide (0.1 mg · kg-1 · hr-1). The monkey was put on a respirator. Expired CO2, blood pressure, and EKG were measured with a Hewlett-Packard model 78345A patient monitor (Palo Alto, CA). EEG and core body temperature also were monitored continuously. Anesthesia was adjusted to keep these parameters in a physiological range. Anesthesia was maintained at a steady level based on the following criteria: the EEG contained both low and high frequencies and occasionally exhibited spindling, the heart rate was slow and steady, and expired CO2 was between 4.5 and 5%. Body temperature was kept at 37°C. Initially, ophthalmic atropine sulfate (1%) was administered to the eyes to dilate the pupils. Then a topical antibiotic solution (gentamicin sulfate, 3%) was applied to the eyes. For the duration of the experiment the eyes were protected by clear, gas-permeable contact lenses. The foveae were mapped onto a tangent screen by using a reversing ophthalmoscope. The visual receptive fields of isolated single neurons then were mapped onto the tangent screen with reference to the foveae. Cells typically were recorded at 2-8° eccentricity, comparable to the bias of the occipital VEP signal toward the central visual field (Maier et al., 1987).
A craniotomy over occipital cortex enabled us to advance microelectrodes into V1 cortex. Extracellular action potentials from single neurons were recorded with glass-coated tungsten microelectrodes (Merrill and Ainsworth, 1972). Spike waveforms were discriminated with a BAK Instruments discriminator (Germantown, MD). The occurrences of action potentials were time stamped with 1 msec precision by a CED 1401+ data acquisition system (Cambridge Electronic Design, Cambridge, UK). The spike times were stored on a computer disk for off-line processing. Assignment of the recorded neurons to cortical layers was accomplished via histological reconstruction of the electrode penetrations that were marked by small electrolytic lesions (cf. Hawken et al., 1988). The data reported in this study are derived from cells recorded in 15 animals.
Stimuli. For stimulus presentations to the monkey we used two different setups. For the first group of cells a Silicon Graphics Elan R40000 computer (Mountain View, CA) generated visual displays on a Barco color monitor (Duluth, GA) at a refresh rate of 60 Hz. For the purposes of Figure 1 B only, some of these cells also were shown stimuli generated by a special purpose instrument, which was designed and built at Rockefeller University (Milkman et al., 1980). This instrument was controlled by a PDP-11/83 computer. It produced a display on a Tektronix 690 color monitor (Beaverton, OR), with a vertical frame refresh rate of 135 Hz. With both display systems the mean luminance on the CRT display was 60-70 cd/m2, and the circular stimulus patch was 2-4° in diameter. On the Barco display there was an additional large gray-white background at the same mean luminance as the stimulus patch.
For the second group of cells either a Silicon Graphics O2 computer or a Dell PC generated stimuli on a Sony Trinitron color monitor GDMF520 (Tokyo, Japan) with a refresh rate of 100 Hz and mean luminance of 90-100 cd/m2. The stimuli on the Sony also were surrounded by a uniform luminance background set to the mean stimulus luminance.
Each cell was stimulated monocularly through the dominant eye and characterized by measuring its steady-state response to conventional drifting gratings (the nondominant eye was occluded). The receptive field of the neuron was mapped on a plotting table, and the middle of the receptive field was directed to the middle of the CRT display with a mirror. The stimulus patch was centered on the visual receptive field of the neuron and fully covered it. Orientation, spatial frequency, and temporal frequency were optimized for the neuron, and contrast was varied to obtain contrast-response functions. Blank trials were interleaved with stimulus trials; the duration of each trial was 2-4 sec. On “blank” trials the screen was of uniform luminance at the same mean luminance as that of the stimulus on stimulus trials. The temporal rate of drift of the drifting grating stimuli was synchronized with the screen refresh. This meant that there was always an integral number of frames in one temporal period or cycle of the drift of the gating. Thus the temporal rates of drift for the 60 Hz display, for example, were limited to values such as 4, 5, 6, 6.667, 7.5, 10 Hz, the temporal periods of which were integral multiples of the refresh time period.
Analysis. A cell was used in this study only if it could be driven to fire by an optimal stimulus to at least 5 spikes/sec above its spontaneous rate.
We categorized cells into simple and complex classes on the basis of the following modulation ratio:
where R(F1) is the fundamental response amplitude and R(F0) is the DC response to drifting sine gratings. If the modulation ratio is >1, we call the cell a simple cell; a complex cell has a modulation ratio <1 (but see Mechler and Ringach, 2002).
To determine whether the spikes of a cell were phase locked to the video refresh, we calculated the power ratio as follows:
where zi are N independent estimates of the Fourier components at the frame rate (60 or 100 Hz) and yj are M independent estimates of a baseline response, consisting of the Fourier components in the 20 Hz surrounding the frame rate (50-70 or 90-110 Hz). M counts the total number of estimates pooled across trials and across the baseline range.
We derived our estimates of the Fourier components for the stimulus-driven case from multiple presentations of a drifting sinusoidal grating optimized for orientation, spatial frequency, and temporal frequency. For nearly all cells these data came from repeats of the three or four contrasts higher than 33% in a contrast-response experiment. Each 2-4 sec trial was broken into 1 sec segments with 50% overlap, yielding 1 Hz frequency resolution and a typical total of N = 42 estimates of the power at the frame rate. As with the VEPs, we calculated estimates of the Fourier components with a rectangular window and no tapering. When calculating the mean power in the baseline range, we excluded the power at the frame rate and at the frame rate plus or minus multiples of the stimulus temporal frequency because lines could appear at these interaction frequencies. This was done for both simple and complex cells. Because our spectral resolution was 1 Hz, this procedure would eliminate the majority of estimates for cells with optimal driving frequencies slower than 2 Hz. We therefore excluded these cells (12 total) from this study. We typically had M = 588 total estimates from the baseline range.
Because the frame rate response of a simple cell is modulated across the stimulus cycle, unlike the uniform response of a complex cell, some of its power at the frame rate is lost to interaction frequencies. To mitigate this loss and to make the power ratio more directly comparable between simple and complex cells, for simple cells we took the power that exceeded the baseline level at the interaction frequencies on either side of the frame rate frequency and added the mean of these two terms to the numerator of the power ratio. No cells switched their classification as “significantly” or “not significantly” phase locked as a result of this procedure.
For a cell to be considered significantly phase locked, we required the power ratio to reach the upper 1% of a bootstrapped distribution. The distribution was created by calculating the power ratio exactly as described above, except with all estimates being drawn randomly with replacement from the baseline values (again excluding the frame rate and all interaction frequencies). The number of values used for the numerator and denominator of this synthetic power ratio exactly matched the number used previously to calculate the actual power ratio for a given cell. After 10,000 repeats of this procedure the upper 1% level was chosen as the critical level of the power ratio needed for that cell to be considered significantly entrained. For comparison, when no signal is present, the power ratio is approximated reasonably by the F[2N-2,2M-2] statistic for large N and M if we assume that any noise produces Gaussian-distributed Fourier components of zero mean. The critical level determined by bootstrapping was for all cells slightly more conservative than the upper 1% level of this F distribution. In practice, throughout the 60 Hz population the cells were entrained significantly if their power ratio exceeded ∼1.7; significant entrainment occurred for the 100 Hz population at power ratios above 1.44.
Estimates for the spontaneous case used stimulus blank periods from all experiments on a given cell. Analysis proceeded as for the driven case, except that there was no need to remove interaction frequencies. The power ratio and the value required to reach significance (the upper 1% of the bootstrapped distribution) were calculated for each cell. Because many blank periods were recorded for each cell, the power ratio level needed to achieve significance was lower than for the driven experiments, and in practice the bootstrap-determined critical level was equal to ∼1.4 for the entire 60 Hz population.
Results
Our data demonstrate the presence of substantial entrainment to CRT frame rates near 60 Hz in both the single unit responses from macaque V1 and the human VEP, with less entrainment at higher frequencies. First we will present the single unit results and then review the human VEP responses.
Single unit responses at 60 Hz recorded in macaque V1
Entrainment to a blank screen
At a 60 Hz video rate many cells in V1 cortex align their spikes to the video refresh even in the absence of any spatial contrast, as depicted in Figure 1A for a representative V1 neuron of this type. The receptive field of the V1 neuron was positioned in the middle of the uniform luminance (“blank”) CRT display screen. The spikes in the raster plot (Fig. 1A, second row) and peaks in the spike histograms (Fig. 1A, third row) occur in clusters that are separated by 16.7 msec, the time between the impulses generated on each screen refresh (Fig. 1A, top row). The plot of the power spectral density accordingly shows a prominent peak at 60 Hz as well as peaks at higher harmonics because of the tight clustering of spikes around the screen refresh (Fig. 1A, bottom row).
Note that this neuron fires at a low mean rate of 2.1 impulses/sec but that the timing of those spikes is determined by the video refresh rate. When driven with a 135 Hz refresh rate (Fig. 1B, top row), the same cell continues to fire at approximately the same low mean rate (2.6 impulses/sec) but now at random times throughout the cycle (Fig. 1B, second and third rows). At the higher refresh rate no power is present in the power spectral density at 60 Hz (Fig. 1B, bottom row). This indicates that the phase-locked response observed when the 60 Hz refresh rate was used derives from the entrainment of the neuron to the refresh rate of the visual stimulus, not from an electrical power line artifact and not from intrinsic 60 Hz oscillations. The absence of power at 135 Hz also indicates that the recording is not susceptible to electrical interference at the frame rate.
Increased single cell entrainment with visual patterned stimuli
Compared with the blank condition, a much larger fraction of cells was entrained to the 60 Hz video refresh rate under conditions in which the stimulus was a high-contrast drifting grating of spatiotemporal parameters optimized for each recorded neuron. Figure 2 shows an example of a V1 complex cell from layer 6, the spike rate of which became entrained to the video refresh rate when stimulated by high-contrast visual stimuli. This complex cell showed no entrainment to a uniformly lit screen even though it had a nonzero spontaneous firing rate. When driven strongly with an optimal drifting grating (at temporal frequency 5 Hz), however, the 60 Hz video refresh rate greatly influenced the timing of its spikes. Figure 2A shows the raster plots of the spike times, indicating robust clustering of spikes when the stimulus contrast exceeded 10%. Even the spike rasters for 17% contrast are statistically significantly entrained. Figure 2B presents the same data as in Figure 2A in the form of spike density histograms. Figure 2C shows for this same complex cell that the mean amplitude of the 60 Hz component in the spike rate is a monotonic function of stimulus contrast. In fact, the 60 Hz amplitude grows with increasing contrast almost as fast as the stimulus-driven mean spike rate. When the visual stimulus on the CRT screen has a large enough contrast to excite this neuron (and presumably many other neurons in the visual cortex) effectively, the spike train contains a large-amplitude 60 Hz component (Fig. 2D).
Another example is given in Figure 3. These are data from a simple cell in layer 6 in the same format as for the complex cell in Figure 2. The time axis in Figure 3, A and B, comprises one temporal cycle of the drifting grating stimulus that was drifting at a temporal rate of 7.5 Hz. Spikes are clustered in time at intervals of 16.7 msec, entrained to the video refresh rate. Phase locking sharpens, and the entire group of clusters moves to the left earlier in the temporal cycle as contrast increases. This means that the spikes modulated by the 7.5 Hz drifting pattern are phase advanced with increasing contrast, as has been observed before (Reid et al., 1992; Carandini and Ferster, 1997). When stimulated with a sinusoidal grating of increasing contrast, this cell increased its firing rate (Fig. 3C, open symbols). Furthermore, its firing became increasingly phase locked to the video refresh. Entrainment is already visible and is statistically significant at 17% contrast, and it becomes more prominent at higher contrast (Fig. 3C, filled symbols). This simple cell, unlike the neuron in Figure 2, was one of the V1 neurons that was entrained significantly to a blank screen (0% contrast). Finally, this simple cell provides an example of the oft-observed peaks at nonlinear interaction frequencies between the grating drift rate and the video rate (Fig. 3D). This last observation implies that the 60 Hz modulation was present in the membrane potential of the neuron and could interact with the lower stimulus temporal frequency to produce the observed intermodulation frequencies.
V1 population analyses
To decide whether the spikes of a cell were entrained significantly to the video refresh rate, we calculated the ratio of the power spectral density of the spike train of the neuron at the video refresh rate (60 or 100 Hz), divided by the mean power in the surrounding band (50-70 or 90-110 Hz, excluding the frequencies of the video refresh as well as the video-stimulus intermodulation frequencies). We classified the spiking of a cell as “entrained” at the 1% significance level if this power ratio exceeded a criterion level, ∼1.4 for spontaneous firing and 1.7 for grating responses at the 60 Hz frame rate. The particular value of the criterion level is chosen as the 99th percentile value of a bootstrapped distribution with degrees of freedom equal to the data (see Materials and Methods). V1 cells with entrained spontaneous activity often were observed in our experiments. Of 91 cells studied with a 60 Hz video refresh rate, 32% significantly entrained their responses to the video refresh of a “blank” white screen. The extent of entrainment of the spontaneous spiking (as measured by the power ratio) weakly depended on the average spontaneous rate (Fig. 4A). The neurons that had the highest average firing rate when the stimulus was a blank screen tended to be the neurons for which spike firing was entrained most robustly to the video refresh rate. In particular, spontaneous spiking in ∼67% (20 of 30) of the neurons with spontaneous rates >3 spikes/sec was entrained significantly, but this was true only for ∼15% (9 of 61) of those with spontaneous rates <3 spikes/sec. The loose correlation indicates that average firing rate alone does not predict entrainment but that some neural mechanisms influence both average rate and entrainment. Furthermore, all simple cells with mean rates above 2 spikes/sec to the blank screen showed significant entrainment (Fig. 4A, filled circles), whereas not all complex cells with high mean rates showed entrainment even though the complex cells have, on average, higher rates to the blank screen (Fig. 4A, open symbols).
Figure 4B depicts the relation between the extent of entrainment and mean response magnitude under conditions in which cells were driven by spatially optimal, high-contrast drifting grating patterns. The spike responses in 82% (75 of 91) of cells significantly phase locked to the video refresh for high-contrast grating patterns. In 27 cells the power ratios even exceeded 10. Adding the high-contrast pattern substantially increased the number of cells that were entrained significantly and also increased the degree of entrainment in those cells.
For both the driven and the spontaneous conditions the complex cells in this study showed higher average firing rates than did the simple cells, but simple and complex cells had similar patterns of entrainment, and in both stimulus conditions the power ratio was distributed similarly in the two cell populations.
Phase locking occurred without regard to the temporal frequency tuning peaks of the cells. Significant entrainment occurred in cells with optimal temporal frequencies across the entire range included in this sample (2-15 Hz). The 75 significantly phase-locked neurons had a median optimal temporal frequency of 5 Hz; the remaining 16 cells had a median of 4 Hz.
Laminar distribution
Because not all of the cells demonstrate responses entrained by the 60 Hz video refresh even when driven by their optimal gratings, we wished to determine whether the entrained neurons might be distributed differentially across the cortical laminas. Previously it had been found that the temporal frequency responses of cells in different cortical laminas were significantly different from each other (Hawken et al., 1996). As the laminar distribution of the power ratio (Fig. 5A) indicates, cells entrained to the refresh of the blank screen were found in all layers of V1. However, the fraction of blank-entrained neurons in layer 4Cα was more than twice as large as that in the rest of V1. For the high-contrast drifting grating condition, the laminar distribution changes dramatically (Fig. 5B). Then a much higher fraction of cells in all layers is entrained to the video refresh rate. The highest fractions of entrained cells are found throughout the input layers 4Cα and 4Cβ and also in layer 4B, but a majority of cells in all layers are entrained significantly, with the highest values of power ratio found in layers 4B, 4Cα, 5, and 6.
Single unit responses at 100 Hz
Although not well documented, many physiologists probably have individually encountered phase locking to a 60 Hz video refresh and so routinely use visual displays with higher refresh frequencies. However, we found a good deal of phase locking even with a CRT display that is refreshed at a rate of 100 Hz. V1 cells in the 100 Hz sample were analyzed for significant phase locking as previously described, and cells in this population (for which more repeats were generally available than in the 60 Hz population) were entrained significantly when their power ratios exceeded 1.44. Of the 92 cells in this group, 21 (23%) significantly entrained their spikes to the 100 Hz refresh when driven by an optimal drifting grating stimulus. Only one cell was entrained to a blank screen at a 100 Hz refresh rate.
Figure 6 illustrates an example of a cell that is well entrained to the 100 Hz video rate when driven with a high-contrast stimulus. The spike rasters and histogram show obvious grouping into clusters that are separated by 10 msec. This layer 6 cell had a power ratio of 13.4, which was near the peak of the 100 Hz population but close to the average power ratio value for the entrained 60 Hz population. The median power ratio was 2.4 in the 100 Hz population.
Figure 7 shows the laminar distribution of power ratio for the entire population of 92 cells run at 100 Hz for the condition of high-contrast visual stimulation. The entrained population contains 11 simple cells and 10 complex cells. Layer 4B clearly is the layer with the highest percentage (53%) of phase-locked cells. As was the case at 60 Hz, the highest values of the power ratio are found in layers 4B, 4Cα, 5, and 6. However, unlike in the 60 Hz group, the significantly entrained neurons within the 100 Hz sample are sensitive to substantially higher temporal frequencies (median temporal frequency tuning peak of 8 Hz; n = 21) than their unentrained counterparts (median peak of 4 Hz; n = 71).
LGN
Previous research had suggested that the spike rates of macaque LGN cells, the main thalamic input to the V1 cells we studied, are entrained strongly to video temporal modulation rates, especially in the magnocellular layers of the LGN (von Blanckensee, 1981). We replicated this finding in one experiment in which we recorded from the LGN with a low-impedance electrode while the monkey viewed a uniformly lit CRT screen with a refresh rate either at 60 or at 120 Hz. There was a large local field potential (LFP) in the macaque LGN that was phase locked to the 60 Hz video refresh rate, and the LFP amplitude was maximal in the magnocellular layers of the LGN (data not shown). An LFP entrained to the CRT refresh rate was not evident when the refresh rate was increased to 120 Hz.
Human VEP responses at 57 Hz
Like the macaque single units, human VEPs more typically phase locked to stimuli containing spatial contrast than to spatially uniform stimuli. Figure 8 illustrates the response of subject KW031 to a contrast-reversing checkerboard stimulus and to a control stimulus, which was white posterboard covering the running CRT and lit to the same mean luminance as the checkerboard (see Materials and Methods). As expected, the checkerboard reversals stimulated robust responses (Fig. 8A), seen as prominent peaks in the spectrum of the EEG at both the 2 Hz reversal frequency and at the 4 Hz second harmonic. The control stimulus yielded no evoked response at these frequencies. There was also a high-amplitude peak in the EEG spectrum at the video refresh rate of 57.47 Hz (Fig. 8B, black trace) when the checkerboard stimulus was presented. Note that the region of the spectrum surrounding the 57.47 Hz peak was flat by comparison. There was no peak at the refresh rate when the control stimulus was used, indicating that the averaged VEP spectrum was free of frame rate artifacts caused by the CRT or video card. Figure 8C portrays the same data in the time domain after averaging over one video refresh cycle. It is just as clear in the time domain that a VEP that is phase locked to the refresh rate emerges when the CRT is displaying the checkerboard, but not when the CRT is covered.
To characterize quantitatively the VEP responses to all six stimuli used (luminance-stationary, luminance-modulated, drifting sinusoidal grating, checkerboard-stationary, checkerboard-modulated, and control), we calculated estimates of the Fourier components at the frame refresh frequency for each experimental trial and then took their vector averages, as described in Materials and Methods. The polar plots in Figure 9 depict the individual trial estimates (gray arrows) together with their vector averages (black arrows) for subject KW031. The VEP phases fell into a small range in each of the experimental conditions except the control, indicating that there was a coherent response to the video refresh under these conditions. Experimental conditions also produced relatively large VEP amplitudes. The checkerboard stimuli in particular evoked large responses at the refresh rate (Fig. 9, bottom row). As was the case for most subjects, for subject KW031 the VEP component at the refresh rate was smaller for the drifting sinusoidal grating and luminance (spatially uniform) stimuli than for the checkerboards. For KW031, as for all seven subjects, the phase of the VEP response to the video refresh impulses is determined by the spatial structure of the stimulus (one phase for the checkerboards, one for the uniform screen, and a third for the sinusoid) with little variation between stationary and contrast-reversing conditions. Whereas VEP phases typically advance with increasing stimulus contrast (Zemon et al., 1988) and lag with increasing spatial frequency (Zemon et al., 1997), our stimuli do not vary smoothly along these two dimensions, and thus it is difficult to compare the response phases. In all of our subjects, as exemplified by KW031 in Figure 9, the control condition was associated with small-amplitude components at the video refresh rate that had no preferred phase from trial to trial, resulting in a vector average near zero.
With the Tcirc2 statistic (Victor and Mast, 1991), we evaluated whether the mean vector amplitude was significantly different from zero. Figure 10 shows the responses of all seven subjects to all of the stimuli at the 57 Hz frame rate. All subjects had statistically significant responses at a 1% level to the video rate with the checkerboard stimuli, and none produced a significant response to the control. Responses to the remaining stimuli vary: some subjects (KK021, LS031, PW021) (Fig. 10, bottom row) had no other significant responses, whereas the four remaining subjects had significant responses to all of the other stimuli (Fig. 10, top row). For two of these subjects (AL021, KW031) the checkerboards drew the highest response, followed by the sinusoid and then the uniform field stimuli, whereas the other two subjects (DX, KD021) produced the largest amplitude responses to the stationary uniform field and the sinusoid, followed by the checkerboards and last by the flickering uniform field. For all seven subjects the sinusoid and at least one of the two types of uniform field stimuli drew statistically indistinguishable responses. Notice that the latter finding is apparently inconsistent with the neurophysiological finding that high-contrast optimal gratings doubled the fraction of video-entrained neurons in comparison to the fraction entrained by uniform luminance blank, but this apparent contradiction could be explained by realizing that the drifting grating used in the psychophysical study was most likely a less effective stimulus compared with the optimal grating of most V1 neurons. In short, stimuli containing spatial contrast were more likely to generate entrained responses, and those responses were commonly stronger than the responses produced by the uniform luminance stimuli.
VEP responses at 72 and 90 Hz
At frame rates higher than 57 Hz, entrainment tapered off, although the VEP was again more likely to entrain to the stimuli that contained spatial contrast. We measured significantly entrained responses at 72 Hz in some, but not all, observers. Figure 11A shows the response of subject KW031 to the control and experimental stimuli run at a 72 Hz video rate. Like before, the checkerboards produced clustered responses at the video rate, although the response amplitudes were reduced compared with what was observed for 57 Hz refresh rate. The sinusoid also stimulated a smaller but well entrained response; the uniform luminance stimuli, however, evoked no coherent responses, similar to the control. At 90 Hz (Fig. 11B) we failed to measure entrainment in any of the experimental conditions for this subject.
The overall pattern of responses for all observers to the 72 and 90 Hz refresh rates matched this example (Table 1). At 72 Hz, significant entrainment occurred most often in response to the checkerboard and sinusoidal stimuli, whereas the uniform luminance stimuli generated significant responses in only one observer. At 90 Hz we did not see significant entrainment to the refresh rate with any of the stimuli in any of the observers.
Distribution of 60 Hz response phases
So that the substantial VEP phase locking that we observed can be produced, a bias toward a particular phase must be present across the neuronal population. Accordingly, we investigated whether any such bias was present in our population of macaque single units. Because the receptive fields of the neurons were always directed by a mirror to the center of the CRT screen, the video impulses that excited the receptive fields of the cells always occurred at approximately the same temporal phase of the refresh cycle. This enabled us to test whether the responses of the cells were phase locked with each other. Figure 12 illustrates the distribution of response phase at 60 Hz across the V1 population studied with optimal high-contrast stimulation. It is evident that, although there is variation across the population in the phase of response to the video refresh, there is a peak at π/2 in the phase distribution. Therefore, across V1 the neuronal spikes did tend to cluster in-phase to the refresh rate of the CRT display.
Discussion
Primate V1 responds robustly to the 60 Hz video refresh rate of typical TV-type displays on CRT screens. This is found in both macaque monkeys and humans. This phase locking of neuronal spikes to the frame refresh is much stronger and more significant when there are high-contrast patterns on the video screen than when the screen is illuminated uniformly as a blank screen. From the neurophysiological experiments on monkeys we conclude that neurons in all layers of V1 can be entrained to a 60 Hz video frame refresh when high-contrast patterns are on the screen, and a significant but smaller fraction can be entrained even at 100 Hz. This means that neurons that project to other cortical areas from the output layers 2/3, 4B, and 6 are likely to be sending refresh-locked spikes to their targets in extrastriate visual cortex.
Previously, Wollman and Palmer (1995) reported that cat V1 cortical neurons and cat LGN cells also fired spikes that were phase locked to the video refresh rate of CRT monitors. In these experiments the V1 and LGN neurons were stimulated by an effective visual pattern such as a drifting bar or grating. The authors tried to estimate the maximal number of cells that might have fired spikes entrained to the video refresh rate so that they could avoid stimulus-related artifacts contaminating results in cross-correlation studies. They quantified the degree of entrainment of the spikes of each neuron with the 60 Hz refresh rate by calculating a power index as the ratio of the average power in the power spectral density between 55 and 65 Hz, divided by the average power in the 10-200 Hz band. However, no analysis was given to indicate what level of this power index might be expected by chance, and therefore one cannot evaluate the statistical power of these results. Although Wollman and Palmer show [(1995), their Fig. 3] the spike trains from one cortical cell that had spikes clearly significantly entrained to the 60 Hz refresh rate, the cell also had an unusually large power index of 1.6. More than 50 of the 75 V1 cells they studied had values of this power index <1.4, so it is difficult to know how many of the other cat V1 neurons they studied were entrained significantly. Our results prove convincingly that there is significant entrainment to 60 Hz video displays in most neurons in macaque V1 cortex when there is a high-contrast pattern on the screen.
VEPs provide a means of comparing human responses with those recorded in animals. Because V1 is the strongest source of potentials recorded over human occipital cortex (Maier et al., 1987), we can judge whether the pervasive phase locking to patterned stimuli in macaque V1 is also present in human V1. Although V2 also contributes to the VEP signal for some visual stimuli (Maier et al., 1987), any refresh-locked signal clearly must be carried by V1 if it is also present in V2 or other visual cortical areas, so the phase locking that we found here indicates that a 57 Hz signal is carried by V1 regardless of the source of our VEP measurements.
Phase locking of the human VEP to a 60 Hz video refresh rate also has been found before. In a study of 13 normal human subjects, Lyskov et al. (1998) measured VEPs while subjects looked at a blank CRT screen refreshed at rates of 60 and 72 Hz. Lyskov and colleagues found that 10 of 13 subjects had responses to the blank screen that were significant when compared with responses to an unmodulated control stimulus. This is a higher incidence of 60 Hz phase locking to the blank screen than we found; our results were four of seven subjects responding significantly to the blank screen refreshed at 60 Hz. Although there is some quantitative disagreement about the prevalence of 60 Hz entrainment to the blank screen, there is good agreement between our results and those of Lyskov et al. (1998) on the marked reduction in entrainment to the 72 Hz frame rate.
What is striking about our results in human subjects is the great enhancement of 60 Hz entrainment caused by the presentation of high-contrast spatial patterns on the CRT display. This experiment was motivated by our previous findings of the influence of stimulus contrast in the macaque cortex experiments. A previous study on two epileptic human patients by Krolak-Salmon et al. (2003) is consistent with our results on normal human subjects. Krolak-Salmon et al. (2003) observed local field potential activity in calcarine cortex (measured with penetrating electrodes) that was phase locked to the 70 Hz video refresh rate of their CRT display. They were using high-contrast spatial patterns such as faces or checkerboards as visual stimuli. Their two patients studied in this way both showed VEP activation at the video refresh frequency.
The increase in entrained response of V1 neurons to the 60 Hz refresh rate when high-contrast spatial patterns are presented on the CRT display, as compared with the amount of entrainment when the CRT screen is blank, is a clear example of failure of superposition. Although the amplitude of the 60 or 100 Hz frame rate component did not change with stimulus contrast, the power ratio increased substantially with stimulus contrast in most cells. In other words, this phenomenon is evidence for nonlinearity in V1 cortex. We believe that understanding the (nonlinear) neural mechanisms that cause the phase locking to the raster refresh rate may be important for a comprehensive theory of V1. Cortical temporal nonlinearities that have been revealed in other experiments (Dean et al., 1982; Reid et al., 1992) may play a role in enabling V1 neurons to phase lock to the frame refresh when spatial patterns vigorously activate the visual cortex, as we found in our experiments. A temporal frequency of 60 Hz is higher than most cells will respond to sinusoidal drifting or contrast-reversing grating patterns (Hawken et al., 1996), and VEP responses to 60 Hz flickering unpatterned stimuli are small compared with lower frequencies (Regan and Spekreijse, 1986). Why, then, are responses so pervasively phase locked to 60 Hz and some even to 100 Hz in the presence of spatial contrast? It could be that the spatial stimulus lifts the membrane potential close enough to threshold that the weak driving force of the video impulse then can produce spikes. An additional explanation is that spatial stimuli may drive the local network into a high conductance state, which would result in spiking that could track fast synaptic changes induced by the CRT impulses (Shelley et al., 2002). That the spike response to a high-frequency stimulus can be enhanced by an effective slower carrier also has been documented in auditory cortex (Elhilali et al., 2004). Given that the response of a V1 neuron reflects the influence of many subcortical cells, it is remarkable that the spikes in an entrained cell are not smeared more broadly in time. This high precision argues both for a narrow dispersion in time of subcortical responses and also for short membrane time scales in V1 neurons, on the order of milliseconds.
From a theoretical point of view, the finding of responses in macaque and human cortex entrained to the refresh rate of the video display may be important with respect to the ongoing consideration of the importance of temporal coding in vision. The frame refresh rate is unrelated to any visual information that is present on the spatiotemporal pattern on the screen. Nevertheless, a majority of V1 neurons will phase lock to the video rate when a monkey or a human being looks at a video screen with pictures displayed on it, and, further, many of these responses are synchronized to the same phase (Fig. 12). Therefore, one would expect that if synchrony occurring in the 50-100 Hz range (gamma band) necessarily reflects perceptual binding or scene segmentation (Singer and Gray, 1995), perception should be compromised noticeably when looking at television or computer displays that have 60 Hz (or 50 Hz) frame refresh rates. However, every day more than a billion people watch TV with no obvious distortion in their perception of the color, form, and motion of pictures on the video screen. Thus phase locking in the middle of the gamma band that is irrelevant to the visual scene does not disrupt scene organization or corrupts it so subtly that people do not notice. Furthermore, precise experiments concerned with reading on video displays (Kennedy and Murray, 1991; Kennedy et al., 1998) indicate that, apart from some possible adjustment in saccade patterns, this complex visuo-motor task is affected very little if at all by using a video display rather than continuous illumination. It seems unlikely that a temporal code that labels a surface or object with an oscillation in the gamma band could succeed if V1 can be entrained so thoroughly to external stimuli oscillating in that range. Other kinds of temporal coding such as coincidence detection or the encoding of information with aperiodic temporal patterns like bursts are not ruled out by these results, in the 50-100 Hz range or at slower time scales (Victor and Purpura, 1996; Mechler et al., 1998).
One practical inference from our results is that, whenever a person is watching television on a CRT monitor, his/her V1 cortex is being buzzed at the video refresh rate. No one knows the long-term consequences of this modulation of the visual cortex by visually meaningless stimulation. One might speculate that the synchronization of brain activity to the video refresh rate might be one of the factors that make television viewing addictive (Kubey and Csikszentmihalyi, 2002).
Footnotes
This work was supported by National Eye Institute Grants R01 EY01472, EY08300, EY9314, P30-EY13079, and T32-7136; National Institutes of Mental Health Grant MH-16705; and Professional Staff Congress City University of New York 63281-00-32. We thank C. Hau and K. Kotenko for their assistance in collecting the VEP data and D. Ringach for his help in collecting the physiological data.
Correspondence should be addressed to Patrick E. Williams, New York University Center for Neural Science, 4 Washington Place, Room 809, New York, NY 10003. E-mail: patrickw{at}cns.nyu.edu.
Copyright © 2004 Society for Neuroscience 0270-6474/04/248278-11$15.00/0