Abstract
It is recently shown that objects with long-term reward associations can be efficiently located during visual search. The neural mechanism for valuable object pop-out is unknown. In this work, we recorded neuronal responses in the ventrolateral prefrontal cortex (vlPFC) with known roles in visual search and reward processing in macaques while monkeys engaged in efficient versus inefficient visual search for high-value fractal objects (targets). Behavioral results and modeling using multialternative attention-modulated drift–diffusion indicated that efficient search was concurrent with enhanced processing for peripheral objects. Notably, neural results showed response amplification and receptive field widening to peripherally presented targets in vlPFC during visual search. Both neural effects predict higher target detection and were found to be correlated with it. Our results suggest that value-driven efficient search independent of low-level visual features arises from reward-induced spatial processing enhancement of peripheral valuable objects.
Significance Statement
Rapid detection of rewarding objects can be essential for survival and reproduction in real life. However, finding valuable objects, among many others, can be time-consuming and slow. In this work, we reveal reward-related changes in the receptive fields of neurons within the prefrontal cortex of macaque monkeys that help them find valuable objects more efficiently. Such reward-related plasticity is shown to develop slowly for objects that are consistently associated with reward and challenges current theories of efficient search based on low-level visual features alone.
Introduction
Low-level guiding features such as color, size, and orientation facilitate quick detection of targets in visual search through a phenomenon known as visual pop-out (Wolfe, 1994; Wolfe and Horowitz, 2004). Such visual pop-out is thought to arise from the brain's ability to simultaneously process certain low-level features in parallel across visual space in early visual areas (Kobatake and Tanaka, 1994; Gur et al., 2005; Maunsell and Treue, 2006). However, recent studies reveal that high-level features such as value can capture attention toward peripherally presented objects (Hickey et al., 2010; Anderson and Halpern, 2017; Ghazizadeh and Hikosaka, 2021). We refer to this phenomenon as value pop-out, where high-value objects, among many distractions, automatically attract attention. Indeed, we have previously shown that this attention-capturing effect allows targets with long reward associations to be identified efficiently in visual search (Ghazizadeh et al., 2016). Importantly, such value-driven efficient search happens for complex fractal patterns and has the capacity to support pop-outs for many objects. However, unlike visual pop-out, pop-out of valuable arbitrary shapes cannot be easily explained by extant frameworks such as feature integration theory (Treisman and Gelade, 1980) or guided search (Wolfe, 1994) and remains to be addressed.
Recent studies have shown that cortical regions, particularly the ventrolateral prefrontal cortex (vlPFC; Abbaszadeh et al., 2023), and subcortical areas, such as substantia nigra reticulata (SNr; Narmashir et al., 2024), show enhanced differentiation of TP and TA conditions in efficient visual search for valuable targets. Among these regions, the vlPFC is known to have a more localized receptive field (RF) that can signal both the presence of the target and its location (Funahashi et al., 1990; Ghazizadeh and Hikosaka, 2021). Thus, we hypothesized that the value pop-out could arise from the reward-driven enhancement of the visual processing for valuable objects in vlPFC. Such enhancement would be most beneficial for objects not yet foveated as it can aid in guiding future saccades to acquire search targets.
To address this issue, we analyzed data from an experiment in which macaque monkeys performed both efficient and inefficient value-driven visual searches while recording vlPFC neural responses (Abbaszadeh et al., 2023). Analysis of visual search behavior using a drift–diffusion model and assessing the location dependence of neural responses in vlPFC revealed that consistent with our hypothesis, long-term reward learning induced enhanced spatial processing of peripheral objects in ways that were predicted to allow for faster detection of a valuable object in peripheral vision.
Materials and Methods
Subjects and surgery
Two male rhesus macaques, Monkeys S (12 years, 10 kg) and H (10 years, 12 kg), were used in this study. All methods and animal care were approved by the Ethical Committee of the Institute for Research in Fundamental Sciences (IPM) and adhered to the standards established by the National Institutes of Health for the Ethical Treatment and Use of Laboratory Animals (IPM, Protocol Number 99/60/1/172). Initially, a titanium head holder and a recording chamber were surgically implanted on the heads of each animal under general anesthesia. The head holder was positioned along the midline of the parietal lobe for both monkeys. Additionally, a recording chamber was placed on the right prefrontal cortex (PFC) for Monkey S and on the left PFC for Monkey H, with a lateral tilt. To verify the accurate placement of the recording chamber postsurgery, MR images of monkeys' heads were acquired. After the monkeys had become proficient in the experimental tasks, a second surgery involving a craniotomy over the vlPFC area was conducted. Recording neuronal data was done using a 1-mm-diameter grid that fitted inside the chamber. Parts of the data in this work were previously reported by Abbaszadeh et al. (2023), which also provides further details on recording location, stimuli, and tasks used.
Recording localization
T1- and T2-weighted MR images (3T, Prisma Siemens) were used to map precise recording locations within the vlPFC. To further validate the positional accuracy of each animal's vlPFC, the National Institute of Mental Health Macaque Template toolbox was used to morph the standard monkey brain atlas with the native space of the monkeys (Seidlitz et al., 2018; Nadian et al., 2023).
Stimuli
Fractal-shaped objects (Miyashita et al., 1991), as shown in Figure 1A, were employed as visual cues. Each of these fractals featured a shared core encircled by four point-symmetrical polygons, superimposed with smaller ones at the forefront. The attributes of each polygon (such as size, edges, and color) were randomly selected. The average diameter of the fractals was 4°. Monkeys were exposed to numerous fractals (over 700) across various tasks. For the search task, Monkeys S and H each saw 736 and 824 fractals, respectively. The monkeys viewed these fractals either during a single training session or throughout five reward training sessions, resulting in a collection of “overtrained” fractals.
Tasks paradigm and behavioral data. A, Example fractal stimuli used in value training and search tasks. Fractals were trained in sets of eight: four associated with a large (good) and four with a small juice reward (bad). Each search session used three sets (24 fractals, 12 good). B, Two different groups of sets were trained for 1 and 5+ d in the object-value training task. The variable training sessions were known to create a difference in the efficiency of the search tested subsequently. C, Object-value training tasks consisted of force (80%) and choice (20%) trials. After central fixation, an object appeared in the periphery (9.5°). After the monkey made a saccade and held a gaze on the object, the corresponding large or small reward was delivered (biased reward training). D, Value-driven search tasks consisted of TP trials with one good object among bad objects or TA trials with all bad objects. The number of objects shown was variable (set size 3, 5, 7, 9). Monkeys had 3 s for each search trial and were free to make as many saccades. Holding gaze on an object meant choosing it and was followed by a corresponding reward. Staying at the center or returning back to it meant rejecting the trial (see methods). Example trials and eye traces for inefficient and efficient searches are color-coded by time (from orange to red). Tick marks at the bottom of trials show the timings of saccades (orange) and reward (black) relative to the set onset (purple), indicating an improvement in target selection in the TP trial and trial rejection in the TA trial in efficient search.
Task control and neural recording
A custom software program developed in the C language was employed to manage both behavioral activities and recordings. A Cerebus Blackrock Microsystem was used to collect neural data (www.blackrockneurotech.com). Eye tracking was conducted using EyeLink 1000 Plus, operating at a sample rate of 1,000 Hz. During each trial, the reward consisted of apple juice diluted with water (50 and 60% dilution for Monkeys S and H, respectively). A total of 526 well-isolated (369 sessions with some sessions more than one neuron), visually sensitive neurons were included in this study (230 from Monkey S and 296 from Monkey H). These neurons were primarily concentrated in Area 46v, located ventral to the principal sulcus. Neurons were classified based on the search slope criteria described below (see Search-type categorization) into efficient, inefficient, or both types of searches. They were recorded during either efficient, inefficient, or both types of searches, with 118 and 177 neurons in the efficient search and 122 and 178 in the inefficient search for Monkeys S and H, respectively. In total, 138 were recorded in both efficient and inefficient searches.
Object-value training task
A biased value saccade task was employed to train the object values in monkeys (Fig. 1C). During each task trial, the monkey was directed to fixate on a white dot appearing at the center of the screen. Upon maintaining fixation for 200 ms, a high-value (good) or low-value (bad) fractal was displayed at one of the eight peripheral locations (eccentricity of 9.5°). Following a 400 ms overlap period, the central fixation point disappeared, prompting the animal to execute a saccade toward the fractal and sustain gaze fixation for an additional 500 ± 100 ms to obtain either a large (for good fractal) or a small (for bad fractal) reward. After reward delivery, a variable intertrial interval (ITI) ranging from 1 to 1.5 s ensued, during which a blank screen was presented.
Each training session encompassed 80 trials, comprising 64 force trials wherein each object (four good and four bad fractals) was presented eight times in a pseudorandom order and 16 choice trials wherein one good and one bad fractal from the set of eight were simultaneously displayed in a diametrical arrangement on the screen. The timing structure for the choice trials was the same as the force trials, with the distinction that the monkey was required to select one of the fractals with a saccade. Successful execution of each trial triggered a correct tone. In contrast, an error tone was triggered in cases of early saccade to a fractal or a disruption of fixation. The outcomes observed during choice trials were used to quantify object-value learning in each monkey.
Value-driven search task
In this task, monkeys were required to identify a good object (target) within a variable number of bad objects during target-present (TP) trials or to reject the trial by returning to the center during target-absent (TA) trials, where all presented objects were bad (Ghazizadeh et al., 2016). TP and TA trials were intermixed with equal likelihood. Prior to the search task, monkeys underwent object-value training to learn the reward values of various objects (see above, Object-value training task). Each monkey learned the values of over 700 fractals (Monkeys S and H, 736 and 824, respectively), with half associated with high reward and the others associated with low reward.
Monkeys performed the search task in blocks of 240 trials, consisting of 120 TP trials and 120 TA trials. For each block, a randomly selected set of 24 fractals (12 high-value and 12 low-value) was drawn from the pool of all trained fractals. During the visual search task, any of the 12 high-value stimuli could serve as the target, while the low-value stimuli acted as distractors. Notably, the target was not explicitly cued for each trial; the monkeys identified it based on their previously learned reward associations.
The start of each trial was signaled by a pink fixation dot displayed on the screen (Fig. 1D). After 400 ms of central fixation, sets of 3, 5, 7, or 9 fractals were presented in an imaginary circle, equally spaced, with a 9.5° eccentricity (set onset). The angular location of the target was taken from a uniform distribution around the circle. For each trial, objects were pseudorandomly selected from a set of 24 fractals (12 good and 12 bad), which were drawn from three prior training sessions. The target could be any 1 of the 12 good objects (during TP trials), while the remaining objects (distractors) were selected from the bad objects. The target was not revealed to the monkeys before each trial. Following the set onset, monkeys had 3 s to either choose an object by fixating it for a minimum of 600 ms (committing time), followed by an additional 100 ms for reward receipt, or to reject the trial by remaining at the center (for 900 ms) or by returning to central fixation and staying for 300 ms after making saccades. Monkeys were permitted to make as many saccades as they wished during the 3 s and to shift their gaze away from the object before the committing time elapsed (gaze breaks during the subsequent 100 ms would result in an incorrect tone). Monkeys would receive a high reward for choosing good and a low reward for choosing bad objects. Rejecting a trial led to quick progression to the subsequent trial (after 400 ms), with a 50% chance of encountering a good object in a TP trial. For nonrejected trials, a randomized ITI of 1–1.5 s was implemented after reward delivery. Monkeys would also receive an error tone if no object was chosen after 3 s and the trial was not rejected either (rare, <0.1% of trials).
Search-type categorization
Each search session typically consisted of one block using an overtrained fractal set and another block using a set of fractals trained for one time. However, the search blocks were classified as efficient or inefficient based on the search slope, calculated using linear regression (sci-kit-learn v1.3.1) for TP trial search times across different set sizes. Efficiency was determined over search block trials with a specific set of 24 stimuli (12 good and 12 bad). Blocks with a search slope below the 33rd percentile of the search slope distribution were designated efficient, and blocks with a search slope above the 66th percentile were categorized as inefficient.
Comparison of visual properties in fractals
Several metrics for each fractal were calculated using MATLAB and the SHINE toolbox, including average hue (scaled from 0 to 1), average saturation (ranging from 0, with no color, to 1, representing full saturation), and average luminance (ranging from 0 to 1, where 0 represents black with no light and 1 represents full brightness), in order to describe the color properties of the images. Root mean square contrast was calculated as the standard deviation of grayscale values across pixels, providing a measure of overall brightness contrast. Texture contrast, derived from the gray-level co-occurrence matrix, was calculated to capture local variations in grayscale levels, emphasizing intensity differences between neighboring pixels and highlighting areas of high texture or sharp edges. Since this contrast measure is based on the relative intensity differences between pixel pairs, it ranges from 0 to 1. The area of each fractal was determined by segmenting the grayscale image into a binary format and extracting the region properties of connected components while bounding box dimensions were calculated as the mean width and height of the bounding box. Energy across spatial frequencies was computed by performing a 2D Fourier transform on the grayscale image to obtain its power spectrum, followed by calculating the average log-transformed energy at each radial spatial frequency over all orientations. These metrics were then compared between high- and low-value fractals irrespective of the search type, as well as between the two search types irrespective of whether the fractals were high-value (targets) or low-value (distractors).
A total of 95 sets of fractals (760 fractals) that the monkeys had learned during the experiment were used to compare the visual properties of high- and low-value fractals. For the comparison of fractal properties between the two search types, we included only 25 fractal sets (200 fractals) that the monkeys used exclusively in one search type. This restriction was necessary because most fractal sets were used in both search types. Specifically, fractal sets were initially learned and used in inefficient search blocks. After additional training sessions, these same fractal sets were later used in efficient search blocks.
For the correlation analysis, we first calculated each visual metric separately for the good and bad fractals within each fractal set of each search block (comprising 24 fractals in total). Then, we computed the average of each metric across the 12 good fractals and 12 bad fractals. Finally, we calculated the difference in the averages between the good and bad fractals. As a result, we obtained a single value for each metric for each given search fractal set. We then computed the Pearson's correlation between this value and the behavioral search slope for the corresponding search fractal set.
Comparison of visual properties of fractals. A, Each subplot compares a specific visual property across high-value fractals (red), low-value fractals (blue), efficient searches (green), and inefficient searches (orange). Spatial energy overlaps for all groups but is vertically offset for visualization. The p value indicates the significance of each pairwise comparison. B, Correlation between behavioral search slope in each block and the difference of each visual property between high-value (targets) and low-value fractals used in that block. The Pearson's correlation and p values are noted, and the gray regression is shown.
Behavioral data analysis: multialternative attention-modulated drift–diffusion (MADD) model
The MADD model simulates decision-making during visual search by assigning a diffuser to each object in the display. All diffusers start accumulating evidence simultaneously from the onset of the search display. The diffusers update evidence for or against each object being the target, with updates directed toward absorbing boundaries that indicate selection or rejection. The model incorporates an attention modulation parameter (θ), which attenuates evidence accumulation for nonfixated objects.
We used a drift–diffusion framework to model the decision-making process during the search task. In this model, each object has its own decision variable (DV) or diffuser. Thus, a given trial, depending on the set size, had diffusers ranging from 3 to 9. Each diffuser started accumulating evidence for its corresponding fractal being the search target or not from the display onset. A decision involved either selecting a fractal as the target or rejecting the entire trial as the TA with absorbing decision boundaries. Hitting the upper boundary signified acceptance of the corresponding fractal as the target. Conversely, if a diffuser reached the lower boundary, it marked its fractal as a bad object. In this scheme, trial rejection happened if and when all diffusers hit the lower boundary. Model parameters included drift rate (d), integration noise standard deviation (σ), and the attention modulation ratio for nonfixated over fixated objects (θ). These parameters were fit separately based on the trial type (TP/TA), set size, and search type (efficient/inefficient).
MADD model. A, Schematic showing how enlargement of neuron spatial processing allows it to see the target within its RF. An example of a TP trial shows set-size 5 with a target on top of the screen (left column). The neuron RF is modeled as a bivariate Gaussian centered at a peripheral location (middle column top row). The RF width got larger due to object reward training (middle column bottom row). The RF is multiplied by the scene to show the objects affecting the neuron's firing (right column). The target becomes visible to the neuron in the enlarged RF condition. B, Schematic of MADD model, an example simulated trials for inefficient (top) and efficient (bottom) with set-size 5 are shown. Dashed red and blue lines show decision boundaries for choosing and rejecting an object. The red and blue trajectories represent the DV for good and bad objects. Red and blue patches indicate times subjects viewed a good or a bad fractal, respectively (different shades of blue for different bad objects). In the inefficient search (θ = 0 top panel), DV for an object was only updated when fixating that object, resulting in serial examination of multiple objects and slow search. In contrast, in the efficient search (θ = 1 bottom panel), DVs were updated even for peripheral objects, resulting in rapid decision boundary crossing and rapid search. The drift rate (d) of nonfixated objects was modulated by θ (drift × θ), and there was integration noise sigma (σ) added at every time point. C, Cumulative distribution function (CDF) for a drift rate, sigma, and theta in inefficient (orange) and efficient (green) searches. D, Average search times (solid line) and predicted search times using the MADD model (dashed line) for inefficient (orange) and efficient searches (green) across set sizes in TP trials (left) and TA trials (right). Symbols “=,” “/,” and “X” indicate significant effects of the main factors: search efficiency, set size, and the interaction between search efficiency and set size, respectively. In the legend, “n” refers to the number of efficient and inefficient searches. E, The scatter plots show correlation between the average θ of a session and the first saccade target detection rate (left), and search slope (right) across TP trials of that session. Each point is a session with orange and green denoting our binary categorization of inefficient and efficient searches in this plot and similar ones hereafter. The black line is the linear fit in this plot and similar plots hereafter (Deming regression). In the legend, “n” refers to the number of two-search neurons.
The diffuser's value at each time step is calculated using the following:
To estimate model parameters for different search task conditions within each search run, we utilized the grid search method to identify optimal values. The mean squared error served as the cost function to determine optimal values for each condition. The model's prediction was the search time (i.e., time to locate the target in TP or reject trial in TA trials). For a given trial, we calculated the prediction for search time by averaging the values obtained from 100 iterations of the drift–diffusion process. Considering the range of search time variations observed in monkey behavior and model predictions, we set valid parameter ranges to be [0, 1] for θ, [0, 0.1] for σ, and [0, 0.01] for d.
To evaluate the contribution of θ in the MADD model, model fitting was conducted using all TP trials under three scenarios: Scenario 1 (θ = 0), nonfixated objects were not updated, effectively clamping θ to zero. Scenario 2 (θ = 1), nonfixated objects were updated equivalently to fixated objects, effectively clamping θ to one. The models were compared using R2, Akaike information criterion (AIC), and Bayesian information criterion (BIC) to assess fit and parameter sensitivity (Table 1). Parameter ranges of drift rate (d) and noise (σ) were examined to understand their dependence on θ.
Parameter ranges and model performance across MADD variants with and without θ
Population vector sum model
We simulated the effect of the target signal tuning curve size on detecting the target by first saccade by tiling the visual scene with a uniform grid of 10 × 10 neurons. Each neuron spatial RF was modeled as a bivariate Gaussian distribution whose value at a given location signified the neuron's firing to an object appearing at that location. The response range for each neuron was chosen to match the values observed in our vlPFC population of neurons with a minimum (frmin = 16 Hz) and maximum (frmax = 41 Hz) during the 100–400 ms after the display onset. A TP trial was simulated by putting a variable number of fractals (3, 5, 7, or 9) on a circle around the center of the screen. The fractal at 0° was the good object (target).
We first calculated the activity of each neuron depending on the location of objects and the set size using the following equation:
We then multiplied this activity by a vector that started from the center of the screen and pointed to the center of the neuron's RF (i.e., M). We then calculated the vector sum across the 10 × 10 grid of neurons. The direction of this vector pointed to the direction of the saccade (which would always be toward a good object due to A > 1), and its magnitude was assumed to be monotonically related to the probability of saccade to that direction (an increase in vector magnitude means higher saccade probability).
We considered three scenarios for modulations of neurons RF, including additive (α), multiplicative (), and expansive (γ) mechanisms. The parameter α accounts for baseline shifts, the parameter β accounts for gain, and the parameter γ accounts for changing the width of the bivariate Gaussian, respectively. The 2D target signal is defined as the difference in firing to TP minus TA with the matched set size and object locations. Note that the subtraction of these two entities gives us the same bivariate Gaussian distribution as the target signal for that neuron, and thus, we considered the same modulations on the real neuron's target signal in the following section.
Tuning curve parameter recovery
The RF is modeled as a 2D symmetric Gaussian with a covariance matrix, Σ = σ2I, where σ = 6° and I is the identity matrix (frmin = 0 Hz; frmax = 10 Hz). To test parameter recovery, we modulated the RF by modifying bias, gain, and width. For bias, a uniform random value of 1 ± 0.2 Hz was added to the firing rate. For gain, the firing rate was multiplied by 2 ± 0.2 Hz. For width, the standard deviation (σ) of the Gaussian was multiplied by 2 ± 0.2 Hz. Simulated neural responses were sampled along a search circle with a radius of 12.5°, and the RF center was shifted along the x-axis from 0 to 30°. Using Equation 4, the parameters α, β, and γ were estimated for the modulated RF and compared with the initial RF. Parameter estimates were obtained using the Nelder–Mead algorithm from Python's SciPy package. The bias and standard deviation of the differences between the estimates and ground truth values were calculated. Results were normalized to ground truth and then averaged across RF center locations (0–30°). The bias and standard deviation of the estimates are also plotted as functions of RF center location.
vlPFC target signal tuning curve
To quantify the target signal dependence on the angular location of the target, we first calculated the neuron response (100–400 ms epoch after display onset) as a function of the target's angle in TP trials. Next, we applied a circular moving average (window length, 30°) to each neuron's angular response to the target location and performed linear interpolation with a 1° resolution. This allowed us to arrive at the TP tuning curve for each neuron with 1° resolution (360 data points). To calculate the target signal, we subtracted the average response for TA trials from this TP tuning curve for TP-preferred neurons
Since averaging the target signal tuning curve across neurons aligned to their peaks can create a false average peak even without any spatial tuning, we applied the same technique to the TA trials as a control. In TA trials, we treated the first fractal as a pseudotarget to calculate the tuning curve for TA trials with a 1° resolution using the same method as above. We then created a “fake” target signal by doing
To calculate the modulation of target signal between efficient and inefficient search, for each neuron, we fitted α, β, and γ ratios by using the Nelder–Mead algorithm within the Python SciPy package using Equation 4 as follows:
Note that while α and β in the bivariate Gaussian simulation will be equal to those estimated in Equation 4, the γ fitted in Equation 4 will not be the same as the one used for simulating the effect of bivariate Gaussian since we are measuring responses on a circle and not on the whole 2D grid. Nevertheless, the fitted γ will be monotonically related to the 2D tuning curve γ (i.e., an increase in the width of bivariate Gaussian will not decrease the width of the circular tuning curve).
Statistical tests and regressions
A two-way ANOVA test was employed to explore the combined influence of search efficiency and set size on these same variables. For the linear line fit, we used Type 2 (Deming) regressions. Spearman correlation coefficient (r) was used to measure correlations in all figures. Error bars or shades in all plots show the standard error of the mean. A paired t test was used for the statistical comparison of the visual properties of fractals. The significance threshold for all tests in this study was p < 0.05: ns, not significant; *p < 0.05; **p < 0.01; ***p < 0.001 (two-sided).
Results
Acute neural recordings from vlPFC (Areas 8Av, 46v, and 45a) were conducted in two monkeys (Monkeys S and H with 240 and 356 neurons in the left and right hemispheres, respectively). Monkeys learned to associate abstract fractal objects with either large or small juice amounts as rewards in a biased value training task (Fig. 1A,B). The large number of fractals used and their arbitrary assignment in high- or low-value groups ensured that the subsequent search task could not be solved by low-level guiding singletons such as unique colors or shapes. Each value training session involved a set of eight fractals, the half associated with small rewards and the other half with large rewards (bad and good objects, respectively). These object sets underwent biased value training for either 1 or +5 d before being used in the visual search task (Fig. 1C).
As reported previously (Ghazizadeh et al., 2016; Abbaszadeh et al., 2023), performance during interspersed choice trials in the value training was high and well above chance for 1 and +5 d trained fractals (1 d ∼95%; 5+ d 98% accuracies; ts(59) > 48.6; ps < 2.4 × 10−49 above 50% chance). Nevertheless, search efficiency was different between 1 and +5 d fractals. During the search task, monkeys were tasked with locating good objects among bad ones in TP trials or rejecting trials by fixating on the fixation point in TA trials in which all objects were bad (Fig. 1D). Trials could have 3, 5, 7, or 9 objects (set size) displayed on an imaginary circle around the center. The search time slope for TP trials (referred to as the “search slope”) was used to define efficient and inefficient search (see Materials and Methods). Efficiency was defined by search slopes below the 33rd percentile (<14 ms/item), while inefficiency was indicated by slopes above the 66th percentile (>27 ms/item). While the search slope was used to define search efficiency, search time was also significantly shorter in efficient versus inefficient sessions in both TP and TA trials (TP, display size, F(3,1016) = 217.77; p < 1 × 10−108; efficiency, F(1,1016) = 1,008.26; p < 1 × 10−153; interaction, F(3,1016) = 98.06; p < 1 × 10−55; TA, display size, F(3,944) = 50.43; p < 1 × 10−29; efficiency, F(1,944) = 161.70, p < 1 × 10−33; interaction, F(3,944) = 4.97; p < 1 × 10−2; Fig. 3D).
Furthermore, we examined the visual properties of both high-value and low-value fractals used in the two types of searches to ensure there was no systematic bias in the visual appearance of the objects. A comparison of the visual properties between high- and low-value fractals (over 700 samples) revealed no significant differences (Fig. 2; ts < 0.91; ps > 0.16). Similarly, no significant differences were observed in the visual properties of the objects used in efficient versus inefficient searches (Fig. 2A; ts < 0.71; ps > 0.06). In addition, there was no correlation between search efficiency as measured by search slopes in a block and the difference between visual properties of high- and low-value fractals used in that block (Fig. 2B; p > 0.2).
Enhanced peripheral object processing: behavioral evidence
Search for a target among multiple distractors can be conceptualized by a MADD formalism (also referred to as aDDM; Krajbich et al., 2012; Tavares et al., 2017). In MADD, the accumulation of evidence is strongest for fixated objects and is attenuated by a parameter θ < 1 for nonfixated ones, which can be taken to model the extent of the peripheral visual processing of objects (Fig. 3A). The evidence for each object is accumulated by a diffuser until it reaches the selection or rejection bands. The search trial concludes when one diffuser hits the selection boundary, indicating target detection, or when all the diffusers reach the rejection boundary, indicating trial rejection. In addition to θ, our model included the drift rate (d) and the diffusion noise (σ; see Materials and Methods). Results showed that MADD was able to fit search times across set sizes and for different search efficiencies (Fig. 3B). Examining the parameter fit revealed that the difference in efficient and inefficient search was primarily due to a significant enhancement of θ in efficient search (t137 = 11.88; p < 1 × 10−25) with minimal effects on the drift rate or sigma (σ and d; t137 = 1.3; p = 0.19; t137 = −1.18; p = 0.23, respectively; Fig. 3C). Indeed, one can verify that θ values at extremes of 0 and 1 result in serial (most inefficient) and parallel (most efficient) searches, respectively. When θ = 1, all diffusers rapidly update their values, enabling set size-independent search. In contrast, when θ = 0, the target requires multiple saccades since no evidence accumulation can happen for nonfixated objects. The full model demonstrated superior fit (R2 = 0.994) compared with a scenario when nonfixated objects are not updated (equivalent to clamping θ to zero; Scenario 1 R2 = 0.485) and to a scenario when there is no difference in updating for whether an object is fixated or not (equivalent to clamping θ to one, Scenario 2 R2 = 0.894), as summarized in Table 1. Model selection criteria (AIC/BIC) consistently favored the full model.
Notably, the fitted values of θ showed a significant negative correlation with search slopes across all sessions (Fig. 3D; all sessions, r = −0.73; p < 1 × 10−43; within efficient, r = −0.53; p < 1 × 10−9; within inefficient, r = −0.53; p < 1 × 10−9) and a significant positive correlation with the detecting good target on the first saccade after display onset across sessions (all sessions, r = 0.74; p < 1 × 10−44; within efficient, r = 0.60; p < 1 × 10−13; within inefficient, r = 0.46; p < 1 × 10−7; Fig. 3D). In summary, behavioral modeling with MADD revealed an increase in θ with search efficiency, suggesting an expansion in the visual processing of peripheral objects in efficient search compared with inefficient search. Next, we explored the neural signature in the vlPFC for this expansion.
Enhanced peripheral object processing: neural evidence
We have previously shown that vlPFC neural response is different in TP versus TA trials within 150 ms after the display onset, which can presumably be used to signal the presence of a high-value target in the display (target signal; Abbaszadeh et al., 2023). Importantly, the target signal was found to be stronger in efficient versus inefficient searches in vlPFC neurons (Abbaszadeh et al., 2023). Specifically, Figure 4 illustrates that the target signal in efficient searches is higher than in inefficient searches, as demonstrated by the example neuron response (Fig. 4A) and the population response (Fig. 4B) of vlPFC neurons. Given the fact that vlPFC neurons have relatively localized RFs (Funahashi et al., 1990; Ringach et al., 2002; Ghazizadeh et al., 2016, 2018), one expects that the target signal should be a function of the target location (target signal tuning curve). Indeed, such spatial dependence in the target signal is observed in individual neurons (Figs. 4C, 5, which shows the target signal for two example neurons, one from each monkey) and the population average (Fig. 4D). Given such a target signal tuning curve, one can think of at least three scenarios for an overall enhancement of the target signal between inefficient and efficient searches. As shown in Figure 6A, the target signal enhancement can be additive, multiplicative, or involve a spatial widening across angular locations. Any of these tuning curve enhancements can intuitively work to make the neuron more responsive to the presence of a peripheral target across angular locations.
The firing rate and target signal in inefficient and efficient searches. A, A PSTH for an example neuron during efficient (right) and inefficient (left) searches, illustrating TP trials (red), TA trials (blue), and the subtraction of TP and TA trials representing the target signal (in green for efficient and in orange for inefficient). B, Same as A but for the average of two-search neurons (n = 138). C, The polar plot shows the target signal of example (binned at eight radial directions) in both inefficient (orange) and efficient (green) searches. D, Same as C but for the average target signal of two-search neurons. The peaks of these target signals were rotated to zero to align them before averaging over population (uncorrected target signal).
Example neurons and their target signal tuning curves. The polar plot shows the target signal of the example neuron from each monkey across various target locations (binned at eight radial directions) in both inefficient (orange) and efficient (green) searches. Each of the eight sections in the plot displays the peristimulus time histogram (PSTH) of the neuron for inefficient search (inner plot) and efficient search (outer plot), separately for TP trials (in red), TA trials (in blue), and the subtraction of TP and TA trials representing the target signal (in green for efficient and in orange for inefficient).
Enhanced spatial processing in vlPFC and its role in target detection. A, Schematic showing three scenarios for modulation of neural spatial-dependent responses to objects (RF tuning curve) assuming object angular location along an eccentric circle around fixation (0 angle being the center of RF). Modulations of the tuning curve by additive (α, first row), multiplicative (β, second row), or expansive (γ, third row) factors are shown. B, The response of different neurons (10 × 10 grid) along their preferred direction to an example display (set size, 5) with a target at 0° angle for two different values of each factor. The gray arrows show the population averages binned into eight radial directions. The yellow and green arrows show the population vector sum across all directions. Note the larger magnitude for multiplicative and expansive factors but not the additive factor. C, The effect of each factor on the magnitude of the population vector sum. D, Population average of the target signal tuning curve during inefficient (orange) and efficient (green) searches for neurons with both search types (n = 138). The gray trace shows the inefficient search tuning curve with multiplicative factor to match the efficient search tuning curve still showing a narrower width. E, Pairwise scatterplots of additive (α), multiplicative (β), and expansion (γ) factors with marginal distributions. The percentage of neurons with multiplication > 1 and expansion > 1 is noted. F, The scatterplot and correlation between each factor and the increase in first saccade target detection in efficient visual search. Note a lack of correlation for the additive factor as predicted in C.
Given that different vlPFC neurons exhibit preferences for various angular locations, a straightforward method to determine the target location would involve calculating the population vector sum of neural activities (Fig. 6B; Swindale, 1998; Ringach et al., 2002; Nauhaus et al., 2008). In this model, the probability of a saccade toward a particular direction is assumed to increase as the size of the population vector sum grows larger. Notably, the population vector sum predicts that response multiplication or widening, but not the additive offset, results in a larger resulting vector and, consequently, a higher chance of saccade toward the target (Fig. 6C; see Materials and Methods). Furthermore, while tuning curve multiplication leads to a monotonic increase in the population vector, tuning curve widening initially increases the size of the population vector, followed by a decrease due to the loss of spatial resolution (Fig. 6C).
To ensure that the changes in the tuning curve parameters (α, β, or γ) can be estimated accurately, given that we are only sampling responses across a 12.5° circle regardless of the RF center for a given neuron, we did a simulation in which only the bias, the gain, or the width of the RF was changed and our estimate of change in bias (α), gain (β), or width (γ) was compared with the known ground truth (Fig. 7A,B). Results show that changes in bias and gain can be accurately determined in each scenario across a wide range of RF centers (0–30°; Fig. 7C). The estimates had minimal bias and low variance unless the RF center was very close to the center (<5°) or very eccentric (>20° eccentricities; Fig. 7D,E, left two columns). The changes in the width were also recovered to a good degree but were partially assigned to gain especially when the RF center was further from the search circle (Fig. 7C–E, right column). These findings suggest that while some of the observed changes in RF gain in our data might have been due to RF width enhancement, the width enhancement itself must have been recovered with high confidence in our analysis.
Tuning curve parameter recovery. A, The hypothetical RF of a neuron is shown with its center outside the search circle. The solid black circle represents the search circle, while the dashed circle is the projected readout of the tuning curve on this circle. B, The neural activity across the line connecting fixation (0°) and center of the RF. The intersection with the search circle is shown at ±12.5°. Three separate scenarios of the tuning curve are illustrated with additive (α), multiplicative (β), and expansion (γ) changes. C, Estimated changes in c, β, and γ normalized to the ground truth (1 being accurate estimation) across RF center locations (0–30°). D, The absolute bias of parameter estimates as a function of RF center (r). The dashed line indicates the search circle radius (12.5°). E, Similar to D but showing the standard deviation of parameter estimates.
Figure 6D shows the vlPFC population target signal tuning curves for both efficient and inefficient search conditions. These curves are averaged relative to the angular location corresponding to the maximal response (0° corresponding to the maximal point for each neuron; n = 138; neurons for both search types; see Materials and Methods). Interestingly, the efficient search was concurrent with an enhanced tuning curve in vlPFC (for individual neuron examples, see Fig. 5). Examining individual neurons revealed a significant percentage of them to show either amplification (β) or widening (γ) or both (Fig. 6E; one-sample t test; expected null value, 1, t(137) = 4.55; p < 10 × 10−4; t(137) = 5.92; p < 10 × 10−7). Notably, the additive (α) effect was near zero for many neurons (Fig. 6E), and the overall effect on the population was not different from zero (t(137) = 1.88; p = 0.06). Importantly and consistent with our population vector model, the results showed a significant positive correlation between the increase of first saccade target detection in efficient versus inefficient searches and both the amplification (β) or widening (γ) of value signal. However, no such correlation was found with the additive shift (α) in two-search neurons consistent with the prediction of the population vector sum model (Fig. 6F; β, r = 0.37; p < 1 × 10−4; γ, r = 0.18; p = 0.02; α, r = 0; p =0.93).
Relationship between target signal tuning curve, θ, and search efficiency
In order to use a single metric that combines target signal enhancement across angular locations, we used the AUT (Fig. 6D). We could confirm a significant rightward shift in the cumulative density function for efficient search across the vlPFC population (Fig. 8A; t(137) = 9.22; p < 1 × 10−15). The enlargement of the RF size was also confirmed for neurons with both search types (Fig. 8B).
Relationship between spatial processing in vlPFC, MADD θ, and search efficiency. A, CDF of the AUT during efficient (green) and inefficient (orange) searches in neurons with both search types (n = 138). B, The scatterplot of AUT in efficient versus inefficient searches in neurons with both search types (n = 138). C, The scatterplot and correlation between θ and AUT across all sessions. D, The scatterplot and correlation between change in θ and change in AUT from inefficient to efficient search for neurons with both search types. E, The scatterplot and correlation between first saccade detection of target and AUT. F, Same as E but for TP search slope and AUT.
Behavioral fits using MADD suggested the increase in θ to be the underlying mechanism for search efficiency (Fig. 3C). Neural analysis revealed that the enhancement of AUT was concurrent with efficient search (Fig. 8A,B). This naturally begs a question about the relationship between the θ and the AUT. Results showed a significant and positive correlation between the AUT of the neuron recorded in a session and the θ of that session (Fig. 8C; r = 0.32; p < 1 × 10−6). Once again, in the neurons that were recorded in both efficient and inefficient search blocks, the increase in AUT was also found to be correlated with the increase in θ when comparing inefficient to efficient search (Fig. 8D; r = 0.19; p = 0.03). The increase in AUT was also correlated with measures of search efficiency such as higher first saccade detection (Fig. 8E; r = 0.40; p < 1 × 10−10) and lower search slopes (Fig. 8F; r = −0.42; p < 1 × 10−12). It is important to note that the correlation between the responses of a single neuron is expected to be an inherently noisy correlate of search accuracy compared with θ, which models the behavior orchestrated by a large number of neurons. Thus, observing these significant correlations between single neuron measures and behavior is notable.
Discussion
Visual search for targets not distinct from their surroundings by low-level guiding features is thought to be set size-dependent and time-consuming (Wolfe, 1994; Wolfe and Horowitz, 2004). Search becomes especially unforgiving when the number of targets to look for increases, possibly due to limited working memory capacity (Soto et al., 2006; Logie, 2011; Oberauer et al., 2016). In addition, some evidence showed that search time logarithmically increases by the number of objects held in memory (Wolfe, 2012), possibly due to limited working memory capacity (Soto et al., 2006; Logie, 2011; Oberauer et al., 2016). Hybrid search tasks, which combine memory and visual search components, have been shown to support efficient search for many targets, even with large target sets (Wolfe, 2012). In these tasks, humans can efficiently search for dozens of targets simultaneously, particularly when the targets are familiar or have distinct low-level features.
Nevertheless, we have recently shown that valuable objects can be found efficiently in the visual search despite their large number and lack of low-level guiding features (Ghazizadeh et al., 2016; Abbaszadeh et al., 2023; Narmashiri et al., 2024). Our task differs from traditional hybrid searches, as it involves abstract fractal targets with arbitrary shapes, making it more demanding than tasks with familiar, simpler objects. Our results highlight the added challenge of value-driven search with abstract, learned targets, likely explaining the increased difficulty as target numbers rose due to the greater cognitive demands. While low-level pop-out is believed to arise from parallel processing of color, orientation, and size across the visual field in early sensory areas (Treisman, 1986; Bichot et al., 2005; Wolfe, 2020), a similar mechanism for valuable objects with arbitrary shapes is unknown. Here, we aimed to provide a neural mechanism for valuable object pop-out by analyzing vlPFC neuron activity, which is known to be involved in visual search (Bichot et al., 2015, 2019) and value memory (Ghazizadeh et al., 2018) along with behavioral modeling using MADD. Our behavioral analysis revealed efficient search arises from enhanced spatial processing for valuable objects in the periphery (Fig. 3). The neural data showed amplification and spatial widening of the vlPFC target signal tuning curve (Fig. 6). Amplification and widening of the target signal tuning curve were predicted to increase the detection of the target by first saccade using a population vector sum model and were subsequently confirmed to be correlated with behavioral measures of search efficiency (Fig. 8). Together, these data suggest that the enhancement of the target signal tuning curve in the vlPFC underlies value-driven search efficiency.
Visual search for a target among other objects can be framed as a decision-making task in which the subject has to evaluate multiple objects, decide which one is the target, and reject others. We formulated this decision-making problem using a MADD model. Variants of this model have been previously used to fit search times in visual search (Krajbich et al., 2012; Tavares et al., 2017). Indeed, our model assumes that evidence accumulation is fastest for foveated objects but attenuated by a θ parameter for peripheral objects (0 < θ < 1, hence the attentional modulation). Behavioral fits revealed that efficient search was concurrent with smaller attenuation (higher θ) for evidence accumulation of peripheral objects but with minimal changes in drift rate (d) or integration noise (σ; Fig. 3B). The increase in θ indicates a move toward simultaneous evaluation of objects regardless of locus of gaze and suggests such simultaneity as the underlying cause for value-driven efficient search. Indeed, the nonzero values of θ suggest that covert attention to peripheral objects contributes to the search process and that search efficiency arises from enhanced covert attention to nonfixated objects.
In a comprehensive model comparison study for multiple alternative decisions, the authors found that the best models did not need decay or inhibition between diffusers to explain accuracy and reaction times (Leite and Ratcliff, 2010). Consistently, in our implementation of MADD, we did not use decay or lateral inhibition between the diffusers. Nevertheless, the possibility of interaction between diffusers (Leite and Ratcliff, 2010) or their noise terms (Daneshi et al., 2019) remains to be systematically investigated. Furthermore, our implementation of the MADD model bears a resemblance to the asynchronous diffusion model (Wolfe, 2003, 2021), with a key distinction: rather than initiating at staggered time points, all diffusers commence the accumulation process simultaneously from the onset of the search. The drift rates (d) in our model are modulated by spatial proximity to the fixation point.
vlPFC was already implicated in spatial and feature-based visual search (Buschman and Miller, 2007; Katsuki and Constantinidis, 2012; Bichot et al., 2019). We have recently shown that vlPFC encodes efficiency in value-driven visual search independent of low-level guiding features as well (Abbaszadeh et al., 2023). Here, we have expanded our previous findings by showing that the spatial tuning of the target signal in vlPFC becomes magnified and wider across angular locations during efficient search. We showed that both of these effects, but not a simple additive shift in the target signal tuning curve, predict an increase in the first saccade target detection by a simple population vector sum method (Fig. 8E). Furthermore, the AUT was significantly correlated with the θ parameter in MADD (Fig. 8C). These findings suggest that the target signal in vlPFC could be the neural correlate for θ in MADD. However, we do not observe the neural correlate of the evidence accumulation or drift rate (d) in vlPFC. One possibility is that evidence accumulation could be done by areas receiving the vlPFC inputs, such as in SC (Jun et al., 2021) or LIP (Shadlen and Newsome, 2001; de Lafuente et al., 2015). We have recently shown neural correlates of evidence accumulation in value-driven food choices using simultaneous fMRI-EEG data in the insular cortex in humans (Ataei et al., 2022). Similar whole-brain explorations in macaques would be needed to find neural correlates for MADD components and confirm its validity as a suitable model of visual search.
While the mechanism for enhancing spatial tuning in our task is unknown, it may involve reward-related plasticity within the corticobasal ganglia loop where vlPFC participates (Griggs et al., 2017; Ghazizadeh and Hikosaka, 2021). Indeed, our recent work has shown an enhancement of the target signal in SNr (Narmashiri et al., 2024), which receives indirect input from vlPFC and also projects back to it via the thalamus (Yasuda and Hikosaka, 2019). It is interesting to investigate whether reward-dependent synaptic plasticity, enabling vlPFC neurons to integrate inputs across a larger pool, could result in a wider RF for valuable objects. Considering the role of dopamine in encoding value learning and memory (Kim et al., 2015), as well as in synaptic plasticity in the cortex and striatum (Calabresi et al., 2007), it remains to be determined whether this response enhancement involves dopamine-dependent synaptic plasticity.
In our task, the number of potential targets in a given session was large (>12), and targets were not cued but instead were singled out by past reward history. This type of visual search differs from well-studied searches in which a cued target is seen immediately before the display onset, which are known to rely on working memory (Soto et al., 2006) and top–down priority maps (Bisley, 2011). Our value-driven search is not consistent with a slew of bottom–up salience maps (Borji et al., 2019) due to a lack of guiding features and random object–reward associations. Instead, our results provide evidence for a third category of salience different from top–down and bottom–up controls, as alluded to previously (Awh et al., 2012). We refer to this as memory-based salience. Our recent work has implicated vlPFC in other aspects of memory-based salience involving aversive associations and perceptual novelty/familiarity (Ghazizadeh and Hikosaka, 2022). Evidence shows that visual attention enhances the gain of the spatial tuning curve of neurons in visual processing areas such as middle temporal and V4 (Womelsdorf et al., 2006; Anton-Erxleben et al., 2009; Lee and Maunsell, 2010). This prompts the possibility that a similar memory-based attentional mechanism may underlie the receptive tuning curve enhancement of the target signal observed in vlPFC neurons. While we have shown decreased visual set size dependence for well-trained valuable objects, the effect of value on the memory set size remains to be explored in the future.
In summary, our findings put forth a conceptual and neural framework for understanding valuable object pop-out during visual search. Our results demonstrate that long-term value association results in amplification and spatial widening of the target signal tuning curve in vlPFC neurons, thereby enhancing the spatial processing of peripherally presented valuable objects, consistent with behavioral modeling using MADD. More broadly, our findings are a striking example of how value learning can change the neural tuning of PFC neurons for fast detection of valuable objects. These results point to value-driven plasticity within vlPFC or its input areas to aid in the detection of valuable objects. The specifics of synaptic and network mechanisms underlying such reward-related enhancements in neural responses remain to be investigated.
Footnotes
↵*K.S. and M.A. contributed equally to this work.
The authors declare no competing financial interests.
- Correspondence should be addressed to Ali Ghazizadeh at alieghazizadeh{at}gmail.com.