Abstract
Everyday decision-making commonly involves assigning values to complex objects with multiple value-relevant attributes. Drawing on object recognition theories, we hypothesized two routes to multiattribute evaluation: assessing the value of the whole object based on holistic attribute configuration or summing individual attribute values. In two samples of healthy human male and female participants undergoing eye tracking and functional magnetic resonance imaging (fMRI) while evaluating novel pseudo objects, we found evidence for both forms of evaluation. Fixations to and transitions between attributes differed systematically when the value of pseudo objects was associated with individual attributes or attribute configurations. Ventromedial prefrontal cortex (vmPFC) and perirhinal cortex were engaged when configural processing was required. These results converge with our recent findings that individuals with vmPFC lesions were impaired in decisions requiring configural evaluation but not when evaluating the sum of the parts. This suggests that multiattribute decision-making engages distinct evaluation mechanisms relying on partially dissociable neural substrates, depending on the relationship between attributes and value.
SIGNIFICANCE STATEMENT Decision neuroscience has only recently begun to address how multiple choice-relevant attributes are brought together during evaluation and choice among complex options. Object recognition research makes a crucial distinction between individual attribute and holistic/configural object processing, but how the brain evaluates attributes and whole objects remains unclear. Using fMRI and eye tracking, we found that the vmPFC and the perirhinal cortex contribute to value estimation specifically when value was related to whole objects, that is, predicted by the unique configuration of attributes and not when value was predicted by the sum of individual attribute values. This perspective on the interactions between subjective value and object processing mechanisms provides a novel bridge between the study of object recognition and reward-guided decision-making.
Introduction
Choosing which snack to buy requires assessing the value of options based on multiple attributes (e.g., color, taste, healthiness). Value can be related to individual attributes; for example, if someone loves chocolate, all snacks containing this ingredient will be valued above those that do not. Value can also emerge from the combination of individual attributes, such as for chocolate-peanut snacks, where the combination of sweet and salty ingredients within the same snack might yield a value greater than the sum of the individual attributes.
The object processing literature has shown that there are distinct neural substrates hierarchically organized along the ventral visual stream (VVS) that represent the individual elements that make up complex objects and the holistic, configural combinations of those elements (Riesenhuber and Poggio, 1999; Bussey and Saksida, 2002). Lesions to the perirhinal cortex (PRC), a medial temporal lobe structure situated at the anterior end of the VVS, impair object discrimination based on attribute configuration but spare discrimination based on individual attributes (Bussey et al., 2005; Bartko et al., 2007; Murray et al., 2007). Neuroimaging studies have shown that blood oxygenation level dependent (BOLD) functional magnetic resonance imaging (fMRI) and regional cerebral blood flow positron emission tomography signals in the human PRC are more sensitive to multiattribute configuration than to the component attributes of objects, whereas the lateral occipital cortex (LOC) demonstrates higher sensitivity to single attributes compared with anterior regions of the VVS (Devlin and Price, 2007; Erez et al., 2016). This suggests that configural object recognition is supported by the PRC and that individual attribute representations at earlier stages of object processing are sufficient for object recognition or discrimination under certain conditions.
Leading neuroeconomic models propose that the ventromedial prefrontal cortex (vmPFC) encodes subjective value across stimuli as a common currency to support flexible decision-making (Chib et al., 2009; Levy and Glimcher, 2012; Delgado et al., 2016). Although many of these studies presented multiattribute objects (e.g., foods, trinkets), they have only rarely considered how the values of multiple attributes are combined. A handful of fMRI studies examined the neural correlates of options explicitly composed of multiple attributes. These have found that signal within the vmPFC reflects the integrated value of the component attributes when each independently contributes to value, that is, when value is associated with individual elements of the option (Basten et al., 2010; Philiastides et al., 2010; Kahnt et al., 2011; Park et al., 2011; Lim et al., 2013; Hunt et al., 2014; Suzuki et al., 2017; Kurtz-David et al., 2019). However, these studies did not address whether there are distinctions in the neural processes underlying value construction based on summing attributes versus value emerging from the holistic configuration of attributes.
Recent evidence argues that the distinction between configural and elemental processing is important in valuation, just as it is known to be important in complex object recognition. We recently found that lesions to the vmPFC in humans impair decisions between objects when value is associated with the configural arrangement of attributes but spare decisions when value is associated with individual attributes (Pelletier and Fellows, 2019). Here, we employ a triangulation approach (Munafò and Smith, 2018) to further test this hypothesis using fMRI and eye tracking to examine the neural and behavioral correlates of multiattribute valuation in healthy women and men.
We hypothesized that estimating the values of multiattribute visual objects in a condition where value is predicted by attribute configuration would engage the vmPFC as well as regions involved in complex object recognition (i.e., PRC) to a greater extent than an elemental condition where individual attributes contribute independently to overall object value. We further hypothesized that fixations to and fixation transitions between value-predictive attributes would differ between configural and elemental value conditions. We report data from two independent samples of healthy participants, one a behavioral and eye-tracking study and another that also included fMRI. An additional pilot study was conducted to determine the fMRI study sample size. All hypotheses and analysis steps were preregistered (https://osf.io/4d2yr).
Materials and Methods
Data were collected from three independent samples using the same experimental paradigm. This paradigm involved first learning and then reporting the monetary value of novel, multiattribute pseudo objects under elemental or configural conditions. We collected an initial behavioral sample to characterize learning, decision-making, and eye-gaze patterns. We then undertook a pilot fMRI study to estimate the sample size needed to detect effects of interest. Informed by this pilot study, a third sample underwent fMRI and eye tracking. Data from the behavioral sample informed the preregistration of eye-tracking hypotheses to be replicated in the fMRI sample.
Participants
Participants were recruited from the Tel Aviv University community via online advertising and through the Strauss Imaging Center's participant database. Participants were healthy volunteers, with normal or corrected-to-normal vision, without any history of psychiatric, neurologic, or metabolic diagnoses, and not currently taking psychoactive medication. The study was approved by the Ethics Committee at Tel Aviv University and the Institutional Review Board of the Sheba Medical Center in Tel-Hashomer, Israel.
Behavioral study
Forty-two participants were recruited to take part in the behavioral experiment. Nine participants were excluded because of poor task performance according to the exclusion criteria detailed below. The final behavioral sample included 33 participants (15 females and 18 males, mean age 22 years, range 18–32). Eye-tracking data were not available for three participants because of poor calibration of the eye tracker.
fMRI pilot study
Imaging data were collected in a pilot sample of eight participants (four females and four males, mean age 25 years, range 21–31) to calculate the sample size needed to detect a significantly stronger modulation of value in the configural compared with the elemental trials in the vmPFC at an α level of 0.05 with 95% power. Power calculations were conducted with the fMIPower software (http://fmripower.org/; Mumford and Nichols, 2008), averaging β weights for the contrast of interest across all voxels of a predefined brain region. Based on these calculations, we preregistered 42 participants. This sample size was also sufficient to detect a significant effect for the parametric modulation of value in the configural condition alone in the vmPFC (38 participants needed for 95% power). The vmPFC region of interest (ROI) and the model used to analyze the pilot data are described below. Imaging data used for power and sample-size calculations are available on OpenNeuro (https://openneuro.org/datasets/ds002079/versions/1.0.1), and the code used to create the power curves and the vmPFC ROI mask are available with the preregistration document (https://osf.io/4d2yr). Pilot participants were not included in the final sample.
fMRI study
Fifty-five participants were recruited to take part in the full fMRI experiment. Nine participants were excluded because of poor task performance in the scanner, according to the preregistered exclusion criteria. Three participants were excluded because of magnetic resonance (MR) artefacts, and one participant was excluded because of excessive motion inside the scanner based on fMRIPrep outputs (Esteban et al., 2019). The final fMRI sample thus included 42 participants (21 females and 21 males, mean age 27 years, range 18–39). Eye-tracking data could not be collected in nine participants because of reflections caused by MR-compatible vision-correction glasses.
Experimental paradigm
The experimental paradigm was adapted from Pelletier and Fellows (2019). Participants learned the monetary values of novel multiattribute pseudo objects (fribbles) in two conditions (configural and elemental), after which they were scanned while bidding monetary amounts for the objects. Fribbles were developed to study object recognition and are designed to mimic real-world objects (Barry et al., 2014). They are composed of a main body and four appendages, which we refer to as attributes, each available in three variations. Two fribble sets were used, one for each condition (randomly assigned for each participant); each set had the same body but different appendages.
In the configural condition, value was associated with the unique configuration (conjunction) of two attributes. In the elemental condition, value was associated with each of two individual attributes, which then could be combined to obtain the value of the whole object. Four different object sets were used across participants; the object set condition assignment was counterbalanced. Learning order was counterbalanced across participants (configural followed by elemental or vice versa), and the order of object presentation was randomized in all experiment phases. An example of the stimuli as well as the value associations are shown on Figure 1.
Learning phase
Participants were instructed before the experiment that they were acting as business owners, buying and selling novel objects. Before acquiring objects in their own inventory, they began by observing objects being sold at auction to learn their market price.
The learning phase included five learning blocks and one learning probe per condition. A block began with a study slide displaying all six objects to be learned in that condition, along with the average value of each object, giving the participant the opportunity to study the set for 60 s before the learning trials (Fig. 2A). The learning trials began with the presentation of an object in the center of the screen above a rating scale, asking “How much is this item worth?” Participants had 5 s to provide a value estimate for the object, using the left and right arrow keys to move a continuous slider and the down arrow key to confirm their response. Feedback was then provided indicating the actual selling price of the object, with a bright yellow bar and the corresponding numerical value overlaid on the same rating scale. The object, rating slider, and feedback were displayed for 2 s, followed by 2 s fixation cross. Each learning block presented all six objects six times each in random order for a total of 36 trials. After five learning blocks, learning was assessed with a probe consisting of 24 trials of the six learned objects presented four times each, in random order. The structure of probe trials was identical to the learning trials, but no feedback was given after the value rating.
In the elemental condition, values were associated with individual attributes. During the learning blocks, the object's body and irrelevant attributes were occluded with a 50% transparent white mask, making the specific value-predictive attribute more salient (Fig. 1). Participants were told that value was associated only with the unmasked attribute. During the learning probe, objects were presented without masks, so all attributes were equally salient, and participants were instructed to sum the values of the two attributes they had learned.
In the configural condition, objects were displayed without masks during the entire learning phase, and the value of the object was associated with the unique configuration of two attributes. In this condition, participants could not learn object values by associating value with any single attribute because each attribute was included in both a relatively high-value and a relatively low-value object, as depicted in the object-value table (Fig. 1).
After learning, each of the six objects of the elemental condition had the same overall value (sum of the two attribute values) as one of the six configural objects. The object set in each condition contained six value-relevant attributes, each of which was part of two different objects in each set.
Bidding task
After learning, participants placed monetary bids on the learned objects to acquire them for their inventory while eye movements were tracked, and in the fMRI studies, fMRI was acquired. The task comprised four runs (scans) each containing the 12 objects (6 per condition) repeated twice in random order for a total of 24 trials. The structure of a bidding trial is depicted in Figure 2B. Before the bidding task, participants performed one practice run to familiarize themselves with task timings.
To make the task incentive compatible, participants were instructed beforehand that all auctions would be resolved at the end of the session. If they bid sufficiently close to [within 5 Israeli shekels (ILS)], or higher than the true (instructed) object's value, this object would be acquired and placed in their inventory. After the task, we would buy all the items in their inventory plus a profit margin (similar to the situation where stores sell their products for a higher price than they paid from the manufacturer). The profit margin was 25%, although the exact margin was unknown to participants. The bonus compensation was calculated by summing the total amount paid by the experimenter to buy the participant's inventory, minus the total of the bids placed by the participant to acquire these items. This total profit was then converted on a scale with a minimum of 0 ILS (i.e., participants could not lose money) and a maximum of 10 ILS (equivalent to ∼$3 US). In other words, if participants generally bid substantially higher or lower than the instructed value, the bonus compensation tended toward zero. If they bid generally very close to the instructed value, the bonus compensation was close to 10.
Anatomical scans and functional localizer task
After the bidding task, fluid attenuated inversion recovery and T1 anatomic scans and B0 field maps were acquired for the fMRI samples, with the parameters detailed below. Following structural scan acquisition, participants performed a functional localizer task adapted from Watson et al. (2012) to define participant-specific visual regions of interest for analysis of the bidding task. Images from four categories (faces, scenes, objects, and scrambled objects) were presented in blocks of 15 s, each containing 20 images displayed for 300 ms with a 450 ms interstimulus interval. Participants were instructed to press a button using the index finger of the right hand when an image was repeated twice in a row (1-back task). The task comprised four runs of 12 blocks each. A 15 s fixation block ended each run. One run contained three blocks of each image category in a counterbalanced order.
Data acquisition
Behavioral data
All phases of the experiment were programmed in MATLAB (catalog #R2017b, MathWorks), using the Psychtoolbox extension (PTB-3; Brainard, 1997). During the learning phase, and during the bidding task for the behavioral sample, stimuli were displayed on a 21.5 inch monitor, and responses were made using a standard keyboard. We recorded value rating and reaction time for each learning trial. During the bidding task in the fMRI, stimuli were presented on a NordicNeuroLab 32 inch LCD display (1920 × 1080 pixels resolution, 120 Hz image refresh rate) that participants viewed through a mirror placed on the head coil. Participants responded using an MR-compatible response box. Value rating, reaction time, and the entire path of the rating slider were recorded for each trial.
Eye-tracking data
We recorded eye-gaze data during the bidding task using the Eyelink 1000 Plus (SR Research), sampled at 500 Hz. Nine-point calibration and validation were conducted before each run of the task.
fMRI data
Imaging data were acquired using a 3T Siemens MAGNETOM Prisma MRI scanner and a 64-channel head coil. High-resolution T1-weighted structural images were acquired for anatomic localization using a magnetization-prepared rapid gradient echo pulse sequence [repetition time (TR) = 2.53 s, echo time (TE) = 2.99 ms, flip angle (FA) = 7°, field of view (FOV) = 224 × 224 × 176 mm, resolution = 1 × 1 × 1 mm].
Functional imaging data were acquired with a T2* weighted multiband echo planar imaging protocol (TR = 1200 ms, TE = 30 ms, FA = 70°, multiband acceleration factor of four and parallel imaging factor iPAT of two, scanned in an interleaved fashion). Image resolution was 2 × 2 × 2 mm voxels (no gap between axial slices), FOV = 97 × 115 × 78 mm (112 × 112 × 76 acquisition matrix). All images were acquired at a 30° angle off the anterior-posterior commissures line to reduce signal dropout in the ventral frontal cortex (Deichmann et al., 2003). We also calculated field maps (b0) using the phase encoding polarity (PEPOLAR) technique, acquiring three images in two opposite phase encoding directions (anterior–posterior and posterior–anterior), to correct for susceptibility-induced distortions.
Data exclusion
Eye-tracking data were discarded for a trial if <70% of samples could be labeled as fixations. Participants who performed poorly in the bidding fMRI task were excluded from analysis based on preregistered exclusion criteria. Specifically, participants with average rating error ≥15 ILS in at least one condition, or an average rating error ≥5 ILS for any single object were excluded. These criteria ensured that no participant using heuristics to estimate value (i.e., rough guessing based on a reduced number of attributes) was included in the final sample. In the behavioral study, three participants were excluded because of large average rating error in the elemental condition (Those participants seemingly failed to sum the two attributes and instead rated based on only one.). Three participants were excluded because of large average rating error in the configural condition, and three were excluded because of large average rating error in both conditions. In the fMRI study, three participants were excluded because of large average rating error in the elemental condition (One gave seemingly random ratings, and two failed to sum the attributes.). Two participants were excluded because of large average rating error in the configural condition. One participant was excluded because of large error on two specific objects in the configural condition. Finally, three participants were excluded because of large average rating error in both conditions.
Statistical analysis
Behavioral data analysis
Learning outside the scanner was assessed by the change in average value rating error across learning blocks. Error was defined as the absolute difference between the rating provided by the subject and the true value of the object or attribute. A repeated measures ANOVA with learning block (five levels) and condition (two levels) as within-subject factors was used to analyze error across learning trials. Group-level value rating error in the learning probes was compared between conditions using a paired-sample t test.
Performance in the bidding task inside the scanner was analyzed by calculating the average error (absolute difference between bid value and instructed value) across the 6 repetitions for each of the 12 objects, as well as the average error by condition. Group-level bidding error was compared between conditions using a paired sample t test. Rating reaction times were similarly compared between conditions.
Eye-tracking data analysis
Eye-tracking data files in EyeLink (EDF format) were converted using the Edf2Mat MATLAB Toolbox. Periods of eye blinks were removed from the data, after which the x and y coordinates and the duration of each fixation during the 3 s of object presentation were extracted. We identified each fixation according to whether it fell on one or the other of the learned attributes or neither. The attribute areas of interest (AOIs) were defined by drawing the two largest equal-sized rectangles centered on the attributes of interest that did not overlap with each other. The same two AOIs were used for the six objects within each set. All AOIs covered an equal area of the visual field, although the positions varied between object sets. For an example of the preregistered AOIs, see Figure 6B. AOIs for all object sets along with their exact coordinates in screen pixels are reported in the preregistration document (https://osf.io/4d2yr).
For each subject and each condition, we calculated the average number of fixations per trial and the number of fixations in each of the AOIs. We also calculated the average duration of individual fixations within each AOI and the total time spent fixating on each AOI. Finally, we calculated the average number of transitions from one attribute AOI to the other. We counted as a transition every instance of a fixation falling on an AOI immediately preceded by a fixation falling on the other AOI. These variables were compared between conditions at the group-level using paired sample t tests.
fMRI data preprocessing
Raw imaging data in DICOM format were converted to NIfTI format and organized to fit the Brain Imaging Data Structure (BIDS; Gorgolewski et al., 2016). Facial features were removed from the anatomic T1-weighted (T1w) images using PyDeface (https://github.com/poldracklab/pydeface). Preprocessing was performed using fMRIPrep version 1.3.0.post2 (RRID:SCR_016216; Esteban et al., 2019), based on Nipype version 1.1.8 (RRID:SCR_002502; Gorgolewski et al., 2011).
For anatomical data preprocessing, the T1w image was corrected for intensity nonuniformity with N4BiasFieldCorrection (Tustison et al., 2010), distributed with Advanced Normalization Tools (ANTs) version 2.2.0 (RRID:SCR_004757; Avants et al., 2008;) and used as T1w-reference throughout the workflow. The T1w-reference was then skull stripped using antsBrainExtraction.sh (ANTs 2.2.0), using OASIS30ANTs as target template. Brain surfaces were reconstructed using recon-all (FreeSurfer 6.0.1; RRID:SCR_001847; Dale et al., 1999), and the brain mask estimated previously was refined with a custom variation of the method to reconcile ANTs-derived and FreeSurfer-derived segmentations of the cortical gray matter of Mindboggle (RRID:SCR_002438; Klein et al., 2017). Spatial normalization to the ICBM 152 Nonlinear Asymmetrical template version 2009c (RRID:SCR_008796; Fonov et al., 2009) was performed through nonlinear registration with antsRegistration (ANTs 2.2.0), using brain-extracted versions of both T1w volume and template. Brain tissue segmentation of cerebrospinal fluid (CSF), white matter (WM) and gray matter (GM) was performed on the brain-extracted T1w using FAST version 5.0.9 (FSL; RRID:SCR_002823; Zhang et al., 2001).
With functional data preprocessing, for each of the eight BOLD runs per subject (across all tasks and sessions), the following preprocessing was performed. First, a reference volume and its skull-stripped version were generated using a custom methodology of fMRIPrep. A deformation field to correct for susceptibility distortions was estimated based on two echo-planar imaging references with opposing phase-encoding directions, using 3dQwarp (Cox and Hyde, 1997; Analysis of Functional NeuroImages, 20160207). Based on the estimated susceptibility distortion, an unwarped BOLD reference was calculated for a more accurate coregistration with the anatomic reference. The BOLD reference was then coregistered to the T1w reference using bbregister (FreeSurfer), which implements boundary-based registration (Greve and Fischl, 2009). Coregistration was configured with 9 df to account for distortions remaining in the BOLD reference. Head-motion parameters with respect to the BOLD reference (transformation matrices, and six corresponding rotation and translation parameters) were estimated before any spatiotemporal filtering using MCFLIRT version 5.0.9 (FSL; Jenkinson et al., 2002). The BOLD time series (including slice-timing correction when applied) were resampled onto their original, native space by applying a single composite transform to correct for head motion and susceptibility distortions. These resampled BOLD time series are referred to as preprocessed BOLD in original space, or just preprocessed BOLD. The BOLD time series were resampled to Montreal Neurological Institute (MNI)152NLin2009cAsym standard space, generating a preprocessed BOLD run in MNI152NLin2009cAsym space. First, a reference volume and its skull-stripped version were generated using a custom methodology of fMRIPrep. Several confounding time series were calculated based on the preprocessed BOLD: frame-wise displacement (FD), DVARS, and three region-wise global signals. FD and DVARS were calculated for each functional run, both using their implementations in Nipype (following the definitions by Power et al., 2014). The three global signals were extracted within the CSF, the WM, and the whole-brain masks. Additionally, a set of physiological regressors were extracted to allow for component-based noise correction (CompCor; Behzadi et al., 2007). Principal components were estimated after high-pass filtering the preprocessed BOLD time series (using a discrete cosine filter with 128 s cutoff) for the two CompCor variants: temporal (tCompCor) and anatomic (aCompCor). Six tCompCor components are then calculated from the top 5% variable voxels within a mask covering the subcortical regions. This subcortical mask is obtained by heavily eroding the brain mask, which ensures it does not include cortical GM regions. For aCompCor, six components are calculated within the intersection of the aforementioned mask and the union of CSF and WM masks calculated in T1w space, after their projection to the native space of each functional run (using the inverse BOLD-to-T1w transformation). The head-motion estimates calculated in the correction step were also placed within the corresponding confounds file. All resamplings were performed with a single interpolation step by composing all the pertinent transformations (i.e., head-motion transform matrices, susceptibility distortion correction, and coregistrations to anatomic and template spaces). Gridded (volumetric) resamplings were performed using antsApplyTransforms (ANTs), configured with Lanczos interpolation to minimize the smoothing effects of other kernels (Lanczos, 1964). Nongridded (surface) resamplings were performed using mri_vol2surf (FreeSurfer).
Confound files were created for each scan (each run of each task of each participant, in TSV format), with the following columns: SD of the root mean squared intensity difference from one volume to the next (DVARS), six anatomic component-based noise correction method (aCompCor), frame-wise displacement, and six motion parameters (translation and rotation each in three directions) as well as their squared and temporal derivatives (Friston 24-parameter model; Friston et al., 1996). A single time point regressor (a single additional column) was added for each volume with FD value larger than 0.9 to model out volumes with excessive motion. Scans with >15% scrubbed volumes were excluded from analysis.
fMRI data analysis
The fMRI data were analyzed using FSL FEAT (fMRI Expert Analysis Tool; Smith et al., 2004). A general linear model (GLM) was estimated to extract contrasts of parameter estimate at each voxel for each subject for each of the four fMRI runs (first-level analysis). Contrasts of parameter estimate from the four runs were then averaged within participants using a fixed effect model (second-level analysis). Group-level effects were estimated using a mixed effect model (FSL's FLAME-1).
The GLM included one regressor modeling the 3 s object presentation time for configural trials, and one regressor modeling object presentation for elemental trials. The model also included one regressor modeling object presentation for the configural trials modulated by the value rating of the object provided on each trial (mean centered) and the equivalent regressor for elemental trials. We included four regressors modeling the rating epoch of the trial, with two unmodulated regressors modeling the rating scale for configural trials and elemental trials separately, and two regressors modeling the rating scale epoch modulated by value ratings (mean centered) for configural trials and elemental trials separately. The duration of the rating event in these four regressors was set to the average rating reaction time across all participants and runs. Rating reaction times were accounted for in the model using a separate regressor modeling the rating epoch for all trials, modulated by the trial-wise reaction time (mean centered). The duration was set to the maximum response time of 3 s in cases where the time limit was reached. All regressors included in this GLM were convolved with a canonical double-γ hemodynamic response function. Their temporal derivatives were also included in the model, with the motion and physiological confounds estimated by fMRIPrep as described above.
Regions of interest
A vmPFC ROI was defined using the combination of the Harvard-Oxford regions frontal pole, frontal medial cortex, paracingulate gyrus and subcallosal cortex, falling between MNI x = −14 and 14 and z < 0, as in Schonberg et al. (2014). This ROI was used for small volume correction where specified.
In addition, we defined four ROIs along the ventral visual stream of the brain; the perirhinal cortex (PRC), parahippocampal place area (PPA), fusiform face area (FFA) and the lateral occipital complex (LOC) using functional localizer data, as in (Erez et al., 2016). The PRC was defined based on a probabilistic map (Devlin and Price, 2007) created by superimposing the PRC masks of 12 subjects, segmented based on anatomic guidelines in MNI-152 standard space. We thresholded the probabilistic map to keep voxels having >30% chance of belonging to the PRC, as in previous work (Erez et al., 2016). The LOC was defined as the region located along the lateral extent of the occipital pole that responded more strongly to objects than scrambled objects (p < 0.001, uncorrected). The FFA was defined as the region that responded more strongly to faces than objects. The PPA was defined as the region that responded more strongly to scenes than to objects. For each of these contrasts, a 10 mm radius sphere was drawn around the peak voxel in each hemisphere using FSL (fslmaths). To analyze brain activity in these regions during the bidding task, cope images from the second-level analysis (average of the four runs for each participant) were converted to percent signal change, before averaging across all voxels within each ventral visual stream ROI. Group-level activations were compared against 0 using one-sample t tests.
Functional connectivity analysis
Functional connectivity was assessed using generalized psychophysiological interactions (gPPI) analysis to reveal brain regions where BOLD time series correlate significantly with the time series of a target seed region in one condition more than another (McLaren et al., 2012). The seed region was defined based on the significant activation cluster found in the group-level analysis for the configural trials value-modulation contrast, small volume corrected for the vmPFC ROI (see Fig. 4A). The seeds' neural response to configural and elemental trials were estimated by deconvolving the mean BOLD signal of all voxels inside the seed region (Gitelman et al., 2003).
The gPPI-GLM included the same regressors as the main GLM described above, plus two PPI regressors of interest, one regressor modeling the seed region's response to configural trials and one regressor modeling the seed region's response to elemental trials. These regressors were obtained by multiplying the seed region time series with an indicator function for object presentation of the corresponding condition, and then reconvolving the result with the double-γ hemodynamic function. The model additionally included one regressor modeling the BOLD time-series of the seed region.
Inference criteria
For behavioral and eye-tracking analysis, we used the standard threshold of p < 0.05 for statistical significance, and we report exact p values and effect sizes for all analyses. Neuroimaging data are reported at the group level with statistical maps thresholded at Z > 3.1 and cluster-based Gaussian Random Field corrected for multiple comparisons with a (whole-brain corrected) cluster significance threshold of p < 0.05. We report analyses restricted to the vmPFC ROI using the same inference criteria, with increased sensitivity to detect effects in this region defined a priori because of fewer comparisons (small volume correction). Ventral visual stream ROI results are reported using the statistical threshold of p < 0.05, Bonferroni corrected for four comparisons (the number of ROIs; p < 0.0125).
Deviations from preregistration
The most substantial deviation from the preregistered analysis concerns the main GLM defined for fMRI analysis. We controlled for reaction times differently from what was stated in the preregistration; this was done because of a mistake in the preregistered analysis plan that proposed an approach different from the usual process of accounting for reaction time (Schonberg et al., 2014; Botvinik-Nezer et al., 2020; Salomon et al., 2020). We also conducted supplementary fMRI analyses including accuracy confound regressors in the GLM after behavioral analysis revealed a trend difference in accuracy between conditions. This analysis did not yield substantially different results, and we thus report results from the model without accuracy regressors, as preregistered.
Data and code accessibility
Unthresholded whole-brain statistical maps are available on NeuroVault.org at https://neurovault.org/collections/9558/. Neuroimaging data necessary to recreate all analyses are available in BIDS format on OpenNeuro at https://openneuro.org/datasets/ds002994/versions/1.0.1. Behavioral and eye-tracking data, codes for behavior, eye-tracking and fMRI analysis, and all experiment codes are available on GitHub at https://github.com/GabrielPelletier/fribblesFMRI_object-value-construction.
Results
Behavior
We first present the behavioral results from the behavioral and fMRI studies to establish the replicability of the behavioral effects.
Learning phase
Participants learned the value of novel multiattribute objects under two conditions, elemental and configural. Learning behavior differed between conditions in both the behavioral and the MRI sample (This phase of the task was performed outside the scanner in both studies.), with configural associations being generally harder to learn than elemental ones, as detailed below.
Value rating errors decreased across learning blocks and were overall higher in the configural condition (Fig. 3A). A repeated measures ANOVA with block and condition as within-subject factors, revealed a main effect of block (behavioral sample, F(4,128) = 58.21, p < 0.001, η2p = 0.45; fMRI sample, F(4,164) = 60.73, p < 0.001, η2p = 0.40) and a main effect of condition (behavioral sample, F(1,32) = 372.14, p < 0.001, η2p = 0.56; fMRI sample, F(1,41) = 470.84, p < 0.001, η2p = 0.56) on value rating error. We also found a significant block by condition interaction (behavioral sample, F(4,128) = 37.98, p < 0.001, η2p = 0.35; fMRI sample, F(4,164) = 30.20, p < 0.001, η2p = 0.25). This interaction reflects that error rates were more similar across conditions as learning wore on, although the rating error remained significantly greater in the configural compared with the elemental condition on the last (fifth) learning block (paired-sample t test, behavioral sample, t(32) = 4.69, p < 0.001, Cohen's d = 0.817; fMRI sample, t(41) = 6.46, p < 0.001, Cohen's d = 0.90).
Reaction times also decreased across learning blocks (main effect of block, behavioral sample, F(4,128) = 7.17 p < 0.001, η2p = 0.09; fMRI sample, F(4,164) = 26.38, p < 0.001, η2p = 0.22). Reaction times were significantly faster in the elemental compared with the configural condition (main effect of condition, behavioral sample, F(1,32) = 467.58, p < 0.001, η2p = 0.62; fMRI sample, F(1,41 = 391.35, p < 0.001, η2p = 0.51). There was no significant block by condition interaction in the behavioral sample (F(4,128) = 0.387, p = 0.818, η2p = 0.005), but there was a significant interaction in the fMRI sample (F(4,164) = 4.35, p = 0.002, η2p = 0.05).
After five learning blocks, participants completed a learning probe without feedback outside the scanner. The learning probe was designed to assess the ability to assign value to the objects during extinction. It was also important to assess the ability to sum two attribute values in the elemental condition, which only included single-attribute value associations in the learning blocks. In the learning probe, accuracy was lower in the elemental condition compared with the configural condition in the behavioral sample (paired-sample t test, t(32) = 2.13, p = 0.041, Cohen's d = 0.372) but was not significantly different between conditions in the fMRI sample (t(41) = 1.30, p = 0.201, Cohen's d = 0.201). Participants were slower in the elemental compared with the configural condition in both samples (behavioral sample, t(32) = 5.47, p < 0.001, Cohen's d = 0.953; fMRI sample, t(41) = 9.56, p < 0.001, Cohen's d = 1.48).
Bidding task
After learning, participants were shown objects from the configural and elemental sets and were asked to bid. Participants in the fMRI study performed the learning phase outside the scanner and then performed the bidding stage while scanned with fMRI. Bidding accuracy was high and not significantly different between the configural (mean rating error = 2.26 Israeli New Shekels (ILS), SD = 1.66) and elemental (mean = 2.04 ILS, SD = 1.66) conditions for the behavioral sample (t(32) = 1.08, p = 0.289, Cohen's d = 0.188; Fig. 3B). In the fMRI sample, bids tended to be closer to the instructed value (smaller error) in the elemental (mean = 2.18 ILS, SD = 1.02) compared with the configural condition (mean = 2.55 ILS, SD = 1.63), although the difference did not reach significance, and the effect was marginal (t(41) = 1.90, p = 0.065, Cohen's d = 0.293). Value rating reaction times were not significantly different between conditions (behavioral sample, t(32) = 1.80, p = 0.081, Cohen's d = 0.314; fMRI sample, t(41) = 0.251, p = 0.803, Cohen's d = 0.038). Thus, despite some behavioral differences between conditions in the learning phase, accuracy and reaction times were similar across conditions in the bidding phase, which was the focus of subsequent analyses.
fMRI signal in the vmPFC selectively tracks configural object value
We hypothesized that the fMRI signal in vmPFC would correlate with configural object value, and that the correlation of vmPFC signal and value would be stronger for configural compared with elemental trials. To test this hypothesis, we preregistered analysis of value modulation effects at the time of object presentation in the a priori defined vmPFC region of interest using small-volume correction. The hypothesized value signal in the vmPFC was not detected during the object presentation epoch but was instead evident at the time of value rating. Two clusters in the vmPFC were significantly correlated with value for configural trials in the rating phase (Fig. 4A). In contrast, no activation clusters were found to correlate with value in the elemental trials, and the direct condition contrast revealed a significant condition by value interaction in the vmPFC, in which signal was correlated more strongly with value in configural compared with elemental trials (Fig. 4B). We decomposed this interaction by calculating the percent signal change by unit of value for each condition separately, within the significant activation cluster for the condition by value interaction (Fig. 4B, right). This analysis revealed that the condition by value interaction in this cluster was driven by a positive effect of value in the configural condition (t(41) = 3.144, p = 0.0031) and a negative effect in the elemental condition (t(41) = 3.531, p = 0.001). This analysis should be interpreted with caution as it only examines significant voxels (i.e., circular analysis) and is only intended as additional information describing the interaction.
Condition by value interaction in the ventral visual stream
We next tested whether the ventral visual stream ROIs were sensitive to the valuation condition. Our preregistered hypothesis was that at the time of object presentation, fMRI signals in the PRC, and not in posterior VVS regions, would be greater in response to objects learned in the configural condition. We found no significant main effect of condition on BOLD in the PRC (p = 0.460) or any other VVS region (LOC, p = 0.286; FFA, p = 0.731; PPA, p = 0.136; Fig. 5B) at the time of object presentation, indicating that during this time, VVS ROIs were similarly activated in response to objects learned in the configural and elemental conditions.
We next examined whether VVS regions were sensitive to value. We found a significant condition by value interaction in the PRC: in this region, the BOLD signal associated with value was stronger for configural compared with elemental trials (p = 0.016, Bonferroni corrected for four ROIs; Fig. 5B). This effect was specific to the PRC and was not found in more posterior regions of the VVS (LOC, FFA, and PPA, uncorrected ps > 0.727). We decomposed the interaction by examining value modulation in configural and elemental trials separately. In the PRC, there was a nonsignificant trend in BOLD signal to be positively correlated with value in configural trials (uncorrected p = 0.127), and negatively correlated with value in elemental trials (uncorrected p = 0.172; Fig. 5B). There was no significant effect of condition (uncorrected ps > 0.216) and no condition by value interaction (uncorrected ps > 0.394) in any VVS regions during the value rating epoch.
Whole-brain examination of configural and elemental evaluation
ROI analyses revealed value modulation effects in the vmPFC and a condition by value interaction in the PRC, but no main effect of condition. We next conducted whole-brain analyses to ask which brain regions, if any, were on average more active in configural in contrast to elemental trials regardless of value at the time of object presentation (Table 1). Table 2 shows the clusters for the opposite contrast.
Eye movements distinguish between configural and elemental evaluation
In the previous section, we report different brain regions recruited in configural and elemental object evaluation. We next investigated whether eye movements during the same 3 s object presentation epoch of the bidding task trials were different between conditions (Fig. 6A,B). The average number of fixations made on the whole object was similar across conditions (behavioral sample, t(32) = 1.741, p = 0.091, Cohen's d = 0.303; fMRI sample, t(31) = 0.479, p = 0.635, Cohen's d = 0.083). However, we found consistent condition differences across the two samples in eye movements with respect to fixations to the value-predictive attributes. Participants made significantly more transitions between these attributes in the configural compared with the elemental condition (behavioral sample, t(32) = 3.364, p = 0.002, Cohen's d = 0.586; fMRI sample, t(31) = 2.659, p = 0.012, Cohen's d = 0.463), and the average duration of individual fixations was longer in the elemental condition (behavioral sample, t(32) = 3.611, p = 0.001, Cohen's d = 0.559; fMRI sample, t(31) = 2.211, p = 0.034, Cohen's d = 0.385).
Control analyses did not reveal significant difference between conditions in the amount of time spent fixating areas of the objects not included in the attributes AOIs (behavioral sample, t(32) = 0.211, p = 0.827, Cohen's d = 0.039; fMRI sample, t(31) = 1.662, p = 0.114, Cohen's d = 0.204). Moreover, fixation duration was not biased toward one attribute AOI over the other in one condition compared with the other (behavioral sample, t(32) = 0.623, p = 0.536, Cohen's d = 0.108; fMRI sample, paired sample t test, t(31) = 1.564, p = 0.128, Cohen's d = 0.273).
Given these observations, we conducted exploratory (not preregistered) analyses to investigate whether gaze differences between conditions were related to differences in the brain. For each participant, we calculated the percent signal change for the configural minus elemental contrast, averaged across voxels of all significant clusters more active in configural compared with elemental trials from the group-level whole-brain analysis (Table 1). We correlated this brain activation variable with the difference between the average number of gaze transitions made during the fixed 3 s object presentation epoch in the configural minus the elemental trials. Therefore, this measure is not dependent on reaction time. This revealed a significant positive correlation between the two variables: the greater the difference in brain activations, the greater the difference in eye movements (Pearson's r = 0.425, p = 0.015; Fig. 6C). There was no significant correlation between brain activation and the difference in average durations of fixations (Pearson's r = −0.075, p = 0.682) or the difference between the number of fixations (Pearson's r = 0.110, p = 0.550).
Functional connectivity analysis
We conducted a preregistered functional connectivity analysis using gPPI, defining the seed as the significant vmPFC clusters found for configural trials value modulation (Fig. 4A). The gPPI analysis did not reveal any clusters across the whole brain and no VVS region displaying evidence of greater functional connectivity with the vmPFC seed in configural compared with elemental trials, or vice versa.
Discussion
Here, using both eye-tracking and fMRI, we show behavioral and neural evidence for two distinct mechanisms of assessing the value of multiattribute objects. We found that evaluation of complex objects relied on different patterns of information acquisition, indexed by eye movements, and engaged different brain regions when value was predicted by configural relationships among attributes compared with when value could be summed from the independent values of individual attributes. Activity in the perirhinal cortex was correlated with value in configural more than elemental trials during object presentation, whereas at the time of value rating, vmPFC showed value-modulated signal for configural trials only. Participants made longer fixations on individual attributes in the elemental condition and made more gaze transitions from one attribute to another when viewing objects in the configural condition. Moreover, at the participant level, the between-condition difference in the number of gaze transitions was correlated with the difference in brain activation.
These experiments in three different samples provide evidence converging with the findings from a recent study in patients with vmPFC damage using the same type of stimuli (Pelletier and Fellows, 2019). That lesion study found that vmPFC damage impaired binary decisions between fribbles in the configural condition but not in the elemental condition. The current work provides additional support for the hypothesis that vmPFC has a unique role in inferring the value of objects based on configural information: BOLD signal in that region was only detectably modulated by object value in the configural and not the elemental condition. The present study further argues that evaluation in the configural condition engages the PRC, a region known to be critical for multiattribute object recognition, but here for the first time also implicated in the evaluation of such objects.
We did not find that the total value of an object obtained by combining two separately learned attribute values was reflected in the vmPFC fMRI signal. This null result alone cannot rule out that vmPFC is involved in value integration from multiple elements. However, together with the finding that damage to vmPFC did not substantially impair the ability to make choices based on such values, it suggests the existence of alternate mechanisms for value construction under such conditions not requiring vmPFC.
Across published fMRI work, vmPFC is reliably associated with subjective value (Rushworth and Behrens, 2008; Bartra et al., 2013). Activity in the vmPFC has also been shown to reflect the values of items composed of multiple attributes, each modeled as independently predictive of value (Basten et al., 2010; Lim et al., 2013; Suzuki et al., 2017). Although the assumption of elemental value integration made in these studies was consistent with their data, it is possible that whole option values were nonetheless estimated configurally. No previous work has contrasted these distinct types of multiattribute valuation, leaving unclear whether vmPFC value signals reflect elemental or configural assessment. The current findings add to the view that the vmPFC is not critical for value integration in general but rather becomes necessary under a narrower set of conditions (Vaidya and Fellows, 2020).
We propose a more specific account whereby the vmPFC is required for inferring value from the configural relationships among lower level attributes. This proposal is not incompatible with the common finding that vmPFC tracks value. This experiment and the preceding lesion study were designed to distinguish two modes of valuation, with objects being evaluated in a purely configural or elemental mode. However, decisions between familiar everyday objects of the type commonly used in this field likely involve a mixture of configural and elemental valuation processes. The frequent finding of value-related signal in vmPFC across studies may reflect that most decisions involve some degree of configural evaluation, rather than arguing for a general role for this region in value assessment under all conditions. This view might also explain prior observations that patients with vmPFC damage are able to evaluate complex social or aesthetic stimuli, but seem to draw on different, potentially more elemental, information to assess the value of such stimuli, compared with healthy participants (Xia et al., 2015; Vaidya et al., 2018).
In this experiment, elemental decisions may be conceived of as an arithmetic problem, whereas configural decisions might rely more on associative memory. Indeed, a reverse inference analysis using Neurosynth (Yarkoni et al., 2011) revealed that the brain activation map for the configural minus elemental contrast was most similar to studies reporting the terms “Retrieval,” “Episodic” and “Recognition memory.” Although recent theories argue that value-based decisions are inextricably linked to associative memory processes (Weber and Johnson, 2006; Shadlen and Shohamy, 2016), the normative theories of choice behavior that have largely guided neuroimaging research on multiattribute decision-making to date often conceive of value as arising from an arithmetic linear integration process (Basten et al., 2010; Lim et al., 2013). Rather than being mutually exclusive, these two approaches are consistent with the coexistence of configural and elemental modes of evaluation.
Previous fMRI studies using multivariate approaches found value signals for elemental objects in vmPFC activity patterns but not in univariate analyses as used here (Kahnt et al., 2011). Another study found evidence for value signals in activity patterns in vmPFC in a configural condition (Kahnt et al., 2010), suggesting that the vmPFC is involved in both modes of evaluation. The current data are not ideally suited for a multivariate classifier approach because of the dependence between the sensory properties of the objects and their value (Kahnt, 2018). We cannot rule out that elemental value would be detectable in the vmPFC by analyzing activation patterns here, although our prior finding of intact elemental valuation following vmPFC damage argues that even if such signal is present in vmPFC, it is not critical for behavior.
The current work also addressed whether regions known to be involved in complex object recognition are likewise involved in assessing the values of such options. We found that fMRI VVS signals were differently sensitive to value across conditions. Specifically, activity in the PRC was modulated by value more for configural compared with elemental trials. There are previous reports of value-correlated signal across the VVS, including in the primary visual cortex (Serences, 2008; Nelissen et al., 2012); lateral occipital complex (Persichetti et al., 2015); the PRC (Mogami and Tanaka, 2006), and several of these regions combined (Arsenault et al., 2013; Kaskan et al., 2017). Across studies, reward has been paired with stimuli ranging in complexity from simple colored gratings to complex objects, but no work previously contrasted conditions in which evaluation relied on characteristics represented at different stages of the VVS hierarchy. Our findings suggest a selective involvement of the PRC in encoding value when it is associated with the high-level (i.e., configural) object representations that this region supports. This is compatible with previous findings that the change in value of faces and objects is associated with changes in face and object processing regions, respectively (Botvinik-Nezer et al., 2020; Salomon et al., 2020), arguing that value learning and storage occurs partly through experience-dependent plasticity in sensory cortex (Schonberg and Katz, 2020).
A condition by value interaction for PRC activity was only observed during object presentation, on average 6 s before value rating, when value-related signals were detected in the vmPFC, arguing against the possibility that value-related PRC activation is driven by the vmPFC. This finding rather suggests that the VVS, in addition to being involved in value learning, is also involved in developing value representations during object recognition, with vmPFC activation following later in the decision process. This is consistent with electrophysiological recordings in macaques, which reported value sensitivity in PRC neurons at ∼200 ms after stimulus onset (Mogami and Tanaka, 2006), whereas other work detected value selective signals only after 400–500 ms in the orbitofrontal cortex (Wallis and Miller, 2003; Kennerley et al., 2009). Electroencephalography in humans likewise revealed value-correlated signals in response to reward-paired objects emerging earlier in the occipital cortex than in the prefrontal cortex (Larsen and O'Doherty, 2014). With these data, the current work is compatible with the idea that value representations emerge gradually from the onset of sensory processing and are refined incrementally until action selection (Yoo and Hayden, 2018). This framework further helps in interpreting the significant condition by value interaction in PRC signals, despite nonsignificant (but trending) effects of value: at this lower level stage of the putative decision-making hierarchy, value-related signals would be expected to interact with the object recognition processes that this region supports and not as closely related to the behavioral output (e.g., value ratings) compared with later stages such as the vmPFC.
We also found systematic differences in eye gaze patterns between conditions, replicated in two samples. Moreover, we found that the greater the difference in gaze transitions in the configural compared with the elemental trials, the greater the difference in brain activation between conditions. The brain regions included in the correlation analysis (Table 1) overlap with regions previously associated with the sensorimotor aspect of gaze behavior (e.g., frontal eye fields, posterior parietal cortex, cerebellum) and with those associated with the cognitive control of gaze, involving memory and planning (e.g., lateral PFC, caudate, thalamus; Sweeney et al., 2007). This brain-gaze correlation might reflect both the cognitive processes underlying multiattribute evaluation in different conditions and visuomotor confounds associated with the generation of saccades. Further studies will be needed to replicate and extend this exploratory finding.
Sequential sampling models have shown that value and gaze interact in driving the decision process (Shimojo et al., 2003; Krajbich et al., 2010). However, little is known about fixation patterns within multiattribute objects during choice (Krajbich, 2019); and how they relate to the value construction process. Consumer research has extensively studied decision strategies using process tracing measures including eye tracking (Russo and Dosher, 1983; Bettman et al., 1998). However, these studies decomposed options by laying out attributes as text and numbers in a table format, which might not relate to the mechanisms underlying everyday choices between complex objects that are likely to be more readily represented in VVS. Here, we provide evidence that when evaluating complex objects having well-controlled visual properties, equal numbers of value-informative attributes, and the same overall value, value construction per se is reflected in eye movements and brain activations. These distinct forms of multiattribute evaluation may inform further work to fully understand the interplay between gaze patterns and value construction during complex decision-making (Busemeyer et al., 2019).
We did not find evidence for increased functional connectivity between the vmPFC and PRC during configural object valuation. This null result must be interpreted with caution, as the study was not powered to find such an effect. There are anatomic connections (Heide et al., 2013), and there is evidence of functional connectivity (Andrews-Hanna et al., 2014) between the vmPFC and the medial temporal lobe in humans. The PRC and medial OFC are reciprocally connected in macaques (Kondo et al., 2005), and disconnecting these regions disrupts value estimation of complex visual stimuli (Clark et al., 2013; Eldridge et al., 2016). The current findings of value-related activations at different stages of the trial in the PRC and the vmPFC suggest that interactions between these two regions might be important for value estimation in configural conditions.
Although we attempted to match the two conditions for difficulty, and further addressed this potential confound by controlling for trial-by-trial rating reaction times and accuracy in fMRI analyses, one limitation of this study is that we could not account for potential condition differences in speed of evaluation during the fixed object presentation time and the subsequent interstimulus interval. The slider response requirement also meant that motor responses were confounded with rated values, potentially explaining why motor regions showed value-correlated activation at the time of object presentation.
In conclusion, this neuroimaging study, directly linked to our recent work in lesion patients, provides evidence for two ways of building the value of complex objects, supported by at least partly distinct neural mechanisms. Leveraging object-recognition research to inform studies of multiattribute value-based decisions, this work suggests that the relationship between attributes and value might influence how an object is processed through the VVS. Research at the interface of these two fields of research may bring novel perspectives on the neural substrates of both perception and motivated behavior.
Footnotes
This work was supported by the Israel Science Foundation (Grants 1798/15 and 2004/15) to T.S. and from the Canadian Institutes of Health Research (Grant MOP-11920) and the Natural Sciences and Engineering Research Council of Canada (Grant RGPIN-2016-06066) to L.K.F. G.P. was supported by the Zavalkoff Family Foundation as part of the Brain@McGill and Tel Aviv University collaboration and by a Sandwich Scholarship from the Council for Higher Education in Israel. We thank Tom Salomon and Shiran Oren for discussions on study design and data analysis, Anastasia Saliy Grigoryan for help with participant recruitment, and the staff of the Alfredo Federico Strauss Center for Computational Neuroimaging of Tel Aviv University for help with data collection.
The authors declare no competing financial interests.
- Correspondence should be addressed to Gabriel Pelletier at gabriel.pelletier{at}mail.mcgill.ca