Skip to main content

Main menu

  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Collections
    • Podcast
  • ALERTS
  • FOR AUTHORS
    • Information for Authors
    • Fees
    • Journal Clubs
    • eLetters
    • Submit
    • Special Collections
  • EDITORIAL BOARD
    • Editorial Board
    • ECR Advisory Board
    • Journal Staff
  • ABOUT
    • Overview
    • Advertise
    • For the Media
    • Rights and Permissions
    • Privacy Policy
    • Feedback
    • Accessibility
  • SUBSCRIBE

User menu

  • Log out
  • Log in
  • My Cart

Search

  • Advanced search
Journal of Neuroscience
  • Log out
  • Log in
  • My Cart
Journal of Neuroscience

Advanced Search

Submit a Manuscript
  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Collections
    • Podcast
  • ALERTS
  • FOR AUTHORS
    • Information for Authors
    • Fees
    • Journal Clubs
    • eLetters
    • Submit
    • Special Collections
  • EDITORIAL BOARD
    • Editorial Board
    • ECR Advisory Board
    • Journal Staff
  • ABOUT
    • Overview
    • Advertise
    • For the Media
    • Rights and Permissions
    • Privacy Policy
    • Feedback
    • Accessibility
  • SUBSCRIBE
PreviousNext
Research Articles, Behavioral/Cognitive

Complexity Matters: Normalization to Prototypical Viewpoint Induces Memory Distortion along the Vertical Axis of Scenes

Yichen Wu(吴奕忱) and Sheng Li(李晟)
Journal of Neuroscience 3 July 2024, 44 (27) e1175232024; https://doi.org/10.1523/JNEUROSCI.1175-23.2024
Yichen Wu(吴奕忱)
1School of Psychological and Cognitive Sciences, Peking University, Beijing 100871, China
2Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing 100871, China
3PKU-IDG/McGovern Institute for Brain Research, Peking University, Beijing 100871, China
4National Key Laboratory of General Artificial Intelligence, Peking University, Beijing 100871, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sheng Li(李晟)
1School of Psychological and Cognitive Sciences, Peking University, Beijing 100871, China
2Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing 100871, China
3PKU-IDG/McGovern Institute for Brain Research, Peking University, Beijing 100871, China
4National Key Laboratory of General Artificial Intelligence, Peking University, Beijing 100871, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Sheng Li(李晟)
  • Article
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF
Loading

Abstract

Scene memory is prone to systematic distortions potentially arising from experience with the external world. Boundary transformation, a well-known memory distortion effect along the near-far axis of the three-dimensional space, represents the observer's erroneous recall of scenes’ viewing distance. Researchers argued that normalization to the prototypical viewpoint with the high-probability viewing distance influenced this phenomenon. Herein, we hypothesized that the prototypical viewpoint also exists in the vertical angle of view (AOV) dimension and could cause memory distortion along scenes’ vertical axis. Human subjects of both sexes were recruited to test this hypothesis, and two behavioral experiments were conducted, revealing a systematic memory distortion in the vertical AOV in both the forced choice (n = 79) and free adjustment (n = 30) tasks. Furthermore, the regression analysis implied that the complexity information asymmetry in scenes’ vertical axis and the independent subjective AOV ratings from a large set of online participants (n = 1,208) could jointly predict AOV biases. Furthermore, in a functional magnetic resonance imaging experiment (n = 24), we demonstrated the involvement of areas in the ventral visual pathway (V3/V4, PPA, and OPA) in AOV bias judgment. Additionally, in a magnetoencephalography experiment (n = 20), we could significantly decode the subjects’ AOV bias judgments ∼140 ms after scene onset and the low-level visual complexity information around the similar temporal interval. These findings suggest that AOV bias is driven by the normalization process and associated with the neural activities in the early stage of scene processing.

  • angle of view
  • boundary transformation
  • fMRI
  • MEG
  • PPA
  • scene perception

Significance Statement

Perceiving a scene with high precision is critical for our navigation and interaction with the surrounding environment. However, systematic memory distortion is quite common. Herein, we discovered that scene memory could be distorted to the upper or lower visual field. According to the behavioral results, multiple measures of scenes’ complexity information were critically involved in the formation of this memory distortion. Furthermore, the results support the normalization theory regarding the existence of a high-probability prototypical viewpoint in scene processing. Our findings also suggested that the normalization process-induced scene memory distortion could be beneficial for the observer's future action selection. The identified complexity measures could be used in designing artificial intelligence (AI) systems with navigational functions.

Introduction

Our observation of the surrounding environment is often constrained by its size and how we interact with it. For example, we rarely view kitchens from a far distance as they are usually too small, and we need to be close to the stove among the other resources in the small space when cooking or during routine visits to the kitchen. Such ecological restraints may limit probable viewpoints of scenes to a narrow range. In this regard, literature has long speculated the existence of a high-probability prototypical viewpoint in the perception and memory of scenes (Intraub et al., 1992; Konkle and Olivia, 2007; Bainbridge and Baker, 2020). Consistent with this hypothesis, studies have associated scene memory with systematic distortions toward a high-probability viewpoint of the environment (J. Park et al., 2021; Lin et al., 2022). Moreover, prior information on scene representation is believed to be an adaptive mechanism in noisy environments (de Lange et al., 2018; Press et al., 2020).

One remarkable aspect of systematic distortions in scene representation is boundary transformation, a phenomenon in which the memorized boundaries of natural scene images consistently zoom out (boundary extension) or in (boundary contraction) across subjects and paradigms (Intraub and Richardson, 1989; Bainbridge and Baker, 2020). This phenomenon could be attributed to the memory normalization process (Bartlett, 1932), in which the stored representation is normalized toward the scene's high-probability viewing distance (Bainbridge and Baker, 2020; Lin et al., 2022; Gandolfo et al., 2023). Moreover, multiple studies on boundary transformation have depicted the viewing distance as a crucial component of the prototypical viewpoint (J. Park et al., 2021; Hafri et al., 2022; Lin et al., 2022).

Herein, we hypothesized that scenes’ vertical angle of view (AOV) is another critical dimension of the prototypical viewpoint that influences scene processing. Two aspects of evidence could support this hypothesis. First, gravity causes differential compositions in the top and bottom halves of scenes. In an indoor scene, the less informative ceiling would be found in the top half and floor, and other informative objects would be in the bottom half. On the other hand, for navigational purposes, immovable landmarks, such as buildings, are more likely to be in the top half of an outdoor scene, while the paths that we walk on are more likely to be in the bottom half (Greene, 2013). This asymmetrical layout could affect the AOV adopted to observe and interact with the scenes. Second, the vertical layout of scenes influences scene processing. For example, vertically inverted scene images may increase the difficulty of identification and change detection tasks (Rock, 1974; Shore and Klein, 2000; R. A. Epstein et al., 2006). In line with these behavioral effects, the typical vertical location in the environment could be used to predict cortical response to a scene fragment (Kaiser et al., 2020). These findings sufficiently support the existence of a high-probability prototypical viewpoint in the vertical axis of natural scenes.

Herein, we adopted a paradigm that induced vertical distortion in scene memory to explore the prototypical viewpoint along the vertical axis of scenes. Two behavioral experiments were performed, in which subjects were instructed to recognize the vertical AOV of scene images in the forced choice and free adjustment tasks. The results revealed consistent memory distortion along the vertical axis of scenes. Further analysis demonstrated that the complexity information asymmetry between the top and bottom halves of the scenes, as well as the subjective AOV ratings from an independent group of subjects, could predict the direction and magnitude of memory distortion. Two brain imaging experiments [(functional Magnetic Resonance Imaging (fMRI) and Magnetoencephalography (MEG)] were also performed, providing evidence that behavioral AOV biases are associated with early cortical activities that could reflect the feedforward sweep of scene processing.

Materials and Methods

Experiment 1: AOV judgment

Subjects

In this experiment, 97 subjects were involved, 49 (mean age, 21.8 years; SD = 2.4; females, 32) and 48 (mean age, 22.7 years; SD = 3.2; females, 33) healthy adults who participated in the 2AFC and 3AFC tasks, respectively. Nine subjects from each of the two task groups were excluded due to low judgment accuracy in the probe image-set test (see details below). The subjects provided oral informed consent prior to their participation and had normal or corrected-to-normal vision. The sample size was comparable with those in previous in-lab boundary transformation studies (Intraub and Richardson, 1989; Intraub and Dickinson, 2008). An additional 1,208 subjects were recruited online via the online survey platform Wenjuanxing (Changsha Ranxing Information Technology) for the AOV rating task. All subjects were compensated for their time. The Committee for Protecting Human and Animal Subjects at the School of Psychological and Cognitive Sciences at Peking University approved the study protocol for all four experiments (Institutional Review Board Protocol No: 2021-06-08).

Stimuli

The stimuli were displayed using a cathode ray tube (CRT) monitor (refresh rate, 120 Hz) at a visual angle of 11.4°. The subjects’ heads were stabilized at a head-up AOV (i.e., they neither bowed nor raised their heads) using a chin rest.

SUN image-set

A set of 500 naturalistic images was gathered from the Scene Understanding (SUN) Database, which contains 131,000 images of 908 different scene categories (Xiao et al., 2010). We used the same SUN image-set as in Bainbridge and Baker (2020). The images were 350 × 350 pixels in size and were sampled from 492 different categories.

Probe image-set

Probe images, a set of 60 panorama images gathered from the Internet, were included in the task to monitor the subject's attention. For each panorama image, three probe images were created by screenshotting it with a step of 10° AOV difference in Unity software (version 2021.1.3). The images were then rescaled to 350 × 350 pixels.

Experimental design

AOV judgment tasks

Subjects were sequentially presented with two similar scenes with a slightly different AOV, and they needed to judge the direction of the AOV difference. Subjects performed a training session before the actual experiment with 10 scene images from the probe image-set. These 10 images would not appear in the subsequent task. The first and second scenes in a training trial had a 10° AOV difference. Thus, subjects perceived an actual AOV difference in the training session. Subjects passed the training session if their performance reached 90% correct.

The procedures of the trials in the 2AFC and 3AFC tasks are shown in Figure 1A. Each trial started with a 1,000 ms fixation interval, followed by the presentation of the first scene image (11.4° × 11.4°) for 250 ms. There was a 250 ms dynamic mask after the offset of the first image, followed by a 1,000 ms presentation of the second scene image. The dynamic mask consisted of five mosaic-scrambled images, each presented for 50 ms. Each mask image was composed of 7 × 7 patches of unrelated images. The masks and the second image were the same size as the first image. Subjects had 3,000 ms to respond to the AOV change. For the scene images from the SUN image-set, the first and second scene images were identical. For the scene images from the probe image-set, the second scene had a 10° up or down AOV change from the first scene. In the 2AFC task, subjects responded to whether the second scene had an upper AOV (up choice and low-bias, scored as −1) or a lower AOV (down choice and up-bias, scored as +1) than the first scene. In the 3AFC task, subjects were instructed to respond with three options: up, the-same (no bias, scored as 0), or down. Every subject was required to complete all 500 scenes in the SUN image-set and 50 scenes in the probe image-set. Trials were divided into 10 blocks of 55 trials. Each block contained 50 scenes from the SUN image-set and five scenes from the probe image-set. Subjects with a below 80% accuracy for the probe images were excluded from further analysis. For the remaining subjects, only trials with scenes from the SUN image-set were included in the analysis.

AOV rating task

Subjective AOV rating was conducted online. The subjects were asked to judge the direction of the photographer's line of sight relative to the horizontal plane with a 5-point score (1) lower your head; (2) slightly lower your head; (3) head up; (4) slightly raise your head; and (5) raise your head). Scenes with a high subjective AOV rating indicate that they have an upward AOV. The rating scores on 300 scene images in the SUN dataset were collected. On average, each scene image received 121 ratings (SD = 10). Subjective AOV rating for each scene image was computed by averaging the rating scores across subjects and normalized for further regression analysis.

Data analysis

The group-averaged behavioral AOV bias score for each scene was calculated. To validate that the behavioral AOV bias scores were not the subjects’ random choices, two analyses were performed based on the bootstrap procedure. First, the distribution of AOV bias scores was compared against the null distribution. Second, the split-half analysis was conducted to examine the rating consistency across subjects. To investigate the potential relationship between the behavioral AOV bias and the information embedded in the scene images, the complexity information of the visual and object features in the scenes was quantitatively estimated. The estimated complexity information, together with the separately collected subjective AOV rating, was submitted to multiple linear regression models for further evaluation. All analyses in Experiments 1 and 2 were conducted using MATLAB R2021a (RRID: SCR_001622).

Distribution of AOV bias scores

We tested whether the distribution of AOV bias scores was different from that of the null hypothesis. The order of the scene images relative to the behavioral choices for each subject was permuted, and the AOV bias score of each scene across the subjects was recomputed. The permuted computations of AOV bias scores were bootstrapped for 1,000 iterations. The probability density function of the permuted AOV bias scores was treated as the distribution of the null hypothesis. The null distribution kept the proportions of each subject's choices but shuffled the choices for the scenes. Thus, the null distribution represented the distribution of AOV bias scores when the same group of subjects made random choices. A chi-square test was then used to examine the difference between the experimental and permuted distributions of AOV bias scores.

Split-half consistency

The split-half analysis was applied to test the rating consistency across subjects. Subjects were randomly assigned into two halves in each iteration of the bootstrap procedure. AOV bias scores were calculated separately for each half of the subjects. The two sets of AOV bias scores were then Spearman correlated to measure the rating consistency between the two halves of subjects. The probability level of the rating consistency was derived by permuting the image order for each subject before the calculation of the group-averaged AOV bias scores. Both the split-half correlation and its probability level estimation were bootstrapped for 1,000 iterations. Split-half reliability ρ* was calculated using the Spearman–Brown prediction formula with the average Spearman correlation r. The proportion of iterations in which the permuted correlation was higher than the split-half correlation was treated as a p value.

Estimation of scene complexity information

Scene complexity can be perceived effortlessly. Complexity measures were considered an influential factor for both object recognition and image memorability (Groen et al., 2018; Saraee et al., 2020). We selected two sets of three measures to evaluate the information about visual and object complexities that are particularly associated with the complexity of the scenes (Oliva et al., 2004; Kyle-Davidson et al., 2023). The visual complexity measures depict the degree of variation in low-level information such as spatial frequency and orientation. The object complexity measures describe the contents of the high-level object-related information in the scenes. Since AOV bias is bidirectional (i.e., up and down), we conjectured that the complexity information in the top or bottom half of the image may specifically contribute to the subjects’ behavioral bias. We divided each scene into the top half and bottom half and estimated the measures of complexity information separately for the two halves. Meanwhile, dimensionality reduction (principal component analysis, PCA) was employed to transform the three measures of complexity into a single score for each half of a scene image. The score was the first component of the PCA. Scores of visual and object complexities were then z-scored. Finally, the asymmetry of visual and object complexities in the vertical axis was computed by subtracting the complexity score of the bottom half from the score of the top half. The visualizations for these measures are depicted in Figure 3A.

Visual complexity

The three measures of visual complexity are gist entropy, subband entropy, and edge density. These measures were calculated for each half of the scene images.

Gist entropy was estimated from the gist describer that quantifies spatial frequencies and orientations at different scene locations (Oliva and Torralba, 2001). The descriptor is well-known for its high accuracy in scene classification tasks (Oliva and Torralba, 2001). It has also been shown that gist features correlate with neural responses in scene-selective brain areas (Henriksson et al., 2019; Lescroart and Gallant, 2019). Scene images were gray scaled and fed into 256 filters (or channels) consisting of four spatial frequencies and eight orientations at each of the eight (2 rows × 4 columns) spatial locations in each half of the scene images. Gist entropy was calculated as the Shannon entropy over all channel values.

Subband entropy is a measure of visual clutter and image complexity in colored scenes (Rosenholtz et al., 2007). It depicts the redundancy of scenes and is related to the number of bits required for image compression. This measure correlates well with behavioral performance in visual search tasks on complex images (Rosenholtz et al., 2007). Scene images from the RGB space were first transformed into the CIELAB space. The luminance (L) and chrominance (a, b) channels of the transformed images were then decomposed into subbands with three spatial scales and four orientations by a wavelet coder with a steerable pyramid. The Shannon entropy within each subband was summed, and the subband entropy was computed by a weighting of 0.08 for each chrominance channel entropy and 0.84 for the luminance channel entropy.

Edge density was estimated using a Canny edge detector. Edge density describes the percentage of pixels that are part of the edges and is suggested to correlate with subjective measures of scene complexity (Ciocca et al., 2015; Corchs et al., 2016; Nagle and Lavie, 2020). This measure is also a good predictor of visual search performance in complex scenes as subband entropy (Rosenholtz et al., 2007). Scene images were gray scaled, and a Canny edge detector was applied to the grayscale images with the upper and lower thresholds of 0.125 and 0.05, respectively.

Object complexity

This was explored using human-labeled annotations of the objects from the SUN database. Each annotation contains a polygon representing the object's contour and a label representing the object's category. Objects were assigned to the top or bottom half of a scene according to the area of the object in each half. If an object occupied an area of >100 pixels in one half, it was assigned to that half of the scene. An object could belong to both halves of a scene.

The first and second measures of object complexity are the number of objects and the number of categories. The third measure is the object area, which is the sum of the areas of polygons in each half of the scenes. Objects with larger sizes are more salient and may provide more information in the AOV judgment task.

Experiment 2: free adjustment

Subjects

In this experiment, 30 newly recruited healthy adults (mean age, 22.0 years; SD = 2.4; females, 14) with normal or corrected-to-normal vision were included. The subjects provided oral informed consent prior to their participation and were compensated for their time.

Stimuli

We gathered a set of 180 naturalistic indoor images in six categories (N = 30) from the Matterport3D dataset, a large-scale RGB-D dataset containing 10,800 panoramic views obtained from 194,400 RGB-D images of 90 building-scale scenes. Categories, depth maps, camera poses, and surface normal maps of the images were also obtained from the Matterport3D dataset. The layout information was labeled in the Matterport3D-layout dataset, a layout estimation dataset for Matterport3D. The six categories selected for this study were the bathroom, bedroom, kitchen, living room, hallway, and garage. All images had a head-up vertical AOV and were generated from the panoramas with a field of view (FOV) of 90° and resized to 350 × 350 pixels.

The stimuli were displayed using a CRT monitor (refresh rate, 120 Hz) at a visual angle of 11.4°. The subjects’ heads were stabilized at a head-up AOV using a chin rest.

Experimental design

Free adjustment task

Before the formal experiment, subjects were allowed six practice trials with images that were not used in the subsequent experiments. Subjects were instructed to reproduce a scene precisely by adjusting the AOV and viewing distance using the keyboard (Fig. 1B). Each trial began with a 1,000 ms fixation interval, followed by presentation of the scene image (11.4° × 11.4°) for 250 ms. A dynamic mask was displayed 250 ms after image offset. The dynamic mask comprised five mosaic-scrambled images, each presented for 50 ms. Each mask image had 7 × 7 patches of unrelated images. There was a 750 ms fixation interval between the dynamic mask display and the subsequent adjustment stage in which the subjects were allowed three degrees of freedom to adjust the viewpoint of the scene image. The subjects were required to adjust the scene image using arrow keys to match the viewpoint of the first scene. Specifically, they were instructed to first adjust the vertical and horizontal AOVs before working on the viewing distance. Movie 1 illustrates the entire experimental procedure. A continuous AOV adjustment was achieved by generating images from the panorama per the AOV changes. On the other hand, viewing distance adjustment was achieved via cropping the image boundaries. The initial adjustment stage image was generated using a random AOV in the 16–24° range and a random viewing distance in the range of −7.5 to 7.5% of the image boundary from the first scene image. Subjects had 18 s to respond in the adjustment stage.

Data analysis

We calculated the group-averaged AOV bias and boundary transformation scores for each scene. The AOV bias score included horizontal and vertical AOV deviations, whereas the boundary transformation score showed viewing distance deviations. Both the horizontal and vertical AOV bias scores and the boundary transformation score were then subjected to split-half consistency tests.

Estimation of scene information

Using the rich annotations in the Matterport3D dataset, we estimated the following scene information, which could be used to predict behavioral deviation.

Depth

We calculated the depth of each scene image by averaging depth values across all pixels in its depth map. For indoor scenes, the depth measure could effectively predict the boundary transformation bias (Lin et al., 2022).

Height

The height of the image viewing point was determined using the camera specifications provided in the Matterport3D dataset.

Visual complexity

Visual complexity measures were estimated as outlined in Experiment 1. Since both the vertical and horizontal AOV bias scores were gathered in Experiment 2, we computed the asymmetry in both axes using visual complexity measures.

Object complexity

Object complexity measures for both the vertical and horizontal axes were estimated as outlined in Experiment 1.

Layout orientation

The layout orientation was estimated using the surface normal vectors of the layout information provided in the Matterport3D-layout dataset. First, scene images were segmented into different regions, including the ceiling, floor, and right/middle/left walls. Normal vectors for each wall's surface were then extracted from the standard image. The vertical layout orientation asymmetry was determined using the mean value of the vertical components of the surface normal vectors of the pixels of the ceiling and floor. On the other hand, the horizontal layout orientation asymmetry was determined by computing the mean value of the horizontal components of the surface normal vectors of pixels of the right/middle/left walls (Fig. 5A). A positive value of the vertical layout orientation asymmetry indicated that the scene image covered a larger portion of the ceiling. On the other hand, a positive value of the horizontal asymmetry suggested that the scene image captured a larger portion of walls facing right.

Experiment 3: fMRI

Subjects

In this experiment, 24 newly recruited healthy adults (mean age, 21.8 years; SD = 2.5; females, 13) with normal or corrected-to-normal vision were included. The subjects provided written informed consent prior to their participation and were compensated for their time.

Stimuli

The stimuli for this experiment were 90 scenes with the lowest AOV bias scores and 90 scenes with the highest AOV bias scores selected from the SUN image-set in Experiment 1. This selection criterion could maximize the proportion of biased choices in the fMRI task in favor of the contrasts of interest in our analysis. The 90 scenes (89 categories) with the highest AOV bias scores had an average AOV bias score of 0.2683 (SD = 0.1211), whereas the 90 scenes (88 categories) with the lowest AOV bias scores had an average AOV bias score of −0.2326 (SD = 0.1160).

The stimuli were back projected onto a translucent screen placed inside the scanner bore using a video projector (refresh rate, 60 Hz; spatial resolution, 1,024 × 768). The subjects viewed the stimuli through a mirror mounted on the head coil at a visual angle of 7.5°.

Experimental design

The procedure for the behavioral AOV judgment task was adapted to fit the fMRI requirements (Fig. 1C). Herein, an event-related fMRI design was employed. Each trial began with a 1,750 ms fixation, followed by presentation of the first scene image for 250 ms. A dynamic mask comprising four mosaic-scrambled images was then presented for 200 ms (50 ms for each image). Following that, a static mosaic mask was presented for 1,800 or 3,800 ms. After another 1,000 ms fixation, the second scene image was presented for 1,000 ms. After the offset of the second scene image, subjects were instructed to respond with their right hand to the AOV change with three options (up, the same, and down) using the response box. The participants were allowed 2,000 ms to respond. The intertrial interval was set at 2,000 or 4,000 ms. All jitters were pseudorandomized between trials. Each run had 30 trials, with two 16 s intervals at the beginning and end of the run, respectively. The length of each run was 392 s. Each run included 15 scene images with the lowest scores and 15 scene images with the highest scores from Experiment 1. Each subject performed six runs of the AOV judgment task, and each scene image appeared only once.

Figure 1.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 1.

Task procedures. A, The 2AFC and 3AFC AOV judgment tasks in Experiment 1. B, The free adjustment task in Experiment 2. C, The 3AFC AOV judgment task in Experiment 3. D, The 3AFC AOV judgment task in Experiment 4.

Scanning parameters

We obtained BOLD MRI images using a 3 T Siemens Prisma scanner equipped with a 64-channel receiver head coil. Functional data were acquired using a multiband echo planar imaging sequence (multiband factor, 2; TR, 2 s; TE, 30 ms; matrix size, 112 × 112 × 62; flip angle, 90°; resolution, 2 × 2 × 2.3 mm3; number of slices, 62). Before functional scans, a high-resolution T1-weighted 3D anatomical dataset was obtained for each subject to facilitate registration (MPRAGE: TR, 2,530 ms; TE, 2.98 ms; matrix size, 448 × 512 × 192; flip angle, 7°; resolution, 0.5 × 0.5 × 1 mm3; number of slices, 192; slice thickness, 1 mm).

Functional localizers

Each subject was assigned an independent functional localizer to identify functional regions of interest (fROIs). The stimuli of the localizer were grayscale images from four categories (scenes, faces, objects, and scrambled objects). The images from the same category were presented centrally in 16 s blocks (20 images, 300 ms onset, and 500 ms blank interval) with 8 s interblock intervals. The subjects performed a one-back task, in which they responded with a button press to successive identical images. The localizer was run for 408 s. There were four onsets for each category block in each run, with the order of categories counterbalanced across subjects and runs. Each subject completed two localizer runs.

fMRI preprocessing

The anatomical and functional data were preprocessed and examined using AFNI (Cox, 1996). Functional images were slice-time corrected (using the AFNI function 3dTshift) and motion-corrected to the image that was considered a minimum outlier (3dVolreg). For whole-brain univariate analysis, each subject's structural image was aligned to the minimum outlier image (align_epi_anat.py) and then standardized to the MNI space through diffeomorphic transformation (3dQWarp). Furthermore, functional and anatomical images were normalized to the MNI space using the output of the warping process (3dNwarpApply). On the other hand, the functional images were smoothed using a Gaussian kernel with a full-width at half-maximum (FWHM) of 4 mm (3dmerge), and signal amplitudes were scaled to a standardized range of 0–100 (3dTstat). For fROI definition and multivoxel pattern analysis, the signal amplitudes of the functional images were rescaled to a 0–100 range after motion correction. All decoding analyses were performed in the original subject space.

Data analysis

Whole-brain univariate analysis

We used a general linear model (GLM) to analyze the preprocessed fMRI data collected during the AOV judgment task. We focused primarily on the regions that elicited differential responses when the AOV bias occurred. This was achieved by assigning the trials to different conditions based on the subject's responses. The low-bias condition included trials in which subjects thought the second scene had an upper AOV compared with the first scene. On the other hand, the up-bias condition included trials with a lower AOV judgment. The control condition comprised trials with the same AOV choice. We assumed that BOLD response changes to the first scene were due to the AOV bias. Thus, we developed three regressors of interest for each of the three conditions to model neural activation when the first image was displayed. The BOLD signals of other task periods were considered additional regressors, including the dynamic mask onset, three response types, six motion parameters, and three polynomials accounting for slow drifts. A canonical hemodynamic response function (HRF) was used in the GLM analysis of the data. In second-level analysis, we first employed a repeated-measures ANOVA (3dANOVA2) with predefined contrasts of low-bias versus up-bias and all bias (low-bias + up-bias) versus control. We used a statistical threshold of p < 0.001 (uncorrected) and a cluster size of 20 to identify voxels that exhibited a significantly different activation in the F test and t test for the predefined contrasts. No clusters survived this statistical thresholding criterion in the low-bias versus up-bias contrast. Furthermore, there was no significant difference in the BOLD signal between the two conditions in which AOV bias occurred. Consequently, we combined the low-bias and up-bias conditions into one bias condition and performed a t test to compare the combined bias against control conditions. The probability of false-positive clusters for each subject was estimated using AFNI 3dClustSim. In this α probability simulation, a threshold of p < 0.001 (one-tailed) was employed before simulation, and a corrected α = 0.05 was used to determine the minimum cluster size postsimulation. The average of the minimum cluster size (i.e., 31) across all subjects was computed and used as the cluster threshold for the t test.

Definition of fROIs

Based on the functional localizer runs, we defined three scene-selective areas, including the parahippocampal place area (PPA), the occipital place area (OPA), and the retrosplenial cortex (RSC) in each hemisphere (R. Epstein and Kanwisher, 1998; O’Craven and Kanwisher, 2000; Dilks et al., 2013). We fitted the response model using the GLM model in AFNI (3dDeconvolve, 3dREMLfit). First, BOLD responses were modeled by convolving a standard HRF function with a 16 s square wave for each block category. Estimated motion parameters, subject responses, and three polynomials accounting for slow drifts were incorporated as no-interest regressors. Notably, PPA and OPA were defined as contiguous clusters of voxels with a threshold of p < 1 × 10−6 (uncorrected) under the scene > face contrast. More specifically, OPA was defined by locating the cluster near the transverse occipital sulcus, and PPA was defined by locating the cluster between the posterior parahippocampal gyrus and the lingual gyrus. All subjects had PPA and OPA in at least one hemisphere. On the other hand, RSC was defined as contiguous clusters of voxels with a threshold of p < 0.001 (uncorrected) of the same contrast and was not strongly activated in the functional localizer runs with the scene > face contrast. RSC was defined by locating the cluster near the posterior cingulate cortex and was only found in 14 out of 24 subjects.

Multivoxel pattern analysis

All decoding analyses were performed using the CoSMoMVPA toolkit (RRID:SCR_014519).

Estimation of fMRI responses

Each trial's voxel-wise activations were estimated using the GLM model. Each trial was modeled with a canonical HRF from the onset of the first image over a 0.25 s duration. The additional regressors comprised the dynamic mask onset, three response types, six motion parameters, and three polynomials accounting for slow drifts. The GLM model yielded 180 regressors (β), each representing a voxel-wise activation of one trial. The obtained β images were converted into a t statistic map for further decoding analysis.

fROI analysis

For each subject, we trained a three-class linear support vector machine (SVM) to discriminate the activations evoked by the first scene image per the subject's three choices (up-bias vs low-bias vs control). We could not predict the subject's choice before the experiment, resulting in an imbalanced number of trials for each class. To mitigate the effects of the imbalanced dataset, we modified the leave-one-run-out (LORO) N-fold cross-validation procedure. In each partition, a single fMRI run was used to test the SVM classifier, while the other runs were used for training. Random trials were removed from both the training and testing sets to ensure class balance. The balanced LORO procedure was repeated until each trial appeared in the training set at least three times. On average, 104 trials (SD = 24) were included in the training set of each partition and 46 partitions (SD = 16) were performed to get the decoding accuracy for each subject.

The dimensionality of the training data was reduced by selecting a subset of voxels in each fROI to mitigate the risk of overfitting when training the classifier. Based on the outcome of t tests with the scene > face contrast in the functional localizer runs, we selected 60 voxels with the strongest activation in the bilateral fROI.

Decoding accuracy was assessed by averaging the classification accuracy of the three classes in each partition. Group-level statistical analysis across subjects was performed with a one-tailed t test against the chance level (1/3). We applied a false discovery rate (FDR) multiple comparison adjustment for these statistical tests.

Experiment 4: MEG

Subjects

Twenty new healthy adults (mean age, 21.8; SD = 2.5; 13 females) with normal or corrected-to-normal vision participated in the MEG experiment. The subjects provided written informed consent prior to their participation and were compensated for their time.

Stimuli

A total of 300 scenes (297 categories) were randomly selected from the SUN image-set in Experiment 1. The stimuli were displayed on a rear-projection screen (refresh rate, 60 Hz) with a visual angle of 5.3°.

Experimental design

Each trial began with a fixation interval for 1 or 2 s, randomized for each trial. The first scene image was then onset for 250 ms, followed by another fixation interval for 1 or 2 s. Dynamic masks in the AOV judgment task were removed to increase the signal-to-noise ratio. The second scene image was presented for 1 s and subjects were instructed to respond after its offset to the AOV change with three options (up, the same, and down) using the response box with their right hands. The experiment consisted of 6 runs of 50 images. Scene images were randomly assigned across runs.

Data acquisition and preprocessing

Electromagnetic brain activity was recorded with an Elekta Neuromag 306 MEG system, composed of 204 planar gradiometers and 102 magnetometers. This system comprises 102 triple-sensor elements, with one magnetometer and two combined planar gradiometers at each location. Signals were sampled continuously at 1,000 Hz and bandpass filtered online between 0.1 and 330 Hz. Offline preprocessing was conducted using the MNE-python package. Signal space separation (SSS) was performed to reduce artifacts caused by the environment and head motion. Independent component analysis (ICA) was applied to the data using the fastICA algorithm implemented in MNE-python. Clear components containing eyeblinks and saccade were removed from the raw unfiltered data. The data were then demeaned, detrended, and downsampled to 100 Hz. A time window of 800 ms was applied to segment the raw data, ranging from 200 ms before the first image onset to 600 ms after the onset. Trials were excluded if the gradiometer value exceeded 5,000 fT/cm or if magnetometers exceeded 5,000 fT.

Multivariate analysis

For each subject, a two-class SVM classifier was trained to discriminate trials between bias and the-same judgment. Decoding was performed across 24 occipital magnetometers and one of their combined gradiometers. The selection of channels was based on the involvement of occipital areas in the AOV bias effect from the fMRI results. Temporal smoothing was applied by averaging across neighboring time points (in the range of 20 ms before and after). The balanced LORO cross-validation procedure was also used in MEG analysis, resulting in the same number of bias and control trials in the training and test sets. Each trial was included in the training set at least three times. On average, 162 trials (SD = 54) were included in the training set of each partition, and 54 partitions (SD = 16) were performed. Decoding performance was assessed by averaging the classification accuracy of the two classes of each partition.

For the two important scene features, the asymmetry of visual and object complexities in the vertical axis, a support vector regression (SVR) model was trained to detect the associated signals. SVR is a machine learning algorithm for regression analysis. It is widely used in decoding analysis of continuous variables (Schubert et al., 2021). The epsilon SVR was conducted using LibSVM with a linear kernel. A sixfold LORO cross-validation procedure was used. Decoding performance was estimated by Pearson’s correlation between the predicted and the actual feature values.

Significance against probability level (0.5 for response and 0 for scene features) was tested for each time point by computing random-effect temporal-cluster statistics corrected for multiple comparisons. The null distribution was accomplished by t test computation over 10,000 iterations, in which the sign of decoding performance above chance was randomly flipped. Threshold-free cluster enhancement (TFCE) was performed as cluster statistic (Smith and Nichols, 2009), with a threshold step of 0.1. Significant temporal clusters were determined by a cluster-forming threshold of p < 0.05.

Searchlight analysis

The same decoding method was applied in a searchlight analysis at the sensor level. All magnetometers and one of the combined gradiometers were involved in the searchlight analysis, resulting in 204 channels. For each channel location and time point, the searchlight analysis was performed with a channel neighborhood determined by Delaunay triangulation and a time neighborhood of 20 ms before and after the time point. On average each location had a neighborhood of 16 channels. The same statistic test and TFCE procedures as described above were applied to the decoding results across the scalp.

Results

Experiment 1

Biased judgment of vertical AOV

In AOV judgment tasks, the two scene images were identical for the nonprobe trials. When asked to judge the AOV difference between the two identical scenes, subjects did not show a preference for the up or down AOV choice. In the 2AFC task, where subjects were forced to make biased choices, they made a down choice in 50.84% (SD = 6.44%) and an up choice in 49.16% (SD = 6.44%) of the scenes. The Wilcoxon t test showed no significant difference between the proportions of the two choice types (z = 0.474; p = 0.635). Although the option of the-same AOV was available to the subjects in the 3AFC task, they made a down choice in 19.59% (SD = 11.55%) and an up choice in 17.01% (SD = 10.10%) of the trials [the remaining trials were judged as the-same AOV: 63.40% (SD = 19.27%)]. No significant difference between the up and down choices was found in a paired t test (t(38) = 1.619; p = 0.114).

Three statistical analyses were performed to ensure that the behavioral AOV bias scores were the results of subjects’ genuine distortion in scene memory. The first analysis showed that subjects did not respond randomly to the task (Fig. 2C). The distribution of the subjects’ AOV bias scores was compared with the null distribution generated by randomly permuting the subjects’ choices while preserving the proportions of their choices. The chi-square test was applied to examine the difference between the two distributions. In the 2AFC task, the distribution of AOV bias scores was significantly different from the permuted distribution (χ2 = 715.51; p < 0.0001). Similarly, the two distributions were significantly different in the 3AFC task (χ2 = 1,434.78; p < 0.0001). The second analysis demonstrated that subjects tended to make the same response for the same scene image. Split-half consistency analyses were conducted based on the bootstrap procedure, and the reliability ρ* was calculated using the Spearman–Brown prediction formula. The results revealed significant choice consistency across subjects in both 2AFC (Spearman–Brown reliability: ρ* = 0.6824; p < 0.001) and 3AFC (ρ* = 0.6119; p < 0.001) tasks. Third, the consistency of AOV judgment between 2AFC and 3AFC tasks was examined (Fig. 2D). The between-task consistency test showed a high Spearman correlation between the two tasks regarding AOV bias scores across scene images (Spearman's ρ = 0.6039; p < 0.0001). These converging results suggest that the observed AOV biases were not random responses and reflected the influence of the scenes’ intrinsic features on the subjects’ judgment. Representative scene images with up- and low-bias are shown in Figure 2B.

Figure 2.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 2.

Behavioral results of Experiment 1. A, Schematic illustration of the emergence of the AOV bias. When subject saw a scene image, its internal representation was distorted, either upward or downward. Following presentation of the second identical image, it was compared with the distorted memory of the first image leading to biased judgment. B, The top 2 ranked scene images with the up and low bias in Experiment 1. The value below each image is its AOV bias score. C, Distribution of AOV bias scores. The red histogram indicates the distribution of averaged AOV bias scores, pooled across the 2AFC and 3AFC tasks. The gray histogram denotes the permuted distribution as if the subjects made random choices. D, Between-task consistency on the AOV bias scores.

Scene features predict AOV bias

We hypothesized that distinctive information contained in the top and bottom halves of the scenes can lead to distortion in scene memory which, in turn, causes the following bias in AOV judgment. Thus, measures related to scene complexity were utilized to assess the information that may contribute to the behavioral AOV biases. The information about low-level visual features and high-level object features for the top and bottom halves of the scenes was estimated. By applying PCA to the estimated complexity information, two scores (visual complexity and object complexity) for each half of a scene image were obtained. The first components of PCA explained 67.81% and 65.03% of the variability across scenes for visual and object features, respectively.

Multiple linear regression analysis was utilized to test the influence of intrinsic scene features on AOV biases. AOV bias scores were treated as response variables. Individual variance was not considered in the regression model. Visual and object complexity asymmetry scores were treated as two predictors. AOV bias may be a by-product of the boundary transformation effect. For example, we may walk forward and bow our heads to watch objects or paths. With the argument that AOV bias and boundary transformation were orthogonal perspectives of the same mental process, we added boundary transformation scores from Bainbridge and Baker (2020) as a control predictor into the model. Finally, three subjective measures (i.e., openness, AOV rating, and viewing distance rating) were also included in the model to control the influence of global configuration of the scene. The viewing distance rating is a key predictor of boundary transformation. Openness may also influence the judgment of vertical AOV as the clear horizontal lines in outdoor scenes may simplify judgment. The AOV of the scene image itself may distort the scene's representation due to normalization processes such as boundary transformation.

The regression model was significant (F = 15.9; p < 0.0001) with an adjusted R2 of 0.231. The asymmetry of object and visual complexities and AOV rating were significant predictors of AOV bias (Fig. 3B and Table 1). Positive regression coefficients for the asymmetry of object and visual complexities suggested that the internal scene representation was distorted toward the regions of higher complexity. The negative regression coefficient for the subjective AOV rating indicated that scenes with an upward AOV were more likely to have a low AOV bias. Openness, scene depth, and boundary transformation score were not significant in the model. No supporting pieces of evidence were found for the hypothesis that AOV bias is attached to boundary transformation or scene openness.

Figure 3.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 3.

Regression model of Experiment 1. A, Visualizations of the measures. All visual and object features were estimated for the top and bottom halves of the scene image. B, A regression model of AOV bias score. The bar plot illustrates the t value of each regressor in a descending order. Regressors marked in red are significant predictors of the model (q < 0.05; FDR corrected).

View this table:
  • View inline
  • View popup
Table 1.

Regression model of AOV bias in Experiment 1

However, we did not examine the distance bias in the 2AFC and 3AFC tasks and could not directly compare the effects between boundary transformation and AOV bias. To further address this issue, we conducted Experiment 2 with a free adjustment task, in which we could obtain scores of AOV bias and boundary transformation from the same group of subjects with high precision.

Experiment 2

Biased adjustment of viewpoint position

In the free adjustment task, after viewing an indoor scene image and short delay, subjects adjusted the pitch, yaw, and viewing distance to recover the memory of the scene image. The free adjustment procedure allowed us to evaluate the degree of memory distortion in all dimensions of the scene space. Similar to the results of Experiment 1, subjects did not show preference in specific directions on the vertical (t(29) = 1.074; p = 0.2916) or horizontal (t(29) = 1.685; p = 0.1026) axis (Fig. 4B). However, a significant preference was found in distance deviation (i.e., boundary transformation; t(29) = 3.340; p = 0.0023). More scenes showed boundary contraction (Fig. 4C). This preference may reflect an image bias in the Matterport3D dataset. Since large space is more convenient for capturing panoramic images required for the dataset, most of the scene images were taken from rooms with relatively large sizes.

Figure 4.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 4.

Behavioral results of Experiment 2. A, Representative images of Experiment 2. The scenes’ inner representations were reconstructed based on the average deviation across subjects. B, Distributions of the horizontal and vertical deviations. The red histogram indicates the distribution of averaged deviations. The gray histogram represents the permuted distribution. The circles in the scatterplot represent the 95% percentile of scores. C, Distribution of the distance deviation (i.e., boundary transformation).

Our data showed that the distributions of response were significantly different from the permuted distributions in all three dimensions (horizontal deviation: χ2 = 199.31, p < 0.0001; vertical deviation: χ2 = 272.12, p < 0.0001; distance deviation: χ2 = 153.67, p < 0.0001). The split-half consistency analysis also revealed consistency responses across subjects (horizontal deviation: ρ* = 0.6732, p < 0.001; vertical deviation: ρ* = 0.7998, p < 0.001; distance deviation: ρ* = 0.6415, p < 0.001). These results confirmed the observed AOV bias in Experiment 1 and the classical boundary transformation effect with the free adjustment procedure. Representative scene images are shown in Figure 4A.

Scene features predict deviations

Multiple linear regression analyses were performed to test the influence of scene features on the behavioral deviations in the three dimensions. Based on the regression results of Experiment 1, horizontal and vertical asymmetries of visual and object complexities were computed. The two measures served as model predictors together with the depth and height information. Specifically, we suggest that the prototypical viewpoint of indoor scenes consists of a standard layout, such as a corner or a back wall at the center of view. Thus, we also included the asymmetry of layout orientation as a predictor in the model. In sum, a regression model of a specific behavioral deviation (horizontal, vertical, or distance) included eight variables representing intrinsic features of scenes and two variables related to other dimensions of deviations. The full regressor lists were shown in Figure 5B–D and Table 2.

Figure 5.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 5.

Regression models of Experiment 2. A, Representative measures of layout orientation. Images of the left column are shown in the task. Images of the right column denote the horizontal component of the normal vector of each wall in each image. B–D, Regression models of the horizontal, vertical, and distance deviations. The bar plot indicates the t value of each regressor in a descending order. Regressors marked in red are significant predictors of the model (q < 0.05; FDR corrected).

View this table:
  • View inline
  • View popup
Table 2.

Regression models in Experiment 2

The regression models of all dimensions were significant (horizontal deviation: adjusted R2 = 0.308, F = 8.97, p < 0.0001; vertical deviation: adjusted R2 = 0.363, F = 11.2, p < 0.0001; distance deviation: adjusted R2 = 0.182, F = 4.99, p < 0.0001; Fig. 5B–D and Table 2). For the horizontal deviation, horizontal asymmetry of layout orientation, object complexity, and visual complexity were significant predictors. For the vertical deviation, vertical asymmetry of layout orientation and object complexity, height, and horizontal asymmetry of layout orientation were significant predictors. The regression model of the vertical deviation agreed with the results of Experiment 1, indicating that the function of the scene memory is to balance the complexity information between the two halves of the scenes. For the distance deviation, only depth was the significant predictor, echoing the finding that viewing distance is a key predictor of boundary transformation. Distance deviation was not a significant predictor of either vertical or horizontal deviation, further confirming that AOV bias was not a by-product of boundary transformation.

Herein, the two behavioral experiments showed the AOV bias as a new visual phenomenon of scene memory distortion. The observed AOV bias mirrored the phenomenon of boundary transformation and extended the theory of viewpoint normalization of scene processing to a new dimension. Previous investigations with neuroimaging techniques have provided some insights into the neural mechanism of boundary transformation (S. Park et al., 2007; Chadwick et al., 2013). To supplement our behavioral findings with further neural evidence, we conducted two exploratory imaging experiments to locate the cortical regions (fMRI) and temporal intervals (MEG) at which AOV bias occurs.

Experiment 3

Behavioral AOV bias

On average, subjects made 34.18% (SD = 7.20%) up choice, 32.65% (SD = 8.72%) down choice, and 33.17% (SD = 11.87%) the-same choice. No significant difference among the three choices was found in Kendall's W test (W = 0.033; p = 0.453). Meanwhile, a high Spearman correlation was found between the AOV bias scores of the fMRI task and the behavioral task in Experiment 1 (Spearman's ρ = 0.6031; p < 0.0001), indicating that the scenes selected for the fMRI task replicated the AOV bias effect in the behavioral task.

Whole-brain univariate analysis

To investigate the source of representation distortion of the first scene image in each trial, we performed whole-brain GLM analysis to model the activation variation evoked by the first scenes in MNI space. ANOVAs across three bias conditions (i.e., up, down, and the-same) revealed significant effects in several areas in temporal, parietal, and frontal cortices. Further tests revealed no significant activity under up-bias and low-bias conditions. Thus, we combined up-bias and low-bias conditions as one regressor (i.e., biased trials) for subsequent fMRI analyses. Trials with the-same choice were considered as control trials. It was found that responses to the first scene images were significantly higher in the biased trials than in the control trials in the bilateral lingual gyrus (Fig. 6A; see Table 3 for the peak coordinates in MNI space and the statistical information). According to Glasser HCP 2016 surfaced-based parcellation in MNI space (Glasser et al., 2016), the clusters in the bilateral lingual gyrus are both overlapped with V3 and V4 in the Glasser atlas. We refer this region as V3/V4.

Figure 6.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 6.

fMRI results of Experiment 3. A, Univariate analysis in MNI space results (bias vs control). Maps are shown with a threshold of p < 0.001 (t > 3.768) and cluster-wise correction for multiple comparisons (minimal cluster size, 31). B, Example fROIs derived from one subject defined by the localizer in native space. The relative locations of PPA (purple), OPA (orange), and RSC (cyan) are presented in two slices. C, MVPA decoding results. Error bars indicate the SEM of mean across subjects. Asterisks denote significant results in the one-tailed t test of bias versus control conditions with FDR correction. *q < 0.05; **q < 0.01.

View this table:
  • View inline
  • View popup
Table 3.

Brain regions that exhibited positive activation when AOV bias occurred (biased vs control, p < 0.001, minimal cluster size of 31 for multiple comparison) in group-level univariate analysis of Experiment 3

Multivariate pattern analysis

The locations of the predefined fROIs varied across subjects. This could reduce the statistical power of the whole-brain analysis. Therefore, we conducted a decoding analysis in native space to examine fMRI signals associated with the direction of memory distortion in fROIs. Scene-selective areas (PPA, OPA, and RSC) were defined with functional localizers in native space (Fig. 6B). Signals in each fROI evoked by the first scene image were extracted. The extraction ensured that the perception of the second image and the decision process would not affect the decoding results. A three-class linear SVM with a balanced LORO cross-validation procedure was applied to the three fROIs to decode the state of memory distortion (Fig. 6C). Classification accuracy was significantly above the probability level in PPA (classification accuracy, 35.38%; t(23) = 3.532; q = 0.0027; one-tailed; FDR corrected) and OPA (classification accuracy, 35.60%; t(23) = 2.589; q = 0.0123; one-tailed; FDR corrected) but not RSC (classification accuracy, 33.95%; t(14) = 0.576; q = 0.2870; one-tailed; FDR corrected).

Taken together, fMRI analyses revealed the involvement of V3/V4, PPA, and OPA in scene memory distortion. These results implied that the initial feedforward processing of the scenes might explain the observed behavioral AOV biases. However, we could not draw a firm conclusion due to the limited temporal resolution of the fMRI signal. Therefore, we conducted an MEG experiment with high temporal resolution to address this issue.

Experiment 4

Behavioral AOV bias

On average, subjects made 20.95% (SD = 10.01%) up choice, 25.08% (SD = 14.20%) down choice, and 53.80% (SD = 20.48%) the-same choice. No significant difference between the up and down choices was found in the paired t test (t = 1.3674; p = 0.187). Meanwhile, a high Spearman correlation was found between the AOV bias scores of the MEG task and the behavioral task (Spearman's ρ = 0.5122; p < 0.0001), indicating that the deletion of the dynamic mask in the MEG task did not affect the AOV bias effect.

Multivariate pattern analysis

In the MEG experiment, the time course of scene memory distortion and the representation of scene features were explored. Specifically, we decoded the response of AOV judgment and the asymmetry of visual and object complexities. Based on the results of Experiment 3, we focused on the time window around the onset of the first scene image and used only occipital sensors for decoding analysis.

For the scene memory distortion, we found significant MVPA classification performance against chance level in the occipital sensors (TFCE one-tailed p < 0.05). This result corresponded to a cluster between 140 and 180 ms (peak, 170 ms; TFCE one-tailed p = 0.0149) after the onset of the first scene image. For the asymmetry of both visual and object complexities, we found significant SVR predictions of asymmetry values (TFCE one-tailed p < 0.05). The significant decoding performance was driven by the cluster between 110 and 470 ms (peak, 180 ms; TFCE one-tailed p = 0.0004) for the asymmetry of visual complexity and the clusters between 200 and 440 ms (peak ,380 ms; TFCE one-tailed p = 0.0012) for the asymmetry of object complexity (Fig. 7).

Figure 7.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 7.

Decoding results of Experiment 4. Thick colored lines indicate the average decoding performance across subjects. The colored shaded areas denote the SEM of decoding performance. The red line represents the decoding performance of behavioral choices from the two-class SVM classifier and corresponds to the left y-axis with the chance level of 0.5. The green and blue lines represent the decoding performance of asymmetry of visual and object complexity from the SVR models and correspond to the right y-axis with the chance level of 0. Decoding performance above chance was calculated using one-sample t tests, controlling for multiple comparisons with TFCE one-tailed p < 0.05. The dots indicate significant clusters with corresponding colors.

Searchlight analysis

The same decoding procedures were applied across the scalp in a searchlight analysis. Searchlight results corresponding to the time points of peak MVPA accuracies or SVP correlations are shown in Figure 8. A significant decoding performance was most pronounced in clusters between 140 and 190 ms in the left occipital sensors for scene memory distortion, clusters between 100 and 600 ms in widespread posterior sensors for the asymmetry of visual complexity, and clusters between 300 and 450 ms in occipital sensors for the asymmetry of object complexity (Fig. 8).

Figure 8.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 8.

Searchlight results of Experiment 4. A, MVPA performance on classifying behavioral choices is shown for the time window of 140–200 ms after the first scene onset. B, SVR performance on the asymmetry of visual complexity in the time window of 140–200 ms after the first scene onset. C, SVR performance on the asymmetry of object complexity in the time window of 360–420 ms after the first scene onset. The pentagrams illustrate significant clusters at TFCE one-tailed, p < 0.05.

Movie 1.

Video of an example trial to illustrate the experimental procedure of Experiment 2.

Collectively, MVPA and searchlight analyses consistently demonstrated the evidence of early neural correlates of scene memory distortion and visual complexity in the time window ∼100–200 ms and a late representation of object complexity ∼300 ms after the first scene image onset.

Discussion

Our findings revealed that scene memory was distorted upward and downward, causing AOV biases in subsequent judgments. In this regard, we deduced that the normalization process distorted scene memory in the encoding stage, resulting in subsequent judgment bias. Two lines of evidence supported this conclusion. First, the asymmetry of scene complexity and subjective AOV ratings could predict the behavioral AOV bias scores. Second, the fMRI and MEG experimental results revealed that the AOV biases were associated with cortical processing during the early encoding stage of scenes and V3/V4, PPA, and OPA regions were involved.

Mechanisms of AOV bias

The normalization theory posits that scene representations should be normalized toward a prior distribution to minimize errors associated with noisy perceptual inputs (Bartlett, 1932; Hemmer and Steyvers, 2009). In previous research, subjects’ fixations were centered latitudinally on the horizon when they freely explored a virtual reality environment, demonstrating the typical phenomenon of the equator bias (Rai et al., 2017; Sitzmann et al., 2018). On the other hand, the subjective AOV ratings were independently obtained from another large group of subjects and were normally distributed around the prototypical head-up AOV. Thus, this head-up AOV could be the prototypical viewpoint along the vertical axis. Notably, scenes with a lower subjective AOV rating than the head-up AOV were more likely associated with an up AOV choice and vice versa. Therefore, we deduced that the biased choices could be a direct consequence of the normalization process toward the prototypical head-up AOV.

Furthermore, the positive coefficients of the complexity asymmetry indicated that scenes’ intrinsic complexity information was involved in AOV bias. Recent research established that boundary transformation could be influenced by the number of objects and semantic information (J. Park et al., 2021; Greene and Trivedi, 2023). These findings suggested that the goal of boundary transformation is to normalize the amount of semantic information held in memory. Akin to this mechanism of boundary transformation, the relationship between complexity and AOV bias may also reflect an underlying cognitive process to balance the amount of visual and object complexity information at the top and bottom halves of each scene.

Notably, AOV bias focuses on memory distortion of scene images and may not reflect the memory patterns in naturalistic vision exploration. Herein, scene images were briefly presented within the foveal and parafoveal visual fields. Furthermore, the pitch of the subjects’ head position was well controlled in all experiments. These experimental designs limited the potential confounds of the subjects’ eye movement and head position, further ensuring that the AOV bias was the result of generic scene-processing activities in the brain. However, to examine the effects on subjects under natural visual explorations, future investigations with larger ecological capacities are required.

We speculated that AOV bias and boundary transformation are different aspects of the same memory distortion process—normalization toward the prototypical viewpoint of a scene. According to the current study, as well as other previous investigations, the normalization process is reflected in two aspects: shifting the viewpoint in memory toward a high-probability viewpoint and balancing the amount of information in the scene. Moreover, the regression analysis of the free adjustment task showed that boundary transformation was not a significant predictor of AOV bias, showing that AOV bias cannot be attributed to boundary transformation.

AOV bias in the feedforward sweep of scene processing

The decoding performance of AOV bias and visual complexity asymmetry arose at the same time interval (peaked ∼170 ms after the first image onset) in the occipital sensors of the MEG experiment. Furthermore, previous electroencephalography research revealed that the influence of low-level visual features on the scene-related signal begins early (100–200 ms) in occipital electrodes (Groen et al., 2013, 2016). The overlapped decoding of AOV bias and visual complexity, as well as the associated early time window, indicate that memory distortion first occurred at the feedforward sweep of the scene's representation.

Consistent with the MEG results, we discovered that regions in the ventral visual pathway, including V3/V4, PPA, and OPA, were associated with bias formation in the fMRI experiment. Numerous electrophysiological studies in macaques and imaging investigations in humans showed that V4 is critically involved in the intermediate-level computation that connects simple edges to complex objects in the ventral visual pathway (Kourtzi and Connor, 2011; Roe et al., 2012; Pasupathy et al., 2020). Another relevant study demonstrated that V4 and adjacent neural regions exhibited more robust activation in response to scene images with higher visual complexity (Groen et al., 2018). Among the scene-selective areas, we discovered high decoding performance in PPA and OPA, but not in RSC. Previous research has reported that PPA and OPA are better in representing and navigating local scenes, whereas RSC integrates local scenes into a wider space (Julian et al., 2018; Peer and Epstein, 2021). Based on the distinct functions of these areas, PPA and OPA are more likely to encode the viewpoint during scene perception. Previous studies reported that the early visual cortex, PPA, and OPA are strongly linked to scene perception (Baldassano et al., 2016), implying that the intermediate-level area (V4) may be involved in the computation of complex information and, through functional connections with PPA and OPA, contributes to the distorted representation of scenes.

In the present study, we did not find definitive evidence supporting the involvement of feedback processing in the formation of AOV biases. According to researches, feedback processing to the occipital cortex (∼200–300 ms) is involved in object recognition in natural scenes (Groen et al., 2018; Wischnewski and Peelen, 2021). In the MEG experiment, the decoding performance of object complexity was significantly above chance (between 200 and 440 ms). Thus, we assumed that the feedback stage of scene processing in our experiment occurred 200 ms after the first scene onset. However, the decoding performance of AOV bias was not significant after 200 ms and did not overlap with that of object complexity. Furthermore, we found no significant effect in the frontal cortex and hippocampus, which were thought to be involved in normalizing perceptual representation with prior knowledge (Bar, 2004; Bar et al., 2006; van Kesteren et al., 2012). The low signal-to-noise ratio in these areas could explain the negative results. Additionally, the prototypical representation required by the normalization process may exist in a more abstract form, further narrowing the probability of its detection. Therefore, we deduced that there was potential feedback modulation comparable with the effect of schema instantiation (Gilboa and Marlatte, 2017) that was not detected here. Additional research is required to delve deeper into this issue.

Normalization for the 3D layout of scenes

In Experiment 2, we discovered that the asymmetry of layout orientation in both the horizontal and vertical axes was the strongest predictor of memory distortion along the same axis. Apparently, indoor scene memory was distorted toward a viewpoint with a balanced layout orientation. The equal areas of ceiling, floor, left wall, and right wall when a person faces the center of a back wall could be an example of the balanced layout orientation. The benefits of a balanced layout include improved access to the boundary information for navigation and an easier perception of objects at the center of the room. Recent research has demonstrated the representation of the 3D layout of scenes in scene-selective areas (Lescroart and Gallant, 2019) and the rapid computation of layout information within ∼100 ms after scene onset (Henriksson et al., 2019). In this regard, the representation of the 3D layout in the feedforward sweep could also influence the normalization process, a phenomenon that requires further investigation.

Footnotes

  • This study was supported by grants from STI2030-Major Projects (Grant Number: 2021ZD0200204) and the National Natural Science Foundation of China (Grant Number: 32271104).

  • The authors declare no competing financial interests.

  • Correspondence should be addressed to Sheng Li at sli{at}pku.edu.cn.

SfN exclusive license.

References

  1. ↵
    1. Bainbridge WA,
    2. Baker CI
    (2020) Boundaries extend and contract in scene memory depending on image properties. Curr Biol 30:537–543.e3. https://doi.org/10.1016/j.cub.2019.12.004 pmid:31983637
    OpenUrlPubMed
  2. ↵
    1. Baldassano C,
    2. Esteva A,
    3. Fei-Fei L,
    4. Beck DM
    (2016) Two distinct scene-processing networks connecting vision and memory. eNeuro 3:ENEURO.0178-16.2016. https://doi.org/10.1523/ENEURO.0178-16.2016 pmid:27822493
    OpenUrlAbstract/FREE Full Text
  3. ↵
    1. Bar M
    (2004) Visual objects in context. Nat Rev Neurosci 5:617–629. https://doi.org/10.1038/nrn1476
    OpenUrlCrossRefPubMed
  4. ↵
    1. Bar M, et al.
    (2006) Top-down facilitation of visual recognition. Proc Natl Acad Sci U S A 103:449–454. https://doi.org/10.1073/pnas.0507062103 pmid:16407167
    OpenUrlAbstract/FREE Full Text
  5. ↵
    1. Bartlett F
    (1932) Remembering: a study in experimental and social psychology. Cambridge and Melbourne: Cambridge University Press.
  6. ↵
    1. Chadwick MJ,
    2. Mullally SL,
    3. Maguire EA
    (2013) The hippocampus extrapolates beyond the view in scenes: an fMRI study of boundary extension. Cortex 49:2067–2079. https://doi.org/10.1016/j.cortex.2012.11.010 pmid:23276398
    OpenUrlCrossRefPubMed
  7. ↵
    1. Ciocca G,
    2. Corchs S,
    3. Gasparini F
    (2015) Complexity perception of texture images. In: New trends in image analysis and processing--ICIAP 2015 Workshops: ICIAP 2015 International Workshops, BioFor, CTMR, RHEUMA, ISCA, MADiMa, SBMI, and QoEM, Genoa, Italy, September 7–8, 2015, Proceedings 18, pp 119–126. Cham: Springer International Publishing.
  8. ↵
    1. Corchs SE,
    2. Ciocca G,
    3. Bricolo E,
    4. Gasparini F
    (2016) Predicting complexity perception of real world images. PLoS One 11:e0157986. https://doi.org/10.1371/journal.pone.0157986 pmid:27336469
    OpenUrlPubMed
  9. ↵
    1. Cox RW
    (1996) AFNI: software for analysis and visualization of functional magnetic resonance neuroimages. Comput Biomed Res 29:162–173. https://doi.org/10.1006/cbmr.1996.0014
    OpenUrlCrossRefPubMed
  10. ↵
    1. de Lange FP,
    2. Heilbron M,
    3. Kok P
    (2018) How do expectations shape perception? Trends Cogn Sci 22:764–779. https://doi.org/10.1016/j.tics.2018.06.002
    OpenUrlCrossRefPubMed
  11. ↵
    1. Dilks DD,
    2. Julian JB,
    3. Paunov AM,
    4. Kanwisher N
    (2013) The occipital place area is causally and selectively involved in scene perception. J Neurosci 33:1331–1336. https://doi.org/10.1523/JNEUROSCI.4081-12.2013 pmid:23345209
    OpenUrlAbstract/FREE Full Text
  12. ↵
    1. Epstein RA,
    2. Higgins JS,
    3. Parker W,
    4. Aguirre GK,
    5. Cooperman S
    (2006) Cortical correlates of face and scene inversion: a comparison. Neuropsychologia 44:1145–1158. https://doi.org/10.1016/j.neuropsychologia.2005.10.009
    OpenUrlCrossRefPubMed
  13. ↵
    1. Epstein R,
    2. Kanwisher N
    (1998) A cortical representation of the local visual environment. Nature 392:598–601. https://doi.org/10.1038/33402
    OpenUrlCrossRefPubMed
  14. ↵
    1. Gandolfo M,
    2. Nägele H,
    3. Peelen MV
    (2023) Predictive processing of scene layout depends on naturalistic depth of field. Psychol Sci 34:394–405. https://doi.org/10.1177/09567976221140341
    OpenUrlPubMed
  15. ↵
    1. Gilboa A,
    2. Marlatte H
    (2017) Neurobiology of schemas and schema-mediated memory. Trends Cogn Sci 21:618–631. https://doi.org/10.1016/j.tics.2017.04.013
    OpenUrlCrossRefPubMed
  16. ↵
    1. Glasser MF, et al.
    (2016) A multi-modal parcellation of human cerebral cortex. Nature 536:171–178. https://doi.org/10.1038/nature18933 pmid:27437579
    OpenUrlCrossRefPubMed
  17. ↵
    1. Greene MR
    (2013) Statistics of high-level scene context. Front Psychol 4:777. https://doi.org/10.3389/fpsyg.2013.00777 pmid:24194723
    OpenUrlCrossRefPubMed
  18. ↵
    1. Greene MR,
    2. Trivedi D
    (2023) Spatial scene memories are biased towards a fixed amount of semantic information. Open Mind 7:445–459. https://doi.org/10.1162/opmi_a_00088 pmid:37637297
    OpenUrlPubMed
  19. ↵
    1. Groen IIA,
    2. Ghebreab S,
    3. Lamme VAF,
    4. Scholte HS
    (2016) The time course of natural scene perception with reduced attention. J Neurophysiol 115:931–946. https://doi.org/10.1152/jn.00896.2015
    OpenUrlCrossRefPubMed
  20. ↵
    1. Groen IIA,
    2. Ghebreab S,
    3. Prins H,
    4. Lamme VAF,
    5. Scholte HS
    (2013) From image statistics to scene gist: evoked neural activity reveals transition from low-level natural image structure to scene category. J Neurosci 33:18814–18824. https://doi.org/10.1523/JNEUROSCI.3128-13.2013 pmid:24285888
    OpenUrlAbstract/FREE Full Text
  21. ↵
    1. Groen IIA,
    2. Jahfari S,
    3. Seijdel N,
    4. Ghebreab S,
    5. Lamme VAF,
    6. Scholte HS
    (2018) Scene complexity modulates degree of feedback activity during object detection in natural scenes. PLoS Comput Biol 14:e1006690. https://doi.org/10.1371/journal.pcbi.1006690 pmid:30596644
    OpenUrlCrossRefPubMed
  22. ↵
    1. Hafri A,
    2. Wadhwa S,
    3. Bonner MF
    (2022) Perceived distance alters memory for scene boundaries. Psychol Sci 33:2040–2058. https://doi.org/10.1177/09567976221093575
    OpenUrlPubMed
  23. ↵
    1. Hemmer P,
    2. Steyvers M
    (2009) A Bayesian account of reconstructive memory. Topics Cogn Sci 1:189–202. https://doi.org/10.1111/j.1756-8765.2008.01010.x
    OpenUrl
  24. ↵
    1. Henriksson L,
    2. Mur M,
    3. Kriegeskorte N
    (2019) Rapid invariant encoding of scene layout in human OPA. Neuron 103:161–171.e3. https://doi.org/10.1016/j.neuron.2019.04.014
    OpenUrlCrossRefPubMed
  25. ↵
    1. Intraub H,
    2. Bender RS,
    3. Mangels JA
    (1992) Looking at pictures but remembering scenes. J Exp Psychol Learn Mem Cogn 18:180–191. https://doi.org/10.1037/0278-7393.18.1.180
    OpenUrlCrossRefPubMed
  26. ↵
    1. Intraub H,
    2. Dickinson CA
    (2008) False memory 1/20th of a second later: what the early onset of boundary extension reveals about perception. Psychol Sci 19:1007–1014. https://doi.org/10.1111/j.1467-9280.2008.02192.x pmid:19000211
    OpenUrlCrossRefPubMed
  27. ↵
    1. Intraub H,
    2. Richardson M
    (1989) Wide-angle memories of close-up scenes. J Exp Psychol Learn Mem Cogn 15:179. https://doi.org/10.1037/0278-7393.15.2.179
    OpenUrlCrossRefPubMed
  28. ↵
    1. Julian JB,
    2. Keinath AT,
    3. Marchette SA,
    4. Epstein RA
    (2018) The neurocognitive basis of spatial reorientation. Curr Biol 28:R1059–R1073. https://doi.org/10.1016/j.cub.2018.04.057 pmid:30205055
    OpenUrlCrossRefPubMed
  29. ↵
    1. Kaiser D,
    2. Inciuraite G,
    3. Cichy RM
    (2020) Rapid contextualization of fragmented scene information in the human visual system. NeuroImage 219:117045. https://doi.org/10.1016/j.neuroimage.2020.117045
    OpenUrlPubMed
  30. ↵
    1. Konkle T,
    2. Olivia A
    (2007) Normative representation of objects: evidence for an ecological bias in object perception and memory. In: Proceedings of the Annual Meeting of the Cognitive Science Society, 29.
  31. ↵
    1. Kourtzi Z,
    2. Connor CE
    (2011) Neural representations for object perception: structure, category, and adaptive coding. Ann Rev Neurosci 34:45–67. https://doi.org/10.1146/annurev-neuro-060909-153218
    OpenUrlCrossRefPubMed
  32. ↵
    1. Kyle-Davidson C,
    2. Zhou EY,
    3. Walther DB,
    4. Bors AG,
    5. Evans KK
    (2023) Characterising and dissecting human perception of scene complexity. Cognition 231:105319. https://doi.org/10.1016/j.cognition.2022.105319
    OpenUrl
  33. ↵
    1. Lescroart MD,
    2. Gallant JL
    (2019) Human scene-selective areas represent 3D configurations of surfaces. Neuron 101:178–192.e7. https://doi.org/10.1016/j.neuron.2018.11.004
    OpenUrl
  34. ↵
    1. Lin F,
    2. Hafri A,
    3. Bonner MF
    (2022) Scene memories are biased toward high-probability views. J Exp Psychol Hum Percept Perform 48:1116–1129. https://doi.org/10.1037/xhp0001045
    OpenUrl
  35. ↵
    1. Nagle F,
    2. Lavie N
    (2020) Predicting human complexity perception of real-world scenes. R Soc Open Sci 7:191487. https://doi.org/10.1098/rsos.191487 pmid:32537189
    OpenUrlPubMed
  36. ↵
    1. O’Craven KM,
    2. Kanwisher N
    (2000) Mental imagery of faces and places activates corresponding stimulus-specific brain regions. J Cogn Neurosci 12:1013–1023. https://doi.org/10.1162/08989290051137549
    OpenUrlCrossRefPubMed
  37. ↵
    1. Oliva A,
    2. Mack ML,
    3. Shrestha M,
    4. Peeper A
    (2004) Identifying the perceptual dimensions of visual complexity of scenes. In: Proceedings of the Annual Meeting of the Cognitive Science Society, 26.
  38. ↵
    1. Oliva A,
    2. Torralba A
    (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42:145–175.
    OpenUrlCrossRef
  39. ↵
    1. Park S,
    2. Intraub H,
    3. Yi D-J,
    4. Widders D,
    5. Chun MM
    (2007) Beyond the edges of a view: boundary extension in human scene-selective visual cortex. Neuron 54:335–342. https://doi.org/10.1016/j.neuron.2007.04.006
    OpenUrlCrossRefPubMed
  40. ↵
    1. Park J,
    2. Josephs EL,
    3. Konkle T
    (2021) Systematic transition from boundary extension to contraction along an object-to-scene continuum. PsyArXiv.
  41. ↵
    1. Pasupathy A,
    2. Popovkina DV,
    3. Kim T
    (2020) Visual functions of primate area V4. Ann Rev Vis Sci 6:363–385. https://doi.org/10.1146/annurev-vision-030320-041306 pmid:32580663
    OpenUrlCrossRefPubMed
  42. ↵
    1. Peer M,
    2. Epstein RA
    (2021) The human brain uses spatial schemas to represent segmented environments. Curr Biol 31:4677–4688.e8. https://doi.org/10.1016/j.cub.2021.08.012 pmid:34473949
    OpenUrlCrossRefPubMed
  43. ↵
    1. Press C,
    2. Kok P,
    3. Yon D
    (2020) The perceptual prediction paradox. Trends Cogn Sci 24:13–24. https://doi.org/10.1016/j.tics.2019.11.003
    OpenUrlCrossRefPubMed
  44. ↵
    1. Rai Y,
    2. Gutiérrez J,
    3. Le Callet P
    (2017) A dataset of head and eye movements for 360 degree images. Proceedings of the 8th ACM on Multimedia Systems Conference, 205–210.
  45. ↵
    1. Rock I
    (1974) The perception of disoriented figures. Sci Am 230:78–86. https://doi.org/10.1038/scientificamerican0174-78
    OpenUrlPubMed
  46. ↵
    1. Roe AW,
    2. Chelazzi L,
    3. Connor CE,
    4. Conway BR,
    5. Fujita I,
    6. Gallant JL,
    7. Lu H,
    8. Vanduffel W
    (2012) Toward a unified theory of visual area V4. Neuron 74:12–29. https://doi.org/10.1016/j.neuron.2012.03.011 pmid:22500626
    OpenUrlCrossRefPubMed
  47. ↵
    1. Rosenholtz R,
    2. Li Y,
    3. Nakano L
    (2007) Measuring visual clutter. J Vis 7:17. https://doi.org/10.1167/7.2.17
    OpenUrlAbstract/FREE Full Text
  48. ↵
    1. Saraee E,
    2. Jalal M,
    3. Betke M
    (2020) Visual complexity analysis using deep intermediate-layer features. Comput Vis Image Underst 195:102949. https://doi.org/10.1016/j.cviu.2020.102949
    OpenUrl
  49. ↵
    1. Schubert E,
    2. Rosenblatt D,
    3. Eliby D,
    4. Kashima Y,
    5. Hogendoorn H,
    6. Bode S
    (2021) Decoding explicit and implicit representations of health and taste attributes of foods in the human brain. Neuropsychologia 162:108045. https://doi.org/10.1016/j.neuropsychologia.2021.108045
    OpenUrlCrossRef
  50. ↵
    1. Shore DI,
    2. Klein RM
    (2000) The effects of scene inversion on change blindness. J Gen Psychol 127:27–43. https://doi.org/10.1080/00221300009598569
    OpenUrlCrossRefPubMed
  51. ↵
    1. Sitzmann V,
    2. Serrano A,
    3. Pavel A,
    4. Agrawala M,
    5. Gutierrez D,
    6. Masia B,
    7. Wetzstein G
    (2018) Saliency in VR: how do people explore virtual environments? IEEE Trans Vis Comput Graph 24:1633–1642. https://doi.org/10.1109/TVCG.2018.2793599
    OpenUrlCrossRef
  52. ↵
    1. Smith SM,
    2. Nichols TE
    (2009) Threshold-free cluster enhancement: addressing problems of smoothing, threshold dependence and localisation in cluster inference. NeuroImage 44:83–98. https://doi.org/10.1016/j.neuroimage.2008.03.061
    OpenUrlCrossRefPubMed
  53. ↵
    1. van Kesteren MTR,
    2. Ruiter DJ,
    3. Fernández G,
    4. Henson RN
    (2012) How schema and novelty augment memory formation. Trends Neurosci 35:211–219. https://doi.org/10.1016/j.tins.2012.02.001
    OpenUrlCrossRefPubMed
  54. ↵
    1. Wischnewski M,
    2. Peelen MV
    (2021) Causal neural mechanisms of context-based object recognition. Elife 10:e69736. https://doi.org/10.7554/eLife.69736 pmid:34374647
    OpenUrlCrossRefPubMed
  55. ↵
    1. Xiao J,
    2. Hays J,
    3. Ehinger KA,
    4. Oliva A,
    5. Torralba A
    (2010). SUN database: large-scale scene recognition from abbey to zoo. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 3485–3492.
Back to top

In this issue

The Journal of Neuroscience: 44 (27)
Journal of Neuroscience
Vol. 44, Issue 27
3 Jul 2024
  • Table of Contents
  • About the Cover
  • Index by author
  • Masthead (PDF)
Email

Thank you for sharing this Journal of Neuroscience article.

NOTE: We request your email address only to inform the recipient that it was you who recommended this article, and that it is not junk mail. We do not retain these email addresses.

Enter multiple addresses on separate lines or separate them with commas.
Complexity Matters: Normalization to Prototypical Viewpoint Induces Memory Distortion along the Vertical Axis of Scenes
(Your Name) has forwarded a page to you from Journal of Neuroscience
(Your Name) thought you would be interested in this article in Journal of Neuroscience.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Print
View Full Page PDF
Citation Tools
Complexity Matters: Normalization to Prototypical Viewpoint Induces Memory Distortion along the Vertical Axis of Scenes
Yichen Wu(吴奕忱), Sheng Li(李晟)
Journal of Neuroscience 3 July 2024, 44 (27) e1175232024; DOI: 10.1523/JNEUROSCI.1175-23.2024

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Respond to this article
Request Permissions
Share
Complexity Matters: Normalization to Prototypical Viewpoint Induces Memory Distortion along the Vertical Axis of Scenes
Yichen Wu(吴奕忱), Sheng Li(李晟)
Journal of Neuroscience 3 July 2024, 44 (27) e1175232024; DOI: 10.1523/JNEUROSCI.1175-23.2024
Twitter logo Facebook logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Jump to section

  • Article
    • Abstract
    • Significance Statement
    • Introduction
    • Materials and Methods
    • Results
    • Discussion
    • Footnotes
    • References
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF

Keywords

  • angle of view
  • boundary transformation
  • fMRI
  • MEG
  • PPA
  • scene perception

Responses to this article

Respond to this article

Jump to comment:

No eLetters have been published for this article.

Related Articles

Cited By...

More in this TOC Section

Research Articles

  • Chemogenetic disruption of monkey perirhinal neurons projecting to rostromedial caudate impairs associative learning
  • Specializations in amygdalar and hippocampal innervation of the primate nucleus accumbens shell
  • LUZP1 Regulates Dendritic Spine Maturation and Synaptic Plasticity in the Hippocampal Dentate Gyrus of Mice
Show more Research Articles

Behavioral/Cognitive

  • Electrophysiological Correlates of Lucid Dreaming: Sensor and Source Level Signatures
  • The Inattentional Rhythm in Audition
  • Similar Computational Hierarchies for Reading and Speech in the Occipital Cortex of Sighed and Blind: Converging Evidence from fMRI and Chronometric TMS
Show more Behavioral/Cognitive
  • Home
  • Alerts
  • Follow SFN on BlueSky
  • Visit Society for Neuroscience on Facebook
  • Follow Society for Neuroscience on Twitter
  • Follow Society for Neuroscience on LinkedIn
  • Visit Society for Neuroscience on Youtube
  • Follow our RSS feeds

Content

  • Early Release
  • Current Issue
  • Issue Archive
  • Collections

Information

  • For Authors
  • For Advertisers
  • For the Media
  • For Subscribers

About

  • About the Journal
  • Editorial Board
  • Privacy Notice
  • Contact
  • Accessibility
(JNeurosci logo)
(SfN logo)

Copyright © 2025 by the Society for Neuroscience.
JNeurosci Online ISSN: 1529-2401

The ideas and opinions expressed in JNeurosci do not necessarily reflect those of SfN or the JNeurosci Editorial Board. Publication of an advertisement or other product mention in JNeurosci should not be construed as an endorsement of the manufacturer’s claims. SfN does not assume any responsibility for any injury and/or damage to persons or property arising from or related to any use of any material contained in JNeurosci.