## Abstract

Systematic errors in human path integration were previously associated with processing deficits in the integration of space and time. In the present work, we hypothesized that these errors are de facto the result of a system that aims to optimize its performance by incorporating knowledge about prior experience into the current estimate of displacement. We tested human linear and angular displacement estimation behavior in a production–reproduction task under three different prior experience conditions where samples were drawn from different overlapping sample distributions. We found that (1) behavior was biased toward the center of the underlying sample distribution, (2) the amount of bias increased with increasing sample range, and (3) the standard deviation for all conditions was linearly dependent on the mean reproduced displacements. We propose a model of Bayesian estimation on logarithmic scales that explains the observed behavior by optimal fusion of an experience-dependent prior expectation with the current noisy displacement measurement. The iterative update of prior experience is modeled by the formulation of a discrete Kalman filter. The model provides a direct link between Weber–Fechner and Stevens' power law, providing a mechanistic explanation for universal psychophysical effects in human magnitude estimation such as the regression to the mean and the range effect.

## Introduction

Path integration, that is, the ability to keep track of changes in orientation and position using self-motion cues, constitutes an essential component of spatial navigation (Mittelstaedt and Mittelstaedt, 1980; Etienne and Jeffery, 2004). Yet human path integration performance exhibits systematic errors. Characteristic overestimation and underestimation of traveled distances and turning angles and thus a tendency to bias toward certain displacements have been reported for path integration tasks in real and virtual environments (Loomis et al., 1993; Jürgens et al., 1999; Riecke et al., 2002; Seemungal et al., 2007; Glasauer et al., 2009b). Furthermore, systematic errors differ between studies: while participants correctly reproduced a 10 m distance in one study (Klatzky et al., 1990), they underestimated the same distance by 2 m in another one (Schwartz, 1999). The main difference between the two studies was the range of distances tested. In the context of magnitude estimation, these systematic errors can be interpreted as regression and range effects (Stevens and Greenbaum, 1966; Teghtsoonian and Teghtsoonian, 1978).

One account of a bias in path integration posits processing deficits that accumulate during integration over space or time (Mittelstaedt and Glasauer, 1991; Fujita et al., 1993; Glasauer et al., 2007; Lappe et al., 2007; Mossio et al., 2008; Bergmann et al., 2011). However, research in related domains has shown that a bias is not necessarily a result of deficient processing, but can also represent the optimal solution of a system that incorporates prior knowledge about the world to maximize its use of information provided by sensory cues (Knill and Pouget, 2004; Burge et al., 2008; Fetsch et al., 2009). A probabilistic interpretation of this statement is the model of an optimal Bayesian estimator that combines a current noisy measurement with an a priori estimate that depends on previous experience (Körding et al., 2004; Miyazaki et al., 2005; Jazayeri and Shadlen, 2010) or reflects a general intrinsic tendency (Jürgens and Becker, 2006; Stocker and Simoncelli, 2006).

We hypothesize that such an estimation process could provide a potential explanation for systematic biases in human path integration. In particular, we speculate that an experience-dependent prior could cause the posterior estimate to adapt to the range of stimuli presented and show a regression toward the expectancy value of the underlying distribution. Thus, we tested human linear and angular displacement estimation separately in three different prior-experience conditions. In a virtual environment, participants were asked to produce and subsequently reproduce distances and turning angles that were drawn from three partially overlapping sample ranges. If participants incorporated knowledge about prior experience into their current estimate of displacement, their behavior should depend significantly on the underlying sample distribution. In a second step, we developed and tested two variants of a Bayesian estimator model where the reproduced displacement was determined by fusion of the current measurement and an a priori expectation. The prior was either modeled as a fixed value that approximated the statistical properties of the underlying sample distribution or as an iterative estimate updated in each trial, which represented the immediate prior experience.

## Materials and Methods

#### Participants

Fourteen volunteers (seven female), aged 22–34 years, were monetarily compensated for their participation in the study. All had normal or corrected-to-normal vision and were naive to the purpose of the experiments. The experiments were approved by the local ethics committee in accordance with Declaration of Helsinki.

#### Experimental setup

Stimuli were presented binocular on a computer monitor (resolution, 1280 × 800; frame rate, 59 Hz) driven by an ATI Mobility Radeon HD 3400 graphics card. Experiments were conducted in complete darkness except for the illumination by the monitor. The real-time virtual reality (VR) was created using Vizard 3.0 (Worldviz) and depicted an artificial stone desert consisting of a textured ground plane, 200 scattered stones, and a textured sky (Fig. 1*a*). The orientation of the ground plane texture, the position of the stones, and the starting position of the participant within the VR were randomized in each trial to prevent participants from using any of these as potential cues. The sky was simulated as a 3D dome centered on the participant's current position so that the distance to the horizon was kept constant. The eye height in the VR was adjusted individually to the true eye height of each participant (Daum and Hecht, 2009). Participants used a multidirectional movable joystick (SPEEDLINK) to navigate.

#### Experimental procedure

The estimation of traveled distances and the estimation of turning angles were tested separately under three different conditions in a production–reproduction task (Fig. 1*a*).

##### Distance estimation experiment.

Each trial started with an instruction for participants to move forward along a linear path while keeping track of their self-displacement. Direction of movement during production was indicated by a visual cue at the horizon. When participants reached the sample distance *d _{p}*, movement was automatically stopped and disabled for a few seconds. Subsequently, participants were instructed to reproduce the perceived distance and indicate their final position via button press. In all trials, velocity was kept constant during movement, but changed randomly up to ±60% (scaling factor drawn from a normal distribution) between production and reproduction phases to exclude time estimation strategies to solve the task. To test the effect of prior experience only, the settings for the three conditions were the same except that the sample distances and respective turning angles were drawn from three different underlying uniform sample distributions, specified as small displacements (

*d*= [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] m), intermediate displacements (

_{p}*d*= [5, 6, 7, 8, 9, 10, 11, 12, 13, 14] m) and large displacements range (

_{p}*d*= [10, 11, 12, 13, 14, 15, 16, 17, 18, 19] m). The sample distributions of the three conditions were chosen to be partially overlapping to test whether displacement estimation behavior differed significantly for the same sample stimulus depending on the previously experienced displacements (Fig. 1

_{p}*b*). Participants had no knowledge about the amount of displacement they had to reach during the production phase and were naive to the condition in which they were tested.

##### Turning angle estimation experiment.

The stimulus and settings in the angle estimation (AE) experiment were identical to the distance estimation (DE) experiment, with the following exception: participants turned on the spot to a previously indicated direction. Turning direction was kept constant between production and reproduction to preclude the use of external cues to solve the task. The sample turning angles, α* _{p}*, for the three prior experience conditions were in analogy drawn from three different sample distributions specified as small displacements (α

*= [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]°), intermediate displacements (α*

_{p}*= [50, 60, 70, 80, 90, 100, 110, 120, 130, 140]°) and large displacements range (α*

_{p}*= [90, 100, 110, 120, 130, 140, 150, 160, 170, 180]°).*

_{p}All participants performed both types of experiment. The three conditions for DE and AE were tested in separate sessions, resulting in six test sessions per participant. Each session lasted between 45 and 60 min and was composed of 200 trials. The first 20 training trials per experimental condition served to familiarize participants with the VR. Feedback on the performance was given after the reproduction by displaying an object in the VR at the correct distance or turning angle and asking subjects to navigate toward this location. In the following 180 test trials, no feedback was given. Only test trials were used for data analysis. Two sessions of the same experiment type, AE or DE, were separated by at least 1 h and up to a few days. Within sessions, participants had a short break of 100 s after 100 and 150 trials. Each sample displacement was repeated 20 times per condition in randomized order. The same trial order within one condition was maintained for all participants. The order in which the three conditions for DE and AE were tested was randomized for each participant.

#### Analysis of behavioral data

Position and orientation of participants within the VR were sampled at 20 Hz. The displacement between the end of the production phase and time of the button press was calculated as the reproduced displacement *d*_{r} and α_{r}. The estimation error was defined as the difference between the reproduced and produced displacement. Data analysis was conducted in MATLAB R2010b (MathWorks). Statistical differences were assessed using repeated-measures ANOVA (rm-ANOVA). One rm-ANOVA with main factors for the overall condition (small displacements, intermediate displacements, large displacements) and displacement ([1:10]̧ [5:14], [10:19] for DE session and [10:100]̧ [50:140], [90:180] for AE sessions) was performed on the signed estimation error to reveal differences in the error magnitude and shape of the curve.

To assess range effects, we also looked for differences in reproduced displacements of samples that were presented in more than one condition (i.e., overlapping samples) using a second rm-ANOVA referred to as duplicated samples comparison with main factor condition (Higher Range vs Lower Range). Thereby, overlapping samples were compared in a single rm-ANOVA. For the factor Higher Range, we used [5:9] m from the intermediate displacements condition and [10:14] m from the large displacements condition, compared with the factor Lower Range including [5:9] m from the smaller displacements condition and [10:14] from the intermediate displacements condition. Note that each measurement was only used once, either for the factor Higher Range or Lower Range. For angular displacements, the factor Higher Range included [50:90]° from the intermediate displacements condition and [100:140]° from the large displacements condition; the Lower Range factor included [50:90]° from the small displacements condition and [100:140]° from the intermediate displacements condition. Linear regression analyses were performed to quantify the relationship between mean and standard deviation. A probability level of *p* < 0.05 was considered significant for all statistical analysis.

#### Bayesian estimator model

The stimulus displacements for the production phase of the three conditions were entered into a Bayesian estimator model in the same order as in the experiment (Fig. 2). The model assumes Bayesian fusion of measurement and prior experience on logarithmic scales to achieve a final displacement estimate. The single computational steps are as follows.

##### Logarithmic internal representation of displacement.

Weber–Fechner's law proposes the representation of a stimulus size on a logarithmic scale (Fechner, 1860). Several recent psychophysical studies support the notion that human behavior approximately follows this law for numerical quantities (Dehaene et al., 2008), visual motion perception (Zanker, 1995; Jürgens and Becker, 2006; Stocker and Simoncelli, 2006), and locomotor path integration (Durgin et al., 2009). Accordingly, we introduced a modified logarithmic representation of perceived linear or angular displacements, similar to the one previously proposed for motion perception (Stocker and Simoncelli, 2006):
where *d*_{m} is the measured displacement on linear scales and *x*_{m} is the internal noisy logarithmic representation of the measured displacement. The random variable *n*_{m} represents the normally distributed measurement noise *p*(*n*_{m}) ≈ *N*(0, σ_{m}^{2}). The input stimuli are expressed in virtual meters or degrees. *d*_{0} ≪ 1 is a small normalization constant, which leads to a unitless internal representation of displacement. For the simulations, we chose an arbitrary fixed value of *d*_{0} = 0.01 m for distance estimation and *d*_{0} = 0.01° for angle estimation. The addition of 1 allows for representing a null displacement and may account for the deviation of the Weber–Fechner law at small magnitudes.

Since all represented displacements (*d*_{m}/*d*_{0}) in our experiment are large compared with 1, we reduced the general description of the transformation in Equation 1 to the simpler form:
Note that *d* always indicates displacements on linear scale, whereas *x* refers to the mean of the internal distributions (Fig. 2*a*). Since the distribution of the measurement noise is known, the measured displacement can internally be represented by the likelihood distribution, a Gaussian distribution with *p*(*x*_{m}) ≈ *N*(*x*_{m}, σ_{m}^{2}).

##### Bayesian fusion of measurement and prior.

The probability of having experienced a certain displacement is given by the posterior probability distribution, which depends on the likelihood of measurement or evidence and the prior probability. Assuming that the likelihood functions of prior and measurement are approximately Gaussian, the mean of the posterior distribution on logarithmic scales *x̂*_{r} is, according to Bayes' rule, given by a weighted sum of the mean of the prior distribution *x*_{prior} and the measurement likelihood *x*_{m}.
with variance
The weights *w*_{prior} and *w*_{m} add up to unity and depend on the uncertainty of the measurement and prior, measured by the inverse variance of prior and measurement distributions:
In the proposed model, Bayesian fusion takes place on logarithmic scales (Eqs. 1, 2), thus the reproduced distance on linear scales is determined from the back-transformation of the Gaussian distribution *p*(*x*_{r}) = *N*(*x̂*_{r}, σ_{r}^{2}), resulting in a lognormal distribution on linear scales.

Up to this point, the model specifies a posterior probability distribution of distances rather than the particular distance that should be reproduced. To determine the distance to be reproduced and thus to execute a specific action the peak, the mean or any specific value of the posterior distribution could be selected, depending on the cost associated with making different types of errors (Doya et al., 2007). Commonly proposed symmetric cost functions (Körding and Wolpert, 2004b) lead to reproduction of one of the location parameters mean, mode, or median of the estimated posterior distribution.

However, in contrast to a normal distribution for the resulting lognormal distribution, these location parameters are no longer equal. The median *d̃*_{r}, mean *d̄*_{r}, and mode *d*_{r max} of the distribution are given by
and thus differ by a shift, which depends on the stimulus distance on linear scales. The variance σ_{dr}^{2} is given by
To account for these differences in the reproduction estimate depending on the cost function, we introduced a shift term, Δ*x*, as additional parameter in the model* ^{a}* (see Model fit), so that the reproduced displacement is given as
Note that with Equations 3 and 8, the reproduced displacement becomes
and thus follows Stevens' power law (Stevens, 1961).

Finally, to account for signal-independent variability of the reproduced displacement caused, for example, by reaction times in handling the response device, the random variable *n _{c}* representing normally distributed constant noise

*p*(

*n*) ≈

_{c}*N*(0, σ

_{c}

^{2}) was added to the reproduced displacement on linear scales

*d*

_{r}.

##### Prior update.

In the current study, the Bayesian estimator was tested with two methods to implement an experience-dependent distance prior, referred to as one-stage and two-stage model. In the one-stage model, the prior was implemented as a distribution with a fixed mean centered at the mean of each underlying sample distribution and thereby represented the global statistics of the input stimuli. In the two-stage model, the displacement prior was updated iteratively in an additional computation step dependent on the posterior estimate of the prior in the previous trial and the current measurement of displacement (Fig. 2*a*). The update in each measurement step is modeled by the discrete formulation of the Kalman filter for a 1D first-order system. The state to be estimated and the current measurement at update step *i*, corresponding to trial *i*, are modeled by
The random variables *n*_{q} and *n*_{r} represent the process and measurement noise, respectively. They are assumed to be independent with approximately normal probability distributions *p*(*n*_{q}) ∝ *N*(0, *q*) and *p*(*n _{r}*) ∝

*N*(0,

*r*). The system defined by Equation 10 thus states that (1) the prior has no intrinsic dynamics and is varying only due to random changes modeled by

*n*

_{q}, and (2) the current measurement is an instantiation of the current prior perturbed by the measurement noise

*n*

_{r}.

For this simple system, the difference equation system of the Kalman filter reduces to
with *k _{i}* being the Kalman gain,

*x̂*

_{prior,i−1}and

*x̂*

_{prior,i}being the a priori and a posteriori estimate of the distance prior at update step

*i*, and

*p*and

_{i}*p*

_{i}_{−1}the corresponding variance. Note that it is evident from this equation that the Kalman gain

*k*can be interpreted as weight of the measurement depending on measurement noise and the assumed random change of the distance prior. The new estimate of the distance prior is thus a weighted sum of the previous estimate and the current measurement.

_{i}In the context of the Bayesian estimator model, we refer to *p _{i}* as the estimated variance of the distance prior σ

_{prior}

^{2}and

*r*as the measurement variance σ

_{m}

^{2}. The prior for the two-stage model was initialized by the first measurement and reset at the beginning of each new session to account for the lack of prior knowledge of the underlying distribution except for the training trials. Note that after a measurement has been taken, the entire model is deterministic and does not involve any random elements to determine the distance to be reproduced. A preliminary version of the model has been published in abstract form (Glasauer et al., 2009a).

##### Model fit.

The displacements used in the experiment were used in the same order as input *d*_{m} for both the one-stage and the two-stage models. The shift term Δ*x* was implemented in both models using Equation 8. In the one-stage model, the single estimate of the prior *x*_{prior} was modeled as the log-transformed mean value of each underlying sample distribution on linear scales. The weighting of prior *w*_{prior} and Δ*x* were determined by minimizing the sum of the squares of the residuals of the one-stage estimator model and the individual participants' mean responses for all three conditions simultaneously using the Matlab procedure *lsqnonlin*.

In the two-stage model, the prior *x*_{prior} was modeled as a continuously varying value determined by the Kalman filter (Eq. 11), which was reset at the beginning of each condition. To quantify the time course of the Kalman gain *k _{i}*, which approaches a steady-state value, its time constant τ expressed in trials was determined by fitting an exponential function to

*k*. The ratio between measurement and process noise in the Kalman filter

_{i}*r*/

*q*and the shift term Δ

*x*were determined by minimizing the sum of the squares of the residuals of the two-stage estimator model and the individual participants' mean responses for all three conditions simultaneously using the Matlab procedure

*lsqnonlin*. To be comparable to the one-stage model, the steady-state weighting of the prior

*w*

_{prior}, which is proportional to the ratio

*r*/

*q*, was determined from the result of the fitting procedure and is reported in the Results, below.

Thus, both the one-stage and the two-stage models are each fully determined by two free parameters. However, the one-stage model requires additional input about the prior in each condition, whereas the two-stage model does not. To assess the precision of the fitted parameters, we estimated 95% confidence intervals (CI_{95%}) of all parameters, which were determined from the Jacobian of the parameter surface at the minimum using the Matlab procedure *nlparci*. The coefficient of determination *R*^{2} was estimated to assess the proportion of variability in the mean data that is accounted for by the respective model. To test for a significant difference in the *R*^{2} for individual participants between the two models, we used the Matlab procedure *signtest*.

For both models, the variance of the reproduced displacement σ̂_{r}^{2} was determined separately from the slope of the linear regression between standard deviation and mean of the reproduced displacements, using Equation 7. The *y*-intercept of the regression was interpreted as being due to constant noise *p*(*n _{c}*) ≈

*N*(0, σ

*) (see Bayesian fusion of measurement and prior, above).*

_{c}##### Predictions for Bayesian estimation.

The proposed Bayesian framework makes specific predictions on the behavior of an optimal estimator that can be tested experimentally. First, assuming independent noise sources for prior experience and measurement, the estimate on logarithmic scales is determined by the weighted average of prior and measurement dependent on the respective reliability. This leads to a power law dependence between input stimulus and reproduced displacement (Eq. 9) in linear space, as proposed by Stevens (1957), where the power function is determined by the individual weighting of the subjects. Second, according to this relationship, the difference between prior and measurement and therefore the effect of the Bayesian fusion becomes more pronounced for larger displacements, meaning that the overestimation and underestimation increase for increasing displacements. This results in a behavior known in the psychophysics literature as range effect (Teghtsoonian and Teghtsoonian, 1978). Third, assuming constant Gaussian noise on logarithmic scales leads to a linear dependence of the mean reproduced displacement and its corresponding standard deviation on linear scales (Eq. 7).

## Results

### Experience-dependent behavior

Participants' responses show three major characteristics that can be attributed to an estimation process that incorporates knowledge about the underlying sample distribution. These characteristics were tested at the group and single-subject levels.

First, reproduced distances and turning angles exhibited a clear tendency toward the mean of the underlying sample distribution for each of the three sample distributions tested. In each condition, small distances and angles were overestimated and large distances and angles were underestimated (Fig. 3). This can be seen in the overlapping distances ([5:14] m) and angles ([50:140]°) that were tested in more than one condition for which the duplicated samples comparison reveals a significant difference between conditions (main effect: Higher Range, Lower Range; DE: *F*_{(1,13)} = 20.4, *p* < 0.001; AE: *F*_{(1,13)} = 66.6, *p* < 0.001). Furthermore, the duplicated samples comparison on the single-subject level reveals that 11 of 14 participants in DE sessions and all participants in AE sessions showed a significant dependence of estimation magnitude on the underlying sample distribution (main effect: Higher Range, Lower Range; DE: *F*_{(1,13)} = 9.8–202.0, *p* < 0.01; AE: *F*_{(1,13)} = 12.4–217.0, *p* < 0.01).

Second, the bias toward the center for each sample distribution increased with increasing displacement range. The overestimation and underestimation errors were more pronounced for the conditions with larger displacements (interaction: condition × displacement; DE: *F*_{(18,234)} = 4.4, *p* < 0.001; AE: *F*_{(18,234)} = 3.2, *p* < 0.001). This causes a decrease in the slope between produced and reproduced displacement for increasing sample range. The significant change in the bias, measured by the change in estimation error over the conditions, was found for 13 of 14 participants in DE and AE sessions (interaction: condition × displacement; DE: *F*_{(18,234)} = 4.7–38.3, *p* < 0.05; AE: *F*_{(18,234)} = 10.2–120.1, *p* < 0.001).

Third, the standard deviation of the reproduced displacements was dependent on the sample distribution. The duplicated samples comparison revealed that standard deviations of overlapping samples differed significantly depending on the underlying sample distribution (main effect: Higher Range, Lower Range; DE: *F*_{(1,13)} = 11.1, *p* = 0.005; AE: *F*_{(1,13)} = 7.8 *p* = 0.01). Additionally, we observed a strong correlation between the mean reproduced displacement and the corresponding mean standard deviation for both DE and AE sessions. (DE: linear regression: *r* = 0.95, *p* < 0.001; AE: *r* = 0.97, *p* < 0.001; Fig. 4*a*,*c*). On the single-subject level, the linear regression between standard deviation and mean of reproduced distances yielded a highly significant correlation coefficient *r* for all participants (DE: *p* < 0.001 for 11 of 14 participants, *p* < 0.01 for the remaining three participants, AE: *p* < 0.001 for 11 of 14, *p* < 0.01 for the remaining three participants).

### Test of the Bayesian estimator model

The experimental findings support the notion that humans incorporate knowledge about the stimulus properties applied in the current condition into their measurement of displacement and that this behavior is qualitatively in agreement with a Bayesian estimation process.

To evaluate this finding in a quantitative manner, we fit two variants of the Bayesian estimation model to the mean response over all participants and to the individual participants' mean responses using a least-squares fitting method (Fig. 2*a*). The first variant, referred to as one-stage model, tests a fixed prior (for a similar study, see Jazayeri and Shadlen, 2010), that is determined by the mean of each sample distribution and therefore represents prior knowledge that captures the overall statistics of the experiment (model fit group: *w*_{prior,DE} = 0.40, CI_{95%} = [0.33, 0.48]; Δ*x*_{DE} = 0.03, CI_{95%} = [0.01, 0.04]; *w*_{prior,AE} = 0.40, CI_{95%} = [0.36, 0.43]; Δ*x*_{AE} = 0.01, CI_{95%} = [0, 0.01]; individual participants: *w̄*_{prior,DE} = 0.41 ± 0.14, range = [0.20 − 0.61]; Δ*x̄*_{DE} = 0.02 ± 0.13; *w̄*_{prior,AE} = 0.39 ± 0.12, range = [0.14 − 0.62]; Δ*x̄*_{AE} = 0.01 ± 0.01). The shift parameter Δ*x* was not significantly different from zero for four of 14 participants in DE and three of 14 participants in AE sessions (remaining participants: DE: five participants, <0; five participants, >0; AE: five participants, <0; six participants, >0).

The second variant, or two-stage model, tests an iteratively updated version of the prior that additionally accounts for variations during the time course of the experiment. This model has, like the one-stage model, two free parameters (model fit group: *w*_{prior,DE} = 0.34, CI_{95%} = [0.30, 0.58]; Δ*x*_{DE} = 0.03, CI_{95%} = [0.01, 0.05]; *w*_{prior,AE} = 0.33, CI_{95%} = [0.28, 0.40]; Δ*x*_{AE} = 0.02, CI_{95%} = [0, 0.04]; individual participants: *w̄*_{prior,DE} = 0.36 ± 0.15, range = [0.20 − 0.61]; Δ*x̄*_{DE} = 0.01 ± 0.11; *w̄*_{prior,AE} = 0.32 ± 0.09, range = [0.14 − 0.62]; Δ*x̄*_{AE} = 0.02 ± 0.07). The shift parameter Δ*x* was not significantly different from zero for six of 14 participants in DE and five of 14 participants in AE sessions (remaining participants: DE: four participants, <0; four participants, >0; AE: three participants, <0; six participants, >0).

The linear relationship between standard deviation and mean of the experimental data was deployed to derive an estimate of the noise sources to simulate the predicted mean reproduction noise of the model. The results for the two-stage model compared with the behavioral data are depicted in Figure 4, *b* and *d*, for the mean of all participants.

Figure 5 compares the experimental data to the mean displacement estimate by the two variants of the fitted Bayesian estimator model. Both variants agree well with the experimental data (coefficient-of-determination, one-stage model, model fit group: *R*_{AE}^{2} = 0.98, *R*_{DE}^{2} = 0.97; individual participants: *R*_{AE}^{2} = 0.83 − 0.99, *R*_{DE}^{2} = 0.84 − 0.98; two-stage-model, model fit group: *R*_{AE}^{2} = 0.98, *R*_{DE}^{2} = 0.97; individual participants: *R*_{AE}^{2} = 0.80 − 0.99, *R*_{DE}^{2} = 0.88 − 0.98). A nonparametric comparison indicated no significant difference between the *R*^{2} values of individual participants for the one- and two-stage models (*p* > 0.1). However, the prior in the two-stage model arises due to the online estimation of the Kalman filter without any knowledge of the underlying sample distribution, whereas the current estimate of the prior in the one-stage model was set to be the mean of the respective underlying sample distribution. Therefore, the one-stage model requires the incorporation of additional knowledge compared with the two-stage case. Furthermore, the two-stage model with iterative update of the prior accounted for small variations in the data that were captured by the variations of the prior (Fig. 5, insets). Consequently, we considered the two-stage model to be superior to the one-stage model and used it for further analysis.

Figure 6*c* shows an example for a typical time course of the variable prior and measurement in one session. The range of displacements predicted by the prior estimates is smaller than that of the sample stimulus. This leads to a predicted measurement that covers a smaller range of displacements than the input stimuli. The time constant of the evolution of Kalman gain varied between subjects (DE: τ_{DE} = [0.2 − 1.6] trials, AE: τ_{AE} = [0.3 − 2.1] trials). These values are similar to the time constants reported for the learning of pointing movements (van Beers, 2009) and shown for learning the mean of a prior distribution in a virtual coin-catching task (Berniker et al., 2010, their Fig. 6).

### Predictions on single-subject behavior

Figure 6 compares the model predictions of mean and standard deviation to the individual participant's responses. The model captures individual differences between participants mainly by variation in the weighting of prior and measurement that in turn determine the slope of the predicted response curve. A strong weighting of the prior results in a more pronounced overestimation and underestimation, while a strong weighting of the measurement results in a predicted response that is very similar to the input stimuli (Fig. 6*a*,*b*, line of equality). Thus, the weighting reflects a scale invariant measure of the overall behavioral tendency of subjects in one experiment.

Within participants, behavior was compared by weighting the prior *w*_{prior} between the DE and AE experiments for each individual participant. We found a significant correlation between the weight in DE compared with AE sessions (linear regression: *r* = 0.76, *p* = 0.001). Figure 7 shows that the weighting of the prior within participants and between DE and AE conditions was more similar than between participants. In particular, the mean ratio of the AE versus DE weights was approximately equal (*w*_{prior,AE}/*w*_{prior,DE} = 0.90).

## Discussion

Human linear and angular displacement estimation is influenced by prior experience. We found that (1) reproduced distances and turning angles were biased toward the center of the underlying sample distribution, (2) the amount of bias increased with increasing sample range, and (3) the standard deviation for all conditions was linearly dependent on the mean reproduced displacement. These three characteristics are well captured by a model of an iterative Bayesian estimator that combines an experience-dependent a priori expectation with the actual noisy measurement to achieve an optimal estimate of displacement. We propose that our results are not limited to displacement estimation, but potentially hold for magnitude reproduction in general.

### Behavioral findings in the context of the literature

Previous work on human linear and angular displacement perception found similar results to ours with a tendency to overshoot and undershoot certain displacements (Loomis et al., 1993; Ivanenko et al., 1997; Seemungal et al., 2007; Bergmann et al., 2011). An indication for the influence of prior experience can be found in work that shows that distance estimation and error magnitude vary considerably as a function of changes in the environmental experience (Ziemer et al., 2009) or stimulus range (Teghtsoonian and Teghtsoonian, 1978; Klatzky et al., 1990; Schwartz, 1999). Yet the estimation of distances and turning angles was mostly tested in different studies, because a direct comparison of the two different measures for single participants is difficult. In the present work, however, the model provides a chance to compare the two magnitudes in terms of individual weighting of prior and measurement, which is invariant to the measure of the magnitude. We found that, overall, individual participants seem to weight the prior for distances and turning angles similarly, whereas the differences between participants' weighting were higher. One possible reason for this is that there is a common processing mechanism for magnitudes in general, including the estimation of turning angles and distances as proposed by Walsh (2003). Another possible explanation is that the reliability of the input was very similar because both measures were based on optic flow in the same virtual environment (Frenz and Lappe, 2005; Mossio et al., 2008). However, the degree of reliance on prior information across tasks may also be a general trait that varies among individual subjects.

### Experience-dependent Bayesian inference leads to a regression toward the mean

The regression effect, first referred to as the central tendency of judgment (Hollingworth, 1910), in psychophysical magnitude estimation is the tendency to correctly estimate magnitudes close to the center of the stimulus range and misestimate marginal ones: values presented at the lower end of the range are overestimated while those at the upper end show underestimation (Stevens and Greenbaum, 1966; Teghtsoonian and Teghtsoonian, 1978). Stevens (1971) attempted to explain this behavior as the tendency of the observer “to shorten the range of whichever variable he controls.” A potential explanation for this tendency is Bayesian fusion of measurement and a priori expectation (Laming, 1999), as shown for displacement estimation in the present work. By multiplying the prior and the likelihood distributions, which correspond to a weighted average of prior and measurement in the Gaussian case, the estimate exhibits a shift from the measurement toward the a priori expectation. As shown in Figure 6*c*, an experience-dependent posterior estimate of randomly presented stimuli covers a smaller range, with displacements close to the center being more likely to occur; this consequently results in a regression toward the center of the sample range.

### Dynamic prior knowledge adapts to the range of stimuli presented

Several studies have convincingly demonstrated that humans can use near-optimal strategies to combine stimulus uncertainty and prior information (Mamassian and Landy, 1998; Körding and Wolpert, 2004a; Tassinari et al., 2006). The a priori expectation in Bayesian models is often viewed as a fixed internal tendency that is due to general features in the world, e.g., that slow velocities are more likely to occur than fast ones (Weiss et al., 2002; Stocker and Simoncelli, 2006). Several studies, however, have shown that the a priori estimate can be modulated by short-term experience (Adams et al., 2004; Körding et al., 2004; Miyazaki et al., 2005) and its mean and variance could be learned during the experiment (Guo et al., 2004; Körding and Wolpert, 2004a; Berniker et al., 2010). Jazayeri and Shadlen (2010), for instance, assumed an experience-dependent prior expectation that was modeled as a continuous and fixed distribution, centered around the mean of the sample distribution. Indeed, it makes more sense for such a representation to arise over time. In the present work, we tested the fixed prior against a variable version that continuously updates its expectation with the previous measurement. The assumption behind the proposed updating procedure is that the mean of the stimulus distribution changes slowly over time, but in a way unknown to the system. We show that such an iterative updated prior, modeled by a Kalman filter, accounts for small variations in the data that are most likely due to the order of stimuli presented and cannot be explained by a version with fixed prior. Furthermore, this model provides an explanation for the origin and development of such a prior over time. In particular, the adaption to the underlying sample range for randomly presented stimuli (Teghtsoonian and Teghtsoonian, 1978; Kowal, 1993; Cheng et al., 2010) and also, potentially, the hysteresis effect, which refers to a dependence of the behavior on the order of stimuli, for experiments with nonrandom order (Eisler and Ottander, 1963; Hock et al., 2005) can result from the continuous update of a prior according to the experienced displacements.

### Logarithmic Bayesian fusion leads to a direct link between Weber–Fechner and Stevens' power law

In the present work, we suggest that the most parsimonious explanation for the behavior is that displacement is coded internally on a logarithmic scale, as first proposed by Fechner (1860) based on Weber's law, which has shown to hold for human locomotor distance reproduction (Durgin et al., 2009). Similar results could, in principle, also be achieved on linear scales, assuming scalar variability, that is, a rise in the standard deviation with increasing mean (Rakitin et al., 1998; Cantlon et al., 2009). Recent work, however, supports the idea that numerical quantities (Dehaene, 2003; Nieder and Merten, 2007) and visual motion perception (Jürgens and Becker, 2006; Stocker and Simoncelli, 2006) are coded logarithmically in the brain. Note that latter authors assumed that Bayesian integration still takes place in linear space. However, as we have shown here, the power law, or Stevens' law, is a direct consequence of Bayesian integration on a logarithmic scale (Eqs. 2, 3, 9). Thus, as MacKay (1963) has shown before, the Weber–Fechner law and Stevens' law are indeed compatible. The proposed Bayesian fusion also assumes that the variance of a measured magnitude is independent of the magnitude on the logarithmic scale. On linear scales, this leads to a constant increase in standard deviation with increasing mean, as observed in the experimental data and corresponding to scale invariance found in both Weber–Fechner's and Stevens' laws (Chater and Brown, 1999).

### Bayesian estimation of displacement, velocity, and time

Bayesian models succeeded in describing a variety of psychophysical data in related domains. The experimental design in the present study was very similar to recent work on interval timing (Jazayeri and Shadlen, 2010), allowing for a direct comparison of the behavioral findings. In particular, the same characteristic features, such as a tendency to the mean of the sample interval and an increase in bias with increasing sample range for estimation of traveled distances and turning angles, were previously reported for interval timing. Jazayeri and Shadlen (2010) tested different probabilistic approaches to combine the two sources of information given by sensory input and prior experience and concluded that a Bayesian observer model is statistically superior to maximum likelihood estimation (Ernst and Banks, 2002) or maximum a posteriori estimation in describing the main features of the behavioral data.

A large range of phenomena in motion perception, such as misestimation of speed and direction, was also successfully described by a Bayesian estimation process based on a prior that favors low speeds (Weiss et al., 2002; Stocker and Simoncelli, 2006). In line with this work, the variability in angular displacement perception has been proposed to be a result of Bayesian fusion of sensory inputs (Butler et al., 2010) and cognitive movement velocity (Jürgens and Becker, 2006). In the present work, however, movement velocity was varied randomly and independently of the experience condition. Thus, we show that the observed effects between conditions de facto depend on the experienced distances or turning angles. Yet the measurement of these displacements in the virtual world is still determined by an integration of optic flow, raising the question of whether the Bayesian estimation process is based on an estimate of displacement or takes place for time and velocity separately and is fused on a higher cognitive level to represent an estimate of displacement. The computational costs of the latter case would be higher for updating more than one magnitude and update of displacements alone in the present work provides a parsimonious explanation for a large number of findings.

### Conclusion

From the realization of the iterative Bayesian estimator model, we infer that the systematic errors seen in human path integration behavior are the result of a performance-optimizing estimation process that exploits knowledge about previous behavior and the uncertainty of measurements. The model provides a direct link between Weber–Fechner and Stevens' power law. Consequently, we propose that our results are not limited to displacement estimation, but can potentially provide a unified explanation for commonly seen effects in psychophysical magnitude estimation studies, such as the range, regression effect, and hysteresis effect.

## Footnotes

This work was supported by the Federal Ministry of Education and Research Grant BCCN 01GQ0440. We thank Virginia Flanagin and Paul MacNeilage for helpful comments and copy-editing.

↵

If the displacement to be reproduced was estimated already on logarithmic scales, then the mode, mean, and median of the posterior distribution would be equal and, for commonly used symmetric cost functions, the statistically optimal estimate would be the median of the log-normal distribution. We tested for this possibility and found that the model accounts well for the behavior of subjects when the shift parameter was not significantly different from zero (see Results), but generated worse fits for the remaining participants.^{a}- Correspondence should be addressed to Frederike H. Petzschner, Institute for Clinical Neurosciences, Ludwig-Maximilians-Universität, Marchioninistrasse 23, 81377 München, Germany. fpetzschner{at}lrz.uni-muenchen.de