## Abstract

The Pearson correlation coefficient squared, *r*^{2}, is an important tool used in the analysis of neural data to quantify the similarity between neural tuning curves. Yet this metric is biased by trial-to-trial variability; as trial-to-trial variability increases, measured correlation decreases. Major lines of research are confounded by this bias, including those involving the study of invariance of neural tuning across conditions and the analysis of the similarity of tuning across neurons. To address this, we extend an estimator,

**SIGNIFICANCE STATEMENT** Quantifying the similarity between two sets of averaged neural responses is fundamental to the analysis of neural data. A ubiquitous metric of similarity, the correlation coefficient, is attenuated by trial-to-trial variability that arises from many irrelevant factors. Spearman recognized this problem and proposed corrected methods that have been extended over a century. We show this method has large asymptotic biases that can be overcome using a novel estimator. Despite the frequent use of the correlation coefficient in neuroscience, consensus on how to address this fundamental statistical issue has not been reached. We provide an accurate estimator of the correlation coefficient and apply it to gain insight into visual invariance.

## Introduction

The measurement of correlation between noisy datasets is ubiquitous and plays a critical role in sensory neuroscience. The *r*^{2} between two sets of mean neural responses is fundamentally relevant to lines of research that compare tuning curves across neurons to understand functional organization and population encoding (Gawne and Richmond, 1993; Gawne et al., 1996; Cohen and Kohn, 2011; Power et al., 2011; Kiani et al., 2015) and to studies that compare tuning curves within the same neuron across stimulus transformations to understand invariance in cortical representations (Nandy et al., 2013; El-Shamayleh and Pasupathy, 2016; Kell and McDermott, 2019; Popovkina et al., 2019). Despite the importance of estimating the correlation across tuning curves, the typical estimator,

A more principled approach has been to account for trial-to-trial variability in the estimation of correlation. Correction for the attenuation of the correlation coefficient by measurement error has received considerable attention from fields outside neuroscience (Thouless, 1939; Beaton et al., 1979; Rosner and Willett, 1988; Adolph and Hardin, 2007; Saccenti et al., 2020) and has recently been adopted to estimate invariance of neural activity (Kell and McDermott, 2019). These studies followed a line of work begun by Spearman (1904), who provided a correction in the form of a multiplicative scaling of the Pearson correlation coefficient by its estimated attenuation. Here, we follow a different approach developed in the neuroscience literature (Sahani and Linden, 2003; Haefner and Cumming, 2008; Pospisil and Bair, 2021b) that sought to account for trial-to-trial variability in the estimation of the fit of a model to noisy estimated tuning curves. This line of work developed separate unbiased estimates of the numerator and denominator of the Pearson correlation coefficient. Here, we show how to extend that work to the case of measuring the correlation between two noisy, estimated tuning curves, and we call the estimator *r*^{2} between the expected responses (ERs) of two neurons. We show in simulation that

To demonstrate the usefulness of our novel estimator, we apply it to the problem of invariant cortical representation, where single-neuron selectivity is considered to be invariant if it is maintained across a transformation of the sensory input. Transformations of visual stimuli can be geometric, such as translation (Nandy et al., 2013) or scaling (El-Shamayleh and Pasupathy, 2016), or they can modify appearance by changing surface properties, for example, color or luminance (Bushnell et al., 2011; Popovkina et al., 2019). We show in neural data how

## Materials and Methods

##### Simulation procedure

To demonstrate the bias that noise imparts on neuron-to-neuron correlation and the ability of our methods to remove this bias, we simulate the responses of neurons to *m* stimuli across *n* repeats. We generate values, *r _{i}*

_{,}

*, that represent the square root (for reasons described below) of single-neuron responses to the*

_{j}*j*th repeat of the

*i*th stimulus from a normal distribution,

*, varies across stimuli while the variance, σ*

_{i}^{2}, is constant. The degree of variation across the μ

*reflects how well the stimuli modulate the neural response; we call this the dynamic range and quantify it as follows:*

_{i}*across stimuli. Trial-to-trial variability can be different across cells, fixing*

_{i}*n*and

*m*; the signal-to-noise ratio (SNR) determines the reliability with which correlation can be estimated across neuron pairs as follows:

This will be a critical parameter in both our simulations and our analysis of neural recordings.

For our simulations, the tuning curves for a pair of neurons, *X* and *Y*, in response to a set of *m* stimuli are defined to be the two sets of mean responses, μ* _{i}* and ν

*, respectively (*

_{i}*i*= 1 …

*m*), and these are modeled as sinusoids (Fig. 1

*A*, lines), as in the following:

The tuning curve for neuron *Y* is simply phase shifted to achieve the desired value of

To validate potential estimators for *n* repeats for each of the *m* stimuli. Averaging these observations across repeats produces observed tuning curves for the two simulated neurons (Fig. 1*A*, open circles).

##### Assumptions for the derivation of unbiased estimators

To simplify the derivation of unbiased estimators for

##### Unbiasing r^{2}

In the case where both *X* and *Y* are equal variance stochastic responses,

In our approach, we will unbias the numerator and denominator of Equation 6 separately. The expected value of the numerator is the following:

The denominator of Equation 6 consists of two factors that are each scaled noncentral chi-squared distributions as follows:

Thus, the expected values of these factors are as follows:

Because these two random variables are independent, the expected value of the denominator, which is their product, is the product of these expected values as follows:

The first term (second line of this equation) is the desired denominator (that of Eq. 7), whereas the second term (third line) is the bias. To remove this bias, an unbiased estimate of it can be subtracted from the denominator of Equation 6.

##### Estimators of bias terms

To compute the bias terms to be subtracted from the numerator and denominator, three unknown quantities need to be estimated, that is, the two following dynamic range values:
^{2}. Below we provide unbiased estimators for these quantities.

In the case of the dynamic range values, the following naive sample estimator:

Thus, an unbiased estimator is the following:

For the case of the sample variance, if we have *n* repeated trials, we can calculate sample variance over those trials, then because the variance is the same across stimuli and neurons (assuming a variance stabilizing transformation), we can average them for a global estimate as follows:
*n* repeats for the *i*th stimulus. Using these unbiased quantities to form the terms to be subtracted from the numerator and denominator of the naive

##### Confidence intervals for r ̂ ER 2

To quantify uncertainty about the estimate

ECCI finds confidence intervals centered around the estimate of

##### Spearman's correction for attenuation

Spearman (1904) noted that in the case of measurements of two quantities with some underlying true correlation but with additive independent measurement error, the measured correlation would tend to be less than the true correlation. He provided the analytic expression for the attenuation, *A* (using the notation of Saccenti et al., 2020), under a bivariate normal model as follows:
_{0}, as follows:

Here,

The method to correct the attenuation is straightforward. Find an estimate for *A* and multiply the estimated correlation, ρ_{0}, by the inverse of this estimate. Spearman does not specify estimators for the unknowns in Equation 21; however, Adolph and Hardin (2007) use sample variance to estimate both the within-condition variance (

We will square this estimator to compare it to ours (see below, Results). Because

##### Electrophysiological data

To demonstrate the use of our estimator, we reanalyzed data from three previous single-unit, extracellular studies of parafoveal V4 neurons in the awake, fixating rhesus monkey (*Macaca mulatta*) of both sexes. Data from the first study, (Pasupathy and Connor, 2001), consists of the responses of 109 V4 neurons to a set of 362 shapes. There were typically three to five repeats of each stimulus, and we examined only the 96 cells that had at least four repeats for all stimuli. We computed the spike count for each trial during the 500 ms stimulus presentation. From a second study (El-Shamayleh and Pasupathy, 2016), we reanalyzed responses of 80 neurons tested for translation invariance using shapes like those of the first study but where the position of the stimuli within the receptive field (RF) was also varied. Each neuron was tested with up to 56 shapes (some of which are rotations of others) presented at three to five positions within the RF. Each unique combination of stimulus and RF position was presented for 5–16 repeats, and spike counts were averaged over the 300 ms stimulus presentation. Data from the third study (Popovkina et al., 2019), consists of the responses of 42 V4 neurons using shapes like those of the first study except in two conditions—fill and outline. In the fill condition, the interior of the shape was the same color as its outline. In the outline condition, the interior of the shape was the same color as the background. Stimuli were presented for 300 ms with a 200 ms blank interval preceding each. For each neuron, stimuli were presented for at least three repeats. Experimental protocols for all studies are described in detail in the original publications.

##### Experimental design and statistical analysis

The experimental design is described above in Electrophysiological data, and a full description is provided in the original studies (Pasupathy and Connor, 2001; El-Shamayleh and Pasupathy, 2016; Popovkina et al., 2019). The statistical analysis of data are described in detail above in the following sections: Unbiasing *r*^{2}, Estimators of bias terms, Confidence intervals for

##### Data availability

Code for calculating estimates of

## Results

Our results are organized as follows. We first describe the source of the bias in

Consider a typical scenario in sensory neuroscience where the response of two neurons to *n* repeated presentations of *m* stimuli have been collected, and the average of these responses are compared (Fig. 1*A*). Even if the underlying neuronal tuning curves (Fig. 1*A*, blue and red lines) were perfectly correlated, the *m* sample averages, *Y _{i}* and

*X*(blue and red open circles), will deviate from the expected value because of trial-to-trial variability, σ

_{i}^{2}, scaled by 1/

*n*. Thus, the observed

*r*

^{2}between the expected values of the neuronal responses is 1 (Fig. 1

*A*, red and blue lines are identical up to a shift and scaling).

The quantity we attempt to estimate here is the correlation coefficient *r*^{2} between the expected values, μ* _{i}* and ν

*, of the responses of the two neurons, that is, between the tuning curves in the absence of noise. We will call this quantity*

_{i}It is tempting to estimate this quantity using the naive sample estimator as follows:
^{2}/*n*, the amount of variability remaining after averaging over stimulus repeats. The bias in the denominator, which results in the attenuation of *m* – 1) (Eq. 13, third line). The bias does go to zero as *n*.

To solve this problem, we take the straightforward strategy of finding unbiased estimators of these noise terms and subtracting them from the numerator and denominator of

### Validation of estimator by simulation

To test the ability of ^{2}), the same number of trials (*n*), but potentially different SNR (Eq. 2; because noise is fixed, SNR increases with tuning curve amplitude). Both simulated neurons had sinusoidal tuning curves (as in Fig. 1*A*), and their relative phase offset was varied to set the true tuning correlation (see above, Materials and Methods, Eq. 3). We first consider the case where neuron *X* has a high SNR (SNR_{X} = 1.0), whereas neuron *Y* has a low SNR (SNR_{Y} = 0.5). In our simulation we set *m* = 371 and *n* = 4 to represent a typical V4 experiment with many shape stimuli and few repeats of each. This parameter regime, where *m* ≫ *n*, is typical in neuroscience because often many experimental conditions (e.g., stimuli) are needed to characterize tuning (see below, Nonvision example of correcting correlation between tuning curves for hippocampus example). In this case, the naive estimate of correlation, *B*, orange), is on average only one-quarter of the value of the true correlation between the simulated tuning curves. For example, *B*, blue line lies on diagonal). However, although *C*), the difference between the average of *r*^{2} = 1 or go below *r*^{2} = 0; this is necessary to obtain an unbiased estimator (e.g., if all probability mass is below 1, the expected value cannot be 1). To account for these impossible values of *r*^{2}, when analyzing results for single neurons, we recommend truncating the estimator to lie in [0, 1]; this will minimally add to the bias and aid interpretation.

### Comparison to Spearman's correction

To demonstrate how our estimator,

We compare the bias of these four estimators using simulated data in which we vary the number of stimuli *m* and the SNR for both neurons while holding the number of repeats constant at *n* = 4 and the true correlation fixed at *m* = 20; Fig. 2*A*), the naive *m* is increased to 50 (Fig. 2*B*, orange) and 1000 (Fig. 2*C*, orange). The estimator proposed by Adolph and Hardin, *r*^{2} and does not improve with *m*. The same estimator computed using an unbiased estimate of the dynamic range, *r*^{2} = 1) as *m* increases. Finally, our *m*, although in the case of the lowest SNR values, it becomes highly variable (Fig. 2*A*, large blue SD at left). For higher values of *m* (Fig. 2*B*,*C*), this is not an issue. For very high *m*, our correction to Spearman's method *C*, blue and purple lines overlap).

Although we have found our corrected estimators can effectively reduce bias, they also typically have higher variability than the prior estimators, which can lead to overall higher mean squared error (MSE, a common measure of overall estimator accuracy). Here, we explore the trade-off between bias and variance across different numbers of repeats (*n*) and stimuli (*m*; Fig. 2, bottom row). When we set the number of repeats to be *n* = 4 (Fig. 2*D–F*), we found that for very low *m*, the MSE was highest for our corrected estimators and the naive estimator. Specifically, *m* increases above 10, the bias of the corrected estimators drops below those of the prior estimators, but the MSE remains higher than Spearman's *m* = 40. As *m* increases beyond 40, the MSE of the corrected estimators drops below those of the prior estimators (blue, purple below red, orange solid traces). Thus, at this point our corrected estimators are more accurate than the uncorrected estimators despite their variance remaining higher. Although the variance of all estimators decreases with *m* (dotted lines decreasing), the bias of the naive estimator (*m*, thus improving the accuracy of the correction. Interestingly, the MSEs of the corrected estimators are nearly identical, but the bias of *n* = 10; Fig. 2*E*), the performance of all estimators improves, but similarly to Fig. 2*D*, the MSE of the naive estimator *m* increases (solid red line is steeper in Fig. 2*E* than in Fig. 2*D*), this is because the overestimation of dynamic range because of trial-to-trial variability (Eq. 17) is reduced by averaging across more repeats, and as *m* increases, the true dynamic range dominates. Finally, for a very high number of repeats (*n* = 50; Fig. 2*F*), the bias of all estimators (dashed lines) is low because the effect of trial-to-trial variability has largely been averaged out. Overall, with a high number of repeats, the difference between the estimators becomes small, but when the number of repeats is low, the corrected estimators (*m*.

### Split trial validation for V4 data

We have shown that our estimator, *n* = 4 repeats and *m* = 371 stimuli (see above, Materials and Methods, Electrophysiological data). We evaluate our estimators on the first and third trial correlated to the second and fourth trial.

We first computed the naive *A*, orange points) and plotted these values as a function of the estimated SNR (*n*, the naive *A*, blue points), with an increasing upward bias for low SNR. The highest SNR units (rightmost blue points) have estimates that are close to one. For neurons with lower SNR, some of the *A*, small filled blue points, top left) or <0 (small filled blue points, bottom left) consistent with our simulations (Fig. 2*A*, shaded blue SD on left). The population average (

For two example neurons, plots of the raw tuning curve values on even versus odd trials (Fig. 3) provide deeper intuition into the metric. Each point in the scatter plots represents the mean response to a particular shape on the even trials versus that on the odd trials. In both plots, by the split-half construction, all the residual variance is attributable to trial-to-trial variability. The estimator

Overall, our validation on simulated data demonstrates that our corrected *n* (only two repeats) and thus substantial noise.

We found that low SNR neurons tended to have unstable estimates. This is because for low SNR, the distribution of denominator estimates shifts toward zero. When estimating population level correlation this can be ameliorated by calculating the median (see below, Measuring population translation invariance). When an analysis focuses on correlation at the single-neuron level, truncation of the estimate of

### Measuring population translation invariance

Translation invariance of neuronal selectivity is the degree to which a neuron maintains the same pattern of responses to a stimulus set regardless of where the stimulus set is presented in its RF. This can be quantified by measuring the correlation between the responses at one position to those at another. The naive

We reanalyzed data from El-Shamayleh and Pasupathy (2016), which consisted of recordings of 80 V4 neurons responding to a set of simple shapes presented at up to four positions within the RF (see above, Materials and Methods, Electrophysiological data). One potential concern is that the RFs of the neurons are diverse (Fig. 4*A*, gray traces); for some neurons the average response (across shapes) falls off quickly over space, whereas for others the mean remains high. Half of the neurons are at less than ∼80% of mean response for the farthest stimuli (Fig. 2*A*, red trace). Given the influence of SNR on *A–C*, top row, orange traces), it is quite possible that the naive *B*), falling from a value of 1 (by definition) at the center to a median of 0.52 at the position immediately to the left of center and to 0.51 at the point immediately to the right of center. The median *C*). This reveals that for the smallest shifts away from the center, tuning remains quite similar (e.g., median *B*) is attributable to trial-to-trial variability. On the other hand, a substantial dropoff in tuning correlation remains for the largest offsets tested.

By focusing on neurons with a high SNR, we can find units that are truly invariant, and we can also identify units that have tuning that changes reliably with position in the RF. For example, consider a high-SNR neuron (neuron 68, which had high *d*^{2}; Eqs. 1, 2; thus was strongly tuned for shapes) that had high translation invariance (Fig. 4*D*, solid blue trace is relatively flat) versus another (neuron 25) with tuning that is more sensitive to position (solid cyan trace falls off sharply). For the same fraction estimated RF shift (−0.12), one cell has near-perfect invariance, despite the RF sensitivity dropping off by ∼11% (blue dotted trace), whereas the second example cell shares only half the variance with the tuning at the center of the RF (cyan trace, RF sensitivity drops ∼20%). The important contribution of the *E*, each point corresponds to one shape), it can be seen that there is a strong linear relationship for neuron 68 (blue). Intuitively, any deviation from a line is attributable to trial-to-trial variability because a line drawn through these points could intersect the majority of SE bars. For neuron 25 (cyan), however, there is a substantial change in tuning; many shapes with points previously in the cluster near the origin have more than doubled their responses at the shifted position, whereas for points to the right of 20 spikes/s, the average response to these shapes has been roughly halved. SE bars on these points indicate that it is unlikely their deviation from a line of best fit is solely because of trial-to-trial variability.

In summary,

### Measuring invariance of shape tuning to a change in a surface property

We now demonstrate how our metric along with its associated CIs) can be used to accurately and systematically quantify another important visual invariance, one that relates to the representation of boundary versus surface features. The neuronal data comes from a study (Popovkina et al., 2019) in which V4 neurons were tested with a set of shapes presented as either outlines (shape interior matched the background) or fills (interior was painted the same as the boundary) with the goal of determining whether responses were dictated by boundary shape alone or were dependent on interior fill. It was hypothesized that neuronal responses would have a high correlation across the transformation from fill to outline—high fill-outline invariance—if neuronal shape tuning was largely based on boundary information, but the authors found that the *r* values tended to be low, suggesting that most neurons did not show strong fill-outline (FO) invariance. This is an intriguing observation because many models differ strongly from V4 in this respect; for example, an H-Max model (Cadieu et al., 2007) fit to V4 shape tuning was found to have very high FO invariance (Popovkina et al., 2019). But to be able to accurately compare noisy neuronal data with noise-free models, we need a way to correct for noise.

Our

We computed fill-outline *r*^{2} (Fig. 5*A*, rightmost bar). These neurons correspond to the observation in the original study that many neurons failed to show tuning for one or the other of the fill or the outline stimulus sets, and this lack of tuning means that the SNR is too low to allow for accurate estimation of the correlation value.

The cumulative distribution of the naive *B*, orange trace) climbs quickly along the left side of the plot, suggesting that high fill-outline invariance is rare (median *B*, blue line, CI < 0.5) is shifted substantially rightward (median *B*) shows the distribution of the naive estimator for just those units with narrow CIs. From this comparison, we conclude that units with measurable tuning for both fill and outline stimuli tended to sit on the higher end of the original naive distribution (green curve lies to the right of orange curve), and that our corrected values are substantially larger than the naive values (blue lies to the right of green).

Example neurons can provide some intuition into the corrected values. Figure 5*C* plots outline versus fill responses for a cell with middling *D*,*E*) are those where some of the variance, but not all, is attributable to noise. Finally, a cell with low SNR (Fig. 5*F*) had a confidence interval covering [0, 1] consistent with the observation that it responded significantly only to filled shapes and not to outlines.

In summary, by focusing on

### Nonvision example of correcting correlation between tuning curves

Our correction is important not only for studies of invariance in sensory areas of vision, audition, somatosensation, and olfaction (for references, see below, Discussion) but is highly relevant in any area of neuroscience where the similarity of noisy neural responses across experimental conditions or across neurons is examined. To provide a clear, concrete example from another field, we explain below how the study of hippocampal place field remapping is particularly susceptible to the confounds of noise.

Place fields are characterized by calculating firing rate tuning curves for neurons in the hippocampus with respect to the position of an animal in an enclosure. Much interest has focused on how these place fields change depending on the spatial and sensory context of the animals; the change has been termed remapping. To quantify the degree of remapping, the naive Pearson correlation coefficient has been used for more than three decades (Muller et al., 1987; Muller and Kubie, 1987; Bostock et al., 1991; Quirk et al., 1992; Skaggs and McNaughton, 1998; Anderson and Jeffery, 2003; Lee et al., 2004; Leutgeb et al., 2005, 2007; Fenton et al., 2008; Alme et al., 2014; Rueckemann et al., 2016; Fetterhoff et al., 2021), despite the fact that these experiments are especially susceptible to the confounds of trial-to-trial variability, which we have discussed above. The experimental conditions in this case are the discretized regions of the space that the animal explores, and this can easily be quite high (e.g., *m* ≈ 450 for an enclosure with a diameter of 76 cm Muller et al., 1987). At the same time, the number of repeats (*n*) is determined by the number of times an animal crosses through the same region of space, thus putting a strong limit on how much averaging can be performed. Furthermore, place fields are sparse, only a few regions in space will evoke responses; thus SNR is suppressed by the large regions over which the neurons are silent. Commensurate with these structural difficulties, studies find that correlation of rate maps in the same experimental context (same enclosure in the same location) can be very low.

Fenton et al. (2008) performed a seminal study of remapping by estimating place field maps in a large square-shaped enclosure and smaller cylinder within the square enclosure. The authors found that the correlation of firing rate maps was lower between the cylinder and square than it was between rate maps recorded in the cylinder at different times. Yet they found that this latter quantity, the correlation of a place field map to itself at a later time, was on average only *r* ≈ 0.34 (*r*^{2} ≈ 0.11; Fenton et al. 2008, their Fig. 2*B*). Assuming that place fields remain the same over short time periods (60 min in this case), a common assumption (Muller et al., 1987; Latuske et al., 2017), it follows that the vast majority of variance in these tuning curves is noise, and thus the SNR is very low. Furthermore, the distribution of these rate map self-correlation values across neurons is broad, being uniformly distributed between 0 and 0.8 (Fenton et al. 2008, their Fig. 2*B*, inset). At the same time, the distribution of map correlations between the cylinder and square are also broad with a substantial fraction exceeding *r* = 0.34. The diversity in these correlation distributions could be explained solely by differences in SNR, or there may be subsets of neurons whose rate maps change rapidly over time and others that remain the same between enclosures. Reanalysis of this dataset using our metric could answer this intriguing question.

In general, we find that most studies, including place-field remapping studies, account for the confound of noise by either minimizing the effect of noise (e.g., using longer recording times, fewer stimuli, and collecting many repeats) or by restricting their conclusion to relative statements (e.g., tuning curves or maps are more similar in this condition than that condition). In the former case, an inordinate amount of repeats are collected when those trials could have been used to increase the number of experimental conditions by applying our estimator. In the latter case, results often remain ambiguous because of the possibility of changes in SNR across conditions, comparisons across studies are difficult, and there is no absolute measure of similarity between tuning curves, restricting the granularity of results.

## Discussion

We have developed a new estimator,

### Interpretation of estimator

We have developed an estimator of correlation between the true tuning curves of two neurons. Conceptually, this is the correlation between two estimated tuning curves if the experimentalist were able to collect an unlimited number of repeats, average them, and there were no changes in experimental conditions as the experiment stretched on into infinity. In practice, however, experimental stability is difficult to maintain for long periods (e.g., electrode position may drift). Thus a key benefit of our estimator is that fewer repeats can be collected, shortening the experiment and allowing correlation to be estimated under tighter experimental control.

### Extensions of the estimator

The correlation coefficient is the basis of several multivariate data analysis methods, including principal component analysis (when the variance is factored out), canonical correlation analysis (for review see Zhuang et al., 2020), and representational similarity analysis (for review see Kriegeskorte et al., 2008). Future work should examine the effect that the downward bias of noise on correlation has on the estimation of these quantities and whether applying a corrected estimator like the one developed here can improve inference.

Here, our derivations have assumed that all responses are independent (e.g., no correlation between responses to different stimuli). In some cases this assumption is incorrect. For example, studies have used correlation to assess the similarity of spike trains (de la Rocha et al., 2007), sometimes for invariance (Metzen et al., 2016). In this case, the neural response is not a set of values over stimuli, but a set of values over time. Neighboring responses in time are most likely correlated because of temporal correlation in the response of the system. Our method can be adapted to handle this type of comparison, and in a forthcoming publication we will extend it to arbitrary trial-to-trial correlations.

### Translation invariance

We quantified translation invariance for a V4 population as the correlation between shape tuning at the RF center versus that at offset positions. Despite translation invariance being a key component of ventral stream models (Fukushima, 1980; Olshausen et al., 1993; Salinas and Abbott, 1997; Riesenhuber and Poggio, 1999; Ullman and Soloviev, 1999), including models specific to V4 (Pasupathy and Connor, 2001; David et al., 2006; Cadieu et al., 2007; Sharpee et al., 2013), only two studies have systematically mapped V4 RFs and presented large stimulus sets at multiple positions (Nandy et al., 2013; El-Shamayleh and Pasupathy, 2016). The number of stimulus conditions required in such experiments calls for a limited number of repeats in awake preparations, causing trial-to-trial variability to dramatically obscure tuning invariance. Correction for noise is thus crucial in studies requiring a large number of unique stimuli. Using our estimator, future studies can more accurately quantify translation invariance across the ventral stream by more densely sampling the RF with more diverse stimuli at the expense of fewer repeats.

### Fill-outline invariance

We also applied our estimator to more accurately assess the invariance of V4 shape selectivity across a change in a surface property. We quantified fill-outline invariance as the correlation in the tuning curves of a neuron to a set of reference shapes with filled interiors versus the same set of shapes drawn only with outlines. Similarly to translation invariance, and consistent with the conclusions of the original study, fill-outline invariance varied substantially across neurons in V4. We found that an analysis based on the naive

More generally, understanding how invariance in any cortical representation differs from that in models is an important step in validating models, and our metric is critical to factor out low correlation values caused by noise. It allows one to distinguish neurons that have high invariance from those that have large changes in tuning, even in the face of of noise.

### Other methods of quantifying neural invariance

Prior studies have used metrics of invariance in addition to

The separability index and *r*^{2}.

## Footnotes

This work was supported by National Science Foundation Graduate Research Fellowship DGE-1256082 (D.A.P.) and National Institutes of Health–National Eye Institute T32 Vision Training Grants EY07031 (D.A.P.), R01-EY029997 (W.B.), and R01-EY027023 (W.B.). We thank Dina Popovkina, Yasmine El-Shamayleh, and Anitha Pasupathy for sharing data and Anitha Pasupathy, Greg Horwitz, and Yen-Chi Chen for providing comments.

The authors declare no competing financial interests.

- Correspondence should be addressed to Dean A. Pospisil at dp4846{at}princeton.edu