Abstract
The Pearson correlation coefficient squared, r2, is an important tool used in the analysis of neural data to quantify the similarity between neural tuning curves. Yet this metric is biased by trial-to-trial variability; as trial-to-trial variability increases, measured correlation decreases. Major lines of research are confounded by this bias, including those involving the study of invariance of neural tuning across conditions and the analysis of the similarity of tuning across neurons. To address this, we extend an estimator,
SIGNIFICANCE STATEMENT Quantifying the similarity between two sets of averaged neural responses is fundamental to the analysis of neural data. A ubiquitous metric of similarity, the correlation coefficient, is attenuated by trial-to-trial variability that arises from many irrelevant factors. Spearman recognized this problem and proposed corrected methods that have been extended over a century. We show this method has large asymptotic biases that can be overcome using a novel estimator. Despite the frequent use of the correlation coefficient in neuroscience, consensus on how to address this fundamental statistical issue has not been reached. We provide an accurate estimator of the correlation coefficient and apply it to gain insight into visual invariance.
Introduction
The measurement of correlation between noisy datasets is ubiquitous and plays a critical role in sensory neuroscience. The r2 between two sets of mean neural responses is fundamentally relevant to lines of research that compare tuning curves across neurons to understand functional organization and population encoding (Gawne and Richmond, 1993; Gawne et al., 1996; Cohen and Kohn, 2011; Power et al., 2011; Kiani et al., 2015) and to studies that compare tuning curves within the same neuron across stimulus transformations to understand invariance in cortical representations (Nandy et al., 2013; El-Shamayleh and Pasupathy, 2016; Kell and McDermott, 2019; Popovkina et al., 2019). Despite the importance of estimating the correlation across tuning curves, the typical estimator,
A more principled approach has been to account for trial-to-trial variability in the estimation of correlation. Correction for the attenuation of the correlation coefficient by measurement error has received considerable attention from fields outside neuroscience (Thouless, 1939; Beaton et al., 1979; Rosner and Willett, 1988; Adolph and Hardin, 2007; Saccenti et al., 2020) and has recently been adopted to estimate invariance of neural activity (Kell and McDermott, 2019). These studies followed a line of work begun by Spearman (1904), who provided a correction in the form of a multiplicative scaling of the Pearson correlation coefficient by its estimated attenuation. Here, we follow a different approach developed in the neuroscience literature (Sahani and Linden, 2003; Haefner and Cumming, 2008; Pospisil and Bair, 2021b) that sought to account for trial-to-trial variability in the estimation of the fit of a model to noisy estimated tuning curves. This line of work developed separate unbiased estimates of the numerator and denominator of the Pearson correlation coefficient. Here, we show how to extend that work to the case of measuring the correlation between two noisy, estimated tuning curves, and we call the estimator
To demonstrate the usefulness of our novel estimator, we apply it to the problem of invariant cortical representation, where single-neuron selectivity is considered to be invariant if it is maintained across a transformation of the sensory input. Transformations of visual stimuli can be geometric, such as translation (Nandy et al., 2013) or scaling (El-Shamayleh and Pasupathy, 2016), or they can modify appearance by changing surface properties, for example, color or luminance (Bushnell et al., 2011; Popovkina et al., 2019). We show in neural data how
Materials and Methods
Simulation procedure
To demonstrate the bias that noise imparts on neuron-to-neuron correlation and the ability of our methods to remove this bias, we simulate the responses of neurons to m stimuli across n repeats. We generate values, ri,j, that represent the square root (for reasons described below) of single-neuron responses to the jth repeat of the ith stimulus from a normal distribution,
This will be a critical parameter in both our simulations and our analysis of neural recordings.
For our simulations, the tuning curves for a pair of neurons, X and Y, in response to a set of m stimuli are defined to be the two sets of mean responses, μi and νi, respectively (i = 1 … m), and these are modeled as sinusoids (Fig. 1A, lines), as in the following:
The tuning curve for neuron Y is simply phase shifted to achieve the desired value of
To validate potential estimators for
Assumptions for the derivation of unbiased estimators
To simplify the derivation of unbiased estimators for
Unbiasing r2
In the case where both X and Y are equal variance stochastic responses,
In our approach, we will unbias the numerator and denominator of Equation 6 separately. The expected value of the numerator is the following:
The denominator of Equation 6 consists of two factors that are each scaled noncentral chi-squared distributions as follows:
Thus, the expected values of these factors are as follows:
Because these two random variables are independent, the expected value of the denominator, which is their product, is the product of these expected values as follows:
The first term (second line of this equation) is the desired denominator (that of Eq. 7), whereas the second term (third line) is the bias. To remove this bias, an unbiased estimate of it can be subtracted from the denominator of Equation 6.
Estimators of bias terms
To compute the bias terms to be subtracted from the numerator and denominator, three unknown quantities need to be estimated, that is, the two following dynamic range values:
In the case of the dynamic range values, the following naive sample estimator:
Thus, an unbiased estimator is the following:
For the case of the sample variance, if we have n repeated trials, we can calculate sample variance over those trials, then because the variance is the same across stimuli and neurons (assuming a variance stabilizing transformation), we can average them for a global estimate as follows:
Confidence intervals for r ̂ ER 2
To quantify uncertainty about the estimate
ECCI finds confidence intervals centered around the estimate of
Spearman's correction for attenuation
Spearman (1904) noted that in the case of measurements of two quantities with some underlying true correlation but with additive independent measurement error, the measured correlation would tend to be less than the true correlation. He provided the analytic expression for the attenuation, A (using the notation of Saccenti et al., 2020), under a bivariate normal model as follows:
Here,
The method to correct the attenuation is straightforward. Find an estimate for A and multiply the estimated correlation, ρ0, by the inverse of this estimate. Spearman does not specify estimators for the unknowns in Equation 21; however, Adolph and Hardin (2007) use sample variance to estimate both the within-condition variance (
We will square this estimator to compare it to ours (see below, Results). Because
Electrophysiological data
To demonstrate the use of our estimator, we reanalyzed data from three previous single-unit, extracellular studies of parafoveal V4 neurons in the awake, fixating rhesus monkey (Macaca mulatta) of both sexes. Data from the first study, (Pasupathy and Connor, 2001), consists of the responses of 109 V4 neurons to a set of 362 shapes. There were typically three to five repeats of each stimulus, and we examined only the 96 cells that had at least four repeats for all stimuli. We computed the spike count for each trial during the 500 ms stimulus presentation. From a second study (El-Shamayleh and Pasupathy, 2016), we reanalyzed responses of 80 neurons tested for translation invariance using shapes like those of the first study but where the position of the stimuli within the receptive field (RF) was also varied. Each neuron was tested with up to 56 shapes (some of which are rotations of others) presented at three to five positions within the RF. Each unique combination of stimulus and RF position was presented for 5–16 repeats, and spike counts were averaged over the 300 ms stimulus presentation. Data from the third study (Popovkina et al., 2019), consists of the responses of 42 V4 neurons using shapes like those of the first study except in two conditions—fill and outline. In the fill condition, the interior of the shape was the same color as its outline. In the outline condition, the interior of the shape was the same color as the background. Stimuli were presented for 300 ms with a 200 ms blank interval preceding each. For each neuron, stimuli were presented for at least three repeats. Experimental protocols for all studies are described in detail in the original publications.
Experimental design and statistical analysis
The experimental design is described above in Electrophysiological data, and a full description is provided in the original studies (Pasupathy and Connor, 2001; El-Shamayleh and Pasupathy, 2016; Popovkina et al., 2019). The statistical analysis of data are described in detail above in the following sections: Unbiasing r2, Estimators of bias terms, Confidence intervals for
Data availability
Code for calculating estimates of
Results
Our results are organized as follows. We first describe the source of the bias in
Consider a typical scenario in sensory neuroscience where the response of two neurons to n repeated presentations of m stimuli have been collected, and the average of these responses are compared (Fig. 1A). Even if the underlying neuronal tuning curves (Fig. 1A, blue and red lines) were perfectly correlated, the m sample averages, Yi and Xi (blue and red open circles), will deviate from the expected value because of trial-to-trial variability, σ2, scaled by 1/n. Thus, the observed
The quantity we attempt to estimate here is the correlation coefficient r2 between the expected values, μi and νi, of the responses of the two neurons, that is, between the tuning curves in the absence of noise. We will call this quantity
It is tempting to estimate this quantity using the naive sample estimator as follows:
To solve this problem, we take the straightforward strategy of finding unbiased estimators of these noise terms and subtracting them from the numerator and denominator of
Validation of estimator by simulation
To test the ability of
Comparison to Spearman's correction
To demonstrate how our estimator,
We compare the bias of these four estimators using simulated data in which we vary the number of stimuli m and the SNR for both neurons while holding the number of repeats constant at n = 4 and the true correlation fixed at
Although we have found our corrected estimators can effectively reduce bias, they also typically have higher variability than the prior estimators, which can lead to overall higher mean squared error (MSE, a common measure of overall estimator accuracy). Here, we explore the trade-off between bias and variance across different numbers of repeats (n) and stimuli (m; Fig. 2, bottom row). When we set the number of repeats to be n = 4 (Fig. 2D–F), we found that for very low m, the MSE was highest for our corrected estimators and the naive estimator. Specifically,
Split trial validation for V4 data
We have shown that our estimator,
We first computed the naive
For two example neurons, plots of the raw tuning curve values on even versus odd trials (Fig. 3) provide deeper intuition into the metric. Each point in the scatter plots represents the mean response to a particular shape on the even trials versus that on the odd trials. In both plots, by the split-half construction, all the residual variance is attributable to trial-to-trial variability. The estimator
Overall, our validation on simulated data demonstrates that our corrected
We found that low SNR neurons tended to have unstable estimates. This is because for low SNR, the distribution of denominator estimates shifts toward zero. When estimating population level correlation this can be ameliorated by calculating the median (see below, Measuring population translation invariance). When an analysis focuses on correlation at the single-neuron level, truncation of the estimate of
Measuring population translation invariance
Translation invariance of neuronal selectivity is the degree to which a neuron maintains the same pattern of responses to a stimulus set regardless of where the stimulus set is presented in its RF. This can be quantified by measuring the correlation between the responses at one position to those at another. The naive
We reanalyzed data from El-Shamayleh and Pasupathy (2016), which consisted of recordings of 80 V4 neurons responding to a set of simple shapes presented at up to four positions within the RF (see above, Materials and Methods, Electrophysiological data). One potential concern is that the RFs of the neurons are diverse (Fig. 4A, gray traces); for some neurons the average response (across shapes) falls off quickly over space, whereas for others the mean remains high. Half of the neurons are at less than ∼80% of mean response for the farthest stimuli (Fig. 2A, red trace). Given the influence of SNR on
By focusing on neurons with a high SNR, we can find units that are truly invariant, and we can also identify units that have tuning that changes reliably with position in the RF. For example, consider a high-SNR neuron (neuron 68, which had high d2; Eqs. 1, 2; thus was strongly tuned for shapes) that had high translation invariance (Fig. 4D, solid blue trace is relatively flat) versus another (neuron 25) with tuning that is more sensitive to position (solid cyan trace falls off sharply). For the same fraction estimated RF shift (−0.12), one cell has near-perfect invariance, despite the RF sensitivity dropping off by ∼11% (blue dotted trace), whereas the second example cell shares only half the variance with the tuning at the center of the RF (cyan trace, RF sensitivity drops ∼20%). The important contribution of the
In summary,
Measuring invariance of shape tuning to a change in a surface property
We now demonstrate how our metric along with its associated CIs) can be used to accurately and systematically quantify another important visual invariance, one that relates to the representation of boundary versus surface features. The neuronal data comes from a study (Popovkina et al., 2019) in which V4 neurons were tested with a set of shapes presented as either outlines (shape interior matched the background) or fills (interior was painted the same as the boundary) with the goal of determining whether responses were dictated by boundary shape alone or were dependent on interior fill. It was hypothesized that neuronal responses would have a high correlation across the transformation from fill to outline—high fill-outline invariance—if neuronal shape tuning was largely based on boundary information, but the authors found that the r values tended to be low, suggesting that most neurons did not show strong fill-outline (FO) invariance. This is an intriguing observation because many models differ strongly from V4 in this respect; for example, an H-Max model (Cadieu et al., 2007) fit to V4 shape tuning was found to have very high FO invariance (Popovkina et al., 2019). But to be able to accurately compare noisy neuronal data with noise-free models, we need a way to correct for noise.
Our
We computed fill-outline
The cumulative distribution of the naive
Example neurons can provide some intuition into the corrected values. Figure 5C plots outline versus fill responses for a cell with middling
In summary, by focusing on
Nonvision example of correcting correlation between tuning curves
Our correction is important not only for studies of invariance in sensory areas of vision, audition, somatosensation, and olfaction (for references, see below, Discussion) but is highly relevant in any area of neuroscience where the similarity of noisy neural responses across experimental conditions or across neurons is examined. To provide a clear, concrete example from another field, we explain below how the study of hippocampal place field remapping is particularly susceptible to the confounds of noise.
Place fields are characterized by calculating firing rate tuning curves for neurons in the hippocampus with respect to the position of an animal in an enclosure. Much interest has focused on how these place fields change depending on the spatial and sensory context of the animals; the change has been termed remapping. To quantify the degree of remapping, the naive Pearson correlation coefficient has been used for more than three decades (Muller et al., 1987; Muller and Kubie, 1987; Bostock et al., 1991; Quirk et al., 1992; Skaggs and McNaughton, 1998; Anderson and Jeffery, 2003; Lee et al., 2004; Leutgeb et al., 2005, 2007; Fenton et al., 2008; Alme et al., 2014; Rueckemann et al., 2016; Fetterhoff et al., 2021), despite the fact that these experiments are especially susceptible to the confounds of trial-to-trial variability, which we have discussed above. The experimental conditions in this case are the discretized regions of the space that the animal explores, and this can easily be quite high (e.g., m ≈ 450 for an enclosure with a diameter of 76 cm Muller et al., 1987). At the same time, the number of repeats (n) is determined by the number of times an animal crosses through the same region of space, thus putting a strong limit on how much averaging can be performed. Furthermore, place fields are sparse, only a few regions in space will evoke responses; thus SNR is suppressed by the large regions over which the neurons are silent. Commensurate with these structural difficulties, studies find that correlation of rate maps in the same experimental context (same enclosure in the same location) can be very low.
Fenton et al. (2008) performed a seminal study of remapping by estimating place field maps in a large square-shaped enclosure and smaller cylinder within the square enclosure. The authors found that the correlation of firing rate maps was lower between the cylinder and square than it was between rate maps recorded in the cylinder at different times. Yet they found that this latter quantity, the correlation of a place field map to itself at a later time, was on average only r ≈ 0.34 (r2 ≈ 0.11; Fenton et al. 2008, their Fig. 2B). Assuming that place fields remain the same over short time periods (60 min in this case), a common assumption (Muller et al., 1987; Latuske et al., 2017), it follows that the vast majority of variance in these tuning curves is noise, and thus the SNR is very low. Furthermore, the distribution of these rate map self-correlation values across neurons is broad, being uniformly distributed between 0 and 0.8 (Fenton et al. 2008, their Fig. 2B, inset). At the same time, the distribution of map correlations between the cylinder and square are also broad with a substantial fraction exceeding r = 0.34. The diversity in these correlation distributions could be explained solely by differences in SNR, or there may be subsets of neurons whose rate maps change rapidly over time and others that remain the same between enclosures. Reanalysis of this dataset using our metric could answer this intriguing question.
In general, we find that most studies, including place-field remapping studies, account for the confound of noise by either minimizing the effect of noise (e.g., using longer recording times, fewer stimuli, and collecting many repeats) or by restricting their conclusion to relative statements (e.g., tuning curves or maps are more similar in this condition than that condition). In the former case, an inordinate amount of repeats are collected when those trials could have been used to increase the number of experimental conditions by applying our estimator. In the latter case, results often remain ambiguous because of the possibility of changes in SNR across conditions, comparisons across studies are difficult, and there is no absolute measure of similarity between tuning curves, restricting the granularity of results.
Discussion
We have developed a new estimator,
Interpretation of estimator
We have developed an estimator of correlation between the true tuning curves of two neurons. Conceptually, this is the correlation between two estimated tuning curves if the experimentalist were able to collect an unlimited number of repeats, average them, and there were no changes in experimental conditions as the experiment stretched on into infinity. In practice, however, experimental stability is difficult to maintain for long periods (e.g., electrode position may drift). Thus a key benefit of our estimator is that fewer repeats can be collected, shortening the experiment and allowing correlation to be estimated under tighter experimental control.
Extensions of the estimator
The correlation coefficient is the basis of several multivariate data analysis methods, including principal component analysis (when the variance is factored out), canonical correlation analysis (for review see Zhuang et al., 2020), and representational similarity analysis (for review see Kriegeskorte et al., 2008). Future work should examine the effect that the downward bias of noise on correlation has on the estimation of these quantities and whether applying a corrected estimator like the one developed here can improve inference.
Here, our derivations have assumed that all responses are independent (e.g., no correlation between responses to different stimuli). In some cases this assumption is incorrect. For example, studies have used correlation to assess the similarity of spike trains (de la Rocha et al., 2007), sometimes for invariance (Metzen et al., 2016). In this case, the neural response is not a set of values over stimuli, but a set of values over time. Neighboring responses in time are most likely correlated because of temporal correlation in the response of the system. Our method can be adapted to handle this type of comparison, and in a forthcoming publication we will extend it to arbitrary trial-to-trial correlations.
Translation invariance
We quantified translation invariance for a V4 population as the correlation between shape tuning at the RF center versus that at offset positions. Despite translation invariance being a key component of ventral stream models (Fukushima, 1980; Olshausen et al., 1993; Salinas and Abbott, 1997; Riesenhuber and Poggio, 1999; Ullman and Soloviev, 1999), including models specific to V4 (Pasupathy and Connor, 2001; David et al., 2006; Cadieu et al., 2007; Sharpee et al., 2013), only two studies have systematically mapped V4 RFs and presented large stimulus sets at multiple positions (Nandy et al., 2013; El-Shamayleh and Pasupathy, 2016). The number of stimulus conditions required in such experiments calls for a limited number of repeats in awake preparations, causing trial-to-trial variability to dramatically obscure tuning invariance. Correction for noise is thus crucial in studies requiring a large number of unique stimuli. Using our estimator, future studies can more accurately quantify translation invariance across the ventral stream by more densely sampling the RF with more diverse stimuli at the expense of fewer repeats.
Fill-outline invariance
We also applied our estimator to more accurately assess the invariance of V4 shape selectivity across a change in a surface property. We quantified fill-outline invariance as the correlation in the tuning curves of a neuron to a set of reference shapes with filled interiors versus the same set of shapes drawn only with outlines. Similarly to translation invariance, and consistent with the conclusions of the original study, fill-outline invariance varied substantially across neurons in V4. We found that an analysis based on the naive
More generally, understanding how invariance in any cortical representation differs from that in models is an important step in validating models, and our metric is critical to factor out low correlation values caused by noise. It allows one to distinguish neurons that have high invariance from those that have large changes in tuning, even in the face of of noise.
Other methods of quantifying neural invariance
Prior studies have used metrics of invariance in addition to
The separability index and
Footnotes
This work was supported by National Science Foundation Graduate Research Fellowship DGE-1256082 (D.A.P.) and National Institutes of Health–National Eye Institute T32 Vision Training Grants EY07031 (D.A.P.), R01-EY029997 (W.B.), and R01-EY027023 (W.B.). We thank Dina Popovkina, Yasmine El-Shamayleh, and Anitha Pasupathy for sharing data and Anitha Pasupathy, Greg Horwitz, and Yen-Chi Chen for providing comments.
The authors declare no competing financial interests.
- Correspondence should be addressed to Dean A. Pospisil at dp4846{at}princeton.edu