The similarity structure of distributed neural responses reveals the multiple representations of letters
Introduction
Reading and spelling, as relatively recent additions to the human skill repertoire, presumably make use of evolutionarily older neural circuitry in order to represent the letter shapes, letter names and motor plans that are necessary for reading and writing. Consistent with this, extensive functional neuroimaging research has identified large-scale networks within visual, language and motor areas of the brain that are reliably recruited for reading and spelling. However, despite considerable consistency in the findings at a general level, there is far less consensus regarding how the component processes and representations are specifically instantiated within these networks. Various approaches have been taken for investigating these more detailed questions, each with specific strengths and weaknesses, as we discuss below. In the research we report on here, we directly address the question of the representational format of the neural codes used in reading by applying an MVPA–RSA searchlight analysis to fMRI data collected from subjects viewing single letters. We use this approach to identify the neuro-topographic distribution of the multiple codes of letters: abstract letter identities, visual letter shapes, letter names and motor programs for writing letter shapes. The findings of this research allow us to address long-standing cognitive science questions regarding the types of representations used in reading as well as neuroscience questions regarding the neural instantiation of these representations.
Multivariate Pattern Analysis (MVPA) (Kriegeskorte et al., 2008) is based on the premise that the informational content of neural representations is distributed across a population of neuronal units and, therefore, that stimuli that are representationally similar will generate similar response patterns across the neuronal units within a relevant brain region. The most common applications of MVPA involve the use of classification algorithms (Kriegeskorte, 2011) to determine if the patterns of responses within some brain region contain sufficient information to distinguish between two (or more) classes of stimuli (e.g., words vs. false fonts). However, instead of MVPA classification, in this work we use an MVPA Representational Similarity Analysis (MVPA–RSA) approach to model testing and comparison. This approach allows a comparison, within neural regions, of the observed similarity/dissimilarity structure of the voxel response patterns (e.g., to letter stimuli) with different quantitative models of the patterns that would be predicted if a region were sensitive to a specific type of representation (e.g., for letters: abstract, visual, phonological, or motoric). Furthermore, we specifically use a searchlight approach rather than the region of interest (ROI) approach that previously has been commonly used, even with RSA. We do so because the searchlight approach allows for model testing in a topographically neutral manner as a searchlight volume systematically examines large swathes of the brain.
Most functional neuroimaging research on orthographic processing has used words and other types of letter and letter-like strings as stimuli and has been directed at questions regarding the orthographic specificity of left ventral temporal-occipital cortex (e.g. Baker et al., 2007, Dehaene and Cohen, 2011), the unit-size of orthographic representations in this region (e.g., words or sub-lexical units) (Glezer et al., 2009, Nestor et al., 2013, Vinckier et al., 2007) or at questions concerning the format of orthographic representations within this area (Dehaene et al., 2001, Dehaene et al., 2004, Polk and Farah, 2002). In our work we do not address the “unit size” question, focusing instead on the question of representational format and content, without limiting ourselves to the ventral temporal-occipital region.
As we discuss below, the use of word and word-like stimuli raises specific interpretational challenges and so, to circumvent some of these, we use single letters as stimuli. A key advantage to single letter stimuli is that the multiple representations of letters are well-defined, dissociable and provide clear predictions for an RSA approach. Specifically, letters have characteristic visual shapes, spoken names and motor plans and these feature dimensions are dissociable in the sense that letters with similar visual shapes (e.g., A/R) may have different sounding names and motor plans, etc. In addition, many theories of reading and spelling posit abstract letter representations (ALIs) (Brunsdon et al., 2006, Jackson and Coltheart, 2001) that serve to unify and mediate between the different cases, fonts and modality-specific formats, such that E, e, and/i/all correspond to precisely the same abstract representation (Fig. 1). For these reasons, an MVPA–RSA investigation of neural responses to single letters is well-suited to addressing questions of representational format. The prediction is that if there are regions specifically tuned to abstract letter identities, letter shapes, names or motor plans they should produce similar neural responses for letters that are similar along one of these dimensions but not for letters that are similar along others. For example, in an area that specifically encodes visual letter shapes, the pattern of responses across voxels should be correlated when participants view letters with similar shapes (A/R) but not when they view visually dissimilar letters (A/S), nor when they view letters with only phonologically similar names (B/P) or similar motor plans (T/L). The same logic extends to the other modality-specific representational types. With respect to abstract letter identities (ALIs), the prediction is that neural substrates encoding these representations should respond similarly to letters that have the same identity despite differing in case and visual appearance (A/a). The further key prediction is that substrates that selectively encode ALIs should be insensitive to similarities between letters in terms of their visual–spatial, letter-name or motoric features.
Theories of reading (e.g., Grainger et al., 2008) often distinguish between low-level representations of visual features of letters, high-level representations of letter shapes and abstract letter identities (ALIs) (Fig. 1). Low-level visual-feature representations correspond to those involved in visual processing more generally and would include representations computed in primary and early visual areas. At this level, the same letter in a different font (e.g., A/) would be represented differently. In contrast, at the level of visual-shape representations, the underlying shape/geometry of a letter is represented, in a manner comparable to what is sometimes referred to in the visual object processing literature as a “structural description” (Miozzo and Caramazza, 1998). At this level, letters in different fonts (A/) would share a representation but different allographs of a letter (A/a), would not. A further distinction is made by reading theories that assume that letter-shape representations are recoded into abstract letter identities (Jackson and Coltheart, 2001) that are used to search memory for stored orthographic representations of familiar word forms. Because ALIs are abstract (font and case invariant) A and a would be represented in the same way. ALIs allow readers to easily recognize words in unfamiliar fonts or case (eAc). Of these three letter representation types, ALIs are the most controversial, both in terms of their existence and also with regard to their neural instantiation. Alternatives to ALI-mediated views posit that reading is based on either visual representations of letters alone or on visual exemplars of previously experienced words and letters (Tenpenny, 1995; for example, see Plaut and Behrmann (2011) for a model of letter representation that does not include ALI's). In fact, the contrast between the ALI vs. visually mediated views of reading is a clear example of the larger, long-standing and contentious debate between the abstractionist and grounded cognition (or semantic and episodic) views of human knowledge representation (Barsalou, 2008, Tulving, 1983).
Considerable behavioral (Besner et al., 1984, Kinoshita and Kaplan, 2008), neuropsychological (Coltheart, 1981) and neuroimaging (Dehaene et al., 2004, Polk and Farah, 2002) evidence has been put forward in support of ALIs. With regard to neural substrates, neuroimaging research has generally localized ALIs to the posterior, inferior temporal lobe (Dehaene et al., 2004). This is consistent with the mid-fusiform localization of the Visual Word Form Area (VWFA) assumed by many to be an orthographic processing area critical for word reading (Cohen et al., 2000, Tsapkini and Rapp, 2010). However, the attribution of ALIs to this brain area remains highly debated (Barton et al., 2010a, Barton et al., 2010b, Burgund and Edwards, 2008, Price and Devlin, 2003, Wong et al., 2009). Critically, as we review next, previous studies arguing for ALIs have not controlled for the possibility that effects attributed to ALIs might instead originate from modality-specific (visual, phonological, motor) or semantic representations of the word or letter stimuli used in these studies.
Some studies examining the nature of orthographic representations have reported similar neural responses or priming effects for orthographic stimuli presented in different fonts (Gauthier et al., 2000, Nestor et al., 2013, Qiao et al., 2010). However, while similar responses to different-font stimuli indicate that the recruited neural representations are indeed more abstract than low-level visual feature representations, the similar responses may have originated from visual letter-shape representations rather than from ALI representations. Thus, the finding of similar cross-font responses does not necessarily implicate ALIs. Other studies have specifically manipulated letter case in order to examine issues of representational format. For example, some studies have shown that activity in the left mid-fusiform was comparable for uniform-case and mixed-case words and pseudowords (APPLE/aPpLe) (Polk and Farah, 2002; but see Kronbichler et al., 2009), or have shown priming effects for words presented in different cases (RADIO/radio) (Dehaene et al., 2001). While these findings can certainly be explained as arising at the level of ALIs, such conclusions would be premature without ruling out alternative accounts. For example, most of these studies did not control for the visual similarity between cross-case letter pairs (P/p; O/o, etc.) leaving open the possibility that the reported cross-case effects originated at some level of visual representation. While this specific possibility was addressed by Dehaene et al. (2004) who reported cross-case word priming even for words with dissimilar cross-case letters (RAGE/rage), neither this study, nor any others, have controlled for the possibility that priming effects could have originated in the phonological letter-name representations shared by cross-case pairs. Furthermore, given that most of these studies have used word stimuli, another possible source of response similarity/priming effects could be the semantic representations of the word stimuli (i.e., RAGE/rage share a common semantic representation). A recent study specifically considered the possibility of a semantic source of priming effects and found priming for orthographically similar words in the VWFA region, but did not find priming for semantically similar words (Devlin et al., 2004). However, although a semantic source of the effects was ruled out in this particular study, the phonological similarity of the orthographically similar word (and pseudoword) pairs was not considered, nor is it clear that visual similarity between the orthographically similar word pairs was either. Other studies, by using single letter stimuli, largely eliminated a semantic locus of cross-case priming effects (Kinoshita and Kaplan, 2008). However, none of the single-letter studies controlled for the letter-name or motor similarity of letter primes and targets. Although rarely discussed, the potential relevance of motor similarity between letters needs to be considered given that a number of studies have found that viewing letters may result in activation of information related to producing letter shapes (James and Atwood, 2009, James and Gauthier, 2006, Longcamp et al., 2003).
In sum, while a number of previous findings are consistent with the notion that abstract letter identities are represented in the left mid-fusiform/inferior temporal region, no study has considered all key alternative accounts, thus weakening their conclusions regarding the representation and localization of ALIs. The goal of this research is to infer the representational content of brain regions using the event-related fMRI BOLD response recorded from participants viewing single upper or lower case letters (A, a, B, b, D, d, etc.) in the context of performing a go/no go symbol (~,?, %, &) detection task. For data analysis, we deployed a searchlight variant of MVPA–RSA (based on Kriegeskorte et al., 2008) that allowed us to evaluate the extent to which the representational similarity structure of the BOLD response for voxels within each searchlight volume corresponded to the similarity structure predicted for a volume that is sensitive to ALI, visual–spatial, letter-name or motoric similarity (Fig. 2). Our choice of stimuli (single letters), task (symbol detection) and data analysis approach (MVPA–RSA model comparison) largely eliminate the interpretative ambiguities associated with previous research on this topic.
Section snippets
Participants
Nine right-handed individuals (5 women), ages 18–26 were students at Johns Hopkins University with no history of reading/spelling disabilities. They received payment for participation and gave written informed consent as required by the Johns Hopkins University Institutional Review Board.
Experimental procedures
Two experimental tasks were administered within one session. A Symbol Detection Task was presented for the first 6 runs followed by an Orthographic Localizer Task for two runs (Rapp and Lipka, 2011). E-Prime
Search space and observed and predicted RSMs (oRSMs and pRSMs)
The search space was generated from the functionally localized Orthographic Processing Network, consisting of voxels for which the group activity for words + consonant strings > baseline. For each participant, this group network was transformed into native space, dilated and reflected across the mid-sagittal plane to create a bilaterally symmetric search space with an average, across participants, of 16,539 functional voxels (range: 14,267–18,636). The resulting region covered: the occipital lobes,
Discussion
The goal of this investigation was to evaluate the similarity structure of neural responses to viewed letters by comparing them to models of the similarity structures predicted for regions encoding abstract letter identities (ALIs) or modality-specific visual–spatial, letter-name or motoric representations of letters. This direct approach to these questions of neuronal representation identified specific neural substrates selectively tuned to different representational types.
Conclusions
This research provides novel neural evidence for abstract and modality-specific representations of letters and identifies the specific neural substrates in which these different representational types are instantiated. Many questions remain, including whether or not abstract representations are present in other domains. It is clear, however, that the MVPA–RSA Searchlight approach provides a powerful means for investigating fundamental issues of human knowledge representation.
Acknowledgments
This research was supported by IGERT Research and Training fellowship for the first author and NIH grant DC006740 to the second author. We are grateful to Manny Vindiola, Bonnie Breining and Michael Wolmetz for valuable advice on data analysis and preliminary drafts as well as to James Sabra for his help with data analysis.
References (69)
- et al.
Reading words, seeing style: the neuropsychology of word, font and handwriting perception
Neuropsychologia
(2010) - et al.
Tuning of the human left fusiform gyrus to sublexical orthographic structure
Neuroimage
(2006) - et al.
The unique role of the visual word form area in reading
Trends Cogn. Sci.
(2011) - et al.
Developmental dyslexia
Lancet
(2004) - et al.
Direct intracranial, fMRI, and lesion evidence for the causal role of left inferotemporal cortex in reading
Neuron
(2006) - et al.
Evidence for highly selective neuronal tuning to whole words in the “visual word form area”
Neuron
(2009) - et al.
Letter perception: from pixels to pandemonium
Trends Cogn. Sci.
(2008) - et al.
fMR-adaptation: a tool for studying the functional properties of human cortical neurons
Acta Psychol. (Amst)
(2001) - et al.
Letter processing automatically recruits a sensory-motor brain network
Neuropsychologia
(2006) - et al.
Evaluation of the dual route theory of reading: a metanalysis of 35 neuroimaging studies
Neuroimage
(2003)
Pattern-information analysis: from stimulus decoding to computational-model testing
Neuroimage
Visual presentation of single letters activates a premotor area involved in writing
Neuroimage
A critical look at the embodied cognition hypothesis and a new proposal for grounding conceptual content
J. Physiol. Paris
The myth of the visual word form area
Neuroimage
Neurobiological studies of reading and reading disability
J. Commun. Disord.
A combined fMRI study of typed spelling and reading
Neuroimage
Unconsciously deciphering handwriting: subliminal invariance for handwritten words in the visual word form area
Neuroimage
Phonological dyslexia and dysgraphia: cognitive mechanisms and neural substrates
Cortex
Disruption of posterior brain systems for reading in children with developmental dyslexia
Biol. Psychiatry
Second-order isomorphism of internal representations: shapes of states
Cogn. Psychol.
The orthography-specific functions of the left fusiform gyrus: evidence of modality and category specificity
Cortex
Patterns of brain reorganization subsequent to left fusiform damage: fMRI evidence from visual processing of words and pseudowords, faces and objects
Neuroimage
Hierarchical coding of letter strings in the ventral stream: dissecting the inner organization of the visual word-form system
Neuron
Visual word processing and experiential origins of functional selectivity in human extrastriate cortex
Proc. Natl. Acad. Sci. U. S. A.
Grounded cognition
Annu. Rev. Psychol.
Encoding in the visual word form area: an fMRI adaptation study of words versus handwriting
J. Cogn. Neurosci.
Concepts are more than percepts: the case of action verbs
J. Neurosci.
Basic processes in reading: computation of abstract letter identities
Can. J. Psychol.
An upper- and lowercase alphabetic similarity matrix, with derived generation similarity values
Behav. Res. Methods Instrum. Comput.
Severe developmental letter-processing impairment: a treatment case study
Cogn. Neuropsychol.
Identity versus similarity priming for letters in left mid-fusiform cortex
Neuroreport
Visual word recognition in the left and right hemispheres: anatomical and functional correlates of peripheral alexias
Cereb. Cortex
The visual word form area: spatial and temporal characterization of an initial stage of reading in normal subjects and posterior split-brain patients
Brain
Disorders of reading and their implications for models of normal reading
Vis. Lang.
Cited by (60)
Reading the written language environment: Learning orthographic structure from statistical regularities
2020, Journal of Memory and LanguageShared premotor activity in spoken and written communication
2019, Brain and Language