Figure 8. Schematic model of processing stages. Acoustic input in the form of spectrotemporal information is fed to primary auditory cortex (i). Here, we hypothesize that subphonetic acoustic information of the input is compared with an internal representation of the perceptual boundary between phonetic features. The absolute distance from the boundary is computed, which corresponds to phoneme ambiguity as tested in this study. The signed distance (i.e., closer to one category or another) corresponds to phoneme acoustics. This processing stage is therefore the locus of the ambiguity effect, although we do not claim that ambiguity is neurally represented per se. Next, this travels to STG (ii), where the phonetic features of a sound (e.g., VOT, PoA) are processed. Note that it is likely that other features of the sound, such as manner, are also generated at this stage, as indicated by the ellipsis. The outputs of these two stages are fed to a neural population that tries to derive a discrete phonological representation based on the features of the input (iii). This stage represents the “phoneme commitment” process, which converges over time by accumulating evidence through its own recurrent connection, as well as feedforward input from the previous stages and feedback from the subsequent stages. The output of the processes performed at each phoneme position then feeds to a node that tries to predict the phonological sequence of the word (iv) to activate potential lexical items based on partial matches with the input (v). Note that both /p/- and /b/-onset words are activated in the example because both cohorts are partially consistent. Below, we show the anatomical location associated with each processing stage. Stage i (processing subphonetic acoustic detail) is located in HG bilaterally (in green). Stages ii–iii (processing phonetic features) is in STG bilaterally (in blue). Stage v (activating lexical candidates) is in left middle temporal gyrus (in purple). Note the similarities with the functional organization of the dual-stream model proposed by Hickok and Poeppel (2007).