Abstract
Neuroscience research has elucidated broad relationships between socioeconomic status (SES) and young children's brain structure, but there is little mechanistic knowledge about specific environmental factors that are associated with specific variation in brain structure. One environmental factor, early language exposure, predicts children's linguistic and cognitive skills and later academic achievement, but how language exposure relates to neuroanatomy is unknown. By measuring the real-world language exposure of young children (ages 4–6 years, 27 male/13 female), we confirmed the preregistered hypothesis that greater adult-child conversational experience, independent of SES and the sheer amount of adult speech, is related to stronger, more coherent white matter connectivity in the left arcuate and superior longitudinal fasciculi on average, and specifically near their anterior termination at Broca's area in left inferior frontal cortex. Fractional anisotropy of significant tract subregions mediated the relationship between conversational turns and children's language skills and indicated a neuroanatomical mechanism underlying the SES “language gap.” Post hoc whole-brain analyses revealed that language exposure was not related to any other white matter tracts, indicating the specificity of this relationship. Results suggest that the development of dorsal language tracts is environmentally influenced, specifically by early, dialogic interaction. Furthermore, these findings raise the possibility that early intervention programs aiming to ameliorate disadvantages in development due to family SES may focus on increasing children's conversational exposure to capitalize on the early neural plasticity underlying cognitive development.
SIGNIFICANCE STATEMENT Over the last decade, cognitive neuroscience has highlighted the detrimental impact of disadvantaged backgrounds on young children's brain structure. However, to intervene effectively, we must know which proximal aspects of the environmental aspects are most strongly related to neural development. The present study finds that young children's real-world language exposure, and specifically the amount of adult-child conversation, correlates with the strength of connectivity in the left hemisphere white matter pathway connecting two canonical language regions, independent of socioeconomic status and the sheer volume of adult speech. These findings suggest that early intervention programs aiming to close the achievement gap may focus on increasing children's conversational exposure to capitalize on the early neural plasticity underlying cognitive development.
Introduction
Socioeconomic status (SES) is a multifaceted index of one's financial resources, educational capital, and relative social status. Neuroimaging studies have found relatively consistent evidence that variation in SES is associated with variation in brain development, including gray matter volume (Raizada et al., 2008; Jednoróg et al., 2012; Noble et al., 2012; Hanson et al., 2013; Luby et al., 2013), thickness (Lawson et al., 2013; Mackey et al., 2015; Romeo et al., 2017), and surface area (Noble et al., 2015), in addition to white matter macrostructure (Raizada et al., 2008; Luby et al., 2013) and microstructure (Gianaros et al., 2013; Ursache and Noble, 2016). Presumably, these neural disparities arise because of systematic differences in certain immediate environmental factors during early childhood. There is, however, a paucity of evidence as to which specific aspects of children's experiences are associated with individual variation in specific neuroanatomical developments.
Behaviorally, it is well known that the quantity and quality of the language young children are exposed to early in life predict their later linguistic and cognitive skills (Huttenlocher et al., 1991; Rodriguez and Tamis-LeMonda, 2011; Rowe, 2012; Weisleder and Fernald, 2013; Hirsh-Pasek et al., 2015). Furthermore, children from lower SES backgrounds are exposed to, on average, fewer utterances of lower complexity than their higher-SES peers (Hoff et al., 2002; Rowe et al., 2005; Huttenlocher et al., 2007). A seminal study estimated that, by the time children reach school age, children growing up in higher-SES families were, on average, exposed to 30 million more words than children growing up in lower-SES families (Hart and Risley, 1995).
Subsequent research has found that more important than the simple quantity of words heard is the quality of language exposure, including linguistic features, such as vocabulary diversity and sophistication, grammatical complexity, and narrative use (Rowe, 2012), as well as interactional features, such as contiguous (time-locked), contingent (topically similar), back-and-forth conversation (Hirsh-Pasek et al., 2015). Conversational turn-taking involves a rich experience of high-quality linguistic, attentional, and social features. There is now some evidence that certain aspects of children's language environments relate to functional brain responses in prefrontal cortical regions (Sheridan et al., 2012; Garcia-Sierra et al., 2016; Romeo et al., 2018). However, there is no evidence as yet relating children's language exposure to their brain structure, including the white matter tracts that connect brain regions into networks.
The white matter tract most associated with language is the left arcuate fasciculus, a component of the superior longitudinal fasciculus (SLF) that connects two cortical regions critical for language: the left inferior frontal gyrus (Broca's area) and the left posterior superior temporal gyrus (Wernicke's area). Microstructure of this tract has been associated with scores on language and literacy measures in children (Yeatman et al., 2011; Saygin et al., 2013; Skeide et al., 2016), and is often altered in both children and adults with disorders of speech, language, and/or literacy (Catani and Mesulam, 2008; Vandermosten et al., 2012a). Given the importance of this tract for language development, we tested the preregistered hypothesis that early language experience, independent of SES, might be related to the microstructure of the left arcuate/SLF; if true, this would suggest that these dorsal language tracts may be a neuroanatomical mechanism by which children's language environments affect their linguistic and cognitive skills.
Materials and Methods
Experimental design.
A priori hypotheses and exploratory analyses were preregistered at https://osf.io/fes4j/register/564d31db8c5e4a7c9694b2be. Specifically, the present study was designed to confirm or refute the hypothesis that young children's language exposure, and particularly the number of conversational turns with adults, would be positively correlated with the fractional anisotropy (FA) of the left arcuate/SLF (and/or a portion thereof), independent of SES and the sheer quantity of adult and child speech alone. As such, this experiment aimed to recruit a socioeconomically diverse sample of young children and their parents to complete diffusion magnetic resonance imaging, standardized cognitive assessments, and 2 full days of real-word auditory language recordings. All analyses were within-group correlations with specific covariates (nuisance and interest) as described below.
Participants.
Forty children (27 male; age range, 4 years, 2 months to 6 years, 10 months; mean ± SD, 5.78 ± 0.72 years) and their parents completed this study. Children were in either prekindergarten or kindergarten grades and were required to be native English speakers with no history of premature birth (<37 weeks), neurological disorders, developmental delay, speech/language therapy, or grade repetition. Nineteen additional children were initially assessed and excluded for not meeting these inclusion criteria.
Twenty-three other children participated but did not have complete datasets because they did not complete the home recordings (n = 6), did not participate in the diffusion tensor imaging (DTI) scan (n = 7), or exhibited excessive movement during the DTI scan (n = 10, details below). Excluded participants did not differ from the included sample on age, SES, behavioral scores, or language exposure measures. However, the groups did differ on child gender; unintentionally, all home-recording noncompletions occurred with female participants, so that girls were more likely to be excluded. Thus, all analyses control for gender. Further, half of the final sample additionally participated in a larger randomized controlled intervention study on parenting practices; only their baseline data (before learning of group assignment) were used here. Furthermore, task-based fMRI results were previously reported for a partially overlapping subset of this sample (Romeo et al., 2018). Forty-four participants had either/both useable fMRI and DTI data; of these, 32 had both useable fMRI and DTI data, 4 had useable fMRI data only (for a final fMRI sample of 36), and 8 had useable DTI data only (for a final DTI sample of 40 for all analyses reported here). All procedures were approved by the Institutional Review Board at the Massachusetts Institute of Technology, and written informed consent was obtained from parents.
Socioeconomic measures.
Participants were from a wide SES range. Parent(s) filled out a short questionnaire about total gross annual household income and the highest level of education obtained by each parent and/or primary caregiver (0 = less than high school; 1 = high school; 2 = some college/associate's degree; 3 = bachelor's degree; 4 = advanced degree). When two parents were present in the home, maternal and paternal years of education were averaged to create a parental education metric. For the final sample, parental education ranged from 0.5 to 4 (mean = 2.81, median = 3.50, SD = 1.17; Fig. 1), and gross household income ranged from $6000 to $250,000 (mean = $108,728, median = $93,000, SD = $69,064; Fig. 1), which is equivalent to the median family income of the Metro region from which participants were sampled (U.S. Census Bureau, 2016). For mediation analyses, education and income metrics were z-scored and averaged.
Standardized behavioral assessments.
Children completed standardized behavioral assessments to characterize verbal and nonverbal cognitive skills. A nonverbal composite score comprised the average of the age-normed standard scores from the Matrix Reasoning, Picture Memory, and Bug Search subtests of the Wechsler Preschool and Primary Scale of Intelligence, fourth edition (Wechsler, 2012). A verbal composite score comprised the average age-normed standard scores of the Peabody Picture Vocabulary Test (Dunn and Dunn, 2007) and the Core Language Score of the Clinical Evaluation of Language Fundamentals, fifth edition (Wiig et al., 2013). To be included in the final sample, participants were required to score scores ≥1 SD below the mean (16th percentile) on both composite scores.
Neuroimaging data acquisition.
Neuroimaging sessions occurred at the Athinoula A. Martinos Imaging Center at the McGovern Institute for Brain Research, at the Massachusetts Institute of Technology. Children were acclimated to the MRI environment and practiced lying still in a mock MRI scanner before data acquisition on a 3 tesla MAGNETOM Trio Tim scanner equipped for EPI (Siemens) with a 32-channel phased array head coil. First, an automated scout image was acquired, and shimming procedures were performed to optimize field homogeneity. Then a whole-head, high-resolution T1-weighted multiecho MPRAGE structural image was acquired using a protocol optimized for movement-prone pediatric populations (TR = 2530 ms, TE = 1.64 ms/3.5 ms/5.36 ms/7.22 ms, TI = 1400 ms, flip angle = 7°, resolution = 1 mm isotropic). Whole-brain diffusion-weighted images were acquired in 74 axial interleaved slices of thickness 2 mm and axial in-plane isotropic resolution 2 mm (128 × 128 × 74 image matrix, TR = 9.3 s, TE = 84 ms, and GRAPPA acceleration factor 2). The series included 10 non–diffusion-weighted reference volumes (b = 0) and 30 diffusion-weighted volumes (b = 700 s/mm2). Resting state and one task-based functional scans were also collected in the same session, but are not reported here.
Code accessibility.
All code necessary to replicate results, along with links to necessary software packages, are freely available at https://github.com/rromeo2/openmindMIT.
Neuroimaging processing and analysis.
First, all diffusion data underwent quality control via visual inspection of all volumes followed by the fully automated DTIPrep pipeline (Oguz et al., 2014), which corrects artifacts caused by Eddy currents, head motion, bed vibration/pulsation, and slicewise, interlacewise, and gradientwise intensity inconsistencies. Participants with >5 unusable volumes (12.5%) were excluded (n = 10), leaving the final sample of 40 participants.
All preprocessing was implemented via a custom script in Nipype version 0.13.0 (Gorgolewski et al., 2011). All images in the diffusion series were aligned to the first non–diffusion-weighted image using affine registration, and corresponding diffusion-weighting gradient vectors were reoriented accordingly, to reduce misalignment. A per-subject total head motion index was computed from volume × volume translation and rotation, percentage of slices with signal dropout, and signal drop-out severity (Yendiki et al., 2014). All analyses statistically control for the total head motion index.
Eighteen major white matter fascicles were automatically reconstructed using TRACULA implemented in FreeSurfer version 6.0 (Yendiki et al., 2011), which uses global probabilistic tractography and the ball-and-stick model of diffusion to estimate the posterior probability distribution of each pathway. This distribution includes the prior probabilities of the pathway given the cortical parcellation and subcortical segmentation of the anatomical image, which had been processed and manually edited as necessary in FreeSurfer (version 5.3.0) (Fischl, 2012) to ensure correct gray and white matter boundaries. Each pathway distribution was thresholded at 20% of the maximum value, and the values at each voxel in the pathway were weighted by the pathway probability at that voxel to obtain whole-tract average measures of microstructure.
Of interest were three measures of water diffusion within tracts: axial diffusivity, which measures the rate of diffusion parallel to the tract; radial diffusivity (RD), which measures the rate of diffusion perpendicular to the tract; and FA, a summary measure of microstructural organization that indexes the overall strength and directionality of diffusion (Lebel et al., 2017). These measures were analyzed within two a priori components of the left SLF: the arcuate fasciculus, which runs between inferior frontal and superior posterior temporal regions (roughly corresponding to SLF II); and SLF III, which runs between inferior frontal and inferior parietal regions (henceforth referred to as SLF).
TRACULA was also used to calculate FA at successive cross-sections as a function of position along the trajectory of both tracts in an anterior-to-posterior direction. Correspondence of nodes across subjects was based on the Euclidean distance in MNI space. Because tracts were reconstructed in each subject's native space and not in a template space, individual participants' tracts were of varying length. For participants with shorter tracts, tail FA values were extrapolated by calculating moving averages of the previous 3 points to ensure uniform length (35 points along the SLF and 48 points along the arcuate fasciculus). The presented results do not change if instead no extrapolations were made.
Finally, whole-brain voxelwise statistical analysis was conducted with Tract-Based Spatial Statistics (TBSS) (Smith et al., 2006), as implemented in FSL version 5.0.9 (Jenkinson et al., 2012). Diffusion space FA images were aligned to each participant's anatomical image using boundary-based registration (Greve and Fischl, 2009), which was then affine aligned to MNI space. Each subject's MNI-space image was eroded to remove the highly variable lateral regions of the FA map. The images were averaged to generate an intersubject FA skeleton, and each voxel from participants' FA volumes was projected onto the FA skeleton. Voxelwise regression analyses were conducted with FSL's randomize tool with 5000 permutations, and threshold free-cluster enhancement was used to correct for multiple comparisons with p < 0.05 (Smith and Nichols, 2009). Significant voxels were then back-projected from skeleton positions to the position at the center of the nearest tract in the subject's FA image in standard space. These points were then inversely warped to each subject's native diffusion space for localization within the probabilistic tractography.
Home audio recordings.
Specific details of the home audio recordings have been previously reported (Romeo et al., 2018). Briefly, parents recorded two consecutive weekend days of audio from the child's perspective via the Language Environmental Analysis (LENA) Pro system (Gilkerson et al., 2017). LENA software automatically processes the recordings and estimates the number of words spoken by an adult in the child's vicinity (“adult words), the number of utterances the key child made (“child utterances”), and the number of dyadic conversational turns, defined as a discrete pair of consecutive adult and child utterances in any order, with no more than 5 s of separation (“conversational turns”). As such, conversational turns measure the contiguous, linguistic interaction between children and adults. Running totals for each speech category were calculated for each consecutive 60 min across the 2 d in 5 min increments (e.g., 7:00 A.M. to 8:00 A.M., 7:05 A.M. to 8:05 A.M., etc.), and the per-participant highest hourly total of adult words, child utterances, and conversational turns were separately extracted for statistical analysis. This metric helped minimize differences in language measures due solely to different recording lengths and/or loud activities that may have masked speech and misrepresented language input.
Statistical analysis.
Statistical analysis of behavioral and summary diffusion measures was executed in SPSS Statistics version 24 (IBM). Given that all participants constituted a single group and all independent and dependent variables were continuous, all relational analyses were two-tailed regressions, reporting Pearson's r (if no covariates) or partial r (with covariates listed in Results). For the node analysis within tracts, independent regressions with listed covariates were conducted with FA at each node as the dependent variable, and p values were FDR corrected for the total number of nodes in both tracts (n = 83).
Mean FA was extracted from the significant TBSS cluster and entered into two bootstrapped mediation analyses (controlling for age, gender, and motion) with 5000 repetitions, as executed in the PROCESS macro (Preacher and Hayes, 2004; Hayes, 2018). In the first model, the number of conversational turns was the independent variable, composite verbal score was the dependent variable, and cluster FA was the mediator. In the second model, composite SES was the independent variable, composite verbal score was the dependent variable, and both conversational turns and cluster FA were entered as mediators. The bootstrapped 95% CIs for the direct (c) and indirect (ab) effects are reported; the mediation was considered “significant” if the 95% CI for the indirect effect did not contain 0. Effect sizes were determined by the mediation ratio, which is the ratio of the indirect effect coefficient to the total effect coefficient; this measure indicates the proportion of the total effect that is mediated.
Results
Replicating prior studies, higher SES was strongly correlated with higher composite verbal scores (education: r(38) = 0.65, p = 5 × 10−6; income: r(38) = 0.46, p = 0.003) and to a lesser extent, with higher composite nonverbal scores (education: r(38) = 0.35, p = 0.03; income: r(38) = 0.16, p = not significant). SES was also positively correlated with measures of language exposure, including adult words (education: r(38) = 0.41, p = 0.008; income: r(38) = 0.28, p = 0.08) and conversational turns (education: r(38) = 0.38, p = 0.02; income: r(38) = 0.40, p = 0.01), but not child utterances alone (both r(38) < 0.27, both p > 0.10). After controlling for SES (parental education and income), the number of conversational turns was the only exposure measure that correlated with children's composite verbal scores (partial r(36) = 0.51, p = 0.001; adult words: partial r(36) = 0.08, p = 0.65; partial r(36) = 0.10, p = 0.57), indicating that differences in conversational exposure relate to variance in children's language skills over and above socioeconomic disparities. Nonverbal scores were not related to any of the language exposure measures (all r(38) < |0.18|, all p > 0.2).
Controlling for age, gender, and head motion, neither the number of adult words nor the number of child utterances were correlated with any diffusion measure in either the arcuate or SLF (all partial r(35) < 0.17, all p > 0.32). However, the number of conversational turns correlated positively with FA (arcuate: partial r(35) = 0.46, p = 0.004; SLF: partial r(35) = 0.45, p = 0.005; Figure 2) and negatively with RD (arcuate: partial r(35) = −0.34, p = 0.04; SLF: partial r(35) = −0.37, p = 0.02), but did not correlate with axial diffusivity (both partial r(35) < abs(0.07), both p > 0.70). Combined, these measures indicate that greater conversational turns correspond with greater coherence of diffusion parallel to the tract, which may be a marker of greater axonal myelination (Lebel et al., 2017). Importantly, the relationships between conversational turns and FA/RD remained significant when controlling for potential confounding variables of SES, the two other LENA measures, or composite language scores. Specifically: controlling for SES (arcuate FA partial r(33) = 0.48, p = 0.003; SLF FA partial r(33) = 0.45, p = 0.007; arcuate RD partial r(33) = −0.37 p = 0.03; SLF RD partial r(33) = −0.36, p = 0.04); controlling for the two other LENA measures (arcuate FA partial r(33) = 0.46, p = 0.005; SLF FA partial r(33) = 0.42, p = 0.01; arcuate RD partial r(33) = −0.35 p = 0.04; SLF RD partial r(33) = −0.37, p = 0.03); and controlling for children's composite language scores (arcuate FA partial r(34) = 0.46, p = 0.005; SLF FA partial r(34) = 0.35, p = 0.038; arcuate RD partial r(34) = −0.349, p = 0.037; SLF RD partial r(34) = −0.268, p = 0.114). These findings indicate that the relations between conversational turns and SLF microstructure cannot be explained by these other child-level or environmental variables.
A node analysis was conducted to explore whether a specific sublocation within these tracts was driving observed relationships. Controlling for age, gender, motion, and SES, 25 (of 83) nodes exhibited significant correlations (FDR-corrected p < 0.05) between conversational turns and local FA; these nodes occurred in four clusters located toward both the anterior and posterior ends of the left arcuate and SLF (Fig. 3), suggesting that the strong correlations in these regions drive the relation between conversational turns and whole tract averages.
Finally, post hoc analyses aimed to ascertain the anatomical specificity of these correlations across all white matter tracts. Correlations between conversational turns and all 18 TRACULA-defined tracts revealed no significant correlations with any tracts other than left arcuate and left SLF (all FDR-corrected p > 0.2). Additionally, a whole-brain, voxelwise analysis with TBSS (controlling for age, gender, and motion) revealed that, convergent with the node analysis, the number of conversational turns was positively correlated (threshold free-cluster enhancement corrected p < 0.05) with FA in a cluster of 513 voxels at the anterior end of the left arcuate/SLF where these tracts terminate with Broca's area in the left inferior frontal gyrus (Fig. 4). To confirm localization in each participant's native space, back-projection revealed that the maximally significant voxel of this cluster occurred within the TRACULA-defined bounds of the intertwining arcuate/SLF near the anterior termination.
The average FA from this cluster was extracted for mediation analyses so as to better characterize the relationship between early language experience, white matter microstructure, and language skill. Controlling for age, gender, and motion, FA in the left anterior arcuate/SLF significantly mediated the relationship between conversational turns and the composite language score (direct effect = 0.095 [95% CI = 0.022–0.169], indirect effect = 0.043 [95% CI = 0.002–0.100], indirect/total effect = 0.311), such that variation in regional FA accounted for 31% of the total relationship between language experience and language skill. Furthermore, conversational turns and FA jointly mediated the relationship between SES and language scores (direct effect = 7.134 [95% CI = 3.057–11.210], indirect effect = 3.007 [95% CI = 0.680–5.829], indirect/total effect = 0.30), indicating that combined behavioral and neural mechanisms explained nearly one-third (30%) of the socioeconomic “language gap.” FA in this region was not significantly related to nonverbal scores (r(35) = 0.117, p = 0.49).
Discussion
These results provide the first evidence of direct association between a specific aspect of children's language experience, namely, adult-child conversational turns, and particular neuroanatomical structural properties, specifically the connectivity of the left arcuate and the left SLF. The number of adult-child conversational turns young children experienced, independent of SES, was positively correlated with the strength of coherence of two dorsal white matter tracts: the left arcuate fasciculus and the left SLF. This relationship appeared to be driven by anisotropy in a subregion near where these tracts terminate in the left inferior frontal lobe at a known hub for expressive and receptive language processing (Friederici, 2012). Mediation models revealed that microstructural properties in this region provide a neural mechanism underlying the relationship between children's conversational exposure and their language skills.
This localization is consistent with functional findings that children's language exposure is related to activation specifically in left prefrontal cortical regions (Sheridan et al., 2012; Garcia-Sierra et al., 2016; Romeo et al., 2018). Together, this suggests that “Broca's area” and adjacent pathways may be components of the perisylvian language network that are particularly sensitive to early linguistic input, especially dialogic conversation. Because the arcuate fasciculus bidirectionally connects Broca's area to primary receptive language regions in superior posterior temporal cortex, this uniquely human tract may be evolutionarily specialized for language (Rilling et al., 2008), as evidenced by correlations between language skill and structural properties of the left arcuate. Classically, damage to the arcuate fasciculus is associated with conduction aphasia (Catani and Mesulam, 2008). Further, individual microstructural variation in the absence of overt damage is related to a number of linguistic skills in childhood, including phonological knowledge and literacy skills (Yeatman et al., 2011; Saygin et al., 2013), presence or risk for developmental dyslexia (Vandermosten et al., 2012a; Langer et al., 2017; Wang et al., 2017), rate of vocabulary growth (Su et al., 2018), as well as word learning (López-Barroso et al., 2013), verbal memory (Catani et al., 2007), and speech perception (Vandermosten et al., 2012b) in adulthood. In all cases, greater coherence in the left arcuate fasciculus reflected better linguistic skills, suggesting that that fast, efficient connectivity between frontal and temporal areas facilitates verbal skills throughout the lifespan. The present results further suggest that variation in early childhood language experience may underlie individual differences in neuroanatomy and behavior.
The apparent environmental influence of conversational turn-taking on left arcuate and superior longitudinal microstructure is congruent with findings that dorsal language tracts (superior longitudinal and arcuate fasciculi) develop more slowly than their ventral counterparts (inferior longitudinal, inferior-frontal-occipital, and uncinate fasciculi) (Perani et al., 2011; Brauer et al., 2013). Specifically, the terminal projection of the arcuate fasciculus at the furthest anterior point near Broca's area is the latest developing component of the dorsal pathway, which is still not fully mature at age 7 years (Brauer et al., 2013). As such, this period of protracted development in early and middle childhood may correspond to a sensitive period of neurodevelopment in which children's anterior dorsal language circuitry is highly susceptible to their environments.
The present finding that conversational exposure correlated positively with FA and negatively with RD in the left arcuate and SLF indicates greater coherence of diffusion parallel to the tracts, which is often considered a marker of greater axonal myelination (Lebel et al., 2017). Considering that myelination increases throughout childhood and early adulthood (Miller et al., 2012), these findings suggest that increased conversational exposure in early childhood might advance maturation of the anterior terminations of the dorsal language pathways important for language processing. However, longitudinal studies of children are necessary to determine precise developmental trajectories in relation to language exposure.
Localization of white-matter microstructural associations with conversational turns was specific to white matter near Broca's area, but such localization is related partially to methods and limitations of neuroimaging. No other tract or region was significantly related to conversational turns in either the TRACULA or TBSS analyses. Weaker associations would not be detected if they were below the statistical thresholds used in the present study. As in any thresholded neuroimaging study, the conclusion that white-matter microstructure near Broca's area is associated with language experience is more certain than the conclusion that no other white-matter area is more weakly associated with such exposure.
In regards to language exposure, dorsal pathway microstructure was related only to the quantity of dialogic adult-child conversational turns, and not to the sheer volume of speech spoken in the child's presence. Conversational turns incorporate social interactional features, such as contiguity (temporal connectedness), contingency (contextual relevancy), and joint attention, beyond simple linguistic features of the spoken content. The specificity of the relation between conversational turns and white-matter microstructure further supports the idea that qualitative aspects of children's early language experience, as opposed to sheer quantitative aspects, may be the largest influence on children's language development (Zimmerman et al., 2009; Rowe, 2012; Roseberry et al., 2014; Hirsh-Pasek et al., 2015). The present findings suggest that neuroanatomical maturation and concomitant language development may critically rely on social exchanges of linguistic information rather than purely passive speech exposure or child speech production in isolation. Developmental models have argued that social interaction is a necessary precursor to language acquisition, perhaps because language may rely on evolutionarily older social neurocircuitry (Kuhl, 2007; e.g., Golinkoff et al., 2015), and the present findings contribute neuroanatomical evidence in favor of such models.
A limitation of this study is the correlational nature of the analyses, which applies to nearly all studies of SES differences as well as most neuroimaging studies comparing groups of people. There is, however, behavioral evidence that experimental manipulation of children's language environment contributes to changes in their language development (Windsor et al., 2011, 2013; Suskind et al., 2016; McGillion et al., 2017; Leech et al., 2018). Further, in the absence of an intervention, the relative quantity and quality of parents' speech to children are remarkably consistent throughout early childhood (Huttenlocher et al., 2007). Thus, although we measured only a limited sample of home language, it is likely that the neural and language variation across children reflected years of differential home language experience.
Several models have addressed how early cognitive stimulation, such as language exposure, may contribute to cognitive development. Whereas some suggest that linguistic experience may uniquely contribute to language domains (e.g., Johnson et al., 2016), others argue that early language interaction may contribute to other aspects of cognition more broadly (e.g., McLaughlin et al., 2017). Although the present study did not find relationships between nonverbal cognition (operationalized as fluid reasoning, working memory, and processing speed) and either language experience or left dorsal language tracts, this does not necessarily mean that language experience solely affects verbal domains. It is possible that language exposure directly relates to other nonverbal domains, such as executive functioning or spatial reasoning. It is also possible that language exposure indirectly influences nonverbal cognition at older ages via language skills at younger ages (Noble et al., 2005, 2007). More comprehensive, longitudinal studies are necessary to tease out the direct and indirect influences of early language experience on multiple domains of cognition throughout childhood and adolescence.
The present findings highlight the specific role that conversational turns may play in a particular aspect of brain development above and beyond SES. There are multiple studies reporting correlations between SES and brain structure and function (for review, see Farah, 2017). Crucially, in the present study, the relation between conversational turns and white matter remained significant after SES was statistically controlled for. This implies that the critical environmental correlate was not SES per se, but rather conversational turns at any level of SES. Although higher SES was in general associated with more conversational turns, the apparent influence of conversational turns on white-matter microstructure occurred independent of SES.
The present results may also have practical implications. Community-based intervention programs designed to close the SES “word gap” have often focused on closing this gap by increasing the quantity of speech that low-SES parents direct toward children (Cartmill, 2016). However, the present results build on previous behavioral findings that the quality of language, specifically conversational interaction, is more strongly linked to children's behavioral outcomes by revealing that this same quality is associated with white-matter development in children's language brain circuitry. This suggests that early intervention programs should not only encourage parents to talk to their children, but to talk with their children to promote optimal brain development. Further research is needed to determine whether enrichment of the language environment in at-risk children could reduce the measurable socioeconomic disparities in academic achievement and brain development (Mackey et al., 2015; Noble et al., 2015; Johnson et al., 2016). More generally, the finding that more conversational turns are associated with more coherent white-matter connectivity independent of SES indicates that promoting such conversational turns may enhance structural brain development and the language abilities supported by that brain development in children from all backgrounds.
Footnotes
This work was supported by the Walton Family Foundation to M.R.W., National Institute of Child Health and Human Development F31HD086957 to R.R.R., Harvard Mind Brain Behavior Grant to R.R.R., and a gift from David Pun Chan to J.D.E.G. We thank the Athinoula A. Martinos Imaging Center at the McGovern Institute for Brain Research (Massachusetts Institute of Technology); Atshusi Takahashi, Steve Shannon, and Sheeba Arnold for data collection support; Kelly Halverson, Emilia Motroni, Lauren Pesta, Veronica Wheaton, and Christina Yu for assistance in administering behavioral assessments; Megumi Takada for help with data collection/organization; Hannah Grotzinger for MRI quality assurance; Matthias Goncalves for data processing assistance; and Transforming Education, John Connolly, and Glennys Sanchez from 1647 Families plus Ethan Scherer from the Boston Charter Research Collaborative for extensive recruitment support.
The authors declare no competing financial interests.
- Correspondence should be addressed to Dr. Rachel R. Romeo, Massachusetts Institute of Technology, Office 46-4037, 43 Vassar Street, Cambridge, MA 02139. rromeo{at}mit.edu