Abstract
Genetic information is involved in the gradual emergence of cortical areas since the neural tube begins to form, shaping the heterogeneous functions of neural circuits in the human brain. Informed by invasive tract-tracing measurements, the cortex exhibits marked interareal variation in connectivity profiles, revealing the heterogeneity across cortical areas. However, it remains unclear about the organizing principles possibly shared by genetics and cortical wiring to manifest the spatial heterogeneity across the cortex. Instead of considering a complex one-to-one mapping between genetic coding and interareal connectivity, we hypothesized the existence of a more efficient way that the organizing principles are embedded in genetic profiles to underpin the cortical wiring space. Leveraging vertex-wise tractography in diffusion-weighted MRI, we derived the global connectopies (GCs) in both female and male to reliably index the organizing principles of interareal connectivity variation in a low-dimensional space, which captured three dominant topographic patterns along the dorsoventral, rostrocaudal, and mediolateral axes of the cortex. More importantly, we demonstrated that the GCs converge with the gradients of a vertex-by-vertex genetic correlation matrix on the phenotype of cortical morphology and the cortex-wide spatiomolecular gradients. By diving into the genetic profiles, we found that the critical role of genes scaffolding the GCs was related to brain morphogenesis and enriched in radial glial cells before birth and excitatory neurons after birth. Taken together, our findings demonstrated the existence of a genetically determined space that encodes the interareal connectivity variation, which may give new insights into the links between cortical connections and arealization.
Significance Statement
Genetic factors have involved the gradual emergence of cortical areas since the neural tube begins to form, shaping the specialization of neural circuitry in the human brain. However, the mechanisms through which genetics encode the complex interareal connectivity remain a pivotal and unanswered question in the field of neuroscience. Here, we hypothesized that a genetically determined space encoding the interareal connectivity variation exists, which may give new insights into the links between cortical connections and arealization. We combined diffusion tractography with a dimension reduction framework to unravel the underlying global topographic principle revealed by the anatomical connections.
Introduction
Mounting evidence has suggested that the genetic effects on the anatomical phenotypes of the human brain are spatially heterogeneous (Chen et al., 2011, 2012, 2013; Huang et al., 2023), demonstrating the vital roles of genes in establishing the configuration of brain space, such as the anatomical hierarchy (Burt et al., 2018; Wang et al., 2024) and the wiring diagram (Srinivasan et al., 2012; Greig et al., 2013; Molnar et al., 2019). The profile of cortical wiring that can be characterized using neuroimaging-based tractography is especially reliable for indicating the heterogeneity across cortical areas (Passingham et al., 2002; Fan et al., 2016; Cheng et al., 2021), reflecting differentiation in local microstructures and connectivity patterns. The interareal connectivity variation has further revealed the spatially heterogeneous patterns of cortical organization, such as regional controllability related to cognitive dynamics (Gu et al., 2015; Cui et al., 2020). In contrast, the deficit in interareal connectivity has demonstrated the specificity of the behavioral manifestations of brain lesions (Thiebaut de Schotten et al., 2020; Talozzi et al., 2023). Although spatial heterogeneity has been separately found to be embedded in genetic architecture and interareal connectivity variation, it is still largely unknown if a unified organizing principle exists underlying their indication of spatial heterogeneity.
The spatiotemporal distribution of genetic factors along the developing brain has been found to set the primary cues that guide the process of cortical arealization (O'Leary and Sahara, 2008; Cadwell et al., 2019). It has been pointed out that gene-coded signaling molecules and transcriptional profiles may contribute to this guidance process, and these mechanisms may even be conserved into maturity (Hawrylycz et al., 2015; Vogel et al., 2024). Specifically, under the effect of morphogens secreted from the patterning centers, transcription factors are expressed in a graded manner and determine the areal fate and the expression of cell-surface molecules, thus determining the topographic organization of synaptic inputs and outputs (Rubenstein and Rakic, 1999; Chen et al., 2013) and directing axonal outgrowth (Chilton, 2006). Nevertheless, considering the complexity and flexibility of neural circuits, it is difficult to conceive of the existence of a one-to-one mapping between the genetic codes and the wiring patterns (Hassan and Hiesinger, 2015). Alternatively, we hypothesize that the genetic profiles distributed across the cortex may contain the organizing principles of cortical wiring (Hassan and Hiesinger, 2015). In this way, the genetic processes and the cortical wiring patterns can be systematically unified to provide a more complete understanding of the heterogeneity across the cortex. Recent advances in acquiring high-throughput transcriptomic data of the human brain (Shen et al., 2012) and large-scale twin samples (Chen et al., 2011, 2012) with neuroimaging data provide essential resources to test our hypothesis.
Furthermore, even if we can identify the genetic topography, it remains unknown whether we can leverage it to explain the interareal connectivity variation captured by high-resolution connectomics (Taylor et al., 2017; Mansour et al., 2021). Recent studies have revealed that cortical areas are hierarchically organized along the cortex from a large-scale gradient perspective of connectivity (Margulies et al., 2016; Huntenburg et al., 2018), indicating that the cortex is organized topologically within interareal connectivity variation, supporting diverse dynamics and functions (Atasoy et al., 2016; Robinson et al., 2016; Preti and Van De Ville, 2019; Chu et al., 2024). However, traditional neural circuit tracing techniques, including classical tracers or virus tracing, are precluded in humans, thereby creating the need for noninvasive measures of the topography embedded in interareal connectivity variation, which could contribute to testing the existence of unified rules underlying the genetic topography and interareal connectivity variation.
To address these open questions, we first quantified the interareal connectivity variation by establishing the similarity matrix of the structural connectivity profiles. Leveraging a manifold learning approach, we identified three components [hereafter also referred to as global connectopy (GC)] running separately along the dorsoventral, rostrocaudal, and mediolateral axes of the brain anatomy, which efficiently delineated the interareal connectivity variation. Then, we characterized the genetic topography by respectively utilizing the gradients from the twin-based genetic correlation matrix of cortical morphology and the cortex-wide gene expression matrix. Based on the identified GC and the genetic topography, we provided evidence supporting our hypothesis by demonstrating the significant consistency between them. Furthermore, we identified specific genes associated with the GCs that were involved in brain morphogenesis and enriched in radial glial cells before birth and excitatory neurons related with cortical projection circuit formation after birth. Our analytic logic is detailed in Figure 1 to provide an overview of the current study.
Overview of the analysis pipeline. We investigated the hypothesis that the existence of a common space is shared by brain connectivity and genetic profiles. a, The construction of GCs. White matter tractograms were calculated to generate a similarity matrix, i.e., a TC matrix. Diffusion map embedding was implemented on the TC matrix, resulting in low-dimensional gradients. b, Biological interpretation of the GCs. White matter bundles contributing to the GCs were analyzed (top panel). GCs were demonstrated to provide a large-scale descriptor of cortical cartography, which may give insight into cortical parcellation (Fan et al., 2016). c, Topographic axes of genetic profiles. We demonstrated a correspondence between the genetic influence on cortical morphology and GCs. We established that the three GCs are consistent with morphogen gradients in the developing brain and spatiomolecular gradients in adulthood and provided evidence that specific genes drive the formation of the cortical organization. d, Annotation of GC-associated genes. Genes associated with the GCs were identified and submitted to enrichment and development analyses.
Materials and Methods
Data and preprocessing
Data collection
We used a publicly available dataset containing 100 unrelated subjects (HCP-U100; 46 males; mean age, 29.11 ± 3.67 years; age range, 22–36 years) provided by the Human Connectome Project (HCP) database (Van Essen et al., 2013; http://www.humanconnectome.org/). All the scans and data from the individuals included in the study had passed the HCP quality control and assurance standards.
The scanning procedures and acquisition parameters were detailed in previous publications (Glasser et al., 2013). In brief, T1-weighted (T1w) images were acquired with a 3D MPRAGE sequence on a Siemens 3 T Skyra scanner equipped with a 32-channel head coil with the following parameters: TR, 2,400 ms; TE, 2.14 ms; flip angle, 8°; FOV, 224 × 320 mm2; and voxel size, 0.7 mm isotropic. Diffusion data were acquired using single-shot 2D spin-echo multiband echo planar imaging on a Siemens 3 T Skyra system (TR, 5,520 ms; TE, 89.5 ms; flip angle, 78°; FOV, 210 × 180 mm). These consisted of three shells (b = 1,000, 2,000, and 3,000 s/mm2), with 90 diffusion directions isotropically distributed among each shell and six b = 0 acquisitions within each shell, with a spatial resolution of 1.25 mm isotropic voxels.
Image preprocessing
The human T1w structural data had been preprocessed following the HCP's minimal preprocessing pipeline (Glasser et al., 2013). In brief, the processing pipeline included imaging alignment to standard volume space using FSL, automatic anatomical surface reconstruction using FreeSurfer, and registration to a group average surface template space using the multimodal surface matching algorithm (Robinson et al., 2014). Human volume data were registered to the Montreal Neurological Institute (MNI) standard space, and surface data were transformed into surface template space (fs_LR).
The diffusion images were processed using FDT (FMRIB's Diffusion Toolbox) of FSL. The main steps included normalization of the b0 image intensity across runs and correction for echoplanar imaging susceptibility, eddy current-induced distortions, gradient nonlinearities, and subject motion. DTIFIT was then used to fit a diffusion tensor model. The probability distributions of the fiber orientation distribution were estimated using Bedpostx.
Next, skull-stripped T1w images for each subject were coregistered to the subject's b0 images using FSL's FLIRT algorithm. Then, nonlinear transformations between the T1 image and the MNI structural template were obtained using FSL's FNIRT. By concatenating these, we derived bidirectional transformations between the diffusion and MNI spaces.
Connectopy mapping
Tractography and connectivity blueprints
To map the whole-brain connectivity pattern, we performed probabilistic tractography using FSL's probtrackx2 accelerated by using GPUs (Behrens et al., 2007; Hernandez-Fernandez et al., 2019). Specifically, the white surface was set as a seed region tracking to the rest of the brain with the ventricles removed and downsampled to 3 mm resolution. The pial surface was used as a stop mask to prevent streamlines from crossing sulci. Each vertex was sampled 5,000 times (5,000 trackings) based on the orientation probability model for each voxel, with a curvature threshold of 0.2, a step length of 0.5 mm, and a number of steps of 3,200. This resulted in a (whole-surface vertices) × (whole-brain voxels) matrix for further analysis. We checked the preprocessing images and tractography results of each subject to ensure the accuracy of our analysis. Specifically, the reconstructed surfaces were visually inspected, and transforms between different spaces were checked carefully. Tractography results were also visually checked to ensure that no streamlines crossed the sulci. We also calculated whole-brain connections at the ROI-wise. We performed tractography from regions to the whole brain using Schaefer-400 parcellation (Schaefer400; Schaefer et al., 2018) to get a (whole-brain ROIs) × (whole-brain voxels) matrix for further analysis.
We additionally created connectivity blueprints following previous work (Mars et al., 2018), which had been used to characterize connectivity patterns of brain regions or cortical vertices. Specifically, we first reconstructed 72 fiber bundles using a pretrained deep-learning model, TractSeg (Wasserthal et al., 2018), and downsampled the resulting tract masks to 3 mm resolution, yielding a (tracts) × (whole-brain voxels) matrix. The connectivity blueprints were then generated by the product of this tract matrix and the whole-brain connectivity matrix, which was obtained and normalized as described in the Tractography section. The columns of the resulting (tracts) × (whole-surface vertices) matrix showed the connectivity distribution pattern of each cortical vertex, while the rows revealed the cortical termination patterns of the tracts.
Tractogram covariance analysis
We calculated the tractogram covariance (TC) matrix to characterize the structural connectivity similarity profile. The vertex profiles underwent pairwise Pearson’s correlations, controlling for the average whole-cortex profile. For a given pair of vertices, i and j, TC was calculated as follows:
Connectopy decomposition
The TC matrix was transformed into a non-negative square symmetric affinity matrix using a cosine affinity kernel. Then, diffusion map embedding was implemented to identify the principal gradient components (Coifman and Lafon, 2006). Diffusion map embedding is a nonlinear manifold learning technique that maps cortical gradients (Margulies et al., 2016). Along these gradients, cortical vertices with similar connectivity profiles are embedded along the axis. Diffusion map embedding is relatively robust to noise and less computationally expensive than other nonlinear manifold learning techniques.
GCs were computed separately for the left and right hemispheres with two key parameters: α controls the influence of the density of sampling points on the manifold, and t controls the scale of the eigenvalues. Here, we set α = 0.5 and t = 0 as recommended (Margulies et al., 2016). The amount of explained variance was assessed, and the first three GCs were chosen to map onto the cortical surface for further analysis. The ROI-wise GCs were also computed in the same model.
Individual GCs were calculated for each subject and aligned to the group GC with Procrustes rotation, which matches the order and direction of the GCs without scaling. The application of Procrustes rotation allows for comparison between individuals.
Projection of the GCs onto the white matter voxels
We used general linear models (GLMs) to project the GCs onto the white matter (Tarun et al., 2020), exploring the contribution of each white matter voxel to the GCs. Specifically, we used the GCs as the dependent variable and the structural connection matrix as the independent variable. The resultant weights indicate the contribution of each white matter voxel. Note that the structural connection of each subject was obtained from probabilistic tractography in individual diffusion space. Each subject's data were registered to MNI standard space and averaged, resulting in a group-level structural connection matrix.
Projection of the connectopies onto the white matter tracts
To investigate how the GCs are related to the underlying white matter tracts, we created a tract projection map using the tract mask generated by TractSeg. Seventy-two tracts identified by TractSeg were grouped into five types for analysis: association tracts, commissural tracts, projection tracts, thalamo tracts, and striato tracts (Extended Data Table 2-1). Using the connectivity blueprints we built earlier, we extracted the projection on the surface of each tract and calculated the mean value and variance using the connectopy values of tract projection on the surface, thus revealing the relationship between the tracts and GCs. Due to the absence of projection on the surface, we excluded the fornix, superior cerebellar peduncle, middle cerebellar peduncle, and inferior cerebellar peduncle from the study. We additionally removed the Corpus Callosum—all because several of its segments were also reconstructed by TractSeg, i.e., CC 1 to CC 7.
Visualization of connectopies
To intuitively inspect the GCs together, we plotted every pair of GCs in a 2D space and assigned colors based on their positions. Then, we generated RGB values for each vertex by normalizing the three GC values and mapped them on the surface, showing the dominant GC at the vertex. Specifically, for the first gradient, the values of all vertices were normalized from zero to one, resulting in the value of the red channel. The other two were conducted the same.
Hierarchical organization analysis
We implemented hierarchical clustering using GCs to test if the GCs could provide a descriptor of arealization. At the first level, the vertices were sorted in a descending order based on the first GC. Brain regions were partitioned into two subregions according to the sign of the connectopy value. Furthermore, based on the positivity and negativity of the second and third GCs, the whole cortex was partitioned into eight modules.
Gene analysis
Genetic correlation analysis of thickness and the surface area
We selected 194 twins from the HCP database [102 monozygotic (MZ) and 92 dizygotic (DZ) pairs; mean age, 29.96 ± 2.96 years; age range, 22–36 years]. The cortical surface was reconstructed to measure cortical thickness and surface areas at each surface location using FreeSurfer in the fsaverage3 template to reduce computation time.
In a classical twin study of sets of MZ and DZ twins, four latent factors could account for the variance of any phenotype: additive genetic effects (A); nonadditive genetic effects, including dominance (D); common or shared environmental effects (C); and nonshared or individual-specific environmental effects (E). Since MZ twins are assumed to be genetically identical, their genetic correlation is perfect (r = 1.0) for both additive and nonadditive genetic effects. In contrast, DZ twins share, on average, 50% of their genes, resulting in correlations of 0.50 for additive genetic effects and 0.25 for nonadditive genetic effects. The C term refers to environmental factors that contribute to similarities between twins; thus, shared environmental factors have a correlation of 1.0 across twin pairs, regardless of zygosity. The E term represents environmental factors that create differences between twins. Since these are individual-specific factors, they are considered uncorrelated across twins.
Given the minimal difference between the A and D effects, we fitted bivariate ACE models to calculate the genetic correlations of cortical thickness and surface area between two cortical vertices. This approach allows us to explore both genetic and environmental contributions to variance and covariance. A phenotypic correlation reflects the shared variance between two traits, combining both genetic and environmental components. Specifically, it is defined as the total covariance (genetic plus environmental) between two variables, divided by the square root of the product of their respective total variances. After decomposing variance sources in the bivariate model, we computed genetic correlations, which represent the genetic covariance divided by the square root of the product of the genetic variances of the two variables. The analyses were performed using OpenMx (Neale et al., 2016). The terms of the output genetic correlation matrix represented the genetic correlation decomposed from the total phenotypic correlation between two cortical vertices.
We then implemented diffusion map embedding on the genetic correlation matrix and compared it with the GCs. Genetic patterning results were obtained from previous studies (Chen et al., 2012, 2013), which parcellated the cortex into 4 and 12 clusters. The parcellation results were compared with the modules derived from hierarchical clustering using the GCs.
AHBA data and preprocessing
Regional microarray gene expression data were obtained from six postmortem brains (one female, ages 24–57 years) provided by the Allen Human Brain Atlas (AHBA; https://human.brain-map.org/). The data were processed using the abagen toolbox (version 0.1.1; https://github.com/rmarkello/abagen; Markello et al., 2021). First, the microarray probes were reannotated (Arnatkeviciute et al., 2019), and probes that did not match a valid Entrez ID were excluded. Then, the probes were filtered based on their expression intensity relative to background noise, and the probes with the maximum summed adjacency for representing the corresponding gene expression were kept, yielding 15,633 genes corresponding to more than one probe. Finally, we resampled the output gene expression map in fs5 space to fsaverage_LR32k space for subsequent study.
Analysis of morphogen gradient-related genes
We further selected morphogen genes in AHBA data (Rakic et al., 2009) and examined their expression difference along the GCs. Each GC was subdivided equally into 10 parts, and two-sample t tests were used to assess the difference in gene expression of selected genes between the first and the last bins (Fjell et al., 2019). We also examined the differential stability (DS) of morphogen genes, which is defined as the tendency for a gene to exhibit reproducible differential expression relationships across brain structures, specifically the average Pearson’s correlation over 15 pairs of six brains in the AHBA dataset (Hawrylycz et al., 2015).
Correspondence to spatiomolecular gradients
Researchers recently reported three spatiomolecular gradients that retained the same pattern as morphogenetic gradients during development, which varied along three spatially embedded axes (Vogel et al., 2024). We only used data from the left hemisphere to reproduce these molecular gradients following the previous publication, because few samples in the AHBA were obtained from the right hemisphere.
Gene enrichment analysis
We used a prediction framework, partial least squares regression (PLSR), to determine the covariance between gene expression and GCs. We then conducted 1,000 bootstrapping to estimate the error of the weight of each gene. The normalized weight of each gene was denoted as the weight divided by the estimated error (Morgan et al., 2019).
Genes were selected as the top and bottom 0.83% genes (n = 260) of each global connectopy's gene list, corrected for both tails and all three GCs for multiple comparisons. Each gene set, combining both tails, underwent cell-type enrichment analysis on the filtered genes using CellGO (http://www.cellgo.world; Li et al., 2023). Single-cell datasets collected from prenatal (Fan et al., 2020) and postnatal sample were used (Ma et al., 2022). Seven major classes of cells were provided in the prenatal sample, including radial glial cells (RG), intermediate progenitor cells (IPC), excitatory neurons (ExN), inhibitory neurons (InN), oligodendrocyte precursor cells/oligodendrocytes (OPC/Oligo), astrocytes (Astro), and microglia cells (Microglia). Six major classes of cells were provided in the postnatal sample, including ExN, InN, Oligo, OPC, Astro, and Microglia. The enrichment p values of the cell types resulting from the submitted genes were based on the Kolmogorov–Smirnov (K–S) test in CellGO.
We collected gene expression data for all time points in the BrainSpan dataset to determine the developmental pattern of gene expression (Miller et al., 2014). Only genes associated with all three GCs were selected. Non-negative matrix factorization was used to derive the gene expression components for each macrostructure in the brain.
The filtered genes extending across the three GCs were also submitted to gene ontology enrichment analysis. ToppGene (https://toppgene.cchmc.org/), which contains the complete list of AHBA genes as the background gene set, was used to conduct the analysis. The following term categories were assessed: GO, Molecular Function; GO, Biological Process; and GO, Cellular Component, Pathway, and Disease.
Replication and robustness analyses
Replication with CHCP dataset
We randomly selected 100 age-matched subjects (59 males; mean age, 24.31 ± 2.35 years; age range, 22–35 years) from the Chinese Human Connectome Project (CHCP) dataset (Ge et al., 2023). All subjects provided written informed consent, and the protocol was approved by the Institutional Review Board at Peking University.
The scanning procedures and acquisition parameters were detailed in previous publications (Ge et al., 2023). In brief, T1w images were acquired on a 3 T Siemens Prisma scanner with the following parameters: TR, 2,400 ms; TE, 2.22 ms; flip angle, 8°; FOV, 256 × 240 mm2; and voxel size, 0.8 mm isotropic. Diffusion data were acquired with the following parameters: TR, 3,500 ms; TE, 86 ms; flip angle, 90°; FOV, 210 × 210 mm; voxel size, 1.5 mm isotropic; 14 baseline images at b = 0; 93 diffusion-weighted images at b = 1,000 s/mm2; and 92 diffusion-weighted images at b = 2,000 s/mm2.
The structural and diffusion data from the CHCP dataset were preprocessed with HCP's minimal preprocessing pipeline (Glasser et al., 2013), which was consistent with the HCP dataset. We repeated the procedure of construction of GCs on the CHCP dataset and compared the similarity with GCs in the HCP dataset.
Examine the confound effect
Complementing our main analysis on calculating GCs, we examined the effect that may be caused by age, sex, and brain size. For age, we divided subjects into three groups (age 20–25 years, n = 17; age 25–30 years, n = 40; age 30–35 years, n = 43). For sex, we divided the subjects into males (n = 46) and females (n = 54). For the brain size, we extracted each HCP subject's total intracranial volume (TIV) from the FreeSurfer output. Moreover, we divided them into five groups [TIV 1.0–1.2, n = 3; TIV 1.2–1.4, n = 12; TIV 1.4–1.6, n = 38; TIV 1.6–1.8, n = 40; TIV 1.8–2.0, n = 7; (*106 mm3)]. We calculated the correlation of the TC matrix between groups.
Examine the distance and geometry effect
We derived the geodesic distance matrices on the mid-thickness surface of the human brain, calculated the gradients, and computed the correlation with GCs to test the effect of geodesic distance. We also removed short-range connections in the TC matrices. We removed the value if the geodesic distance of the two vertices was <10, 20, and 30 mm, respectively, and calculated the GCs again.
We further followed the procedure from a previous study to obtain the cortical geometric eigenmodes (Pang et al., 2023). In brief, because the brain structure can be approximated as being constant in time, the resulting spatial and temporal dynamics can be treated separately via eigenmode decomposition. In particular, the spatial aspect satisfies the Laplacian eigenvalue problem, which is also known as the Helmholtz equation, defined in the following equation:
We derived the geometric eigenmodes of the cortical surfaces by solving the eigen-decomposition problem
Results
Three GCs in the human brain
To explore how the spatial organization of the human brain is shaped by the underlying structural connections within the white matter, we characterized the GCs across the whole brain. We first built a vertex-wise similarity matrix of structural connectivity using 100 unrelated subjects from the HCP dataset (Van Essen et al., 2013). Specifically, structural connectivity profiles were obtained from probabilistic tractography along ∼30k vertices for each hemisphere and were correlated between each pair of vertices. TC matrices were thresholded at 0 and averaged across subjects. In short, the TC matrix captured structural connectivity similarity across the whole brain.
We implemented diffusion embedding on the TC matrix, a manifold learning method previously used to capture functional gradients (Coifman and Lafon, 2006). The resultant components revealed the position of vertices along the axes with the most dominant differences in the given structural connectivity profile. The first three GCs separately showed dorsoventral, rostrocaudal, and mediolateral patterns and together accounted for 33% of the total variance (Fig. 2a).
The three GCs and the white matter tracts contribute to the GCs. a, The first three GCs run dorsoventrally, rostrocaudally, and mediolaterally and are termed GC1-DV, GC2-RC, and GC3-ML, respectively. These patterns are stable along subjects (Extended Data Fig. 2-1) and are not affected by age, sex, or brain size (Extended Data Fig. 2-2). An independent experience was also applied on CHCP dataset (Extended Data Fig. 2-3). Moreover, these three patterns are highly reproducible between individuals (Extended Data Fig. 2-4) and can be reproduced in ROI-wise (Extended Data Fig. 2-5). The GCs are beyond geodesic distance and cortical geometry (Extended Data Figs. 2-6, 2-7) and demonstrated the role of long-range connections in the formation of GCs (Extended Data Fig. 2-8). b, The three GCs are projected onto the white matter (Extended Data Fig. 2-9) and are situated by distinct sets of white matter tracts. The white matter tracts were reconstructed by TractSeg (Extended Data Table 2-1). Mean values and variances of the contributions of five types of tracts to the GCs are shown. The numerical values are showed in Figure 2-10. Each dot represents one tract. Different colors represent different types of tracts, with deeper colors indicating higher contributions. The two right columns show two types of characteristic tracts contributing to the GCs, with one type of the tract situated at one extreme of the GC while the other stretching across the GC. The first column shows the cortical projection on the surface of the tract, and the second shows the corresponding fiber bundles reconstructed by TractSeg. See Extended Data Table 2-1 and Extended Data Figures 2-1–2-10 for more details.
Figure 2-1
Stability of the three global connectopies patterns across the subjects. The correlation coefficient was greater than 0.9 when the number of subjects reached ∼10. Download Figure 2-1, TIF file.
Figure 2-2
The relationship between tractogram covariance and age, sex, and brain size. (a) We divided subjects into three groups (age 20-25: n = 17; age 25-30: n = 40; age 30-35: n = 43) and found that the tractogram covariance matrix among the three groups showed very high similarity. (b) We extracted each HCP subject's total intracranial volume (TIV) from the Freesurfer output. Furthermore, we divided into five groups (TIV 1.0-1.2: n = 3; TIV 1.2-1.4: n = 12; TIV 1.4-1.6: n = 38; TIV 1.6-1.8: n = 40; TIV 1.8-2.0: n = 7; [*106 mm3]) and calculated the tractogram covariance matrix. We found that the three groups showed very high similarity. We also calculated the correlation between the male group (n = 46) and the female group (n = 54) and found the correlation very high (r = 0.99). Download Figure 2-2, TIF file.
Figure 2-3
Global connectopies were calculated using the CHCP dataset and their correlation with global connectopies. (a) We replicated our procedure of calculating global connectopies on 100 age- and sex-matched subjects from the CHCP dataset. The scree plot describing variance is shown on the right. (b) The global connectopies pattern were highly consistent between the two datasets (left: rG1-DV = 0.89, rG2-RC = 0.85, rG3-ML = 0.93; right: rG1-DV = 0.98, rG2-RC = 0.94, rG3-ML = 0.91). Download Figure 2-3, TIF file.
Figure 2-4
Individual global connectopies were aligned after Procrustes alignment. We applied Procrustes alignment on the individual's three global connectopies, i.e., (a) GC1-DV, (b) GC2-RC, and (c) GC3-ML. The similarity among individuals is very high (all r > 0.9). Download Figure 2-4, TIF file.
Figure 2-5
Global connectopies were calculated at the ROI-wise level. (a) We performed the global connectopy calculation at the ROI-wise on human brain atlases, i.e., Schaefer-400 parcellation (Schaefer400). The first three gradients correspond to our global connectopies, showing the dorsal-ventral, rostral-caudal, and medial-lateral patterns. (b) The three global connectopies. Download Figure 2-5, TIF file.
Figure 2-6
Gradients were calculated using the geodesic distance matrix and their correlation with global connectopies. (a) We derived the geodesic distance matrices on the mid-thickness surface of the human brain and calculated the gradients, which showed dissimilar patterns with our global connectopies. A scree plot describing variance is shown on the right. (b) Correlation between geodesic distance gradient and global connectopies. Download Figure 2-6, TIF file.
Figure 2-7
Geometric eigenmodes and their correlation with global connectopies. (a) We calculated the geometric eigenmodes following Pang et al. 2023 Nature. (b) Correlation between geometric eigenmodes and global connectopies. Download Figure 2-7, TIF file.
Figure 2-8
Global connectopies are beyond geodesic distance and short-range connections. (a) Connection distance distribution of the vertex after the thresholding step. Many long-distance connections remained after thresholding. (b) Global connectopies were calculated by removing the short-range connection and its correlation with global connectopies. We removed the value if the geodesic distance of the two vertices was less than 10 mm, 20 mm, and 30 mm, respectively, and calculated the global connectopies again. We found that they showed very similar results to our original main results. Download Figure 2-8, TIF file.
Figure 2-9
Voxel's contribution to the global connectopies was calculated using the GLM method. We used general linear models (GLM) to project the global connectopies onto the white matter, exploring the contribution of each white matter voxel to the global connectopies. Specifically, we used the global connectopies as the dependent variable and the structural connection matrix as the independent variable. The resultant weights indicated the contribution of each white matter voxel. Download Figure 2-9, TIF file.
Figure 2-10
Mean value and variance of fiber contributions to the global connectopies. (a) The mean value of fiber contributions to the global connectopies. (b) Variance in fiber contributions to the global connectopies. Download Figure 2-10, TIF file.
Table 2-1
Fiber bundles reconstructed by TractSeg. Download Table 2-1, XLSX file.
More specifically, the first GC (GC 1 dorsoventral, GC1-DV), following a dorsoventral pattern, was anchored at one end by the occipital, inferior temporal, and orbitofrontal cortex and at the other end by the sensorimotor cortex. The second GC (GC 2 rostrocaudal, GC2-RC) varied along a rostrocaudal axis, radiating from the occipitoparietal cortex and ending in the prefrontal cortex. The third GC (GC 3 mediolateral, GC3-ML) showed a mediolateral pattern, with the highest expression in the lateral temporal and prefrontal cortex and the lowest in the cingulate cortex (Fig. 2a).
These patterns were stable regardless of the number of subjects considered (Extended Data Fig. 2-1) and were not affected by age, sex, or brain size (Extended Data Fig. 2-2). The findings were consistent in a replication sample from the CHCP dataset (n = 100; Extended Data Fig. 2-3; Ge et al., 2023). Individual GCs were computed for each subject and aligned to the group GC with Procrustes rotation, allowing for individual comparison. The individual GCs were highly correlated with the group GC (all r > 0.9; Extended Data Fig. 2-4). To reduce the influence of noise, we used Schaefer-400 parcellation (Schaefer et al., 2018) to calculate ROI-wise GCs. The results showed that the first three gradients corresponded to the three vertex-wise GCs (Extended Data Fig. 2-5). We further showed that GCs are beyond geodesic distance and cortical geometry (Extended Data Figs. 2-6, 2-7) and demonstrated the role of long-range connections in the formation of GCs (Extended Data Fig. 2-8). These results showed that GCs were not simply the result of spatial autocorrelation.
The three connectopies are situated by distinct sets of white matter tracts
We further explored how the underlying white matter shaped these spatial organizations. We fitted GLMs to interpolate measures from the gray matter into the white matter (Tarun et al., 2020). Specifically, we used the GCs as the dependent variable and the structural connection matrix as the independent variable, thus projecting each GC onto the white matter (Extended Data Fig. 2-9). For GC1-DV, the white matter voxels were distributed along the dorsal–ventral axis beneath the surface. Specifically, voxels near the sensorimotor cortex and occipitotemporal cortex showed values that were markedly consistent with the locations of the extremes of the first GC. Voxels corresponding to GC2-RC and GC3-ML varied from posterior to anterior and medial to lateral.
We then focused on the specific tracts related to the emergence of the GCs. We reconstructed 72 fiber bundles following TractSeg (Wasserthal et al., 2018; Table 2-1) and built group-level connectivity blueprints (Mars et al., 2018), where the rows revealed the cortical termination patterns of the tracts. We first extracted the projection on the surface of each tract and calculated the mean value and variance of connectopy values of the tract projection on the surface (Fig. 2b). We identified the tracts that were either situated at one of the extremes of the GC or were spread out along the axis. The results for the bilateral tracts were averaged between the two hemispheres and are shown in the scatterplot (Fig. 2b; detailed values are shown in Extended Data Fig. 2-10).
Two types of characteristic tracts are shown in the right columns of Figure 2b, including the thalamoprecentral tract (T PREC), striato-occipital tract (ST OCC), and thalamoparietal tract (T PAR) for GC1-DV as well as the middle longitudinal fascicle (MLF), thalamoprefrontal tract (T PREF), and superior longitudinal fascicle II (SLF II) for GC2-RC, along with the arcuate fascicle, cingulum, and MLF for GC3-ML.
For GC1-DV, T PREC, which connects the thalamus with the precentral gyrus, is related to motor task performance (Strick, 1986). T_PREC has a distinctly vertical expanding shape and showed high mean values near the dorsal end of GC1-DV, expressing the most significant contribution to the GC. ST OCC connects the striatum with the occipital cortex, showing high mean values at the ventral end and having a “stretching” impact on the GC pattern. In contrast, T PAR has a high variance within GC1-DV and appears to constrain the GC formation.
For GC2-RC, the second branch of the superior longitudinal fascicle, which mostly runs from the posterior to the anterior, starting from the middle frontal gyrus and terminating in the angular gyrus, had the greatest mean value for the white matter tract. The SLF II, which covered most of the areas in the second GC, also showed high values, providing the prefrontal cortex with parietal cortex information about the perception of visual space and providing information about working memory in the prefrontal cortex to the parietal cortex to focus spatial attention and regulate the selection of spatial information (Janelle et al., 2022). Other “high-contribution” tracts, including the MLF and T PREF, are located either in the anterior or posterior regions.
The GC3-ML showed a similar pattern to the tracts that made high contributions. The corpus callosum, the giant white matter fiber bundle consisting of millions of axonal projections, spans from the medial brain through different parts of the cortex and connects all the brain lobes. Thalamo- and striatocortical projections also extend from the subcortical nuclei to reach different lateral cortices. We suggest that the former type of tract plays a role in stretching the brain along the axis, while the latter type has a constraining effect and that these work together to shape the organization of the brain.
GCs provide a large-scale descriptor of arealization
To inspect the GCs as a unit, we normalized three connectopy values at each vertex to generate RGB values and mapped them on the surface to make lobes or subregions visible distinctly (Fig. 3a). We termed this map the global connectopic space, which quantified the topography of the dominant GC. The cortex, shown in red, green, and blue, indicates the individual regions dominated by each GC. Regions dominated by a combination of more than one GC were represented in different colors, e.g., the “yellow” region was dominated by the first and second GCs together, as shown in the cube in Figure 3a.
GCs provide a large-scale descriptor of arealization. a, The three GCs were plotted together in what we termed the global connectopic space, with connectopy values assigned to the RGB values at each vertex. The global connectopic space integrates the information of the three GCs and quantifies the topography of the dominant GC. b, Left, Several lobes or subregions clearly appear when every two GCs are plotted together in 2D space, with the extremes of the GCs clearly and continuously shown. GC1-DV and GC2-RC split the cortex into the prefrontal cortex and limbic cortex, sensorimotor cortex, and occipitotemporal cortex. GC1-DV and GC3-ML split the cortex into the frontal cortex, limbic cortex, and occipitotemporal cortex. GC2-RC and GC3-ML split the cortex into the prefrontal cortex, limbic cortex, and other regions. Right, The three GCs were plotted together in 3D space. Vertices were assigned different colors according to the signs of the three axes. c, Modules identified by hierarchical clustering using the three GCs. At each level, the brain was partitioned into two modules according to the positive and negative signs of the GC. Eight modules, including lobes or functional networks, emerged clearly. vSomaMot, ventral somatomotor cortex; dSomaMot, dorsal somatomotor cortex; lPFC, lateral prefrontal cortex; CG, cingulate gyrus; lTC, lateral temporal cortex; AG, angular gyrus; mOC, medial occipital cortex; OPC, occipital polar cortex; OFC, orbitofrontal cortex; vmPFC, ventromedial prefrontal cortex; PCC, posterior cingulate cortex.
We then plotted every pair of GCs in a 2D space and assigned colors based on the global connectopic space in Figure 3a. The first and second GCs interacted to split the whole brain into the prefrontal cortex and limbic cortex, sensorimotor cortex, and occipitotemporal cortex (Fig. 3b). Similarly, the frontal cortex, limbic cortex, and occipitotemporal cortex were shown when the first and third GCs were considered together (Fig. 3b). The prefrontal cortex, limbic cortex, and others were shown when the second and third connectopies were combined (Fig. 3b). The three GCs were also plotted together in 3D space, with vertices assigned different colors according to the signs of the three axes (Fig. 3b, right).
In order to show the modules clearly in the global connectopic space, we partitioned the brain by hierarchical clustering using the three GCs (Fig. 3c). At the first level, the brain was partitioned into two modules based on the positive and negative signs of the first GC. These two modules coincided with the division between the dorsal and ventral brain regions. Each module was further partitioned into two modules at each level by the sign of the GCs. The rostrocaudal and mediolateral patterns emerged after considering the corresponding GC. In the end, eight modules were captured: the ventral and dorsal somatomotor cortex, lateral prefrontal cortex, cingulate gyrus, lateral temporal cortex + angular gyrus, medial occipital cortex + occipital polar cortex, orbitofrontal cortex, and ventromedial prefrontal cortex + posterior cingulate cortex (Fig. 3c). Note that the parcellation guided by the three GCs is very coarse compared with several fine-grained brain atlases that have been delineated using the gradient approach (Gordon et al., 2016; Schaefer et al., 2018). We did not utilize these more fine-grained parcels because we believe they are more likely to be shaped by local gradients compared with the three GCs, which characterize the global pattern across the brain.
Potential genetic basis underlying GCs
Having established the GCs in the human brain, we first explored the relationship between the genetic influence on cortical morphology and the identified GCs. We estimated the pairwise genetic correlation between vertices on the cortex by fitting a bivariate ACE model, which revealed shared genetic influences on cortical thickness and relative area expansion between cortical vertices. The resultant genetic correlation matrix represented the genetic correlation decomposed from the total phenotypic correlation between the two cortical vertices. The first three gradients decomposed from the genetic correlation matrix of cortical thickness showed a rostral–caudal axis, a medial–lateral axis, and a dorsal–ventral axis, respectively (left, rGC1-DV, GG3-Thickness = 0.72; pspin < 0.0028, FDR corrected; rGC2-RC, GG1-Thickness = 0.91; pspin < 0.0001, FDR corrected; rGC3-ML, GG2-Thickness = 0.72; pspin < 0.0041, FDR corrected; right, rGC1-DV, GG3-Thickness = 0.73; pspin < 0.0044, FDR corrected; rGC2-RC, GG1-Thickness = 0.88; pspin < 0.0098, FDR corrected; rGC3-ML, GG2-Thickness = 0.71; pspin < 0.1566, FDR corrected; Fig. 4a,b). The first three gradients decomposed from the genetic correlation matrix of the surface area correspond with the three GCs (left, rGC1-DV, GG1-Area = 0.53; pspin < 0.1630; rGC2-RC, GG2- Area = 0.49; pspin < 0.0006, FDR corrected; rGC3-ML, GG3- Area = 0.51; pspin < 0.1031; right, rGC1-DV, GG1-Area = 0.58; pspin < 0.2131; rGC2-RC, GG2- Area = 0.46; pspin < 0.0177; rGC3-ML, GG3- Area = 0.51; pspin < 0.0812; Extended Data Fig. 4-1). We replicated the GCs using twin data from the HCP and found they show high similarity with the results derived from unrelated subjects (left, rG1-DV = 0.99; rG2-RC = 0.99, rG3-ML = 0.99; right, rG1-DV = 0.99; rG2-RC = 0.98; rG3-ML = 0.98; Extended Data Fig. 4-2). Each parcel derived from the fuzzy clustering of the genetic correlation matrix of cortical thickness (Chen et al., 2012) also showed significant overlap with one of the four parcels derived from the hierarchical clustering identified using the GCs (Fig. 4c; Extended Data Figs. 4-3a,b, 4-4a,b). The overlap between parcels derived from the hierarchical clustering and parcels from the genetic correlation of surface area is additionally shown in Extended Data Figures 4-3, c and d, and 4-4, c and d. These significant correspondence levels indicated a close genetic relatedness of the GCs.
Figure 4-4
Overlap between eight modules derived from hierarchical clustering and genetic patterning. Eight modules derived from hierarchical clustering in Figure 3c in the main text showed significant overlap with one or more parcels in genetic patterning obtained from the genetic correlation of cortical thickness and surface area. *** p < .001, ** p < .01, * p < .05. Download Figure 4-4, TIF file.
GCs correspond to genetic topography. a, The first three gradients of genetic similarity of cortical thickness. The genetic similarity matrix was calculated by fitting bivariate ACE models to compute the genetic correlations of cortical thickness between two vertices in a twin dataset (Extended Data Fig. 4-1). b, The three gradients of genetic similarity of cortical thickness show a high correlation with GCs (rGC1-DV, GG3-Thickness = 0.72; pspin < 0.0028, FDR corrected; rGC2-RC, GG1-Thickness = 0.91; pspin < 0.0001, FDR corrected; rGC3-ML, GG2-Thickness = 0.72; pspin < 0.0041, FDR corrected). The GCs can be replicated by twin data (Extended Data Fig. 4-2). c, Overlap between four modules derived from hierarchical clustering and genetic patterning of cortical thickness. Four-cluster genetic patterning was obtained by performing fuzzy clustering on the genetic similarity matrix, which corresponds to well-known brain regions (Extended Data Figs. 4-3, 4-4). ***p < 0.001; **p < 0.01; *p < 0.05. See Extended Data Figures 4-1–4-4 for more details.
Figure 4-1
Global connectopies were calculated using HCP twins and their correlation with global connectopies. (a) We replicated our procedure of calculating global connectopies on 194 HCP twins. The scree plot describing variance is shown on the right. (b) The correlation between results on twins was very high with our original main results (left: rG1-DV = 0.99, rG2-RC = 0.99, rG3-ML = 0.99; right: rG1-DV = 0.99, rG2-RC = 0.98, rG3-ML = 0.98). Download Figure 4-1, TIF file.
Figure 4-2
Gradients of genetic similarity in cortical thickness and surface area of the cerebral cortex. (a) Scree plot describing variance when calculating the gradients. (b) The first three gradients of genetic similarity of cortical thickness of the cerebral cortex. (c) The first three gradients of genetic similarity of the surface area of the cerebral cortex. Download Figure 4-2, TIF file.
Figure 4-3
Overlap between four modules derived from hierarchical clustering and genetic patterning. Four modules derived from hierarchical clustering in Figure 3c in the main text showed significant overlap with one or more parcels in genetic patterning obtained from the genetic correlation of cortical thickness and surface area. *** p < .001, ** p < .01, * p < .05. Download Figure 4-3, TIF file.
Since the morphogen gradients have been established and their pattern was conserved during the development (Vogel et al., 2024), we explored whether these genetic patterns correspond with those of the structural connectivity. We first identified the morphogen genes in AHBA data according to a previous publication (Rakic et al., 2009). These genes exhibited high DS across individuals, with 80% of morphogen genes located in the top half of DS genes (ΔBR > 0.5284; Extended Data Table 5-1), suggesting their involvement in common functionality and dysfunctionality in human brains (Hawrylycz et al., 2015). We subdivided the connectopies equally into 10 parts. Two-sample t tests were used to assess the difference in gene expression of these genes between the first and the last bins. We found that the expression of FGF17 was significantly higher in the dorsal cluster than in the ventral cluster (t = 27; p < 0.001), while the expression of FOXG1 was significantly higher in the ventral cluster (t = −25; p < 0.001). The same opposite patterns were observed between FGF8 (rostral < caudal; t = −33; p < 0.001) and PAX6 (rostral > caudal; t = 29; p < 0.001) and between SFRP1 (medial < lateral, t = −41, p < 0.001) and WNT3 (medial > lateral, t = 99, p < 0.001; Fig. 5a). The number of subdivisions does not affect the final results (Extended Data Fig. 5-1).
Consistency between GCs and morphogen gradients and spatiomolecular gradients. a, The expression of morphogen genes which mostly exhibited high DS across individuals (Extended Data Table 5-1) and followed the same pattern as any of the three GCs was significantly different between the two ends of the GCs (two-sample t test, all p < 0.001). The number of subdivisions does not affect the final results (Extended Data Fig. 5-1). b, The GCs significantly correlated with the spatiomolecular gradients, which maintained the same pattern as morphogenetic gradients during development (rGC1-DV, LV1 = 0.74; pspin < 0.0060, FDR corrected; rGC2-RC, LV2 = 0.75; pspin < 0.0012, FDR corrected; rGC3-ML, LV3 = 0.5; pspin < 0.0233, FDR corrected). ***p < 0.001. See Extended Data Table 5-1 and Extended Data Figure 5-1 for more details.
Table 5-1
Differential stability of the morphogen genes of interest. Download Table 5-1, XLSX file.
Figure 5-1
Relationship between expression of example genes with the global connectopies. We divided the connectopies and expressions of genes into bins (5, 10, and 20 bins) and calculated the correlation coefficient between these two. For GC1-DV, FOXG1 and FGF17 were selected. For GC2-RC, FGF8 and PAX6 were selected. For GC3-ML, SFRP1 and WNT3 were selected. We found a high correlation between connectopies and gene expression patterns. Download Figure 5-1, TIF file.
The above analyses showed a genetic association with morphogen axes in the developing brain. Next, we investigated whether the directional gradients of gene expression in the adult human brain, i.e., spatiomolecular gradients (Vogel et al., 2024), correlate with GCs. The spatiomolecular gradients delineated the topographic variation of gene expression spanning the adult human brain and were reported to capture previously identified rostral–caudal, dorsal–ventral, and medial–lateral axes of early developmental patterning (Vogel et al., 2024). We showed that the three GCs had significant correlations with the spatiomolecular gradients (rGC1-DV, LV1 = 0.74; pspin < 0.0060; FDR corrected; rGC2-RC, LV2 = 0.75; pspin < 0.0012; FDR corrected; rGC3-ML, LV3 = 0.5; pspin < 0.0233; FDR corrected; Fig. 5b). Note that the gradient values for the subcortex, cerebellum, and brainstem were excluded, and only data for the cortex were considered. Since the LV3 gradient showed the most variation and varied along a mediolateral and dorsoventral direction, as mentioned previously (Vogel et al., 2024), the r value was lower compared with the former two but still significant, and the correlation between the first GC was also moderate (rGC1-DV, LV3 = −0.36; pspin < 0.05). These findings again suggested a potential genetic basis underlying the GCs.
Finally, we explored the roles of genes that contributed to the GCs. We used PLSR on the AHBA data and filtered the genes with the top 5% weight for each GC after 1,000 times bootstrapping (corrected for multiple comparisons). We found that a selection of genes contributed to more than one GC, with 22 genes overlapping with all three (Fig. 6a, right). All significantly associated genes overlapped with known morphogenetic genes identified previously (Rakic et al., 2009; hypergeometric test; p < 0.001; Fig. 6b, right). The top genes in each GC, such as FARSA, MPND, SYCP2, CUX1, ARHGDIG, and NUDT14, are known to be associated with the regulation of transcription factor activity, morphogenesis, cell proliferation, and metabolic processes (Fig. 6b; Adra et al., 1998; Yang et al., 2006; Zhu et al., 2007; Sansregret and Nepveu, 2008; Heyen et al., 2009; Arikkath, 2012; Krenke et al., 2019), while genes related to all three GCs, including ADAMTSL1 and TIAM1, showed an obvious connection with various diseases (Fig. 6b; Hendee et al., 2017; Lu et al., 2022; Ru et al., 2022). This finding is consistent with the observation that disruptions in forming early developmental gradients through mutations to gradient-associated genes can cause severe developmental disorders (Flores-Sarnat and Sarnat, 2008; Vogel et al., 2024). We found these genes to be significantly enriched in radial glial cells in prenatal samples (Fig. 6c, left) and enriched in excitatory neurons after birth (Fig. 6c, right). This was verified by the developmental pattern of these genes provided by all time points in the BrainSpan dataset (Miller et al., 2014), which showed that they were highly expressed in the prenatal period in all lobes and that their expression decreased after birth (Fig. 6d), suggesting that the patterns present in adulthood were nearly established in the early stages of development. Digging deeper into specific biological processes, the gene ontology enrichment analysis showed that these genes were related to the regulation of transcription, metabolic process, morphogenesis, cellular development, and neuron projection (Fig. 6e).
Gene enrichment analysis of connectopy-associated genes. a, Distributions of connectopic weights across genes. Genes in the top 0.83% of either tail of the distribution (n = 130) were selected for further enrichment analysis. The Venn diagram showing the overlap of genes is shown on the right. b, Top genes for unique connectopy and overlap of connectopies. c, Cell-type enrichment analysis was conducted on genes belonging to each GC using prenatal and postnatal datasets. These genes are significantly enriched in radial glial cells in prenatal samples and enriched in excitatory neurons after birth. d, The development spectrum of gene expression for each brain macrostructure. Genes associated with GCs were highly expressed in the prenatal period and decreased after birth. e, Gene set enrichment across all GC-related genes indicated terms related to the regulation of transcription, metabolic process, morphogenesis, cellular development, and neuron projection.
Discussion
In the current study, we demonstrated that a low-dimensional, topological representation of brain connectivity exists and may share common space with gene expression, despite the significant disparity in the numbers of genes and connections. We systematically analyzed the spatial topography of cerebral connectivity, i.e., the white matter tractogram, and identified three orthogonal GCs, the dorsoventral, rostrocaudal, and mediolateral connectopies across the cerebral cortex, which are represented in the spatial arrangement of long-range white matter tracts. Moreover, the GCs were observed to align with the gradients of morphogens identified during embryonic brain development and with genetic topography and spatiomolecular gradients. Our findings demonstrate the crucial role that comprehending the connectivity topographies and their genetic constraints plays in understanding the underlying principles that shape human brain organization.
The three orthogonal GCs derived from the white matter closely resemble the principal axes of brain development (Takahashi et al., 2012; Vogel et al., 2024) and the chemotactic gradients of early embryogenesis (Van Haastert and Devreotes, 2004; Wu, 2005). From a developmental perspective, both the encoding of chemical gradients by morphogens and the spatial expression patterns of all genes during the formation of the mature brain, along with recent research on phenotypic mapping facilitated by twin studies (Chen et al., 2011, 2012, 2013), indicate that gradient-based organizational principles may serve as one of the fundamental rules in the precise formation of gene-encoded connections and the establishment of functional cortical regions. Indeed, genes provide the initial blueprint that guides differentiation and the directional migration of neurons through the subsequent formation of different types of gradients of morphogens, and, in parallel, the precise trajectory and projection of axons (Kast and Levitt, 2019). Given the significant disparity in the numbers of genes and connections, genes might not drive the precise details of complex circuit diagrams but have the potential to influence their organizational rules by enacting pattern formation rules to increase the probability that neurons make the correct connections (Hassan and Hiesinger, 2015; Langen et al., 2015). This simple genetic rule may underpin the global connectome organization, consistent with the argument that anatomical connections could characterize the arealization and show consistency with cortical patterns driven by genetic profiles (Cui et al., 2016; Fan et al., 2016; Fan, 2021).
Interestingly, the high correspondence between modules identified using GCs and those obtained from genetic topographies (Chen et al., 2013) may also provide evidence for the role of connections in the initial scaffolding for cortical organization. Previous research found that cortical patterning creates segregated areas with different functions via cell differentiation and migration, specifically occurring along molecular gradients representing the patterned expression of morphogens and transcription factors (Sansom and Livesey, 2009; Cadwell et al., 2019), which were found to radiate along the rostrocaudal, dorsoventral, and mediolateral axes of the neural compartment (Hoch et al., 2009). These patterns are conserved to maturity (Vogel et al., 2024) and are consistent with our GCs, suggesting a relatively conserved role of genetic profiles in shaping brain organization. Meanwhile, we are proposing an immediate goal to explore the structural connectional basis underlying shifts in macroscale functional organization during development (Dong et al., 2021, 2024).
Another verification warranted in the future work is the conservation or changes of GCs across species, as the phylogenetically conserved spatiomolecular gradients resemble the similar pattern of genetically encoded signaling pathways and macrostructural organization in various species (Valk et al., 2020; Vogel et al., 2024). A consistent phenomenon of structural connectivity organization across species would provide supporting evidence for overall neuroanatomical similarity, suggesting a basic organizing principle within genetic and connectivity profiles throughout evolution. This is of great significance to the deep exploration of neurodevelopment in humans and nonhuman animals, which could be integrated into the characterization of genetic and connectional topological patterns at the macro level.
Under the effect of morphogens secreted in the patterning centers with absolute positional information, numerous transcription factors are expressed in a graded manner. For example, the anterior expression of FGF8 suppresses the posteriorly expressed transcription factors, resulting in their lower anterior but higher posterior expression (Storm et al., 2006). Similarly, manipulation of EMX2 by genetic knock-out resulted in higher expression posteromedially and lower expression anterolaterally, which caused an expansion of the frontal and lateral regions at the expense of the visual cortex in mice (Hamasaki et al., 2004). These transcription factors are believed to determine the areal fate and expression of cell-surface molecules, thus determining the topographic organization of synaptic inputs and outputs (Rubenstein and Rakic, 1999; Chen et al., 2013) and directing axonal outgrowth (Chilton, 2006). They significantly overlapped with the genes related to the identified GCs, which exhibited distinct developmental patterns and especially enriched in radial glial cells in the prenatal period. According to the radial unit hypothesis of cortical development, radial glial cells in the embryonic brain facilitate the generation, placement, and allocation of neurons in the cortex and regulate how they wire up, by acting as a “highway” for axons to facilitate growth and ensuring proper trajectories (Nowakowski et al., 2016; Casingal et al., 2022), during which complex molecular interactions between axons and recipient cortical areas may be involved (Kast and Levitt, 2019).
Exploration of the white matter associated with the GCs is marked by the important influences exerted by older brain structures in cortical development and arealization. The thalamo- and striatocortical projections contributed to the formation of structural organization patterns, especially the first principal component, which varied along the same axis as the functionally distinct dorsal–ventral systems that evolved from two primordial brain structures, following the dual origin theory (Giaccio, 2006; Pandya et al., 2015). As an important mediator of sensory experience, the thalamus accelerates initial synaptic plasticity and forms direct topographic projections to cortical areas during development, thus influencing regional differentiation of the neocortex (Molnar and Blakemore, 1995; Park et al., 2024). Moreover, another early brain structure, the brainstem, sends ascending neural projections to the developing cortex and also routes sensory input to corresponding brain areas, helping shape anatomical development (Fritzsch et al., 2022). The brainstem houses key neuromodulatory systems, including the noradrenergic, serotonergic, and dopaminergic systems. These systems play a crucial role in influencing cortical plasticity and connectivity. Ultimately, they contribute to the functional differentiation of cortical areas (Taylor et al., 2022; Shine, 2023).
Several technical and methodological limitations must be acknowledged in the current work. The first and most direct concern is the accuracy of mapping structural connections using diffusion tensor imaging. However, diffusion tractography has been used as an irreplaceable tool to identify white matter across the brain in vivo and noninvasively, with certain limitations imposed by the tractography which can cause false-positive results (Maier-Hein et al., 2017). In the future, joint MRI and microscopy data analysis may help address these limitations (Huang et al., 2021; Axer and Amunts, 2022; Howard et al., 2023). Distance and geometry effects also need to be considered when mapping the structural connectivity pattern. Still, the consistency between genetic topography and GCs suggests that our findings are not simply due to spatial autocorrelation. Although the association between cortical geometry and brain function should be recognized (Pang et al., 2023), the interaction of structural connectivity and cortical geometry deserves further attention to reveal the function realization of the developing human brain, especially the neonate brain (Dubois et al., 2014). The underlying genetic drivers should also be considered to explore the causal relationship between them by animal studies, which will reduce the impact of mismatch of ages and other factors resulting from relating HCP young adult diffusion MRI connectopies to AHBA-based gene expression information. Although the AHBA is the densest sample transcriptomic dataset of the human brain to date, it comes with the limitations mentioned above, as well as the gap between postmortem changes in the gene expression level and in vivo features, making our results a matter of caution. With the development of RNA sequencing technology, which enables rapid profiling and deep investigation of the transcriptome, more genes could be detected and provide valuable information (Wang et al., 2009). In addition, the observed organizing principles in the current work may also be used to guide the engineering of three-dimensional cortical spheroids and to comprehend the human-specific aspects of the neural circuit assembly (Sloan et al., 2018; Miura et al., 2022).
To conclude, we explored how genes link the accurate formation of connections despite significant differences in their numbers. Our findings revealed the existence of a low-dimensional, topological representation of brain connectivity that may share a common space with gene expression, providing a consistent characterization of cortical arealization. We hope our discoveries will offer valuable insights into studying brain structure and function.
Data Availability
Data from the HCP can be downloaded at https://db.humanconnectome.org/. Data from the CHCP can be downloaded at https://www.Chinese-HCP.cn. The human gene expression data are available in the Allen Brain Atlas (https://human.brain-map.org/static/download). Data from BrainSpan can be downloaded at https://www.brainspan.org/. The HCP pipeline can be found at https://github.com/Washington-University/HCPpipelines. The neuroimaging preprocessing software used for the other datasets is freely available (FreeSurfer v6.0, http://surfer.nmr.mgh.harvard.edu/, and FSL v6.0.5, https://fsl.fmrib.ox.ac.uk/fsl/fslwiki). The brain parcellation is referenced from (http://github.com/trichtu/BAI-Net-Brain-Atlas-Individualization-Network/tree/master). The gene processing pipeline is available (abagen, https://github.com/rmarkello/abagen), and gene enrichment analysis is conducted at https://toppgene.cchmc.org/. The cell-type enrichment analysis process was conducted at http://www.cellgo.world. The brain maps were presented using BrainSpace (https://brainspace.readthedocs.io/) and Connectome Workbench v1.5.0 (https://www.humanconnectome.org/software/connectome-workbench). The tracts were visualized using BrainNet Viewer v1.7 (https://www.nitrc.org/projects/bnv/). The Python code to reproduce the analyses and figures is available at https://github.com/FANLabCASIA/GC.git (will make this public upon acceptance).
Footnotes
This work was partially supported by STI2030-Major Projects (Grant Number 2021ZD0200200), the Natural Science Foundation of China (Grant Numbers 82472061, 82072099, 82202253, 62250058), and Guangxi National Science Foundation (Grant No. 2024GXNSFBA010212). Data were provided in part by the Human Connectome Project, WU-Minn Consortium (Principal Investigators, David Van Essen and Kamil Ugurbil; 1U54MH091657) funded by the 16 National Institutes of Health (NIH) Institutes and Centers that support the NIH Blueprint for Neuroscience Research and by the McDonnell Center for Systems Neuroscience at Washington University. We appreciate the English language and editing assistance of Rhoda E. and Edmund F. Perozzi, PhDs.
The authors declare no competing financial interests.
- Correspondence should be addressed to Lingzhong Fan at lingzhong.fan{at}ia.ac.cn or Congying Chu at congying.chu{at}ia.ac.cn.