Abstract
Local translation in neurons is partly mediated by the reactivation of stalled polysomes. Stalled polysomes may be enriched within the granule fraction, defined as the pellet of sucrose gradients used to separate polysomes from monosomes. The mechanism of how elongating ribosomes are reversibly stalled and unstalled on mRNAs is still unclear. In the present study, we characterize the ribosomes in the granule fraction using immunoblotting, cryogenic electron microscopy (cryo-EM), and ribosome profiling. We find that this fraction, isolated from 5-d-old rat brains of both sexes, is enriched in proteins implicated in stalled polysome function, such as the fragile X mental retardation protein (FMRP) and Up-frameshift mutation 1 homologue. Cryo-EM analysis of ribosomes in this fraction indicates they are stalled, mainly in the hybrid state. Ribosome profiling of this fraction reveals (1) an enrichment for footprint reads of mRNAs that interact with FMRPs and are associated with stalled polysomes, (2) an abundance of footprint reads derived from mRNAs of cytoskeletal proteins implicated in neuronal development, and (3) increased ribosome occupancy on mRNAs encoding RNA binding proteins. Compared with those usually found in ribosome profiling studies, the footprint reads were longer and were mapped to reproducible peaks in the mRNAs. These peaks were enriched in motifs previously associated with mRNAs cross-linked to FMRP in vivo, independently linking the ribosomes in the granule fraction to the ribosomes associated with FMRP in the cell. The data supports a model in which specific sequences in mRNAs act to stall ribosomes during translation elongation in neurons.
SIGNIFICANCE STATEMENT Neurons send mRNAs to synapses in RNA granules, where they are not translated until an appropriate stimulus is given. Here, we characterize a granule fraction obtained from sucrose gradients and show that polysomes in this fraction are stalled on consensus sequences in a specific state of translational arrest with extended ribosome-protected fragments. This finding greatly increases our understanding of how neurons use specialized mechanisms to regulate translation and suggests that many studies on neuronal translation may need to be re-evaluated to include the large fraction of neuronal polysomes found in the pellet of sucrose gradients used to isolate polysomes.
Introduction
In neurons, the local translation of mRNAs at distal synaptic sites is essential for neuronal development (Cioni et al., 2018), maintaining the local proteome (Glock et al., 2017), homeostasis of excitability (Mori et al., 2019), and synaptic plasticity (Sossin and Costa-Mattioli, 2019). Local translation requires the transport of mRNAs from the soma to distal sites in a translationally repressed state, followed by their reactivation, either when the mRNA reaches its correct location or after an appropriate stimulus (Anadolu and Sossin, 2021). Two major forms of mRNA transport have been defined in neurons, the transport of mRNAs that are repressed at translation initiation and the transport of mRNAs for which elongation has begun but then stalled (Sossin and DesGroseillers, 2006). The mRNAs repressed before elongation lack large ribosomal subunits and are often transported in dedicated mRNA transport particles. In contrast, the mRNAs repressed at elongation are transported as stalled polysomes in RNA granules (Krichevsky and Kosik, 2001; Sossin and DesGroseillers, 2006). The local reactivation of translation from mRNAs transported as stalled polysomes can be distinguished from mRNAs blocked at initiation using drugs such as homoharringtonine that specifically block the first step of translation elongation, thus blocking translation from mRNA transport particles but not RNA granules (Anadolu and Sossin, 2021). Using this tool, several physiological processes have been shown to be supported by initiation-inhibitor resistant protein synthesis, such as the local production of microtubule-associated protein 1B (Map1B) and the induction of a form of long-term depression (LTD) stimulated by the activation of metabotropic glutamate receptors (mGluRs; mGluR-LTD) in vertebrates, as well as the induction of a type of intermediate term synaptic plasticity in the invertebrate model system Aplysia californica (Graber et al., 2013; McCamphill et al., 2015; Graber et al., 2017).
Stalled polysomes are transported in neuronal RNA granules, which are large liquid–liquid phase separated structures (Anadolu and Sossin, 2021). These were first described in oligodendrocytes, where they transported myelin basic protein mRNA to myelin synthesis sites (Barbarese et al., 1995). The term neuronal RNA granule was first used to describe a sedimented fraction containing ribosomes and repressed mRNAs lacking initiation factors (Krichevsky and Kosik, 2001). These large collections of ribosomes can be separated from normal polysomes based on their sedimentation in sucrose gradients (Krichevsky and Kosik, 2001; Kanai et al., 2004; Aschrafi et al., 2005; Elvira et al., 2006; El Fatimy et al., 2016). The proteomic characterization of these structures is consistent with the possibility that they are stalled polysomes (Kanai et al., 2004; Elvira et al., 2006; El Fatimy et al., 2016). Indeed, they are enriched in mRNAs such as Map1B that undergo initiation-inhibitor resistant protein synthesis (El Fatimy et al., 2016).
Stalled polysomes may also be necessary for neuronal development and the regulation of developmentally expressed mRNAs through association with the fragile X mental retardation protein (FMRP), a protein that when lost results in the neurodevelopmental disorder fragile X syndrome (Garber et al., 2008). UV cross-linking of FMRP to mRNA in neurons showed that FMRP was mainly associated with the coding region of mRNAs (Darnell et al., 2011), consistent with the possibility of FMRP associating with ribosomes. Indeed, FMRP has been shown to be associated with stalled ribosomes in several studies (Ceman et al., 2003; Aschrafi et al., 2005; Darnell et al., 2011; El Fatimy et al., 2016; Shah et al., 2020). Several mRNA sequences enriched in the regions of mRNAs cross-linked with FMRP have been identified (Ascano et al., 2012; Anderson et al., 2016) suggesting that specific mRNA sequences may be necessary for determining which mRNAs are recruited to FMRP-containing stalled polysomes. FMRP was also shown to be enriched in RNA granules (El Fatimy et al., 2016), consistent with the idea that the RNA granule fraction contains stalled polysomes.
The mechanism for stalling polysomes during elongation in neurons is unknown. The nonsense-mediated decay factor Up-frameshift mutation 1 (UPF1) homologue was implicated in this process. Decreasing levels of UPF1 reduced initiation-inhibitor resistant protein synthesis, the local production of Map1B, and disrupted the induction of mGluR-LTD (Graber et al., 2017). As UPF1 is known to be attracted to ribosomes when they reach the stop codon through association with eukaryotic release factors (Kim and Maquat, 2019), this suggests that stalled mRNAs within RNA granules may be blocked at the release step of translation termination. To test this, we used ribosome profiling, also known as ribosome footprinting, to elucidate the position of ribosomes on mRNAs through sequencing the ribosome-protected fragments (footprint reads) after nuclease treatment (Ingolia, 2014). We took advantage of the enrichment of RNA granules in the pellets of sucrose gradients (El Fatimy et al., 2016) to create an enriched granule fraction and identified the sites occupied by ribosomes in this fraction. Cryogenic electron microscopy (cryo-EM) analysis of the ribosomes in the pellet revealed that most of the ribosomes were stalled in the hybrid position and were loaded with two tRNA molecules, one in the A/P configuration and the second in a P/E configuration. The footprint reads from these ribosomes were larger than expected and produced reproducible peaks, highly enriched in motifs previously associated with FMRP target mRNAs (Ascano et al., 2012; Anderson et al., 2016), and consensus sites for m6A modifications in the brain (Zhang et al., 2018). Contrary to our prediction, the footprint reads were not enriched at the stop codon but were slightly biased toward the first half of the open reading frame of transcripts. We propose a stochastic model in which stalling of ribosomes at specific motifs in mRNAs in neurons attracts FMRP, starting a process of RNA granule assembly that allows transport of stalled ribosomes to distal sites in neurons that will later be reactivated for local translation.
Materials and Methods
Reagents used
Antibodies used
The following antibodies were used: rabbit anti-S6 (catalog #2217, Cell Signaling Technology), rabbit anti-FMRP (catalog #4317, Cell Signaling Technology), rabbit anti-eEF2 (catalog #2332S, Cell Signaling Technology), rabbit anti-Upf1 (catalog #ab133564, Abcam), mouse anti-Stau2 (catalog #MM0037-P, MediMabs), rabbit anti-PurA (catalog #ab79936, Abcam), mouse anti-hnRNPA2B1 (catalog #NB120-6102, Novus Bio), mouse anti-SMN (catalog #NB100-1936, Novus Bio), rabbit anti-IGF2BP1 (ZBP1; catalog #NBP2-38 956, Novus Bio), rabbit anti-TIA1 (catalog #12133-2-AP, Proteintech), rabbit anti-eIF4E (catalog #9742, Cell Signaling Technology), mouse anti-G3BP (catalog #H00010146-M01, Abnova), anti-PNRC2 (catalog #NBP-1-74252, Novus Bio), anti-YT521-B homology domain containing protein (YTHDF1; catalog #17479-1-AP, Proteintech), anti-phosphoS6 (catalog #4858T, Cell Signaling Technology), and HRP-conjugated secondary antibodies (catalog #31430 and #31460, Thermo Fisher Scientific).
Enzymes used
The enzymes used were RNase I (catalog #AM2294, Ambion), SuperaseIN (20 U/µl; catalog #AM2696, Invitrogen) and T4 polynucleotide kinase (catalog #M0201, New England Biolabs).
Kits used
The following kits were used: enhanced chemoluminescence kit (catalog #NEL105001EA, PerkinElmer), Western Blot Stripping Buffer (catalog #S208070, ZmTech Scientifique), Ribo-Zero Gold (Human/Mouse/Rat) Kit (Illumina), NEBNext rRNA Depletion Kit (Human/Mouse/Rat; catalog #E6350, New England Biolabs), NEXTflex Small RNA Sequencing Kit version 3 (catalog #NOVA-5132-06, PerkinElmer), and RiboCop rRNA Depletion Kit HMR V2 (catalog #144, Lexogen).
Specialized equipment included the following: Bio Rad ChemiDoc digital imager, Biocomp Gradient Master; Agilent Small RNA chip (Agilent Technologies), NovaSeq S1/2 flow cells, Tecnai F20 electron microscope, Gatan Ultrascan 4000 4 k × 4 k CCD Camera System, Density Gradient Fractionation System (BR-188, Brandel) with Syringe Pump (Syr-101, Brandel), and Fraction Collector (Foxy R1, Teledyne ISCO) with UA-6 Absorbance Detector (Brandel).
Biological and computational resources
Sprague Dawley rats were purchased from Charles River Laboratories. For computational resources, see below, Ribosome sequencing and RNA sequencing data analysis.
Statistical analysis
To determine significant enrichment and abundances of particular sets of mRNAs in the stalled polysomes compared with the overall mRNAs, t tests with Bonferroni correction were used. Significance of motifs and Gene Ontology (GO) terms were determined by the programs used for this task (HOMER, gProfileR) and edgeR (Robinson et al., 2010) (see below, Ribosome sequencing and RNA sequencing data analysis).
Purification of RNA granules
RNA granules were purified from whole-brain homogenate harvested from postnatal day (P)5 Sprague Dawley rats of both sexes, using a protocol adapted from a previous study (El Fatimy et al., 2016). Five P5 rat brains were homogenized in RNA Granule Buffer (20 mm TRIS-HCl, pH 7.4; catalog #BP152-1, Thermo Fisher Scientific), 150 mm NaCl (catalog #BP358-212, Thermo Fisher Scientific), 2.5 mm MgCl2 (catalog #M33-500, Thermo Fisher Scientific) supplemented with 1 mm DTT (catalog #D9163, Sigma-Aldrich), 1 mm EGTA (catalog #E8145 Sigma-Aldrich), and EDTA-free protease inhibitor (catalog #04693132001, Roche). Note that cycloheximide (catalog #ab120093, Abcam) was not added to the homogenization buffer unless explicitly stated. Homogenate was centrifuged 15 min in a Thermo Fisher Scientific T865 fixed-angle rotor at 6117 × g at 4°C to spin down cellular debris. The supernatant was treated with 1% IGEPAL CA-630 (catalog #I8896, Sigma-Aldrich) for 5 min at 4°C on a rocker. The sample was then loaded onto a 2 ml 60% sucrose (catalog #8550, Calbiochem) cushion (dissolved in supplemented RNA Granule Buffer) in a Sorvall 36 ml tube (Kendro, catalog #3141, Thermo Fisher Scientific), filled to the top with additional RNA Granule Buffer and centrifuged for 2 h in a Thermo Fisher Scientific AH-629 swing-bucket rotor at 56,660 × g at 4°C to achieve the polysome pellet. The pellet was resuspended in RNA Granule Buffer, gently dounced, and loaded over a 15–60% linear sucrose gradient (gradient was made with RNA Granule Buffer) that was prepared in advance using a gradient maker (Biocomp Gradient Master) and centrifuged for 45 min at 56,660 × g at 4°C in a Thermo Fisher Scientific AH-629 swing bucket rotor. Fractions of 3.5 ml were then collected from the top, and the remaining pellet was rinsed once and then resuspended using RNA Granule Buffer. For experiments measuring UV absorbance, sucrose gradients were centrifuged using an SW40 rotor (Beckman Coulter) and fractionated using an ISCO density gradient fractionation system, optical density was continuously recorded at 254 nm, and fractions were collected with a Foxy Jr Fractionator (Teledyne ISCO). For some experiments, such as for electron microscopy, the resuspended pellet was used directly or treated with salt and nuclease to break up the ribosome clusters into monosomes (see below). For some experiments, such as ribosome footprinting (see below, Nuclease and salt treatments), the fractions were precipitated overnight at −20°C by adding two volumes of chilled 100% ethanol. The precipitated samples were then centrifuged for 45 min at 2177 × g at 4°C in an Eppendorf 5810/5810 swing bucket rotor before collection using RNA Granule Buffer.
Nuclease and salt treatments
We treated the pellet fraction with salt and nuclease to break up the polysomes in the RNA granule into monosomes for ribosome footprint analysis. The pellet was incubated with RNA Granule Buffer containing 400 mm NaCl for 10 min at 4°C on a rocker (El Fatimy et al., 2016). Before nuclease treatment, the NaCl concentration was reduced to 150 mm by diluting the sample with a NaCl-free RNA Granule Buffer. The sample was then treated with 100 U of Ambion RNase I (100 U/µl; catalog #AM2294, Thermo Fisher Scientific) for 30 min at 4°C on a rocker. The nuclease was quenched with 100 U of Invitrogen SuperaseIN (20 U/µl; catalog #AM2696, Thermo Fisher Scientific), and the samples were rerun on a fresh 15–60% sucrose gradient to separate monosomes. Fraction 2 was precipitated overnight at −20°C by adding 7 ml of chilled 100% ethanol. The precipitated samples were then centrifuged for 45 min at 2177 × g at 4°C in an Eppendorf 5810/5810 swing bucket rotor, and pellets were resuspended in RNA Granule Buffer. For ribosome profiling (see below, Footprint sequencing library construction and sequencing), the Fraction 2 pellets were stored in isopropanol before analysis.
Transmission and cryo-electron microscopy
In experiments where samples were imaged by negative staining, the untreated pellet fraction and polysome fraction, the pellet fractions after treatment with either high-salt or nuclease or both, or the monosome fraction after sucrose gradients were deposited on the EM grids. The ribosome concentration of each sample was adjusted to ∼80 ng/µl (∼25 nm) using RNA Granule Buffer before applying them to the grids. In the case of the sample treated with both nuclease and high salt, the concentration of the sample applied to the grid was 9.2 ng/µl (2.9 nm). For these experiments, we used 400 mesh copper grids freshly coated with a continuous layer of thin carbon. Grids were glow discharged at 15 mA for 15 s and then floated on a 5 µl drop of the diluted sample for 2 min. Excess of the sample was blotted away with filter paper (Whatman #1), and to stain them, they were subsequently floated on a 5 µl drop of a 1% uranyl acetate solution for 1 min. Excess stain was blotted away, and the grids were dried in air and stored in regular grid boxes. The EM images were acquired on a Tecnai F20 electron microscope operated at 200 kV using a room temperature side entry holder. Images were collected in a Gatan Ultrascan 4000 4 k × 4 k CCD Camera System Model 895 at a nominal magnification of 60,000×. Images produced by this camera had a calibrated pixel size of 1.8 Å/pixel. The total electron dose per image was ∼50 e-/Å2. Images were collected using a defocus of approximately −2.7 μm. Images were prepared for figures using the Adobe Photoshop program.
For samples imaged by cryo-EM, the pellet fraction was treated with nuclease before being deposited on the cryo-EM grids. The ribosome concentration in the sample applied to the grid was 160 nm. Cryo-EM grids (c-flat CF-2/2-2C-T) used for these samples were washed in chloroform for 2 h and treated with glow discharged in air at 15 mA for 20 s. A volume of 3.6 µl was applied to the grid before vitrification in liquid ethane using a Vitrobot Mark IV (Thermo Fisher Scientific). The Vitrobot parameters used for vitrification were blotting time 3 s and a blot force +1. The Vitrobot chamber was set to 25°C and 100% relative humidity.
Cryo-EM datasets were collected at FEMR-McGill using a Titan Krios microscope at 300 kV equipped with a Gatan BioQuantum K3 direct electron detector. The software used for data collection was SerialEM (Schorb et al., 2019). Images were collected in counting mode according to the parameters described in Table 1.
Data acquisition parameters for cryo-EM dataset
To calculate the cryo-EM structures, cryo-EM movies obtained in the Titan Krios were corrected for beam-induced motion using the RELION implementation of the MotionCor 2 algorithm (Zheng et al., 2017). Contrast Transfer Function (CTF) parameter estimation was done using the CTFFIND-4.1 program (Rohou and Grigorieff, 2015). The remaining processing steps were done using RELION-3 (Zivanov et al., 2018). Individual particles in the images were identified using autopicking. These particles were extracted and subjected to two cycles of reference-free 2D classification to remove false-positives from the autopicking process. The cleaned dataset was subjected to three layers of 3D classification. In each layer, each class was split into three new classes. The initial 3D reference used for the 3D classification was a 60 Å low-pass-filtered map of the 80S human ribosome created from Electron Microscopy Data Bank (EMD)-2938 (Khatter et al., 2015). The 3D classifications did not use masks. To speed up the computer calculations, the 2D and 3D classifications were performed using particle images binned by four. However, we use full-size images (1.09 Å/pixel) in the refinement steps. Particles assigned to maps representing the same conformation were pooled together and used for 3D autorefine. The numbers of particles assigned to each class and included in each refinement are described below. Refinement was performed in four stages: In the first stage, the 3D autorefine was performed without a mask. The resulting map was used to create a mask and was also used as the initial reference for a second stage refinement that included one additional 3D autorefine cycle. Masks were created with the relion_mask_create command extending the binary mask by four pixels and creating a soft edge with a width of four pixels. The initial threshold for binarization of the mask varied depending on the structure. In the third step, we used the output of the last 3D autorefine job as the input for CTF refinement. In this process, we selected Estimate (anisotropic) magnification as yes in the first cycle. In the second cycle of CTF refinement, we selected Perform CTF parameter fitting as Yes and selected Fit defocus as Per-particle and Fit astigmatism as Per-micrograph. We selected No for Fit B-factor, Fit phase-shift, Estimate trefoil, and Estimate fourth order aberrations. However, we selected Yes for Estimate beam tilt, as our data were collected using this approach. We also selected Yes for Estimate trefoil and Estimate fourth order aberrations. In the fourth step, we used the output from the CTF refinement job to perform Bayesian polishing to correct for per-particle beam-induced motion before subjecting these particles to the last cycle of 3D autorefine. Bayesian polishing was performed using sigma values of 0.2 Å/dose, 5000 Å and 2 Å/dose for velocity, divergence and acceleration, respectively. The map from the last cycle of the 3D autorefine (80S consensus map) was used to perform multibody refinement by dividing the 80S structure into two major bodies (body1, 60S; body2, 40S). Sharpening of the final cryo-EM maps was done with RELION (Zivanov et al., 2018). The average resolution for the 80S consensus maps and for those obtained for each one of the subunits through the multibody refinement was estimated by gold-standard Fourier shell correlation (FSC). Resolution estimation is reported using an FSC threshold value of 0.143. Local resolution analysis in the 80S consensus and composite maps was done with RELION (Zivanov et al., 2018). In each class, the composite map of the 40S and 60S subunits was obtained from the refinement of the 40S and 60S subunits through multibody refinement and subsequently merging the two subunits into a single map using the vop add command in Chimera. Cryo-EM map visualization was done by UCSF Chimera (Pettersen et al., 2004) and Chimera X (Goddard et al., 2018; Pettersen et al., 2021).
Immunoblotting and quantification of enrichment
For immunoblotting, SDS sample buffer was added to each sample before loading onto a 10% or 12% acrylamide gel. The resolved proteins were either stained with Coomassie Brilliant Blue (catalog #821616, Thermo Fisher Scientific) or transferred onto a 0.45 μm nitrocellulose membrane (catalog #1620115, Bio-Rad Laborartories) for immunoblotting. The membranes were blocked with 5% BSA (catalog #9647, Sigma-Aldrich) in Tris-buffered saline with Tween (TBS-T; catalog #BP152-1, Thermo Fisher Scientific; NaCl, catalog #BP358-212 Thermo Fisher Scientific; Tween, catalog #BP337, Thermo Fisher Scientific) before incubation with primary antibodies (see above, Reagents) all at 1:1000 dilution. Membranes were washed with TBS-T after incubation. Detection was done using HRP-conjugated secondary antibodies (catalog #31430 and #31460, Thermo Fisher Scientific) followed by ECL (catalog #NEL105001EA, PerkinElmer) reaction and imaging using the Bio-Rad ChemiDoc digital imager. For quantification of RNA binding protein (RBP) enrichment, membranes were stripped with a Western Blot Stripping Buffer (catalog #S208070, ZmTech Scientifique) and reprobed with rabbit anti-S6 (catalog #2217, Cell Signaling Technology) antibody, followed by detection with HRP-conjugated secondary antibodies (catalog #31430 and #31460, Thermo Fisher Scientific).
Quantification of signal intensity was done using ImageJ software. We selected full-lane ROIs and quantified single bands corresponding to the observed kilodalton size of each protein. We then used the corresponding anti-S6 signal intensity to quantify the amount of examined protein per S6 ribosomal protein for each fraction. For each experiment, the protein/S6 value was normalized to the protein/S6 value from the starting material. Following this normalization, biological replicates were averaged. For salt and nuclease experiments, we calculated the proportion of Granule Fraction Polysomes (Pellet) digested into monosomes (Fraction 2) by doing the following calculation: Fraction 2/(Fraction 2 + Pellet).
Footprint sequencing library construction and sequencing
Fraction 2 from the monosome purification was centrifuged at 4444 × g at 4°C for 60 min, the isopropanol was removed, and the pellet was air-dried and resuspended in 10 mm Tris-HCl, pH 7. We then proceeded with a standard hot phenol-acid extraction (Acid Phenol/Chloroform mix 125:24:1; catalog #AM9722, Invitrogen). The samples were resuspended in nuclease-free water and quantified on a Spectrophotometer. All samples were depleted of ribosomal RNA using the Ribo-Zero Gold (Human/Mouse/Rat) Kit [Illumina; granule fraction (GF) replicates 1–3 and GF cycloheximide treated 1–3] or the NEBNext rRNA Depletion Kit (Human/Mouse/Rat; catalog #E6350, New England Biolabs; GF replicates 4–5, total polysomes 1–4, and no salt treatment 1–2) or the Ribocop rRNA depletion Kit HMR V2 [GF replicates 6–8 and polysome fraction (PF) 1–3]. The three last replicates (6–8) are used for size distribution analysis, and the analysis comparing GF to PF as these fractions are from the same purification as the PF but were not used in other analyses that used GF replicates (1–5). The footprint samples were size selected using the 17 and 34 nt markers as a guide on a 15% TBE-Urea polyacrylamide gel (Thermo Fisher Scientific), whereas total RNA samples were randomly heat fragmented by incubating at 95°C in an alkaline fragmentation solution (50 mm NaCO3, pH 9.2, 2 mm EDTA) for 40 min, yielding comparably sized fragments to footprints. All samples were dephosphorylated using PNK (T4, New England Biolabs). The quality and concentration of samples were assessed by running an Agilent Small RNA chip, and sequencing libraries were generated using the NEXTflex Small RNA Sequencing Kit v3 (catalog #NOVA-5132-06, PerkinElmer), according to instructions from the manufacturer. Samples were balanced and pooled for sequencing with Edinburgh Genomics or the McGill University Genome Center on NovaSeq S1/2 flow cells yielding 50 (Edinburgh) or 100 (McGill) bp paired-end reads.
Ribosome sequencing and RNA sequencing data analysis
Adaptor sequences and low-quality score containing bases (Phred score < 30) were trimmed from reads using Cutadapt version 2.8 (-j 8 -u 4 -u −4 -Z -o; Martin, 2011). Noncoding RNAs were removed by custom scripts following mapping these contaminant reads using Bowtie2 version 2.3.5 (–phred33 –very-sensitive; Langmead and Salzberg, 2012). The unmapped reads were aligned to reference rat genome (Rnor_6.0.94) using STAR version 2.7.3a (Dobin et al., 2013) with options previously described (Biever et al., 2020; –twopassMode Basic –twopass1readsN −1 –seedSearchStartLmax 15 –outSJfilterOverhangMin 15 8 8 8 –outFilterMismatchNoverReadLmax 0.1). QuantMode with STAR was used to obtain genomic and transcript coordinates (Dobin et al., 2013). Assigning ribosome protected reads to genomic features [coding sequence (CDS), UTRs] was based on genome annotation (Rnor_6.0.94). Only a single transcript isoform, with the highest APPRIS score (Rodriguez et al., 2013), was considered per gene. All sequence data analyzed during the study are publicly accessible through the National Center for Biotechnology Information (BioProject ID PRJNA931294).
Raw counts obtained using featurecounts version 2.2.0 (Liao et al., 2014) were analyzed using the R package edgeR (Robinson et al., 2010). From this package, transcript abundance was obtained for ribosome sequencing (riboseq) and RNA sequencing (RNAseq) data in reads per kilobyte of mRNA (RPKM). Ribosomal occupancy was determined for each transcript by dividing riboseq RPKM by RNAseq RPKM. False discovery rates (FDRs) and p values were determined for ribosomal occupancy in the package, and nominal p values were corrected for multiple testing using the Benjamini–Hochberg method. For enrichment, Trimmed Mean of M-values (TMM) as implemented in edgeR; adjusted p values and FDR were calculated as described in the package (Robinson et al., 2010).
GO enrichment analysis was performed with gProfileR (Reimand et al., 2016). Enrichment p values were based on a hypergeometric test using the set of known Rat genes as background. Sample correlation based on normalized read count was obtained using the R package edgeR (Robinson et al., 2010) We used Ribowaltz (Lauria et al., 2018) to assign length-dependent P site correction and periodicity assignment to each read. Codon usage statistics was calculated using coRdon bioconductor package (https://github.com/BioinfoHR/coRdon). RNA secondary structure prediction was done by calculating minimum of free energy using MFOLD (Zuker, 2003).
Identification of consensus peaks
Sites enriched with ribosomal footprints were identified using normalized riboseq profiles using peak identification function within IDPmisc, an R package (https://CRAN.R-project.org/package=IDPmisc). Peak width of minimum 18 nt and peak height above the mean peak height within the transcript was used as the criteria to define a region with significant enrichment of ribosome protected reads. Only peaks region that overlap (at least 90% nt overlap) in at least three of five samples were considered using BEDtools version 2.29.2 (Quinlan and Hall, 2010)
Motif analysis
The peak regions were scanned for known human RNA-binding protein motifs using the FIMO program (Grant et al., 2011) which is part of the MEME Suite (Bailey et al., 2009). Only search results with a p value less than the threshold of 0.05 were considered. De novo motif finding was performed using peak from ribosome protected reads. The HOMER tool (Heinz et al., 2010) was used for this analysis (findMotifs. pl -rna). Background sequences were randomly selected from transcripts with no peaks.
Data availability
All sequence data analyzed during the study are publicly accessible through the National Center for Biotechnology Information (BioProject ID PRJNA931294). All Microsoft Excel files with complete numbers are attached as supplemental data. The cryo-EM maps obtained in this study have been deposited in the Electron Microscopy Data Bank; the accession codes are provided in Table 1.
Results
Isolation of the granule fraction
RNA granules are found in the pellet of sucrose gradients used to purify polysomes from monosomes (Krichevsky and Kosik, 2001; Kanai et al., 2004; Aschrafi et al., 2005; Elvira et al., 2006; El Fatimy et al., 2016). Our protocol to enrich for RNA granules is based on a previous study (El Fatimy et al., 2016) that involves a high-velocity spin over a 60% sucrose pad to enrich for all polysomes (Total polysomes), followed by short spin on a 15–60% sucrose gradient to separate polysomes (PF) from the GF (Fig. 1A). The short sedimentation time allows for clear separation of two populations of ribosomal proteins as can be seen from Coomassie staining and immunoblotting for the ribosomal protein S6 (Fig. 1B). The ribosomal proteins peak in fraction 5–6 (PF), and very few ribosomal proteins are found in fractions 8–9 (Fig. 1B) in contrast with the large number of ribosomal proteins in the pellet (GF; Fig. 1B). A UV absorption plot also confirms that very few A254 absorbing structures are present in the last fractions before the pellet (fractions 9–10) before the GF (which was not resuspended in this case; Fig. 1C). EM of the GF and PF shows ribosomal clusters in transmission EM pictures (Fig. 1D). Counting of ribosomes in the clusters from the GF and the PF using transmission EM micrographs (Fig. 1D) revealed no difference in the number of ribosomes/cluster between the two fractions suggesting that the granule fraction is not simply made up of larger polysomes (10 ± 4.2 ribosomes in the GF; n = 219 clusters, 14 micrographs, 2 preparations) and 10 ± 5.2 ribosomes in the PF (N = 240 clusters, 14 micrographs, 2 preparations).
Characterization of the granule fraction. A, Summary of protocol for isolating the GF from P5 rat whole-brain homogenate using sucrose gradient fractionation. B, Top, SDS-PAGE stained with Coomassie Brilliant Blue showing enrichment of the characteristic distribution of ribosomal proteins in Fractions 5–6 (PF) and Pellet (GF). Bottom, Immunoblot analysis (from a separate purification) of S6 ribosomal protein showing peaks in Fractions 5–6 (PF) and Pellet (GF). Top, Lanes are described (M, Molecular weight marker). C, Top, SDS-PAGE stained with Coomassie Brilliant Blue showing the distribution of proteins from Fraction 1 to Fraction 10, excluding the pellet. Bottom, UV absorption plot (A254) of the same experiment collected fractions 1–10 showing enrichment of polysomes in the PF. D, Negative stained electron micrograph of the GF and PF shows clusters of ribosomes of approximately the same size. E, Immunoblot analysis of starting material (PF; Fraction 5/6) and GF (Pellet) stained for RBPs implicated in RNA granules and stalled polysomes and other factors, FMRP, PURA, UPF1, eEF2, PNRC2, STAU2, ZBP1, G3BP, HNRNPA2B1, eIF4E, SMN, TIA1, p-S6, YTHDF1. One representative blot for each RBP is shown, each blot is normalized to the S6 staining from that blot, and the S6 blot is shown below each separate blot. F–P, Quantification of Western blots. The fold enrichment compared with starting material normalized to levels of S6 (see above, Materials and Methods) is shown for each RBP, FMRP (N = 3), PURA (N = 3), UPF1 (N = 6), eEF2 (N = 3), PNRC2 (N = 4), STAU2 (N = 4), ZBP1 (N = 4), G3BP (N = 3), hnRNPA2B1 (N = 3), eIF4E (N = 4), SMN (N = 3), TIA1 (N = 4), p-S6 (N = 4), and YTHDF1 (N = 4). Error bars indicate SEM.
FMRP is known to be associated with stalled polysomes (Ceman et al., 2003; Darnell et al., 2011; Graber et al., 2013; Chen et al., 2014; El Fatimy et al., 2016). If stalled polysomes are enriched in the GF, FMRP should be enriched in the GF compared with the PF. To examine which markers are enriched in the GF, we performed Western blot analysis on the starting material, PF and GF. For quantification, we standardized the levels of the proteins to the levels of the ribosomal protein S6 to normalize for the number of ribosomes in each fraction. As expected, FMRP is enriched in the GF (Fig. 1E,F). Similarly, Pur-alpha (PURA), a protein that marks RNA granules in neurons (Kanai et al., 2004; El Fatimy et al., 2016) and whose loss leads to neurodevelopmental disorders (Reijnders et al., 2018), was also enriched in the GF compared with the PF (Fig. 1E,G). UPF1 is required for the efficient formation of RNA granules (Graber et al., 2017) and is enriched in the GF (Fig. 1E,H), Interestingly, we could not detect the eukaryotic elongation factor 2 (eEF2) in the polysome or the granule fraction (Fig. 1E,I). This argues against a previous model where phosphorylation of eEF2 trapped in stalled polysomes was proposed to be a key step in the reactivation of stalled polysomes (McCamphill et al., 2015; Sossin and Costa-Mattioli, 2019). Although UPF1 is a major component on the nonsense-mediated decay (NMD) pathway, PNRC2, another major marker of the NMD pathway, was not enriched in the GF (Fig. 1E,J), consistent with UPF playing a role in RNA granules independent of its role in NMD (Graber et al., 2017). UPF1 interacts with Staufen 2 (STAU2), and this interaction is important for the formation of stalled polysomes (Graber et al., 2017). However, although STAU2 was present in the GF, it was not enriched compared with the PF (Fig. 1E,K). All the other RBPs examined (ZBP1, G3BP1, hnRNPA2B1), which have also been associated with RNA granules in previous studies (Elvira et al., 2006; Fallini et al., 2011), showed similar results to STAU 2 (Fig. 1E,L–N), suggesting that many RBPs are equally distributed between the two fractions and emphasizing the enrichment of FMPR, PURA, and UPF1 in the GF, specifically. Neither the polysome nor the pellet fraction was enriched for eIF4E, consistent with the lack of ribosomes in the process of initiation (Fig. 1E,O). In contrast, SMN and TIA, RBPs particularly implicated in stress granules (Waris et al., 2014) but not present in RNA granule proteomics (Kanai et al., 2004; Elvira et al., 2006; El Fatimy et al., 2016), were not enriched in either of the two fractions (Fig. 1E,P,Q), indicating that stress granules are not an abundant constituent in these preparations. The GF was also not enriched for S6 phosphorylation (Fig. 1E,R), nor for YTHDF1 (Fig. 1E,S), an m6A reader, despite the abundance of m6A sites under ribosomes in the GF (see below).
Overall, the protein composition of the GF demonstrates distinctions in ribosome associated proteins compared with the PF (fractions 5–6), suggesting a distinct subset of ribosomes are enriched in the GF.
Cleavage of pelleted ribosomes into monosomes
Ribosome profiling is based on nuclease digestion of mRNA that is not protected by ribosomes, isolation of monosomes containing the protected mRNA, followed by library construction and sequencing. Stalled ribosomes have been reported to be resistant to cleavage by nucleases (Darnell et al., 2011). Based on the previous observation that high-salt conditions cause the unpacking of the ribosomes in the pellet (El Fatimy et al., 2016), we treated the granule fraction with 400 mm sodium chloride for 10 min before dilution to reduce the salt concentration back to physiological levels (150 mm) followed by incubation with RNase I (Fig. 2A). We found that this treatment could cleave the ribosome clusters to monosomes (Fig. 2B–F). The effect of the high-salt and nuclease treatments was visualized using negative staining EM (Fig. 2B). Samples either untreated or treated with the high-salt buffer, nuclease (low concentration, 0.5 µl:50 U) or both were deposited on the EM grids and negatively stained. When applied separately, we observed that the high-salt and nuclease treatments induced partial unpacking of the ribosome clusters. Only when both treatments were applied to the sample did the EM images show that the ribosome clusters had dissociated into monosomes (Fig. 2B).
Cleavage of compacted stalled polysomes into monosomes. A, Schematic of nuclease digestion of polysomes from GF into monosomes. B, Electron micrographs of negatively stained GF following treatment with and without RNase I, and with and without pretreatment with Salt (–Nuclease –Salt, top left; –Nuclease +Salt, top right; +Nuclease –Salt, bottom left; +Nuclease +Salt, bottom right). Scale bar, 100 nm. Note that EM images represent the GF treated with nuclease and salt before the second sucrose gradient and not the purified monosomes in Fraction 2 obtained from the second sucrose gradient. C–E, Western blot analysis for S6 ribosomal protein in sucrose gradient fractions. The GF was resuspended, treated with 0 μl (c), 0.5 μl (D) and 1 μl (E) RNase I, with or without pretreatment of 400 mm NaCl (–Salt, top; +Salt, bottom), followed by a 15–60% sucrose gradient run for 45 min, after which fractions were collected and run on Western blot (see above, Materials and Methods). F, Quantification of digestion measured as the ratio of mean S6 intensity of digested monosomes compared with pellet [Fraction 2/(Fraction 2 + Pellet)] represented as percentage of digestion to monosomes; N = 4 biological replicates. Error bars indicate SEM. Images where nuclease caused complete cleavage in the absence of salt as shown in Extended Data Figure 2-1.
Figure 2-1
Complete digestion of GF. Western blot analysis for S6 ribosomal protein of purified fractions collected after first treating the GF from the first sucrose gradient with 0 μl, 0.5 μl, and 1 μl RNase I, with or without prereatment of 400 mm NaCl (−Salt, A; +Salt, B), followed by a second sucrose gradient (see above, Materials and Methods). Only fraction 2 was loaded in this experiment. In this example, almost complete digestion was observed even with lower nuclease concentrations and no salt. Increased nuclease digestion appeared to be related to different batches of nuclease and different amounts of time nuclease was stored, but this was not systematically analyzed. Download Figure 2-1, TIF file.
We also examined cleavage by resedimentation after cleavage. After RNase digestion, we resedimented the fractions on a sucrose gradient and measured the movement of S6 ribosomal protein immunoreactivity from the GF to fraction 2 of the sucrose gradient (Fig. 2C–F). We observed that the high-salt treatment improved the digestion by nuclease at low nuclease concentrations. However, some preparations of nuclease led to complete cleavage even without the need for salt (Extended Data Fig. 2-1). At the higher concentration of nuclease (1 µl:100 U), there was no difference in digestion in the presence or absence of salt (Fig. 2C; Extended Data Fig. 2-1), consistent with previous results suggesting that high concentrations of nuclease can cleave stalled polysomes (Darnell et al., 2011).
Cryo-EM analysis of ribosomes reveals ribosomes are stalled in the hybrid position
To examine the state of the ribosomes in the pelleted fraction, the RNase I–treated RNA granules were deposited in cryo-EM grids and vitrified for analysis using single-particle approaches. We used the nuclease-treated fraction as single-particle analysis is simplified by the separated monosomes. The 400 mm sodium chloride treatment of the RNA granules was eliminated for these samples to prevent the dissociation of stalling factors potentially bound to these ribosomes. Individual ribosomes in the cryo-EM micrographs were selected and subjected to 2D and 3D classification. Images were collected in counting mode according to the parameters described in Table 1, and the imaging process workflow is shown in Extended Data Figure 3-1.
We found that two classes of the 80S ribosomes coexisted in the RNA granules (Fig. 3). Class 1 was more populated and contained 85% of the particles in the dataset. A consensus map cryo-EM map for class 1 was obtained and refined to 2.4 Å resolution and refined further using a multibody refinement approach. In this approach each subunit defined one body, and the 40S and 60S subunits refined to 2.5 Å and 2.3 Å resolution, respectively (Extended Data Fig. 3-2). In this class 1, the 80S ribosomes exhibited tRNA molecules in hybrid A/P and P/E state (Fig. 3A). Particles in class 2 represented the 15% remaining of the population and contained a tRNA in the P-site (Fig. 3B) but an empty A site. We also obtained a consensus map for this class that refined to 2.6 Å resolution, and the multibody refinement approach of their 40S and 60S subunits produced maps that refined to 3 Å and 2.6 Å resolution, respectively (Extended Data Fig. 3-2).
Cryo-EM analysis of the digested ribosomes in the GF. A, B, Side view of the cryo-EM maps of class 1 (A) and class 2 (B) 80S ribosomes (top) found in the GF after nuclease digestion. The workflow used in defining the classes is shown in Extended Data Figure 3-1. The tRNA molecules observed in each of the maps are indicated. Bottom, A top view of the same cryo-EM maps. The resolution of these maps is shown in Extended Data Figure 3-2. The 40S and 60S subunits are shown as transparent densities for easier viewing of the position of the tRNA molecules in each class. The atomic model of the rRNA and r-protein components (uL10 and uL11) of the P stalk are indicated to show the lack of density existing for this structural motif. The atomic model of the P stalk components was obtained from Zhou et al. (2020; Protein Data Bank (PDB) ID 6XIQ) and shown in the same position and orientation that the P stalk would have adopted should this motif show density in these maps.
Figure 3-1
1 Cryo-EM workflow_legend. Image processing workflow of the cryo-EM dataset collected from the monosomes digested from the GF. The diagram displays the main image processing steps undertaken with this dataset and the two main ribosome populations that were found. The number of particle images maintained at each step are indicated. The resolution of both large and small subunits for the populations identified are also indicated. Download Figure 3-1, TIF file.
Figure 3-2
Resolution analysis of the two major classes of monosomes from the granule fraction. The cryo-EM maps for class 1 (A) and class 2 (B) of monosomes from the digested GF were refined by multibody refinement by dividing the 80S ribosome into two major bodies, the 40S and the 60S particles. Because each subunit was refined independently, we show the Fourier shell correlation graphs (left and middle) for each class. Graph are labeled to indicate whether they correspond to either the 40S or 60S subunit part of the cryo-EM maps. We used a FSC threshold of 0.143 to report the resolution. Right, The local resolution analysis of the cryo-EM maps obtained for both classes. These maps were obtained by merging the 60S and 40S cryo-EM maps obtained by multibody refinement using the command vop add in Chimera. Maps are colored according to their local resolution using the color coding indicated in the scale bars. Main structural landmarks are indicated. Download Figure 3-2, TIF file.
The hybrid structures are consistent with ribosomes stalled in the elongation process, such as the following ribosomes in collided ribosome structures (Juszkiewicz et al., 2018). Although we do not believe that these structures emanate from collided ribosomes (see below), the structures are consistent with the presence of stalled ribosomes in the pellet fraction. Despite the absence of salt and the high resolution of these structures, no additional large densities were observed in these structures.
A feature observed in the two classes of 80S ribosomes found in the GF was the absence of cryo-EM density in the 60S P stalk (Fig. 3, bottom). A closer analysis revealed a complete lack of density for r-protein uL10 (Rplp0) and uL11 (Rpl12) and the rRNA loops connecting both proteins. uL10 is an essential protein for the stability of the P stalk and together with uL11 form the base of the P stalk. Therefore, these structures suggest that the P stalk is highly flexible in the two classes of ribosomes. uL10 is also one of the binding sites of the elongation factor eEF2, and it participates in the formation of the GTPase-associated center (Naganuma et al., 2010). Consequently, the absence of P stalk density in these structures suggests that halting translation in these ribosomes involves the prevention of binding of elongation factors such as eEF2.
Characterization of the ribosome-protected fragments of mRNAs in the granule fraction
To ensure that the ribosome profiling was done on monosomes, the cleaved ribosomes (treatment with high concentrations of RNase I (1 U) from the GF, PF, or the total ribosomes (Fig. 1A) were loaded onto a second 15–60% sucrose gradient and centrifuged to separate the monosomes before RNA extraction, library preparation, and sequencing of the footprint reads (Figs. 4A–C, 5A). UV absorbance (A254) shows a peak in fraction 2 (Fig. 4B), and EM of fraction 2 (which was the fraction used for ribosome profiling) confirms that it contains mainly 80S monosomes (Fig. 4C). As we are interested in stalled polysomes, no attempt was made to prevent ribosome run-off, and cycloheximide was not present during the tissue homogenization. We tested whether the presence or absence of cycloheximide has an impact on ribosome footprints from the GF (Extended Data Fig. 5-1). There was little effect of the omission of cycloheximide on the mRNAs detected by footprint reads, as seen by a high Pearson's correlation and clustering by principal components analysis (PCA) between relative numbers of footprint reads in the presence or absence of cycloheximide (Extended Data Fig. 5-1). Similarly, nuclease digestion was performed after a brief treatment with high salt; although this treatment slightly improved the digestion (Fig. 2), the mRNAs detected by footprint reads from the GF were very similar to the salt-untreated sample, again determined by a high Pearson's correlation and clustering by PCA (Extended Data Fig. 5-1).
Purification of monosomes for ribosome footprinting. A, Schematic of the procedure, where the GF is treated with nucleases, and then sucrose gradient fractionation is used to isolate 80S monosomes. B, UV A254 absorbance of the sedimentation shows major peak in fraction 2. This is also the major fraction where S6 is found (Fig. 2E). C, Negative stained electron micrograph of Fraction 2 shows the presence of 80S monosomes in this fraction.
Ribosome footprinting of the GF. A, Diagram summarizing the footprinting procedure. B, Size distribution of footprint reads from GF (blue circles; N = 8 biological replicates) compared with polysome fraction (PF; orange squares; N = 3 biological replicates). Error bars indicate SEM. C, Read coverage of different size footprints from GF (small 32 nt) to 3′UTR, 5′UTR, and CDS. D, The number of read extremities (shading) for each read length (y-axis) based on distance from start (left, 0 on x-axis is A in ATG) and stop (right, 0 on x-axis is last nucleotide of stop codon) with the beginning of the read (5′) on top and the end of the read (3′) on bottom for GF. Data are shown for one biological replicate, but results are similar for all replicates. Similarity between replicates based on heat maps and principal component analysis are shown in Extended Data Figure 5-1. Replicates of the read extremities are shown in Extended Data Figure 5-2. E, Periodicity statistics for GF indicate that long reads (33–40) in frame 0 have significantly more periodicity than frame 1 and frame 2 for long reads, or any frame for short (21–24) and medium reads (25–32; ANOVA, F(261,8) = 13.9, p < 0.001; post hoc Tukey's HSD test, long reads in 0 frame; *p < 0.001 against long reads in frame 1 and frame 2 and other reads in frame 0); Error bars indicate SD, n = 39 long, 40 medium, 11 short (N is based on each read length in each biological replicate; not all read lengths are present in all biological replicates). Error bars indicate SD. F, Distribution of large reads for GF with the CDS of all transcripts normalized to the same length shows that reads are biased to the first half of the transcripts. The x-axis is the relative position in the transcript, and the y-axis is the average number of reads for that relative position. Data are shown for one biological replicate, but results are similar for all replicates.
Figure 5-1
No effect of Salt or Chx treatment on RNA reads (A-B) Heat map for the comparison of biological replicates accomplished in the presence or absence of cycloheximide (Chx) (A) or for the comparison of biological replicates accomplished in the presence or absence of salt (B). (C) Heat map of the comparison of biological replicates for Total RNA. Warmer colors indicate a higher correlation between groups. Differences between biological samples were equal to or higher than the differences seen with treatment. (D-E) Principal component analysis for the comparison of biological replicates accomplished in the presence or absence of cycloheximide (Chx) including RNA-SEQ of starting material (D) or for the comparison of biological replicates accomplished in the presence or absence of salt including RNA-SEQ of starting material (E). Differences between biological samples were equal to or higher than the differences seen with treatment. The footprint reads were clustered separately from the RNA-SEQ of total mRNA. All calculations were done with R package edgeR (Robinson et al., 2010).Download Figure 5-1, TIF file.
Figure 5-2
Biological replicates of read end maps. A–E, The number of read extremities (shading) for each read length (y-axis) based on distance from start (left, 0 on x-axis is A in ATG) and stop (right, 0 on x-axis is last nucleotide of stop codon), with the beginning of the read (5′) on top and the end of the read (3′) on bottom. A–E is an individual biological replicate. Download Figure 5-2, TIF file.
Ribosome profiling reads are usually generated from canonical fragment sizes between 28 and 34 nt (Ingolia, 2014). Surprisingly, the peak read size (34–37 nt) from the ribosome-protected reads in the GF was longer than this canonical ribosome-protected fragment and longer than the distribution of reads in the PF ribosome population, although these were also larger than expected (33–35 nt; Fig. 5A,B). It has been previously reported that classical ribosomes produce medium size 27–29 nt footprints and small 20–22 nt footprints should the ribosome have an open A site (Wu et al., 2019). The longer reads we observed may be because of (1) an altered state of the stalled polysome, (2) increased protection because of associated RBPs, or (3) incomplete digestion. The longer reads from the GF map better to the CDS than shorter reads (Fig. 5C). The presence of reads in the 3′UTR probably represents contamination from RBP complexes on the 3′UTR that comigrate on the sucrose gradient with monosomes. Alignment at the start and stop codons showed that the 5′ end of the reads has an offset of ∼14–15 bp, consistent with the ribosomes sitting on the A site as expected for hybrid-state stalled ribosomes, as opposed to 12 bp offset for ribosomes resting on the P site (Ingolia et al., 2009; Martens et al., 2015). Most of the excess length of the longer reads is because of extension at the 3′ end of the footprint reads (Fig. 5D; Extended Data Fig. 5-2). The extension at the 3′ end increased with the read length giving a diagonal line on a plot of read length versus distance from start or stop codon (Fig. 5D), and this variable extension at the 3′ end of reads is also seen in other studies (Martens et al., 2015). Ribosome footprint reads should show periodicity because of the three-nucleotide code in the mRNA and reads over 32 nt showed higher periodicity than the shorter reads (Fig. 5E). We had predicted that footprint reads from stalled polysomes would be most enriched at the stop codon and the 3′ end of the message because of the requirement of UPF1 for the formation of stalled polysomes and the recruitment of UPF1 at the stop codon. However, we did not observe a bias for footprint reads near the stop codon. Instead, there was some bias in the large footprint reads for the first half of the message (Fig. 5F). Together, ribosome profiling of dissociated monosomes derived from the granule fraction from P5 rat brain reveals enrichment in large ribosomal footprints (>32 nt), displays a preference for the first part of the mRNA CDS, and does not display a bias for the stop codon.
Analysis of mRNAs that are abundant and enriched in footprint reads from the granule fraction
We were interested in identifying which mRNAs represent the most abundant constituent of the GF, which mRNAs have increased ribosomal occupancy in the GF, and which mRNAs have more protected reads in the GF compared with total polysomes or the PF. We calculated the Abundance (RPKM) of footprint reads to determine the most abundant constituents of the granule fraction (Extended Data Table 6-1). To determine which mRNAs have more polysomes/mRNA, we calculated what is characteristically called the translation efficiency (abundance of ribosome footprints/abundance of total mRNA as determined by conventional RNA seq of the starting fraction) for each mRNA (Extended Data Table 6-1). However, in the context of presumed stalled ribosomes, translation efficiency may be a misleading term, so we use a more conservative term, ribosomal occupancy. We also determined the footprint reads of the GF that were enriched relative to the footprint reads from the total polysomes or the PF (see above, Materials and Methods; Extended Data Table 6-1).
GO analysis of the 100 mRNAs with the largest abundance showed a significant over-representation of mRNAs encoding cytoskeletal proteins (Table 2) that are expressed developmentally in neuronal projection and synaptic compartments (Fig. 6A), including Map1b (Table 2), an mRNA we had previously shown to be translated in dendrites through reactivation of stalled polysomes (Graber et al., 2013, 2017). Although cytoskeletal mRNAs dominate the most abundant GO category and also show increased ribosome occupancy (Table 2), they are also relatively abundant in the total mRNA population and also are present in the footprint reads of the total polysomes and PF fractions. The GO analysis for mRNAs with increased ribosome occupancy showed a significant over-representation of mRNAs encoding RNA binding proteins or proteins involved in RNA metabolism (Fig. 6B), including the gene mutated in amyotrophic lateral sclerosis, FUS (fused in sarcoma; Table 3). Finally, the GF footprint reads that were enriched compared with the footprint reads from the total polysomes and PF represented mRNAs involved in neuronal development and synapse formation, particularly microtubule-associated proteins such as motor proteins (Tables 4, 5; Extended Data Table 6-1). The GF footprint reads were also enriched in mRNAs from the endomembrane system and endoplasmic reticulum compared with footprint reads from total polysomes, suggesting that the GF also enriched for secretory mRNAs compared with total polysomes (Tables 4, 5; Extended Data Table 6-1). These results were similar regardless of whether the 50, 200, or 500 most abundant or enriched mRNAs were selected for the analysis (Extended Data Table 6-2).
Top 20 most abundant mRNAs
Top 20 most ribosomally occupied mRNAs
Top 20 enriched mRNAs (GF vs total polysomes)
Top 20 enriched mRNAs (GF vs PF)
GO Analysis of footprint mRNAs. A–D, GO terms of selected comparisons for most abundant (A), most ribosomally occupied (B), most enriched in GF compared with Total Polysomes (C) and most enriched in GF compared with the PF. Terms highlighted in red represent terms involved in cytoskeleton (A), RNA binding (B), and terms also found in A and B (C and D). The abundance, level of ribosome occupancy, and enrichment to PF or total polysomes for all mRNAs are shown in Extended Data Table 6-1. GO analysis using different numbers of the top-ranked mRNAs is shown in Extended Data Table 6-2.
Table 6-1
Abundance, ribosome occupancy and enrichment (GF vs total; GF vs PF) of footprint reads for each mRNA. Columns are the gene name (column A), calculation of the log to the base 10 fold change (Log(FC) of the ribosome occupancy (RKPM footprint reads/RKPM RNA seq reads) by edgeR calculated for GF replicates 1–5 (column B), calculation by edgeR of the p value for the fold change (column C), calculation by edgeR of the the false discovery rate of the log(FC) change for GF replicates 1–5 (column D), the average RKPM from the RNA-SEQ of the total RNA from total RNA of GF replicates 1–5 (column E), the average RKPM of the footprint reads from GF replicates 1–5 (column F). The average RPKM was also independently calculated after first separating reads into long (>33, column G), medium (25–33, column H), and short (<25, column I) reads; calculation of the log to the base 10-fold change (Log(FC) of the footprint reads between GF (replicates 1–5) and total polysome (1–4) by edgeR calculated for GF replicates 1–5 (column J); calculation by edgeR of the p value for the fold change in column J (column K); calculation of the log to the base 10-fold change (Log(FC) of the footprint reads between GF (replicates 6–8) and PF (replicates 1–3) by edgeR (column L); calculation by edgeR of the p value for the fold change in column L (column M); identifiers of the mRNAs used for Figure 7 and Extended Data Figures 7-1 and 7-2 (column N-AD). The data associated with them are shown in separate sheets identified by the group used, and the groups are described in the text. The master sheet has all mRNAs, but the separate sheets are restricted to mRNAS with Footprint RKPMs of >5. The analysis of the abundance, ribosome occupancy, and enrichment of mRNAs identified as being regulated by nonsense mediated decay, as well as figures examining this are shown in the Sheet NMD analysis but not identified separately in the Master sheet. Download Table 6-1, XLSX file.
Table 6-2
Full GO analysis based on top abundance (RPKM), ribosome (fold change of RPKM footprint abundance)s/RKPM total RNA abundance), enrichment of GF relative to total polysomes (lowest adjusted p values from differential expressed gene analysis) and enrichment of GF relative to PF (lowest adjusted p values from differential expressed gene analysis) and transcripts with peaks. The different sheets represent analysis based on the top 50, 100, 200, or 500 transcripts on the analysis describe above or from the total number of transcripts with peaks (526). Analysis is from the Web site gProfileR (Reimand et al, 2016) using all rat genes as a comparison group. MF, Molecular function); BP, biological process; CC, cellular component. The output columns are directly from the gProfileR site and are defined on the site. The interactions (column J) represent the members of the enrichment list included in the GO term (column B).Download Table 6-2, XLSX file.
We next determined whether mRNAs previously reported to be regulated by elongation or initiation in the nervous system were over-represented in our samples (Fig. 7; Extended Data Fig. 7-1). Strikingly, mRNAs, whose translation is regulated by elongation through eEF2 phosphorylation (Kenney et al., 2016), were significantly abundant; had larger ribosomal occupancy and were enriched in the GF compared with total polysomes (Fig. 7A–C; Extended Data Fig. 7-1). In contrast, neuronal mRNAs regulated by signaling pathways that mainly affect initiation, either through eIF4E phosphorylation (Amorim et al., 2018), TOR (target of rapamycin) activation (including terminal oligopyrimidine tract (TOP) mRNAs; Thoreen et al., 2012), or eIF2 alpha phosphorylation (Di Prisco et al., 2014), did not have larger ribosomal occupancy and were not enriched in the preparation ((Fig. 7A–C; Extended Data Fig. 7-1). Although TOP mRNAs, mainly encoding ribosomal subunits are relatively abundant and known to be transported in neurons, we observed a significant de-enrichment of ribosome occupancy from TOP mRNAs in footprint libraries in the GF, and they were also enriched in the PF compared with the GF (Fig. 7A; Extended Data Fig. 7-1), consistent with the notion that transported mRNAs blocked at initiation are not enriched in this preparation.
Correlation analysis of most abundant and ribosomally occupied mRNAs in ribosome footprints. A, B, Comparison of footprint reads of most abundant (A) and most ribosomally occupied (B) mRNAs (Extended Data Fig. 7-1), enrichment to the total polysomes and GF), to mRNAs regulated by translation elongation (58), upregulated by mGluR with upstream open reading frames (61), eIF4E phosphorylation (59), mTOR (60), and TOP mRNAs (60). C, D, Comparison of most abundant (C) and most ribosomally occupied (D) mRNAs to run-off-resistant mRNAs (19) and mRNAs that are CLIPped by FMRP (17, 62). E–F, Comparison of most abundant (E) and most ribosomally occupied (F) mRNAs to mRNAs translated preferentially by monosomal and polysomal ribosomes in the neuropil (36) and secretory mRNAs (secretory proteins with reviewed annotation from UNIPROT), compared with all mRNAs. G, H, Comparison of most abundant (G) and most ribosomally occupied (H) mRNAs to autism-related mRNAs from the SFARI database (syndromic and levels 1–3). The total SFARI group was also divided into ones that are also in the FMRP CLIP group (17, 62) and ones that are not. For all groups there was a cutoff of 5 RPKM to avoid mRNAs not expressed in the nervous system; p values from comparison to all mRNAs (Students t test with Bonferroni correction for multiple tests; n = 14 for all comparisons in figure). Only significant p values (p < 0.01 after correction) are shown; log(FCl), Log (base 10) fold change. The dotted line shows y = 0 value. Bottom, N for each comparison group is shown. Comparison of abundance and ribosome occupancy between mRNAs in neuronal dendrites and other mRNAs is shown in Extended Data Figure 7-2. Similar correlation analysis seen in this figure using only mRNAs found in neuronal dendrites is shown in Extended Data Figure 7-3. Correlation of footprint read abundance with length of mRNAs is shown in Extended Data Figure 7-4.
Figure 7-1
Correlation analysis of most enriched mRNAs in ribosome footprints from Granule Fraction (GF). (A) Level of enrichment GF vs Total (log fold change (log(FC)) of GF/Total)) or GF vs PF (B) in mRNAs regulated by translation elongation (58), upregulated by mGluR with upstream open reading frames (61) eIF4E phosphorylation (59), mTOR (60) and TOP mRNAs (60). (C) Comparison of enrichment (GF vs Total) or GF vs PF (D) in runoff-resistant mRNAs (19) and mRNAs that are CLIPped by FMRP (17, 62) to all mRNAs. E) Comparison of most enriched mRNAs (GF vs Total) or GF vs PF) (F) for mRNAs translated preferentially by monosomal and polysomal ribosomes in the neuropil (36) and secretory mRNAs (secretory proteins with reviewed annotation from UNIPROT) to all mRNAs (G) Comparison of most enriched mRNAs (GF vs Total) or GF vs PF) (H) for autism-related mRNAs from the SFARI database (syndromic and levels 1-3) to all mRNAs. The total SFARI group was also divided into ones that are also in the FMRP CLIP group (17, 62) and ones that are not. For all groups there was a cut-off of 5 RPKM to avoid mRNAs not expressed in the nervous system. P values from comparison to all mRNAs (Students t test with Bonferroni correction for multiple tests (n=14 for all comparisons in figure). Only Significant P values (p<0.05 after correction are shown). log(FCl), log (base 10) fold change. The dotted line shows the y=0 value. The N for each comparison group is shown under the group. The N for each comparison group is shown under the group. Download Figure 7-1, TIF file.
Figure 7-2
Abundance and ribosomal occupancy compared to transported mRNAs. A, B, Abundance and (B) Ribosomal Occupancy comparison of footprint reads to mRNAs classified as transported to distal sites as determined (35). For all groups, there was a cutoff of 5 RPKM to avoid mRNAs not expressed in the nervous system; p values from comparison to all mRNAs (Student's t test with Bonferroni correction for multiple tests; n = 2 for all comparisons in figure). Only significant p values (p < 0.01 after correction) are shown. log(FCl), Log (base 10) fold change. The dotted line shows the y = 0 value. Bottom, N for each comparison group is shown; N for each comparison group is shown. Download Figure 7-2, TIF file.
Figure 7-3
Comparisons only using neuronal transported mRNAs. A, B, Comparison of footprint reads of most abundant (A) and most ribosomally occupied (B) mRNAs to mRNAs regulated by translation elongation (58), upregulated by mGluR with upstream open reading frames (61) eIF4E phosphorylation (59), mTOR (60), and TOP mRNAs (60). C, D, Comparison of most abundant (C) and most ribosomally occupied (D) mRNAs to run-off-resistant mRNAs (19) and mRNAs that are CLIPped by FMRP (17, 62). E, F, Comparison of most abundant (E) and most ribosomally occupied (F) mRNAs to mRNAs translated preferentially by monosomal and polysomal ribosomes in the neuropil (36) and secretory mRNAs (secretory proteins with reviewed annotation from UNIPROT), compared to all mRNAs. G, H, Comparison of most abundant (G) and most ribosomally occupied (H) mRNAs to autism-related mRNAs from the SFARI database (syndromic and levels 1–3). The total SFARI group was also divided into ones that are also in the FMRP CLIP group (17, 62). and ones that are not. For all groups there was a cutoff of 5 RPKM to avoid mRNAs not expressed in the nervous system; p values from comparison to all mRNAs (Student's t test with Bonferroni correction for multiple tests (n = 14 for all comparisons in figure). Only significant p values (p < 0.01 after correction) are shown. log(FCl), Log (base 10) fold change. The dotted line shows the y = 0 value. Bottom, N for each comparison group is shown. Download Figure 7-3, TIF file.
Figure 7-4
Lack of correlation between transcript length and transcript enrichment and abundance. A, B, The correlation of the length of the mRNAs, either the entire transcript length or the CDS length, with the amount of ribosomal occupancy and abundance of the mRNA were calculated for either the total set of mRNAs (A) or just the mRNAs with peaks (B); r values are shown, and the squares colored based on the heat map of the R scores (inset). No correlation was significant (p < 0.05). Download Figure 7-4, TIF file.
We next determined whether mRNAs previously reported to be enriched in stalled polysomes are particularly enriched/abundant in the footprint reads of the GF. A publication identified mRNAs protected by ribosomes resistant to ribosomal run-off in neuronal slices (Shah et al., 2020). Although this study focused on the proportion of each mRNA for which ribosomes ran off, the data also identified the mRNAs with the most protected fragments remaining after a long period of run-off (60 min). The 200 run-off-resistant mRNAs with the most reads remaining at 60 min were significantly abundant, had increased ribosome occupancy, and were enriched in the GF compared with total polysomes and the PF (Fig. 7D–F; Extended Data Fig. 7-1). FMRP, an RBP highly enriched in the sedimented pellet, has been associated with stalled polysomes (Ceman et al., 2003; Darnell et al., 2011). The mRNAs associated with FMRP in neurons, identified through cross-linking immunoprecipitation (CLIP), are also more resistant to ribosome run-off (Darnell et al., 2011). We examined the abundance and enrichment of two separate FMRP CLIP studies from brain tissue (Darnell et al., 2011; Maurin et al., 2018), and both were significantly abundant, had increased ribosome occupancy, and were enriched in the GF, particularly compared with the polysome fraction (Fig. 7D–F; Extended Data Fig. 7-1).
We also examined several other datasets to evaluate the mRNAs with footprint reads in the GF. Secreted mRNAs are stalled by their signal peptide and then cotranslationally inserted into the endoplasmic reticulum (ER). The transport of secreted mRNAs stalled at elongation would also involve the transport of ER in ribosome-associated vesicles (Carter et al., 2020). Secreted mRNAs are less abundant and have decreased ribosome occupancy compared with other mRNAs in the GF (Fig. 7G,H). However, they are also enriched in the GF compared with the total polysomes and the PF (Extended Data Fig. 7-1). Recently, mRNAs that are preferentially translated from monosomes in neuronal processes were identified (Biever et al., 2020). Although we predicted that these mRNAs would also be depleted from our preparation, they had higher ribosome occupancy and were significantly enriched in the GF, particularly compared with the polysome fraction (Fig. 7G–I; Extended Data Fig. 7-1), whereas mRNAs preferentially transported in polysomes were not enriched in the GF; indeed they were enriched in the PF compared with the GF (Extended Data Fig. 7-1), although both types of mRNA were abundant in our preparation (Fig. 7H). This is despite the finding that total mRNA levels for the preferentially polysomal transported mRNAs were significantly higher than the total mRNA levels for the preferentially monosomal translated mRNAs at this developmental time point (158 ± 17 polysome RPKM, n = 327 vs 72 ± 4 monosome RPKM, n = 458; SEM, p < 0.001, Student's t test).
Because translation from stalled polysomes is implicated in neurodevelopmental disorders, we examined whether protected reads from autism-related genes from the Simons Foundation Autism Research Initiative (SFARI) database were enriched and abundant in the protected reads. Compared with all mRNAs, these mRNAs had higher ribosome occupancy and were significantly enriched in the GF from our data (Fig. 7J–L; Extended Data Fig. 7-1). There is a significant overlap with FMRP CLIPped mRNAs in this dataset, but both FMRP CLIPped SFARI mRNAs and non-CLIPped SFARI mRNAs were enriched in the protected reads (Fig. 7G,H; Extended Data Fig. 7-1).
Finally, some of the increases in abundance and ribosomal occupancy we observed may be because of our preparation being specifically from the nervous system. To account for this bias, we repeated this analysis using only mRNAs known to be transported in neuronal processes (Biever et al., 2020). Although the mRNAs known to be transported in neuronal processes were significantly enriched and abundant in our preparation (Extended Data Fig. 7-2), when we restricted the total set of mRNAs to only include this set, all the results for abundance and ribosome occupancy above were replicated (Extended Data Fig. 7-3).
The ribosome-protected reads are enriched in sequences matching FMRP CLIPs
If stalled polysomes are indeed enriched in the ribosome-containing pellet, the footprint reads should help identify where on the mRNA ribosomes are stalled. Examination of the distribution of the large reads (>32 nt) on individual messages revealed highly nonuniform distribution of reads on mRNAs (Fig. 8A). As peaks in ribosome-protected fragments are often nonreproducible (Liu et al., 2019), we used stringent criteria to identify peaks. First, peaks were defined for each mRNA in each library based on a maximum value higher than the average RPKM for the mRNA and a minimum width of 18 nt at the half-maximum height. Second, the peak had to be present in at least three of the 5 biological replicates (Fig. 6B; Extended Data Fig. 8-1). Using these stringent criteria, we identified 766 peaks in 524 mRNAs (Extended Data Table 8-1). Although 90% of the mRNAs had only one or two peaks, the mRNA most associated with stalled polysomes, Map1b, had the largest number of identified peaks (10; Fig. 8A). Only 15% of the peaks in the GF were also peaks in ribosome-protected reads from the PF (Extended Data Table 8-1).
Sequences underlying ribosome-protected fragments are enriched in sequences matching FMRP CliPs. A, mRNA profiles of Map1b, β-actin, and Tubulin 2b showing reproducible consensus peaks in the CDS; circles represent consensus peaks of footprint reads mapping to the same sequence, blue lines represent reproducible consensus peaks across biological replicates, red shading represents CDS. All replicates are shown in Extended Data Figure 8-1. The list of all peaks and their positions in the mRNAs can be found in Extended Data Table 8-1. Analysis of the codon frequency of the codons in the peaks is shown in Extended Data Figure 8-2. Codon analysis of the peaks can be found in Extended Data Table 8-2. B, Diagram summarizing how motif analysis is done. C, Results from the HOMER program show the only three consensus sequences above the cutoff provided by HOMER. D, HOMER identified motifs overlapped with matching interaction motifs for RBPs listed in brackets. Residues that do not match are given in smaller font. Right, Code key for residue annotation. E, Table of top 10 RBPs with RNA interaction motifs present in consensus peaks. The number of peaks with multiple hits is also shown as Frequency of Motifs per Consensus Peak; n = x represents the number of motifs per consensus peak. All RBPs motifs examined are shown in Extended Data Table 8-3. F, Top-ranked consensus sequence from HOMER showing overlapping sites for interaction motifs WGGA (*) and RGACH (**), and their corresponding residues on a single peak from Map1b, β-actin, and Tubulin 2b. Both sequences given for each protein are identical, but they have been annotated to show clusters of motifs that map to the same consensus sequence. Right, Numbers indicate the location of the sequence in each mRNA that correspond to a consensus peak (blue) on the mRNA profiles shown in A.
Figure 8-1
List of all Peaks from the large (>32nt) footprint reads of GF (present in 3/5 independent replicates). The transcript ensemble number is given in Column A. The starting and ending nucleotide of the peak in Column B and C respectively. Length of the peak is in column D. The start and end of the corresponding region (5'UTR, CDS or 3'UTR is given in columns E and F respectively. The genename is given in Column G. The region of the peak, including peaks at the start and the stop are given in column H, Column I gives the relative placement in the CDS (where 0 is the start and 1 is the stop codon) or in the 5'UTR or 3'UTR. Column J is the number of total WGGA and RGACH reads in each peak that contains both motifs. The number of sequences matching GVAGAW and GACAAG (based on FIMO) are given in columns K and L respectively. For sites with no shared WGGA and RGACH site, the number of WGGA sites and RGACH sites are given in columns M and N respectively. Column O is the sum total of the sites measured in columns I-M) for each peak. Column P is a 1 if the mRNA containing the peak is a secretory mRNA. Column Q is a 1 if this peak overlaps (20%) with a peak in the large reads from the PF. Column R reports on the state of this mRNA in the Zhang et al 2020 study examining mRNAs with m6A sites at this stage of neuronal development (Ref #81) [No; mRNA not found in study; 3'UTR, m6A sites in 3'UTR but not the CDS; m6A sites in the 5'UTR but not the CDS, YES, m6A sites in the CDS. Download Figure 8-1, TIF file.
Figure 8-2
Rare codon usage in peaks. The codon usage was calculated for the peaks and for the background sequences generated from similar size fragments of mRNAs that did not have peaks. The ratio (peak usage/background usage) was calculated for each codon (x-axis) and plotted against rarity of the codon found according to Athey et al. (2017). The coefficient of determination is plotted (R2 = 0.4046) and p < 0.0001. Download Figure 8-2, TIF file.
Table 8-1
List of all peaks from the large (>32 nt) footprint reads of GF (present in 3/5 independent replicates). The transcript ensemble number is given in column A. The starting and ending nucleotide of the peak in columns B and C, respectively. Length of the peak is in column D. The start and end of the corresponding region (5′UTR, CDS, or 3′UTR is given in columns E and F, respectively. The gene name is given in column G. The region of the peak, including peaks at the start and the stop are given in column H, Column I gives the relative placement in the CDS (where 0 is the start and 1 is the stop codon) or int e 5′UTR or 3′UTR. Column J is the number of total WGGA and RGACH reads in each peak that contains both motifs. The number of sequences matching GVAGAW and GACAAG (based on FIMO) are given in columns K and L, respectively. For sites with no shared WGGA and RGACH site, the number of WGGA sites and RGACH sites are given in columns M and N, respectively. Column O is the sum total of the sites measured in columns I–M for each peak. Column P is a 1 if the mRNA containing the peak is a secretory mRNA. Column Q is a 1 if this peak overlaps (20%) with a peak in the large reads from the PF. Column R reports on the state of this mRNA in the Zhang et al (2020) study examining mRNAs with m6A sites at this stage of neuronal development (No, mRNA not found in study; 3′UTR, m6A sites in 3′UTR but not the CDS; m6A sites in the 5′UTR but not the CDS; Yes, m6A sites in the CDS). Download Table 8-1, XLS file.
Table 8-2
Analysis of rare codon usage in peaks. A gives the amino acid and B the codon for that amino acid. C is the count of that codon in the peaks, whereas D is the percentage of codons encoding this amino acid uses this codon in the peaks. F and G are the equivalent calculation for the background peaks. I is the ratio between peaks and background. J is the rarity of the codon based on Athey et al, 2017. Amino acid with single codons (Met, Trp) and stop codons were not used. Sheet 2 examines the same data but sorted on amino acid usage, not codon usage. Download Table 8-2, XLSX file.
Table 8-3
FIMO screening for RBP consensus sites. The results of FIMO screening for RBP consensus sites (Van Nostrand et al, 2020) with additional FMRP sites from Anderson et al (2016) and Ascano et al (2012). The RBP is given in column A, the motif in column B, and the number of peaks (from the large reads >32 nt) with a match from the FIMO search (p < 0.05) in column C. The total number of occurrences of the motif (there may be multiple matches in each peak) is given in D, and the distribution of the number of matches in each peak is given in E. Download Table 8-3, XLSX file.
Contrary to our initial hypothesis that peaks representing stalled footprint reads would be clustered around the stop codon, only 6 of the 766 total peaks were at the stop codon, and, similar to the overall coverage of footprint reads, the peaks were biased to the first half of the message with an average position of 0.37 ± 0.26 (SD) where 0 is the start codon, and 1 is the stop codon (Extended Data Table 8-1). The average length of the sequence within a peak was 36 ± 6 nt, similar to the most common footprint read size (Fig. 5B). We also identified peaks in the smaller reads, but only 65% of these peaks were in the CDS, and there was little overlap with the peaks in the long reads. In contrast, 94% of the peaks from large reads were in the CDS. Although secretory proteins are enriched in the GF (Extended Data Fig. 7-1), only 5% of the peaks were from secreted mRNAs (Extended Data Table 8-1), suggesting the presence of these mRNAs in the pellet may be more because of association with membranes than with their enrichment in stalled polysomes. GO analysis of the mRNAs with peaks match many of the GO terms identified earlier for abundant and ribosome-occupied mRNAs in the GF including enrichment for the molecular function mRNA binding (1x10E-15), the cellular component of neuron projection (2x10E-19), and the cytoskeleton (1x10E-12; Extended Data Table 6-2).
We next examined whether consensus sequences were in these peaks of footprint reads from the large fragments. We used an unbiased sequence motif search approach with the HOMER program (Heinz et al., 2010). HOMER uses relative enrichment and requires a background sequence. To remove usage bias for mRNA sequences, we used similarly sized fragments from mRNAs with no peaks as our background selection. The HOMER program identified three highly significant consensus sequences in the peaks above the false discovery rate determined by the program (Fig. 8C). Notably, the most significant consensus sequence (p = 1e-67) included the consensus sequence (WGGA) previously derived from analyzing FMRP CLIP sequences (Ascano et al., 2012; Anderson et al., 2016), which also overlapped with the consensus sites for m6A methylation (RGACH or RRACT) in the nervous system (Zhang et al., 2018; Fig. 8D–F). The motif was not biased to the start or end of the protected reads (average position of the motif in reads was 19 ± 9 bases) and thus did not represent sequences selected because they were difficult for nucleases to digest as sequences selected because of resistance to digestion would be at the end or beginning of the read. Analysis of 36 nt in front of the peaks or 36 nt behind the peaks did not result in a HOMER consensus sequence above the false discovery rate determined by the program. Neither did a HOMER search using just the 3′ extension of the large reads. Similarly, peaks identified using small- or medium-size reads also did not result in a HOMER consensus sequence above this cutoff.
Stalling can occur on rare codons, and thus we examined whether peaks were enriched in rare codons (Athey et al., 2017). In fact, the opposite was true, and peaks had significantly fewer rare codons than the background mRNAs, presumably as they represent abundantly translated mRNAs (Extended Data Fig. 8-2; Extended Data Table 8-2). We also examined amino acid usage in the peaks compared with background sequences and note the two highest enriched amino acids are the two acidic amino acids, aspartic acid (1.8×; GAC and GAT) and glutamic acid (1.6×; GAA and GAG); Extended Data Table 8-2). Whether this is because of the enrichment of GA sequence in the HOMER motif, or whether the HOMER motif is found because of the enrichment of these two amino acids encoded by the peak sequence is not clear.
We also performed a directed search of the peaks for sequences matching consensus binding sites of RBPs (Van Nostrand et al., 2020) using the FIMO program (Grant et al., 2011). Consistent with the nonbiased search, FMRP CLIP consensus sites had the most matches (Fig. 8D,E; Extended Data Table 8-3). Strikingly, these motifs mainly recognized purine-rich sequences (Fig. 8E; 76% of nucleotides in the top 10 RBP motifs are purines). However, the overall peak nucleotides are not enriched in purines compared with the background sequences (both at 57% purines). The abundance of purines in the consensus peaks suggests a possible role for the RBP PURA, a protein enriched in the granule fraction but without a known consensus binding site. Of the 766 peaks, there were 415 peaks with a WGGA site and many peaks with multiple WGGA sites in the peak, making a total of 615 WGGA sites in consensus peaks (Fig. 8E; Extended Data Table 8-1). The number of multiple matches (39% of peaks with a WGGA have multiple WGGA sites) was more than those found in the corresponding background sequences (11 ± 1%, from 10 separate selections of 750 background sequences with a WGGA peak). Of the 615 WGGA sites found in the peaks, 182 WGGA sites overlapped with an m6A site (Fig. 8F, examples). This percentage of WGGA sites overlapping with an RGACH site (30%) was significantly more than the number of overlaps seen in WGGA sites in our background samples (19 + 1.3%; 10 separate selections of 1300 WGGA sites from background sequences). Moreover, there were also many examples where m6A sites and WGGA sites were in the same peak but did not overlap. Of the 415 peaks with a WGGA site, 315 (76%) had an m6A consensus site in the same peak significantly more than were found in the background sample (39 ± 4%, 10 separate selections of 750 background sequences with a WGGA peak).
There were also many matches to other FMRP CLIP consensus sites (Fig. 8D). After including GVAGAW and GACAAG, more than 80% of the 766 consensus peaks contained an FMRP CLIP sequence or m6A consensus sites, suggesting a strong sequence bias for these sites in the regions of the mRNA enriched in footprint reads (Extended Data Table 8-1).
Discussion
The granule fraction is an enriched preparation for stalled ribosomes
Neuronal RNA granules containing ribosomes are found in the pellet of sucrose gradients (Krichevsky and Kosik, 2001; Aschrafi et al., 2005; Elvira et al., 2006; El Fatimy et al., 2016). However, previous studies did not show that ribosomes in these fractions are stalled. There has been an association between stalled polysomes and FMRP association (Darnell et al., 2011); however, the RNA granules were not isolated by sedimentation in that study. Here, we show that the GF is not only enriched for FMRP (as had been previously shown; El Fatimy et al., 2016), but also for UPF1, an additional protein functionally implicated in stalled polysomes (Graber et al., 2017). Moreover, we find that mRNAs that were previously shown to resist ribosomal run-off by initiation inhibitors (Shah et al., 2020), or to be regulated at the elongation stage of translation (Kenney et al., 2016), were over-represented in the footprint reads generated from this fraction. Finally, cryo-EM of the monosomes derived from the ribosome clusters in the GF demonstrates that they are mainly stalled in the hybrid position. These findings are consistent with the GF enriching for ribosomes that were stalled in neurons.
The distinction between the GF and the PF (i.e., the reason the clusters of ribosomes found in the GF pellet under these conditions) remains unclear. This distinction is unlikely to be that the polysomes in the GF are larger than ones in the PF. First, there are very few ribosomes in the fractions preceding the pellet, inconsistent with a gradual running off of large polysomes. Second, there is no correlation between the abundance or enrichment of mRNAs in the GF based on their length (total transcript or ORF; Extended Data Fig. 7-4), as would be expected if these were simply larger polysomes. Third, counting ribosomes in the cluster did not reveal a larger number of ribosomes/cluster compared with the polysome fraction. The GF may represent aggregates of ribosomes/polysomes. However, the question would remain why these would show specific enrichment of both specific types of mRNA (particularly those known to be stalled in neurons) and proteins (FMRP, UPF1). It is possible that RNA granules containing stalled polysomes may be more likely to aggregate than normal polysomes. Given the role of RNA granules in mRNA transport to distal sites in the neuron, it is likely that the aggregation of stalled polysomes may aid in the protection of the stalled mRNAs and nascent peptides during transport.
The GF was also enriched in reads from secretory proteins, and major GO terms for enrichment were endomembrane, endoplasmic reticulum, and vesicles (Fig. 6). The 20 most enriched mRNAs for GF compared with Total polysomes are all mRNAs encoding secreted proteins (Table 4). However, compared with total mRNA, there was decreased abundance and decreased ribosomal occupancy of secreted mRNAs (Fig. 7) despite their enrichment in the GF (Extended Data Fig. 7-1). It is possible that most ribosomes translating secretory proteins are lost because of their association with ER during the sedimentations to enrich for polysomes (Fig. 1A) explaining their decreased abundance. However, the ER-associated ribosomes that are found in the polysome pellet appear to preferentially sediment in the second sucrose gradient. As these mRNAs are neither abundant nor have high ribosomal occupancy, only 5% of the peaks emanate from secreted mRNAs (Extended Data Table 8-1). We conclude that although ribosomes in the process of translating secretory mRNAs are enriched in the pellet, they are not a major constituent of the RNA granules containing stalled polysomes.
Although UPF1 is enriched in the GF, there is no evidence that the stalling involves a process related to NMD. Other NMD proteins, such as PRNC2 are not enriched in the GF (Fig. 1). Also, neuronal NMD-regulated mRNAs (Kurosaki et al., 2021) show decreased abundance, no difference in ribosome occupancy, and no enrichment in the protected reads (Extended Data Table 6-1).
It should be noted that this preparation represents a snapshot of P5 brains. This sedimentation protocol is complicated by the presence of myelin at later stages of development (El Fatimy et al., 2016). Still, if feasible, the mRNAs found in the granule fraction would likely be different at the adult stage as many of the mRNAs with the most abundant footprint reads that we saw in this fraction are developmentally implicated in neuronal outgrowth. A related protocol developed for embryonic day 18 rat brains showed fewer ribosomes when performed in adults (Elvira et al., 2006). However, stalled polysomes and some of the mRNAs isolated here (notably Map1b) are implicated in mGluR-LTD, a plasticity present mainly in mature brains (Nosyreva and Huber, 2005). Thus, it is plausible that the role of stalled polysomes, or the mRNAs that they regulate, may change as the brain develops.
Cryo-EM analysis reveals two populations of stalled ribosomes
The ribosomes in the GF are mainly stalled in the hybrid state, and this result has also been seen in a recent cryo-EM analysis of polysomes found in RNA granules isolated using a distinct protocol (Kipper et al., 2022). Collided ribosomes appear when a trailing ribosome encounters a slower or paused leading ribosome in a polysome. The paused leading ribosome contains a peptidyl-tRNA in the P-site, and the trailing collided ribosome adopts the rotated state with A/P and P/E hybrid tRNAs (Juszkiewicz et al., 2018). Thus, at first glance, our structural studies resemble those of collided ribosomes. However, we did not observe other signs of collided ribosomes, such as periodic peaks or the size of peaks corresponding to two or more ribosomes. Indeed, ribosomes that are stalled because of collision with a downstream ribosome recruit specific factors to resolve the stall and rescue the ribosomes (Buskirk and Green, 2017). These factors are not present in the proteomics of RNA granules (Kanai et al., 2004; Elvira et al., 2006; El Fatimy et al., 2016), including proteomics of RNA granules containing ribosomes in the hybrid position (Kipper et al., 2022). Indeed, if these stalled ribosomes are meant to be transported and reactivated, they need to be protected from the surveillance mechanisms that normally target prematurely stopped or collided ribosomes, which results in the unstalling and/or degradation of the proteins (Buskirk and Green, 2017). Finally, collided ribosomes protect a 58 nt fragment, even after extensive RNAase I treatment (Zhao et al., 2021), which is not the size of the fragments we observed in this study. Thus, although collided ribosomes are also found stalled in the hybrid state, the ribosomes studied here are unlikely to be from collided ribosomes.
In addition, it is unlikely that the longer protected fragments are because of the 15% of the structures with a tRNA in the P site and an empty A site as the majority of protected reads are large. The consensus site underneath ribosome clusters were only found using sequences from the large fragments, not the small fragments. Thus, the large reads and the peaks of protected fragments are presumably coming from the ribosomes blocked in the hybrid position before eEF2 mediated translocation. Again, this differs from what would be expected from collided ribosomes where the paused ribosome has an empty A site, and the following ribosomes are stalled in the hybrid position.
We found no additional densities that could be assigned to putative stalling factors at the tRNA interface region of the ribosome or other regions of the ribosome. This is despite the enrichment of FMRP in this fraction and its proposed role as a ribosome binding stalling factor. Thus, these structures are more consistent with stalling being encoded by the consensus mRNA sequences instead of a specific protein factor. However, mobile binding proteins are difficult to identify with cryo-EM, and a more focused examination will be required to identify small molecules, post-translational modifications, or other mechanisms that may determine how stalling occurs. We observed a mobile or absent P stalk (Fig. 3). A similar finding was made for polysomes stalled by K63 ubiquitination after oxidative damage in yeast (Zhou et al., 2020). Because the P stalk is important for eEF2 binding (Naganuma et al., 2010), altering the P stalk could help explain the absence of eEF2 and the stalling in the hybrid state. Moreover, the P stalk proteins were identified in a recent screen for ribosomal proteins that can exchange with newly synthesized proteins locally in neuronal dendrites (Fusco et al., 2021) consistent with the possibility that the P stalk proteins dissociate from ribosomes in neurons. However, what causes the loss or increased mobility of the P stalk and its relationship to the consensus sequences detected under the ribosome is not clear. It is also possible that nuclease treatment removed stalling factors located on the mRNA outside the protected fragments, although we did not find consensus sequences immediately before or after the peaks of protected fragments or in the 3′ extended sequences. Further studies will be required to provide more insight into the stalling mechanism.
Cytoskeletal and RNA binding proteins represent abundant mRNAs and mRNAs with high ribosome occupancy, respectively
Axon and dendrite outgrowth, coupled with synapse formation, occur at high rates in P5 rat brains (Semple et al., 2013). Indeed, the most abundant footprint reads are on cytoskeletal mRNAs and mRNAs encoding proteins that are highly enriched in growth cones (such as 14-3-3 proteins; Kent et al., 2010). The most abundant number of footprint reads is found on β-actin, whose local translation in both axons and dendrites is important for neuronal outgrowth (Eom et al., 2003; Leung et al., 2006). Cytoskeletal encoding mRNAs are also enriched in the mRNAs with peaks (Extended Data Table 6-2). However, they are not particularly enriched in the GF compared with the PF or total polysomes and cytoskeletal mRNAs are abundant in all these fractions. More surprising was the high ribosome occupancy for RNA binding proteins in the footprint reads (Fig. 6; Table 3), and mRNA binding proteins are also enriched in the mRNAs with peaks (Extended Data Table 6-2). This suggests an important homeostatic aspect for translation from stalled polysomes, in which the increased translation of RBPs will have critical effects on the translation of other messages not necessarily present in stalled polysomes.
Identification of conserved motifs enriched in footprint read peaks from pelleted ribosomes
The finding that the footprint reads derived from the GF are distributed mainly in large peaks is consistent with the enrichment of stalled ribosomes in this fraction. These peaks are strikingly enhanced in sequences previously defined as enriched in FMRP CLIPs (Fig. 7). This is consistent with the strong enrichment of FMRP in the granule fraction (Fig. 1) and the abundance and enrichment of mRNAs previously identified as associated with FMRP using CLIP experiments (Fig. 6). Although most FMRP CLIP consensus sequences have not been directly shown to bind to FMRP, the WGGA sequence can be directly bound by FMRP (Ascano et al., 2012). However, the major FMRP binding sites (G quadruplex and Kissing sequence) are not enriched in FMRP CLIP consensus sites (Anderson et al., 2016). Moreover, if ribosomes protect this sequence, it is unclear how FMRP would gain access to these sequences. However, it is possible that FMRP initially binds this sequence, and this is followed by ribosome occupation and stalling. It is also possible that these sequences are enriched in FMRP CLIPs because FMRP is specifically associated with stalled ribosomes, and these sequences specify where ribosomes would be stalled independently of FMRP. In this scenario, the sequences would not be directly bound by FMRP. Instead, FMRP would be cross-linked to sequences near to but not protected by the ribosome. Because the CLIP sequences are ∼100 bp, this is entirely consistent with both the CLIP and ribosome footprint read data.
GGA was also identified as a consensus site in the 3′UTRs of mRNAs whose transport to distal sites in neurons was decreased in the absence of FMRP (Goering et al., 2020). However, in this case, the GGA sites were involved in forming G quadruplexes and binding to the RGG domain of FMRP, whereas the WGGA CLIP sites in the open reading frame are not associated with G quadruplexes (Anderson et al., 2016). Moreover, the lack of transport was largely rescued by the I304N mutant of FMRP that does not bind ribosomes (Goering et al., 2020). Although this article compared ribosome footprints of a cell line differentiated into a neuron-like cell in the presence or absence of FMRP, there was no indication that footprints from RNA granules were measured as the average size of the protected fragments examined was the standard 28–32 bp long.
The consensus sites in the peaks are also enriched in a consensus motif for m6A modification. Interestingly, mRNAs with m6A sites are selectively transported in neurons (Merkurjev et al., 2018), and this methylation plays an important role in neurodevelopment (Widagdo and Anggono, 2018). Moreover, mRNAs that are associated with FMRP by CLIP experiments have previously been shown to be highly enriched for m6A modifications in neurons (Zhang et al., 2018), and this has been proposed to play a role in FMRP-mediated nuclear export (Hsu et al., 2019; Westmark et al., 2020). Nevertheless, the peak sequence motifs we identify with m6A sites are not enriched in m6A modification sites based on another study (Zhang et al., 2020; Extended Data Table 8-1), and the m6A reader YTDHF1 is not enriched in the GF (Fig. 1). However, whether FMRP directly binds to m6A, interacts with m6A readers, or is associated with m6A through some other indirect interaction is unclear. There has been some indication that m6A directly leads to the stalling of ribosomes (Choi et al., 2016), or it may be that some m6A reader in neurons is important for the stall. Again, it is unclear how the reader would access the mRNA sequences protected by the ribosome; but similar to FMRP, the sequence may be recognized first and then later occupied by the ribosome. It will be interesting in the future to determine the specific relationship between m6A methylation and stalled polysomes and whether initial findings of specific roles for m6A methylation in the developing brain are linked to their possible role in stalling translation in RNA granules.
Are stalled monosomes components of RNA granules?
Our data are consistent with a model in which a controlled form of stalling attracts specific factors in neurons, such as FMRP, that likely play a role in the packaging of the stalled polysomes into a granule for transport before collisions occur. Indeed, this may even occur before multiple ribosomes are initiated on an mRNA because the mRNAs preferentially translated by monosomes in dendrites are highly enriched in the GF compared with the PF (Extended Data Fig. 7-1), suggesting that monosomes that have already started translation are also packaged into the granules. The packaging of many distinct mRNAs in the same mRNA granule is consistent with the finding that in situ RNA granules containing stalled polysomes probably contain many distinct mRNAs (Langille et al., 2019) and recent data on purified RNA granules suggesting multiple polysomes in an RNA granule (Kipper et al., 2022).
Conclusions
Although there have been assumed links between the ribosomes that sediment in sucrose gradients, stalled polysomes identified by resistance to ribosomal run-off in neuronal dendrites, and the association of FMRPs with stalled polysomes, there were previously no direct connections between these various lines of research. Identifying an enrichment for FMRP CLIP consensus sequences in protected reads in ribosomes from the pellet of sucrose gradients establishes these links. Notably, most investigations of translation regulation in neuronal tissues do not consider the pellet fraction after separating polysomes using sucrose gradients. Thus, they are not accounting for this pool of translationally repressed mRNAs. For ribosome profiling from the initial polysome pellet, the use of translation efficiency needs to be re-evaluated because many of the ribosomes in this initial pellet are presumably stalled. Moreover, some studies only include footprints of the canonical size and may exclude the larger size footprints that we observed in this preparation. Thus, these results have large implications for the interpretation of many studies on neuronal translation.
Our data strongly support a model in which mRNAs of cytoskeletal and RNA binding proteins important for neurodevelopment are regulated through ribosomes stalled in elongation that are packaged into RNA granules and transported to distal sites, where they, on stimulation, would be reactivated to result in fast and local protein synthesis. RNA granule proteins such as FMRP that interact with these mRNAs either directly by binding to stalling sequences on the mRNA or to stalled ribosomes occupying these sequences, would thus act as master regulators of the fate of many mRNAs through regulating reactivation of mRNAs from stalled polysomes.
Footnotes
This work was supported by Canadian Institute of Health Research Grant 374967 to W.S.S., Azrieli Foundation Grant 501100005155 to W.S.S. and J.O., and Hellenic Foundation for Research and Innovation Grant 2556 to C.G.G. W.S.S. is a James McGill Professor. S.M.J. was supported by a Queen's University Belfast Patrick Johnston Research Fellowship. We thank staff at the Facility for Electron Microscopy Research (FEMR) at McGill University. FEMR is supported by the Canadian Foundation for Innovation, the Quebec government, and McGill University. We thank Mehdi Amiri and the Sonenberg lab for help with the acquisition of the UV absorption data. Figures were created using Biorender.com.
The authors declare no competing financial interests.
- Correspondence should be addressed to Wayne S. Sossin at wayne.sossin{at}mcgill.ca