Glioblastoma (GBM) is the most aggressive human brain tumor. Although several molecular subtypes of GBM are recognized, a robust molecular prognostic marker has yet to be identified. Here, we report that the stemness regulator Sox2 is a new, clinically important target of microRNA-21 (miR-21) in GBM, with implications for prognosis. Using the MiR-21–Sox2 regulatory axis, approximately half of all GBM tumors present in the Cancer Genome Atlas (TCGA) and in-house patient databases can be mathematically classified into high miR-21/low Sox2 (Class A) or low miR-21/high Sox2 (Class B) subtypes. This classification reflects phenotypically and molecularly distinct characteristics and is not captured by existing classifications. Supporting the distinct nature of the subtypes, gene set enrichment analysis of the TCGA dataset predicted that Class A and Class B tumors were significantly involved in immune/inflammatory response and in chromosome organization and nervous system development, respectively. Patients with Class B tumors had longer overall survival than those with Class A tumors. Analysis of both databases indicated that the Class A/Class B classification is a better predictor of patient survival than currently used parameters. Further, manipulation of MiR-21–Sox2 levels in orthotopic mouse models supported the longer survival of the Class B subtype. The MiR-21–Sox2 association was also found in mouse neural stem cells and in the mouse brain at different developmental stages, suggesting a role in normal development. Therefore, this mechanism-based classification suggests the presence of two distinct populations of GBM patients with distinguishable phenotypic characteristics and clinical outcomes.
SIGNIFICANCE STATEMENT Molecular profiling-based classification of glioblastoma (GBM) into four subtypes has substantially increased our understanding of the biology of the disease and has pointed to the heterogeneous nature of GBM. However, this classification is not mechanism based and its prognostic value is limited. Here, we identify a new mechanism in GBM (the miR-21–Sox2 axis) that can classify ∼50% of patients into two subtypes with distinct molecular, radiological, and pathological characteristics. Importantly, this classification can predict patient survival better than the currently used parameters. Further, analysis of the miR-21-Sox2 relationship in mouse neural stem cells and in the mouse brain at different developmental stages indicates that miR-21 and Sox2 are predominantly expressed in mutually exclusive patterns, suggesting a role in normal neural development.
Gliomas are the most common type of malignant primary brain tumor in adults, and glioblastoma (GBM) is the most common and aggressive form of glioma (Gladson et al., 2010; Dunn et al., 2012; Sanai and Berger, 2012; Huse et al., 2013; Louis et al., 2014; Sturm et al., 2014; Lathia et al., 2015; Agnihotri and Zadeh, 2015; Aldape et al., 2015; Furnari et al., 2015), with an average survival duration of 15 months. The standard therapies for GBM—surgery, radiation therapy, and chemotherapy—are not very effective and the tumors generally relapse. Molecular analyses, the generation of invaluable mouse models (Hambardzumyan et al., 2011; Chen et al., 2012), and information derived from various other sources have led to new therapeutic approaches for GBM (Ahluwalia et al., 2010; Weller et al., 2012; Weller et al., 2013; Cloughesy et al., 2014; Hamza et al., 2014; Wick et al., 2014; Nabors et al., 2015). The molecular classification of GBM (Phillips et al., 2006; Brennan et al., 2009; Noushmehr et al., 2010a; Verhaak et al., 2010; Bastien et al., 2015) raised the hope that patient survival could be easily and reliably predicted on the basis of the classification system. However, although MGMT promoter methylation (Hegi et al., 2005; Bady et al., 2012; Stupp and Hegi, 2013; Wick et al., 2014) and IDH1 somatic mutation/G-CIMP status (Noushmehr et al., 2010) can predict survival in GBM patients, a similar therapeutic regimen is still generally used for all newly diagnosed GBM patients regardless of the patient's GBM subtype. Moreover, the main clinically accepted prognostic factors are age, Karnofsky performance status, and degree of resection (Sanai and Berger, 2012; Hatiboglu et al., 2013) and there is still no commonly implemented molecular marker.
MicroRNAs (miRNAs) are known to posttranscriptionally regulate many genes by either destabilizing mRNA or blocking translation and therefore play a critical role in normal development, cancer, stress response, and other physiological states (Singh et al., 2009; Gurtan and Sharp, 2013; Sheedy, 2015). Dysregulation of miRNA expression appears to be a common feature in many cancers. MiRNA-21 (miR-21) is upregulated in many cancers, including GBM (Møller et al., 2013; Sheedy, 2015). However, expression of miR-21 in neural stem cells in transgenic mice does not produce brain tumors (Huse et al., 2009; Medina et al., 2010), suggesting a gap in our knowledge of the role of miR-21 in brain tumors. Here, we find that miR-21 destabilizes Sox2 expression.
Studies have indicated that GBM tumors consist of GBM stem-like cells (GSCs) and have highlighted the roles of GSCs in modulating the tumor microenvironment, neovascularization, and tumor resistance to chemotherapy and radiation therapy (Emmenegger and Wechsler-Reya, 2008; Dirks, 2010; Charles et al., 2012; Sampetrean and Saya, 2013; Binda et al., 2014, Lathia et al., 2015). However, it is unclear whether there are other cell types in GBM tumors that might affect tumor biology and patient outcomes. Here, we identify two phenotypically and molecularly distinct cell types in GBM tumors that indicate different patient prognoses.
Materials and Methods
MiR-21–Sox2 whole-genome screen.
From the Cancer Genome Atlas (TCGA) database, a log2- and median-centered mRNA and miRNA expression dataset consisting of 225 patients with primary untreated GBM and Karnofsky performance scores >70 was collated. The patients were ranked according to miR-21 expression levels. The 20 patients with the highest miR-21 expression levels were compared with the 20 with the lowest miR-21 expression levels. Four patient datasets were generated; each set had 10 patients with high miR-21 and 10 patients with low miR-21. A whole-genome analysis by Comparative Marker Selection (GenePattern suite, Broad Institute, Massachusetts Institute of Technology, Cambridge, Massachusetts) revealed the top upregulated and downregulated genes in the high-miR-21 and low-miR-21 patients. Common and significantly downregulated genes from each of the four sets were compiled into a list of 69 genes. This list was then compared with the predicted gene targets of miR-21 using miRWalk (Dweep et al., 2011) and a final list of 29 genes inversely correlated with miR-21 and with at least one predicted binding site remained. The genes were then ranked according to both the predictive power of miR-21 binding and average p-value across the four discovery sets. For visualization, a gene expression heat map using the patient data analyzed was created using dChip software (Fig. 1A).
Patient-derived GSCs were isolated from surgical specimens of GBM patients treated at The University of Texas M.D. Anderson Cancer Center as described previously (Jiang et al., 2007; Kamal et al., 2012). Briefly, GSCs were grown in DMEM F-12 medium (Invitrogen) supplemented with 1:50 B27 (Invitrogen), 1:100 penicillin-streptomycin (Invitrogen), 1:100 l-glutamine (Invitrogen), 20 ng/ml epidermal growth factor (EGF) (Sigma-Aldrich), and 20 ng/ml basic fibroblast growth factor (b-FGF) (Sigma-Aldrich). Cells were passaged every 3 or 4 d using enzymatic dissociation.
Neural stem cells (NSCs) were isolated from the cortices of wild-type embryonic day 12 (E12) mouse embryos (Rietze and Reynolds, 2006). All of the mouse experiments were performed according to protocols approved by the Institutional Animal Care and Use Committee at M.D. Anderson Cancer Center. Care was taken to remove the meninges and other surrounding tissue. Briefly, the brains were dissected and mechanically dissociated and then the cells were seeded in complete neurobasal medium (Invitrogen) supplemented with 1:50 B27 (Invitrogen), 1:100 penicillin-streptomycin (Invitrogen), 1:100 l-glutamine (Invitrogen), 20 ng/ml EGF (Sigma-Aldrich), 20 ng/ml b-FGF (Sigma-Aldrich), and 20 μg/ml heparin (Sigma-Aldrich) in ultra-low-attachment plates (Corning). Neurospheres were enzymatically dissociated into single cells using Accutase (Sigma-Aldrich) once every 4–5 d. To induce differentiation of E12 NSCs, cells were seeded on poly-l-ornithine-coated plates in differentiating medium devoid of growth factors (EFG/FGF) and supplemented with 2% fetal bovine serum for 7 d.
In situ immunofluorescence.
GSCs grown in chamber slides were fixed with 4% paraformaldehyde for 15 min and washed 3 times with PBS. Cells were then incubated in a prehybridization (pre-hyb) mixture (50% formamide, 1 m NaCl, 1× Denhardt's solution, 0.5 mg/ml yeast tRNA, and 0.3% Triton X-100) for 2 h at 60°C. After pre-hyb, the cells were incubated with biotin-labeled locked nucleic acid probes (Exiqon) at 60°C for 14–16 h in hybridization mix (pre-hyb plus 25 nm probe). Next, the cells were washed with 5 × sodium chloride/sodium citrate buffer (SSC), pH 7, for 5 min, 50% formamide in 1 × SSC and 0.1% Tween 20 at hybridization temperature for 30 min, and 0.2 × SSC for 10 min, followed by 3 washes with PBS. Cells were incubated in fluorescein-isothiocyante-avidin (1:500; Invitrogen) for 30 min. For the combined immunofluorescence, the cells were blocked with a biotin-avidin kit (Vector Laboratories) according to the manufacturer's recommendations. Cells were blocked with 4% normal goat serum for 1 h and incubated with the following primary antibodies overnight at 4°C: rabbit anti-Sox-2 (1:50; Stemgent). The cells were then washed with PBS 3 times, incubated with the corresponding secondary Alexa Fluor dyes for 1 h, washed twice with PBS, mounted with fluorescence-mounting medium containing 4′,6-diamidino-2-phenylindole (Vector Laboratories), and photographed using a Nikon TE2000 controlled with MetaMorph software (Molecular Devices).
The formalin-fixed, paraffin-embedded sections were washed with xylene and rehydrated through decreasing ethanol concentrations. Antigen retrieval was done at 100°C for 30 min in 10 mm sodium citrate. Sections were allowed to cool for 30 min and then in situ immunofluorescence or immunofluorescence for Ki67 or TUNEL (Kamal et al., 2012) was performed as described above.
Stable overexpression of miR-21 was achieved using custom-made lentiviruses overexpressing pre-miR-21 or control vector (System Biosciences). In brief, 50,000 cells were plated on 24-well plates and allowed to recover for 3 h. Cells were incubated with the pre-miR-21 lentivirus particles (multiplicity of infection = 4) for 48 h. Cells were allowed to grow for 5 d and then sorted using green fluorescent protein. Overexpression was confirmed using real-time PCR analysis.
Stable GSC lines were generated using custom-made lentiviruses against miR-21 (miRzip-21) or control vector lentiviruses (control) from System Biosciences. MiRzip-21 or control cells were then transfected with a double-luciferase 3′-UTR reporter construct of human Sox2/Renilla control reporter (catalog #HmiT017628; GeneCopoeia) using the Amaxa electroporation system at a density of 1 μg of human Sox2 3′-UTR construct per 1 × 107 cells” Transfected cells were then incubated for 24 h and cell lysates were prepared. Firefly and Renilla luciferase activity was measured by using the Dual-Glo luciferase assay system according to the manufacturer's instructions (Promega).
Quantitative real-time reverse transcription PCR.
Total RNA was isolated using the Trizol (Invitrogen) method according to the manufacturer's instructions. MiRNA levels were estimated using the miRvana quantitative reverse transcription (qRT)-PCR kit (Ambion) according to the manufacturer's instructions. For mRNA, cDNA synthesis was done using a Verso cDNA kit (Thermo Scientific). Real-time PCR experiments were done using an Applied Biosystems 7900 system.
Western blot analysis.
Whole-cell lysates were prepared and ∼30 μg of protein was separated on an SDS-PAGE gel (Pierce). Proteins were electrophoretically transferred onto nitrocellulose membranes. The membranes were probed with the following specific primary antibodies: mouse anti-SOX2 (1:350; R&D Systems), rabbit anti-SOX2 (1:1000; Cell Signaling Technology), and mouse anti-actin (1:10,000; Sigma-Aldrich), followed by probing with IRDye-conjugated anti-mouse/anti-rabbit secondary antibodies (1:15,000; LI-COR) for 1 h. The blots were scanned using the Odyssey infrared imaging system and quantified using Odyssey version 3.0 software.
For GSCs, the neurosphere assay was performed as described previously (Kamal et al., 2012). In brief, cells dissociated into single-cell suspensions with 10 cells per well for self-renewal assay for multiple generations or 100 cells per well self-renewal for single generation were seeded in 96-well plates and the number of neurospheres >50 micron was counted after 15 d.
qRT-PCR for miR-21 and Sox2 in GBM patient samples.
Tumor samples of patients with histologically confirmed GBM who were treated at M.D. Anderson were obtained from the MDA Brain Tumor Center tissue bank. A retrospective analysis of the patient charts was performed to collect patient data such as age and overall survival. Samples were examined and dissected, if required, by a neuropathologist (K.A.) to ensure purity of the tumor tissue. Total RNA was isolated from the formalin-fixed paraffin-embedded tissues using the Epicenter DNA and RNA isolation kit (Epicenter Biotechnologies) after deparaffinization and proteinase K treatment of the paraffin-embedded samples. Expression levels of miR-21 and Sox2 were quantified using the relative threshold cycle method and normalized to housekeeping gene EEF1A.
MiR-21–Sox2-based GBM classification using TCGA.
TCGA GBM level 1 expression data were obtained for mRNA expression (http://cancergenome.nih.gov/). A robust multiarray analysis was performed in R (R Project, http://www.r-project.org/). Among the three probes available for Sox2, we chose the probe with the highest variation and the highest mean to be representative of Sox2 (213721_at). For miR-21, level 2 Agilent data were downloaded and the two highly correlated probes (Rsq = 0.9) for miR-21 were averaged. A total of 353 primary untreated patient samples with mRNA and miRNA annotation were present in the TCGA at the time of analysis. For each statistical analysis, the highest possible number of patients was included, limited by pertinent annotation availability and separation into high miR-21/low Sox2 and low miR-21/high Sox2 subtypes. The high miR-21/low Sox2 and low miR-21/high Sox2 groups were determined as specified below in the M.D. Anderson patient set: the miR-21 and Sox2 expression data were stratified into high miR-21/low Sox2 and low miR-21/high Sox2 using a median expression cutoff. In this context, half of the samples were classified as “high” and the other half were classified as “low” for both miR-21 and Sox2 independently. When classifying the patients on the basis of MiR-21–Sox2 status, 3 possible groups emerged: high miR-21/low Sox2 (Class A), low miR-21/high Sox2 (Class B), and high miR-21/high Sox2 or low miR-21/low Sox2 (Class C). Substratification was further performed for time to death, vital status, age, Karnofsky performance status, and molecular GBM subgroups. Kaplan–Meier survival and Cox proportional hazards ratio were calculated using Microsoft Excel 2010 and JMP version 9.01 statistical software (SAS Institute).
MiR-21–Sox2-based GBM classification using RT-PCR.
The expression levels of miR-21 and Sox2 were normalized with EEF1a in the GBM patient samples. The median values of miR-21 and Sox2 expression were obtained. Samples above the median for each regulator were categorized as “high” and samples below the median were categorized as “low.” Samples were then stratified as Class A, Class B, or Class C. Median survival and p-values were determined by Kaplan–Meier statistics using JMP 9.01 (SAS Institute).
TCGA phenotype analysis.
Magnetic resonance images were obtained from the Cancer Imaging Archive (http://cancerimagingarchive.net/) and analyzed quantitatively for volume contrast enhancement and necrosis as described previously (Zinn et al., 2011a). A total of 82 patients were included in the analysis. Histopathology annotation for percentage of necrosis was downloaded from the public TCGA data portal and matched to our TCGA dataset. A total of 228 patients were included in the analysis. Expression of TCGA stemness markers was calculated using normalized TCGA mRNA expression for CD133, Nanog, and Nestin in 173 Class A and B patients. Mean necrosis and contrast volumes and SEs for each Class (A–C) were calculated using JMP Pro version 9.01.
The methylation status of the MGMT promoter was predicted by MGMT-STP27 (Bady et al., 2012). The raw DNA methylation datasets (HM-450K) were obtained after Illumina preprocessing that included a background correction with the R package minfi and the preprocessed data were centered and normalized by probes, followed by unsupervised hierarchical classification of the 1000 most variable probes (autosomes only) using Ward's algorithm and Euclidean distance to establish CpG island methylator phenotype (CIMP) classification (green rectangle for non-CIMP and red for CIMP), as defined in Bady et al. (2012). The R statistical software package “survival” was used for the survival analysis.
Mouse orthotopic GBM models.
Class B GSCs were treated with either nontargeting shmiRNA (shNT) or shmiR-21. Clones were generated and characterized as described previously (Kamal et al., 2012). These cells were then used to assess the tumorigenic potential in vivo. Based on the power analysis for each cell type, 8 male nude mice 4–5 weeks old (obtained from the Department of Experimental Radiation Oncology at M.D. Anderson) were used. The cells transduced with vector alone were used as controls. Brain orthotopic tumor models were generated as described previously (Kamal et al., 2012). Briefly, 50,000 cells were implanted in a total volume of 5 μl into the right frontal lobes of the mice. Prism version 6.01 software (GraphPad) was used to generate survival curves.
Mouse orthotopic tumor characterization for cell proliferation using Ki67 assay.
Paraffin sections were deparffinized by washing in xylene 3 times for 5 min each. The sections were hydrated using gradient alcohols. Antigen retrieval was performed using citrate buffer and steaming for 20 min. The sections were allowed to cool down slowly for at least 90 min before blocking with 5% goat serum followed by incubation with Ki67 (1:500, catalog #ab136912; Abcam) antibody for 1 h at room temperature. After washing twice with PBST, the sections were incubated with secondary antibody followed by mounting with 4,6 diamino-2-phenylindole (DAPI) (Vector Laboratories). Immunofluorescence signal was measured under a microscope. Six independent areas were counted for quantification.
Mouse orthotopic tumor characterization for apoptosis by TUNEL assay.
TUNEL assay was performed according to manufacturer's instructions (catalog #11684817910; Roche). Briefly, the sections were deparffinized and hydrated as described for the proliferation assay. After antigen retrieval using citrate buffer, the sections were cooled rapidly and blocked in 20% goat serum, followed by the addition of TUNEL reaction mixture and incubation for 1 h at 37°C in a humidified atmosphere. Sections were washed with PBST followed by mounting medium with DAPI and positive cells were counted with a fluorescence microscope. Six independent areas were counted for quantification.
Gene set enrichment analysis.
For gene set enrichment analysis (GSEA), TCGA human genome U133 array expression data for the 173 patients with Class A (86 patients) or Class B (87 patients) GBM were downloaded through the TCGA data portal. The GSEA tool, developed by the Broad Institute (http://www.broadinstitute.org/gsea) (Subramanian et al., 2005), used a collection of 825 gene sets from the Molecular Signatures Database (MSigDB) that were annotated for gene ontology biological processes. The results were sorted according to the normalized enrichment score (NES) and the false discovery rate (FDR). The significant (FDR < 0.01) gene sets with the top 10 positive and negative NES scores were selected for unsupervised clustering analysis, which was performed with R program language. The unsupervised clustering analysis classified patients into groups based on the array expression datasets downloaded from the TCGA databases. The χ2 test level of significance of p < 0.01 was used to confirm the classification of Class A and Class B tumors.
MiR-21 targets the stemness regulator Sox2 in GBM
MiR-21 is believed to be expressed at high levels in GBM. To understand the role of miR-21 in GBM, we first determined the miR-21 levels in tumor samples from 279 patients using data from TCGA and from 69 patients treated at M.D. Anderson. In contrast to the common belief that all GBMs express relatively high levels of miR-21, miR-21 was expressed at varying levels in both experimental sets and miR-21 expression was associated with survival differences that were not significant. We reasoned that measurement of a parameter that reflects a biological mechanism connected to miR-21 function, such as its critical disease-specific gene target, would produce a biologically more informative parameter. To identify a target of miR-21 that is clinically relevant in GBM patients, we performed a whole-genome screen using the TCGA. A total of 69 concordantly downregulated genes in tumor samples with high miR-21 expression were then cross-matched with predicted targets of miR-21 using miRWalk (Suh et al., 2007). The resulting 29 genes were ranked according to the predictive power of miR-21 binding. Sox2, which is required to maintain stem cell characteristics, or “stemness,” in neural cells (Graham et al., 2003; Suh et al., 2007; Shimozaki et al., 2012), was identified as the top predicted target (5/8 algorithms) and one of the top inversely correlated genes (p = 0.01; Fig. 1A), suggesting that miR-21 suppresses Sox2 and that the two have an inverse relationship in GBM.
MiR-21–Sox2 status stratifies GBM tumors into two subtypes
To ascertain whether the MiR-21–Sox2 status can be used to classify GBM tumors, we analyzed the TCGA dataset. Using the median cutoff method, we identified two extreme subclasses that corresponded to an inverse MiR-21–Sox2 relationship (Fig. 1B; see Materials and Methods for details): progenitor-like, miR-21High/Sox2Low (Class A) and stem-like, miR-21Low/Sox2High (Class B). Class A and B tumors together comprised 50% of patients by default because of the method used. The remaining patients (Class C) had tumors with miR-21 and Sox2 expression that mathematically fell in between that of Class A and Class B and could not be stratified accurately (see Discussion). This class included either miR-21High/Sox2High (Class C1) or miR-21Low/Sox2Low (Class C2) tumors. Therefore, the method could identify the two extremes on the spectrum as Class A and Class B. To validate the TCGA results, we analyzed GBM tumor samples from M.D. Anderson patients (age 19–80 years). We determined the miR-21 and Sox2 levels in the 69 formalin-fixed, paraffin-embedded GBM samples using a simple qRT-PCR assay. Using the median cutoff, we found that 36 of the 69 patients could be categorized as either Class A or Class B, with the remaining patients falling into Class C (Fig. 1C). This ratio was similar to what was observed in the TCGA dataset.
We then analyzed patient GBM specimens using a complementary in situ hybridization (to detect miR-21)/immunofluorescence (to detect SOX2) assay. Examples of specimens representing Class A and Class B are shown (Fig. 2A). For the purposes of this work, we first concentrated on the miR-21-Sox2 mechanism-driven Class A and Class B tumors. Class C tumors are described later.
MiR-21 mechanistically suppresses Sox2 expression and MiR-21–Sox2 forms a regulatory axis
To obtain cells that could be experimentally manipulated, we generated GSCs. To determine whether miR-21 negatively regulates Sox2, we transduced a Class B-derived GSC line that expresses low miR-21 and high Sox2 levels with lentiviral vectors expressing the control vector or precursor miR-21 (pre-miR-21) and first validated that the exogenous pre-miR-21 caused an increase in the endogenous miR-21 levels (Fig. 2B, top). We then determined, using Western blotting assays, that the exogenous pre-miR-21 decreased the SOX2 protein level (Fig. 2B, bottom).
To determine whether miR-21 binds to the 3′-UTR of Sox2 mRNA, we constructed a Class A GSC stable cell line transduced with either shRNA lentiviruses against miR-21 (shmiR-21) or control vector (shNT) and a Class B GSC stable cell line transduced with either lentiviruses expressing exogenous pre-miR-21 or its corresponding control. We then transfected the stable lines with a dual reporter plasmid encoding a luciferase cDNA containing the Sox2-specific miR-21-binding site in its 3′-UTR region (Sox2 3′-UTR) and cDNA containing an internal control renilla luciferase and performed luciferase assays. Luciferase activities were normalized to the internal control renilla luciferase. We found higher luciferase activity under miR-21 knock-down conditions and lower luciferase activity under miR-21 overexpression conditions in both Class A and Class B GSCs (Fig. 2C, top: Class A, bottom: Class B).
To further confirm the specificity of the miR-21-binding site present in the Sox2 3′-UTR, we transfected a Class B GSC line stably transduced with lentivirus expressing miR-21 (Fig. 2B) or control with a dual reporter plasmid encoding a luciferase cDNA containing no Sox2 3′-UTR (empty vector), Sox2 3′-UTR, or Sox2 3′-UTR mutated at the miR-21-binding site and an internal control renilla luciferase and performed luciferase assays. Although the luciferase activity of the nontargeting control- and miR-21-transduced cells containing empty vector did not differ significantly, the Sox2 3′-UTR vector produced lower luciferase activity in the miR-21-transduced cells compared with the control (Fig. 2D). This lowered luciferase activity could be rescued by the mutant Sox2 3′-UTR construct (Fig. 2D). Together, these results indicated that miR-21 interacts with the single, specific binding site present on the Sox2 3′-UTR.
To determine whether the suppression of Sox2 by miR-21 reflects Sox2's biological function in regulating stemness, we took the Class B cells transduced with lentiviruses expressing either exogenous pre-miR-21 or its corresponding control, as described in Figure 2B, and performed self-renewal assays. As shown in Figure 2E, miR-21-mediated lowering of Sox2 expression caused a corresponding decrease in the self-renewal capacity of these cells.
Three GSC lines not used in the earlier experiments were then transduced with lentiviruses expressing exogenous pre-miR-21 or its corresponding nontargeting control and SOX2 protein levels were determined using Western blotting assays. All of the GSC lines showed decreased levels of SOX2 upon expression of exogenous miR-21, confirming that the MiR-21–Sox2 axis exists in additional GSC lines (Fig. 2F). Therefore, these results indicate that miR-21 functionally suppresses Sox2 in these cells.
Class A versus Class B classification delineates phenotypically distinct GBM tumors
To determine whether Class A and B GBM tumors represent distinct phenotypes, we performed quantitative MRI volumetric analysis as described previously by our group (Zinn et al., 2011b). The analysis demonstrated that the area of contrast enhancement (p = 0.03), amount of necrosis (p = 0.04), and total tumor volume (p = 0.02) were significantly greater in Class A GBMs than in Class B GBMs (Fig. 3A,B). When we analyzed the TCGA histopathological patient data, we again found that Class A tumors had significantly larger amounts of necrosis than Class B tumors (Fig. 3C).
Class A versus Class B classification represents molecularly distinct GBM tumors
We also analyzed the distribution patterns of the four known GBM molecular subgroups (Verhaak et al., 2010) in our classification system using the TCGA dataset. An integrated genomic view is shown in Figure 3D. The detailed expression patterns of the subtype-specific marker genes (Verhaak et al., 2010) in Class A and Class B tumors are shown in Figure 3E. We found that neither Class A nor Class B tumors aligned with the signature of any one subgroup; instead, both Class A and Class B tumors included signatures of all four subgroups, with a preponderance of the mesenchymal subgroup in Class A and a preponderance of the classical and proneural subgroups in Class B. Analysis of the mutation status of well known individual regulators of GBM in these tumors indicated that Class A, compared with Class B, tumors had significantly more NF1 mutation. The G-CIMP phenotype, which belongs to the proneural subgroup and correlates with IDH1 somatic mutation, is present in ∼10% of GBM patients and predicts longer survival (Noushmehr et al., 2010). Our analysis indicated that neither G-CIMP (Bady et al., 2012) nor IDH1 mutations were present in Class A tumors; these mutations were significantly more prevalent in Class B tumors, although the actual number of mutations was low, as expected. PDGFRA, EGFR, TP53, and PTEN mutations, as well as patient age and sex, did not differ significantly between the two classes of tumors.
To determine whether our classification system also reflected distinct gene expression patterns, we performed a global gene differential analysis using the TCGA database. We used log2-fold change of >1 as a cutoff and considered only those with adjusted p-values <0.05. Comparison of Class A and Class B resulted in 285 genes. As shown in the heat map (Fig. 4), such an analysis shows distinct gene expression clustering for Class A and Class B tumors.
We further interrogated the classification system by performing GSEA using TCGA expression data for Class A and Class B tumors using a collection of 825 gene sets from the Molecular Signatures Database that were annotated for gene ontology biological processes. The results were sorted according to the NES and the FDR. The significant (FDR < 0.01) gene sets with the top 10 positive and negative NES scores are shown in Figure 5A. Although Class B tumors were significantly associated with chromosome organization and nervous system development (Fig. 5B), Class A tumors were significantly associated with immune/inflammatory response (Fig. 5C). A total of 493 genes were further retrieved from the top 20 gene sets and were used in an unsupervised clustering analysis that classified patients into groups based on the array expression datasets downloaded from the TCGA database. The clustering results were visualized with a heat map and dendrograms (Fig. 5D). The χ2 test (p < 0.01) provided further confirmation for the distinct classification of Class A and Class B tumors.
We also performed ingenuity pathway analysis of the genes expressed in Class A and Class B tumors present in TCGA database. The results suggested cellular movement and cell-to-cell signaling and interaction as some of the most significant canonical pathways in Class A, but not Class B, tumors (Table 1). In contrast, gene expression and cell cycle were some of the most significant canonical pathways in Class B, but not Class A, tumors. Therefore, analyses of multiple molecular parameters as described above suggested that Class A and Class B tumors represent molecularly distinct tumors.
Class A versus Class B classification is a stronger predictor of patient survival than currently used parameters
To compare survival between Class A and B patients, we performed a Kaplan–Meier analysis of the TCGA patient dataset (N = 173). The Class A patients had shorter survival than Class B patients (median survival: 380 d for Class A vs 459 d for Class B; p = 0.0088; Fig. 6A, Table 2). The effect likelihood ratio showed that MiR-21–Sox2 status was the strongest and only independently significant prognostic factor (p = 0.0381) among those analyzed [Karnofsky performance status (p = 0.0890), age (p = 0.595), Table 2]. Importantly, although G-CIMP status predicts survival in ∼10% of GBM patients (Noushmehr et al., 2010), it had no significant survival prediction in our patient classification (p = 0.79; Table 2), suggesting that the survival difference seen between Class A and Class B patients is independent of G-CIMP status. Kaplan–Meier survival analysis showed that miR-21 by itself did not significantly affect survival, even before corrections (n = 279; p = 0.0761; Fig. 6B). Sox2 by itself was a significant prognostic factor (n = 279; p = 0.0307; Fig. 6C). However, after correcting for age and Karnofsky performance status, Sox2 lost significance (p = 0.0964) because of the stronger predictive power of age (p = 0.0003).
To validate the TCGA data analysis, we used M.D. Anderson patient data. Kaplan–Meier results indicated that the MiR-21–Sox2 combined signature was a significant prognostic indicator (n = 36; p = 0.02; Fig. 6D), as opposed to miR-21 alone (Fig. 6E) or Sox2 alone (Fig. 6F), among the M.D. Anderson patients. When age and Karnofsky performance status were added to the analysis, none of the parameters was significant, presumably because of the small number of patients.
We then analyzed other well characterized miRNA-mRNA pairs such as miR-19/PTEN (n = 156; p = 0.52; Fig. 6G) and miR-34/P53 (n = 130; p = 0.2; Fig. 6H) for survival advantage (Li et al., 2009; Ernst et al., 2010). These results indicated that neither of the pairs was statistically significant under similar settings, further underscoring the novelty of the MiR-21–Sox2 axis.
To determine whether the impact of MiR-21–Sox2 on GBM patient survival as determined in the TCGA and M.D. Anderson patient databases can be borne out by experimental manipulation, we compared Class B GSC cells containing low miR-21 and high Sox2 levels with their isogenic counterparts containing exogenous pre-miR-21. As characterized earlier (Fig. 2B), the transduced cells showed increase expression of miR-21 and decreased expression of SOX2. We then transplanted these cells into the brains of nude mice using a screw-guided system that we had used previously (Kamal et al., 2012) and performed Kaplan–Meier survival analyses. As shown in Figure 7, top, overexpression of miR-21 in Class B GSCs decreased the survival of tumor-bearing mice, supporting the bioinformatic results. Immunofluorescence analysis indicated that the addition of exogenous miR-21 to Class B GSCs increased cell proliferation (Fig. 7, middle) and decreased apoptosis (Fig. 7, bottom) in tumors and was associated with decreased survival of tumor-bearing mice.
Classes A and B versus Class C tumors
Because the Class A and B tumor classifications were based on a median cutoff value, we wanted to determine how the tumor characteristics would be affected when Class C tumors were included in the analyses. We first analyzed patient GBM specimens using in situ-immunofluorescence analysis, as described in Figure 2A. Examples of specimens representing Class C1 (miR-21High/Sox2High) and Class C2 (miR-21Low/Sox2Low) are shown (Fig. 8A). As shown in Figure 8B, the mutation analyses of various GBM regulators remained mostly similar to what was seen when only Class A and Class B were considered.
We then performed quantitative MRI volumetric analysis of Class A–C tumors. Because the classification was based on a median cutoff method and because Class C tumors fell in between Classes A and B, one would expect that the median radiological parameters of patients with Class C tumors would also fall between Classes A and B. The analysis supported this prediction and showed that the area of contrast enhancement, amount of necrosis, and total tumor volume for Class C tumors all fell between those of Class A and B tumors (Fig. 8C).
We also wanted to determine what would happen when we added Class C tumors to the patient survival prediction analysis. Again, one would expect that the median survival of patients with Class C tumors would fall between that of Classes A and B. Kaplan–Meier survival analysis, shown in Figure 8D, supported this expectation. In addition, it also indicated that whereas the survival difference between Class A and B patients was highly significant (p = 0.007), as expected, there was no significant difference between the survival of Class A and Class C patients (p = 0.203) or between Class B and Class C patients (p = 0.126). This observation further confirmed the prognostic value of the present classification schema.
MiR-21–Sox2 association is present in normal neural development
Because cancer and normal development are intimately connected, we wanted to determine whether the MiR-21–Sox2 axis is present during normal development. We first generated neurospheres from E12 mouse embryonic brains. To confirm that miR-21 mechanistically destabilizes Sox2 in NSCs, as in GSCs, we transfected pre-miR-21 and the NT control into NSCs, cultured the cells under neurosphere conditions, and analyzed the cells. Pre-miR-21 caused an increase in the intracellular miR-21 levels (Fig. 9A) compared with control, validating that the exogenous pre-miR-21 was processed properly and converted into miR-21 inside the cells. Western blotting analysis (Fig. 9B) indicated that the additional miR-21, compared with the NT control, caused a decrease in the Sox2 levels. Therefore, miR-21 regulates Sox2 in NSCs.
We then determined the expression patterns of miR-21 and Sox2 in these cells cultured under neurosphere and differentiation conditions. An in situ hybridization (to detect miR-21)/immunofluorescence assay (to detect Sox2) analysis (Fig. 9C) showed that NSCs under neurosphere conditions expressed SOX2, a marker of NSCs, as expected, but did not express miR-21. In contrast, miR-21 expression transiently appeared in cells cultured under differentiation conditions for 1 d, in which Sox2 expression had diminished. Both miR-21 and Sox2 expressions then drastically diminished in cells that were cultured for 7 d. To confirm the differentiation status of these cells, we performed in situ hybridization/immunofluorescence analyses of these cells to examine the expression patterns of miR-21 and the early neuronal differentiation marker neuronal b-tubulin. Neuronal b-tubulin was predominant only in cells cultured under differentiation conditions for 7 d (Fig. 9D).
Because both Class A and Class B GSCs could proliferate and give rise to neurospheres, we wanted to determine whether both high Sox2/low miR-21 NSCs (day 0) and high miR-21/low Sox2 differentiating cells (day 1) could also proliferate. We interrogated these cells with in situ hybridization (miR-21) and immunofluorescence (Ki67) analyses (Fig. 9E). As shown, both neurospheres (day 0, miR-21 negative) and early differentiating (day 1, miR-21 positive) cells, but not the differentiated cells (day 7), are capable of cell proliferation (Ki67 positive). Our results are supported by earlier studies of normal mouse brain development showing that both NSCs and intermediate progenitors can exhibit cell division (Ki67 positive). Therefore, these results reinforced the observation that both Class A and Class B GSCs are capable of generating neurospheres, as was observed experimentally.
To ascertain whether the inverse expression pattern of miR-21 and Sox2 exists in vivo, we examined mouse sections at E12, a stage of early neurogenesis, and at E17, a later stage of neurogenesis, using in situ hybridization/immunofluorescence analysis (Fig. 10A,B, respectively). As shown, E12 brain sections showed robust expression of Sox2 and very little expression of miR-21, whereas E17 brain sections mostly showed very strong expression of miR-21 and weak expression of Sox2. We then determined the miR21 and Sox2 expression patterns in adult mouse brain sections (Fig. 10C). As shown, miR-21 and Sox2 were expressed in distinct cells. These results indicated that miR-21 and Sox2 have predominantly mutually exclusive expression patterns during development. Therefore, together, the results described in this section support the notion generated in the previous sections that Class B GSCs are more stem like, whereas Class A GSCs are more progenitor like in nature.
On the basis of our identification of Sox2 as a target of miR-21 in GBM, we have developed a miR21–Sox2-based classification system that distinguishes miR-21Low/Sox2High (Class B) and miR-21High/Sox2Low (Class A) tumors. GSEA predicted one of the enriched gene sets of Class B tumor cells to be brain and CNS development. The analogous high Sox2/low miR-21 neural stem cells during neuronal differentiation support such prediction. In contrast, GSEA predicted one of the enriched gene sets of Class A tumor cells to be inflammatory and immune response. The pro-inflammation gene PTX3, which was associated with Class A tumors, is also a potential GBM marker (Locatelli et al., 2013). Another such gene, LY86, is also a potential inflammation response gene (Su et al., 2014). This is particularly significant because the inflammation/immune response contributes significantly to the aggressiveness of GBM tumors (Glass and Synowitz, 2014). Our findings that Class A tumors predict poorer survival that Class B tumors is in agreement with this observation. Interestingly, Class A tumor cells appear to be similar to neural progenitor cells during neuronal differentiation. Although a bidirectional influence between inflammation/immunity and neural progenitor cells has been established (Merino et al., 2015) and although miR-21 has been found to regulate inflammation in various cell types (Tili et al., 2013; Sheedy, 2015), how the high miR-21/low Sox2 axis might mechanistically control inflammation/immunity in Class A GBM tumors is not well known and the present study opens a window on this potential mechanism of GBM biology.
Compared with the known subtypes of GBM, Class A and Class B tumors were found to be unique: neither class corresponded completely with any known signatures of any specific GBM subtype; instead, both classes contained a mixture of molecular signatures. Class B tumors contained a preponderance of the proneural and classical signatures. The combined, context-dependent effects of high levels of Sox2, a stemness regulator (Shimozaki et al., 2012) and a marker of proneural GBM (Verhaak et al., 2010), and low levels of miR-21, a regulator of many oncogenic functions in several cancers (Sheedy, 2015) and a marker of mesenchymal tumors (Sintupisut et al., 2013), likely contributed to the unique properties of Class B tumors. The high Sox2 and low miR-21 levels also likely resulted in higher expression of EGFR, a marker of classical GBM tumors (Verhaak et al., 2010), in these tumors. Class A tumors, which had preponderance of the mesenchymal signature, likely resulted from the context-dependent contribution of high miR-21 and low Sox2 levels. This contribution also extended to high expression of NF1, a marker of mesenchymal GBM tumors. How the simple MiR-21–Sox2 axis might contribute mechanistically to the generation of a complex GBM tumor biology, presumably in a context-dependent manner, is yet to be determined.
Based on the brain transcriptome database (Cahoy et al., 2008), Verhaak et al. (2010) found that each tumor subtype was associated with a distinctive cellular lineage signature: proneural (oligodendrocytic signature), classical (astrocytic signature), neural (oligodendrocytic, astrocytic, and neural signatures), and mesenchymal (cultured astroglial signature). As shown in Figure 4, we determined that 285 genes were differentially expressed in Class A and Class B. A comparison of these genes with the lineage markers suggested that most differentially expressed genes were of astrocytic signature (77 genes); of these, 42 genes were overexpressed in Class B and 35 genes in Class A tumors. Overall analysis suggested that neither of the tumor types followed a strict lineage. For example, the significant genes expressed in Class A tumors included ARPC1B and NPC2 (oligodendrocytic signature), VAMP8 and CLIC1 (astrocytic signature), ITGA3 and FHL2 (neural signature), and S100A11 and TAGLN (cultured astroglial signature). Similarly, the significant genes expressed in Class B tumors included CHD7 and GNG4 (oligodendrocytic signature), BCAN and CSPG5 (astrocytic signature), MAPT and CXXC4 (neural signature), and STMN2 (cultured astroglial signature). Overall, Class A tumors showed expression of greater numbers of cultured astroglial signature genes than did Class B tumors. This is perhaps to be expected because Class A tumors had a greater preponderance of mesenchymal gene expression than did Class B tumors (Fig. 3D,E). This lack of a clear association between any of the four lineage markers and either Class A or B tumors suggests that these are highly oncogenically evolved tumors that have not retained the cellular lineage. Indeed, a lack of association would be anticipated given the number of mutations and alterations in these GBM tumors.
Our patient stratification based on the MiR-21–Sox2 axis also discovered a third group (Class C), as one might expect from the median cutoff method of analysis. Class C tumors exhibited high miR-21 and high Sox2 levels or low miR-21 and low Sox2 levels. The Class C tumors fell between Class A and Class B tumors, not only in terms of their MiR-21–Sox2 ratio, but also in their biological properties. Class C tumors were between Class A and Class B with respect to both their phenotypic properties such as area of contrast enhancement on MRI and amount of tumor necrosis and patient survival. The mechanism driving the Class C tumors and its significance to the GBM patient population remains unclear. Further studies in this direction would shed light on the development of, not only Class C, but also Class A and Class B tumors. Nonetheless, the presence of Class C tumors underscores the distinct molecular and biological properties exhibited by the Class A and Class B tumors.
The suppressive effect of miR-21 on Sox2 expression was also seen recently in human mesenchymal stem cells, hair-follicle-derived neural crest stem cells, and embryonic stem cells (Ni et al., 2014; Trohatou et al., 2014; Singh et al., 2015). Interestingly, an earlier study reported that miR-21 did not target Sox2 in either neural cells or in GBM (Põlajeva et al., 2012), possibly because those studies likely represented a specific subtype of cells that excluded Class A and Class B cells. Our work suggests that the MiR-21–Sox2 axis is likely to have significance for stem cells other than GSCs or NSCs. Because SOX2 is a positive regulator of self-renewal and pluripotency for embryonic stem cells (ESCs), one would expect that the presence of miR-21 would oppose self-renewal and cause differentiation of ESCs. MiR-21 was first identified through a comparison of differentiating (high miR-21) and undifferentiated (low miR-21) mouse ESCs (Houbaviy et al., 2003), which supports the notion that miR-21 is associated with suppression of stemness. We observed previously that miR-21 expression was indeed lower in undifferentiated E14Tg2a.4 mouse ESCs than in differentiating ESCs and found that the forced expression of miR-21 in ESCs caused lowered rates of self-renewal in these cells (Singh et al., 2008; Singh et al., 2012). Importantly, miR-21-mediated regulation of ESCs is context dependent (Buckley et al., 2009; Jørgensen et al., 2009a; Jørgensen et al., 2009b; Singh et al., 2012; Singh et al., 2015), suggesting a dynamic process in which various cell-intrinsic as well as cell-extrinsic factors form part of an interconnected network. Finally, because SOX2 is an induced pluripotent stem cell factor, our work supports the finding that inhibition of miR-21 enhanced induced pluripotent stem cell reprogramming efficiency (Yang et al., 2011). Therefore, together, our results suggest that Sox2 and miR-21 are key players in the stemness network of different cell types affecting normal development and disease and that manipulation of Sox2, miR-21, or both could be used to alter the outcome of the network. Various approaches to target miR-21 in GBM are already being pursued (Wang et al., 2015). Similarly, immunotherapeutic approaches involving vaccination with SOX2 peptides have also shown promise in a mouse model of high-grade glioma, suggesting the feasibility of targeting SOX2 (Favaro et al., 2014).
This work was supported by the National Institutes of Health (Grants CA97124 and NS81684). P.S. is the recipient of the Dodie P. Hawn Fellowship in Genetics. P.Z. was partially supported by the Oreffice Foundation. We thank Dawn Chalaire and Mary Maumder for editing the manuscript.
The authors declare no competing financial interests.
- Correspondence should be addressed to Sadhan Majumder, 1515 Holcombe Blvd, Houston, TX 77030.