Atlas Overview#
This section introduces the FGMB Atlas resource and summarizes the modeling strategy used to connect molecular QTL resources, prediction models, regulome-wide association studies (RWAS) association testing, and downstream causal fine-mapping.
In this documentation, a molecular trait refers to a source-specific molecular measurement panel used for predictor training. Each molecular trait is defined by both its biological context and molecular modality, such as a brain region, cell type, or cell subtype measured for gene expression, protein abundance, or splicing.
Core resource components include:
Molecular QTL-derived prediction models for genetically regulated molecular traits.
RWAS-ready summary resources for disease and aging-related brain traits.
Cross-context annotations that connect signals across modalities, cohorts, and biological settings.
Figure and workflow notebooks that document the manuscript analyses.
Molecular Modalities#
The current FGMB atlas includes the following molecular modalities:
Gene expression
Protein abundance
Splicing regulation
Within the atlas, each molecular trait is defined as a unique context–modality combination, and these data are used to train genetically regulated expression prediction models for downstream RWAS and causal TWAS analyses.
ROSMAP Resources#
ROSMAP provides the largest and most diverse set of molecular reference panels in FGMB. These data include both bulk tissue and single-nucleus resources across multiple brain contexts and modalities.
Bulk tissue RNA-seq#
Bulk RNA-seq gene expression data were incorporated from three ROSMAP brain regions:
Dorsolateral prefrontal cortex (DLPFC), N = 777
Posterior cingulate cortex (PCC), N = 441
Anterior cingulate cortex (AC), N = 593
Monocyte gene expression#
FGMB also includes a peripheral blood-derived monocyte gene expression panel from ROSMAP:
Monocyte (Mono), N = 226
ROSMAP protein abundance#
Protein abundance data from ROSMAP were incorporated for:
DLPFC protein abundance (pQTL), N = 416
ROSMAP splicing regulation#
Splicing enrichment data derived from bulk RNA-seq were included for three ROSMAP brain regions:
DLPFC splicing regulation (sQTL), N = 806
PCC splicing regulation (sQTL), N = 449
AC splicing regulation (sQTL), N = 603
ROSMAP single-nucleus RNA-seq resources#
FGMB includes several ROSMAP-derived single-nucleus RNA-seq resources from dorsolateral prefrontal cortex.
1. CUIMC1 single-nucleus resource#
The CUIMC1 dataset includes six major pseudo-bulk cell types of DLPFC region.
Astrocytes (Ast), N = 419
Inhibitory neurons (Inh), N = 419
Excitatory neurons (Exc), N = 419
Oligodendrocytes (Oli), N = 419
Oligodendrocyte progenitor cells (OPC), N = 418
Microglia (Mic), N = 419
2. MIT single-nucleus resource#
The MIT dataset includes major cell types and selected subtypes, with sample sizes ranging from 80 to 387:
Astrocytes (Ast), N = 385
Inhibitory neurons (Inh), N = 379
Excitatory neurons (Exc), N = 386
Oligodendrocytes (Oli), N = 387
Oligodendrocyte progenitor cells (OPC), N = 383
Microglia (Mic), N = 377
Astrocyte subtype 10 (Ast_10), N = 113
Microglia subtype 12 (Mic_12), N = 106
Microglia subtype 13 (Mic_13), N = 80
3. Mega single-nucleus resource#
The mega-analysis resource integrates multiple single-nucleus datasets and includes six major cell types with sample sizes ranging from 735 to 737:
Astrocytes (Ast), N = 737
Inhibitory neurons (Inh), N = 736
Excitatory neurons (Exc), N = 737
Oligodendrocytes (Oli), N = 737
Oligodendrocyte progenitor cells (OPC), N = 735
Microglia (Mic), N = 733
Mount Sinai Brain Bank (MSBB)#
The Mount Sinai Brain Bank contributes bulk tissue molecular reference panels to FGMB. These data broaden regional representation beyond ROSMAP and add complementary bulk brain contexts for expression and protein prediction modeling.
MSBB gene expression panels Bulk RNA-seq gene expression data were included from four cortical brain regions:
Frontal pole (FP, Brodmann area 10), N = 274
Superior temporal gyrus (STG, Brodmann area 22), N = 254
Parahippocampal gyrus (PHG, Brodmann area 36), N = 230
Inferior frontal gyrus (IFG, Brodmann area 44), N = 256
MSBB protein abundance panel FGMB also includes protein abundance data from:
Parahippocampal gyrus (PHG pQTL), N = 184
Knight-ADRC#
The Knight Alzheimer’s Disease Research Center contributes parietal cortex molecular reference panels to FGMB. These data add an independent aging-brain cohort and broaden atlas coverage across both cohort and modality.
Knight-ADRC gene expression panel Gene expression reference data were included for:
Parietal cortex (PC eQTL), N = 354
Knight-ADRC protein abundance panel Protein abundance reference data were included for:
Parietal cortex (PC pQTL), N = 412
FGMB Molecular Trait Summary#
Source Dataset |
Molecular Dataset |
Context |
Molecular Modality |
Sample Size |
Genes Trained |
Imputable Genes |
|---|---|---|---|---|---|---|
ROSMAP (De Jager et al. 2018) |
DLPFC eQTL |
Dorsolateral Prefrontal Cortex |
Gene Expression |
777 |
16,307 |
9,948 |
PCC eQTL |
Posterior Cingulate Cortex |
Gene Expression |
441 |
16,110 |
10,720 |
|
AC eQTL |
Anterior Cingulate Cortex |
Gene Expression |
593 |
16,104 |
10,918 |
|
Mono eQTL |
Monocyte |
Gene Expression |
226 |
12,801 |
5,397 |
|
ROSMAP (Bennett et al. 2018) |
DLPFC pQTL |
Dorsolateral Prefrontal Cortex |
Protein Expression |
416 |
7,396 |
3,925 |
ROSMAP (Najar et al. 2025) |
DLPFC sQTL |
Dorsolateral Prefrontal Cortex |
Splicing Enrichment |
806 |
12,474 |
7,261 |
PCC sQTL |
Posterior Cingulate Cortex |
Splicing Enrichment |
449 |
12,663 |
9,832 |
|
AC sQTL |
Anterior Cingulate Cortex |
Splicing Enrichment |
603 |
12,585 |
8,472 |
|
ROSMAP (Fujita et al. 2024) |
Ast eQTL CUIMC1 |
Astrocyte |
Gene Expression |
419 |
11,392 |
6,023 |
Inh eQTL CUIMC1 |
Inhibitory Neuron |
Gene Expression |
419 |
11,266 |
6,609 |
|
Exc eQTL CUIMC1 |
Excitatory Neuron |
Gene Expression |
419 |
11,111 |
8,085 |
|
Oli eQTL CUIMC1 |
Oligodendrocyte |
Gene Expression |
419 |
10,912 |
5,595 |
|
OPC eQTL CUIMC1 |
Oligodendrocyte Progenitor |
Gene Expression |
418 |
7,742 |
3,582 |
|
Mic eQTL CUIMC1 |
Microglia |
Gene Expression |
419 |
7,130 |
3,071 |
|
ROSMAP (Comandante-Lou et al. 2025) |
Ast eQTL MIT |
Astrocyte |
Gene Expression |
385 |
9,125 |
5,530 |
Ast.10 eQTL |
Astrocyte Subtype 10 |
Gene Expression |
113 |
1,927 |
1,927 |
|
Inh eQTL MIT |
Inhibitory Neuron |
Gene Expression |
379 |
10,760 |
7,141 |
|
Exc eQTL MIT |
Excitatory Neuron |
Gene Expression |
386 |
10,645 |
8,000 |
|
Oli eQTL MIT |
Oligodendrocyte |
Gene Expression |
387 |
10,021 |
6,748 |
|
OPC eQTL MIT |
Oligodendrocyte Progenitor |
Gene Expression |
383 |
8,707 |
5,163 |
|
Mic eQTL MIT |
Microglia |
Gene Expression |
377 |
5,404 |
3,306 |
|
Mic.12 eQTL MIT |
Microglia Subtype 12 |
Gene Expression |
106 |
702 |
701 |
|
Mic.13 eQTL MIT |
Microglia Subtype 13 |
Gene Expression |
80 |
692 |
692 |
|
ROSMAP (Comandante-Lou et al. 2025) |
Ast eQTL mega |
Astrocyte |
Gene Expression |
737 |
7,742 |
4,428 |
Inh eQTL mega |
Inhibitory Neuron |
Gene Expression |
736 |
9,577 |
5,970 |
|
Exc eQTL mega |
Excitatory Neuron |
Gene Expression |
737 |
10,138 |
7,856 |
|
Oli eQTL mega |
Oligodendrocyte |
Gene Expression |
737 |
8,196 |
4,826 |
|
OPC eQTL mega |
Oligodendrocyte Progenitor |
Gene Expression |
735 |
5,897 |
2,787 |
|
Mic eQTL mega |
Microglia |
Gene Expression |
733 |
3,514 |
1,562 |
|
MSBB (Wang et al. 2018) |
FP eQTL |
Frontal Pole |
Gene Expression |
274 |
9,275 |
8,088 |
STG eQTL |
Superior Temporal Gyrus |
Gene Expression |
254 |
9,275 |
7,645 |
|
PHG eQTL |
Parahippocampal Gyrus |
Gene Expression |
230 |
9,275 |
7,494 |
|
IFG eQTL |
Inferior Frontal Gyrus |
Gene Expression |
256 |
9,275 |
7,997 |
|
PHG pQTL |
Parahippocampal Gyrus |
Protein Expression |
184 |
11,224 |
8,325 |
|
Knight-ADRC (Fernandez et al. 2024) |
PC eQTL |
Parietal Cortex |
Gene Expression |
354 |
15,941 |
9,070 |
PC pQTL |
Parietal Cortex |
Protein Expression |
412 |
1,018 |
233 |
Molecular Dataset refers to a source-specific molecular measurement panel used for predictor training, such as a brain region, cell type, or cell subtype measured for gene expression, protein abundance, or splicing. Context denotes the tissue, cell type, or cell subtype, whereas Molecular Modality denotes the type of molecular phenotype being modeled.