Multi-Group Causal TWAS(M-cTWAS) Fine-Mapping#
This page illustrates command templates for multi-group causal TWAS fine-mapping with the ctwas workflow in twas_ctwas.ipynb. In the FGMB Atlas workflow, this step integrates RWAS association results, selected molecular prediction weights, GWAS summary statistics, and LD reference data to prioritize candidate causal genes, molecular contexts, and SNP-level signals.
The upstream ctwas workflow is run in three practical stages:
assemble cTWAS region data by chromosome;
estimate global group-prior parameters after all chromosome-level assembly jobs finish;
run causal fine-mapping for selected regions.
By default, these commands run multi-group cTWAS (M-cTWAS), where molecular contexts are analyzed jointly. To run individual-context (single group) cTWAS instead, add --no-multi_group to the same commands at all three steps above.
Before Running#
These commands should be run in an active xqtl-protocol analysis environment. Make sure both xqtl-protocol, pecotmr, and M-cTWAS are updated to the version used for the production analysis,
The --xqtl_meta_data file used for cTWAS can contain many genes across many regions. For exploratory runs or chromosome-specific jobs, subset the metadata to the relevant chromosome or region list.
Two naming options can be confused:
--nameshould match the prefix used by the RWAS result files because cTWAS uses this value to locate upstream RWAS outputs.--name_suffixis an arbitrary label that can be used to distinguish test runs, parameter settings, or repeated analyses.
Required Inputs#
Input |
Purpose |
|---|---|
|
GWAS metadata table pointing to summary-statistics files for the target GWAS study. |
|
LD reference metadata table used to load regional LD matrices and variant information. |
|
LD block or analysis-region file used to organize regional assembly and fine-mapping. |
|
Metadata table pointing to xQTL prediction weights, RWAS results, context labels, and gene-region annotations. |
|
GWAS study label to analyze, matching a study in the GWAS metadata. |
|
Chromosome to process during region-data assembly. Run this separately for each chromosome. |
|
Specific region to fine-map, usually formatted like |
|
Prior variance structure used by cTWAS. The examples below use |
|
Thinning setting used by the cTWAS workflow. Keep this consistent across the three stages. |
|
Variant-selection threshold for RWAS weights. The examples below use |
Step 1: Assemble cTWAS Region Data by Chromosome#
Run the assembly step once per chromosome by replacing --chrom. For a genome-wide analysis, submit one job for each chromosome.
sos run xqtl-protocol/code/pecotmr_integration/twas_ctwas.ipynb ctwas \
--cwd ../output/ \
--thin 1 \
--prior_var_structure shared_all \
--name rosmap_eqtl_pqtl \
--name_suffix TEST \
--gwas_meta_data /mnt/vast/hpc/csg/cl4215/mrmash/workflow/GWAS/gwas_meta.tsv \
--ld_meta_data /mnt/vast/hpc/csg/data_public/20240409_ADSP_LD_matrix/ld_meta_file.tsv \
--regions /mnt/vast/hpc/csg/cl4215/mrmash/workflow/pipeline_data/EUR_LD_blocks.bed \
--xqtl_meta_data /home/cl4215/wd/pipeline_data/ctwas_rosmap_twas_wgw_xqtl_meta_data.tsv \
--twas_weight_cutoff 0 \
--gwas_study Bellenguez_2022 \
--chrom 1
For single-group cTWAS analyses, add --no-multi_group to the command.
After all chromosome jobs complete, this stage should produce one region-data file per chromosome.
Step 2: Estimate Global Parameters#
Run this step only after all chromosome-level assembly jobs from Step 1 have finished. This stage reads the chromosome-level region-data files and estimates the global group-prior parameters used by fine-mapping. This step can require substantial memory; plan for at least 50 GB for large metadata files.
sos run xqtl-protocol/code/pecotmr_integration/twas_ctwas.ipynb ctwas \
--cwd ../output/ \
--run_param_est \
--skip_assembly \
--thin 1 \
--prior_var_structure shared_all \
--name rosmap_eqtl_pqtl \
--name_suffix TEST \
--gwas_meta_data /mnt/vast/hpc/csg/cl4215/mrmash/workflow/GWAS/gwas_meta.tsv \
--ld_meta_data /mnt/vast/hpc/csg/data_public/20240409_ADSP_LD_matrix/ld_meta_file.tsv \
--regions /mnt/vast/hpc/csg/cl4215/mrmash/workflow/pipeline_data/EUR_LD_blocks.bed \
--xqtl_meta_data /home/cl4215/wd/pipeline_data/ctwas_rosmap_twas_wgw_xqtl_meta_data.tsv \
--twas_weight_cutoff 0 \
--gwas_study Bellenguez_2022
Step 3: Run Fine-Mapping by Region#
Run fine-mapping for each target --region-name. This step can also require substantial memory, often 50 GB or more for large regions or many molecular contexts.
sos run xqtl-protocol/code/pecotmr_integration/twas_ctwas.ipynb ctwas \
--run_finemapping \
--skip_assembly \
--prior_var_structure shared_all \
--cwd ../output/ \
--name rosmap_eqtl_pqtl \
--name_suffix TEST \
--gwas_meta_data /mnt/vast/hpc/csg/cl4215/mrmash/workflow/GWAS/gwas_meta.tsv \
--ld_meta_data /mnt/vast/hpc/csg/data_public/20240409_ADSP_LD_matrix/ld_meta_file.tsv \
--regions /mnt/vast/hpc/csg/cl4215/mrmash/workflow/pipeline_data/EUR_LD_blocks.bed \
--xqtl_meta_data /home/cl4215/wd/pipeline_data/ctwas_rosmap_twas_wgw_xqtl_meta_data.tsv \
--twas_weight_cutoff 0 \
--gwas_study Bellenguez_2022 \
--region-name chr15_63051119_66680537
Expected Outputs#
The fine-mapping stage produces region-level cTWAS result tables and diagnostic objects. A typical result table contains gene-level and SNP-level variables with fields such as:
id, molecular_id, type, context, group, region_id, z, susie_pip, mu, cs
Key fields include:
Column |
Meaning |
|---|---|
|
Unique cTWAS variable identifier, often combining molecular ID and context for gene-level variables. |
|
Gene, molecular trait, or SNP identifier. |
|
Variable type, such as eQTL, pQTL, sQTL, or SNP. |
|
Molecular context or cell/tissue setting for gene-level variables. |
|
Group label used by the cTWAS prior model. |
|
Fine-mapped LD block or analysis region. |
|
Association z-score used by the cTWAS model. |
|
Posterior inclusion probability from fine-mapping. |
|
Credible-set label, when assigned. |