Multi-Group Causal TWAS(M-cTWAS) Fine-Mapping#

This page illustrates command templates for multi-group causal TWAS fine-mapping with the ctwas workflow in twas_ctwas.ipynb. In the FGMB Atlas workflow, this step integrates RWAS association results, selected molecular prediction weights, GWAS summary statistics, and LD reference data to prioritize candidate causal genes, molecular contexts, and SNP-level signals.

The upstream ctwas workflow is run in three practical stages:

  1. assemble cTWAS region data by chromosome;

  2. estimate global group-prior parameters after all chromosome-level assembly jobs finish;

  3. run causal fine-mapping for selected regions.

By default, these commands run multi-group cTWAS (M-cTWAS), where molecular contexts are analyzed jointly. To run individual-context (single group) cTWAS instead, add --no-multi_group to the same commands at all three steps above.

Before Running#

These commands should be run in an active xqtl-protocol analysis environment. Make sure both xqtl-protocol, pecotmr, and M-cTWAS are updated to the version used for the production analysis,

The --xqtl_meta_data file used for cTWAS can contain many genes across many regions. For exploratory runs or chromosome-specific jobs, subset the metadata to the relevant chromosome or region list.

Two naming options can be confused:

  • --name should match the prefix used by the RWAS result files because cTWAS uses this value to locate upstream RWAS outputs.

  • --name_suffix is an arbitrary label that can be used to distinguish test runs, parameter settings, or repeated analyses.

Required Inputs#

Input

Purpose

--gwas_meta_data

GWAS metadata table pointing to summary-statistics files for the target GWAS study.

--ld_meta_data

LD reference metadata table used to load regional LD matrices and variant information.

--regions

LD block or analysis-region file used to organize regional assembly and fine-mapping.

--xqtl_meta_data

Metadata table pointing to xQTL prediction weights, RWAS results, context labels, and gene-region annotations.

--gwas_study

GWAS study label to analyze, matching a study in the GWAS metadata.

--chrom

Chromosome to process during region-data assembly. Run this separately for each chromosome.

--region-name

Specific region to fine-map, usually formatted like chr15_63051119_66680537.

--prior_var_structure

Prior variance structure used by cTWAS. The examples below use shared_all.

--thin

Thinning setting used by the cTWAS workflow. Keep this consistent across the three stages.

--twas_weight_cutoff

Variant-selection threshold for RWAS weights. The examples below use 0.

Step 1: Assemble cTWAS Region Data by Chromosome#

Run the assembly step once per chromosome by replacing --chrom. For a genome-wide analysis, submit one job for each chromosome.

sos run xqtl-protocol/code/pecotmr_integration/twas_ctwas.ipynb ctwas \
    --cwd ../output/ \
    --thin 1 \
    --prior_var_structure shared_all \
    --name rosmap_eqtl_pqtl \
    --name_suffix TEST \
    --gwas_meta_data /mnt/vast/hpc/csg/cl4215/mrmash/workflow/GWAS/gwas_meta.tsv \
    --ld_meta_data /mnt/vast/hpc/csg/data_public/20240409_ADSP_LD_matrix/ld_meta_file.tsv \
    --regions /mnt/vast/hpc/csg/cl4215/mrmash/workflow/pipeline_data/EUR_LD_blocks.bed \
    --xqtl_meta_data /home/cl4215/wd/pipeline_data/ctwas_rosmap_twas_wgw_xqtl_meta_data.tsv \
    --twas_weight_cutoff 0 \
    --gwas_study Bellenguez_2022 \
    --chrom 1

For single-group cTWAS analyses, add --no-multi_group to the command.

After all chromosome jobs complete, this stage should produce one region-data file per chromosome.

Step 2: Estimate Global Parameters#

Run this step only after all chromosome-level assembly jobs from Step 1 have finished. This stage reads the chromosome-level region-data files and estimates the global group-prior parameters used by fine-mapping. This step can require substantial memory; plan for at least 50 GB for large metadata files.

sos run xqtl-protocol/code/pecotmr_integration/twas_ctwas.ipynb ctwas \
    --cwd ../output/ \
    --run_param_est \
    --skip_assembly \
    --thin 1 \
    --prior_var_structure shared_all \
    --name rosmap_eqtl_pqtl \
    --name_suffix TEST \
    --gwas_meta_data /mnt/vast/hpc/csg/cl4215/mrmash/workflow/GWAS/gwas_meta.tsv \
    --ld_meta_data /mnt/vast/hpc/csg/data_public/20240409_ADSP_LD_matrix/ld_meta_file.tsv \
    --regions /mnt/vast/hpc/csg/cl4215/mrmash/workflow/pipeline_data/EUR_LD_blocks.bed \
    --xqtl_meta_data /home/cl4215/wd/pipeline_data/ctwas_rosmap_twas_wgw_xqtl_meta_data.tsv \
    --twas_weight_cutoff 0 \
    --gwas_study Bellenguez_2022

Step 3: Run Fine-Mapping by Region#

Run fine-mapping for each target --region-name. This step can also require substantial memory, often 50 GB or more for large regions or many molecular contexts.

sos run xqtl-protocol/code/pecotmr_integration/twas_ctwas.ipynb ctwas \
    --run_finemapping \
    --skip_assembly \
    --prior_var_structure shared_all \
    --cwd ../output/ \
    --name rosmap_eqtl_pqtl \
    --name_suffix TEST \
    --gwas_meta_data /mnt/vast/hpc/csg/cl4215/mrmash/workflow/GWAS/gwas_meta.tsv \
    --ld_meta_data /mnt/vast/hpc/csg/data_public/20240409_ADSP_LD_matrix/ld_meta_file.tsv \
    --regions /mnt/vast/hpc/csg/cl4215/mrmash/workflow/pipeline_data/EUR_LD_blocks.bed \
    --xqtl_meta_data /home/cl4215/wd/pipeline_data/ctwas_rosmap_twas_wgw_xqtl_meta_data.tsv \
    --twas_weight_cutoff 0 \
    --gwas_study Bellenguez_2022 \
    --region-name chr15_63051119_66680537

Expected Outputs#

The fine-mapping stage produces region-level cTWAS result tables and diagnostic objects. A typical result table contains gene-level and SNP-level variables with fields such as:

id, molecular_id, type, context, group, region_id, z, susie_pip, mu, cs

Key fields include:

Column

Meaning

id

Unique cTWAS variable identifier, often combining molecular ID and context for gene-level variables.

molecular_id

Gene, molecular trait, or SNP identifier.

type

Variable type, such as eQTL, pQTL, sQTL, or SNP.

context

Molecular context or cell/tissue setting for gene-level variables.

group

Group label used by the cTWAS prior model.

region_id

Fine-mapped LD block or analysis region.

z

Association z-score used by the cTWAS model.

susie_pip

Posterior inclusion probability from fine-mapping.

cs

Credible-set label, when assigned.