Overview: SPACox, SPAmix, and SPAGRM are residual‑based methods for genome‑wide association studies that use residuals from fitted null models together with genotype data to test associations for a wide range of complex traits. They share a common framework: SPACox is the baseline method for homogeneous populations of unrelated individuals; SPAmix extends SPACox to model population structure (e.g., admixed or multi‑population cohorts); SPAGRM extends SPACox to account for sample relatedness.

Features of the Methods:

Method	Population Structure	Sample Relatedness	Modeling Approach
SPACox	Not	Not	Residuals random
SPAmix	Modeled	Not	Genotypes random
SPAGRM	Not	Modeled	Genotypes random

All three methods implement the saddlepoint approximation (SPA), making them robust and accurate for common, low‑frequency, and rare variants, including cases where phenotype or residual distributions are highly unbalanced. To apply these three methods, the residuals must satisfy the following conditions:

\[\sum_{i=1}^n X_{ij} R_i = 0 \quad \text{for each } j, \quad \text{and} \quad \sum_{i=1}^n R_i = 0\]

where $R_i$ is the residual for subject $i$, and $X_{ij}$ is the covariate $j$ for subject $i$.

SPACox

SPACox uses an empirical cumulant generating function (CGF) to perform SPA-based single-variant association tests, enabling analysis with residuals from any null model.

Citations:

Bi et al. (2020). Fast and accurate method for genome-wide time-to-event data analysis and its application to UK Biobank. American Journal of Human Genetics. doi:10.1016/j.ajhg.2020.06.003

Step 1: Model Fitting and Preprocessing

In GRAB.NullModel, specify traitType = "Residual" for residual-based methods. A quick example is provided below. Refer to ?GRAB.NullModel and ?GRAB.SPACox for detailed parameter instructions.

# Step 1, Option 2, SPACox
# Fit null model and get residuals
residuals = coxph(
  Surv(SurvTime, SurvEvent) ~ AGE + GENDER, 
  data = PhenoData
)$residuals

# Calculate parameters needed for step 2
obj.SPACox = GRAB.NullModel(
  residuals ~ AGE + GENDER, 
  data = PhenoData, 
  subjIDcol = "IID", 
  method = "SPACox", 
  traitType = "Residual"
)

Step 2: Association Testing

Refer to ?GRAB.Marker and ?GRAB.SPACox for detailed parameter instructions.

# Step 2, SPACox
GenoFile = system.file("extdata", "simuPLINK.bed", package = "GRAB")
OutputFile = file.path(tempdir(), "Results_SPACox.txt")

# Marker-level testing
GRAB.Marker(obj.SPACox, GenoFile = GenoFile, OutputFile = OutputFile)

# Read results
head(data.table::fread(OutputFile))

Output Columns:

Marker: Variant identifier
Info: CHR:POS:REF:ALT
AltFreq: Alternative allele frequency
AltCounts: Alternative allele count
MissingRate: Proportion missing
Pvalue: Association p-value
zScore: Test statistic

SPAmix

SPAmix performs retrospective single-variant association tests using genotypes and residuals from null models of any complex trait in large-scale biobanks. It extends SPACox to support complex population structures, such as admixed ancestry and multiple populations, but does not account for sample relatedness.

Citation:

Ma et al. (2025). Sparse estimation of high-dimensional genetic correlation and its application to global biobank meta-analysis. Genome Biology. doi:10.1186/s13059-025-03827-9

Step 1: Model Fitting and Preprocessing

A quick example is shown below. See ?GRAB.NullModel and ?GRAB.SPAmix for full parameter details. In GRAB.NullModel:

Set traitType = "Residual".
Provide control$PC_columns as a comma-separated list of SNP-derived PC column names (e.g., "PC1,PC2") — this is required.
To analyze multiple residuals in one run, place them on the left side of the formula separated by + (e.g., res1 + res2 ~ covariates); each residual is tested independently, while common preprocessing steps are executed once to save time.

# Step 1, Option 2, SPAmix
# Fit one null model and get its residuals
res_cox <- coxph(
  Surv(SurvTime, SurvEvent) ~ AGE + GENDER + PC1 + PC2,
  data = PhenoData
)$residuals

# Fit another null model and get its residuals
res_lm <- lm(
  QuantPheno ~ AGE + GENDER + PC1 + PC2, 
  data = PhenoData
)$residuals

# Calculate parameters needed for step 2
obj.SPAmix <- GRAB.NullModel(
  formula = res_cox + res_lm ~ AGE + GENDER + PC1 + PC2,
  data = PhenoData,
  subjIDcol = "IID",
  method = "SPAmix",
  traitType = "Residual",
  control = list(PC_columns = "PC1,PC2")
)

Step 2: Association Testing

Refer to ?GRAB.Marker and ?GRAB.SPAmix for detailed parameter instructions.

# Step 2, SPAmix
GenoFile = system.file("extdata", "simuPLINK.bed", package = "GRAB")
OutputFile = file.path(tempdir(), "Results_SPAmix.txt")

# Marker-level testing
GRAB.Marker(obj.SPAmix, GenoFile = GenoFile, OutputFile = OutputFile)

# Read results
head(data.table::fread(OutputFile))

Output Columns:

Pheno: Phenotype identifier (pheno_1, pheno_2, …)
Marker: Variant identifier
Info: CHR:POS:REF:ALT
AltFreq: Alternative allele frequency
AltCounts: Alternative allele count
MissingRate: Proportion missing
Pvalue: Association p-value
zScore: Test statistic

SPAGRM

SPAGRM is a scalable and accurate framework for retrospective association tests. It treats genetic loci as random vectors and uses a precise approximation of their joint distribution. This enables SPAGRM to handle any type of complex trait, including longitudinal and unbalanced phenotypes. SPAGRM extends SPACox to support sample relatedness.

Note:

Detailed documentation for SPAGRM is available at the SPAGRM online tutorial.

Citation:

Xu et al. (2025). Scalable and accurate variance component analysis with large sample relatedness. Nature Communications. doi:10.1038/s41467-025-56669-1

Step 1: Preprocessing

A quick example is provided below. Refer to ?SPAGRM.NullModel and ?GRAB.SPAGRM for detailed parameter instructions.

# Load data
ResidMatFile <- system.file("extdata", "ResidMat.txt", package = "GRAB")
SparseGRMFile <- system.file("extdata", "SparseGRM.txt", package = "GRAB")
PairwiseIBDFile <- system.file("extdata", "PairwiseIBD.txt", package = "GRAB")
GenoFile <- system.file("extdata", "simuPLINK.bed", package = "GRAB")
OutputFile <- file.path(tempdir(), "resultSPAGRM.txt")

# Pre-calculate genotype distributions
obj.SPAGRM <- SPAGRM.NullModel(
  ResidMatFile = ResidMatFile,
  SparseGRMFile = SparseGRMFile,
  PairwiseIBDFile = PairwiseIBDFile,
  control = list(ControlOutlier = FALSE)
)

`ResidMatFile` Format

Whitespace-delimited file with two columns:

SubjID  Resid
ID001  -0.234
ID002   0.512
ID003  -0.089
ID004   0.157

Format specifications:

Header row required
SubjID must match those in GRM and IBD files
Resid computed from external null models (e.g., lmer(), coxph(), glm()) should have mean ≈ 0

`PairwiseIBDFile` Format

A pairwise IBD (identical by decent) file must be whitespace-delimited with five columns in the following order:

ID1   ID2   pa      pb      pc
f1_5  f1_1  0.0000  0.9296  0.07038
f1_5  f1_2  0.0755  0.8916  0.03285
f1_6  f1_1  0.0000  0.9466  0.05338

Format specifications:

ID1: subject 1 identifier
ID2: subject 2 identifier
pa: probability that the pair share both alleles (IBD = 2) at a locus.
pb: probability that the pair share one allele (IBD = 1) at a locus.
pc: probability that the pair share no alleles (IBD = 0) at a locus.

See getPairwiseIBD for details on generating a pairwise IBD file.

Step 2: Association Testing

Refer to ?GRAB.Marker and ?GRAB.SPAGRM for detailed parameter instructions.

# Perform association tests
GRAB.Marker(obj.SPAGRM, GenoFile, OutputFile)

# Read results
head(data.table::fread(OutputFile))

Output Columns:

Marker: Variant identifier
Info: CHR:POS:REF:ALT
AltFreq: Alternative allele frequency
AltCounts: Alternative allele count
MissingRate: Proportion missing
zScore: Test statistic
Pvalue: Association p-value
hwepval: Hardy-Weinberg equilibrium p-value

SPACox

Step 1: Model Fitting and Preprocessing

Step 2: Association Testing

SPAmix

Step 1: Model Fitting and Preprocessing

Step 2: Association Testing

SPAGRM

Step 1: Preprocessing

ResidMatFile Format

PairwiseIBDFile Format

Step 2: Association Testing

`ResidMatFile` Format

`PairwiseIBDFile` Format