Fit the null model

SPAGRM is a scalable and accurate analysis framework to control for sample relatedness for various complex traits, including multiple patterns for longitudinal traits (longitudinal mean, within-subject variability, and trand).

For SPAGRM, it is optional, rather than required, to incorporate a random effect to characterize sample relatedness. Therefore, SPAGRM is particularly suitable for the analysis of complex traits with complicated structure, and the application of complex statistical models.

To conduct a GWAS analysis containing related subjects, users only need to fit the null model only once with/without incorporating random effcts and obtain model residuals defined as the gradient of the likelihood function. We will latter show you how to implement it through longitudinal, quantitative, and binary trait analysis.

General pipeline to fit the null model

  1. Prepare the data
    • Prepare the phenotype and covariates. Genotype file or GRM file is required only if you fit a mixed model containing random effects to account for sample relatedness.
    • It’s recommended to perform quality control (QC) for longitudinal traits.
    • Covariates can include age, gender, SNP-derived top principal components (PCs), Leave One Chromosome Out Polygenic Scores (LOCO-PGSs), and so on.
  2. Choose a suitable statistical model
    • For example, we can use linear mixed models or generalized estimation equations for longitudinal trait analysis.
    • We can use linear/logistic regression methods for quantitative/binary trait analysis.
    • Users can try other statistical models for complex traits with more complicated structure if interested.
  3. Fit the null model and obtain model residuals
    • The score statistics can be derived from the gradient of the likelihood function by fitting the null model.
    • Generally, The score statistics have the consistent format of S = GTR, where G represents the genotype to be tested and R is defined as the model residuals.

Table of contents