Quantitative traits
We can use a linear regression to fit the null model for quantitative traits. That is ignoring random effects in mixed models. Or we can use a linear mixed model to fit the null model and obtain residuals.
Linear regression
Please run the following code in R
Read in example data sets
library(GRAB)
PhenoFile = system.file("extdata", "simuPHENO.txt", package = "GRAB")
PhenoData = data.table::fread(PhenoFile)
print(PhenoData)
# IID AGE GENDER eta bVec BinaryPheno QuantPheno
# 1: Subj-1 59.61118 0 29.59961 -0.2059781 0 0.27866626
# 2: Subj-10 61.31636 0 29.32101 -1.3371678 0 -2.29732890
# 3: Subj-100 59.16625 0 29.90944 0.3263173 0 -2.48702515
# 4: Subj-101 60.18490 1 28.99876 -1.5936861 0 -3.40632756
# 5: Subj-102 58.76700 1 27.90020 -1.9832990 0 -3.44214439
# ---
# 996: f9_5 59.25336 0 29.33119 -0.2954873 0 -0.91237106
# 997: f9_6 57.90253 1 28.55622 -0.8950407 0 -2.50808150
# 998: f9_7 62.04752 1 30.98239 -0.5413704 0 -0.38776894
# 999: f9_8 58.79103 0 28.06333 -1.3321861 0 -0.02743997
# 1000: f9_9 61.18766 1 30.14505 -0.9487810 0 -0.02288709
# OrdinalPheno SurvTime SurvEvent
# 1: 0 0.03245097 0
# 2: 0 0.19733412 0
# 3: 1 0.09451178 0
# 4: 0 0.10946053 0
# 5: 0 0.04882298 0
# ---
# 996: 2 0.20041554 0
# 997: 0 0.01220480 0
# 998: 2 0.04456741 0
# 999: 0 0.07052257 0
# 1000: 0 0.08657261 0
Fit the null model and obtain model residuals
nullmodel = lm(QuantPheno ~ AGE + GENDER, data = PhenoData)
ResidMat = data.frame(SubjID = PhenoData$IID,
Resid = nullmodel$residuals)
print(ResidMat)
# SubjID Resid
# 1: Subj-1 0.7302493
# 2: Subj-10 -2.7149915
# 3: Subj-100 -1.8086329
# 4: Subj-101 -3.6774972
# 5: Subj-102 -2.9905191
# ---
# 996: f9_5 -0.2783859
# 997: f9_6 -1.6157763
# 998: f9_7 -1.6084382
# 999: f9_8 0.8422290
# 1000: f9_9 -0.8052310
ResidMatFile = system.file("extdata", "ResidMat.txt", package = "GRAB")
data.table::fwrite(ResidMat, file = ResidMatFile, row.names = FALSE, col.names = TRUE, quote = FALSE, sep = "\t")
Linear mixed model
In practice, we use SAIGE to fit the null linear mixed model. Here is an example code showing how to use docker to run SAIGE.
Fit the null mixed model
docker run -v /path/to/your/sparseGRM/:/sparseGRMDir \
-v /path/to/your/pheno/:/phenoDir \
-v /path/to/your/geno/:/genoDir \
-v /path/to/your/output/:/outputDir \
wzhou88/saige:1.1.9 step1_fitNULLGLMM.R \
--sparseGRMFile=/sparseGRMDir/SAIGE_SparseGRM.txt \
--sparseGRMSampleIDFile=/sparseGRMDir/SAIGE_SparseGRMSampleID.txt \
--useSparseGRMtoFitNULL=TRUE \
--outputPrefix=/outputDir/SAIGE_step1 \
--plinkFile=/genoDir/simuPLINK \
--phenoFile=/phenoDir/simuPHENO.txt \
--traitType=quantitative \
--invNormalize=TRUE \
--phenoCol=QuantPheno \
--covarColList=AGE,GENDER \
--sampleIDColinphenoFile=IID \
--isCateVarianceRatio=FALSE \
--IsOverwriteVarianceRatioFile=TRUE
See SAIGE documentation for more details about these commands.
Load step 1 product and obtain model residuals
Please run the following code in R
load("/path/to/your/output/SAIGE_step1.rda") # please verify your filepath.
ResidMat = data.frame(SubjID = modglmm$sampleID,
Resid = modglmm$residuals)
print(ResidMat)
# SubjID Resid
# 1: f1_1 -1.1239756
# 2: f1_2 0.7595350
# 3: f1_3 0.6276040
# 4: f1_4 1.2506376
# 5: f1_5 -0.3869877
# ---
# 996: Subj-496 -0.7629248
# 997: Subj-497 0.2142096
# 998: Subj-498 -0.1806843
# 999: Subj-499 0.2505687
# 1000: Subj-500 0.2809179
ResidMatFile = system.file("extdata", "ResidMat.txt", package = "GRAB")
data.table::fwrite(ResidMat, file = ResidMatFile, row.names = FALSE, col.names = TRUE, quote = FALSE, sep = "\t")
Note
ResidMatFile
has the same format regardless of testing $\beta$g or $\tau$g.- The column names of
ResidMatFile
must be exactlySubjID
in the first column andResid
in the second column.- Each subject should match its corresponding residual.