Tutorial: EMMAX GotCloud STOM: Lecture 2
From Genome Analysis Wiki
Jump to navigationJump to searchSTOM 2014 Workshop - Practical Sessions 2
Lecture 2
The slides describing the notes below are available here (PDF)
Basic Setup
- To see the files for the session, type
ls /data/stom2014/session2/
If you see any errors, please let me know now!
- For convenience, let’s set some variables
export S2=/data/stom2014/session2 mkdir ~/out
Naive Association Test
- Run naive association test using PLINK
$S2/bin/plink --noweb --bfile $S2/data/1000G.auto.omni.phased.EUR --pheno $S2/data/1000G_EUR_20_1459060.phe --linear --out ~/out/naive
- Check your output file and see what it looks like
less ~/out/naive.assoc.linear
- Check the p-value at the causal variant
grep -w ADD ~/out/naive.assoc.linear | grep 20:1459060
- Draw QQ plot using the following R commands
> source('/data/stom2014/session2/r/qqconf.r') > T <- read.table('~/out/naive.assoc.linear',header=TRUE) > pdf('~/out/naive.pdf') > qq.conf.beta(T$P) > dev.off()
Genomic Control
- Add --adjust option to enable genomic control
$S2/bin/plink --noweb --bfile $S2/data/1000G.auto.omni.phased.EUR --pheno $S2/data/1000G_EUR_20_1459060.phe --linear --adjust --out ~/out/naive
- Calculate inflation factor on your own
> T <- read.table('~/out/naive.assoc.linear',header=TRUE)
- First, find the median p-value
> median(T$P) > 0.4814
- Convert p-value into chi-square using R, and compute lambda
> qchisq(0.4814,1,lower.tail=FALSE) [1] 0,4956901 > 0.4958032/0.456 [1] 1.08704
- Check out the custom script to calculate inflation factor
less $S2/r/calc.GC.lambda.r
- Feed the p-values from association results
cut -c 96- ~/out/naive.assoc.linear | Rscript $S2/r/calc.GC.lambda.r
Principal Component Analysis
- Convert PLINK format file to EMMAX-compatible format to obtain PCs
$S2/bin/plink --noweb --bfile $S2/data/1000G.auto.omni.phased.EUR --recode12 --output-missing-genotype 0 --transpose --out ~/out/1000G.auto.omni.phased.EUR
- Create kinship matrix using EMMAX
$S2/bin/emmax-kin-intel64 -T 1 -M 0.2 -v -d 10 ~/out/1000G.auto.omni.phased.EUR less ~/out/1000G.auto.omni.phased.EUR.aBN.kinf
- Calculate principal component using custom script
Rscript $S2/r/calc.PC.from.kinf.r ~/out/1000G.auto.omni.phased.EUR.aBN.kinf ~/out/1000G.auto.omni.phased.EUR.tfam ~/out/1000G.auto.omni.phased.EUR.pc10
- Check out how the PCA outcome looks like
less ~/out/1000G.auto.omni.phased.EUR.pc10
- Visualize the population structure using PCs with custom script
Rscript $S2/r/plot_pc_pop.r ~/out/1000G.auto.omni.phased.EUR.pc10 $S2/data/1000G.auto.omni.phased.EUR.pop ~/out/1000G.auto.omni.phased.EUR.pc10.pdf
- Use PCs as covariates to adjust for PCs
$S2/bin/plink --noweb --bfile $S2/data/1000G.auto.omni.phased.EUR --pheno $S2/data/1000G_EUR_20_1459060.phe --covar ~/out/1000G.auto.omni.phased.EUR.pc10 --covar-number 1,2,3,4 --linear --adjust --out ~/out/pca
- Check out the p-value at the causal variant and inflation of statistics
grep -w ADD ~/out/pca.assoc.linear | grep 20:1459060 grep -w ADD ~/out/pca.assoc.linear | cut -c 96- | Rscript $S2/r/calc.GC.lambda.r
Mixed Model Association
- Run EMMAX association
$S2/bin/emmax-intel64 -t ~/out/1000G.auto.omni.phased.EUR -o ~/out/emmax -p $S2/data/1000G_EUR_20_1459060.phe -k ~/out/1000G.auto.omni.phased.EUR.aBN.kinf
- Check the inflation factor
cut -f 4 ~out/emmax.ps | Rscript $S2/r/calc.GC.lambda.r [1] 1.006079
- Draw and compare multiple QQ plots using the R function provided
> source('/data/stom2014/session2/r/qqconf.r') > T1 <- read.table('~/out/naive.assoc.linear',header=TRUE) > T2 <- read.table('~/out/pca.assoc.linear',header=TRUE) > T3 <- read.table('~/out/emmax.ps') > pdf('~/out/all.pdf') > qq.conf.beta(T1$P) > qq.conf.beta(T2$P,drawaxis=FALSE,ptcolor="blue") > qq.conf.beta(T3$V4,drawaxis=FALSE,ptcolor="red") > dev.off()