Line 14: |
Line 14: |
| | | |
| == Lecture 2 == | | == Lecture 2 == |
| + | |
| + | === Basic Setup === |
| | | |
| * To see the files for the session, type | | * To see the files for the session, type |
Line 22: |
Line 24: |
| export S2=/data/stom2014/session2 | | export S2=/data/stom2014/session2 |
| mkdir ~/out | | mkdir ~/out |
| + | |
| + | === Naive Association Test === |
| | | |
| * Run naive association test using PLINK | | * Run naive association test using PLINK |
Line 42: |
Line 46: |
| > qq.conf.beta(T$P) | | > qq.conf.beta(T$P) |
| > dev.off() | | > dev.off() |
| + | |
| + | === Genomic Control === |
| | | |
| * Add --adjust option to enable genomic control | | * Add --adjust option to enable genomic control |
Line 62: |
Line 68: |
| > 0.4958032/0.456 | | > 0.4958032/0.456 |
| [1] 1.08704 | | [1] 1.08704 |
| + | |
| + | * Check out the custom script to calculate inflation factor |
| + | |
| + | less $S2/r/calc.GC.lambda.r |
| + | |
| + | ** Feed the p-values from association results |
| + | cut -c 96- ~/out/naive.assoc.linear | Rscript $S2/r/calc.GC.lambda.r |
| + | |
| + | === Principal Component Analysis === |
| + | |
| + | * Convert PLINK format file to EMMAX-compatible format to obtain PCs |
| + | |
| + | $S2/bin/plink --noweb --bfile $S2/data/1000G.auto.omni.phased.EUR --recode12 --output-missing-genotype 0 --transpose --out ~/out/1000G.auto.omni.phased.EUR |
| + | |
| + | * Create kinship matrix using EMMAX |
| + | |
| + | $S2/bin/emmax-kin-intel64 -T 1 -M 0.2 -v -d 10 ~/out/1000G.auto.omni.phased.EUR |
| + | less ~/out/1000G.auto.omni.phased.EUR.aBN.kinf |
| + | |
| + | * Calculate principal component using custom script |
| + | |
| + | Rscript $S2/r/calc.PC.from.kinf.r ~/out/1000G.auto.omni.phased.EUR.aBN.kinf ~/out/1000G.auto.omni.phased.EUR.tfam ~/out/1000G.auto.omni.phased.EUR.pc10 |
| + | |
| + | * Check out how the PCA outcome looks like |
| + | |
| + | less ~/out/1000G.auto.omni.phased.EUR.pc10 |
| + | |
| + | * Visualize the population structure using PCs with custom script |
| + | |
| + | Rscript $S2/r/plot_pc_pop.r ~/out/1000G.auto.omni.phased.EUR.pc10 $S2/data/1000G.auto.omni.phased.EUR.pop ~/out/1000G.auto.omni.phased.EUR.pc10.pdf |
| + | |
| + | * Use PCs as covariates to adjust for PCs |
| + | |
| + | $S2/bin/plink --noweb --bfile $S2/data/1000G.auto.omni.phased.EUR --pheno $S2/data/1000G_EUR_20_1459060.phe --covar ~/out/1000G.auto.omni.phased.EUR.pc10 --covar-number 1,2,3,4 --linear --adjust --out ~/out/pca |
| + | |
| + | * Check out the p-value at the causal variant and inflation of statistics |
| + | |
| + | grep -w ADD ~/out/pca.assoc.linear | grep 20:1459060 |
| + | grep -w ADD ~/out/pca.assoc.linear | cut -c 96- | Rscript $S2/r/calc.GC.lambda.r |
| + | |
| + | === Mixed Model Association === |
| + | |
| + | * Run EMMAX association |
| + | |
| + | $S2/bin/emmax-intel64 -t ~/out/1000G.auto.omni.phased.EUR -o ~/out/emmax -p $S2/data/1000G_EUR_20_1459060.phe -k ~/out/1000G.auto.omni.phased.EUR.aBN.kinf |
| + | |
| + | * Check the inflation factor |
| + | |
| + | cut -f 4 ~out/emmax.ps | Rscript $S2/r/calc.GC.lambda.r |
| + | [1] 1.006079 |
| + | |
| + | * Draw and compare multiple QQ plots using the R function provided |
| + | |
| + | > source('/data/stom2014/session2/r/qqconf.r') |
| + | > T1 <- read.table('~/out/naive.assoc.linear',header=TRUE) |
| + | > T2 <- read.table('~/out/pca.assoc.linear',header=TRUE) |
| + | > T3 <- read.table('~/out/emmax.ps') |
| + | > pdf(‘~/out/na?) |
| + | > pdf('~/out/all.pdf') |
| + | > qq.conf.beta(T1$P) |
| + | > qq.conf.beta(T2$P,drawaxis=FALSE,ptcolor="blue") |
| + | > qq.conf.beta(T3$V4,drawaxis=FALSE,ptcolor="red") |
| + | > dev.off() |