Tutorial: EMMAX GotCloud STOM: Lecture 2

From Genome Analysis Wiki
Jump to navigationJump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

STOM 2014 Workshop - Practical Sessions 2

Lecture 2

The slides describing the notes below are available here (PDF)

Basic Setup

  • To see the files for the session, type
ls /data/stom2014/session2/

If you see any errors, please let me know now!

  • For convenience, let’s set some variables
export S2=/data/stom2014/session2
mkdir ~/out

Naive Association Test

  • Run naive association test using PLINK
$S2/bin/plink --noweb --bfile $S2/data/1000G.auto.omni.phased.EUR --pheno $S2/data/1000G_EUR_20_1459060.phe --linear --out ~/out/naive
  • Check your output file and see what it looks like
less ~/out/naive.assoc.linear
  • Check the p-value at the causal variant
grep -w ADD ~/out/naive.assoc.linear | grep 20:1459060
  • Draw QQ plot using the following R commands
> source('/data/stom2014/session2/r/qqconf.r')
> T <- read.table('~/out/naive.assoc.linear',header=TRUE)
> pdf('~/out/naive.pdf')
> qq.conf.beta(T$P)
> dev.off()

Genomic Control

  • Add --adjust option to enable genomic control
$S2/bin/plink --noweb --bfile $S2/data/1000G.auto.omni.phased.EUR --pheno $S2/data/1000G_EUR_20_1459060.phe --linear --adjust --out ~/out/naive
  • Calculate inflation factor on your own
> T <- read.table('~/out/naive.assoc.linear',header=TRUE)
    • First, find the median p-value
> median(T$P)
> 0.4814
    • Convert p-value into chi-square using R, and compute lambda
> qchisq(0.4814,1,lower.tail=FALSE)
[1] 0,4956901
> 0.4958032/0.456
[1] 1.08704
  • Check out the custom script to calculate inflation factor
less $S2/r/calc.GC.lambda.r
    • Feed the p-values from association results
cut -c 96- ~/out/naive.assoc.linear | Rscript $S2/r/calc.GC.lambda.r

Principal Component Analysis

  • Convert PLINK format file to EMMAX-compatible format to obtain PCs
$S2/bin/plink --noweb --bfile $S2/data/1000G.auto.omni.phased.EUR --recode12 --output-missing-genotype 0 --transpose --out ~/out/1000G.auto.omni.phased.EUR
  • Create kinship matrix using EMMAX

$S2/bin/emmax-kin-intel64 -T 1 -M 0.2 -v -d 10 ~/out/1000G.auto.omni.phased.EUR less ~/out/1000G.auto.omni.phased.EUR.aBN.kinf

  • Calculate principal component using custom script
Rscript $S2/r/calc.PC.from.kinf.r ~/out/1000G.auto.omni.phased.EUR.aBN.kinf ~/out/1000G.auto.omni.phased.EUR.tfam ~/out/1000G.auto.omni.phased.EUR.pc10
  • Check out how the PCA outcome looks like
less ~/out/1000G.auto.omni.phased.EUR.pc10
  • Visualize the population structure using PCs with custom script
Rscript $S2/r/plot_pc_pop.r ~/out/1000G.auto.omni.phased.EUR.pc10 $S2/data/1000G.auto.omni.phased.EUR.pop ~/out/1000G.auto.omni.phased.EUR.pc10.pdf
  • Use PCs as covariates to adjust for PCs
$S2/bin/plink --noweb --bfile $S2/data/1000G.auto.omni.phased.EUR --pheno $S2/data/1000G_EUR_20_1459060.phe --covar ~/out/1000G.auto.omni.phased.EUR.pc10 --covar-number 1,2,3,4 --linear --adjust --out ~/out/pca
  • Check out the p-value at the causal variant and inflation of statistics
grep -w ADD ~/out/pca.assoc.linear | grep 20:1459060 
grep -w ADD ~/out/pca.assoc.linear | cut -c 96- | Rscript $S2/r/calc.GC.lambda.r

Mixed Model Association

  • Run EMMAX association

$S2/bin/emmax-intel64 -t ~/out/1000G.auto.omni.phased.EUR -o ~/out/emmax -p $S2/data/1000G_EUR_20_1459060.phe -k ~/out/1000G.auto.omni.phased.EUR.aBN.kinf

  • Check the inflation factor
cut -f 4 ~out/emmax.ps | Rscript $S2/r/calc.GC.lambda.r
[1] 1.006079
  • Draw and compare multiple QQ plots using the R function provided
> source('/data/stom2014/session2/r/qqconf.r')
> T1 <- read.table('~/out/naive.assoc.linear',header=TRUE)
> T2 <- read.table('~/out/pca.assoc.linear',header=TRUE)
> T3 <- read.table('~/out/emmax.ps')
> pdf('~/out/all.pdf')
> qq.conf.beta(T1$P)
> qq.conf.beta(T2$P,drawaxis=FALSE,ptcolor="blue")
> qq.conf.beta(T3$V4,drawaxis=FALSE,ptcolor="red")
> dev.off()