Tutorial: EMMAX GotCloud STOM: Lecture 2

STOM 2014 Workshop - Practical Sessions 2

Lecture 2

The slides describing the notes below are available here (PDF)

Basic Setup

• To see the files for the session, type
```ls /data/stom2014/session2/
```

If you see any errors, please let me know now!

• For convenience, let’s set some variables
```export S2=/data/stom2014/session2
mkdir ~/out
```

Naive Association Test

• Run naive association test using PLINK
```\$S2/bin/plink --noweb --bfile \$S2/data/1000G.auto.omni.phased.EUR --pheno \$S2/data/1000G_EUR_20_1459060.phe --linear --out ~/out/naive
```
• Check your output file and see what it looks like
```less ~/out/naive.assoc.linear
```
• Check the p-value at the causal variant
```grep -w ADD ~/out/naive.assoc.linear | grep 20:1459060
```
• Draw QQ plot using the following R commands
```> source('/data/stom2014/session2/r/qqconf.r')
> pdf('~/out/naive.pdf')
> qq.conf.beta(T\$P)
> dev.off()
```

Genomic Control

```\$S2/bin/plink --noweb --bfile \$S2/data/1000G.auto.omni.phased.EUR --pheno \$S2/data/1000G_EUR_20_1459060.phe --linear --adjust --out ~/out/naive
```
• Calculate inflation factor on your own
```> T <- read.table('~/out/naive.assoc.linear',header=TRUE)
```
• First, find the median p-value
```> median(T\$P)
> 0.4814
```
• Convert p-value into chi-square using R, and compute lambda
```> qchisq(0.4814,1,lower.tail=FALSE)
[1] 0,4956901
> 0.4958032/0.456
[1] 1.08704
```
• Check out the custom script to calculate inflation factor
```less \$S2/r/calc.GC.lambda.r
```
• Feed the p-values from association results
```cut -c 96- ~/out/naive.assoc.linear | Rscript \$S2/r/calc.GC.lambda.r
```

Principal Component Analysis

• Convert PLINK format file to EMMAX-compatible format to obtain PCs
```\$S2/bin/plink --noweb --bfile \$S2/data/1000G.auto.omni.phased.EUR --recode12 --output-missing-genotype 0 --transpose --out ~/out/1000G.auto.omni.phased.EUR
```
• Create kinship matrix using EMMAX

\$S2/bin/emmax-kin-intel64 -T 1 -M 0.2 -v -d 10 ~/out/1000G.auto.omni.phased.EUR less ~/out/1000G.auto.omni.phased.EUR.aBN.kinf

• Calculate principal component using custom script
```Rscript \$S2/r/calc.PC.from.kinf.r ~/out/1000G.auto.omni.phased.EUR.aBN.kinf ~/out/1000G.auto.omni.phased.EUR.tfam ~/out/1000G.auto.omni.phased.EUR.pc10
```
• Check out how the PCA outcome looks like
```less ~/out/1000G.auto.omni.phased.EUR.pc10
```
• Visualize the population structure using PCs with custom script
```Rscript \$S2/r/plot_pc_pop.r ~/out/1000G.auto.omni.phased.EUR.pc10 \$S2/data/1000G.auto.omni.phased.EUR.pop ~/out/1000G.auto.omni.phased.EUR.pc10.pdf
```
• Use PCs as covariates to adjust for PCs
```\$S2/bin/plink --noweb --bfile \$S2/data/1000G.auto.omni.phased.EUR --pheno \$S2/data/1000G_EUR_20_1459060.phe --covar ~/out/1000G.auto.omni.phased.EUR.pc10 --covar-number 1,2,3,4 --linear --adjust --out ~/out/pca
```
• Check out the p-value at the causal variant and inflation of statistics
```grep -w ADD ~/out/pca.assoc.linear | grep 20:1459060
grep -w ADD ~/out/pca.assoc.linear | cut -c 96- | Rscript \$S2/r/calc.GC.lambda.r
```

Mixed Model Association

• Run EMMAX association

\$S2/bin/emmax-intel64 -t ~/out/1000G.auto.omni.phased.EUR -o ~/out/emmax -p \$S2/data/1000G_EUR_20_1459060.phe -k ~/out/1000G.auto.omni.phased.EUR.aBN.kinf

• Check the inflation factor
```cut -f 4 ~out/emmax.ps | Rscript \$S2/r/calc.GC.lambda.r
[1] 1.006079
```
• Draw and compare multiple QQ plots using the R function provided
```> source('/data/stom2014/session2/r/qqconf.r')