Difference between revisions of "Tutorial: EMMAX GotCloud STOM"

From Genome Analysis Wiki
Jump to navigationJump to search
 
(10 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
= STOM 2014 Workshop - Practical Sessions =
 
= STOM 2014 Workshop - Practical Sessions =
  
Welcome to Hyun Min Kang's practical session guide page for STOM 2014 workshop. If you do not know what STOM 2014 workshop is please follow the [[ http://bibs.snu.ac.kr/board/index.php?catid=201&bcid=344 | link]]
+
Welcome to Hyun Min Kang's practical session guide page for STOM 2014 workshop. If you do not know what STOM 2014 workshop is please follow http://bibs.snu.ac.kr/board/index.php?catid=201&bcid=344
  
 
This page is intended to supplement the slides presented in the practical sessions of STOM 2014 workshop by facilitating easy copy-and paste of commands illustrated in the example, in the case the speed of the lecture is too fast for you.
 
This page is intended to supplement the slides presented in the practical sessions of STOM 2014 workshop by facilitating easy copy-and paste of commands illustrated in the example, in the case the speed of the lecture is too fast for you.
Line 13: Line 13:
 
Note that some commands can be very long and may go farther than the browser's width
 
Note that some commands can be very long and may go farther than the browser's width
  
== Lecture 2 ==
+
== Lecture 2 : Genomic Control, PCA, EMMAX ==
  
The slides describing the notes below is available [[Media:Stom practice 02.pdf | here (PDF)]]
+
Please see [[Tutorial:_EMMAX_GotCloud_STOM:_Lecture_2 | Page for Practice 2 ]]
  
=== Basic Setup ===
+
== Lecture 5 : GotCloud Alignment Pipeline  ==
  
* To see the files for the session, type
+
Please see [[Tutorial:_EMMAX_GotCloud_STOM:_Lecture_5 | Page for Practice 5 ]]
ls /data/stom2014/session2/
 
If you see any errors, please let me know now!
 
  
* For convenience, let’s set some variables
+
== Lecture 6 : GotCloud Variant Calling Pipeline + samtools ==
export S2=/data/stom2014/session2
 
mkdir ~/out
 
  
=== Naive Association Test ===
+
Please see [[Tutorial:_EMMAX_GotCloud_STOM:_Lecture_6 | Page for Practice 6 ]]
  
* Run naive association test using PLINK
+
== Lecture 8 : Sequence-based association using EPACTS ==
  
$S2/bin/plink --noweb --bfile $S2/data/1000G.auto.omni.phased.EUR --pheno $S2/data/1000G_EUR_20_1459060.phe --linear --out ~/out/naive
+
Please see [[Tutorial:_EMMAX_GotCloud_STOM:_Lecture_8 | Page for Practice 8 ]]
 
 
* Check your output file and see what it looks like
 
 
 
less ~/out/naive.assoc.linear
 
 
 
* Check the p-value at the causal variant
 
 
 
grep -w ADD ~/out/naive.assoc.linear | grep 20:1459060
 
 
 
* Draw QQ plot using the following R commands
 
 
 
> source('/data/stom2014/session2/r/qqconf.r')
 
> T <- read.table('~/out/naive.assoc.linear',header=TRUE)
 
> pdf('~/out/naive.pdf')
 
> qq.conf.beta(T$P)
 
> dev.off()
 
 
 
=== Genomic Control ===
 
 
 
* Add --adjust option to enable genomic control
 
 
 
$S2/bin/plink --noweb --bfile $S2/data/1000G.auto.omni.phased.EUR --pheno $S2/data/1000G_EUR_20_1459060.phe --linear --adjust --out ~/out/naive
 
 
 
* Calculate inflation factor on your own
 
 
 
> T <- read.table('~/out/naive.assoc.linear',header=TRUE)
 
 
 
** First, find the median p-value
 
 
 
> median(T$P)
 
> 0.4814
 
 
 
** Convert p-value into chi-square using R, and compute lambda
 
 
 
> qchisq(0.4814,1,lower.tail=FALSE)
 
[1] 0,4956901
 
> 0.4958032/0.456
 
[1] 1.08704
 
 
 
* Check out the custom script to calculate inflation factor
 
 
 
less $S2/r/calc.GC.lambda.r
 
 
 
** Feed the p-values from association results
 
cut -c 96- ~/out/naive.assoc.linear | Rscript $S2/r/calc.GC.lambda.r
 
 
 
=== Principal Component Analysis ===
 
 
 
* Convert PLINK format file to EMMAX-compatible format to obtain PCs
 
 
 
$S2/bin/plink --noweb --bfile $S2/data/1000G.auto.omni.phased.EUR --recode12 --output-missing-genotype 0 --transpose --out ~/out/1000G.auto.omni.phased.EUR
 
 
 
* Create kinship matrix using EMMAX
 
 
 
$S2/bin/emmax-kin-intel64 -T 1 -M 0.2 -v -d 10 ~/out/1000G.auto.omni.phased.EUR
 
less ~/out/1000G.auto.omni.phased.EUR.aBN.kinf
 
 
 
* Calculate principal component using custom script
 
 
 
Rscript $S2/r/calc.PC.from.kinf.r ~/out/1000G.auto.omni.phased.EUR.aBN.kinf ~/out/1000G.auto.omni.phased.EUR.tfam ~/out/1000G.auto.omni.phased.EUR.pc10
 
 
 
* Check out how the PCA outcome looks like
 
 
 
less ~/out/1000G.auto.omni.phased.EUR.pc10
 
 
 
* Visualize the population structure using PCs with custom script
 
 
 
Rscript $S2/r/plot_pc_pop.r ~/out/1000G.auto.omni.phased.EUR.pc10 $S2/data/1000G.auto.omni.phased.EUR.pop ~/out/1000G.auto.omni.phased.EUR.pc10.pdf
 
 
 
* Use PCs as covariates to adjust for PCs
 
 
 
$S2/bin/plink --noweb --bfile $S2/data/1000G.auto.omni.phased.EUR --pheno $S2/data/1000G_EUR_20_1459060.phe --covar ~/out/1000G.auto.omni.phased.EUR.pc10 --covar-number 1,2,3,4 --linear --adjust --out ~/out/pca
 
 
 
* Check out the p-value at the causal variant and inflation of statistics
 
 
 
grep -w ADD ~/out/pca.assoc.linear | grep 20:1459060
 
grep -w ADD ~/out/pca.assoc.linear | cut -c 96- | Rscript $S2/r/calc.GC.lambda.r
 
 
 
=== Mixed Model Association ===
 
 
 
* Run EMMAX association
 
 
 
$S2/bin/emmax-intel64 -t ~/out/1000G.auto.omni.phased.EUR -o ~/out/emmax -p $S2/data/1000G_EUR_20_1459060.phe -k ~/out/1000G.auto.omni.phased.EUR.aBN.kinf
 
 
 
* Check the inflation factor
 
 
 
cut -f 4 ~out/emmax.ps | Rscript $S2/r/calc.GC.lambda.r
 
[1] 1.006079
 
 
 
* Draw and compare multiple QQ plots using the R function provided
 
 
 
> source('/data/stom2014/session2/r/qqconf.r')
 
> T1 <- read.table('~/out/naive.assoc.linear',header=TRUE)
 
> T2 <- read.table('~/out/pca.assoc.linear',header=TRUE)
 
> T3 <- read.table('~/out/emmax.ps')
 
> pdf('~/out/all.pdf')
 
> qq.conf.beta(T1$P)
 
> qq.conf.beta(T2$P,drawaxis=FALSE,ptcolor="blue")
 
> qq.conf.beta(T3$V4,drawaxis=FALSE,ptcolor="red")
 
> dev.off()
 

Latest revision as of 10:56, 6 January 2014

STOM 2014 Workshop - Practical Sessions

Welcome to Hyun Min Kang's practical session guide page for STOM 2014 workshop. If you do not know what STOM 2014 workshop is please follow http://bibs.snu.ac.kr/board/index.php?catid=201&bcid=344

This page is intended to supplement the slides presented in the practical sessions of STOM 2014 workshop by facilitating easy copy-and paste of commands illustrated in the example, in the case the speed of the lecture is too fast for you.

This page only covers lecture 2, 5, 6, 8 that are taught by Hyun Min Kang.

Here I assume that

  • The audience has basic knowledge of Unix system, basic utilities, and pipes
  • The audience has the account in the cluster system and know how to access to the resources presented here

Note that some commands can be very long and may go farther than the browser's width

Lecture 2 : Genomic Control, PCA, EMMAX

Please see Page for Practice 2

Lecture 5 : GotCloud Alignment Pipeline

Please see Page for Practice 5

Lecture 6 : GotCloud Variant Calling Pipeline + samtools

Please see Page for Practice 6

Lecture 8 : Sequence-based association using EPACTS

Please see Page for Practice 8