Changes

From Genome Analysis Wiki
Jump to navigationJump to search
Line 1: Line 1: −
(''WARNING: Under Construction'')
+
'''Note:''' the latest version of this practical is available at: [[SeqShop: Genetic Association Analysis Practical]]
 +
* The ones here is the original one from the June workshop (updated to be run from elsewhere)
 +
 
 +
 
 +
== Introduction ==
 +
[[Media:Seqshop association 2014 06.pdf|View Lecture Slides]]
 +
 
 +
[[Media:Seqshop association practice 2014 06.pdf|View Introductory Slides for Practical Session]]
    
== Goals of This Session ==
 
== Goals of This Session ==
 +
 
* Understand how to annotate variants using EPACTS
 
* Understand how to annotate variants using EPACTS
 
* Understand how to run single variant association analysis using EPACTS
 
* Understand how to run single variant association analysis using EPACTS
 
* Understand how to run rare variant association test using EPACTS
 
* Understand how to run rare variant association test using EPACTS
 
* Understand how to visualize the association output from EPACTS
 
* Understand how to visualize the association output from EPACTS
 +
 +
== Setup in person at the SeqShop Workshop ==
 +
''This section is specifically for the SeqShop Workshop computers.''
 +
<div class="mw-collapsible mw-collapsed" style="width:600px">
 +
''If you are not running during the SeqShop Workshop, please skip this section.''
 +
<div class="mw-collapsible-content">
    
{{SeqShopLogin}}
 
{{SeqShopLogin}}
   −
== Setup your run environment==
+
=== Setup your run environment===
 
+
This is the same setup you did for the previous tutorial, but you need to redo it each time you log in.
This is setup similar to the Structural Variant session.
     −
* GotCloud program
+
This will setup some environment variables to point you to
 +
* [[GotCloud]] program
 
* Tutorial input files
 
* Tutorial input files
 
* Setup an output directory
 
* Setup an output directory
Line 20: Line 34:  
* You won't see any output after running <code>source</code>
 
* You won't see any output after running <code>source</code>
 
** It silently sets up your environment
 
** It silently sets up your environment
** If you want to view the detail of the set up, type
+
** If you want to view the detail of the setup, type
  less /home/hmkang/seqshop/setup.txt
+
  less /home/mktrost/seqshop/setup.txt
 
and press 'q' to finish.
 
and press 'q' to finish.
 +
 
<div class="mw-collapsible mw-collapsed" style="width:200px">
 
<div class="mw-collapsible mw-collapsed" style="width:200px">
 
View setup.txt
 
View setup.txt
Line 35: Line 50:  
  export OUT=~/out
 
  export OUT=~/out
 
  mkdir -p ${OUT}
 
  mkdir -p ${OUT}
 +
</div>
 +
</div>
 +
</div>
 +
</div>
 +
 +
 +
== Setup when running on your own outside of the SeqShop Workshop ==
 +
''This section is specifically for running on your own outside of the SeqShop Workshop.''
 +
<div class="mw-collapsible" style="width:600px">
 +
''If you are running during the SeqShop Workshop, please skip this section.''
 +
<div class="mw-collapsible-content">
 +
 +
This tutorial builds on the alignment & snpcall tutorials, if you have not already, please first run those tutorials: [[SeqShop:_Sequence_Mapping_and_Assembly_Practical, June 2014|Alignment Tutorial]] & [[SeqShop: Variant Calling and Filtering for SNPs Practical, June 2014|SNP Calling Tutorial]]
 +
 +
=== Download & Build EPACTS ===
 +
If you do not already have EPACTS:
 +
* cd to where you want EPACTS installed (you can change this to any directory you want)
 +
mkdir -p ~/seqshop
 +
cd ~/seqshop/
 +
* download, decompress, and build the version of epacts that was tested with this tutorial:
 +
wget http://csg.sph.umich.edu//kang/epacts/download/EPACTS-3.2.6.tar.gz
 +
tar xvf EPACTS-3.2.6.tar.gz
 +
cd EPACTS-3.2.6
 +
./configure --prefix=$HOME/seqshop/epacts
 +
make
 +
make install
 +
cd ../..
 +
 +
{{SeqShopRemoteEnv}}
 +
 +
<ul>
 +
<li> Additional variables for EPACTS:</li>
 +
<ul>
 +
<div class="mw-collapsible" style="width:500px">
 +
<li>Using bash (replace the paths below with the appropriate paths):</li>
 +
<div class="mw-collapsible-content">
 +
:<pre>export EPACTS=~/seqshop/epacts</pre>
 +
</div>
 +
</div>
 +
<div class="mw-collapsible mw-collapsed" style="width:500px">
 +
<li>Using tcsh (replace the paths below with the appropriate paths):</li>
 +
<div class="mw-collapsible-content">
 +
:<pre>setenv EPACTS ~/seqshop/epacts</pre>
 +
</div>
 +
</div>
 +
</ul>
 +
</ul>
 +
 
</div>
 
</div>
 
</div>
 
</div>
Line 45: Line 108:  
Check the contents of the VCF file using the following command.
 
Check the contents of the VCF file using the following command.
   −
  zless $OUT/thunder/chr22/ALL/thunder/chr22.filtered.PASS.beagled.ALL.thunder.vcf.gz
+
  zless ${OUT}/thunder/chr22/ALL/thunder/chr22.filtered.PASS.beagled.ALL.thunder.vcf.gz
    
=== Phenotype Information ===
 
=== Phenotype Information ===
Line 51: Line 114:  
Phenotype information is prepared in PED format commonly used in other GWAS software such as MERLIN or PLINK.
 
Phenotype information is prepared in PED format commonly used in other GWAS software such as MERLIN or PLINK.
   −
  less $IN/assoc/seqshop.ped
+
  less ${SS}/assoc/seqshop.ped
    
The first several line should look like below.
 
The first several line should look like below.
Line 58: Line 121:  
View Data
 
View Data
 
<div class="mw-collapsible-content" style="width:800px">
 
<div class="mw-collapsible-content" style="width:800px">
  #FAM_ID IND_ID DAD_ID MOM_ID SEX     GENO    PHENO
+
  #FAM_ID IND_ID DAD_ID MOM_ID SEX PHENO
  HG00551 HG00551 0       0       0       1      0
+
  HG00551 HG00551 0 0 0 0
  HG00553 HG00553 0       0       0       0      0
+
  HG00553 HG00553 0 0 0 0
  HG00554 HG00554 0       0       0       0      0
+
  HG00554 HG00554 0 0 0 0
  HG00637 HG00637 0       0      0       0       0
+
  HG00637 HG00637 0 0 0 0
  HG00638 HG00638 0       0      0       0       0
+
  HG00638 HG00638 0 0 0 0
  HG00640 HG00640 0       0       0       1      1
+
  HG00640 HG00640 0 0 0 1
  HG00641 HG00641 0       0       0       2      1
+
  HG00641 HG00641 0 0 0 1
  HG00734 HG00734 0       0       0       1      1
+
  HG00734 HG00734 0 0 0 1
  HG00736 HG00736 0       0      0      0      0
+
  HG00736 HG00736 0 0 0 0
HG00737 HG00737 0      0      0       0       0
   
  ...
 
  ...
 
</div>
 
</div>
Line 101: Line 163:  
   
 
   
 
  mkdir --p $OUT/assoc
 
  mkdir --p $OUT/assoc
  $EPACTS/bin/epacts-anno --in $OUT/thunder/chr22/ALL/thunder/chr22.filtered.PASS.beagled.ALL.thunder.vcf.gz --out $OUT/assoc/snps.anno.vcf.gz
+
  $EPACTS/bin/epacts-anno --in $OUT/thunder/chr22/ALL/thunder/chr22.filtered.PASS.beagled.ALL.thunder.vcf.gz --out $OUT/assoc/snps.anno.vcf.gz --ref $SS/ref22/human.g1k.v37.chr22.fa
    
Then you will see a series of messages before annotation finishes.
 
Then you will see a series of messages before annotation finishes.
Line 109: Line 171:  
<div class="mw-collapsible-content" style="width:800px">
 
<div class="mw-collapsible-content" style="width:800px">
 
  /home/hmkang/seqshop/epacts/bin/anno -i /net/seqshop-server/hmkang/out/thunder/chr22/ALL/thunder/chr22.filtered.PASS.beagled.ALL.thunder.vcf.gz -r \
 
  /home/hmkang/seqshop/epacts/bin/anno -i /net/seqshop-server/hmkang/out/thunder/chr22/ALL/thunder/chr22.filtered.PASS.beagled.ALL.thunder.vcf.gz -r \
     /home/hmkang/seqshop/epacts/share/EPACTS/human_g1k_v37.fasta -f refGene -g /home/hmkang/seqshop/epacts/share/EPACTS/hg19_gencodeV14.txt.gz \
+
     /home/hmkang/seqshop/ref22/human_g1k_v37.chr22.fa -f refGene -g /home/hmkang/seqshop/epacts/share/EPACTS/hg19_gencodeV14.txt.gz \
 
   -c /home/hmkang/seqshop/epacts/share/EPACTS/codon.txt -o  /net/seqshop-server/hmkang/out/assoc/snps.anno.vcf.gz --inputFormat vcf \
 
   -c /home/hmkang/seqshop/epacts/share/EPACTS/codon.txt -o  /net/seqshop-server/hmkang/out/assoc/snps.anno.vcf.gz --inputFormat vcf \
 
   -p /home/hmkang/seqshop/epacts/share/EPACTS/priority.txt
 
   -p /home/hmkang/seqshop/epacts/share/EPACTS/priority.txt
Line 120: Line 182:  
           Gene Annotation
 
           Gene Annotation
 
             Parameters : -g [/home/hmkang/seqshop/epacts/share/EPACTS/hg19_gencodeV14.txt.gz]
 
             Parameters : -g [/home/hmkang/seqshop/epacts/share/EPACTS/hg19_gencodeV14.txt.gz]
                           -r [/home/hmkang/seqshop/epacts/share/EPACTS/human_g1k_v37.fasta]
+
                           -r [/home/hmkang/seqshop/ref22/human_g1k_v37.chr22.fa]
 
                           --inputFormat [vcf], --checkReference, -f [refGene]
 
                           --inputFormat [vcf], --checkReference, -f [refGene]
 
                           -p [/home/hmkang/seqshop/epacts/share/EPACTS/priority.txt]
 
                           -p [/home/hmkang/seqshop/epacts/share/EPACTS/priority.txt]
Line 126: Line 188:  
                           -u [], -d [], --se [], --si [], --outputFormat []
 
                           -u [], -d [], --se [], --si [], --outputFormat []
 
   Other Annotation Tools : --genomeScore [], --bed [], --tabix []
 
   Other Annotation Tools : --genomeScore [], --bed [], --tabix []
  Load reference genome /home/hmkang/seqshop/epacts/share/EPACTS/human_g1k_v37.fasta...
+
  Load reference genome /home/hmkang/seqshop/ref22/human_g1k_v37.chr22.fa...
  DONE: 84 chromosomes and 3101804739 bases are loaded.
+
  DONE: 1 chromosomes and 51304566 bases are loaded.
 
  Load codon file /home/hmkang/seqshop/epacts/share/EPACTS/codon.txt...
 
  Load codon file /home/hmkang/seqshop/epacts/share/EPACTS/codon.txt...
 
  DONE: codon file loaded.
 
  DONE: codon file loaded.
Line 189: Line 251:  
Let's run a single-variant association analysis using a score test.  
 
Let's run a single-variant association analysis using a score test.  
   −
  $EPACTS/bin/epacts-single --ped $IN/assoc/seqshop.ped --vcf $OUT/assoc/snps.anno.vcf.gz --out $OUT/assoc/single --region 22:36000000-37000000 --test b.score --pheno PHENO --run 2  
+
  $EPACTS/bin/epacts-single --ped $SS/assoc/seqshop.ped --vcf $OUT/assoc/snps.anno.vcf.gz --out $OUT/assoc/single --region 22:36000000-37000000 --test b.score --pheno PHENO --run 2  
    
After running it, you will see EPACTS output files by looking at
 
After running it, you will see EPACTS output files by looking at
Line 220: Line 282:  
View QQ plots
 
View QQ plots
 
<div class="mw-collapsible-content" style="width:800px">
 
<div class="mw-collapsible-content" style="width:800px">
 +
:<pre>evince $OUT/assoc/single.epacts.qq.pdf&</pre>
 
[[File:Single.epacts.qq.png]]
 
[[File:Single.epacts.qq.png]]
 
</div>
 
</div>
Line 227: Line 290:  
View Manhattan plots
 
View Manhattan plots
 
<div class="mw-collapsible-content" style="width:800px">
 
<div class="mw-collapsible-content" style="width:800px">
[[File:Single.epacts.mh.png]]
+
:<pre>evince $OUT/assoc/single.epacts.mh.pdf&</pre>
 +
[[File:Single.epacts.mh.png|900px]]
 
</div>
 
</div>
 
</div>
 
</div>
Line 238: Line 302:  
View Zoom Plots
 
View Zoom Plots
 
<div class="mw-collapsible-content" style="width:800px">
 
<div class="mw-collapsible-content" style="width:800px">
 +
:<pre>evince $OUT/assoc/single.zoom.22.36995620.pdf&</pre>
 
[[File:Single.zoom.22.36995620.png]]
 
[[File:Single.zoom.22.36995620.png]]
 
</div>
 
</div>
Line 248: Line 313:  
And run EMMAX test specifying the kinship matrix
 
And run EMMAX test specifying the kinship matrix
   −
  $EPACTS/bin/epacts-single --ped $IN/assoc/seqshop.ped --vcf $OUT/assoc/snps.anno.vcf.gz --out $OUT/assoc/emmax --region 22:36000000-37000000 --test q.emmax --pheno PHENO --run 2 --kinf $OUT/assoc/snps.anno.kinf
+
  $EPACTS/bin/epacts-single --ped $SS/assoc/seqshop.ped --vcf $OUT/assoc/snps.anno.vcf.gz --out $OUT/assoc/emmax --region 22:36000000-37000000 --test q.emmax --pheno PHENO --run 2 --kinf $OUT/assoc/snps.anno.kinf
    
Then the results may look similar to previous ones.
 
Then the results may look similar to previous ones.
    
  cat $OUT/assoc/emmax.epacts.top5000
 
  cat $OUT/assoc/emmax.epacts.top5000
      
== Run Groupwise Test ==
 
== Run Groupwise Test ==
Line 279: Line 343:  
If you want to run a collapsing burden test (CMC), run the following command
 
If you want to run a collapsing burden test (CMC), run the following command
   −
  $EPACTS/bin/epacts group --ped $IN/assoc/seqshop.ped --vcf $OUT/assoc/snps.anno.vcf.gz --out $OUT/assoc/group.collapse --test b.collapse --groupf $OUT/assoc/snps.anno.grp --pheno PHENO --run 2  
+
  $EPACTS/bin/epacts group --ped $SS/assoc/seqshop.ped --vcf $OUT/assoc/snps.anno.vcf.gz --out $OUT/assoc/group.collapse --test b.collapse --groupf $OUT/assoc/snps.anno.grp --pheno PHENO --run 2  
    
You can view the results by examining the output file
 
You can view the results by examining the output file
Line 303: Line 367:  
You can run SKAT-O test in a similar way, but with a special tag
 
You can run SKAT-O test in a similar way, but with a special tag
   −
  $EPACTS/bin/epacts group --ped $IN/assoc/seqshop.ped --vcf $OUT/assoc/snps.anno.vcf.gz --out $OUT/assoc/group.skato --test skat --skat-o --groupf $OUT/assoc/snps.anno.grp --pheno PHENO --run 2
+
  $EPACTS/bin/epacts group --ped $SS/assoc/seqshop.ped --vcf $OUT/assoc/snps.anno.vcf.gz --out $OUT/assoc/group.skato --test skat --skat-o --groupf $OUT/assoc/snps.anno.grp --pheno PHENO --run 2
    
And view output files
 
And view output files
96

edits

Navigation menu