Changes

From Genome Analysis Wiki
Jump to navigationJump to search
50 bytes removed ,  19:35, 1 December 2012
no edit summary
Line 40: Line 40:     
Once installed, test out the software by running a quick example using the test data provided in the "example" directory. The example VCF and PED files are:  
 
Once installed, test out the software by running a quick example using the test data provided in the "example" directory. The example VCF and PED files are:  
<pre>$ epacts.v2.2.0.20121026/example/1000G_exome_chr20_example_softFiltered.calls.vcf.gz
+
<pre>$ EPACTS-3.0.0/example/1000G_exome_chr20_example_softFiltered.calls.vcf.gz
   −
$ epacts.v2.2.0.20121026/example/1000G_dummy_pheno.ped
+
$ EPACTS-3.0.0/example/1000G_dummy_pheno.ped
 
</pre>  
 
</pre>  
 
<br> Run the single variant score test on the example data using this command:  
 
<br> Run the single variant score test on the example data using this command:  
Line 53: Line 53:  
This command will run the single variant test on the input VCF and PED files, with a minimum MAF threshold of 0.001. &nbsp;The phenotype is "DISEASE" and we are adjusting the analysis with covariates AGE and SEX. &nbsp;The output file directory prefix is {OUTPUT_DIR}/test. &nbsp;Finally, EPACTS will run the analysis in parallel on 2 CPUs.  
 
This command will run the single variant test on the input VCF and PED files, with a minimum MAF threshold of 0.001. &nbsp;The phenotype is "DISEASE" and we are adjusting the analysis with covariates AGE and SEX. &nbsp;The output file directory prefix is {OUTPUT_DIR}/test. &nbsp;Finally, EPACTS will run the analysis in parallel on 2 CPUs.  
   −
A more detailed description of the example can be found [http://genome.sph.umich.edu/wiki/Test_EPACTS_for_DIAGRAM here].
+
A more detailed description of the example can be found [http://genome.sph.umich.edu/wiki/Test_EPACTS_for_DIAGRAM here].  
    
== 2. &nbsp;Prepare VCF file with genotypes / dosages  ==
 
== 2. &nbsp;Prepare VCF file with genotypes / dosages  ==
Line 61: Line 61:  
=== A. &nbsp;Convert dosage file into VCF format  ===
 
=== A. &nbsp;Convert dosage file into VCF format  ===
   −
Use the wrapper program "dose2vcf" to convert your doseage output to pseudo VCF format. &nbsp;Download the tool from [http://www.sph.umich.edu/csg/cfuchsb/dose2vcf_v0.5.gz here]. If you used rs numbers during imputation, you can find mapping tables ready for dose2vcf [http://www.sph.umich.edu/csg/cfuchsb/mapping_rs_ALL.GIANT.phase1_release_v3.20101123.tgz here (214 Mb) ]  
+
Use the wrapper program "dose2vcf" to convert your doseage output to pseudo VCF format. &nbsp;Download the tool from [http://www.sph.umich.edu/csg/cfuchsb/dose2vcf_v0.5.gz here]. If you used rs numbers during imputation, you can find mapping tables ready for dose2vcf [http://www.sph.umich.edu/csg/cfuchsb/mapping_rs_ALL.GIANT.phase1_release_v3.20101123.tgz here (214 Mb) ]  
    
<br>  
 
<br>  
Line 74: Line 74:     
</pre>  
 
</pre>  
Note that for longer chromosomes, the program is quite memory intensive. &nbsp;In this case, please convert dosages in shorter sections of the chromosome. &nbsp;For example, if the imputation was performed by sections, then convert these sections to vcf first, and then merge the vcf files together using vcftools [http://vcftools.sourceforge.net/docs.html#concat vcf-concat]:
+
Note that for longer chromosomes, the program is quite memory intensive. &nbsp;In this case, please convert dosages in shorter sections of the chromosome. &nbsp;For example, if the imputation was performed by sections, then convert these sections to vcf first, and then merge the vcf files together using vcftools [http://vcftools.sourceforge.net/docs.html#concat vcf-concat]:  
    
=== B. &nbsp;bgzip and tabix VCF files  ===
 
=== B. &nbsp;bgzip and tabix VCF files  ===
Line 82: Line 82:  
tabix -pvcf -f input.vcf.gz ## this command will produce input.vcf.gz.tbi
 
tabix -pvcf -f input.vcf.gz ## this command will produce input.vcf.gz.tbi
 
</pre>  
 
</pre>  
If the VCF file is separated by chromosome, the VCF file must contain the string "chr1" in the chromosome 1 file, and corresponding chromosome name for other chromosomes.<br>Sample IDs in the VCF file must be consistent to those from PED file
+
If the VCF file is separated by chromosome, the VCF file must contain the string "chr1" in the chromosome 1 file, and corresponding chromosome name for other chromosomes.<br>Sample IDs in the VCF file must be consistent to those from PED file  
    
== 3. &nbsp;Prepare PED file for phenotypes and covariates  ==
 
== 3. &nbsp;Prepare PED file for phenotypes and covariates  ==
Line 131: Line 131:  
<pre>A DISEASE
 
<pre>A DISEASE
 
T QT
 
T QT
C AGE</pre>
+
C AGE</pre>  
Key: &nbsp;A =&nbsp;binary trait; T = quantitative trait; C = covariate<br>
+
Key: &nbsp;A =&nbsp;binary trait; T = quantitative trait; C = covariate<br>  
    
== 4. &nbsp;Run EPACTS association pipeline  ==
 
== 4. &nbsp;Run EPACTS association pipeline  ==
    
For detailed description of options, use:  
 
For detailed description of options, use:  
<pre>epacts.v2.2.0.20121026/epacts single -man
+
<pre>EPACTS-3.0.0/epacts single -man
 
</pre>  
 
</pre>  
 
<br>  
 
<br>  
   −
=== Primary analyses (without BMI) [please submit as soon as the analysis is complete] ===
+
=== Primary analyses (without BMI) [please submit as soon as the analysis is complete] ===
    
There are '''2''' separate association analyses to be completed '''without adjusting for BMI'''.  
 
There are '''2''' separate association analyses to be completed '''without adjusting for BMI'''.  
Line 222: Line 222:  
<br>  
 
<br>  
   −
<br>
+
<br>  
   −
=== Analysis for QC&nbsp;[please submit as soon as the analysis is complete] ===
+
=== Analysis for QC&nbsp;[please submit as soon as the analysis is complete] ===
    
For quality control, please run an additional analysis using EPACTS on all SNPs for chromosome 20 only using the '''SCORE''' test without BMI adjustment. &nbsp;These results will be used to compare with results from the primary analyses, to ensure the new EPACTS software has been run correctly.  
 
For quality control, please run an additional analysis using EPACTS on all SNPs for chromosome 20 only using the '''SCORE''' test without BMI adjustment. &nbsp;These results will be used to compare with results from the primary analyses, to ensure the new EPACTS software has been run correctly.  
Line 249: Line 249:  
|}
 
|}
   −
<br>
+
<br>  
   −
=== Secondary analyses (with BMI)&nbsp;[please submit as soon as the analysis is complete] ===
+
=== Secondary analyses (with BMI)&nbsp;[please submit as soon as the analysis is complete] ===
    
There are '''2'''&nbsp;secondary analyses'''&nbsp;adjusting for BMI'''.  
 
There are '''2'''&nbsp;secondary analyses'''&nbsp;adjusting for BMI'''.  
Line 371: Line 371:     
As preparation for the Firth test analysis, we encourage you to analyze the data using the Wald test first, since it is computationally much faster. &nbsp;This will be a good way to check if your VCF and PED files for every chromosome are correctly formatted for EPACTS and resolve any problems you may have with your imputation or input files.  
 
As preparation for the Firth test analysis, we encourage you to analyze the data using the Wald test first, since it is computationally much faster. &nbsp;This will be a good way to check if your VCF and PED files for every chromosome are correctly formatted for EPACTS and resolve any problems you may have with your imputation or input files.  
<pre>epacts.v2.2.0.20121026 /epacts single -vcf [INPUT VCF FILENAME] -ped [INPUT PED FILENAME] -out [OUTPUT FILENAME PREFIX] \
+
<pre>EPACTS-3.0.0/epacts single -vcf [INPUT VCF FILENAME] -ped [INPUT PED FILENAME] -out [OUTPUT FILENAME PREFIX] \
 
-test b.wald -pheno DISEASE -cov AGE -sepchr -anno -min-mac 1 -field EC -run 10
 
-test b.wald -pheno DISEASE -cov AGE -sepchr -anno -min-mac 1 -field EC -run 10
 
</pre>  
 
</pre>  
'''Important:''' To analyze dosages (not genotypes), you must specify the dosage field with the "--field EC" option. Without this option, you will be analyzing the hard genotypes (i.e. --field option defaults to "GT" or "genotypes")!
+
'''Important:''' To analyze dosages (not genotypes), you must specify the dosage field with the "--field EC" option. Without this option, you will be analyzing the hard genotypes (i.e. --field option defaults to "GT" or "genotypes")!  
    
=== B. Analysis of low frequency variants using Firth bias-corrected logistic regression  ===
 
=== B. Analysis of low frequency variants using Firth bias-corrected logistic regression  ===
Line 381: Line 381:     
To run the Firth test using the EPACTS software:  
 
To run the Firth test using the EPACTS software:  
<pre>epacts.v2.2.0.20121026 /epacts single -vcf [INPUT VCF FILENAME] -ped [INPUT PED FILENAME] -out [OUTPUT FILENAME PREFIX] \
+
<pre>EPACTS-3.0.0/epacts single -vcf [INPUT VCF FILENAME] -ped [INPUT PED FILENAME] -out [OUTPUT FILENAME PREFIX] \
 
-test b.firth -pheno DISEASE -cov AGE -sepchr -anno -min-mac 1 -max-mac 200  -field EC -run 10
 
-test b.firth -pheno DISEASE -cov AGE -sepchr -anno -min-mac 1 -max-mac 200  -field EC -run 10
 
</pre>  
 
</pre>  
Line 391: Line 391:     
The EPACTS command for the score test analysis of chromosome 20 is:  
 
The EPACTS command for the score test analysis of chromosome 20 is:  
<pre>epacts.v2.2.0.20121026 /epacts single -vcf [INPUT VCF FILENAME] -ped [INPUT PED FILENAME] -out [OUTPUT FILENAME PREFIX] \
+
<pre>EPACTS-3.0.0/epacts single -vcf [INPUT VCF FILENAME] -ped [INPUT PED FILENAME] -out [OUTPUT FILENAME PREFIX] \
 
-test b.score -pheno DISEASE -cov AGE -chr 20 -anno -min-mac 1 -field EC -run 10
 
-test b.score -pheno DISEASE -cov AGE -chr 20 -anno -min-mac 1 -field EC -run 10
 
</pre>  
 
</pre>  
Line 398: Line 398:  
=== D. Typical DIAGRAM analysis using existing association pipeline (with BMI)<br>  ===
 
=== D. Typical DIAGRAM analysis using existing association pipeline (with BMI)<br>  ===
   −
This is the typical DIAGRAM analysis using your current association pipeline and software including BMI adjustment.
+
This is the typical DIAGRAM analysis using your current association pipeline and software including BMI adjustment.  
    
'''Alternative: &nbsp;Analyze VCF and PED files using the Wald test with the EPACTS software:'''  
 
'''Alternative: &nbsp;Analyze VCF and PED files using the Wald test with the EPACTS software:'''  
   −
<br>
+
<br>  
    
=== E. Analysis of low frequency variants using Firth bias-corrected logistic regression (with BMI)  ===
 
=== E. Analysis of low frequency variants using Firth bias-corrected logistic regression (with BMI)  ===
   −
Again use the Firth test on EPACTS for your analysis with BMI
+
Again use the Firth test on EPACTS for your analysis with BMI  
    
== 5. &nbsp;Report EPACTS results<br>  ==
 
== 5. &nbsp;Report EPACTS results<br>  ==
216

edits

Navigation menu