Changes

From Genome Analysis Wiki
Jump to navigationJump to search
1,856 bytes added ,  23:30, 26 October 2012
Line 141: Line 141:  
<br>  
 
<br>  
   −
=== Primary analyses  ===
+
=== Primary analyses (without BMI) ===
    
There are '''2''' separate association analyses to be completed '''without adjusting for BMI'''.  
 
There are '''2''' separate association analyses to be completed '''without adjusting for BMI'''.  
Line 222: Line 222:  
<br>  
 
<br>  
   −
<br>
+
<br>  
    
=== Analysis for QC  ===
 
=== Analysis for QC  ===
Line 248: Line 248:  
| A. DIAGRAMv4_iSNPs_XXX_1000G_KKK_SCR_YYY_ZZZ.epacts.gz
 
| A. DIAGRAMv4_iSNPs_XXX_1000G_KKK_SCR_YYY_ZZZ.epacts.gz
 
|}
 
|}
 +
 +
<br>
 +
 +
=== Secondary analyses (with BMI)  ===
 +
 +
There are '''2'''&nbsp;secondary analyses'''&nbsp;adjusting for BMI'''.
 +
 +
{| width="1650" border="1" align="left" cellpadding="1" cellspacing="1"
 +
|-
 +
! scope="col" |
 +
Association Analysis
 +
 +
! scope="col" |
 +
Statistical Test
 +
 +
! scope="col" |
 +
Subset of SNPs
 +
 +
! scope="col" |
 +
Output File Type
 +
 +
! scope="col" |
 +
Output Filename Format
 +
 +
|-
 +
|
 +
4. &nbsp;Typical DIAGRAM analysis using existing association pipeline (with BMI)
 +
 +
<br>
 +
 +
|
 +
Wald or likelihood ratio
 +
 +
|
 +
All SNPs with
 +
 +
MAC &gt;= 1
 +
 +
|
 +
Custom file
 +
 +
based on DIAGRAM format
 +
 +
|
 +
DIAGRAMv4_iSNPs_XXX_adjBMI_1000G_KKK_TTT_YYY_ZZZ.txt
 +
 +
<br>
 +
 +
|-
 +
|
 +
5. &nbsp;Analysis of low frequency variants using Firth bias-corrected logistic regression (with BMI)
 +
 +
<br>
 +
 +
|
 +
Firth bias-corrected
 +
 +
|
 +
SNPs with
 +
 +
200 &gt;= MAC &gt;= 1
 +
 +
|
 +
EPACTS output file
 +
 +
|
 +
DIAGRAMv4_iSNPs_XXX_adjBMI_1000G_KKK_FBC_YYY_ZZZ.epacts.gz
 +
 +
<br>
 +
 +
|}
 +
 +
<br>
 +
 +
<br>
 +
 +
<br>
 +
 +
<br>
 +
 +
<br>
 +
 +
<br>
 +
 +
<br>
 +
 +
'''Please send the 2 Primary analyses and the QC analysis when complete.'''
    
<br>  
 
<br>  
Line 277: Line 364:  
This is the typical DIAGRAM analysis using your current association pipeline and software. &nbsp; [[Image:1000Genomes march2012 imputation analysis plan 08312012.pdf]]  
 
This is the typical DIAGRAM analysis using your current association pipeline and software. &nbsp; [[Image:1000Genomes march2012 imputation analysis plan 08312012.pdf]]  
   −
==== Alternative: &nbsp;Analyze VCF and PED files using the Wald test with the EPACTS software: ====
+
==== Alternative: &nbsp;Analyze VCF and PED files using the Wald test with the EPACTS software: ====
   −
This will be a good way to check if your VCF and PED files are correctly formatted for EPACTS.
+
As preparation for the Firth test analysis, we encourage you to analyze the data using the Wald test first, since it is computationally faster. &nbsp;This will be a good way to check if your VCF and PED files are correctly formatted for EPACTS and resolve any problems you may have with your imputation or input files.  
 
<pre>epacts.v2.2.0.20121026 /epacts single -vcf [INPUT VCF FILENAME] -ped [INPUT PED FILENAME] -out [OUTPUT FILENAME PREFIX] \
 
<pre>epacts.v2.2.0.20121026 /epacts single -vcf [INPUT VCF FILENAME] -ped [INPUT PED FILENAME] -out [OUTPUT FILENAME PREFIX] \
 
-test b.wald -pheno DISEASE -cov AGE -sepchr -anno -min-mac 1 -field EC -run 10
 
-test b.wald -pheno DISEASE -cov AGE -sepchr -anno -min-mac 1 -field EC -run 10
</pre>
+
</pre>  
'''Important:''' To analyze dosages (not genotypes), you must specify the dosage field with the "--field EC" option. Without this option, you will be analyzing the hard genotypes (i.e. --field option defaults to "GT" or "genotypes")!
+
'''Important:''' To analyze dosages (not genotypes), you must specify the dosage field with the "--field EC" option. Without this option, you will be analyzing the hard genotypes (i.e. --field option defaults to "GT" or "genotypes")!  
    
=== 2. Analysis of low frequency variants using Firth bias-corrected logistic regression  ===
 
=== 2. Analysis of low frequency variants using Firth bias-corrected logistic regression  ===
Line 293: Line 380:  
-test b.firth -pheno DISEASE -cov AGE -sepchr -anno -min-mac 1 -max-mac 200  -field EC -run 10
 
-test b.firth -pheno DISEASE -cov AGE -sepchr -anno -min-mac 1 -max-mac 200  -field EC -run 10
 
</pre>  
 
</pre>  
<br>'''Important:''' &nbsp;To analyze dosages (not genotypes), you must specify the dosage field with the "--field EC" option. &nbsp;Without this option, you will be analyzing the hard genotypes (i.e. --field option defaults to "GT" or "genotypes")!<br>
+
'''Important:''' &nbsp;To analyze dosages (not genotypes), you must specify the dosage field with the "--field EC" option. &nbsp;Without this option, you will be analyzing the hard genotypes (i.e. --field option defaults to "GT" or "genotypes")!  
    
=== 3. Analysis of chromosome 20 using logistic regression score test  ===
 
=== 3. Analysis of chromosome 20 using logistic regression score test  ===
Line 303: Line 390:  
-test b.score -pheno DISEASE -cov AGE -chr 20 -anno -min-mac 1 -field EC -run 10
 
-test b.score -pheno DISEASE -cov AGE -chr 20 -anno -min-mac 1 -field EC -run 10
 
</pre>  
 
</pre>  
This command will run single variant analysis using the score test logistic regression on the DISEASE phenotype adjusting for AGE. Add the relevant additional covariates with additional "-cov" options. This assumes that the VCF files are separated by chromosomes (option -sepchr). All variants with at least one minor allele count will be analyzed (option -min-mac 1). It will annotate results by functional category (option -anno) and run the analysis on 10 parallel CPUs (option -run 10).
+
This command will run single variant analysis using the score test logistic regression on the DISEASE phenotype adjusting for AGE. Add the relevant additional covariates with additional "-cov" options. This assumes that the VCF files are separated by chromosomes (option -sepchr). All variants with at least one minor allele count will be analyzed (option -min-mac 1). It will annotate results by functional category (option -anno) and run the analysis on 10 parallel CPUs (option -run 10).  
 +
 
 +
=== 4. Typical DIAGRAM analysis using existing association pipeline (with BMI)<br>  ===
 +
 
 +
This is the typical DIAGRAM analysis using your current association pipeline and software including BMI adjustment. &nbsp; [[Image:1000Genomes march2012 imputation analysis plan 08312012.pdf]]
 +
 
 +
==== Alternative: &nbsp;Analyze VCF and PED files using the Wald test with the EPACTS software:  ====
 +
 
 +
<br>
 +
 
 +
=== 5. Analysis of low frequency variants using Firth bias-corrected logistic regression (with BMI) ===
 +
 
 +
Again use the Firth test on EPACTS for your analysis with BMI
    
== 5. &nbsp;Report EPACTS results<br>  ==
 
== 5. &nbsp;Report EPACTS results<br>  ==
216

edits

Navigation menu