Line 141: |
Line 141: |
| <br> | | <br> |
| | | |
− | === Primary analyses === | + | === Primary analyses (without BMI) === |
| | | |
| There are '''2''' separate association analyses to be completed '''without adjusting for BMI'''. | | There are '''2''' separate association analyses to be completed '''without adjusting for BMI'''. |
Line 222: |
Line 222: |
| <br> | | <br> |
| | | |
− | <br> | + | <br> |
| | | |
| === Analysis for QC === | | === Analysis for QC === |
Line 248: |
Line 248: |
| | A. DIAGRAMv4_iSNPs_XXX_1000G_KKK_SCR_YYY_ZZZ.epacts.gz | | | A. DIAGRAMv4_iSNPs_XXX_1000G_KKK_SCR_YYY_ZZZ.epacts.gz |
| |} | | |} |
| + | |
| + | <br> |
| + | |
| + | === Secondary analyses (with BMI) === |
| + | |
| + | There are '''2''' secondary analyses''' adjusting for BMI'''. |
| + | |
| + | {| width="1650" border="1" align="left" cellpadding="1" cellspacing="1" |
| + | |- |
| + | ! scope="col" | |
| + | Association Analysis |
| + | |
| + | ! scope="col" | |
| + | Statistical Test |
| + | |
| + | ! scope="col" | |
| + | Subset of SNPs |
| + | |
| + | ! scope="col" | |
| + | Output File Type |
| + | |
| + | ! scope="col" | |
| + | Output Filename Format |
| + | |
| + | |- |
| + | | |
| + | 4. Typical DIAGRAM analysis using existing association pipeline (with BMI) |
| + | |
| + | <br> |
| + | |
| + | | |
| + | Wald or likelihood ratio |
| + | |
| + | | |
| + | All SNPs with |
| + | |
| + | MAC >= 1 |
| + | |
| + | | |
| + | Custom file |
| + | |
| + | based on DIAGRAM format |
| + | |
| + | | |
| + | DIAGRAMv4_iSNPs_XXX_adjBMI_1000G_KKK_TTT_YYY_ZZZ.txt |
| + | |
| + | <br> |
| + | |
| + | |- |
| + | | |
| + | 5. Analysis of low frequency variants using Firth bias-corrected logistic regression (with BMI) |
| + | |
| + | <br> |
| + | |
| + | | |
| + | Firth bias-corrected |
| + | |
| + | | |
| + | SNPs with |
| + | |
| + | 200 >= MAC >= 1 |
| + | |
| + | | |
| + | EPACTS output file |
| + | |
| + | | |
| + | DIAGRAMv4_iSNPs_XXX_adjBMI_1000G_KKK_FBC_YYY_ZZZ.epacts.gz |
| + | |
| + | <br> |
| + | |
| + | |} |
| + | |
| + | <br> |
| + | |
| + | <br> |
| + | |
| + | <br> |
| + | |
| + | <br> |
| + | |
| + | <br> |
| + | |
| + | <br> |
| + | |
| + | <br> |
| + | |
| + | '''Please send the 2 Primary analyses and the QC analysis when complete.''' |
| | | |
| <br> | | <br> |
Line 277: |
Line 364: |
| This is the typical DIAGRAM analysis using your current association pipeline and software. [[Image:1000Genomes march2012 imputation analysis plan 08312012.pdf]] | | This is the typical DIAGRAM analysis using your current association pipeline and software. [[Image:1000Genomes march2012 imputation analysis plan 08312012.pdf]] |
| | | |
− | ==== Alternative: Analyze VCF and PED files using the Wald test with the EPACTS software: ==== | + | ==== Alternative: Analyze VCF and PED files using the Wald test with the EPACTS software: ==== |
| | | |
− | This will be a good way to check if your VCF and PED files are correctly formatted for EPACTS. | + | As preparation for the Firth test analysis, we encourage you to analyze the data using the Wald test first, since it is computationally faster. This will be a good way to check if your VCF and PED files are correctly formatted for EPACTS and resolve any problems you may have with your imputation or input files. |
| <pre>epacts.v2.2.0.20121026 /epacts single -vcf [INPUT VCF FILENAME] -ped [INPUT PED FILENAME] -out [OUTPUT FILENAME PREFIX] \ | | <pre>epacts.v2.2.0.20121026 /epacts single -vcf [INPUT VCF FILENAME] -ped [INPUT PED FILENAME] -out [OUTPUT FILENAME PREFIX] \ |
| -test b.wald -pheno DISEASE -cov AGE -sepchr -anno -min-mac 1 -field EC -run 10 | | -test b.wald -pheno DISEASE -cov AGE -sepchr -anno -min-mac 1 -field EC -run 10 |
− | </pre> | + | </pre> |
− | '''Important:''' To analyze dosages (not genotypes), you must specify the dosage field with the "--field EC" option. Without this option, you will be analyzing the hard genotypes (i.e. --field option defaults to "GT" or "genotypes")! | + | '''Important:''' To analyze dosages (not genotypes), you must specify the dosage field with the "--field EC" option. Without this option, you will be analyzing the hard genotypes (i.e. --field option defaults to "GT" or "genotypes")! |
| | | |
| === 2. Analysis of low frequency variants using Firth bias-corrected logistic regression === | | === 2. Analysis of low frequency variants using Firth bias-corrected logistic regression === |
Line 293: |
Line 380: |
| -test b.firth -pheno DISEASE -cov AGE -sepchr -anno -min-mac 1 -max-mac 200 -field EC -run 10 | | -test b.firth -pheno DISEASE -cov AGE -sepchr -anno -min-mac 1 -max-mac 200 -field EC -run 10 |
| </pre> | | </pre> |
− | <br>'''Important:''' To analyze dosages (not genotypes), you must specify the dosage field with the "--field EC" option. Without this option, you will be analyzing the hard genotypes (i.e. --field option defaults to "GT" or "genotypes")!<br>
| + | '''Important:''' To analyze dosages (not genotypes), you must specify the dosage field with the "--field EC" option. Without this option, you will be analyzing the hard genotypes (i.e. --field option defaults to "GT" or "genotypes")! |
| | | |
| === 3. Analysis of chromosome 20 using logistic regression score test === | | === 3. Analysis of chromosome 20 using logistic regression score test === |
Line 303: |
Line 390: |
| -test b.score -pheno DISEASE -cov AGE -chr 20 -anno -min-mac 1 -field EC -run 10 | | -test b.score -pheno DISEASE -cov AGE -chr 20 -anno -min-mac 1 -field EC -run 10 |
| </pre> | | </pre> |
− | This command will run single variant analysis using the score test logistic regression on the DISEASE phenotype adjusting for AGE. Add the relevant additional covariates with additional "-cov" options. This assumes that the VCF files are separated by chromosomes (option -sepchr). All variants with at least one minor allele count will be analyzed (option -min-mac 1). It will annotate results by functional category (option -anno) and run the analysis on 10 parallel CPUs (option -run 10). | + | This command will run single variant analysis using the score test logistic regression on the DISEASE phenotype adjusting for AGE. Add the relevant additional covariates with additional "-cov" options. This assumes that the VCF files are separated by chromosomes (option -sepchr). All variants with at least one minor allele count will be analyzed (option -min-mac 1). It will annotate results by functional category (option -anno) and run the analysis on 10 parallel CPUs (option -run 10). |
| + | |
| + | === 4. Typical DIAGRAM analysis using existing association pipeline (with BMI)<br> === |
| + | |
| + | This is the typical DIAGRAM analysis using your current association pipeline and software including BMI adjustment. [[Image:1000Genomes march2012 imputation analysis plan 08312012.pdf]] |
| + | |
| + | ==== Alternative: Analyze VCF and PED files using the Wald test with the EPACTS software: ==== |
| + | |
| + | <br> |
| + | |
| + | === 5. Analysis of low frequency variants using Firth bias-corrected logistic regression (with BMI) === |
| + | |
| + | Again use the Firth test on EPACTS for your analysis with BMI |
| | | |
| == 5. Report EPACTS results<br> == | | == 5. Report EPACTS results<br> == |