Line 6: |
Line 6: |
| | | |
| == Lastest ChangeLog == | | == Lastest ChangeLog == |
− | * February 28th, 2013 : EPACTS v3.2.0 release | + | * Dec 15th, 2016 : EPACTS v3.3.0 release (github) |
− | ** R package installation bug (for some users) was fixed | + | ** Moved the repository into github |
− | ** A bug in the MAF error for high frequency variants (AF>0.25) was now fixed | + | ** Some major fixes in handling large sample size (>18,000) |
− | ** SKAT version is updated to 0.81 | + | ** Other minor bug fixes |
− | ** --bprange option is added to allow testing for small region size | + | * July 10th, 2014 : EPACTS v3.2.6 release |
− | ** Additional minor bug fixes | + | ** Minor bug fix in epacts-make-kin |
| + | * March 11th, 2014 : EPACTS v3.2.5 release |
| + | ** EMMAX-SKAT is implemented with major bug fix |
| + | * November 21th, 2013 : EPACTS v3.2.4 release |
| + | ** Fixed a number of minor bugs (more comprehensive fix is still pending) |
| + | * March 25th, 2013 : EPACTS v3.2.3 release |
| + | ** Relaxed the checking of low-rank matrix in SKAT tests (to avoid unncessary skipping of genes) |
| + | * March 13th, 2013 : EPACTS v3.2.2 release |
| + | ** Fixed an error which occasionally report mismatches in the number of samples |
| + | * March 9th, 2013 : EPACTS v3.2.1 release |
| + | **Fixed errors in loading the dynamic library |
| + | ** Fixed errors in SKAT-O (thanks to Anubha Mahajan and Jason Flannick) |
| + | ** Fixed bugs in emmax-CMC |
| + | ** Added emmax-SKAT (contributed by Seunngeun Lee) |
| + | ** And additional minor bug fixes |
| See [[#Full ChangeLog]] for full details | | See [[#Full ChangeLog]] for full details |
| | | |
Line 38: |
Line 52: |
| == Obtaining EPACTS == | | == Obtaining EPACTS == |
| | | |
− | The official release of EPACTS software is available at http://www.sph.umich.edu/csg/kang/epacts/ . | + | * The official release of EPACTS software is available at https://github.com/statgen/EPACTS |
− | From the CSG cluster, it is available at /net/fantasia/home/bin/epacts/ | + | ** From the CSG cluster, it is available at /net/fantasia/home/bin/epacts/ |
| + | * Note that R (version 2.10 or higher) and gnuplot (version 4.2 or higher) must be installed in order to run EPACTS correctly. |
| | | |
| == Currently Supported Statistical Tests == | | == Currently Supported Statistical Tests == |
Line 57: |
Line 72: |
| | Implemented by | | | Implemented by |
| |- | | |- |
− | | b.glm | + | | b.wald |
| | Binary | | | Binary |
| | YES <br> (Joint) | | | YES <br> (Joint) |
Line 77: |
Line 92: |
| | Firth Bias-Corrected Logistic Likelihood Ratio Test | | | Firth Bias-Corrected Logistic Likelihood Ratio Test |
| | Clement Ma | | | Clement Ma |
| + | |- |
| + | | b.spa2 |
| + | | Binary |
| + | | YES <br> |
| + | | Moderate |
| + | | Saddlepoint Approximation Method |
| + | | Shawn Lee & Rounak Dey |
| |- | | |- |
| | b.lrt | | | b.lrt |
Line 98: |
Line 120: |
| | Linear Wald Test | | | Linear Wald Test |
| | Hyun Min Kang <br> (as implemented in lm in R) | | | Hyun Min Kang <br> (as implemented in lm in R) |
− | |-
| |
− | | q.score
| |
− | | Quantitative
| |
− | | YES <br> (Regressed Out)
| |
− | | Fast
| |
− | | Quantitative Score Test <br> (from Lin DY and Tang ZZ, AJHG 2011 89:354-67)
| |
− | | Clement Ma
| |
| |- | | |- |
| | q.linear | | | q.linear |
Line 184: |
Line 199: |
| |- | | |- |
| | skat | | | skat |
− | | Quantitative | + | | Binary/Quantitative |
| | YES <br> (Joint Estimation) | | | YES <br> (Joint Estimation) |
| | Slow | | | Slow |
Line 191: |
Line 206: |
| |- | | |- |
| | VT | | | VT |
− | | Variable Threshold Test <br> with adaptive permutation | + | | Binary/Quantitative |
| | YES <br> (Regressed out first) | | | YES <br> (Regressed out first) |
| | Slow | | | Slow |
− | | Price et al, AJHG (2010) 86:832-8 | + | | Variable Threshold Test <br> with adaptive permutation <br> Price et al, AJHG (2010) 86:832-8 |
| + | | Hyun Min Kang |
| + | |- |
| + | | emmaxCMC |
| + | | Binary/Quantitative |
| + | | YES <br> (Regressed Out First) |
| + | | Slow |
| + | | Collapsing burden test using EMMAX |
| | Hyun Min Kang | | | Hyun Min Kang |
| |- | | |- |
| | emmaxVT | | | emmaxVT |
− | | Quantitative | + | | Binary/Quantitative |
| | YES <br> (Regressed Out First) | | | YES <br> (Regressed Out First) |
| | Slow | | | Slow |
| | Variable-threshold burden test using EMMAX | | | Variable-threshold burden test using EMMAX |
| | Hyun Min Kang | | | Hyun Min Kang |
| + | |- |
| + | | mmskat |
| + | | Quantitative |
| + | | YES <br> (Regressed Out First) |
| + | | Slow |
| + | | SKAT test using EMMAX |
| + | | Seunggeun Lee & Hyun Min Kang |
| |} | | |} |
| | | |
Line 209: |
Line 238: |
| If you want to use EPACTS in an Ubuntu platform, following the step below | | If you want to use EPACTS in an Ubuntu platform, following the step below |
| | | |
− | *Download EPACTS source distribution at http://www.sph.umich.edu/csg/kang/epacts/download/EPACTS-3.2.0.tar.gz (100MB)
| + | $ git clone https://github.com/statgen/EPACTS.git |
− | *Uncompress EPACTS package, and install the package using the following set of commands
| + | $ cd EPACTS |
| + | $ ./configure --prefix [/path/to/install] |
| + | $ make |
| + | $ make install |
| | | |
− | tar xzvf EPACTS-3.2.0.tar.gz
| |
− | cd EPACTS-3.2.0
| |
− | ./configure --prefix=/path/to/install
| |
− | make
| |
− | make install
| |
| | | |
| (Important Note: '''make sure to specify --prefix=/path/to/install''' to avoid installing to the default path /usr/local/, which you may not have the permission. /home/your_userid/epacts might be a good one, if you are not sure where to install) | | (Important Note: '''make sure to specify --prefix=/path/to/install''' to avoid installing to the default path /usr/local/, which you may not have the permission. /home/your_userid/epacts might be a good one, if you are not sure where to install) |
Line 234: |
Line 261: |
| In order to use EPACTS in the CSG cluster, you do not need to install them. You can directly use or make a copy of the in-house release version at | | In order to use EPACTS in the CSG cluster, you do not need to install them. You can directly use or make a copy of the in-house release version at |
| | | |
− | /net/fantasia/home/hmkang/bin/epacts/ | + | /net/fantasia/home/hmkang/tools/epacts-3.3.0/bin/epacts/ |
| + | |
| + | * If you want to access previous versions, visit http://csg-old.sph.umich.edu/kang/epacts/download |
| | | |
| == Getting Started With Examples == | | == Getting Started With Examples == |
| If you are using EPACTS from the CSG cluster, please set the following environment variable | | If you are using EPACTS from the CSG cluster, please set the following environment variable |
− | EPACTS_DIR=/net/fantasia/home/hmkang/bin/epacts (in bash) | + | EPACTS_DIR=/net/fantasia/home/hmkang/tools/epacts-3.3.0/bin/epacts (in bash) |
− | setenv EPACTS_DIR /net/fantasia/home/hmkang/bin/epacts (in csh) | + | setenv EPACTS_DIR /net/fantasia/home/hmkang/tools/epacts-3.3.0/bin/epacts (in csh) |
| | | |
| If you downloaded EPACTS binary and please set EPACTS_DIR to the full path of the downloaded and uncompressed directory. | | If you downloaded EPACTS binary and please set EPACTS_DIR to the full path of the downloaded and uncompressed directory. |
Line 296: |
Line 325: |
| 20 1616892 1616892 20:1616892_A/G_Synonymous:SIRPG 266 144 1 0.27068 0.0051239 2.7991 145 121 0.63449 0.42975 | | 20 1616892 1616892 20:1616892_A/G_Synonymous:SIRPG 266 144 1 0.27068 0.0051239 2.7991 145 121 0.63449 0.42975 |
| 20 25038372 25038372 20:25038372_G/A_Intron:ACSS1 266 103.3 1 0.19418 0.005748 2.7618 145 121 0.47201 0.28813 | | 20 25038372 25038372 20:25038372_G/A_Intron:ACSS1 266 103.3 1 0.19418 0.005748 2.7618 145 121 0.47201 0.28813 |
| + | |
| + | The key columns represents: |
| + | * '''NS''' : Number of phenotyped samples with non-missing genotypes |
| + | * '''AC''' : Total Non-reference Allele Count |
| + | * '''CALLRATE''' : Fraction of non-missing genotypes. |
| + | * '''MAF''' : Minor allele frequencies |
| + | * '''PVALUE''' : P-value of single variant test |
| + | * '''AF.CASE''' : Non-reference allele frequencies for cases |
| + | * '''AF.CTRL''' : Non-reference allele frequencies for controls |
| | | |
| ==== Q-Q plot of test statistics (stratified by MAF) ==== | | ==== Q-Q plot of test statistics (stratified by MAF) ==== |
Line 330: |
Line 368: |
| Note that [MARKER_ID_K] has to be sorted by increasing order of genomic coordinate | | Note that [MARKER_ID_K] has to be sorted by increasing order of genomic coordinate |
| | | |
− | In oeder to create gene-level group file from typically formatted VCF file, one may use the following utility | + | In order to create gene-level group file from typically formatted VCF file, one may use the following utility |
| | | |
| ${EPACTS_DIR}/epacts make-group --vcf [input-vcf] --out [output-group-file] --format [epacts, annovar, chaos or gatk] --nonsyn | | ${EPACTS_DIR}/epacts make-group --vcf [input-vcf] --out [output-group-file] --format [epacts, annovar, chaos or gatk] --nonsyn |
Line 359: |
Line 397: |
| --groupf ${EPACTS_DIR}/data/1000G_exome_chr20_example_softFiltered.calls.anno.grp --out out/test.gene.skat \ | | --groupf ${EPACTS_DIR}/data/1000G_exome_chr20_example_softFiltered.calls.anno.grp --out out/test.gene.skat \ |
| --ped ${EPACTS_DIR}/data/1000G_dummy_pheno.ped --maxAF 0.05 \ | | --ped ${EPACTS_DIR}/data/1000G_dummy_pheno.ped --maxAF 0.05 \ |
− | --chr 20 --pheno QT --cov AGE --cov SEX --test skat --ska-o --run 2 | + | --chr 20 --pheno QT --cov AGE --cov SEX --test skat --skat-o --run 2 |
| | | |
| ==== Example Output ==== | | ==== Example Output ==== |
Line 439: |
Line 477: |
| bgzip input.vcf ## this command will produce input.vcf.gz | | bgzip input.vcf ## this command will produce input.vcf.gz |
| tabix -pvcf -f input.vcf.gz ## this command will produce input.vcf.gz.tbi | | tabix -pvcf -f input.vcf.gz ## this command will produce input.vcf.gz.tbi |
− | * If the VCF file is separated by chromosome, the VCF file must contain the string "chr1" in the chromosome 1 file, and corresponding chromosome name for other chromosomes. | + | * If the VCF file is separated by chromosome, the VCF file specified in the input argument must contain the string "chr1" in the chromosome 1 file, and corresponding chromosome name for other chromosomes. Thus, the files names should be like <code>[prefix]chr1[suffix].vcf.gz</code>, <code>[prefix]chr2[suffix].vcf.gz</code>, ..., <code>[prefix]chr22[suffix].vcf.gz</code>, <code>[prefix]chrX[suffix].vcf.gz</code>. |
| * Sample IDs in the VCF file must be consistent to those from PED file | | * Sample IDs in the VCF file must be consistent to those from PED file |
| * Currently EPACTS only support bi-allelic variants, but it handles SNPs, INDELs, snd SVs. | | * Currently EPACTS only support bi-allelic variants, but it handles SNPs, INDELs, snd SVs. |
Line 527: |
Line 565: |
| # What is VCF? | | # What is VCF? |
| #* VCF refers to Variant Call Format | | #* VCF refers to Variant Call Format |
− | #* See [[http://www.1000genomes.org/wiki/Analysis/Variant%20Call%20Format/vcf-variant-call-format-version-41 | 1000 Genomes wiki page]] for the detailed description of VCF format | + | #* See [[http://www.1000genomes.org/wiki/Analysis/Variant%20Call%20Format/vcf-variant-call-format-version-41 1000 Genomes wiki page]] for the detailed description of VCF format |
| # Should input VCF be compressed into certain format? | | # Should input VCF be compressed into certain format? |
| #* Correct. EPACTS assumes that VCF file is bgzipped and tabixed already. | | #* Correct. EPACTS assumes that VCF file is bgzipped and tabixed already. |
Line 537: |
Line 575: |
| #* If non-GT field is used, the field is considered as dosage and should be a single numeric value. | | #* If non-GT field is used, the field is considered as dosage and should be a single numeric value. |
| # What are the acceptable input format to encode phenotypes and covariates? | | # What are the acceptable input format to encode phenotypes and covariates? |
− | ** See [[#PED file for Phenotypes and Covariates]] for the detailed information
| + | #* See [[#PED file for Phenotypes and Covariates]] for the detailed information |
| # How should I encode binary phenotypes? | | # How should I encode binary phenotypes? |
| #* If you encode your phenotypes into two different numeric values (e.g. 0/1 or 1/2), EPACTS will automatically recognize them as binary phenotypes and encode them into 1/2 values. Higher value will be considered as cases for case-control association | | #* If you encode your phenotypes into two different numeric values (e.g. 0/1 or 1/2), EPACTS will automatically recognize them as binary phenotypes and encode them into 1/2 values. Higher value will be considered as cases for case-control association |
Line 556: |
Line 594: |
| #* [[#Manhattan Plot of Test Statistics]] will inform us the genome-wide distribution of association signals | | #* [[#Manhattan Plot of Test Statistics]] will inform us the genome-wide distribution of association signals |
| #* [[#Output Text of All Test Statistics]] will contain the full information of test results across all units tested | | #* [[#Output Text of All Test Statistics]] will contain the full information of test results across all units tested |
| + | # The Q-Q and Manhattan plots cannot be found. Why? |
| + | #* It is probably because gnuplot 4.2 or higher is not installed in your system, or they are included but cannot be found in your ${PATH}. Please visit [[http://gnuplot.info/ GNUPLOT web page]] for installation. |
| + | # How can I read the EMMAX kinship file from produced from EPACTS? |
| + | # * You can run the following command to dump your kinship matrix into a human-readable text format. |
| + | $(EPACTS_DIR)/bin/pEmmax kin-util --kinf [input.kinf] --outf [output.prefix] --dump |
| | | |
| === More questions === | | === More questions === |
− | # If you have more questions, please contact [[mailto:hmkang@umich.edu | Hyun Min Kang]]. | + | # If you have more questions, please contact [[mailto:hmkang@umich.edu Hyun Min Kang]]. |
| | | |
| == Detailed Options == | | == Detailed Options == |
Line 570: |
Line 613: |
| ${EPACTS_DIR}/bin/epacts zoom -man (for zoom plot) | | ${EPACTS_DIR}/bin/epacts zoom -man (for zoom plot) |
| ${EPACTS_DIR}/bin/epacts meta -man (for meta-analysis) | | ${EPACTS_DIR}/bin/epacts meta -man (for meta-analysis) |
− | ${EPACTS_DIR}/bin/epacts makegroup -man (for creating gene group) | + | ${EPACTS_DIR}/bin/epacts make-group -man (for creating gene group) |
| | | |
| == Implementing Additional Statistical Tests == | | == Implementing Additional Statistical Tests == |
Line 708: |
Line 751: |
| | | |
| == Full ChangeLog == | | == Full ChangeLog == |
| + | * July 10th, 2014 : EPACTS v3.2.6 release |
| + | ** Minor bug fix in epacts-make-kin |
| + | * March 11th, 2014 : EPACTS v3.2.5 release |
| + | ** EMMAX-SKAT is implemented with major bug fix |
| + | * November 21th, 2013 : EPACTS v3.2.4 release |
| + | ** Fixed a number of minor bugs |
| + | ** Some known bugs still exist |
| + | *** SKAT-O Lambda eigenvalue error. This happenes in a particular context but haven't nailed down a way to prevent it yet. |
| + | *** EMMAX has case and control frequency flipped. |
| + | * EMMAX test has a silly known bug with case / ctrl frequency is flipped |
| + | * March 25th, 2013 : EPACTS v3.2.3 release |
| + | ** Relaxed the checking of low-rank matrix in SKAT tests (to avoid unncessary skipping of genes) |
| + | * March 13th, 2013 : EPACTS v3.2.2 release |
| + | ** Fixed an error which occasionally report mismatches in the number of samples |
| * March 9th, 2013 : EPACTS v3.2.1 release | | * March 9th, 2013 : EPACTS v3.2.1 release |
| **Fixed errors in loading the dynamic library | | **Fixed errors in loading the dynamic library |