Difference between revisions of "Arf"
From Genome Analysis Wiki
Jump to navigationJump to searchLine 7: | Line 7: | ||
Here is an example of how <code>arf</code> works: | Here is an example of how <code>arf</code> works: | ||
+ | |||
+ | #estimates allele and genotype frequencies from genotype likelihoods. | ||
+ | #AF - Allele frequency estimates of alternate alleles (EM) | ||
+ | #HWEAF - Allele frequency estimates of alternate alleles under the assumption of HWE equilibrium (EM) | ||
+ | #GF - Genotype frequency estimates (EM) | ||
+ | arf -s freq 1000g.vcf | ||
#conducts HWE LRT test from genotype likelihoods (multiallelic) | #conducts HWE LRT test from genotype likelihoods (multiallelic) | ||
Line 13: | Line 19: | ||
#HWCHISQ - HWE Chisquare value | #HWCHISQ - HWE Chisquare value | ||
#HWDOF - Degrees of Freedom for test | #HWDOF - Degrees of Freedom for test | ||
− | # | + | #will generate frequency tags. |
− | # | + | arf -s hwe 1000g.vcf |
− | # | + | |
− | arf -s hwe 1000g.vcf | + | #conducts HWE LRT test from genotype likelihoods (multiallelic) |
+ | #adds the info tags | ||
+ | #HWP - HWE P-value | ||
+ | #HWCHISQ - HWE Chisquare value | ||
+ | #HWDOF - Degrees of Freedom for test | ||
+ | #will attempt to use existing allele frequency estimates in the info | ||
+ | #fields if they exist, otherwise it will estimate the frequencies from the data. | ||
+ | arf -s hwe 1000g.vcf -e | ||
#estimates Inbreeding Coefficient F from genotype likelihood | #estimates Inbreeding Coefficient F from genotype likelihood | ||
Line 23: | Line 36: | ||
arf -s f 1000g.vcf | arf -s f 1000g.vcf | ||
− | # performs both HWE test and estimates F | + | #you can also do both analysis at the same time |
+ | #performs both HWE test and estimates F | ||
arf -s hwe,f 1000g.vcf | arf -s hwe,f 1000g.vcf | ||
− | + | ||
− | # annotates exonic regions | + | #annotates exonic regions |
#adds the info tag | #adds the info tag | ||
#EXON - flag | #EXON - flag |
Revision as of 17:00, 17 January 2012
arf is a genetic analysis program for sequencing data.
Basic Usage Example
arf [options] <vcf-file>
Here is an example of how arf
works:
#estimates allele and genotype frequencies from genotype likelihoods. #AF - Allele frequency estimates of alternate alleles (EM) #HWEAF - Allele frequency estimates of alternate alleles under the assumption of HWE equilibrium (EM) #GF - Genotype frequency estimates (EM) arf -s freq 1000g.vcf
#conducts HWE LRT test from genotype likelihoods (multiallelic) #adds the info tags #HWP - HWE P-value #HWCHISQ - HWE Chisquare value #HWDOF - Degrees of Freedom for test #will generate frequency tags. arf -s hwe 1000g.vcf
#conducts HWE LRT test from genotype likelihoods (multiallelic) #adds the info tags #HWP - HWE P-value #HWCHISQ - HWE Chisquare value #HWDOF - Degrees of Freedom for test #will attempt to use existing allele frequency estimates in the info #fields if they exist, otherwise it will estimate the frequencies from the data. arf -s hwe 1000g.vcf -e
#estimates Inbreeding Coefficient F from genotype likelihood #adds the info tag #F - Inbreeding Coefficient arf -s f 1000g.vcf
#you can also do both analysis at the same time #performs both HWE test and estimates F arf -s hwe,f 1000g.vcf #annotates exonic regions #adds the info tag #EXON - flag arf -a exon 1000g.vcf
#computes a complexity measure for flanking sequences around a variant #adds the info tag #C - complexity measure arf -a c 1000g.vcf -g genome.fa
Command Line Options
vcf-file VCF file (can be gzipped or bgzipped) g genome-file (Memory Mapped Sequence file) (note that if genome.fa is specified, the actual file looked for is genome-bs.umfa) s statistical analysis a annotation
Output
user@host:~$ vmatch gatk.vcf samtools.vcf -w 10 -d
Description
Outputs 2 files match.txt : gives the matched pairs 1)id1 2)id2 3)match type 4)extended no of bases 5)normalized match.log : Details of the extension and normalization process for all compared pairs vmatch matches the variants in 2 VCF files by choosing the best match for every possible variant pair. The percentage of matches is given at 3 levels for each variant total of both VCF files.
Download
For arf 0.557215, we provide binaries for linux machines.
You will also need a copy of the memory mapped file: human.g1k.v37-bs.umfa. Please gunzip it before usage. Note that to use it, please refer to the file as human.g1k.v37.fa, it will be automatically renamed as human.g1k.v37-bs.umfa by arf.
This page is maintained by Adrian.