Difference between revisions of "Arf"
From Genome Analysis Wiki
Jump to navigationJump to searchLine 13: | Line 13: | ||
#HWCHISQ - HWE Chisquare value | #HWCHISQ - HWE Chisquare value | ||
#HWDOF - Degrees of Freedom for test | #HWDOF - Degrees of Freedom for test | ||
+ | #AF - Allele frequency estimates of alternate alleles (EM) | ||
+ | #HWEAF - Allele frequency estimates of alternate alleles under the assumption of HWE equilibrium (EM) | ||
+ | #GF - Genotype frequency estimates (EM) | ||
arf -s hwe 1000g.vcf | arf -s hwe 1000g.vcf | ||
#estimates Inbreeding Coefficient F from genotype likelihood | #estimates Inbreeding Coefficient F from genotype likelihood | ||
+ | #adds the info tag | ||
+ | #F - Inbreeding Coefficient | ||
arf -s f 1000g.vcf | arf -s f 1000g.vcf | ||
Line 22: | Line 27: | ||
# annotates exonic regions | # annotates exonic regions | ||
+ | #adds the info tag | ||
+ | #EXON - flag | ||
arf -a exon 1000g.vcf | arf -a exon 1000g.vcf | ||
#computes a complexity measure for flanking sequences around a variant | #computes a complexity measure for flanking sequences around a variant | ||
− | arf -a c 1000g.vcf | + | #adds the info tag |
+ | #C - complexity measure | ||
+ | arf -a c 1000g.vcf -g genome.fa | ||
== Command Line Options == | == Command Line Options == |
Revision as of 16:54, 17 January 2012
arf is a genetic analysis program for sequencing data.
Basic Usage Example
arf [options] <vcf-file>
Here is an example of how arf
works:
#conducts HWE LRT test from genotype likelihoods (multiallelic) #adds the info tags #HWP - HWE P-value #HWCHISQ - HWE Chisquare value #HWDOF - Degrees of Freedom for test #AF - Allele frequency estimates of alternate alleles (EM) #HWEAF - Allele frequency estimates of alternate alleles under the assumption of HWE equilibrium (EM) #GF - Genotype frequency estimates (EM) arf -s hwe 1000g.vcf
#estimates Inbreeding Coefficient F from genotype likelihood #adds the info tag #F - Inbreeding Coefficient arf -s f 1000g.vcf
# performs both HWE test and estimates F arf -s hwe,f 1000g.vcf # annotates exonic regions #adds the info tag #EXON - flag arf -a exon 1000g.vcf
#computes a complexity measure for flanking sequences around a variant #adds the info tag #C - complexity measure arf -a c 1000g.vcf -g genome.fa
Command Line Options
vcf-file VCF file (can be gzipped or bgzipped) g genome-file (Memory Mapped Sequence file) (note that if genome.fa is specified, the actual file looked for is genome-bs.umfa) s statistical analysis a annotation
Here is an example of how arf works:
#computes HWE and F statistics from genotype likelihoods arf -s hwe,f 1kg.vcf
Output
user@host:~$ vmatch gatk.vcf samtools.vcf -w 10 -d
Description
Outputs 2 files match.txt : gives the matched pairs 1)id1 2)id2 3)match type 4)extended no of bases 5)normalized match.log : Details of the extension and normalization process for all compared pairs vmatch matches the variants in 2 VCF files by choosing the best match for every possible variant pair. The percentage of matches is given at 3 levels for each variant total of both VCF files.
Download
For arf 0.557215, we provide binaries for linux machines.
You will also need a copy of the memory mapped file: human.g1k.v37-bs.umfa. Please gunzip it before usage. Note that to use it, please refer to the file as human.g1k.v37.fa, it will be automatically renamed as human.g1k.v37-bs.umfa by arf.
This page is maintained by Adrian.