Difference between revisions of "Arf"

From Genome Analysis Wiki
Jump to navigationJump to search
Line 7: Line 7:
  
 
Here is an example of how <code>arf</code> works:
 
Here is an example of how <code>arf</code> works:
 +
 +
  #estimates allele and genotype frequencies from genotype likelihoods.
 +
  #AF - Allele frequency estimates of alternate alleles (EM)
 +
  #HWEAF - Allele frequency estimates of alternate alleles under the assumption of HWE equilibrium (EM)
 +
  #GF - Genotype frequency estimates (EM)
 +
  arf -s freq 1000g.vcf
  
 
   #conducts HWE LRT test from genotype likelihoods (multiallelic)
 
   #conducts HWE LRT test from genotype likelihoods (multiallelic)
Line 13: Line 19:
 
   #HWCHISQ - HWE Chisquare value
 
   #HWCHISQ - HWE Chisquare value
 
   #HWDOF - Degrees of Freedom for test
 
   #HWDOF - Degrees of Freedom for test
   #AF - Allele frequency estimates of alternate alleles (EM)
+
   #will generate frequency tags.
   #HWEAF - Allele frequency estimates of alternate alleles under the assumption of HWE equilibrium (EM)
+
  arf -s hwe 1000g.vcf
   #GF - Genotype frequency estimates (EM)
+
 
   arf -s hwe 1000g.vcf  
+
  #conducts HWE LRT test from genotype likelihoods (multiallelic)
 +
   #adds the info tags
 +
  #HWP - HWE P-value
 +
  #HWCHISQ - HWE Chisquare value
 +
   #HWDOF - Degrees of Freedom for test
 +
  #will attempt to use existing allele frequency estimates in the info
 +
  #fields if they exist, otherwise it will estimate the frequencies from the data.
 +
   arf -s hwe 1000g.vcf -e
  
 
   #estimates Inbreeding Coefficient F from genotype likelihood
 
   #estimates Inbreeding Coefficient F from genotype likelihood
Line 23: Line 36:
 
   arf -s f 1000g.vcf
 
   arf -s f 1000g.vcf
  
   # performs both HWE test and estimates F
+
  #you can also do both analysis at the same time
 +
   #performs both HWE test and estimates F
 
   arf -s hwe,f 1000g.vcf
 
   arf -s hwe,f 1000g.vcf
+
   
   # annotates exonic regions
+
   #annotates exonic regions
 
   #adds the info tag
 
   #adds the info tag
 
   #EXON - flag
 
   #EXON - flag

Revision as of 17:00, 17 January 2012

arf is a genetic analysis program for sequencing data.

Basic Usage Example

 arf [options] <vcf-file>

Here is an example of how arf works:

  #estimates allele and genotype frequencies from genotype likelihoods.
  #AF - Allele frequency estimates of alternate alleles (EM)
  #HWEAF - Allele frequency estimates of alternate alleles under the assumption of HWE equilibrium (EM)
  #GF - Genotype frequency estimates (EM)
  arf -s freq 1000g.vcf
  #conducts HWE LRT test from genotype likelihoods (multiallelic)
  #adds the info tags
  #HWP - HWE P-value
  #HWCHISQ - HWE Chisquare value
  #HWDOF - Degrees of Freedom for test
  #will generate frequency tags.
  arf -s hwe 1000g.vcf
  #conducts HWE LRT test from genotype likelihoods (multiallelic)
  #adds the info tags
  #HWP - HWE P-value
  #HWCHISQ - HWE Chisquare value
  #HWDOF - Degrees of Freedom for test
  #will attempt to use existing allele frequency estimates in the info 
  #fields if they exist, otherwise it will estimate the frequencies from the data.
  arf -s hwe 1000g.vcf -e
  #estimates Inbreeding Coefficient F from genotype likelihood
  #adds the info tag
  #F - Inbreeding Coefficient
  arf -s f 1000g.vcf
  #you can also do both analysis at the same time
  #performs both HWE test and estimates F
  arf -s hwe,f 1000g.vcf
   
  #annotates exonic regions
  #adds the info tag
  #EXON - flag
  arf -a exon 1000g.vcf
  #computes a complexity measure for flanking sequences around a variant
  #adds the info tag
  #C - complexity measure
  arf -a c 1000g.vcf -g genome.fa

Command Line Options

   vcf-file     VCF file (can be gzipped or bgzipped)
   g              genome-file (Memory Mapped Sequence file) 
                  (note that if genome.fa is specified, the actual file looked for is genome-bs.umfa)
   s              statistical analysis
   a              annotation

Output

   user@host:~$ vmatch gatk.vcf samtools.vcf -w 10 -d  
   

Description

   Outputs 2 files
     match.txt : gives the matched pairs
                 1)id1
                 2)id2
                 3)match type
                 4)extended no of bases
                 5)normalized
     match.log : Details of the extension and normalization process for all compared pairs
   vmatch matches the variants in 2 VCF files by choosing the best match for every
   possible variant pair.  The percentage of matches is given at 3 levels for each
   variant total of both VCF files.


Download

For arf 0.557215, we provide binaries for linux machines.

You will also need a copy of the memory mapped file: human.g1k.v37-bs.umfa. Please gunzip it before usage. Note that to use it, please refer to the file as human.g1k.v37.fa, it will be automatically renamed as human.g1k.v37-bs.umfa by arf.

This page is maintained by Adrian.