Difference between revisions of "Arf"

From Genome Analysis Wiki
Jump to navigationJump to search
Line 20: Line 20:
 
   #HWDOF - Degrees of Freedom for test
 
   #HWDOF - Degrees of Freedom for test
 
   #will generate frequency tags.
 
   #will generate frequency tags.
   arf -s hwe 1000g.vcf
+
   arf -a hwe 1000g.vcf
  
 
   #conducts HWE LRT test from genotype likelihoods (multiallelic)
 
   #conducts HWE LRT test from genotype likelihoods (multiallelic)
 
   #will attempt to use existing allele frequency estimates in the info  
 
   #will attempt to use existing allele frequency estimates in the info  
   arf -s hwe 1000g.vcf -e
+
   arf -a hwe 1000g.vcf -e
  
 
   #estimates Inbreeding Coefficient F from genotype likelihood
 
   #estimates Inbreeding Coefficient F from genotype likelihood
 
   #adds the info tag
 
   #adds the info tag
 
   #F - Inbreeding Coefficient
 
   #F - Inbreeding Coefficient
   arf -s f 1000g.vcf
+
   arf -a f 1000g.vcf
  
 
   #you can also do both analysis at the same time
 
   #you can also do both analysis at the same time
 
   #performs both HWE test and estimates F
 
   #performs both HWE test and estimates F
   arf -s hwe,f 1000g.vcf
+
   arf -a hwe,f 1000g.vcf
 
      
 
      
 
   #annotates exonic regions
 
   #annotates exonic regions

Revision as of 11:40, 18 January 2012

arf is a genetic analysis program for sequencing data.

Basic Usage Example

 arf [options] <vcf-file>

Here is an example of how arf works:

  #estimates allele and genotype frequencies from genotype likelihoods.
  #AF - Allele frequency estimates of alternate alleles (EM)
  #HWEAF - Allele frequency estimates of alternate alleles under the assumption of HWE equilibrium (EM)
  #GF - Genotype frequency estimates (EM)
  arf -s freq 1000g.vcf
  #conducts HWE LRT test from genotype likelihoods (multiallelic)
  #adds the info tags
  #HWP - HWE P-value
  #HWCHISQ - HWE Chisquare value
  #HWDOF - Degrees of Freedom for test
  #will generate frequency tags.
  arf -a hwe 1000g.vcf
  #conducts HWE LRT test from genotype likelihoods (multiallelic)
  #will attempt to use existing allele frequency estimates in the info 
  arf -a hwe 1000g.vcf -e
  #estimates Inbreeding Coefficient F from genotype likelihood
  #adds the info tag
  #F - Inbreeding Coefficient
  arf -a f 1000g.vcf
  #you can also do both analysis at the same time
  #performs both HWE test and estimates F
  arf -a hwe,f 1000g.vcf
   
  #annotates exonic regions
  #adds the info tag
  #EXON - flag
  arf -a exon 1000g.vcf
  #computes a complexity measure for flanking sequences around a variant
  #adds the info tag
  #C - complexity measure
  arf -a c 1000g.vcf -g genome.fa

Command Line Options

   vcf-file     VCF file (can be gzipped or bgzipped)
   g              genome-file (fasta file) 
                  (note that if genome.fa is specified, the actual file looked 
                   for is genome-bs.umfa, if the memory mapped file is not 
                   found, it will be automatically generated from the fasta file)
   s              statistical analysis
   a              annotation
   o              output file name

Output

   An output file is generated with the name arf.<data-time>.<vcf-file>
   The file name can be specified with the -o option.
   Log files are generated in arf.<data-time>.<analyses>.log

Description

   Basically deals with VCF files, generate additional info tags in an output VCF file.
   Deals with hard calls as well as genotype likelihoods.

Download

For arf 0.557215, we provide binaries for linux machines.

You will also need a copy of the memory mapped file: human.g1k.v37-bs.umfa. Please gunzip it before usage. Note that to use it, please refer to the file as human.g1k.v37.fa, it will be automatically renamed as human.g1k.v37-bs.umfa by arf.

This page is maintained by Adrian.