Arf
From Genome Analysis Wiki
arf is a genetic analysis program for sequencing data.
Basic Usage Example
arf [options] <vcf-file>
Here is an example of how arf
works:
#estimates allele and genotype frequencies from genotype likelihoods. #AF - Allele frequency estimates of alternate alleles (EM) #HWEAF - Allele frequency estimates of alternate alleles under the assumption of HWE equilibrium (EM) #GF - Genotype frequency estimates (EM) arf -a freq 1000g.vcf
#conducts HWE LRT test from genotype likelihoods (multiallelic) #adds the info tags #HWP - HWE P-value #HWCHISQ - HWE Chisquare value #HWDOF - Degrees of Freedom for test #will generate frequency tags. arf -a hwe 1000g.vcf
#conducts HWE LRT test from genotype likelihoods (multiallelic) #will attempt to use existing allele frequency estimates in the info arf -a hwe 1000g.vcf -e
#estimates Inbreeding Coefficient F from genotype likelihood #adds the info tag #F - Inbreeding Coefficient arf -a f 1000g.vcf
#you can also do both analysis at the same time #performs both HWE test and estimates F arf -a hwe,f 1000g.vcf #annotates exonic regions #adds the info tag #EXON - flag arf -a exon 1000g.vcf
#computes extracts flanking sequence around a variant #adds the info tag #FLANKS - 5' sequence, reference allele, 3' sequence up to length n defined by option -l, default is 25 arf -a flanks 1000g.vcf -g genome.fa -l 30
#computes a complexity measure for flanking sequences around a variant #adds the info tag #CPXY - complexity measure for flanks of length l defined by option -l, default is 25 arf -a complexity 1000g.vcf -g genome.fa -l 30
Command Line Options
vcf-file VCF file (can be gzipped or bgzipped) g genome-file (fasta file) (note that if genome.fa is specified, the actual file looked for is genome-bs.umfa, if the memory mapped file is not found, it will be automatically generated from the fasta file) l length of flanking sequence a analysis/annotation o output file name
Output
An output file is generated with the name arf.vcf The file name can be specified with the -o option. Log files are generated in arf.log
Description
Basically deals with VCF files, generate additional info tags in an output VCF file. Deals with hard calls as well as genotype likelihoods.
Download
For arf 0.557215, we provide binaries for linux machines.
You will also need a copy of human genome assembly fasta file: human.g1k.v37.fa. Please gunzip it before usage. arf will generate a memory mapped file from the fasta file named human.g1k.v37-bs.umfa.
This page is maintained by Adrian.