Changes

From Genome Analysis Wiki
Jump to navigationJump to search
3,075 bytes removed ,  01:07, 31 January 2015
Line 7: Line 7:  
This page points to downloads, documentation, and papers for software that is written here at the [http://genome.sph.umich.edu Center for Statistical Genetics]
 
This page points to downloads, documentation, and papers for software that is written here at the [http://genome.sph.umich.edu Center for Statistical Genetics]
   −
If you have any questions or comments, please email Mary Kate Trost (mktrost@umich.edu).
+
If you have any questions or comments, please email Mary Kate Wing (mktrost@umich.edu).
    
=StatGen C++ Software=
 
=StatGen C++ Software=
Line 31: Line 31:  
**Genotype concordance based detection
 
**Genotype concordance based detection
 
**Estimate based on population allele frequencies without genotype data
 
**Estimate based on population allele frequencies without genotype data
  −
   
*[[Pileup]] – Pileup every base or just bases in specified region and write VCF
 
*[[Pileup]] – Pileup every base or just bases in specified region and write VCF
  −
*[[SuperDeDuper]] - Determine duplicate alignments, either marking or removing the lower quality duplicates. In addition, it may modify paired-end reads where the ends overlap by soft clipping the end with the lower quality bases in the region of overlap.
      
==== BAM Util Tools ====
 
==== BAM Util Tools ====
The following tools are part of the [[BamUtil|BamUtil program]].
+
{{BamUtilPrograms}}
 
  −
'''QC/Stats'''
  −
*[[BamUtil: validate|validate]] – Check file format & print statistics
  −
*[[BamUtil: diff|diff]] - Print the diffs between 2 bams
  −
*[[BamUtil: stats|stats]] - Generate some statistics for a SAM/BAM file
  −
'''Rewrite SAM/BAM file'''
  −
*[[BamUtil: convert|convert]] – Convert between SAM & BAM
  −
*[[BamUtil: splitBam|splitBam]] – Split into 1 file per Read Group
  −
*[[BamUtil: splitChromosome|splitChromosome]] – Split into 1 file per Chromosome
  −
*[[BamUtil: writeRegion|writeRegion]] – Write only reads in the specified region and/or have the specified read name
  −
*[[BamUtil: convert#BAM File Recovery | BAM Recovery]] - Recover corrupted BAM files
  −
*[[BamUtil: asp | asp]] - perform an asynchronous pileup producing an ASP file.  <span style="color:#D2691E">ASP is a new format that is currently in production, so this tool is not yet available for public release.</span>
  −
'''File Updates'''
  −
*[[BamUtil: filter|filter]] – Soft clip ends with too high mismatch % and mark unmapped if quality of mismatches is too high
  −
*[[BamUtil: revert|revert]] - Revert SAM/BAM replacing the specified fields with their previous values (if known) and removes specified tags
  −
*[[BamUtil: squeeze|squeeze]] - Reduce files size by dropping OQ fields, duplicates, specified tags, using '=' when a base matches the reference, binning quality scores, and replacing readNames with unique integers
  −
*[[BamUtil: clipOverlap|clipOverlap]] - Clip overlapping read pairs so they do not overlap
  −
*[[BamUtil: trimBam| trimBam]] – Trim end of reads, changing read ends to ‘N’ & quality to ‘!’
  −
*[[BamUtil: polishBam|polishBam]] – Add/Update header lines & add RG tag to each record
  −
*[[BamUtil: rgMergeBam|rgMergeBam]] – Merge sorted BAM files adding Read Groups
  −
 
  −
'''Helper Tools to Print Readable Information'''
  −
*[[BamUtil: dumpHeader|dumpHeader]] - Print the File Header to the screen.
  −
*[[BamUtil: dumpRefInfo|dumpRefInfo]] - Print the reference information from the SAM/BAM header.
  −
*[[BamUtil: dumpIndex|dumpIndex]] - Print the BAM Index to the screen in a readable format
  −
*[[BamUtil: readReference|readReference]] - Print the reference string for the specified region to the screen.
  −
*[[BamUtil: dumpAsp|dumpAsp]] - perform an asynchronous pileup producing an ASP file.  <span style="color:#D2691E">ASP is a new format that is currently in production, so this tool is not yet available for public release.</span>
  −
 
      
=== FASTQ ===
 
=== FASTQ ===
Line 73: Line 41:  
**Reports Base Composition Statistics (%reads at each read index)
 
**Reports Base Composition Statistics (%reads at each read index)
    +
 +
=== Meta Analysis ===
 +
* [[Rare-Metal-Worker|RAREMETALWORKER - generate summary level statistics for meta analysis using Rare-Metal]]
 +
* [[Rare-Metal|RAREMETAL - perform genome-wide meta analysis of rare variants]]
    
=== Other Tools ===
 
=== Other Tools ===
Line 83: Line 55:     
=== Requested Tools ===
 
=== Requested Tools ===
[[BAM to FASTQ]]
      
=Other Tools=
 
=Other Tools=
   −
Since many of our tools still rely on GLF files and samtools stopped supporting GLF files, we created a version of samtools that still supports pileup to GLF files AND incorporates the updated BAQ logic.  This version is called samtools-hybrid That code can be downloaded at: https://github.com/statgen/samtools-0.1.7a-hybrid
+
* [[samtools-hybrid]] - Since many of our tools still rely on GLF files and samtools stopped supporting GLF files, we created a version of samtools that still supports pileup to GLF files AND incorporates the updated BAQ logic.  This version is called samtools-hybrid That code can be downloaded at: https://github.com/statgen/samtools-0.1.7a-hybrid
 
+
*[[baseQualityCheck]] - tool to calculate the observed base quality vs. empirical base quality (helps to evaluate mappers)
== [[Read Mapping]] ==
  −
*[[Karma|Karma]] - Our fast short read aligner, which generates [[Mapping Quality Scores]]
  −
*[[Karma-colorspace|Karma-ColorSpace]] - QUICKSTART on mapping color space reads
  −
*[[baseQualityCheck]] - a mature tool to calculate the observed base quality vs. empirical base quality (helps to evaluate mappers)
  −
 
  −
*[[Examples|Examples]] - Sample command lines with discussion
  −
 
  −
*[[MapabilityScores]] - Definitions of various mappability scores adopted at UCSC genome browser.
  −
 
  −
 
  −
==SAM/BAM==
  −
*Recalibrator – Resource-efficient tool, which recalibrates base qualities based on an adaptive logistic regression model - <span style="color:#D2691E">Available upon request</span>
  −
*Deduper – Mark or remove duplicates - <span style="color:#D2691E">Coming Soon</span>
      
== Variant Calling ==
 
== Variant Calling ==
 
* [[glfSingle]] - Variant calling for a single, deeply sequenced individual
 
* [[glfSingle]] - Variant calling for a single, deeply sequenced individual
 
* [[glfMultiples]] - Variant calling for multiple, unrelated individuals
 
* [[glfMultiples]] - Variant calling for multiple, unrelated individuals
* [[Polymutt:_a_tool_for_calling_polymorphism_and_de_novo_mutations|polymutt]] - Variant and ''de novo'' mutation detection in families (nuclear or extended pedigrees) from sequencing
+
* [[Polymutt|polymutt]] - Variant and ''de novo'' mutation detection in families (nuclear or extended pedigrees) from sequencing
    
== Variant Annotation ==
 
== Variant Annotation ==
 
*[[vcfCodingSnps]] - Annotate coding variants in a VCF file.
 
*[[vcfCodingSnps]] - Annotate coding variants in a VCF file.
   −
== File Conversion ==
+
== Genotype Imputation ==
*[[bam2FastQ]] - Convert BAM files into FastQ files
+
*[[Minimac3]] - Fast and Efficient Genotype Imputation.
    +
== Additional Pedigree & Sequence Analysis Tools ==
 +
Can be found at: http://sph.umich.edu/csg/abecasis/software.html
    
= Other Useful Links =
 
= Other Useful Links =
487

edits

Navigation menu