Changes

From Genome Analysis Wiki
Jump to navigationJump to search
94 bytes added ,  18:30, 8 January 2013
Line 20: Line 20:  
The deduper assumes that duplicates in the input BAM file are not marked.  When the deduper detects a marked duplicate in the input BAM file, it will throw an error and stop.  To override this behavior, use the [[#Ignore Previous Duplicate Marking (--force)|<code>--force</code>]] option;  in this mode, alignments that are marked as duplicates in the input file are unmarked before the deduper begins its detection algorithm.  The result is that only duplicates detected by the deduper will be marked in or removed from the output file.
 
The deduper assumes that duplicates in the input BAM file are not marked.  When the deduper detects a marked duplicate in the input BAM file, it will throw an error and stop.  To override this behavior, use the [[#Ignore Previous Duplicate Marking (--force)|<code>--force</code>]] option;  in this mode, alignments that are marked as duplicates in the input file are unmarked before the deduper begins its detection algorithm.  The result is that only duplicates detected by the deduper will be marked in or removed from the output file.
   −
The handling of paired-end reads assumes that the mate information in the SAM/BAM records is accurate.  If a mate is not found at the expected position, an error message is printed (once per file) indicating this error.  Paired-end reads whose mate cannot be found are not marked duplicate and are not used for duplicate marking of other paired-end reads.  Single-end reads with the same key as paired-end reads whose mate cannot be found are still marked as duplicate.  If this error is encountered, you may want to fix the mate information and reprocess the file through the deduper.  Use the [[#Treat Reads with Mates On Different Chromosomes As Single-Ended (--oneChrom)|<code>--oneChrom</code>]] option to treat reads with a mate on a different chromosome as single-ended.  This option is useful if you are running the deduper on just a single chromosome.
+
The handling of paired-end reads assumes that the mate information in the SAM/BAM records is accurate.  If a mate is not found at the expected position, an error message is printed (once per file) indicating this error.  Paired-end reads whose mate cannot be found are not marked duplicate and are not used for duplicate marking of other paired-end reads.  Single-end reads with the same key as paired-end reads whose mate cannot be found are still marked as duplicate.  If this error is encountered, you may want to fix the mate information and reprocess the file through the deduper.   
 +
 
 +
Use the [[#Treat Reads with Mates On Different Chromosomes As Single-Ended (--oneChrom)|<code>--oneChrom</code>]] option to treat reads with a mate on a different chromosome as single-ended.  This option is useful if you are running the deduper on just a single chromosome.  The code will use less memory with this option if mates are found on different chromosomes.
      Line 41: Line 43:  
   
 
   
 
This code assumes that at most 1000 bases are clipped at the start of a read.
 
This code assumes that at most 1000 bases are clipped at the start of a read.
      
==Handling Recalibration==
 
==Handling Recalibration==

Navigation menu