From Genome Analysis Wiki
Jump to: navigation, search


1,001 bytes added, 10:52, 19 May 2011
<span style="color:#D2691E">'''***Coming Soon***'''</span>
The <code>diff</code> option on the bam executable prints the difference between two coordinate sorted SAM/BAM files. This can be used to compare the outputs of running a SAM/BAM through different tools/versions of tools. The <code>diff</code> tool compares records that have the same Read Name and Fragment (from the flag). If a matching ReadName & Fragment is not found, the record is considered to be different. <code>diff</code> assumes the files are coordinate sorted and uses this assumption for determining how long to store a record before determining that the other file does not contain a matching ReadName/Fragment. If the files are not coordinate sorted, this logic does not work. By default, just the chromosome/position and cigar are compared for each record. Options are available to compare:* sequence* base quality* specified tags* turn off position comparison* turn off cigar comparison
=== Parameters ===
Required Parameters:
--in1 : first coordinate sorted SAM/BAM file to be diffed --in2 : second coordinate sorted SAM/BAM file to be diffed
Optional Parameters:
--seq : diff the sequence bases.
=== Output Format ===
There are 2 types of differences.* ReadName/Fragment combo is in one file, but not in the other file within the window set by recPoolSize & posDiff* ReadName/Fragment combo is in both files, but at least one of the specified fields to diff is different
== readReference ==

Navigation menu