Changes

From Genome Analysis Wiki
Jump to navigationJump to search
70 bytes added ,  15:13, 8 April 2010
Line 123: Line 123:  
= Karma Performance Tuning =
 
= Karma Performance Tuning =
   −
There are four components to the Karma index and hash.  A pure index, based on an N-mer word index.  This is used as a pointer into a word positions table, which is an ordered list of genome positions in which that N-mer word appears.  There is a cap called the ''occurrence cutoff'', which once crossed, causes that index word to be marked as a high repeat pattern.  Once marked as high repeat, the N-mer word is now combined with both the N-mer word preceding it, as well as the N-mer word succeeding it to create a 2 * N-mer word hash key.  Two hash tables are populated, a left and a right hash.
+
There are four components to the Karma index and hash.  A pure index array, based on an N-mer word index.  This is used as a pointer into a word positions table, which is an ordered list of genome positions in which that N-mer word appears.  There is a cap called the ''occurrence cutoff'', which once exceeded, causes that index word to be marked as a high repeat pattern.  Once marked as high repeat, the N-mer word is instead combined with both the N-mer word preceding it, as well as the N-mer word succeeding it to create a 2 * N-mer word hash key.  Two hash tables are populated, a left and a right hash.  These are then used when that pattern is found in a read.
    
== Index Word Size ==
 
== Index Word Size ==
75

edits

Navigation menu