From Genome Analysis Wiki
Jump to navigationJump to search
70 bytes added
, 15:13, 8 April 2010
Line 123: |
Line 123: |
| = Karma Performance Tuning = | | = Karma Performance Tuning = |
| | | |
− | There are four components to the Karma index and hash. A pure index, based on an N-mer word index. This is used as a pointer into a word positions table, which is an ordered list of genome positions in which that N-mer word appears. There is a cap called the ''occurrence cutoff'', which once crossed, causes that index word to be marked as a high repeat pattern. Once marked as high repeat, the N-mer word is now combined with both the N-mer word preceding it, as well as the N-mer word succeeding it to create a 2 * N-mer word hash key. Two hash tables are populated, a left and a right hash. | + | There are four components to the Karma index and hash. A pure index array, based on an N-mer word index. This is used as a pointer into a word positions table, which is an ordered list of genome positions in which that N-mer word appears. There is a cap called the ''occurrence cutoff'', which once exceeded, causes that index word to be marked as a high repeat pattern. Once marked as high repeat, the N-mer word is instead combined with both the N-mer word preceding it, as well as the N-mer word succeeding it to create a 2 * N-mer word hash key. Two hash tables are populated, a left and a right hash. These are then used when that pattern is found in a read. |
| | | |
| == Index Word Size == | | == Index Word Size == |