Changes

Karma (view source)

Revision as of 14:04, 8 April 2010

2,684 bytes added , 14:04, 8 April 2010

lots of changes for karma 0.9

Line 1: Line 1:

[[Category:Software]]

−

~~Karma is top secret. Shh!~~

+

'''K-tuple Alignment with Rapid Matching Algorithm'''

−

~~= Download =~~

+

Karma uses an existing reference to align short reads, such as generated by Illumina sequencers.

−

To get a ~~bootleg~~ copy go to [http://www.sph.umich.edu/csg/pha/karma/download/ Karma Download]

+

The current version, 0.9.0, is optimized to rapidly map base space reads from Illumina sequencers. This version does not map color space reads, nor does it reliably map LS454 reads. Both of those features will return in Karma 0.9.1.

+

= Download Karma =

+

To get a copy go to [http://www.sph.umich.edu/csg/pha/karma/download/ Karma Download]

+

= Build Karma =

+

== Dependencies ==

+

== Building ==

+

== Testing the build ==

+

= Normal Workflow =

+

Karma works using a set of index and hash files created from an existing reference. Once created, this set of reference index and hash files must always be specified in the command line when aligning reads.

+

In concept, the simplest workflow is to first create a reference index using 'karma create', then align reads using 'karma map'. You only have to build the index and hash once.

+

Because the reference can be large, and because Karma will share the reference among many running instances of Karma, it is useful to put well known references in a common location readily accessible to you and your collaborators.

= Build Reference =

+

Building a reference with Karma is straightforward, but because it is time consuming for longer genomes, you typically save the reference index between runs.

+

The simplest example for creating a reference and index using a wordsize of 11-mer words is:

+

karma create -i -w 11 phiX.fa

+

More generally, three primary parameters are necessary for building a Karma reference index:

+

# a boolean flag indicating base or color space

+

# the index table word occurrence cutoff value

+

# the word size

+

Although the input reference is always expected to be base space and in FASTA format, the binary version of the reference, and the corresponding index and hash files, can be in either color space (ABI SOLiD) or base space (Illumina or LS454). For a given reference FASTA file, you may have either a color or base space binary reference, as well as either color or base space index/hash files.

+

Because the index and hash files are dependent on the occurrence cutoff parameter and the word size, the output files created by karma have those values in the file name. This allows you to create a variety of index/hash tables, depending on your expected use (ABI SOLiD, in particular, is sensitive to read length).

+

== Options for building reference ==

+

-w ''word size'' Word size for index and hash (default 15, typically 10-16)

+

-O ''occurrence cutoff'' Upper count of number of word positions to store in word positions table (default 5000)

+

-c Creates a color space reference and index/hash

+

-i Create the index and hash as well as the binary reference

+

== Options ==

Pha

75

edits

Changes

Karma (view source)

Revision as of 14:04, 8 April 2010

Navigation menu

Page actions

Page actions

Personal tools

quick links

teaching

Navigation

Search

Tools