Changes

LibStatGen: FASTQ (view source)

Revision as of 17:44, 3 February 2010

1,227 bytes added , 17:44, 3 February 2010

no edit summary

Line 1: Line 1: +

== Validation Criteria ==

+

=== Sequence Identifier Line ===

+

*Every entry in the file should have a unique identifier.

+

=== Raw Sequence Line ===

+

*A base sequence should have non-zero length.

+

*Validates the base sequences against the characters allowed via configuration.

+

** Base Only: A C T G N a c t g n

+

** Color Space Only: 0 1 2 3 .(period)

+

** Base or Color Space: A C T G N a c t g n 0 1 2 3 .(period)

+

*Reads should be of a minimum length; many mappers will get into trouble with very short reads.

+

=== Plus Line ===

+

=== Quality String Line ===

+

*A quality string should be present for every base sequence.

+

*Paired quality and base sequences should be of the same length.

+

*Valid quality values should all have ASCII codes > 32.

+

== Additional Features ==

+

*Base composition are reported and tracked by position.

+

*Consumes gzipped and uncompressed text files transparently (see libcsg/InputFile.h).

+

== Additional Wishlist - Not Implemented ==

+

*To reduce memory usage, implement a two-pass algorithm that stores only a key for each sequence name (rather than complete sequence names) in memory (suggest a pair of options -1 -> one pass, high memory use, -2 -> two pass lower memory use, default is -1).

+

== Assumptions ==

+

== How to Use the fastQValidator Executable ==

'''Required Parameters:'''

Line 22: Line 52:

== FastQ Validator Output ==

−

~~The FastQ Validator~~

+

'''Coming Soon'''

Mktrost

Administrators

3,045

edits

Changes

LibStatGen: FASTQ (view source)

Revision as of 17:44, 3 February 2010

Navigation menu

Page actions

Page actions

Personal tools

quick links

teaching

Navigation

Search

Tools