From Genome Analysis Wiki
Jump to navigationJump to search
207 bytes added
, 12:02, 25 October 2010
Line 5: |
Line 5: |
| | | |
| This command line tool can be found at: http://www.sph.umich.edu/csg/mktrost/fastQFile/ | | This command line tool can be found at: http://www.sph.umich.edu/csg/mktrost/fastQFile/ |
| + | |
| + | Note: Since the FastQValidator checks for unique sequence names, it may use a large amount of memory. |
| | | |
| == Valid FastQ File Requirements == | | == Valid FastQ File Requirements == |
Line 104: |
Line 106: |
| There are a series of optional capabilities a FastQ Validator could implement. Among those: | | There are a series of optional capabilities a FastQ Validator could implement. Among those: |
| | | |
| + | *Add option to disable the unique sequence name validation so it does not store all the sequence names. |
| *To reduce memory usage, implement a two-pass algorithm that stores only a key for each sequence name (rather than complete sequence names) in memory (suggest a pair of options -1 -> one pass, high memory use, -2 -> two pass lower memory use, default is -1). | | *To reduce memory usage, implement a two-pass algorithm that stores only a key for each sequence name (rather than complete sequence names) in memory (suggest a pair of options -1 -> one pass, high memory use, -2 -> two pass lower memory use, default is -1). |
| *Report average read quality score. | | *Report average read quality score. |