Line 6: |
Line 6: |
| This command line tool can be downloaded as part of the library: http://genome.sph.umich.edu/wiki/Software#Download | | This command line tool can be downloaded as part of the library: http://genome.sph.umich.edu/wiki/Software#Download |
| | | |
− | Note: Since the FastQValidator checks for unique sequence names, it may use a large amount of memory. | + | Note: Since the FastQValidator checks for unique sequence names, it may use a large amount of memory - this can be disabled by specifying the --disableSeqIDCheck option |
| | | |
| == Valid FastQ File Requirements == | | == Valid FastQ File Requirements == |
Line 33: |
Line 33: |
| overwrites the printableErrors option. | | overwrites the printableErrors option. |
| --baseComposition : Print the Base Composition Statistics. | | --baseComposition : Print the Base Composition Statistics. |
| + | --disableSeqIDCheck : Disable the unique sequence identifier check. |
| + | Use this option to save memory since the sequence id |
| + | check uses a lot of memory. |
| + | Does not affect the printing of Base Composition Statistics. |
| --quiet : Suppresses the display of errors and summary statistics. | | --quiet : Suppresses the display of errors and summary statistics. |
| Does not affect the printing of Base Composition Statistics. | | Does not affect the printing of Base Composition Statistics. |
Line 42: |
Line 46: |
| | | |
| === Usage === | | === Usage === |
− | ./fastQValidator --file <fileName> [--minReadLen <minReadLen>] [--maxErrors <numErrors>] [--printableErrors <printableErrors>|--ignoreErrors] [--baseSpace|--colorSpace|--auto] [--baseComposition] [--quiet]
| + | ./fastQValidator --file <fileName> [--minReadLen <minReadLen>] [--maxErrors <numErrors>] [--printableErrors <printableErrors>|--ignoreErrors] [--baseComposition] [--disableSeqIDCheck] [--quiet] [--baseSpace|--colorSpace|--auto] [--params] |
| | | |
| === Examples === | | === Examples === |
Line 56: |
Line 60: |
| | | |
| == FastQ Validator Output == | | == FastQ Validator Output == |
− | When running the fastQValidator Executable, the output starts with a summary of the parameters: | + | When running the fastQValidator Executable, if the --params option is specified, the output starts with a summary of the parameters: |
| | | |
− | The following parameters are in effect: | + | The following parameters are available. Ones with "[]" are in effect: |
| | | |
| Input Parameters | | Input Parameters |
− | --file [testFile.txt], --baseComposition [ON], --quiet, --minReadLen [10],
| + | --file [../fastqValidator/test/testFile.txt], --baseComposition, |
| + | --disableSeqIDCheck, --quiet, --params [ON], --minReadLen [10], |
| --maxErrors [-1] | | --maxErrors [-1] |
− | Space Type : --baseSpace [ON], --colorSpace, --auto | + | Space Type : --baseSpace, --colorSpace, --auto [ON] |
− | Errors : --ignoreErrors, --printableErrors [100] | + | Errors : --ignoreErrors, --printableErrors [20] |
| | | |
| The Validator Executable outputs error messages for invalid sequences based on [[C++ Class: FastQFile#Validation Criteria Used For Reading a Sequence|Validation Criteria]]. For Example: | | The Validator Executable outputs error messages for invalid sequences based on [[C++ Class: FastQFile#Validation Criteria Used For Reading a Sequence|Validation Criteria]]. For Example: |
Line 105: |
Line 110: |
| There are a series of optional capabilities a FastQ Validator could implement. Among those: | | There are a series of optional capabilities a FastQ Validator could implement. Among those: |
| | | |
− | *Add option to disable the unique sequence name validation so it does not store all the sequence names.
| |
| *To reduce memory usage, implement a two-pass algorithm that stores only a key for each sequence name (rather than complete sequence names) in memory (suggest a pair of options -1 -> one pass, high memory use, -2 -> two pass lower memory use, default is -1). | | *To reduce memory usage, implement a two-pass algorithm that stores only a key for each sequence name (rather than complete sequence names) in memory (suggest a pair of options -1 -> one pass, high memory use, -2 -> two pass lower memory use, default is -1). |
| *Report average read quality score. | | *Report average read quality score. |