Changes

From Genome Analysis Wiki
Jump to navigationJump to search
60 bytes removed ,  14:16, 22 February 2010
no edit summary
Line 7: Line 7:     
A valid fastQ file meets the validation criteria specified in [[FastQFile#Validation Criteria Used For Reading a Sequence|FastQ File Validation]].
 
A valid fastQ file meets the validation criteria specified in [[FastQFile#Validation Criteria Used For Reading a Sequence|FastQ File Validation]].
 +
 +
 +
== How to Use the fastQValidator Executable ==
 +
'''Required Parameters:'''
 +
        --file  :  FastQ filename with path to be prorcessed.
 +
 +
'''Optional Parameters:'''
 +
        --minReadLen        : Minimum allowed read length (Defaults to 10).
 +
        --maxReportedErrors : Maximum number of errors to display before suppressing them (Defaults to 20).
 +
        --ignoreAllErrors  : Ignore all errors (same as --maxReportedErrors 0), overwrites the maxReportedErrors option.
 +
 +
'''Optional Space Options for Raw Sequence (Last one specified is used):'''
 +
        --autoDetect : Determine baseSpace/colorSpace from the Raw Sequence in the file (Default).
 +
        --baseSpace  : ACTGN only
 +
        --colorSpace : 0123. only
 +
 +
'''Usage:'''
 +
        ./fastQValidator --file <fileName> [--minReadLen <minReadLen>] [--maxReportedErrors <maxReprotedErrors>|--ignoreAllErrors] [--baseSpace|--colorSpace|--autoDetect]
 +
 +
'''Examples:'''
 +
        ../fastQValidator --file testFile.txt
 +
        ../fastQValidator --file testFile.txt --minReadLen 10 --baseSpace --maxReportedErrors 100
 +
        ./fastQValidator --file test/testFile.txt --minReadLen 10 --colorSpace --ignoreAllErrors
 +
 +
 +
== FastQ Validator Output ==
 +
When running the fastQValidator Executable, the output starts with a summary of the parameters:
 +
 +
The following parameters are in effect:
 +
 +
Input Parameters
 +
--file [testFile.txt], --minReadLen [10]
 +
  Space Type : --baseSpace [ON], --colorSpace, --autoDetect
 +
      Errors : --ignoreAllErrors, --maxReportedErrors [100]
 +
 +
The Validator Executable outputs error messages for invalid sequences based on [[FastQFile#Validation Criteria Used For Reading a Sequence|Validation Criteria]].
 +
For Example:
 +
ERROR on Line 25: The sequence identifier line was too short.
 +
ERROR on Line 29: First line of a sequence does not begin wtih @
 +
ERROR on Line 33: No Sequence Identifier specified before the comment.
 +
 +
Base Composition Percentages by Index:
 +
 +
Base Composition Statistics:
 +
Read Index %A %C %G %T %N Total Reads At Index
 +
        0  100.00    0.00    0.00    0.00    0.00 20
 +
        1    5.00  95.00    0.00    0.00    0.00 20
 +
        2    5.00    0.00    5.00  90.00    0.00 20
 +
 +
 +
Summary of the number of lines, sequences, and errors:
 +
Finished processing testFile.txt with 92 lines containing 20 sequences.
 +
There were a total of 17 errors.
 +
    
== Additional Features ==
 
== Additional Features ==
Line 34: Line 88:     
* It may be useful to report 2 types of information to the user: ERROR (critical failure) and WARNING (tolerable errors).
 
* It may be useful to report 2 types of information to the user: ERROR (critical failure) and WARNING (tolerable errors).
  −
  −
  −
== How to Use the fastQValidator Executable ==
  −
'''Required Parameters:'''
  −
        -f  :  FastQ filename with path to be prorcessed.
  −
  −
'''Optional Parameters:'''
  −
        -l  :  Minimum allowed read length (Defaults to 10).
  −
        -e  :  Maximum number of errors to display before suppressing them(Defaults to 20).
  −
        -b  :  Raw sequence type: "A"/"C"/"G"/"T"/"N"  - Bases only;
  −
                                  "0"/"1"/"2"/"3"/"."  - Color space only;
  −
                                  ""                  - Base Decision on the first Raw Sequence Character (Default)
  −
                                  All other characters - Bases & Color space
  −
  −
'''Testing only Parameters:'''
  −
        -t  :  If "ReadOnly" is specified, the fastq will be read but not processed.  This may be used for determining read time.
  −
'''Usage:'''
  −
        ./fastQValidator -f <fileName> -l <minReadLen> -e <maxReprotedErrors> -b <rawSeqType>
  −
  −
'''Examples:'''
  −
        ../fastQValidator -f testFile.txt
  −
        ../fastQValidator -f testFile.txt -l 10 -b A -e 100
  −
        ./fastQValidator -f test/testFile.txt -l 10 -b Z -e 100
  −
        time ./fastQValidator -f test/testFile.txt -t ReadOnly
  −
  −
  −
== FastQ Validator Output ==
  −
When running the fastQValidator Executable, the output starts with a summary of the parameters:
  −
The following parameters are in effect:
  −
              FastQ File Name :    testFile.txt (-fname)
  −
              Min Read Length :              10 (-l9999)
  −
          Max Reported Errors :            100 (-e9999)
  −
                      BaseType :              A (-bname)
  −
                      TestMode :                (-tname)
  −
  −
Both the Executable and the Library outputs the following:
  −
*Error messages for the first Configurable number of errors.:
  −
ERROR on Line 25: The sequence identifier line was too short.
  −
ERROR on Line 29: First line of a sequence does not begin wtih @
  −
ERROR on Line 33: No Sequence Identifier specified before the comment.
  −
*Base Composition Percentages by Index:
  −
  −
Base Composition Statistics:
  −
Read Index %A %C %G %T %N Total Reads At Index
  −
        0  100.00    0.00    0.00    0.00    0.00 20
  −
        1    5.00  95.00    0.00    0.00    0.00 20
  −
        2    5.00    0.00    5.00  90.00    0.00 20
  −
*Summary of the number of lines, sequences, and errors:
  −
Finished processing testFile.txt with 92 lines containing 20 sequences.
  −
There were a total of 17 errors.
 

Navigation menu