Changes

From Genome Analysis Wiki
Jump to navigationJump to search
1,331 bytes added ,  10:49, 2 February 2017
Line 1: Line 1: −
== How to Use the fastQValidator Executable ==
+
[[Category:C++]]
'''Required Parameters:'''
+
[[Category:libStatGen]]
        -f  : FastQ filename with path to be prorcessed.
+
[[Category:libStatGen FASTQ]]
   −
'''Optional Parameters:'''
+
== Where to find the fastqFile Library and the FastQValidator ==
        -l  :  Minimum allowed read length (Defaults to 10).
  −
        -e  :  Maximum number of errors to display before suppressing them(Defaults to 20).
  −
        -b  :  Raw sequence type:  B - ACTGN only (Default)
  −
                                  C - 0123. only
  −
                                  BC - ACTGN or 0123.
     −
'''Testing only Parameters:'''
+
The fastQ Library is now a part of [[C++ Library: libStatGen]].
        -t  :  If "ReadOnly" is specified, the fastq will be read but not processed.  This may be used for determining read time.
  −
'''Usage:'''
  −
        ./fastQValidator -f <fileName> -l <minReadLen> -e <maxReprotedErrors> -b <rawSeqType>
     −
'''Examples:'''
+
The FastQValidator is documented at [[FastQValidator]].
        ../fastQValidator -f testFile.txt
+
 
        ../fastQValidator -f testFile.txt -l 10 -b BC -e 100
+
== FASTQ Library Component for Reading and Validating FastQFiles ==
        ./fastQValidator -f test/testFile.txt -l 10 -b BC -e 100
+
The software reads and validates fastq files in both compressed and uncompressed formats.
        time ./fastQValidator -f test/testFile.txt -t ReadOnly
+
 
 +
The FASTQ component of the library is found in libStatGen/fastq/.
 +
 
 +
See https://github.com/statgen/libStatGen/commits/master/fastq for a list of the most recent updates to the development version of the FASTQ portion of the library.
 +
 
 +
For the old change log, see: [[C++ Library: FASTQ Change Log]]
 +
 
 +
=== Classes in the FASTQ Portion of Library ===
 +
{| style="margin: 1em 1em 1em 0; background-color: #f9f9f9; border: 1px #aaa solid; border-collapse: collapse;" border="1"
 +
|-style="background: #f2f2f2; text-align: center;"
 +
! Class Name !!  Description
 +
|-
 +
| <code>[[C++ Class: FastQFile|FastQFile]]</code>
 +
| Class used for reading/validating a fastq file.
 +
|-
 +
| <code>[http://csg.sph.umich.edu//mktrost/doxygen/current/classBaseCount.html BaseCount]</code>
 +
| Wrapper around an array that has one index per base and an extra index for a total count of all bases.  This class is used to keep a count of the number of times each index has occurred.  It can print a percentage of the occurrence of each base against the total number of bases.
 +
|-
 +
| <code>[http://csg.sph.umich.edu//mktrost/doxygen/current/classBaseComposition.html BaseComposition]</code>
 +
| Class that tracks the composition of base by read location.
 +
|-
 +
| <code>[http://csg.sph.umich.edu//mktrost/doxygen/current/classFastQStatus.html FastQStatus]</code>
 +
| Status for FastQ operations.
 +
|}
 +
 
 +
== FASTQ Output ==
 +
When a sequence is read, error messages for the first maxReportedErrors are output for failed [[C++ Class: FastQFile#Validation Criteria Used For Reading a Sequence|Validation Criteria]].
 +
For Example:
 +
ERROR on Line 25: The sequence identifier line was too short.
 +
ERROR on Line 29: First line of a sequence does not begin wtih @
 +
ERROR on Line 33: No Sequence Identifier specified before the comment.
 +
 
 +
== FastQValidator ==
 +
The [[FastQValidator]] was built using the FastQFile class.  More details on that program are at the supplied link.
96

edits

Navigation menu