Changes

From Genome Analysis Wiki
Jump to navigationJump to search
460 bytes added ,  16:30, 19 June 2013
no edit summary
Line 15: Line 15:  
== Binary Download ==
 
== Binary Download ==
   −
We have prepared a pre-compiled (under Ubuntu) qplot along with source code . You can download it from: [http://www.sph.umich.edu/csg/zhanxw/software/qplot/qplot.20120602.tar.gz qplot.20120602.tar.gz (File Size: 1.7G)]  
+
We have prepared a pre-compiled (under Ubuntu) qplot along with source code . You can download it from: [http://www.sph.umich.edu/csg/zhanxw/software/qplot/qplot.20130619.tar.gz qplot.20130619.tar.gz (File Size: 1.7G)]  
    
The executable file is under qplot/bin/qplot.  
 
The executable file is under qplot/bin/qplot.  
Line 25: Line 25:  
== Source Code Distribution ==
 
== Source Code Distribution ==
   −
We provide a source code only download in [http://www.sph.umich.edu/csg/zhanxw/software/qplot/qplot-source.20120602.tar.gz qplot-source.20120602.tar.gz]. Optionally, you can download example file and/or data file:
+
We provide a source code only download in [http://www.sph.umich.edu/csg/zhanxw/software/qplot/qplot-source.20130619.tar.gz qplot-source.20130619.tar.gz]. Optionally, you can download example file and/or data file:
    
[http://www.sph.umich.edu/csg/zhanxw/software/qplot/qplot-example.tar.gz  example]: example input file, and expected outputs if you following the [[#Built-in example | direction]].  
 
[http://www.sph.umich.edu/csg/zhanxw/software/qplot/qplot-example.tar.gz  example]: example input file, and expected outputs if you following the [[#Built-in example | direction]].  
Line 34: Line 34:     
* 1. Unarchive downloaded file
 
* 1. Unarchive downloaded file
  tar zvxf qplot-source.20120602.tar.gz
+
  tar zvxf qplot-source.20130619.tar.gz
    
A new folder ''qplot'' will be created.
 
A new folder ''qplot'' will be created.
Line 40: Line 40:  
* 2. Build libStatGen
 
* 2. Build libStatGen
 
  cd qplot
 
  cd qplot
  make libStatGen
+
  (cd ../libStatGen; make cloneLib)
    
This step will download a necessary software library [http://genome.sph.umich.edu/wiki/C%2B%2B_Library:_libStatGen libStatGen] and compile source code into a binary code library.
 
This step will download a necessary software library [http://genome.sph.umich.edu/wiki/C%2B%2B_Library:_libStatGen libStatGen] and compile source code into a binary code library.
    
* 3. Build qplot
 
* 3. Build qplot
  make all
+
  make  
    
This step will then build qplot. Upon success, the executable qplot can be found under qplot/bin/.
 
This step will then build qplot. Upon success, the executable qplot can be found under qplot/bin/.
Line 71: Line 71:     
   some_linux_host > qplot/bin/qplot
 
   some_linux_host > qplot/bin/qplot
   
+
    The following parameters are available. Ones with "[]" are in effect:
              References : --reference [/net/fantasia/home/zhanxw/software/qplot/data/human.g1k.v37.fa],
+
   
                          --dbsnp [/net/fantasia/home/zhanxw/software/qplot/data/dbSNP130.UCSC.coordinates.tbl],
+
   
                          --gccontent [/net/fantasia/home/zhanxw/software/qplot/data/human.g1k.w100.gc]
+
   
  Create gcContent file : --create_gc [], --winsize [100]
+
                    References : --reference [/net/fantasia/home/zhanxw/software/qplot/data/human.g1k.v37.fa],
            Region list : --regions [], --invertRegion
+
                                --dbsnp [/net/fantasia/home/zhanxw/software/qplot/data/dbSNP130.UCSC.coordinates.tbl]
            Flag filters : --read1_skip, --read2_skip, --paired_skip,
+
      GC content file options : --winsize [100]
                          --unpaired_skip
+
                  Region list : --regions [], --invertRegion
          Dup and QCFail : --dup_keep, --qcfail_keep
+
                  Flag filters : --read1_skip, --read2_skip, --paired_skip,
        Mapping filters : --minMapQuality [0.00]
+
                                --unpaired_skip
      Records to process : --first_n_record [-1]
+
                Dup and QCFail : --dup_keep, --qcfail_keep
        Lanes to process : --lanes []
+
              Mapping filters : --minMapQuality [0.00]
  Read group to process : --readGroup []
+
            Records to process : --first_n_record [-1]
      Input file options : --noeof
+
              Lanes to process : --lanes []
            Output files : --plot [], --stats [], --Rcode [], --xml []
+
        Read group to process : --readGroup []
            Plot labels : --label [], --bamLabel []
+
            Input file options : --noeof
 +
                  Output files : --plot [], --stats [], --Rcode [], --xml []
 +
                  Plot labels : --label [], --bamLabel []
 +
        Obsoleted (DO NOT USE) : --gccontent [], --create_gc
    
== Input files ==
 
== Input files ==
Line 102: Line 105:  
This file has two columns. First column is the chromosome name which must be consistent with the reference created above. Second column is 1-based SNP position. If you want to create your own dbSNP data from downloaded UCSC dbSNP file, one way to do it is: <code>cat dbsnp_129_b36.rod|grep "single" | awk '$4-$3==1' |cut -f2,4 > dbSNP_129_b36.tbl</code>  
 
This file has two columns. First column is the chromosome name which must be consistent with the reference created above. Second column is 1-based SNP position. If you want to create your own dbSNP data from downloaded UCSC dbSNP file, one way to do it is: <code>cat dbsnp_129_b36.rod|grep "single" | awk '$4-$3==1' |cut -f2,4 > dbSNP_129_b36.tbl</code>  
   −
* <code>--gccontent</code>
+
* <code> **OBSOLETED** --gccontent, --create_gc </code>
   −
Although GC content can be calculated on the fly each time, it is much more efficient to load a precomputed GC content from a file. To generate the file, use the following command:
+
Although GC content can be calculated on the fly each time, it is much more efficient to load a precomputed GC content from a file.  
qplot --reference reference.fa --windowsize winsize --create_gc reference.gc
+
GC content file name is automatically determined in this format: <reference_genome_base_file_name>.winsize<gc_content_window_size>.gc.
 +
For example, if your reference genome is human.g1k.v37.fa and the window size is 100, then the GC content file name is: human.g1k.v37.winsize100.gc .
 +
 
 +
As it said, there is no need to use --gccontent to specify GC content file in each run.
 +
 
 +
* <code> input files </code>
 +
 
 +
QPLOT take SAM/BAM files.
    
''Note'': Before running qplot, it is critical to check how the chromosome names are coded. Some BAM/SAM files use just numbers, others use chr + numbers. '''You need to make sure that the chromosome names from the reference and dbSNP are consistent with the BAM/SAM files.'''
 
''Note'': Before running qplot, it is critical to check how the chromosome names are coded. Some BAM/SAM files use just numbers, others use chr + numbers. '''You need to make sure that the chromosome names from the reference and dbSNP are consistent with the BAM/SAM files.'''
255

edits

Navigation menu