Line 2: |
Line 2: |
| Main Workshop wiki page: [[SeqShop: December 2014]] | | Main Workshop wiki page: [[SeqShop: December 2014]] |
| | | |
− | See the [[Media:Seqshop cnv partb 2014 06.pdf|introductory slides]] for an intro to this tutorial. | + | See [[Media:Seqshop cnv partb 2014 06.pdf|lecture slides]] for the lecture slides associated with this tutorial. |
| | | |
| == Goals of This Session == | | == Goals of This Session == |
Line 24: |
Line 24: |
| == Setup in person at the SeqShop Workshop == | | == Setup in person at the SeqShop Workshop == |
| ''This section is specifically for the SeqShop Workshop computers.'' | | ''This section is specifically for the SeqShop Workshop computers.'' |
− | <div class="mw-collapsible" style="width:600px"> | + | <div class="mw-collapsible mw-collapsed" style="width:600px"> |
| ''If you are not running during the SeqShop Workshop, please skip this section.'' | | ''If you are not running during the SeqShop Workshop, please skip this section.'' |
| <div class="mw-collapsible-content"> | | <div class="mw-collapsible-content"> |
Line 38: |
Line 38: |
| * Setup an output directory | | * Setup an output directory |
| ** It will leave your output directory from the previous tutorial in tact. | | ** It will leave your output directory from the previous tutorial in tact. |
− | source /home/mktrost/seqshop/setup.txt | + | source /net/seqshop-server/home/mktrost/seqshop/setup.txt |
| * You won't see any output after running <code>source</code> | | * You won't see any output after running <code>source</code> |
| ** It silently sets up your environment | | ** It silently sets up your environment |
| ** If you want to view the detail of the setup, type | | ** If you want to view the detail of the setup, type |
− | less /home/mktrost/seqshop/setup.txt | + | less /net/seqshop-server/home/mktrost/seqshop/setup.txt |
| and press 'q' to finish. | | and press 'q' to finish. |
| | | |
Line 53: |
Line 53: |
| </div> | | </div> |
| </div> | | </div> |
− |
| |
| | | |
| == Setup when running on your own outside of the SeqShop Workshop == | | == Setup when running on your own outside of the SeqShop Workshop == |
Line 229: |
Line 228: |
| In addition, if one wants to genotype structural variants from other structural variant caller, there is a step available. | | In addition, if one wants to genotype structural variants from other structural variant caller, there is a step available. |
| * Third-party Genotyping and Filtering step : Perform genotyping on the variant sites specified by an input VCF, and also perform variant filtering. | | * Third-party Genotyping and Filtering step : Perform genotyping on the variant sites specified by an input VCF, and also perform variant filtering. |
| + | |
| + | == Command Line Usage of GenomeSTRiP pipeline == |
| + | |
| + | To see how to use GenomeSTRiP pipeline, type |
| + | perl $GC/bin/genomestrip.pl |
| + | |
| + | <div class="mw-collapsible mw-collapsed"> |
| + | ''View Results'' |
| + | <div class="mw-collapsible-content"> |
| + | ERROR: One of command options among --run-metadata, --run-discovery, --run-genotype, --run-thirdparty must be specified |
| + | ERROR: Missing required option, outdir |
| + | Usage: |
| + | /net/seqshop-server/home/mktrost/seqshop/gotcloud/bin/genomestrip.pl |
| + | [options] |
| + | |
| + | Help Options: |
| + | -help Print out brief help message [OFF] |
| + | -man Print the full documentation in man page style [OFF] |
| + | |
| + | Command options: |
| + | -run-metadata Create metadata [OFF] |
| + | -run-discovery Run variant discovery and filtering. Can run with --run-metadata together [OFF] |
| + | -run-genotype Run genotyping - requires to finish run-metadata and run-discovery [OFF] |
| + | -run-thirdparty Run genotyping and filtering of third-party sites [OFF] |
| + | |
| + | Options for input/output data: |
| + | -gotcloudroot|gcroot STRGotCloud Root Directory [] |
| + | -conf STR GotCloud configuration files [] |
| + | -outdir STR Override's conf file's OUT_DIR. Used as the genomestrip output directory unless --out or GENOMESTRIP_OUT is set [] |
| + | -list STR BAM list file containing ID and BAM path [] |
| + | -out STR Output directory which stores subdirectories such as metadata/, discovery/, genotypes/, thirdparty/ unless overriden individually [] |
| + | -metadata STR Output directory to store --run-metadata results. Default is [OUT]/metadata/ [] |
| + | -discovery STR Output directory to store --run-discovery results. Default is [OUT]/discovery/ [] |
| + | -genotype STR Output directory to store --run-genotype results. Default is [OUT]/genotype/ [] |
| + | -thirdparty STR Output directory to store --run-thirdparty results. Default is [OUT]/thirdparty/ [] |
| + | |
| + | Advanced Options: |
| + | -tmp-dir STR temporary directory to store temporary files. Default is [OUT]/tmp [] |
| + | -gs-dir STR GenomeSTRiP svtoolkit directory [] |
| + | -param STR GenomeSTRIP parameter file [] |
| + | -ref STR Reference FASTA file [] |
| + | -mask STR Reference mask FASTA file [] |
| + | -ploidy-map STR Ploidy map file [] |
| + | -mosix-opt STR MOSIX options [] |
| + | -region STR Region to focus on the variants [] |
| + | -unit INT Number of variants to be genotyped per parallel run [100] |
| + | |
| + | Additional Inputs: |
| + | -in-vcf STR Input site VCF files used for --run-genotype or --run-thirdparty. For --run-thirdparty, this argument is required. For --run-genotype, default is [OUT]/discovery/discovery.vcf [] |
| + | -pass-only Genotype only PASS-filtered variants, default is OFF [OFF] |
| + | -skip-rc Skip precomputing read count [OFF] |
| + | -base-prefix STR Prefix of all files [] |
| + | -bam-prefix STR Prefix of BAM files [] |
| + | -ref-prefix STR Prefix of Reference FASTA files [] |
| + | -no-phonehome Skip phone home functionality [OFF] |
| + | -make-base-name STR Specifies the basename for the makefile [] |
| + | -verbose Specifies that additional details are to be printed out [OFF] |
| + | -dry-run Perform a dry-run that only produces Makefile but not run it [OFF] |
| + | -numjobs INT Number of jobs to concurrently run [1] |
| + | -autosomes Perform analysis only on autosomes [OFF] |
| + | </div></div> |
| | | |
| == Running GotCloud/GenomeSTRiP Metadata Pipeline == | | == Running GotCloud/GenomeSTRiP Metadata Pipeline == |
Line 273: |
Line 333: |
| | | |
| To discover large deletions from the 62 BAMs we are using for this workshop, you can run the following command | | To discover large deletions from the 62 BAMs we are using for this workshop, you can run the following command |
− | perl ${GC}/bin/genomestrip.pl --run-discovery --metadata ${SS}/metadata --conf ${SS}/gotcloud.conf --numjobs 2 --conf ${SS}/gotcloud.conf --numjobs 2 --region 22:36000000-37000000 --base-prefix ${SS} --outdir ${OUT} | + | perl ${GC}/bin/genomestrip.pl --run-discovery --metadata ${SS}/metadata --conf ${SS}/gotcloud.conf --numjobs 4 --conf ${SS}/gotcloud.conf --numjobs 2 --region 22:36000000-37000000 --base-prefix ${SS} --outdir ${OUT} |
| * <code>${GC}/bin/genomestrip.pl -run-discovery</code> runs the GenomeSTRiP Discovery Pipeline | | * <code>${GC}/bin/genomestrip.pl -run-discovery</code> runs the GenomeSTRiP Discovery Pipeline |
| * <code>--metadata ${SS}/metadata</code> points to the pre-made metadata file as explained in the previous section, [[#Running GotCloud/GenomeSTRiP Metadata Pipeline|Running GotCloud/GenomeSTRiP Metadata Pipeline]]. | | * <code>--metadata ${SS}/metadata</code> points to the pre-made metadata file as explained in the previous section, [[#Running GotCloud/GenomeSTRiP Metadata Pipeline|Running GotCloud/GenomeSTRiP Metadata Pipeline]]. |
Line 313: |
Line 373: |
| <div class="mw-collapsible-content" style="width:800px"> | | <div class="mw-collapsible-content" style="width:800px"> |
| 7 COHERENCE;COVERAGE;DEPTH;DEPTHPVAL | | 7 COHERENCE;COVERAGE;DEPTH;DEPTHPVAL |
− | 17 COHERENCE;COVERAGE;DEPTH;DEPTHPVAL;PAIRSPERSAMPLE | + | 18 COHERENCE;COVERAGE;DEPTH;DEPTHPVAL;PAIRSPERSAMPLE |
| 3 COHERENCE;COVERAGE;DEPTH;PAIRSPERSAMPLE | | 3 COHERENCE;COVERAGE;DEPTH;PAIRSPERSAMPLE |
| 2 COHERENCE;COVERAGE;DEPTHPVAL;PAIRSPERSAMPLE | | 2 COHERENCE;COVERAGE;DEPTHPVAL;PAIRSPERSAMPLE |
Line 323: |
Line 383: |
| 2 COVERAGE;DEPTH;PAIRSPERSAMPLE | | 2 COVERAGE;DEPTH;PAIRSPERSAMPLE |
| 4 COVERAGE;DEPTHPVAL | | 4 COVERAGE;DEPTHPVAL |
− | 5 COVERAGE;DEPTHPVAL;PAIRSPERSAMPLE | + | 4 COVERAGE;DEPTHPVAL;PAIRSPERSAMPLE |
| 5 COVERAGE;PAIRSPERSAMPLE | | 5 COVERAGE;PAIRSPERSAMPLE |
| </div> | | </div> |
Line 342: |
Line 402: |
| The discovery pipeline only performs discovery of variant sites with filtering. You will need to iterate BAMs again to perform genotyping. | | The discovery pipeline only performs discovery of variant sites with filtering. You will need to iterate BAMs again to perform genotyping. |
| * If running on a small machine, you may want to reduce <code>--numjobs</code> from 4 to 1. | | * If running on a small machine, you may want to reduce <code>--numjobs</code> from 4 to 1. |
− | perl ${GC}/bin/genomestrip.pl --run-genotype --metadata ${SS}/metadata --conf ${SS}/gotcloud.conf --numjobs 2 --base-prefix ${SS} --outdir ${OUT} | + | perl ${GC}/bin/genomestrip.pl --run-genotype --metadata ${SS}/metadata --conf ${SS}/gotcloud.conf --numjobs 4 --base-prefix ${SS} --outdir ${OUT} |
| | | |
| This will take ~3 minutes to finish. | | This will take ~3 minutes to finish. |
Line 362: |
Line 422: |
| You can take a 3rd-party site and genotype with GenomeSTRiP. Here we take a 1000 Genomes phase 1 sites and genotype them. | | You can take a 3rd-party site and genotype with GenomeSTRiP. Here we take a 1000 Genomes phase 1 sites and genotype them. |
| * If running on a small machine, you may want to reduce <code>--numjobs</code> from 4 to 1. | | * If running on a small machine, you may want to reduce <code>--numjobs</code> from 4 to 1. |
− | time perl ${SS}/svtoolkit/bin/genomestrip.pl -run-thirdparty --in-vcf ${SS}/ext/1kg.phase1.chr22.36Mb.sites.vcf --metadata ${SS}/svtoolkit/metadata --conf ${SS}/gotcloud.conf --region 22:36000000-37000000 --base-prefix ${SS} --outdir ${OUT} --gcroot ${GC} --numjobs 4 | + | perl ${GC}/bin/genomestrip.pl --run-thirdparty --in-vcf ${SS}/ext/1kg.phase1.chr22.36Mb.sites.vcf --metadata ${SS}/metadata --conf ${SS}/gotcloud.conf --region 22:36000000-37000000 --base-prefix ${SS} --outdir ${OUT} --numjobs 2 |
| | | |
| This will take ~1 minute to finish. | | This will take ~1 minute to finish. |
Line 390: |
Line 450: |
| </div> | | </div> |
| | | |
− | == Starting SNP Call on your own Genome == | + | |
− | Go to [[SeqShop: Calling Your Own Genome, December 2014]] so we can run SNP calling overnight.
| + | == Return to Workshop Wiki Page == |
| + | Return to main workshop wiki page: [[SeqShop: December 2014]] |