Line 7: |
Line 7: |
| | | |
| Please refer to [[Media:Seqshop cnv partb 2014 06.pdf|Lecture slides]] for more general background. | | Please refer to [[Media:Seqshop cnv partb 2014 06.pdf|Lecture slides]] for more general background. |
| + | |
| + | |
| + | == Setup in person at the SeqShop Workshop == |
| + | ''This section is specifically for the SeqShop Workshop computers.'' |
| + | <div class="mw-collapsible mw-collapsed" style="width:600px"> |
| + | ''If you are not running during the SeqShop Workshop, please skip this section.'' |
| + | <div class="mw-collapsible-content"> |
| | | |
| {{SeqShopLogin}} | | {{SeqShopLogin}} |
| | | |
− | == Setup your run environment== | + | === Setup your run environment=== |
− | | + | This is the same setup you did for the previous tutorial, but you need to redo it each time you log in. |
− | This is a '''slightly different''' setup from what you did for the previous tutorial, but you need to redo it each time you log in. It will setup some environment variables to point you to: | |
| | | |
− | * GotCloud program | + | This will setup some environment variables to point you to |
| + | * [[GotCloud]] program |
| * Tutorial input files | | * Tutorial input files |
| * Setup an output directory | | * Setup an output directory |
| ** It will leave your output directory from the previous tutorial in tact. | | ** It will leave your output directory from the previous tutorial in tact. |
− | source /home/hmkang/seqshop/setup.txt | + | source /home/mktrost/seqshop/setup.txt |
| * You won't see any output after running <code>source</code> | | * You won't see any output after running <code>source</code> |
| ** It silently sets up your environment | | ** It silently sets up your environment |
− | ** If you want to view the detail of the set up, type | + | ** If you want to view the detail of the setup, type |
− | less /home/hmkang/seqshop/setup.txt | + | less /home/mktrost/seqshop/setup.txt |
| and press 'q' to finish. | | and press 'q' to finish. |
| + | |
| <div class="mw-collapsible mw-collapsed" style="width:200px"> | | <div class="mw-collapsible mw-collapsed" style="width:200px"> |
| View setup.txt | | View setup.txt |
− | <div class="mw-collapsible-content" style="width:800px"> | + | <div class="mw-collapsible-content"> |
− | export GC=/home/hmkang/seqshop/gotcloud
| + | [[File:setup.png|500px]] |
− | export IN=/home/hmkang/seqshop/inputs
| + | </div> |
− | export REF=/home/hmkang/seqshop/reference/chr22
| + | </div> |
− | export VTREF=/home/hmkang/seqshop/reference/vtRef
| + | </div> |
− | export SV=/home/hmkang/seqshop/reference/svtoolkit
| + | </div> |
− | export EXT=/home/hmkang/seqshop/external
| + | |
− | export OUT=~/out
| + | |
− | mkdir -p ${OUT}
| + | == Setup when running on your own outside of the SeqShop Workshop == |
| + | ''This section is specifically for running on your own outside of the SeqShop Workshop.'' |
| + | <div class="mw-collapsible" style="width:600px"> |
| + | ''If you are running during the SeqShop Workshop, please skip this section.'' |
| + | <div class="mw-collapsible-content"> |
| + | === Download the example data === |
| + | |
| + | === Setup your run environment === |
| + | |
| + | Environment variables will be used throughout the tutorial. |
| + | |
| + | We recommend that you setup these variables so you won't have to modify every command in the tutorial. |
| + | |
| + | <div class="mw-collapsible mw-collapsed" style="width:500px"> |
| + | I'm using bash (replace the paths below with the appropriate paths): |
| + | <div class="mw-collapsible-content"> |
| + | * Point to where you installed GotCloud |
| + | *:<pre>export GC=/home/username/gotcloud</pre> |
| + | * Point to where you installed the seqshop files |
| + | *:<pre>export SS=/home/username/seqshop/</pre> |
| + | * Point to where you want the output to go |
| + | *:<pre>export OUT=/home/username/seqshop_output/</pre> |
| + | </div> |
| + | </div> |
| + | |
| + | <div class="mw-collapsible mw-collapsed" style="width:500px"> |
| + | I'm using tcsh (replace the paths below with the appropriate paths): |
| + | <div class="mw-collapsible-content"> |
| + | * Point to where you installed GotCloud |
| + | *:<pre>setenv GC /home/username/gotcloud</pre> |
| + | * Point to where you installed the seqshop files |
| + | *:<pre>setenv SS /home/username/seqshop/</pre> |
| + | * Point to where you want the output to go |
| + | *:<pre>setenv OUT /home/username/seqshop_output/</pre> |
| + | </div> |
| + | </div> |
| + | |
| </div> | | </div> |
| </div> | | </div> |
| | | |
− | Do you notice what the differences were?
| |
| | | |
| == Examining GotCloud/GenomeSTRiP Input files == | | == Examining GotCloud/GenomeSTRiP Input files == |
Line 44: |
Line 87: |
| === Sequnce Alignment Files: BAM Files and Index Files=== | | === Sequnce Alignment Files: BAM Files and Index Files=== |
| | | |
− | The GotCloud Indel caller takes the same inputs as GotCloud snpcall. | + | The GotCloud GenomeSTRiP structural variant caller takes the same inputs as GotCloud snpcall. |
| * BAMs->SVs rather than BAMs->SNPs | | * BAMs->SVs rather than BAMs->SNPs |
| | | |
Line 51: |
Line 94: |
| If you want to check if you still have the bam index file, run | | If you want to check if you still have the bam index file, run |
| | | |
− | head $OUT/bam.index | + | head ${OUT}/bam.index |
| | | |
| <ul> | | <ul> |
Line 73: |
Line 116: |
| Also, make sure that you have only 62 samples (you did not append new files twice) | | Also, make sure that you have only 62 samples (you did not append new files twice) |
| | | |
− | wc -l $OUT/bam.index | + | wc -l ${OUT}/bam.index |
| | | |
| Your expected output is similar to this. | | Your expected output is similar to this. |
Line 91: |
Line 134: |
| # PloidyMap file indicating the regions of genomes with unusual ploidy (e.g. chrX, chrY) | | # PloidyMap file indicating the regions of genomes with unusual ploidy (e.g. chrX, chrY) |
| | | |
− | We looked at them yesterday, but you can take another look at the chromosome 22 reference files included for this tutorial: | + | We looked at them in previous tutorials, but you can take another look at the chromosome 22 reference files included for this tutorial: |
| | | |
− | ls $REF | + | ls ${SS}/ref22 |
| | | |
| <ul> | | <ul> |
Line 99: |
Line 142: |
| <li>View Results</li> | | <li>View Results</li> |
| <div class="mw-collapsible-content" style="width:800px"> | | <div class="mw-collapsible-content" style="width:800px"> |
− | 1000G_omni2.5.b37.sites.PASS.chr22.vcf.gz hapmap_3.3.b37.sites.chr22.vcf.gz.tbi human.g1k.v37.chr22.fa.bwt
| + | [[File:RefDir.png|500px]] |
− | 1000G_omni2.5.b37.sites.PASS.chr22.vcf.gz.tbi human.g1k.v37.chr22-bs.umfa human.g1k.v37.chr22.fa.fai
| |
− | 1kg.pilot_release.merged.indels.sites.hg19.chr22.vcf human.g1k.v37.chr22.dict human.g1k.v37.chr22.fa.pac
| |
− | dbsnp_135.b37.chr22.vcf.gz human.g1k.v37.chr22.fa human.g1k.v37.chr22.fa.sa
| |
− | dbsnp_135.b37.chr22.vcf.gz.tbi human.g1k.v37.chr22.fa.amb human.g1k.v37.chr22.winsize100.gc
| |
− | hapmap_3.3.b37.sites.chr22.vcf.gz human.g1k.v37.chr22.fa.ann
| |
| </div> | | </div> |
| </div> | | </div> |
| </ul> | | </ul> |
| | | |
− | | + | Reference files required just for Structural Variation: |
− | Additional reference and parameters
| |
− | | |
− | ls $SV/ref
| |
− | | |
− | <ul>
| |
− | <div class="mw-collapsible mw-collapsed" style="width:200px">
| |
− | <li>View Results</li>
| |
− | <div class="mw-collapsible-content" style="width:800px">
| |
| human_g1k_v37.chr22.mask.100.fasta human_g1k_v37.chr22.mask.100.fasta.dict human_g1k_v37.chr22.mask.100.fasta.fai | | human_g1k_v37.chr22.mask.100.fasta human_g1k_v37.chr22.mask.100.fasta.dict human_g1k_v37.chr22.mask.100.fasta.fai |
− | </div>
| |
− | </div>
| |
− | </ul>
| |
| | | |
| | | |
− | ls $SV/conf | + | Parameters files required just for Structural Variation: |
| + | ls ${SS}/svtoolkit |
| | | |
| <ul> | | <ul> |
Line 136: |
Line 164: |
| | | |
| === GotCloud Configuration File === | | === GotCloud Configuration File === |
− | We will use a slightly modified version of configuration file as we used yesterday in GotCloud Align. | + | We will use the same configuration file we used for the GotCloud Align tutorial. |
| | | |
| See [[SeqShop:_Sequence_Mapping_and_Assembly_Practical#GotCloud Configuration File|SeqShop: Alignment: GotCloud Configuration File]] for more details | | See [[SeqShop:_Sequence_Mapping_and_Assembly_Practical#GotCloud Configuration File|SeqShop: Alignment: GotCloud Configuration File]] for more details |
Line 142: |
Line 170: |
| | | |
| For more information on configuration, see: [[GotCloud:_Variant_Calling_Pipeline#Configuration_File|GotCloud snpcall: Configuration File]] | | For more information on configuration, see: [[GotCloud:_Variant_Calling_Pipeline#Configuration_File|GotCloud snpcall: Configuration File]] |
− | * Contains information on how to configure for exome/targeted sequencing
| |
− |
| |
− | Check out what was changed.
| |
| | | |
− | cat $IN/gotcloud.conf | + | Check out the GenomeStrip specific settings at the end of the configuration file |
| + | tail -n 7 ${SS}/gotcloud.conf |
| | | |
| <ul> | | <ul> |
Line 152: |
Line 178: |
| <li>View Results</li> | | <li>View Results</li> |
| <div class="mw-collapsible-content" style="width:800px"> | | <div class="mw-collapsible-content" style="width:800px"> |
− | IN_DIR = $(GOTCLOUD_ROOT)/../inputs
| |
− |
| |
− | INDEX_FILE = $(IN_DIR)/align.index
| |
− | FASTQ_PREFIX = $(IN_DIR)/fastq
| |
− | BAM_PREFIX = $(IN_DIR)/
| |
− |
| |
− | OUT_DIR = out
| |
− | BAM_INDEX = $(OUT_DIR)/bam.index
| |
− |
| |
− | ############
| |
− | # References
| |
− | REF_DIR = $(GOTCLOUD_ROOT)/../reference/chr22
| |
− | AS = NCBI37 # Genome assembly identifier
| |
− | REF = $(REF_DIR)/human.g1k.v37.chr22.fa
| |
− | DBSNP_VCF = $(REF_DIR)/dbsnp_135.b37.chr22.vcf.gz
| |
− | HM3_VCF = $(REF_DIR)/hapmap_3.3.b37.sites.chr22.vcf.gz
| |
− | INDEL_PREFIX = $(REF_DIR)/1kg.pilot_release.merged.indels.sites.hg19
| |
− | OMNI_VCF = $(REF_DIR)/1000G_omni2.5.b37.sites.PASS.chr22.vcf.gz
| |
− |
| |
− | MAP_TYPE = BWA_MEM
| |
− |
| |
− | ###############
| |
− | CHRS = 22
| |
− |
| |
− | ######### THUNDER ########
| |
− | # Update so it will run faster for the tutorial
| |
− | # * 10 rounds instead of 30 (-r 10)
| |
− | # * without --compact option
| |
− | # Runs faster, but uses more memory, but not a lot for the small example
| |
− | THUNDER = $(BIN_DIR)/thunderVCF -r 10 --phase --dosage --inputPhased $(THUNDER_STATES)
| |
− |
| |
| ############################## | | ############################## |
| ## GenomeSTRIP | | ## GenomeSTRIP |