Line 131: |
Line 131: |
| <div class="mw-collapsible" style="width:500px"> | | <div class="mw-collapsible" style="width:500px"> |
| | | |
− | === Start SNP Calling === | + | === Start INDEL Calling === |
| <div class="mw-collapsible-content"> | | <div class="mw-collapsible-content"> |
| ==== Resume screen ==== | | ==== Resume screen ==== |
Line 150: |
Line 150: |
| * Relative path, so assumes running from your home directory (I prefer absolute paths, but for simplicity of the workshop, we just use relative path). | | * Relative path, so assumes running from your home directory (I prefer absolute paths, but for simplicity of the workshop, we just use relative path). |
| | | |
− | ==== GotCloud SnpCall Configuration ==== | + | ==== GotCloud INDEL Configuration ==== |
| | | |
| cat ~/$SAMPLE/gotcloud.conf | | cat ~/$SAMPLE/gotcloud.conf |
| | | |
− | You will see something like this:
| + | Same as it looked yesterday with no special Configuration settings for INDEL calling. |
− | <pre>
| |
− | # Cluster Settings
| |
− | BATCH_TYPE =
| |
− | BATCH_OPTS =
| |
| | | |
− | OUT_DIR = Sample13/output
| + | ==== Running INDEL ==== |
− | | + | Run GotCloud indel with 8 jobs running in parallel |
− | # Align Settings
| + | * Why 8? |
− | MAP_TYPE = BWA_MEM
| |
− | BWA_THREADS = -t 24
| |
− | FASTQ_LIST = fastq.list
| |
− | | |
− | # SNP Call Settings
| |
− | UNIT_CHUNK = 20000000 # Chunk size of SNP calling : 20Mb
| |
− | VCF_EXTRACT = /net/seqshop-server/home/mktrost/seqshop/singleSample/snpOnly.vcf.gz
| |
− | MODEL_GLFSINGLE = TRUE
| |
− | MODEL_SKIP_DISCOVER = FALSE
| |
− | MODEL_AF_PRIOR = TRUE
| |
− | | |
− | EXT_DIR = /net/seqshop-server/home/mktrost/seqshop/singleSample/ext
| |
− | EXT = $(EXT_DIR)/ALL.chrCHR.phase3.combined.sites.unfiltered.vcf.gz $(EXT_DIR)/chrCHR.filtered.sites.vcf.gz
| |
− | </pre>
| |
− | | |
− | ===== Configuration Updates =====
| |
− | '''No configuration updates from the original settings are necessary.'''
| |
− | * Originally you were going to update the configuration to do exome calling only, but we have decided to do whole genome
| |
− | ** If it doesn't finish tonight, we will kill it at the tutorial in the morning, and take advantage of the GotCloud restart capability after the tutorial tomorrow.
| |
− | | |
− | If you started making updates to your configuration yesterday, you can try to make it match above (mktrost is actually the path you want in the paths above).
| |
− | * The only difference from above should be your OUT_DIR.
| |
− | Check it with (leave as ~mktrost - you are comparing to my NA12878 conf file):
| |
− | diff $SAMPLE/gotcloud.conf ~mktrost/NA12878/gotcloud.conf
| |
− | | |
− | '''If you see differences other than OUT_DIR, you can modify your file using nedit (or your favorite editor as you used yesterday).'''
| |
− | * '''Or you can copy my file and just change OUT_DIR:'''
| |
− | cp ~mktrost/NA12878/gotcloud.conf $SAMPLE/gotcloud.conf
| |
− | nedit $SAMPLE/gotcloud.conf&
| |
− | * Edit OUT_DIR
| |
− | OUT_DIR = Sample##/output
| |
− | * Save, and close
| |
− | | |
− | ==== Running SnpCall ==== | |
− | Run GotCloud snpcall with 6 jobs running in parallel | |
− | * Why 6? | |
| ** You want to run as many as you can. | | ** You want to run as many as you can. |
− | ** 5 of you on the machine - 5*6 = 30 jobs will be running in parallel on that machine | + | ** 3 of you on the machine - 3*8 = 24 jobs will be running in parallel on that machine |
− | ${GC}/gotcloud snpcall --conf $SAMPLE/gotcloud.conf --numjobs 6 | + | ${GC}/gotcloud indel --conf $SAMPLE/gotcloud.conf --numjobs 8 |
− | * Only need the configuration & number of threads, rest is specified within the configuration. | + | * Only need the configuration, number of threads, and the output directory, rest is specified within the configuration. |
| | | |
| This will run overnight. We will check if it completed at the practical in the morning. | | This will run overnight. We will check if it completed at the practical in the morning. |