Changes

From Genome Analysis Wiki
Jump to navigationJump to search
Line 1: Line 1:  +
'''Note:''' the latest version of this practical is available at: [[SeqShop: Sequence Mapping and Assembly Practical]]
 +
* The ones here is the original one from the June workshop (updated to be run from elsewhere)
 +
 
== Introduction ==
 
== Introduction ==
 
See the [[Media:SeqShop - GotCloud Align.pdf|introductory slides]] for an intro to this tutorial.
 
See the [[Media:SeqShop - GotCloud Align.pdf|introductory slides]] for an intro to this tutorial.
Line 41: Line 44:  
''If you are running during the SeqShop Workshop, please skip this section.''
 
''If you are running during the SeqShop Workshop, please skip this section.''
 
<div class="mw-collapsible-content">
 
<div class="mw-collapsible-content">
=== Download the example data ===
+
=== Download & Build GotCloud ===
 +
If you do not already have GotCloud:
 +
* cd to where you want GotCloud installed (you can change this to any directory you want)
 +
mkdir -p ~/seqshop
 +
cd ~/seqshop/
 +
* download, decompress, and build the version of gotcloud that was tested with this tutorial:
 +
wget https://github.com/statgen/gotcloud/archive/gotcloud.workshop.tar.gz
 +
tar xvf gotcloud.workshop.tar.gz
 +
mv gotcloud-gotcloud.workshop gotcloud
 +
cd gotcloud/src
 +
make
 +
cd ../..
   −
=== Setup your run environment ===
+
Remember the path to gotcloud/ that is what you will need to set your GC variable to.
   −
Environment variables will be used throughout the tutorial.
+
=== Download the example data ===
 
+
Download and untar file containing the example data used in the practicals:
We recommend that you setup these variables so you won't have to modify every command in the tutorial.
+
wget http://csg.sph.umich.edu//mktrost/seqshopExample.tar.gz
 +
tar xvf seqshopExample.tar.gz
    +
You will see the names of all the files included in the example data scrolling on the screen as they are unpacked from the tar file.
   −
<div class="mw-collapsible mw-collapsed" style="width:500px">
+
{{SeqShopRemoteEnv}}
I'm using bash (replace the paths below with the appropriate paths):
  −
<div class="mw-collapsible-content">
  −
* Point to where you installed GotCloud
  −
*:<pre>export GC=/home/username/gotcloud</pre>
  −
* Point to where you installed the seqshop files
  −
*:<pre>export SS=/home/username/seqshop/</pre>
  −
* Point to where you want the output to go
  −
*:<pre>export OUT=/home/username/seqshop_output/</pre>
  −
</div>
  −
</div>
  −
 
  −
<div class="mw-collapsible mw-collapsed" style="width:500px">
  −
I'm using tcsh (replace the paths below with the appropriate paths):
  −
<div class="mw-collapsible-content">
  −
* Point to where you installed GotCloud
  −
*:<pre>setenv GC /home/username/gotcloud</pre>
  −
* Point to where you installed the seqshop files
  −
*:<pre>setenv SS /home/username/seqshop/</pre>
  −
* Point to where you want the output to go
  −
*:<pre>setenv OUT /home/username/seqshop_output/</pre>
  −
</div>
  −
</div>
  −
 
  −
</div>
  −
</div>
      
== Examining [[GotCloud]] Align Input Files ==
 
== Examining [[GotCloud]] Align Input Files ==
Line 163: Line 154:  
<li>View Screenshot</li>
 
<li>View Screenshot</li>
 
<div class="mw-collapsible-content">
 
<div class="mw-collapsible-content">
[[File:RefDir.png|500px]]
+
[[File:RefDir.png|700px]]
 
</div>
 
</div>
 
</div>
 
</div>
Line 220: Line 211:  
<li>Need a reminder of the format?</li>
 
<li>Need a reminder of the format?</li>
 
<div class="mw-collapsible-content">
 
<div class="mw-collapsible-content">
[[File:fqindex.png|650px]]
+
[[File:fqindex.png|750px]]
 
</div>
 
</div>
 
</div>
 
</div>
Line 244: Line 235:  
<li>HG00553 & HG00640</li>
 
<li>HG00553 & HG00640</li>
 
<li>They have multiple unique values in the RGID field</li>
 
<li>They have multiple unique values in the RGID field</li>
[[File:fqindexRG.png|650px]]
+
[[File:fqindexRG.png|800px]]
 
</div>
 
</div>
 
</div>
 
</div>
Line 271: Line 262:     
Let's look at the configuration file I created for this test:
 
Let's look at the configuration file I created for this test:
  more ${IN}/gotcloud.conf
+
  more ${SS}/gotcloud.conf
    
Use the <code>space bar</code> to advance if the whole file isn't displayed.
 
Use the <code>space bar</code> to advance if the whole file isn't displayed.
   −
; If your input and references are at different paths than specified, what would you change?
+
; If your references are in a different path than what is specified, what would you change?
 
<ul>
 
<ul>
<div class="mw-collapsible mw-collapsed" style="width:200px">
+
<div class="mw-collapsible mw-collapsed" style="width:300px">
 
<li>Answer:</li>
 
<li>Answer:</li>
 
<div class="mw-collapsible-content">
 
<div class="mw-collapsible-content">
 
<ul>
 
<ul>
<li>You would change <code>IN_DIR</code> & <code>REF_DIR</code> to the new paths</li>
+
<li>You would change <code>REF_DIR</code> to the new path</li>
[[File:gcConf.png|500px]]
+
[[File:gcConf.png|800px]]
 
</div>
 
</div>
 
</div>
 
</div>
Line 293: Line 284:     
Now that we have all of our input files, we need just a simple command to run them
 
Now that we have all of our input files, we need just a simple command to run them
  ${GC}/gotcloud align --conf ${IN}/gotcloud.conf --numcs 2
+
  ${GC}/gotcloud align --conf ${SS}/gotcloud.conf --numcs 2 --base_prefix ${SS} --outdir ${OUT}
 +
 
 +
* <code>${GC}/gotcloud</code> runs GotCloud
 +
* <code>align</code> tells GotCloud you want to run the alignment pipeline.
 +
* <code>--conf</code> tells GotCloud the name of the configuration file to use.
 +
** The configuration for this test was downloaded with the seqshop input files.
 
* <code>--numcs</code> means to run 2 samples at a time.
 
* <code>--numcs</code> means to run 2 samples at a time.
** How many you can run concurrently depends on your system
+
** How many you can run concurrently depends on your system.
 +
* <code>--base_prefix</code> tells GotCloud the prefix to append to relative paths.
 +
** The Configuration file cannot read environment variables, so we need to tell GotCloud the path to the input files, ${SS}
 +
** Alternatively, gotcloud.conf could be updated to specify the full paths
 +
* <code>--out_dir</code> tells GotCloud where to write the output.
 +
** This could be specified in gotcloud.conf, but to allow you to use the ${OUT} to change the output location, it is specified on the command-line
 +
 
 
[[File:gcalignStart.png|850px]]
 
[[File:gcalignStart.png|850px]]
   Line 382: Line 384:  
[[File:Qplotpdf.png|400px]]
 
[[File:Qplotpdf.png|400px]]
 
<li> Look at the PDF I produced when I ran the whole genome:</li>  
 
<li> Look at the PDF I produced when I ran the whole genome:</li>  
  evince ${IN}/example/HG00551.wg.qplot.pdf&
+
  evince ${SS}/ext/HG00551.wg.qplot.pdf&
 
</ul>
 
</ul>
 
[[File:Qplotpdfwg.png|400px]]
 
[[File:Qplotpdfwg.png|400px]]
Line 431: Line 433:     
Let's visualize what reads in that area look like using samtools tview:
 
Let's visualize what reads in that area look like using samtools tview:
  ${GC}/bin/samtools tview ${OUT}/bams/HG00551.recal.bam ${REF}/human.g1k.v37.chr22.fa
+
  ${GC}/bin/samtools tview ${OUT}/bams/HG00551.recal.bam ${SS}/ref22/human.g1k.v37.chr22.fa
 
* Type ‘g’  
 
* Type ‘g’  
 
** Type 22:36907000  
 
** Type 22:36907000  
Line 475: Line 477:     
== Logging Off ==
 
== Logging Off ==
 +
 +
''This section is specifically for the SeqShop Workshop computers.''
 +
<div class="mw-collapsible mw-collapsed" style="width:600px">
 +
''If you are not running during the SeqShop Workshop, please skip this section.''
 +
<div class="mw-collapsible-content">
 
To logout of seqshop-server, type:
 
To logout of seqshop-server, type:
 
  exit
 
  exit
Line 480: Line 487:     
When done, log out of the Windows machine.
 
When done, log out of the Windows machine.
 
+
</div>
 
+
</div>
== Please provide feedback from Day 1 ==
  −
https://docs.google.com/a/umich.edu/forms/d/1afGq98QToC17CO8k6tXXyybIZFFfNsy8mOX5Tzc2QyM/viewform
 
96

edits

Navigation menu