Difference between revisions of "SeqShop: Calling Your Own Genome, June 2014"

From Genome Analysis Wiki
Jump to navigationJump to search
Line 4: Line 4:
 
Set these values.  If you used a different path for any of these, please update here.  Also, be sure to specify your sample name instead of Sample_XXXXX
 
Set these values.  If you used a different path for any of these, please update here.  Also, be sure to specify your sample name instead of Sample_XXXXX
  
  source /home/mktrost/seqshop-server/setup.txt
+
  source /home/mktrost/seqshop/setup.2x.txt
 
  export SAMPLE=Sample_XXXXX
 
  export SAMPLE=Sample_XXXXX
 +
export ALIGN_OUT=~/personal/output
 +
export CHR20_OUT=~/personal/output.20
 +
mkdir -p $CHR20_OUT
  
OUT needs to point to where your alignment output went, so if your output is not ~/personal/output, please set OUT appropriately:
+
ALIGN_OUT needs to point to where your alignment output went, so if your output is not ~/personal/output, please set OUT appropriately
export OUT=~/personal/YOUR_OUTPUT_DIR
 
  
 
Verify that this does not give an error:
 
Verify that this does not give an error:
  ls $OUT/bams/${SAMPLE}.recal.bam
+
  ls $ALIGN_OUT/bams/${SAMPLE}.recal.bam
 +
 
 +
 
 +
== Chromosome 20 ==
 +
 
 +
We want to add the 100 1000G chr20 BAMs to your bam list.  Let's copy the original one into a new one so we can run other tests later.
 +
cp $ALIGN_OUT/bam.index $CHR20_OUT/bam.20.index
 +
 
 +
Now add the chr20 BAMs to your new bam list:
 +
cat $IN/chr20/bam.20.index >> $CHR20_OUT/bam.20.index
 +
 
 +
Verify you have 101 lines in your list:
 +
wc -l $CHR20_OUT/bam.20.index
 +
 
 +
Update your gotcloud configuration file to indicate only chromosome 20 and point to the new list:
 +
nedit ~/personal/gotcloud.2x.conf
 +
 
 +
Update OUT_DIR & BAM_INDEX to:
 +
OUT_DIR = $(IN_DIR)/output.20
 +
BAM_INDEX = $(OUT_DIR)/bam.20.index
 +
 
 +
Tell it you only want to process chromosome 20, by adding the following anywhere in the file:
 +
CHRS = 20
 +
 
 +
Since it would take a while to run chrom 20 for 101 samples, I already ran the first step for the 100 1000G samples.
 +
 
 +
We will "trick" GotCloud into thinking you already ran them by copying them into your output directory.
 +
cp -r $IN/chr20/glfs $CHR20_OUT/.
 +
 
 +
Now you are ready to run.  Specify your chr20 bam list on the command line (or you could update BAM_INDEX in your conf file.
 +
$GC/gotcloud snpcall --conf ~/personal/gotcloud.2x.conf --numjobs 1
  
 
=== GotCloud Configuration Updates ===
 
=== GotCloud Configuration Updates ===

Revision as of 01:27, 20 June 2014

Login to the seqshop-server Linux Machine

This section will appear redundantly in each session. If you are already logged in or know how to log in to the server, please skip this section

  1. Login to the windows machine
    • The username/password for the Windows machine should be written on the right-hand monitor
  2. Start xming so you can open external windows on our Linux machine
    • Start->Enter "Xming" in the search and select "Xming" from the program list
    • Nothing will happen, but Xming was started.
    • View Screenshot
    • Xming.png

  3. Open putty
    • Start->Enter "putty" in the search and select "PuTTY" from the program list
    • View Screenshot
    • PuttyS.png

  4. Configure PuTTY in the PuTTY Configuration window
    • Host Name: seqshop-server.sph.umich.edu
    • View Screenshot
    • Seqshop.png

    • Setup to allow you to open external windows:
      • In the left pannel: Connection->SSH->X11
        • Add a check mark in the box next to Enable X11 forwarding
        • View Screenshot
        • SeqshopX11.png

    • Click Open
    • If it prompts about a key, click OK
  5. Enter your provided username & password as provided


You should now be logged into a terminal on the seqshop-server and be able to access the test files.

  • If you need another terminal, repeat from step 3.

Login to the seqshop Machine

So you can each run multiple jobs at once, we will have you run on 4 different machines within our seqshop setup.

  • You can only access these machines after logging onto seqshop-server

3 users logon to:

ssh -X seqshop1

3 users logon to:

ssh -X seqshop2

2 users logon to:

ssh -X seqshop3

2 users logon to:

ssh -X seqshop4

Setup

Set these values. If you used a different path for any of these, please update here. Also, be sure to specify your sample name instead of Sample_XXXXX

source /home/mktrost/seqshop/setup.2x.txt
export SAMPLE=Sample_XXXXX
export ALIGN_OUT=~/personal/output
export CHR20_OUT=~/personal/output.20
mkdir -p $CHR20_OUT

ALIGN_OUT needs to point to where your alignment output went, so if your output is not ~/personal/output, please set OUT appropriately

Verify that this does not give an error:

ls $ALIGN_OUT/bams/${SAMPLE}.recal.bam


Chromosome 20

We want to add the 100 1000G chr20 BAMs to your bam list. Let's copy the original one into a new one so we can run other tests later.

cp $ALIGN_OUT/bam.index $CHR20_OUT/bam.20.index

Now add the chr20 BAMs to your new bam list:

cat $IN/chr20/bam.20.index >> $CHR20_OUT/bam.20.index

Verify you have 101 lines in your list:

wc -l $CHR20_OUT/bam.20.index

Update your gotcloud configuration file to indicate only chromosome 20 and point to the new list:

nedit ~/personal/gotcloud.2x.conf

Update OUT_DIR & BAM_INDEX to:

OUT_DIR = $(IN_DIR)/output.20
BAM_INDEX = $(OUT_DIR)/bam.20.index

Tell it you only want to process chromosome 20, by adding the following anywhere in the file:

CHRS = 20

Since it would take a while to run chrom 20 for 101 samples, I already ran the first step for the 100 1000G samples.

We will "trick" GotCloud into thinking you already ran them by copying them into your output directory.

cp -r $IN/chr20/glfs $CHR20_OUT/.

Now you are ready to run. Specify your chr20 bam list on the command line (or you could update BAM_INDEX in your conf file.

$GC/gotcloud snpcall --conf ~/personal/gotcloud.2x.conf --numjobs 1

GotCloud Configuration Updates

To speed things up, we will only run on certain regions

  • We need to update the GotCloud Configuration to do this.

Locate your gotcloud.2x.conf (probably at: ~/personal/gotcloud.2x.conf) and open it in your favorite editor:

nedit  ~/personal/gotcloud.2x.conf

Add

# Specify the path to the regions we want to call
UNIFORM_TARGET_BED = $(REF_DIR)/20130108.exome.targets.nochr.bed

# We do not want any off target bases
OFFSET_OFF_TARGET = 0

WRITE_TARGET_LOCI = TRUE
TARGET_DIR = target

Run

Set CHR = 1 2

cp -r /home/mktrost/seqshop/inputs/glfs personal/output/.

cp personal/output/bam.index personal/output/bam.exome.index

cat /home/mktrost/seqshop/inputs/20130502.gotcloud.low_coverage.100.index >> personal/output/bam.exome.index

wc -l personal/output/bam.exome.index

cp /home/mktrost/seqshop/inputs/gotcloud.exome.conf personal/.

Update the conf ...your home.

/home/mktrost/seqshop/gotcloud/gotcloud snpcall --conf personal/gotcloud.exome.conf