From Genome Analysis Wiki
Jump to navigationJump to search
2,676 bytes added
, 14:31, 13 November 2014
{{SeqShopLogin}}
== Setup ==
Set these values. If you used a different path for any of these, please update here. Also, be sure to specify your sample name instead of Sample_XXXXX
source /home/mktrost/seqshop/setup.2x.txt
export SAMPLE=Sample_XXXXX
OUT needs to point to where your alignment output went, so if your output is not ~/personal/output, please set OUT appropriately:
export OUT=~/personal/YOUR_OUTPUT_DIR
After setting this, also do
mkdir -p $OUT/ancestry
Verify that this does not give an error:
ls $OUT/bams/${SAMPLE}.recal.bam
== Run ==
=== Step 1: bam --> pileup ===
$GC/bin/samtools mpileup -q 30 -Q 20 -f $REF/hs37d5.fa -l $HGDP/HGDP_938.bed $OUT/bams/${SAMPLE}.recal.bam > $OUT/ancestry/${SAMPLE}.recal.pileup
This step takes 5-6 minutes.
=== Step 2: pileup --> seq ===
python $LASER/pileup2seq/pileup2seq.py \
-m $HGDP/HGDP_938.site \
-o $OUT/ancestry/$SAMPLE.laser \
$OUT/ancestry/$SAMPLE.recal.pileup
This step takes just a few seconds.
=== Estimate ancestry ===
This step will take about 5-6 minutes.
$LASER/laser -g $HGDP/HGDP_938.geno -c $HGDP/HGDP_938.RefPC.coord -s $OUT/ancestry/$SAMPLE.laser.seq -K 20 -k 4 -M 0.8 -o $OUT/ancestry/$SAMPLE.laser.2 &
View the results:
less -S $OUT/ancestry/${SAMPLE}.laser.2.SeqPC.coord
== Visualizing Ancestry ==
Copy the R code to plot your ancestry
cp -r $LASER/plot/ $OUT/ancestry/.
Change to that new directory:
cd $OUT/ancestry/plot
Generate the plot:
Rscript plotHGDP.r $HGDP/HGDP_938.RefPC.coord $OUT/ancestry/${SAMPLE}.laser.2.SeqPC.coord
Take a look:
evince Results_on_HGDP.pdf &
==Interested in looking just at European populations?==
=== Step 1: bam --> pileup ===
You can skip this, you already did it.
=== Step 2: pileup --> seq ===
python $LASER/pileup2seq/pileup2seq.py \
-m $HGDP/HGDP.633K.euro.site \
-o $OUT/ancestry/$SAMPLE.Euro.laser \
$OUT/ancestry/${SAMPLE}.recal.pileup
This step takes just a few seconds.
=== Estimate ancestry ===
This step will take a few seconds.
$LASER/laser -g $HGDP/HGDP.633K.euro.geno -c $HGDP/HGDP.633K.euro.RefPC.coord -s $OUT/ancestry/$SAMPLE.Euro.laser.seq -K 20 -k 4 -M 0.8 -o $OUT/ancestry/$SAMPLE.Euro.laser.1 &
View the results:
less -S $OUT/ancestry/${SAMPLE}.Euro.laser.1.SeqPC.coord
== Visualizing Ancestry ==
Copy the R code to plot your ancestry
cp -r $LASER/plot/ $OUT/ancestry/.
Change to that new directory:
cd $OUT/ancestry/plot
Move your other plot so you don't over-write it
mv Results_on_HGDP.pdf Results_on_HGDP_All.pdf
Generate the plot:
Rscript plotHGDP.r $HGDP/HGDP.633K.euro.RefPC.coord $OUT/ancestry/${SAMPLE}.Euro.laser.1.SeqPC.coord
Take a look:
evince Results_on_HGDP.pdf &