Line 1: |
Line 1: |
| + | __TOC__ |
| + | |
| + | == Setup == |
| + | <div class="mw-collapsible" style="width:600px"> |
| + | ''If you are already logged in, you can skip this section.'' |
| + | <div class="mw-collapsible-content"> |
| {{SeqShopLogin}} | | {{SeqShopLogin}} |
| + | </div> |
| + | </div> |
| + | |
| + | == Setup your terminal == |
| + | |
| + | Resume screen: |
| + | screen -r |
| + | |
| | | |
− | == Setup == | + | <div class="mw-collapsible mw-collapsed" style="width:700px"> |
− | ''If you were sequenced, set these values:'' | + | '''Only if your snpcalling is still running, let Mary Kate know, expand, and follow these instructions''' |
− | export SAMPLE=Sample* | + | <div class="mw-collapsible-content"> |
| + | * Detach from screen |
| + | Ctrl-a d |
| + | * Start new screen |
| + | screen |
| + | * Reset environment variables (replace SampleXX with your sample name) |
| + | export SAMPLE=SampleXX |
| source /net/seqshop-server/home/mktrost/seqshop/setupSS.txt | | source /net/seqshop-server/home/mktrost/seqshop/setupSS.txt |
− | source /net/seqshop-server/home/chaolong/LASER-Tutorial/setup.txt
| + | * When you detach from screen and reattach, you will have multiple screen sessions |
| + | ** The ancestry one will have today's date, the snpcall one will have an earlier date. |
| + | *** Mary Kate can explain how to reattach to a specific session when we get there. |
| + | </div> |
| + | </div> |
| + | |
| + | |
| + | '''Instructions for everyone: Setup LASER''' |
| | | |
− | ''If you were not sequenced, set these values:''
| + | Source the LASER setup: |
− | export SAMPLE=NA12878
| |
− | source /net/seqshop-server/home/mktrost/seqshop/setupSS.txt
| |
| source /net/seqshop-server/home/chaolong/LASER-Tutorial/setup.txt | | source /net/seqshop-server/home/chaolong/LASER-Tutorial/setup.txt |
| + | (SAMPLE & OUT were previously set in your screen session.) |
| | | |
| After setting this, also do | | After setting this, also do |
Line 22: |
Line 48: |
| === Step 1: bam --> pileup === | | === Step 1: bam --> pileup === |
| | | |
− | $GC/bin/samtools mpileup -q 30 -Q 20 -f $REF/hs37d5.fa -l $HGDP/HGDP_938.bed $OUT/bams/${SAMPLE}.recal.bam > $OUT/ancestry/${SAMPLE}.recal.pileup | + | $GC/bin/samtools mpileup -q 30 -Q 20 -f $REF/human.g1k.v37.fa -l $HGDP/HGDP_938.bed $OUT/bams/${SAMPLE}.recal.bam > $OUT/ancestry/${SAMPLE}.HGDP.pileup |
| + | |
| + | This step takes ~1 hour for a genome sequenced at 17X. |
| + | |
| + | While this is running, we will go look at other results, so exit screen: |
| + | Ctrl-a d |
| + | |
| + | While that's running, let's go run the Association Analysis Practical: [[SeqShop: Genetic Association Analysis Practical, December 2014]] |
| + | |
| + | === Checking if Pileup finished === |
| + | screen -r |
| + | |
| + | '''Did everyone's finish?''' |
| + | |
| + | |
| + | If not, let's go to Indel: [[SeqShop:_Calling_Your_Own_Genome,_December_2014#Reviewing_Indel_Results]] |
| | | |
− | This step takes 5-6 minutes.
| |
| | | |
| + | If yes, let's continue below: |
| === Step 2: pileup --> seq === | | === Step 2: pileup --> seq === |
| | | |
| python $LASER/pileup2seq/pileup2seq.py \ | | python $LASER/pileup2seq/pileup2seq.py \ |
| -m $HGDP/HGDP_938.site \ | | -m $HGDP/HGDP_938.site \ |
− | -o $OUT/ancestry/$SAMPLE.laser \ | + | -o $OUT/ancestry/$SAMPLE.HGDP \ |
− | $OUT/ancestry/$SAMPLE.recal.pileup | + | $OUT/ancestry/$SAMPLE.HGDP.pileup |
| | | |
| This step takes just a few seconds. | | This step takes just a few seconds. |
Line 38: |
Line 79: |
| This step will take about 5-6 minutes. | | This step will take about 5-6 minutes. |
| | | |
− | $LASER/laser -g $HGDP/HGDP_938.geno -c $HGDP/HGDP_938.RefPC.coord -s $OUT/ancestry/$SAMPLE.laser.seq -K 20 -k 4 -M 0.8 -o $OUT/ancestry/$SAMPLE.laser.2 & | + | $LASER/laser -g $HGDP/HGDP_938.geno -c $HGDP/HGDP_938.RefPC.coord -s $OUT/ancestry/$SAMPLE.HGDP.seq -K 20 -k 4 -M 0.8 -o $OUT/ancestry/$SAMPLE.HGDP & |
| | | |
| View the results: | | View the results: |
− | less -S $OUT/ancestry/${SAMPLE}.laser.2.SeqPC.coord | + | less -S $OUT/ancestry/${SAMPLE}.HGDP.SeqPC.coord |
| | | |
| == Visualizing Ancestry == | | == Visualizing Ancestry == |
Line 51: |
Line 92: |
| | | |
| Generate the plot: | | Generate the plot: |
− | Rscript plotHGDP.r $HGDP/HGDP_938.RefPC.coord $OUT/ancestry/${SAMPLE}.laser.2.SeqPC.coord | + | Rscript plotHGDP.r $HGDP/HGDP_938.RefPC.coord $OUT/ancestry/${SAMPLE}.HGDP.SeqPC.coord |
| | | |
| Take a look: | | Take a look: |
| evince Results_on_HGDP.pdf & | | evince Results_on_HGDP.pdf & |
| | | |
− | ==Interested in looking just at European populations?== | + | ==Interested in looking just at European or East Asian or Central South Asian populations?== |
| + | === Population information === |
| + | 8 European populations in the HGDP dataset, including 156 individuals in total. |
| + | |
| + | 18 East Asian populations in the HGDP dataset, including 224 individuals after excluding 5 outliers (defined as more than 5 standard deviations from mean in any of top 10 PCs, 1 removal iteration). |
| + | |
| + | 8 Central/South Asian populations in the HGDP dataset, including 195 individuals after excluding 5 outliers (defined as more than 5 standard deviations from mean in any of top 10 PCs, 1 removal iteration). Note that 7 out of these 8 population were collected in Pakistan and 1 was collected in China (Uygur). |
| + | |
| + | [[File:HGDP Popualtions.png|thumb|center|alt=HGDP populations |300px|HGDP populations]] |
| + | |
| === Step 1: bam --> pileup === | | === Step 1: bam --> pileup === |
| You can skip this, you already did it. | | You can skip this, you already did it. |
Line 62: |
Line 112: |
| === Step 2: pileup --> seq === | | === Step 2: pileup --> seq === |
| | | |
− | python $LASER/pileup2seq/pileup2seq.py \
| + | You can skip this step too, because you already did it. |
− | -m $HGDP/HGDP.633K.euro.site \
| |
− | -o $OUT/ancestry/$SAMPLE.Euro.laser \
| |
− | $OUT/ancestry/${SAMPLE}.recal.pileup
| |
− | | |
− | This step takes just a few seconds.
| |
| | | |
| === Estimate ancestry === | | === Estimate ancestry === |
| This step will take a few seconds. | | This step will take a few seconds. |
| | | |
− | $LASER/laser -g $HGDP/HGDP.633K.euro.geno -c $HGDP/HGDP.633K.euro.RefPC.coord -s $OUT/ancestry/$SAMPLE.Euro.laser.seq -K 20 -k 4 -M 0.8 -o $OUT/ancestry/$SAMPLE.Euro.laser.1 & | + | For European ancestry: |
| + | $LASER/laser -g $HGDP/HGDP.633K.euro.geno -c $HGDP/HGDP.633K.euro.RefPC.coord -s $OUT/ancestry/$SAMPLE.HGDP.seq -K 20 -k 4 -M 0.8 -o $OUT/ancestry/$SAMPLE.HGDP-Euro & |
| + | |
| + | For East Asian ancestry: |
| + | $LASER/laser -g $HGDP/HGDP.633K.easia.2.geno -c $HGDP/HGDP.633K.easia.2.RefPC.coord -s $OUT/ancestry/$SAMPLE.HGDP.seq -K 20 -k 4 -M 0.8 -o $OUT/ancestry/$SAMPLE.HGDP-EAsia & |
| + | |
| + | For Central/South Asian ancestry: |
| + | $LASER/laser -g $HGDP/HGDP.633K.csasia.2.geno -c $HGDP/HGDP.633K.csasia.2.RefPC.coord -s $OUT/ancestry/$SAMPLE.HGDP.seq -K 20 -k 4 -M 0.8 -o $OUT/ancestry/$SAMPLE.HGDP-CSAsia & |
| + | |
| | | |
− | View the results: | + | View the results for European ancestry: |
− | less -S $OUT/ancestry/${SAMPLE}.Euro.laser.1.SeqPC.coord | + | less -S $OUT/ancestry/${SAMPLE}.HGDP-Euro.SeqPC.coord |
| + | or for East Asian ancestry |
| + | less -S $OUT/ancestry/${SAMPLE}.HGDP-EAsia.SeqPC.coord |
| + | or for Central/South Asian ancestry |
| + | less -S $OUT/ancestry/${SAMPLE}.HGDP-CSAsia.SeqPC.coord |
| | | |
| == Visualizing Ancestry == | | == Visualizing Ancestry == |
− | Copy the R code to plot your ancestry
| |
− | cp -r $LASER/plot/ $OUT/ancestry/.
| |
| | | |
| Change to that new directory: | | Change to that new directory: |
Line 88: |
Line 143: |
| | | |
| | | |
− | Generate the plot: | + | Generate the plot for European ancestry: |
− | Rscript plotHGDP.r $HGDP/HGDP.633K.euro.RefPC.coord $OUT/ancestry/${SAMPLE}.Euro.laser.1.SeqPC.coord | + | Rscript plotHGDP.r $HGDP/HGDP.633K.euro.RefPC.coord $OUT/ancestry/${SAMPLE}.HGDP-Euro.SeqPC.coord |
| + | or for East Asian ancestry |
| + | Rscript plotHGDP.r $HGDP/HGDP.633K.easia.2.RefPC.coord $OUT/ancestry/${SAMPLE}.HGDP-EAsia.SeqPC.coord |
| + | or for Central/South Asian ancestry |
| + | Rscript plotHGDP.r $HGDP/HGDP.633K.csasia.2.RefPC.coord $OUT/ancestry/${SAMPLE}.HGDP-CSAsia.SeqPC.coord |
| | | |
| Take a look: | | Take a look: |
| evince Results_on_HGDP.pdf & | | evince Results_on_HGDP.pdf & |
| + | |
| + | == Return to Calling your own Genome == |
| + | Go to [[SeqShop:_Calling_Your_Own_Genome,_December_2014#Reviewing_Indel_Results]] and pick up where we left off. |