Changes

SeqShop: Calling Your Own Genome, December 2014 (view source)

Revision as of 14:54, 12 December 2014

3,684 bytes added , 14:54, 12 December 2014

→‎Friday

Line 400: Line 400:

ls ~/$SAMPLE/output/vcfs

+

=== More SNP Analysis ===

−

==== WORKSHOP FEEDBACK! ====

+

==== Environmental Variables ====

+

If you didn't set the environmental variable, you can set it again

+

source /net/seqshop-server/home/mktrost/seqshop/setup.txt

+

export SAMPLE=SampleXX (MAKE SURE TO CHANGE XX to your number or use NA12878 instead)

+

source /net/seqshop-server/home/mktrost/seqshop/setupSS.txt

+

In addition, set another environmental variable for locating the binaries for custom analysis

+

export HK=/net/seqshop-server/home/hmkang/seqshop/bin

+

==== Annotation / Lookup against dbSNP ====

+

If you want to add rsIDs to your variant files, you can do this by running the following command

+

$HK/vcf-add-rsid -vcf $OUT/vcfs/chr1/chr1.filtered.vcf.gz --db $HK../data/dbSNP.b138/dbsnp_138.b37.vcf.gz --out $OUT/vcfs/chr1/chr1.filtered.rsid.vcf.gz

+

If you want to run this command across all chromosomes in parallel, you can use the special script run-command-wgs

+

$HK/run-command-wgs --cmd "$HK/vcf-add-rsid -vcf $OUT/vcfs/chr1/chr1.filtered.vcf.gz --db $HK/../data/dbSNP.b138/dbsnp_138.b37.vcf.gz --out $OUT/vcfs/chr1/chr1.filtered.rsid.vcf.gz" --numjobs 6

+

Looking up SNPs by rsID is possible by (for example)

+

$HK/vcf-lookup-rsid --vcf $OUT/vcfs/chr1/chr1.filtered.vcf.gz --sepchr --rs rs17766217

+

If you want to browse the rsIDs of known GWAS SNPs, you can do this by

+

cut -f 1,8,22 $HK/../data/gwascatalog/gwascatalog.txt | less

+

==== Annotating your genome ====

+

You can annotate your genome using EPACTS software packages. Individual chromosome can be annotated by running.

+

$EPACTS/bin/epacts anno --in $OUT/vcfs/chr1/chr1.filtered.rsid.vcf.gz --out $OUT/vcfs/chr1/chr1.filtered.rsid.anno.vcf.gz

+

Or you can run multiple chromosomes in parallel in one command

+

$HK/run-command-wgs --cmd "$EPACTS/bin/epacts anno --in $OUT/vcfs/chr1/chr1.filtered.rsid.vcf.gz --out $OUT/vcfs/chr1/chr1.filtered.rsid.anno.vcf.gz" --numjobs 6

+

==== Extracting only exonic SNPs ====

+

If you want to look at the exonic SNPs, you can extract using the following command

+

$HK/run-command-wgs --cmd "($HK/tabix -H $OUT/vcfs/chr1/chr1.filtered.rsid.anno.vcf.gz; zcat $OUT/vcfs/chr1/chr1.filtered.rsid.anno.vcf.gz | grep Exon;)| $HK/bgzip -c > $OUT/vcfs/chr1/chr1.filtered.rsid.anno.exon.vcf.gz" --numjobs 6

+

And they can be combined as follows

+

(zcat $OUT/vcfs/chr1/chr1.filtered.rsid.anno.exon.vcf.gz; zcat $OUT/vcfs/chr[2-9]/chr*.filtered.rsid.anno.exon.vcf.gz $OUT/vcfs/chr??/chr*.filtered.rsid.anno.exon.vcf.gz $OUT/vcfs/chrX/chrX.filtered.rsid.anno.exon.vcf.gz) | $HK/bgzip -c > $OUT/wgs.filtered.rsid.anno.exon.vcf.gz

+

==== Exonic Variants NOT found by 1000G ====

+

If you are interested in rare variants that are not identified by 1000G, you can extract them by running

+

zcat $OUT/wgs.filtered.rsid.anno.exon.vcf.gz | grep "EXTFILTER=NA,NA" | grep -v -w "0/0" | grep -v "less

+

For example,

+

+

will give you the counts of variants, separate by the filtering results

+

* Q1. How manny novel silent, missense, and nonsense SNPs are found? Is that too few, too small, or just about right?

+

* Q2. Looking at each functional category, which functional categories has largest fraction of SNPs failed filter? Why do you think it is?

+

* Q3. Can you exclude the sites that are also in dbSNP, and count how many nonsense variants are left?

+

If you want to know predicted functional significance of a particular variant, you can search by

+

$HK/tabix $HK/../data/CADD/whole_genome_SNVs.tsv.gz [chr]:[pos] | head -3

+

The phred score at the last column quantifies the degree of functional significance

+

===== WORKSHOP FEEDBACK! =====

Please provide feedback on the workshop in general:

Hmkang

Administrators

1,120

edits

Changes

SeqShop: Calling Your Own Genome, December 2014 (view source)

Revision as of 14:54, 12 December 2014

Navigation menu

Page actions

Page actions

Personal tools

quick links

teaching

Navigation

Search

Tools