Changes

LASER (view source)

Revision as of 14:46, 4 February 2013

394 bytes added , 14:46, 4 February 2013

→‎Process sequencing file (BAM)

Line 39: Line 39:

We illustrate how to obtain .seq file from BAM files in this section.

−

In this example, we use HGDP data set as reference, which contains 938 individuals and 632,958 markers.

+

In this example, we use HGDP data set as a reference, which contains 938 individuals and 632,958 markers.

1. Obtain pileup files from BAM files

−

We use ''samtools'' to extract the sequence bases overlapping ~~the~~ 632,958 reference markers:

+

The first step is to generate a BED file:

+

cat ../resource/HGDP/HGDP_938.site |awk '{if (NR > 1) {print $1, $2-1, $2;}}' > HGDP_938.bed

+

This BED file contains the positions of all the reference markers.

+

Then we use ''samtools'' to extract the sequence bases overlapping these 632,958 reference markers.

+

Assuming your BAM file name is ''NA12878.chrom22.recal.bam'' (our example BAM file), you can use this:

+

samtools mpileup -q 30 -Q 20 -f ../../LASER-resource/reference/hs37d5.fa -l HGDP_938.bed exampleBAM/NA12878.chrom22.recal.bam > NA12878.chrom22.pileup

+

to obtain a pileup file named ''NA12878.chrom22.pileup''. It is required to keep the ''.pileup' suffix.

2. Obtain a seq file from pileup files.

−

~~To convert~~ pileup files into a single seq file before running LASER~~, we first~~ generate a ~~site~~ file:

+

After obtaining pileup files from each BAM file, you can convert them into a single seq file before running LASER.

−

+

Use the same site file and all generated pileup files from step 1 to generate a seq file:

−

~~cat ../resource/HGDP/HGDP_938.site |awk '{if (NR > 1) {print $1, $2-1, $2;}}' > HGDP_938.bed~~

−

~~Then use this~~ site ~~file and all generated~~ pileup ~~files from step 1 to generate a seq file:~~

+

python pileup2seq.py -m ../resource/HGDP/HGDP_938.site -o test NA12878.chrom22.pileup

−

~~python pileup2seq.py -m ../resource/HGDP/HGDP_938.site -o~~ test ~~NA12878~~.~~chrom22~~.~~pileup~~

+

You should obtain test.seq file after this step.

== Estimate ancestries of sequence samples ==

Zhanxw

255

edits

Changes

LASER (view source)

Revision as of 14:46, 4 February 2013

Navigation menu

Page actions

Page actions

Personal tools

quick links

teaching

Navigation

Search

Tools