Changes

From Genome Analysis Wiki
Jump to navigationJump to search
Line 60: Line 60:  
* Overlapping pair of read fragments are not specially handled by default. **It is recommended** to explicitly turn on --clip-overlap option to clip either side of overlapping read fragment to improve the filtering performance.
 
* Overlapping pair of read fragments are not specially handled by default. **It is recommended** to explicitly turn on --clip-overlap option to clip either side of overlapping read fragment to improve the filtering performance.
 
* By default, it assume that it runs in one machine. If you are running in MOSIX enable cluster, mosix-nodes [node1,node2,node3,..,noden] will allow to spread the jobs to multiple nodes in parallel
 
* By default, it assume that it runs in one machine. If you are running in MOSIX enable cluster, mosix-nodes [node1,node2,node3,..,noden] will allow to spread the jobs to multiple nodes in parallel
 +
 +
Internally, the current implementation runs samtools to collect this pileup. We have a separate software package that handles indels and SNPs together and will replace samtools soon.
    
Examples from the 1000 Genomes project is available at
 
Examples from the 1000 Genomes project is available at
Line 85: Line 87:  
  HG00107 /net/1000g/1000g/data/HG00107/alignment/HG00107.mapped.ILLUMINA.bwa.GBR.low_coverage.20130415.bam
 
  HG00107 /net/1000g/1000g/data/HG00107/alignment/HG00107.mapped.ILLUMINA.bwa.GBR.low_coverage.20130415.bam
   −
  $head /net/1000g/hmkang/1KG/phase3/wg.consensus/union/union.snps.sites.loci
+
  $ head /net/1000g/hmkang/1KG/phase3/wg.consensus/union/union.snps.sites.loci
 
  1 10002
 
  1 10002
 
  1 10004
 
  1 10004
Line 96: Line 98:  
  1 10473
 
  1 10473
 
  1 10478
 
  1 10478
 +
 +
The output file has a bit cryptic format, but it will be readable to the downstream software. See [[http://samtools.sourceforge.net/pileup.shtml | Samtools Pileup Format]] web page to understand the details of the output format.
 +
 +
$ zcat /net/1000g/hmkang/1KG/phase3/wg.consensus/lcmpus/phase3.low_coverage.wgs.HG00096.txt.gz | head
 +
1 10327 T 4 ,.,. E@,H >>>F 76,30,36,18
 +
1 10469 C 5 .g,$,. A?/;P >>>FF 78,66,100,38,12
 +
1 10470 G 4 .,,. 8?CH >>FF 79,67,39,13
 +
1 10471 C 4 .,,. D=:Q >>FF 80,68,40,14
 +
1 10472 G 4 .,,. <DLH >>FF 81,69,41,15
 +
1 10473 G 4 .,,. C=DR >>FF 82,70,42,16
 +
1 10478 C 5 .,,., DJQR> >>FF> 87,75,47,21,2
 +
1 10492 C 5 ,,T,, QKAA@ >FF>> 89,61,35,16,7
 +
1 10494 G 5 ,,.,, QQ<G> >FF>> 91,63,37,18,9
 +
1 10503 T 5 ,$,.,, /QDAG >FF>> 100,72,46,27,18

Navigation menu