Changes

From Genome Analysis Wiki
Jump to navigationJump to search
Line 24: Line 24:  
     capt pileup [options]
 
     capt pileup [options]
 
   
 
   
     Required Options (Run epacts single -man or see wiki for more info):
+
     Required Options (Run capt-pileup -man or see wiki for more info):
 
       -loci STR        Input genomic position to perform pileup
 
       -loci STR        Input genomic position to perform pileup
 
       -index STR        Index file containing sample IDs and BAM file path
 
       -index STR        Index file containing sample IDs and BAM file path
Line 60: Line 60:  
* Overlapping pair of read fragments are not specially handled by default. **It is recommended** to explicitly turn on --clip-overlap option to clip either side of overlapping read fragment to improve the filtering performance.
 
* Overlapping pair of read fragments are not specially handled by default. **It is recommended** to explicitly turn on --clip-overlap option to clip either side of overlapping read fragment to improve the filtering performance.
 
* By default, it assume that it runs in one machine. If you are running in MOSIX enable cluster, mosix-nodes [node1,node2,node3,..,noden] will allow to spread the jobs to multiple nodes in parallel
 
* By default, it assume that it runs in one machine. If you are running in MOSIX enable cluster, mosix-nodes [node1,node2,node3,..,noden] will allow to spread the jobs to multiple nodes in parallel
 +
 +
Internally, the current implementation runs samtools to collect this pileup. We have a separate software package that handles indels and SNPs together and will replace samtools soon.
    
Examples from the 1000 Genomes project is available at
 
Examples from the 1000 Genomes project is available at
Line 85: Line 87:  
  HG00107 /net/1000g/1000g/data/HG00107/alignment/HG00107.mapped.ILLUMINA.bwa.GBR.low_coverage.20130415.bam
 
  HG00107 /net/1000g/1000g/data/HG00107/alignment/HG00107.mapped.ILLUMINA.bwa.GBR.low_coverage.20130415.bam
   −
  $head /net/1000g/hmkang/1KG/phase3/wg.consensus/union/union.snps.sites.loci
+
  $ head /net/1000g/hmkang/1KG/phase3/wg.consensus/union/union.snps.sites.loci
 
  1 10002
 
  1 10002
 
  1 10004
 
  1 10004
Line 96: Line 98:  
  1 10473
 
  1 10473
 
  1 10478
 
  1 10478
 +
 +
The output file has a bit cryptic format, but it will be readable to the downstream software. See [[http://samtools.sourceforge.net/pileup.shtml  Samtools Pileup Format]] web page to understand the details of the output format.
 +
 +
$ zcat /net/1000g/hmkang/1KG/phase3/wg.consensus/lcmpus/phase3.low_coverage.wgs.HG00096.txt.gz | head
 +
1 10327 T 4 ,.,. E@,H >>>F 76,30,36,18
 +
1 10469 C 5 .g,$,. A?/;P >>>FF 78,66,100,38,12
 +
1 10470 G 4 .,,. 8?CH >>FF 79,67,39,13
 +
1 10471 C 4 .,,. D=:Q >>FF 80,68,40,14
 +
1 10472 G 4 .,,. <DLH >>FF 81,69,41,15
 +
1 10473 G 4 .,,. C=DR >>FF 82,70,42,16
 +
1 10478 C 5 .,,., DJQR> >>FF> 87,75,47,21,2
 +
1 10492 C 5 ,,T,, QKAA@ >FF>> 89,61,35,16,7
 +
1 10494 G 5 ,,.,, QQ<G> >FF>> 91,63,37,18,9
 +
1 10503 T 5 ,$,.,, /QDAG >FF>> 100,72,46,27,18
 +
 +
(TO BE CONTINUED)..

Navigation menu