Changes

From Genome Analysis Wiki
Jump to navigationJump to search
Line 24: Line 24:  
     capt pileup [options]
 
     capt pileup [options]
 
   
 
   
     Required Options (Run epacts single -man or see wiki for more info):
+
     Required Options (Run capt-pileup -man or see wiki for more info):
 
       -loci STR        Input genomic position to perform pileup
 
       -loci STR        Input genomic position to perform pileup
 
       -index STR        Index file containing sample IDs and BAM file path
 
       -index STR        Index file containing sample IDs and BAM file path
Line 60: Line 60:  
* Overlapping pair of read fragments are not specially handled by default. **It is recommended** to explicitly turn on --clip-overlap option to clip either side of overlapping read fragment to improve the filtering performance.
 
* Overlapping pair of read fragments are not specially handled by default. **It is recommended** to explicitly turn on --clip-overlap option to clip either side of overlapping read fragment to improve the filtering performance.
 
* By default, it assume that it runs in one machine. If you are running in MOSIX enable cluster, mosix-nodes [node1,node2,node3,..,noden] will allow to spread the jobs to multiple nodes in parallel
 
* By default, it assume that it runs in one machine. If you are running in MOSIX enable cluster, mosix-nodes [node1,node2,node3,..,noden] will allow to spread the jobs to multiple nodes in parallel
 +
 +
Internally, the current implementation runs samtools to collect this pileup. We have a separate software package that handles indels and SNPs together and will replace samtools soon.
    
Examples from the 1000 Genomes project is available at
 
Examples from the 1000 Genomes project is available at
Line 65: Line 67:     
For example, you can modify from the following command  
 
For example, you can modify from the following command  
  /net/fantasia/home/hmkang/bin/captTest/bin/capt-pileup --index /net/1000g/hmkang/1KG/phase3/index/20130502.gotcloud.low_coverage.2col.index --out /net/1000g/hmkang/1KG/phase3/wg.consensus/lcmpus/phase3.low_coverage.wgs --loci /net/1000g/hmkang/1KG/phase3/wg.consensus/union/union.snps.sites.loci --mosix-nodes 10,11,12,13 --ref /net/1000g/hmkang/1KG/phase3/gotcloud/gotcloud.ref/hs37d5.fa --clip-overlap
+
  /net/fantasia/home/hmkang/bin/captTest/bin/capt-pileup --index /net/1000g/hmkang/1KG/phase3/index/20130502.gotcloud.low_coverage.2col.index \\
 +
    --out /net/1000g/hmkang/1KG/phase3/wg.consensus/lcmpus/phase3.low_coverage.wgs \\
 +
    --loci /net/1000g/hmkang/1KG/phase3/wg.consensus/union/union.snps.sites.loci \\
 +
    --mosix-nodes 10,11,12,13 \\
 +
    --ref /net/1000g/hmkang/1KG/phase3/gotcloud/gotcloud.ref/hs37d5.fa \\
 +
    --clip-overlap
    
Have a peek of the each input file to better understand what you actually need to prepare
 
Have a peek of the each input file to better understand what you actually need to prepare
Line 80: Line 87:  
  HG00107 /net/1000g/1000g/data/HG00107/alignment/HG00107.mapped.ILLUMINA.bwa.GBR.low_coverage.20130415.bam
 
  HG00107 /net/1000g/1000g/data/HG00107/alignment/HG00107.mapped.ILLUMINA.bwa.GBR.low_coverage.20130415.bam
   −
  $head /net/1000g/hmkang/1KG/phase3/wg.consensus/union/union.snps.sites.loci
+
  $ head /net/1000g/hmkang/1KG/phase3/wg.consensus/union/union.snps.sites.loci
 
  1 10002
 
  1 10002
 
  1 10004
 
  1 10004
Line 91: Line 98:  
  1 10473
 
  1 10473
 
  1 10478
 
  1 10478
 +
 +
The output file has a bit cryptic format, but it will be readable to the downstream software. See [[http://samtools.sourceforge.net/pileup.shtml  Samtools Pileup Format]] web page to understand the details of the output format.
 +
 +
$ zcat /net/1000g/hmkang/1KG/phase3/wg.consensus/lcmpus/phase3.low_coverage.wgs.HG00096.txt.gz | head
 +
1 10327 T 4 ,.,. E@,H >>>F 76,30,36,18
 +
1 10469 C 5 .g,$,. A?/;P >>>FF 78,66,100,38,12
 +
1 10470 G 4 .,,. 8?CH >>FF 79,67,39,13
 +
1 10471 C 4 .,,. D=:Q >>FF 80,68,40,14
 +
1 10472 G 4 .,,. <DLH >>FF 81,69,41,15
 +
1 10473 G 4 .,,. C=DR >>FF 82,70,42,16
 +
1 10478 C 5 .,,., DJQR> >>FF> 87,75,47,21,2
 +
1 10492 C 5 ,,T,, QKAA@ >FF>> 89,61,35,16,7
 +
1 10494 G 5 ,,.,, QQ<G> >FF>> 91,63,37,18,9
 +
1 10503 T 5 ,$,.,, /QDAG >FF>> 100,72,46,27,18
 +
 +
(TO BE CONTINUED)..

Navigation menu