Line 60: |
Line 60: |
| * Overlapping pair of read fragments are not specially handled by default. **It is recommended** to explicitly turn on --clip-overlap option to clip either side of overlapping read fragment to improve the filtering performance. | | * Overlapping pair of read fragments are not specially handled by default. **It is recommended** to explicitly turn on --clip-overlap option to clip either side of overlapping read fragment to improve the filtering performance. |
| * By default, it assume that it runs in one machine. If you are running in MOSIX enable cluster, mosix-nodes [node1,node2,node3,..,noden] will allow to spread the jobs to multiple nodes in parallel | | * By default, it assume that it runs in one machine. If you are running in MOSIX enable cluster, mosix-nodes [node1,node2,node3,..,noden] will allow to spread the jobs to multiple nodes in parallel |
| + | |
| + | Internally, the current implementation runs samtools to collect this pileup. We have a separate software package that handles indels and SNPs together and will replace samtools soon. |
| | | |
| Examples from the 1000 Genomes project is available at | | Examples from the 1000 Genomes project is available at |
Line 85: |
Line 87: |
| HG00107 /net/1000g/1000g/data/HG00107/alignment/HG00107.mapped.ILLUMINA.bwa.GBR.low_coverage.20130415.bam | | HG00107 /net/1000g/1000g/data/HG00107/alignment/HG00107.mapped.ILLUMINA.bwa.GBR.low_coverage.20130415.bam |
| | | |
− | $head /net/1000g/hmkang/1KG/phase3/wg.consensus/union/union.snps.sites.loci | + | $ head /net/1000g/hmkang/1KG/phase3/wg.consensus/union/union.snps.sites.loci |
| 1 10002 | | 1 10002 |
| 1 10004 | | 1 10004 |
Line 96: |
Line 98: |
| 1 10473 | | 1 10473 |
| 1 10478 | | 1 10478 |
| + | |
| + | The output file has a bit cryptic format, but it will be readable to the downstream software. See [[http://samtools.sourceforge.net/pileup.shtml | Samtools Pileup Format]] web page to understand the details of the output format. |
| + | |
| + | $ zcat /net/1000g/hmkang/1KG/phase3/wg.consensus/lcmpus/phase3.low_coverage.wgs.HG00096.txt.gz | head |
| + | 1 10327 T 4 ,.,. E@,H >>>F 76,30,36,18 |
| + | 1 10469 C 5 .g,$,. A?/;P >>>FF 78,66,100,38,12 |
| + | 1 10470 G 4 .,,. 8?CH >>FF 79,67,39,13 |
| + | 1 10471 C 4 .,,. D=:Q >>FF 80,68,40,14 |
| + | 1 10472 G 4 .,,. <DLH >>FF 81,69,41,15 |
| + | 1 10473 G 4 .,,. C=DR >>FF 82,70,42,16 |
| + | 1 10478 C 5 .,,., DJQR> >>FF> 87,75,47,21,2 |
| + | 1 10492 C 5 ,,T,, QKAA@ >FF>> 89,61,35,16,7 |
| + | 1 10494 G 5 ,,.,, QQ<G> >FF>> 91,63,37,18,9 |
| + | 1 10503 T 5 ,$,.,, /QDAG >FF>> 100,72,46,27,18 |