Changes

From Genome Analysis Wiki
Jump to navigationJump to search
Line 14: Line 14:  
== (2) Split GLF files by chromosome ==
 
== (2) Split GLF files by chromosome ==
   −
/home1/ylwtx/2009.08.GLF-split/  
+
/home1/ylwtx/2009.08.GLF-split/  
   −
update-glf.csh<br>splitGLF.csh  
+
update-glf.csh<br>splitGLF.csh  
    
key command: glfSplit<br>Source codes: ~goncalo/code/glfSplit/  
 
key command: glfSplit<br>Source codes: ~goncalo/code/glfSplit/  
Line 22: Line 22:  
Input GLF format: gz or bgzf<br>Output GLF format: gz  
 
Input GLF format: gz or bgzf<br>Output GLF format: gz  
   −
Tom suggested combing the first two steps using the following samtools command:<br>samtools -view -u *.bam 22 | samtools pileup –g - &gt; *.glf  
+
Tom suggested combing the first two steps using the following samtools command:<br>samtools -view -u *.bam 22 | samtools pileup –g - &gt; *.glf
    
== (3) Build a list of individuals within each population ==
 
== (3) Build a list of individuals within each population ==
   −
/home/ylwtx/codes/cpp/mach-1.0.16/test_thunder/2009.11.all  
+
/home/ylwtx/codes/cpp/mach-1.0.16/test_thunder/2009.11.all  
    
STEP 0 in s1-5.csh  
 
STEP 0 in s1-5.csh  
   −
Note: Check to make sure that all the individuals with GLF are included in the list “NA.number.by.popn”  
+
Note: Check to make sure that all the individuals with GLF are included in the list “NA.number.by.popn”
    
== (4) Link files and tabulate # of files per population, per platform ==
 
== (4) Link files and tabulate # of files per population, per platform ==
Line 100: Line 100:  
Notes: <br> Sites with more than two alleles (not including REF_ALLELE) will be discarded  
 
Notes: <br> Sites with more than two alleles (not including REF_ALLELE) will be discarded  
   −
== (10) Run thunder (hidden Markov model) ==
+
To-Do: <br>  ** Include genotypes only for individuals that have sequence data
 +
 
 +
== (10) Run thunder (hidden Markov model) ==
    
/home/ylwtx/codes/cpp/mach-1.0.16/test_thunder/2009.09.all/  
 
/home/ylwtx/codes/cpp/mach-1.0.16/test_thunder/2009.09.all/  
Line 110: Line 112:  
Notes:<br>(1) Cleaned monomorphic sites before feeding to thunder (no need, b/c thunder 005 handles AL1/-)<br>(2) All sites are bi-allelic with one of the alleles being the reference allele (sites with more than 2 alleles including the reference allele are discarded at the beginning of thunder run: initially because of a prior dependent on the reference allele. In the current setting, where Freq1 is used for the prior, we can choose to ignore the reference allele information.) <br>a. Codes changed on 2009-11-02<br>(3) Split:  
 
Notes:<br>(1) Cleaned monomorphic sites before feeding to thunder (no need, b/c thunder 005 handles AL1/-)<br>(2) All sites are bi-allelic with one of the alleles being the reference allele (sites with more than 2 alleles including the reference allele are discarded at the beginning of thunder run: initially because of a prior dependent on the reference allele. In the current setting, where Freq1 is used for the prior, we can choose to ignore the reference allele information.) <br>a. Codes changed on 2009-11-02<br>(3) Split:  
   −
Total 150 jobs (50 jobs for each population)  
+
<br>
 +
 
 +
{| cellspacing="1" cellpadding="1" border="1" style="width: 634px; height: 251px;"
 +
|-
 +
| chromosome
 +
| #parts<br>
 +
| length per part in Mb (last segment)<br>
 +
| start<br>
 +
| end<br>
 +
|-
 +
| 1-2<br>
 +
| 4<br>
 +
| 70 (63-67)<br>
 +
| 0,60,120,180<br>
 +
| 70,130,190,243-247<br>
 +
|-
 +
| 3-4<br>
 +
| 4<br>
 +
| 60 (41-49)<br>
 +
| 0,50,100,150<br>
 +
| 60,110,160,191-200<br>
 +
|-
 +
| <br>
 +
| <br>
 +
| <br>
 +
| <br>
 +
| <br>
 +
|-
 +
| 5-6<br>
 +
| 3<br>
 +
| 70 (51-61)<br>
 +
| 0,60,120<br>
 +
| 70,130,171-181<br>
 +
|-
 +
| 7-8<br>
 +
| 3<br>
 +
| 60 (40-59)<br>
 +
| 0,50,100<br>
 +
| 60,110,140-159<br>
 +
|-
 +
| 9*<br>
 +
| 3<br>
 +
| 75, 45, 40<br>
 +
| 0,'''65''',100<br>
 +
| '''75''',110,140<br>
 +
|-
 +
| <br>
 +
| <br>
 +
| <br>
 +
| <br>
 +
| <br>
 +
|-
 +
| 10-12<br>
 +
| 2<br>
 +
| 70 (72-75)<br>
 +
| 0,60<br>
 +
| 70,132-135<br>
 +
|-
 +
| 13-15<br>
 +
| 2<br>
 +
| 60 (50-64)<br>
 +
| 0,50<br>
 +
| 60,100-114<br>
 +
|-
 +
| <br>
 +
| <br>
 +
| <br>
 +
| <br>
 +
| <br>
 +
|-
 +
| 16-22<br>
 +
| 1<br>
 +
| 47-89<br>
 +
| <br>
 +
| <br>
 +
|}
 +
 
 +
&nbsp;*GAP btw 47-65Mb
 +
 
 +
 
 +
 
 +
Total 150 jobs (50 jobs for each population)
    
== (11) Ligate thunder results for larger chromosomes ==
 
== (11) Ligate thunder results for larger chromosomes ==
212

edits

Navigation menu