Changes

From Genome Analysis Wiki
Jump to navigationJump to search
344 bytes added ,  17:38, 15 January 2012
Line 82: Line 82:     
*First, find the first and last SNP in the region you are interested in. Say "rsFIRST" and "rsLAST", defined according to position.  
 
*First, find the first and last SNP in the region you are interested in. Say "rsFIRST" and "rsLAST", defined according to position.  
*Then:
+
*Then, under csh:
 
   
   @ first = `grep -nw rsFIRST orig.snps | cut -f1 -d ':'`
 
   @ first = `grep -nw rsFIRST orig.snps | cut -f1 -d ':'`
 
  @ last = `grep -nw rsLAST orig.snps | cut -f1 -d ':'`
 
  @ last = `grep -nw rsLAST orig.snps | cut -f1 -d ':'`
 +
under bash:
 +
  first=`grep -nw rsFIRST orig.snps | cut -f1 -d ':'`
 +
last=`grep -nw rsLAST orig.snps | cut -f1 -d ':'`
    
*Then find out the field that contains the actual haplotypes, where alleles are separated by whitespace
 
*Then find out the field that contains the actual haplotypes, where alleles are separated by whitespace
 
   head -1 orig.hap | wc -w
 
   head -1 orig.hap | wc -w
 +
Note: if the haplotypes are gz compressed, do:
 +
  zcat orig.hap.gz | head -1 | wc -w
    
* Finally (say you got 3 from the above wc -w command. If you got other numbers, replace the 3 in bold below with the number you got):
 
* Finally (say you got 3 from the above wc -w command. If you got other numbers, replace the 3 in bold below with the number you got):
    
   awk '{print $'''3'''}' orig.hap | cut -c${first}-${last} > region.hap
 
   awk '{print $'''3'''}' orig.hap | cut -c${first}-${last} > region.hap
 +
 +
Note: if the haplotypes are gz compressed, do:
 +
  zcat orig.hap.gz | awk '{print $'''3'''}' | cut -c${first}-${last} > region.hap
    
The created reference files are in MaCH format. You do NOT need to turn on --hapmapFormat option.
 
The created reference files are in MaCH format. You do NOT need to turn on --hapmapFormat option.
212

edits

Navigation menu