From Genome Analysis Wiki
Jump to navigationJump to search
344 bytes added
, 17:38, 15 January 2012
Line 82: |
Line 82: |
| | | |
| *First, find the first and last SNP in the region you are interested in. Say "rsFIRST" and "rsLAST", defined according to position. | | *First, find the first and last SNP in the region you are interested in. Say "rsFIRST" and "rsLAST", defined according to position. |
− | *Then: | + | *Then, under csh: |
− | | |
| @ first = `grep -nw rsFIRST orig.snps | cut -f1 -d ':'` | | @ first = `grep -nw rsFIRST orig.snps | cut -f1 -d ':'` |
| @ last = `grep -nw rsLAST orig.snps | cut -f1 -d ':'` | | @ last = `grep -nw rsLAST orig.snps | cut -f1 -d ':'` |
| + | under bash: |
| + | first=`grep -nw rsFIRST orig.snps | cut -f1 -d ':'` |
| + | last=`grep -nw rsLAST orig.snps | cut -f1 -d ':'` |
| | | |
| *Then find out the field that contains the actual haplotypes, where alleles are separated by whitespace | | *Then find out the field that contains the actual haplotypes, where alleles are separated by whitespace |
| head -1 orig.hap | wc -w | | head -1 orig.hap | wc -w |
| + | Note: if the haplotypes are gz compressed, do: |
| + | zcat orig.hap.gz | head -1 | wc -w |
| | | |
| * Finally (say you got 3 from the above wc -w command. If you got other numbers, replace the 3 in bold below with the number you got): | | * Finally (say you got 3 from the above wc -w command. If you got other numbers, replace the 3 in bold below with the number you got): |
| | | |
| awk '{print $'''3'''}' orig.hap | cut -c${first}-${last} > region.hap | | awk '{print $'''3'''}' orig.hap | cut -c${first}-${last} > region.hap |
| + | |
| + | Note: if the haplotypes are gz compressed, do: |
| + | zcat orig.hap.gz | awk '{print $'''3'''}' | cut -c${first}-${last} > region.hap |
| | | |
| The created reference files are in MaCH format. You do NOT need to turn on --hapmapFormat option. | | The created reference files are in MaCH format. You do NOT need to turn on --hapmapFormat option. |