Difference between revisions of "Regions of high linkage disequilibrium (LD)"
From Genome Analysis Wiki
Jump to navigationJump to search (Created page with 'There are regions of high linkage diequilibrium in the human genome. These regions should be excluded when performing certain analyses such as principal component analysis on gen…') |
|||
Line 77: | Line 77: | ||
| 23 || 129102146 || 131602146 || hild34 | | 23 || 129102146 || 131602146 || hild34 | ||
|} | |} | ||
+ | |||
+ | == Excluding Regions With Plink == | ||
+ | |||
+ | You can remove these regions from a PED file using the following PLINK commands. Assuming you have the data stored in a file named "high-ld.txt" | ||
+ | |||
+ | plink --file mydata --make-set high-ld.txt --write-set --out hild | ||
+ | plink --file mydata --exclude hild.set --recode --out mydatatrimmed |
Revision as of 23:14, 18 January 2012
There are regions of high linkage diequilibrium in the human genome. These regions should be excluded when performing certain analyses such as principal component analysis on genotype data.
Chr | Start | Stop | ID |
---|---|---|---|
1 | 48060567 | 52060567 | hild1 |
2 | 85941853 | 100407914 | hild2 |
2 | 134382738 | 137882738 | hild3 |
2 | 182882739 | 189882739 | hild4 |
3 | 47500000 | 50000000 | hild5 |
3 | 83500000 | 87000000 | hild6 |
3 | 89000000 | 97500000 | hild7 |
5 | 44500000 | 50500000 | hild8 |
5 | 98000000 | 100500000 | hild9 |
5 | 129000000 | 132000000 | hild10 |
5 | 135500000 | 138500000 | hild11 |
6 | 25500000 | 33500000 | hild12 |
6 | 57000000 | 64000000 | hild13 |
6 | 140000000 | 142500000 | hild14 |
7 | 55193285 | 66193285 | hild15 |
8 | 8000000 | 12000000 | hild16 |
8 | 43000000 | 50000000 | hild17 |
8 | 112000000 | 115000000 | hild18 |
10 | 37000000 | 43000000 | hild19 |
11 | 46000000 | 57000000 | hild20 |
11 | 87500000 | 90500000 | hild21 |
12 | 33000000 | 40000000 | hild22 |
12 | 109521663 | 112021663 | hild23 |
20 | 32000000 | 34500000 | hild24 |
23 | 14150264 | 16650264 | hild25 |
23 | 25650264 | 28650264 | hild26 |
23 | 33150264 | 35650264 | hild27 |
23 | 55133704 | 60500000 | hild28 |
23 | 65133704 | 67633704 | hild29 |
23 | 71633704 | 77580511 | hild30 |
23 | 80080511 | 86080511 | hild31 |
23 | 100580511 | 103080511 | hild32 |
23 | 125602146 | 128102146 | hild33 |
23 | 129102146 | 131602146 | hild34 |
Excluding Regions With Plink
You can remove these regions from a PED file using the following PLINK commands. Assuming you have the data stored in a file named "high-ld.txt"
plink --file mydata --make-set high-ld.txt --write-set --out hild plink --file mydata --exclude hild.set --recode --out mydatatrimmed