Regions of high linkage disequilibrium (LD)
From Genome Analysis Wiki
Jump to navigationJump to searchThere are regions of high linkage diequilibrium in the human genome. These regions should be excluded when performing certain analyses such as principal component analysis on genotype data.
Chr | Start | Stop | ID |
---|---|---|---|
1 | 48060567 | 52060567 | hild1 |
2 | 85941853 | 100407914 | hild2 |
2 | 134382738 | 137882738 | hild3 |
2 | 182882739 | 189882739 | hild4 |
3 | 47500000 | 50000000 | hild5 |
3 | 83500000 | 87000000 | hild6 |
3 | 89000000 | 97500000 | hild7 |
5 | 44500000 | 50500000 | hild8 |
5 | 98000000 | 100500000 | hild9 |
5 | 129000000 | 132000000 | hild10 |
5 | 135500000 | 138500000 | hild11 |
6 | 25500000 | 33500000 | hild12 |
6 | 57000000 | 64000000 | hild13 |
6 | 140000000 | 142500000 | hild14 |
7 | 55193285 | 66193285 | hild15 |
8 | 8000000 | 12000000 | hild16 |
8 | 43000000 | 50000000 | hild17 |
8 | 112000000 | 115000000 | hild18 |
10 | 37000000 | 43000000 | hild19 |
11 | 46000000 | 57000000 | hild20 |
11 | 87500000 | 90500000 | hild21 |
12 | 33000000 | 40000000 | hild22 |
12 | 109521663 | 112021663 | hild23 |
20 | 32000000 | 34500000 | hild24 |
23 | 14150264 | 16650264 | hild25 |
23 | 25650264 | 28650264 | hild26 |
23 | 33150264 | 35650264 | hild27 |
23 | 55133704 | 60500000 | hild28 |
23 | 65133704 | 67633704 | hild29 |
23 | 71633704 | 77580511 | hild30 |
23 | 80080511 | 86080511 | hild31 |
23 | 100580511 | 103080511 | hild32 |
23 | 125602146 | 128102146 | hild33 |
23 | 129102146 | 131602146 | hild34 |
Excluding Regions With Plink
You can remove these regions from a PED file using the following PLINK commands. Assuming you have the data stored in a file named "high-ld.txt"
plink --file mydata --make-set high-ld.txt --write-set --out hild plink --file mydata --exclude hild.set --recode --out mydatatrimmed