RAREFY TUTORIAL

From Genome Analysis Wiki
Jump to: navigation, search

Useful Wiki Pages

EXAMPLE DATA SET

  • Please download the EXAMPLE file to your directory.
  • In this example, 20,000 individuals in 4,000 nuclear families were simulated with 40% trait heritability.
  • A variant with frequency 0.001 and effect size 2 SD was simulated in the sample. The variant was simulated to be trait-increasing.

COMMAND

  • Use the following command to RAREFY the sample for both trait-increasing and trait-decreasing variants. Although this sample includes only trait-increasing variant, in real world, we do not know if the sample contains trait-increasing variants or trait-decreasing variants or both.
rarefy --ped sim1.ped --dat sim1.dat --inverseNormal --traitIncreasing --traitDecreasing --prefix sim1

OUTPUT

  • Trait-increasing RAREFY scores and trait-decreasing RAREFY scores are saved in "sim1_simul_rarefy.trait.Decreasing.result" and "sim1_simul_rarefy.trait.Increasing.result" separately.
  • Use the following command to sort the RAREFY score to select informative individuals:
sort -k7 -gr sim1_simul_rarefy.trait.Increasing.result | head -10
  • The top 10 selected individuals are:
famid  pid     fatid   motid   sex      residual rarefy_idv_score rarefy_fam_score
1994    5       1       2       1       4.05563 0.839037        2.73226
1994    2       0       0       2       3.66226 0.829384        2.73226
345     1       0       0       1       3.45514 0.785963        1.95526
345     4       1       2       2       3.20513 0.782924        1.95526
3503    4       1       2       2       3.33598 0.621358        1.85789
3503    1       0       0       1       2.15643 0.619741        1.85789
1994    3       1       2       1       2.25633 0.607197        2.73226
3503    5       1       2       1       2.51068 0.601241        1.85789
1994    4       1       2       2       1.85392 0.437127        2.73226
3860    2       0       0       2       2.45598 0.405228        1.5474