Changes

From Genome Analysis Wiki
Jump to navigationJump to search
4,021 bytes added ,  09:52, 21 October 2010
Created page with 'haploxt is a C/C++ software developed by [https://www.sph.umich.edu/csg/yli/ Yun Li] and [https://www.sph.umich.edu/csg/abecasis/ Goncalo Abecasis]. It calculates LD (D' and r<su…'
haploxt is a C/C++ software developed by [https://www.sph.umich.edu/csg/yli/ Yun Li] and [https://www.sph.umich.edu/csg/abecasis/ Goncalo Abecasis]. It calculates LD (D' and r<sup>2</sup>) from phased haplotypes.

= Options =

== --impped --impdat <br> ==

specify one input pedigree set.

== --trueped --truedat <br> ==

specify the other input pedigree set.

== --match ==

generates a matrix taking values 0,1,2 indicating # of matched alleles. The dimension of the matrix is # of overlapping individuals times # of overlapping markers of the two input pedigree sets.

== --bySNP ==

is turned on by default to generate SNP specific measures. The output .bySNP will contain the following 6 fields for each SNP:

(1) SNP&nbsp;: SNP name
(2) gErr&nbsp;: genotypic discordance rate
(3) aErr&nbsp;: allelic discordance rate
(4) matchedG&nbsp;: number of genotypes matched
(5) matchedA: number of alleles matched
(6) maskedG: total number of genotypes evaluated/masked (&lt;=n of course) (I should change the naming to comparedG or evaluatedG)

<br>

== --byGeno ==

can be added on top of --bySNP. It will generates the following fields after the 6 fields above:

(7) hetAerr&nbsp;: allelic discordance rate among heterozygotes
(8) AL1: allele 1 (an arbitrary allele)
(9) AL2: allele 2
(10) freq1: frequency of AL1
(11) MAF
(12) #true 1/1: # individuals with experimental genotype AL1/AL1
(13) mm1/2: # of true AL1/AL1 being imputed as AL1/AL2
(14) mm2/2: # of true AL1/AL1 being imputed as AL2/AL2
(15) #true 1/2
(16) mm1/1
(17) mm2/2
(18) #true 2/2
(19) mm1/1
(20) mm1/2

<br>

<br>

== --accuracyByGeno ==

Similar to --byGeno, it is used on top of --bySNP. It may be used together with --byGeno. It will generate the following fields, after (7-20) is --byGeno is turned on or after the 6th field otherwise.

(A) almajor: major allele
(B) alminor: minor allele
(C) freq1: major allele frequency
(D) accuracy11: allelic concordance rate for homozygotes major allele
(E) accuracy12: allelic concordance rate for heterozygotes
(F) accuracy22: allelic concordance rate for homozygotes minor allele

<br>

== --byPerson ==

generates a separate output file .byPerson and contains the following information for each person:

(1) famid
(2) subjID
(3) gErr
(4) aErr
(5) matchedG
(6) matchedA
(7) maskedG

<br> This --byPerson option is useful if there is potential sample swap or inter-individual difference, e.g., sequencing depth, number of markers genotyped.

<br>

== --maskflag --maskped --maskdat ==

CalcMatch compares all genotypes overlapping the two input sets. However, when --maskflag is turned on AND --maskped and --maskdat are specified (I know ...) it compares only the following subset of the overlapping genotypes: genotypes either not found (i.e., individual or marker not included) or missing (included but with value 0/0, N/N, ./. etc) in --maskped / --maskdat. These options are useful when some individuals were masked for some SNPs while others masked for a different set of SNPs.

= example command lines =

CalcMatch --trueped true.ped --truedat true.dat --impped imp.ped --impdat imp.dat -o CalcMatch.Output --byPerson

Will generate CalcMatch.Output.bySNP (6 fields only) and CalcMatch.Output.byPerson.

CalcMatch --trueped true.ped --truedat true.dat --impped imp.ped --impdat imp.dat -o CalcMatch.Output --byGeno --byPerson

Will generate CalcMatch.Output.bySNP (6+20 fields) and CalcMatch.Output.byPerson.

CalcMatch --trueped true.ped --truedat true.dat --impped imp.ped --impdat imp.dat -o CalcMatch.Output --accuracyByGeno --byPerson

Will generate CalcMatch.Output.bySNP (6+6 fields only) and CalcMatch.Output.byPerson.

CalcMatch --trueped true.ped --truedat true.dat --impped imp.ped --impdat imp.dat -o CalcMatch.Output --accuracyByGeno --byGeno --byPerson

Will generate CalcMatch.Output.bySNP (6+20+6 fields only) and CalcMatch.Output.byPerson.
212

edits

Navigation menu