Changes

BamGenotypeCheck (view source)

Revision as of 22:57, 23 November 2009

2 bytes added , 22:57, 23 November 2009

no edit summary

Line 8: Line 8:

lanecheck --referencegenome NCBI36.fa --dbSNPfile dbSNP.txt

−

--lanefile lane.lst --pedfile test.ped --datfile test.dat --mapfile test.map --prefix result

+

--lanefile lane.lst --pedfile test.ped --datfile test.dat --mapfile test.map --prefix result

== Command Line Options ==

Line 35: Line 35:

=== Other Options ===

−

--memorymap ''use memory map technique for efficient memory sharing of reference genome file

+

--memorymap ''use memory map technique for efficient memory sharing of reference genome file''

−

''

+

−

== Principle of ~~operation~~: ==

+

== Principle of Operation: ==

The overall procedure is that the genotype identity checking program compares internal evidence from the sequence reads themselves to reference genotype information for a panel of candidate individuals. In the case of 1000 Genomes pilot data, these are HapMap genotypes from the same Coriell cell lines that are being sequenced. For each combination of [sequencing run x candidate individual] the program calculates the observed rate of mismatches at both "informative" and "background" locations and reports as "excess mismatch rate"

Line 52: Line 52:

1. Separate the results by "Read group classifier".

−

The mapped .bam file may contains sequence data from different instrument runs. The read identifiers often are dot or colon-separated strings of the form 'run_name<sep>read_number'. The 'run_name' may be either an SRR / ERR identifier or the sequencing center's own alpha-numeric internal run identifier. Allow users to input extended regular expression such as '\(^[^.:]+\)[.:].*' hich matches just the part of each read identifier that is common to all reads from one instrument run and which differs between instrument runs.

+

The mapped .bam file may contains sequence data from different instrument runs. The read identifiers often are dot or colon-separated strings of the form 'run_name<sep>read_number'. The 'run_name' may be either an SRR / ERR identifier or the sequencing center's own alpha-numeric internal run identifier. Allow users to input extended regular expression such as '\(^[^.:]+\)[.:].*' hich matches just the part of each read identifier that is common to all reads from one instrument run and which differs between instrument runs.

−

+

<br>

2. Use model based approach to calculate probability of lane coming from the claimed individual in the index file given a pool of individuals.  

Weich

533

edits

Changes

BamGenotypeCheck (view source)

Revision as of 22:57, 23 November 2009

Navigation menu

Page actions

Page actions

Personal tools

quick links

teaching

Navigation

Search

Tools