From Genome Analysis Wiki
Jump to navigationJump to search
40 bytes added
, 09:46, 25 February 2013
Line 49: |
Line 49: |
| Our dataset consists of 60 individuals from GBR sequenced by the 1000 Genomes Project. These individuals have been sequenced to an average depth of about 4x. | | Our dataset consists of 60 individuals from GBR sequenced by the 1000 Genomes Project. These individuals have been sequenced to an average depth of about 4x. |
| | | |
− | To conserve time and disk-space, our analysis will focus on a small region on chromosome 20, 42900000 - 43200000. We will first map the reads for a single individual (labeled TBD). We will then combine the results with mapped reads from the other 59 individuals to generate a list of polymorphic sites and estimate accurate genotypes at each of these sites. | + | To conserve time and disk-space, our analysis will focus on a small region on chromosome 20, 42900000 - 43200000. We will first map the reads for two individuals (HG00096, HG00100). We will then combine the results with mapped reads from the other 58 individuals to generate a list of polymorphic sites and estimate accurate genotypes at each of these sites. |
| + | |
| + | The example dataset we'll be using is available at: ftp://share.sph.umich.edu/gotcloud/gotcloudExample.tar |
| | | |
− | The example dataset we'll be using is included in this tar-ball TBD.
| |
| # Create & Change directory to where you want to install the Tutorail data | | # Create & Change directory to where you want to install the Tutorail data |
| # Download the dataset tar from the ftp site | | # Download the dataset tar from the ftp site |