Changes

From Genome Analysis Wiki
Jump to navigationJump to search
Line 68: Line 68:     
In the above command, -i specifies alternative IDs for the BAM files to be used in the .seq file (including popID and indivID). -b and -i are optional.
 
In the above command, -i specifies alternative IDs for the BAM files to be used in the .seq file (including popID and indivID). -b and -i are optional.
 +
 +
 +
== Estimating ancestry coordinates using LASER ==
 +
 +
Step 0: Generate the reference ancestry space (using the PCA mode of the LASER program)
 +
 +
# ./LASER-2.01/laser -g ./HGDP/HGDP_938.geno -pca 1 -k 30 -o HGDP_938
 +
 +
The above command takes about 15 mins to finish and we will skip it in this tutorial. A set of reference ancestry coordinates has been generated in the file $HGDP/HGDP_938.RefPC.coord.
 +
 +
Step 1: Place sequenced samples into the reference ancestry space:
 +
 +
./LASER-2.01/laser -g $HGDP/HGDP_938.geno -c $HGDP/HGDP_938.RefPC.coord -s hapmap_trios.seq -K 20 -k 4 -x 1 -y 3 -o hapmap_trios.1-3 &
 +
./LASER-2.01/laser -g $HGDP/HGDP_938.geno -c $HGDP/HGDP_938.RefPC.coord -s hapmap_trios.seq -K 20 -k 4 -x 4 -y 6 -o hapmap_trios.4-6 &
 +
 +
The first job will process samples 1 to 3 and the second job will processed samples 4 to 6. Each sequenced sample will be projected from a 20-dimensional PCA space onto a 4-dimensional reference ancestry space. The running time is ~7 minutes for processing 3 samples in each job.
111

edits

Navigation menu