Changes

From Genome Analysis Wiki
Jump to navigationJump to search
2,873 bytes added ,  11:51, 2 February 2017
Line 1: Line 1:  
= Introduction  =
 
= Introduction  =
   −
LASER, which stands for Locating Ancestry using SEquencing Reads, is a C++ software package that can estimate individual ancestry directly from genome-wide shortgun sequencing reads without calling genotypes. The method relies on the availability of a set of reference individuals whose genome-wide SNP genotypes and ancestral information are known. We first construct a reference coordinate system by applying principal components analysis (PCA) to the genotype data of the reference individuals. Then, for each sequencing sample, use the genome-wide sequencing reads to place the sample into the reference PCA space. With an appropriate reference panel, the estimated coordinates of the sequencing samples identify their ancestral background and can be directly used to correct for population structure in association studies or to ensure adequate matching of cases and controls.  
+
LASER, which stands for Locating Ancestry using SEquencing Reads, is a C++ software package that can estimate individual ancestry directly from genome-wide shortgun sequencing reads without calling genotypes. The method relies on the availability of a set of reference individuals whose genome-wide SNP genotypes and ancestral information are known. We first construct a reference coordinate system by applying principal components analysis (PCA) to the genotype data of the reference individuals. Then, for each sequencing sample, use the genome-wide sequencing reads to place the sample into the reference PCA space. With an appropriate reference panel, the estimated coordinates of the sequencing samples identify their ancestral background and can be directly used to correct for population structure in association studies or to ensure adequate matching of cases and controls.
   −
The goal of this wiki page is to help you get start using LASER, and we encourage you to read the [http://www.sph.umich.edu/csg/chaolong/LASER/LASER_Manual.pdf manual] for more details.
+
 
 +
Note:
 +
The goal of this wiki page is to help you get start using LASER.
 +
This page was created for LASER 1.0. Some of the information might be outdated for LASER 2.0.
 +
A more updated wiki page can be found at [http://genome.sph.umich.edu/wiki/SeqShop:_Estimates_of_Genetic_Ancestry_Practical 2014 UM Sequencing Workshop].
 +
We also encourage you to read the [http://csg.sph.umich.edu/chaolong/LASER/LASER_Manual.pdf manual] for more details of the software.
    
= Download =
 
= Download =
   −
To get a copy of the software and manual, go to the [http://www.sph.umich.edu/csg/chaolong/LASER/ LASER Download] page.
+
To get a copy of the software and manual, go to the [http://csg.sph.umich.edu//chaolong/LASER/ LASER Download] page.
    
= Workflow  =
 
= Workflow  =
Line 75: Line 80:     
<br>
 
<br>
 +
 +
 +
== Interpret LASER outputs ==
 +
 +
Upon successfully launching LASER command line as above, the output messages should be similar to below:
 +
 +
    ===================================================================
 +
    ====      LASER: Locating Ancestry from SEquencing Reads      ====
 +
    ====            Version 1.0 | (c) Chaolong Wang 2013            ====
 +
    ====================================================================
 +
    Started at: Fri Nov 15 01:05:48 2013
 +
 +
    938 individuals are detected in the GENO_FILE.
 +
    632958 loci are detected in the GENO_FILE.
 +
    1 individuals are detected in the SEQ_FILE.
 +
    632958 loci are detected in the SEQ_FILE.
 +
    938 individuals are detected in the COORD_FILE.
 +
    100 PCs are detected in the COORD_FILE.
 +
 +
    Parameter values used in execution:
 +
    -------------------------------------------------
 +
    GENO_FILE (-g)resource/HGDP/HGDP_938.geno
 +
    SEQ_FILE (-s)pileup2seq/test.seq
 +
    COORD_FILE (-c)resource/HGDP/HGDP_938.RefPC.coord
 +
    OUT_PREFIX (-o)test
 +
    DIM (-k)2
 +
    MIN_LOCI (-l)100
 +
    SEQ_ERR (-e)0.01
 +
    FIRST_IND (-x)1
 +
    LAST_IND (-y)1
 +
    REPS (-r)1
 +
    OUTPUT_REPS (-R)0
 +
    CHECK_FORMAT (-fmt)10
 +
    CHECK_COVERAGE (-cov)0
 +
    PCA_MODE (-pca)0
 +
    -------------------------------------------------
 +
 +
    Fri Nov 15 01:05:50 2013
 +
    Checking data format ...
 +
    GENO_FILE: OK.
 +
    SEQ_FILE: OK.
 +
    COORD_FILE: OK.
 +
 +
    Fri Nov 15 01:06:01 2013
 +
    Reading reference genotypes ...
 +
 +
    Fri Nov 15 01:09:15 2013
 +
    Reading reference PCA coordinates ...
 +
 +
    Fri Nov 15 01:09:15 2013
 +
    Analyzing sequence samples ...
 +
    Results for the sequence samples are output to 'test.SeqPC.coord'.
 +
 +
    Finished at: Fri Nov 15 01:09:21 2013
 +
    ====================================================================
 +
 +
The ancestry of input samples are store in the file '''test.SeqPC.coord''', which content is shown below:
 +
 +
    popID indivID L1 Ci t PC1 PC2
 +
    NA12878.chrom22 NA12878.chrom22 1601 0.00858193 0.977243 31.522 224.098
 +
 +
The ancestry coordinates for NA12878 samples are given in PC1 (31.522) and PC2 (224.098).
 +
 +
It is recommended to visualize this results with HGDP reference samples whose coordinates are given in file: resource/HGDP/HGDP_938.RefPC.coord
 +
 +
In our manuscript, an example figure is shown:
 +
 +
[[File:LASER paper Figure 2.png|thumb|center|alt=LASER example outputs as in Figure 2|400px|LASER Outputs]]
 +
 +
In this figure, 238 individuals were randomly selected from the total 938 HGDP samples as the testing set (colored symbols),
 +
and the remaining 700 HGDP individuals were used as the reference panel (gray symbols).
    
= File format  =
 
= File format  =
Line 168: Line 244:     
LASER has advanced options including (1) parallel computing; (2) increase ancestry inference accuracy using repeated runs; (3) generate PCA coordiates using genotypes.
 
LASER has advanced options including (1) parallel computing; (2) increase ancestry inference accuracy using repeated runs; (3) generate PCA coordiates using genotypes.
See [http://www.sph.umich.edu/csg/chaolong/LASER/LASER_Manual.pdf LASER Manual] for detailed information.
+
See [http://csg.sph.umich.edu//chaolong/LASER/LASER_Manual.pdf LASER Manual] for detailed information.
    
= Contact  =
 
= Contact  =
96

edits

Navigation menu