Changes

From Genome Analysis Wiki
Jump to navigationJump to search
1,347 bytes added ,  03:07, 4 November 2013
Line 1: Line 1:  
== Overview ==
 
== Overview ==
   −
'''SEQMIX''' is a C++ program that takes advantage of off-targeted sequence reads from exome/targeted sequencing experiments for accurate local ancestry inference.
+
'''SEQMIX''' is a C++ program that takes advantage of off-targeted sequence reads from exome/targeted sequencing experiments for accurate local ancestry inference. The paper is currently accepted by AJHG and will appear at the November issue (link coming soon).
    
== Method ==
 
== Method ==
   −
Before running SEQMIX, it is important to pre-process your data with a LD pruning step, which identify sites that are in high LD (r^2 > 0.1) and keep the sites with a higher sequence depth into the model. Since the sequence depth distribution is sample dependent, it is necessary to prune the sequence data for each individual.
+
Before running SEQMIX, it is important to LD prune your data so that pairs of sites in high LD (r^2 > 0.1) are identified and only the one with a higher sequence depth are included into the model. As the sequence depth distribution is sample dependent, it is necessary to prune the sequence data for each individual.
    
== Download ==
 
== Download ==
 +
Here is a tar ball [[Media:SEQMIX_0.1.tar]] for SEQMIX (version 0.1) which compresses the following three folders
   −
(Coming soon)
+
* libsrc: a folder contains source code from code written by Goncalo Abecasis
 +
* src: source code for SEQMIX
 +
* Release: example files and command file as well as the Readme.txt file that explains how to run SEQMIX
 +
 
 +
Before you run SEQMIX, or even while you are running SEQMIX examples (summarized by the '''example.sh''' file), please refer to the '''Readme.txt''' file for detailed explanations of the two steps for running SEQMIX.
 +
 
 +
Note that SEQMIX requires these specified files
 +
 
 +
* allele frequency for Africans
 +
* allele frequency for Europeans
 +
* genetic distance file
 +
* input vcf
 +
 
 +
The input vcf file is generated from sequencing experiment and the downstream data processing steps. The allele frequency and genetic distance files should be prepared by users and sometimes are tedious to do. The good news is that I have used these files for the whole genome level. Please contact me (''youna@umich.edu'') if you would like to have them. I will point you to the path if you are internal user and will figure out a way to share them (These files are fairly big) if you are a external users.
 +
 
 +
== Maintainer ==
 +
 
 +
Please contact Youna Hu (''youna@umich.edu'') if you have any questions or suggestions for SEQMIX.
    
== Related Programs ==
 
== Related Programs ==
   −
Local ancestry inference with high density genotype array data can be done with existing software [http://www.stats.ox.ac.uk/~myers/software.html/ HAPMIX], [http://lamp.icsi.berkeley.edu/lamp/ LAMP], [http://genepath.med.harvard.edu/~reich/Software.htm/ ANCESTRYMAP] (Warning: These programs are very difficult to run.).
+
Local ancestry inference with high density genotype array data can be done with existing software [http://www.stats.ox.ac.uk/~myers/software.html/ HAPMIX], [http://lamp.icsi.berkeley.edu/lamp/ LAMP], [http://genepath.med.harvard.edu/~reich/Software.htm/ ANCESTRYMAP].
    
Whole genome ancestry inference with ultra low coverage sequence data can be analyzed with [[LASER]].
 
Whole genome ancestry inference with ultra low coverage sequence data can be analyzed with [[LASER]].
60

edits

Navigation menu