Difference between revisions of "LocusZoom Standalone"

From Genome Analysis Wiki
Jump to navigationJump to search
Line 3: Line 3:
 
This page contains information regarding a version of LocusZoom that may be downloaded for personal use. For more information on LocusZoom, see this [[LocusZoom|page]].  
 
This page contains information regarding a version of LocusZoom that may be downloaded for personal use. For more information on LocusZoom, see this [[LocusZoom|page]].  
  
== Requirements ==  
+
== Requirements ==
  
 
The following software is required:  
 
The following software is required:  
  
* [http://www.python.org/download/ Python 2.6] (do '''not''' download the 3.0 branch!)
+
*[http://www.python.org/download/ Python 2.6] (do '''not''' download the 3.0 branch!)  
* [http://www.r-project.org/ R 2.10+]
+
*[http://www.r-project.org/ R 2.10+]  
* [[New_Fugue|new_fugue]], a program for computing LD, written by Goncalo Abecasis.
+
*[[New Fugue|new_fugue]], a program for computing LD, written by Goncalo Abecasis.
  
 
Currently only '''Unix/Linux''' is supported, though Mac OS X should be supported in a future release.  
 
Currently only '''Unix/Linux''' is supported, though Mac OS X should be supported in a future release.  
  
== Synopsis ==  
+
== Synopsis ==
=== A quick example ===
+
 
 +
=== A quick example ===
  
 
First, change directory into examples/. Then, run the following command:  
 
First, change directory into examples/. Then, run the following command:  
 +
<pre>./run_example.py</pre>
 +
This script runs the following command for you:
 +
<pre>../bin/locuszoom --metal Kathiresan_2009_HDL.txt --refgene FADS1</pre>
 +
A PDF plot of the FADS1 locus will be created in the directory. It should look roughly like this:
  
<pre>./run_example.py</pre>
+
[[Image:FADS1 small.png]]
  
This script runs the following command for you:
+
Voila, your first region plot!
  
<pre>../bin/locuszoom --metal Kathiresan_2009_HDL.txt --refgene FADS1</pre>
+
== Installation  ==
  
A PDF plot of the FADS1 locus will be created in the directory. It should look roughly like this:  
+
=== Step 1: Install Python  ===
  
Voila, your first region plot!
+
You will need to install Python on your system if it is not already. Head over to [http://www.python.org www.python.org] to download it. Note that you will want to make sure to download the latest from the 2.x branch, and '''not''' the 3.0 one.
  
== Installation ==
+
=== Step 2: Install ===
=== Step 1: Install Python ===
 
You will need to install Python on your system if it is already. Head over to [http://www.python.org www.python.org] to download it. Note that you will want to make sure to download the latest from the 2.x branch, and '''not''' the 3.0 one.
 
  
=== Step 2: Install R ===
 
 
R is also required for generating the plots. You can download R at [http://www.r-project.org/ www.r-project.org]. Version 2.10 or greater is required.  
 
R is also required for generating the plots. You can download R at [http://www.r-project.org/ www.r-project.org]. Version 2.10 or greater is required.  
  
 +
=== Step 3: Install new_fugue  ===
 +
 +
New_fugue is a program that calculates linkage disequilibrium measures from genotype files. While installing new_fugue is optional, we highly recommend it as it makes the process of generating plots much easier. If you opt to skip installing new_fugue, you will need to provide your own computed LD files for each region that you want to plot.
 +
 +
New_fugue can be downloaded from [[New Fugue|here]].
 +
 +
Once downloaded, extract the tar file using:
 +
<pre> tar zxf /path/to/new_fugue.tar.gz</pre>
 +
Change into the generic-new_fugue directory that is created, and run:
 +
<pre> make install </pre>
 +
=== Step 4: Install LocusZoom  ===
 +
 +
LocusZoom is provided as a tar archive which contains the following:
 +
 +
*the LocusZoom python application
 +
*the R script used for generating plots
 +
*genotype files (used for computing LD) from hapmap and 1000G (build hg18 only)
 +
*a SQLite database file containing tables describing SNP positions, SNP annotations, gene and exon locations, and recombination rates (build hg18 only)
 +
 +
Simply unpack the tar to your directory of choice by doing the following:
 +
<pre>cd &lt;directory where you want to place locuszoom&gt;
 +
tar zxf /path/to/locuszoom.tgz
 +
</pre>
 +
The tar archive will extract into the following directory structure:
 +
 +
*locuszoom/
 +
**bin/
 +
***locuszoom (this is the locuszoom "executable")
 +
***locuszoom.R (the R script which is used by locuszoom for creating the plots)
 +
**conf/ (configuration file located here)
 +
**data/
 +
***database/ (SQLite file located here)
 +
***hapmap/ (hapmap genotype files)
 +
***1000G/ (1000G genotype files)
 +
**src/ (source code for locuszoom)
 +
 +
It is important that this directory structure remain intact. To make launching locusoom easier, you could create a link to it from /usr/local/bin, for example:
 +
<pre>ln -s bin/locuszoom /usr/local/bin/locuszoom</pre>
 +
== Configuration  ==
 +
 +
== Input  ==
 +
 +
=== Association results file ("metal" file)  ===
 +
 +
The main input to LocusZoom is a file containing results from an association scan or meta-analysis. The file must have 2 things: markers (SNPs), and p-values. The file should look something like this:
 +
 +
<br>
 +
{| width="25%" cellspacing="0" cellpadding="1" border="1"
 +
|-
 +
! scope="col" | MarkerName
 +
! scope="col" | P-value
 +
|-
 +
| align="center" | rs1
 +
| align="center" | 0.423
 +
|-
 +
| align="center" | rs2
 +
| align="center" | 1.23e-04
 +
|-
 +
| align="center" | rs3
 +
| align="center" | 9.4e-390
 +
|}
 +
<br>
 +
 +
The file should be tab-delimited, though this can be changed using the <code>--delim </code> option.
 +
 +
This file should be passed to locuszoom using the <code>--metal</code> option.
  
 +
=== Region ===
  
== Configuration ==
+
You can specify the region to plot in any one of the following ways:
  
== Input ==
+
* A reference SNP and flanking region
 +
<pre> --refsnp <your snp> --flank 500kb </pre>
 +
* A reference SNP and chromosome/start/stop specification
 +
<pre> --refsnp <your snp> --chr # --start <base position> --end <base position> </pre>
 +
* A gene and flanking region
 +
<pre> --refgene <your gene> --flank 250kb </pre>
 +
The flank is computed as +/- from the transcription start/end of the gene. From this region, LocusZoom will find the SNP with the most significant p-value, and use this as the reference SNP.
 +
* A gene and chromosome/start/stop specification
 +
<pre> --refgene <your gene> --chr # --start <base position> --end <base position> </pre>
 +
This method is similar to the above, except that an exact region is specified. The SNP with the most significant p-value in this region will be used.
 +
* A chromosome/start/stop specification
 +
<pre> --chr # --start <base position> --end <base position> </pre>
 +
Once again, the SNP with the most significant p-value will be used in this region.
  
== Options ==  
+
== Options ==
  
== Output ==  
+
== Output ==
  
== License ==
+
== License ==
  
 
[[Category:Software]]
 
[[Category:Software]]

Revision as of 16:48, 24 May 2010

LocusZoomSmall.png

This page contains information regarding a version of LocusZoom that may be downloaded for personal use. For more information on LocusZoom, see this page.

Requirements

The following software is required:

Currently only Unix/Linux is supported, though Mac OS X should be supported in a future release.

Synopsis

A quick example

First, change directory into examples/. Then, run the following command:

./run_example.py

This script runs the following command for you:

../bin/locuszoom --metal Kathiresan_2009_HDL.txt --refgene FADS1

A PDF plot of the FADS1 locus will be created in the directory. It should look roughly like this:

FADS1 small.png

Voila, your first region plot!

Installation

Step 1: Install Python

You will need to install Python on your system if it is not already. Head over to www.python.org to download it. Note that you will want to make sure to download the latest from the 2.x branch, and not the 3.0 one.

Step 2: Install R

R is also required for generating the plots. You can download R at www.r-project.org. Version 2.10 or greater is required.

Step 3: Install new_fugue

New_fugue is a program that calculates linkage disequilibrium measures from genotype files. While installing new_fugue is optional, we highly recommend it as it makes the process of generating plots much easier. If you opt to skip installing new_fugue, you will need to provide your own computed LD files for each region that you want to plot.

New_fugue can be downloaded from here.

Once downloaded, extract the tar file using:

 tar zxf /path/to/new_fugue.tar.gz

Change into the generic-new_fugue directory that is created, and run:

 make install 

Step 4: Install LocusZoom

LocusZoom is provided as a tar archive which contains the following:

  • the LocusZoom python application
  • the R script used for generating plots
  • genotype files (used for computing LD) from hapmap and 1000G (build hg18 only)
  • a SQLite database file containing tables describing SNP positions, SNP annotations, gene and exon locations, and recombination rates (build hg18 only)

Simply unpack the tar to your directory of choice by doing the following:

cd <directory where you want to place locuszoom> 
tar zxf /path/to/locuszoom.tgz 

The tar archive will extract into the following directory structure:

  • locuszoom/
    • bin/
      • locuszoom (this is the locuszoom "executable")
      • locuszoom.R (the R script which is used by locuszoom for creating the plots)
    • conf/ (configuration file located here)
    • data/
      • database/ (SQLite file located here)
      • hapmap/ (hapmap genotype files)
      • 1000G/ (1000G genotype files)
    • src/ (source code for locuszoom)

It is important that this directory structure remain intact. To make launching locusoom easier, you could create a link to it from /usr/local/bin, for example:

ln -s bin/locuszoom /usr/local/bin/locuszoom

Configuration

Input

Association results file ("metal" file)

The main input to LocusZoom is a file containing results from an association scan or meta-analysis. The file must have 2 things: markers (SNPs), and p-values. The file should look something like this:


MarkerName P-value
rs1 0.423
rs2 1.23e-04
rs3 9.4e-390


The file should be tab-delimited, though this can be changed using the --delim option.

This file should be passed to locuszoom using the --metal option.

Region

You can specify the region to plot in any one of the following ways:

  • A reference SNP and flanking region
 --refsnp <your snp> --flank 500kb 
  • A reference SNP and chromosome/start/stop specification
 --refsnp <your snp> --chr # --start <base position> --end <base position> 
  • A gene and flanking region
 --refgene <your gene> --flank 250kb 

The flank is computed as +/- from the transcription start/end of the gene. From this region, LocusZoom will find the SNP with the most significant p-value, and use this as the reference SNP.

  • A gene and chromosome/start/stop specification
 --refgene <your gene> --chr # --start <base position> --end <base position> 

This method is similar to the above, except that an exact region is specified. The SNP with the most significant p-value in this region will be used.

  • A chromosome/start/stop specification
 --chr # --start <base position> --end <base position> 

Once again, the SNP with the most significant p-value will be used in this region.

Options

Output

License