Changes

2,814 bytes added , 17:39, 15 November 2016

→‎Download reference files

Line 8: Line 8:

=== Download from webpage ===

−

Through this link [http://~~gvt~~.sph.umich.edu/GREGOR/ GREGOR], you can download a copy of GREGOR.

+

Through this link [http://csg.sph.umich.edu/GREGOR/ GREGOR], you can download a copy of GREGOR.

== Build GREGOR ==

Line 14: Line 14:

To build GREGOR, copy the GREGOR package to the directory you want, and then run the following command:

−

tar xzvf GREGOR.tar.gz

+

tar xzvf GREGOR.v1.4.0.tar.gz

−

After you unzip, you can find 3 directories ~~in "GREGOR"~~ (./example ./lib ./script).

+

After you unzip, in the folder "GREGOR" you can find 4 directories (./Copyrights, ./example, ./lib ./script) and 2 files (README, release_version.txt).

== Download reference files ==

+

ownload the reference files from this link [http://csg.sph.umich.edu/GREGOR/ GREGOR Download].

−

~~Download~~ the ~~reference files~~ from ~~this link [http~~:~~//www~~.~~sph.umich.edu/csg/jich/GREGOR/ GREGOR Download], then un-package the file~~

+

Reference files are created for the different population groups(AFR, AMR, ASN, EUR, SAN) from 1000G data (Release date : May 21, 2011).

−

~~tar xzvf GREGOR~~.~~ref~~.~~tar~~.gz

+

If your LD r2 threshold equals or greater than 0.7, please download reference files from category: LD window size = 1MB; LD r2 ≥ 0.7.

+

If your LD r2 threshold equals or greater than 0.2, please download reference files from category: LD window size = 1MB; LD r2 ≥ 0.2.

−

After ~~unzip~~, you will get ~~47 reference files in~~ the ~~directory~~ "~~~/ref~~"

+

After download reference files, you need merge the part files to one gz file. Use the command line likes:

+

cat \

+

GREGOR.AFR.ref.r2.greater.than.0.2.tar.gz.part.00 \

+

GREGOR.AFR.ref.r2.greater.than.0.2.tar.gz.part.01 \

+

GREGOR.AFR.ref.r2.greater.than.0.2.tar.gz.part.02 \

+

GREGOR.AFR.ref.r2.greater.than.0.2.tar.gz.part.03 \

+

GREGOR.AFR.ref.r2.greater.than.0.2.tar.gz.part.04 \

+

> GREGOR.AFR.ref.r2.greater.than.0.2.tar.gz

+

Then extract this file:

+

tar zxvf GREGOR.AFR.ref.r2.greater.than.0.2.tar.gz

+

You will get one directory which has the name "AFR".

== Basic Usage Example ==

Line 68: Line 83:

## KEY ELEMENTS TO CONFIGURE : NEED TO MODIFY

###############################################################################

−

INDEX_SNP_FILE = ~~/workingdirectory/example/example.index.snps.rsid.list.txt ## e.g.~~ /workingdirectory/example/example.index.snps.rsid.list.txt

+

INDEX_SNP_FILE = /workingdirectory/example/example.index.snps.rsid.list.txt

−

BED_FILE_INDEX = ~~/workingdirectory/example/example.bed.file.index ## e.g.~~ /workingdirectory/example/example.bed.file.index

+

BED_FILE_INDEX = /workingdirectory/example/example.bed.file.index

−

REF_DIR = /workingdirectory/ref/ ~~## reference directory~~

+

REF_DIR = /workingdirectory/ref/

R2THRESHOLD = 0.7

LDWINDOWSIZE = 1000000

−

OUT_DIR = ~~/workingdirectory/example/example.rsid.20130808/ ## e.g.~~ /workingdirectory/example/example.rsid.20130808/

+

OUT_DIR = /workingdirectory/example/example.rsid.20130808/

MIN_NEIGHBOR_NUM = 500

BEDFILE_IS_SORTED = True

−

~~MOSRUN~~ = ~~mosbatch~~ -E/tmp -i -m2000 -~~j20~~,43,~~122~~,~~135~~,~~137~~,~~138~~,~~149~~,~~151~~,~~153~~,~~154~~,~~155~~,~~156~~,~~162~~,~~163~~ sh -c

+

POPULATION = AFR ## define the population, you can specify EUR, AFR, AMR or ASN

+

TOPNBEDFILES = 2

+

JOBNUMBER = 10

+

###############################################################################

+

#BATCHTYPE = mosix ## submit jobs on MOSIX

+

#BATCHOPTS = -E/tmp -i -m2000 -j10,11,12,13,14,15,16,17,18,19,120,122,123,124,125 sh -c

+

###############################################################################

+

#BATCHTYPE = slurm ## submit jobs on SLURM

+

#BATCHOPTS = --partition=main --time=0:30:0

+

###############################################################################

+

BATCHTYPE = local ## run jobs on local machine

+

In the config file, there are several parameters to adjust:

Line 84: Line 110:

BED_FILE_INDEX: This file lists the datasets (e.g. BED files) to be used for enrichment analysis. Use complete paths to file locations and make sure positions are in hg19 format.

−

REF_DIR: Define reference file directory which you download at here.

+

REF_DIR: Define reference file directory which you download at here. If your "AFR" folder is at "/home/myid/GRGORE/ref/AFR/", then define this parameter to "/home/myid/GRGORE/ref/".

−

R2THRESHOLD and LDWINDOWSIZE: These two parameters define the index SNP (and control SNP) LD proxies by r2 threshold and LD window size.

+

R2THRESHOLD and LDWINDOWSIZE: These two parameters define the index SNP (and control SNP) LD proxies by r2 threshold and LD window size. If you download r2 ≥ 0.7， you can define this number between 1 and 0.7.

OUT_DIR: All result files are saved to this folder, where the script will create multiple sub-directories. Index SNPs are in the folder "index_SNP"; Random SNPs are in the folder "random_SNP".

Line 93: Line 119:

BEDFILE_IS_SORTED: True or false, depending on whether the BED files listed in the index file are sorted.

+

POPULATION: If you use reference file "AFR", define this to AFR. You have 5 optiones: AFR, AMR, ASN, EUR and SAN.

+

GREGOR can run on local machine or on the cluster with MOSIX or SLURM.

+

BATCHTYPE: When you run GREGOR on local machine, specify "local"; when run on MOSIX system, specify "mosix"; when run on SLURM system, specify "slurm".

+

BATCHOPTS: This parameter works with BATCHTYPE when you specify "mosix" or "slurm". For example, when you define mosix, this parameter can be "-E/tmp -i -m2000 -j10,11,12,13,14,15,16 sh -c"; when you define "slurm", it can be "--partition=1000g --time=0:30:0"

+

== Reference Files ==

+

We provide two kinds of reference files. The difference between these reference data are LD buddy definitions.

+

*LD window size = 1MB; LD r2 ≥ 0.7：

+

**All LD buddies are in window size 1MB and r2 is greater than and equals to 0.7. If you want to calculate LD buddies in 1MB and r2 ≥ 0.7 (such as 0.9,0.8,0.7), please use these reference data.

+

*LD window size = 1MB; LD r2 ≥ 0.2：

+

**All LD buddies are in window size 1MB and r2 is greater than and equals to 0.2. If you want to calculate LD buddies in 1MB and r2 ≥ 0.2 (such as 0.6,0.5,0.4,0.3,0.2), please use these reference data.

== Results Output ==

The file StatisticSummaryFile.txt in the output directory contains enrichment results with the following information:

+

[[File:GREGOR Summary 20160201.png]]

Bed_File: The individual datasets used in the enrichment analysis

Line 103: Line 145:

Pvalue: P-value calculated assuming a sum of binomial distributions to represent the number of index SNPs (or LD proxies) that overlap a dataset compared to the expectation observed in the matched control sets

−

*Note: SNPs that cannot be converted from rsID to chr:pos format are listed in the output file rsid.index.snp.txt. SNPs for which there are no LD proxies or no MAF data available are listed in the output file nonannoted.index.snp.txt.

+

*Note:

+

**SNPs that cannot be converted from rsID to chr:pos format are listed in the output file rsid.index.snp.txt. SNPs for which there are no LD proxies or no MAF data available are listed in the output file nonannoted.index.snp.txt.

+

** If one index SNP and its LD-buddies are not in any bed region, the Pvalue could be defined to "NA"

== Testing GREGOR ==

Jchen

66

edits

Changes

GREGOR (view source)

Revision as of 17:39, 15 November 2016

Navigation menu

Page actions

Page actions

Personal tools

quick links

teaching

Navigation

Search

Tools