Difference between revisions of "Biostatistics 830: Main Page"

From Genome Analysis Wiki
Jump to navigationJump to search
 
(30 intermediate revisions by the same user not shown)
Line 11: Line 11:
 
== Scheduling ==
 
== Scheduling ==
  
For Fall 2013, classes are scheduled for Wednesdays and Fridays, 3:00 - 4:30 pm room M4318 SPH II.  
+
For Fall 2014, classes are scheduled for Mondays and Tuesdays, 3:00 - 4:30 pm room M4318 SPH II.  
 +
Tentatively, we will aim for Wednesday 2:00 - 3:30 for office hours and any classes that must be rescheduled.
 +
 
 +
== Grading ==
  
 
The final grade will take into account your performance in problem sets and worksheets as well as your participation in class.
 
The final grade will take into account your performance in problem sets and worksheets as well as your participation in class.
  
=== Dates ===
+
=== Class Worksheets ===
  
The green dates below indicate when we will be meeting.
+
September 8 - [[Media:Question_Sheet_-_Li_et_al_(2010)_Gen_Epid.pdf|Li et al (2010)]]
  
{| border="1" cellpadding="5" cellspacing="0" style="text-align: center;"
+
September 22 - [[Media:Question_Sheet_-_Howie_et_al_(2012)_Nat_Genet.pdf|Howie et al (2012)]]
|-
 
! colspan="3" style="background: #ffdead;" | September
 
! rowspan="7" |    
 
! colspan="3" style="background: #ffdead;" | October
 
! rowspan="7" |    
 
! colspan="3" style="background: #ffdead;" | November
 
! rowspan="7" |    
 
! colspan="3" style="background: #ffdead;" | December
 
|-
 
|Mon
 
|Wed
 
|Fri
 
|Mon
 
|Wed
 
|Fri
 
|Mon
 
|Wed
 
|Fri
 
|Mon
 
|Wed
 
|Fri
 
|-
 
|2
 
! style="background: lightgreen;" |4
 
! style="background: lightgreen;" |6
 
|
 
! style="background: lightgreen;" |2
 
|4
 
|
 
|
 
! style="background: lightgreen;" |1
 
|2
 
! style="background: lightgreen;" |4
 
! style="background: lightgreen;" |6
 
|-
 
|9
 
|11
 
! style="background: lightgreen;" |13
 
|7
 
|9
 
|11
 
|4
 
! style="background: lightgreen;" |6
 
! style="background: lightgreen;" |8
 
! style="background: lightgreen;" |9
 
! style="background: lightgreen;" |11
 
|
 
|-
 
|16
 
|18
 
! style="background: lightgreen;" |20
 
! style="background: lightgreen;" |14
 
! style="background: lightgreen;" |16
 
! style="background: lightgreen;" |18
 
! style="background: lightgreen;" |11
 
! style="background: lightgreen;" |13
 
! style="background: lightgreen;" |15
 
|
 
|
 
|
 
|-
 
|23
 
! style="background: lightgreen;" |25
 
! style="background: lightgreen;" |27
 
|21
 
|23
 
|25
 
|18
 
! style="background: lightgreen;" |20
 
! style="background: lightgreen;" |22
 
|
 
|
 
|
 
|-
 
|30
 
|
 
|
 
! style="background: lightgreen;" |28
 
! style="background: lightgreen;" |30
 
|
 
! style="background: lightgreen;" |25
 
! style="background: lightgreen;" |27
 
|29
 
|
 
|
 
|
 
|}
 
  
=== Class Worksheets ===
+
September 29 - [[Media:Question_Sheet_-_Menelaou_et_al_(2013)_Bioinformatics.pdf|Menelaou et al (2013)]] 
  
September 6 - [[Media:Question_Sheet_-_Li_et_al_(2010)_Gen_Epid.pdf|Li et al (2010)]]
+
'''October 6 - School of Public Health Symposium'''
  
September 20 - [[Media:Question_Sheet_-_Howie_et_al_(2012)_Nat_Genet.pdf|Howie et al (2012)]]
+
October 7 - Delaneau et al (2013)
  
September 27 - [[Media:Question_Sheet_-_Browing_and_Browning_(2012)_Am_J_Hum_Genet.pdf|Browning and Browning (2007)]]
+
'''October 13/14 - University Fall Break'''
  
October 16 - [[Media:Question_Sheet_-_Li_et_al_(2008)_Genome_Research.pdf|Li et al (2008)]]
+
'''October 20/21 - American Society of Human Genetics Meeting'''
  
November 1 - [[Media:Question_Sheet_-_Li_and_Durbin_(2009)_Bioinformatics.pdf|Li and Durbin (2009)]]
+
October 27 - [[Media:Question_Sheet_-_Li_and_Durbin_(2009)_Bioinformatics.pdf|Li and Durbin (2009)]]
  
November 8 - [[Media:Question_Sheet_-_Zerbino_and_Birney_(2008)_Bioinformatics.pdf|Zerbino and Birney (2008)]]
+
November 3 - [[Media:Question_Sheet_-_Iqbal_et_al_(2012)_Nature_Genetics.pdf|Iqbal et al (2012)]]
  
November 13 - [[Media:Question_Sheet_-_Iqbal_et_al_(2012)_Nature_Genetics.pdf|Iqbal et al (2012)]]
+
November 17 - [[Media:Question_Sheet_-_Albers_et_al_(2010)_Genome_Research.pdf|Albers et al (2010)]]
  
November 22 - [[Media:Question_Sheet_-_Li_and_Durbin_(2011)_Nature.pdf|Li and Durbin(2011)]]
+
November 24 - [[Media:Question_Sheet_-_Kircher_et_al_(2014)_Nature_Genetics.pdf|Kircher et al (2014)]]
  
November 27 - [[Media:Question_Sheet_-_Liu_et_al_(2013)_Nature_Genetics.pdf|Liu et al (2013)]]
+
December 1 - [[Media:Question_Sheet_-_Liu_et_al_(2013)_Nature_Genetics.pdf|Liu et al (2014)]]
  
December 6 - Wu et al (2012)
+
December 8 - [[Media:Question_Sheet_-_Kang_et_al_(2010)_Nature_Genetics.pdf|Kang et al (2010)]]
 
 
December 9 - Jun et al (2013)
 
  
 
== Standards of Academic Conduct ==
 
== Standards of Academic Conduct ==
Line 138: Line 52:
 
''Student academic misconduct includes behavior involving plagiarism, cheating, fabrication, falsification of records or official documents, intentional misuse of equipment or materials, and aiding and abetting the perpetration of such acts. The preparation of reports, papers, and examinations, assigned on an individual basis, must represent each student’s own effort. Reference sources should be indicated clearly. The use of assistance from other students or aids of any kind during a written examination, except when the use of books or notes has been approved by an instructor, is a violation of the standard of academic conduct.''
 
''Student academic misconduct includes behavior involving plagiarism, cheating, fabrication, falsification of records or official documents, intentional misuse of equipment or materials, and aiding and abetting the perpetration of such acts. The preparation of reports, papers, and examinations, assigned on an individual basis, must represent each student’s own effort. Reference sources should be indicated clearly. The use of assistance from other students or aids of any kind during a written examination, except when the use of books or notes has been approved by an instructor, is a violation of the standard of academic conduct.''
  
In the context of this course, any work you hand-in should be your own and any material that is a transcript (or interpreted transcript) of work by others must be clearly labeled as such.
+
In the context of this course, any work you hand-in should be your own and any material that is a transcript (or interpreted transcript) of work by others must be clearly labeled as such. If you turn in work that is directly copied from another student or from a published or unpublished source without attribution, you risk failing the course.
  
 
== Required Reading ==
 
== Required Reading ==
  
 
* Albers CA, Lunter G, MacArthur DG, McVean G, Ouwehand WH, Durbin R (2010) Dindel: accurate indel calls from short-read data. Genome Res. 21:961-73
 
* Albers CA, Lunter G, MacArthur DG, McVean G, Ouwehand WH, Durbin R (2010) Dindel: accurate indel calls from short-read data. Genome Res. 21:961-73
 
* Browning SR, Browning BL (2007) Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. ''Am J Hum Genet.'' '''81''':1084-97. PMID: 17924348
 
  
 
* Coventry A, Bull-Otterson LM, Liu X, Clark AG, Maxwell TJ, Crosby J, Hixson JE, Rea TJ, Muzny DM, Lewis LR, Wheeler DA, Sabo A, Lusk C, Weiss KG, Akbar H, Cree A, Hawes AC, Newsham I, Varghese RT, Villasana D, Gross S, Joshi V, Santibanez J, Morgan M, Chang K, Iv WH, Templeton AR, Boerwinkle E, Gibbs R, Sing CF (2010) Deep resequencing reveals excess rare recent variants consistent with explosive population growth. ''Nat Commun.'' '''1''':131. PMID: 21119644
 
* Coventry A, Bull-Otterson LM, Liu X, Clark AG, Maxwell TJ, Crosby J, Hixson JE, Rea TJ, Muzny DM, Lewis LR, Wheeler DA, Sabo A, Lusk C, Weiss KG, Akbar H, Cree A, Hawes AC, Newsham I, Varghese RT, Villasana D, Gross S, Joshi V, Santibanez J, Morgan M, Chang K, Iv WH, Templeton AR, Boerwinkle E, Gibbs R, Sing CF (2010) Deep resequencing reveals excess rare recent variants consistent with explosive population growth. ''Nat Commun.'' '''1''':131. PMID: 21119644
Line 154: Line 66:
 
* Iqbal Z, Caccamo M, Turner I, Flicek P, McVean G (2012) De novo assembly and genotyping of variants using colored de Bruijn graphs. ''Nat Genet.'' '''44''':226-32. PMID: 22231483
 
* Iqbal Z, Caccamo M, Turner I, Flicek P, McVean G (2012) De novo assembly and genotyping of variants using colored de Bruijn graphs. ''Nat Genet.'' '''44''':226-32. PMID: 22231483
  
* Jun G, Flickinger M, Hetrick KN, Romm JM, Doheny KF, Abecasis GR, Boehnke M, Kang HM (2012) Detecting and estimating contamination of human DNA samples in sequencing and array-based genotype data. ''Am J Hum Genet.'' '''91''':839-48. PMID: 23103226
+
* Kang HM, Sul JH, Service SK, Zaitlen NA, Kong SY, Freimer NB, Sabatti C, Eskin E (2010) Variance component model to account for sample structure in genome-wide association studies. ''Nat. Genet.'' '''42''':348-354
 
+
* Li H, Ruan J, Durbin R (2008) Mapping short DNA sequencing reads and calling variants using mapping quality scores. ''Genome Res.'' '''18''':1851-8. PMID: 18714091 [[Biostatistics 830 - Code Snippets|[Code Snippets]]]
+
* Kircher M, Witten DM, Jain P, O'Roak BJ, Cooper GM, Shendure J (2014) A general framework for estimating the relative pathogenicity of human genetic variants. ''Nat. Genet.'' '''46''' 310–315
  
 
* Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. ''Bioinformatics.'' '''25''':1754-60. PMID: 19451168
 
* Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. ''Bioinformatics.'' '''25''':1754-60. PMID: 19451168
  
* Li H, Durbin R (2011) Inference of human population history from individual whole-genome sequences. ''Nature.'' '''475''':493-6. PMID: 21753753
+
* Li Y, Willer CJ, Ding J, Scheet P, Abecasis GR (2010) MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. ''Genet Epidemiol.'' '''34''':816-34. PMID: 21058334 [[Biostatistics 830 - Code Snippets|[Code Snippets]]]
  
* Li Y, Willer CJ, Ding J, Scheet P, Abecasis GR (2010) MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. ''Genet Epidemiol.'' '''34''':816-34. PMID: 21058334 [[Biostatistics 830 - Code Snippets|[Code Snippets]]]
+
* Menelaou A, Marchini J (2013) Genotype calling and phasing using next-generation sequencing reads and a haplotype scaffold. ''Bioinformatics.'' '''29''':84-91. PMID: 23093610
  
 
* Lin DY, Zeng D (2010) Meta-analysis of genome-wide association studies: no efficiency gain in using individual participant data. ''Genet Epidemiol.'' '''34''':60-6. PMID: 19847795
 
* Lin DY, Zeng D (2010) Meta-analysis of genome-wide association studies: no efficiency gain in using individual participant data. ''Genet Epidemiol.'' '''34''':60-6. PMID: 19847795
  
* Liu et al (2013) http://arxiv.org/abs/1305.1318
+
* Liu DJ, Peloso GM, Zhan X, Holmen OL, Zawistowski M, Feng S, Nikpay M, Auer PL, Goel A, Zhang H, Peters U, Farrall M, Orho-Melander M, Kooperberg C, McPherson R, Watkins H, Willer CJ, Hveem K, Melander O, Kathiresan S, Abecasis GR (2014) Meta-analysis of gene-level tests for rare variant association. ''Nat Genet.'' '''46''':200-4
  
* Wen X, Stephens M (2010) Using linear predictors to impute allele frequencies from summary or pooled genotype data. ''Ann Appl Stat.'' '''4''':1158-1182. PMID: 21479081
+
* Schiffels S, Durbin R (2014) Inferring human population size and separation history from multiple genome sequences. ''Nat Genet.'' '''46''':919-25
  
* Wu MC, Lee S, Cai T, Li Y, Boehnke M, Lin X (2011) Rare-variant association testing for sequencing data with the sequence kernel association test. Am J Hum Genet. 89:82-93
+
* Yang J, Zaitlen NA, Goddard ME, Visscher PM, Price AL (2014) Advantages and pitfalls in the application of mixed-model association methods. ''Nat. Genet.'' '''46''':100-106
  
* Zerbino DR, Birney E (2008) Velvet: algorithms for de novo short read assembly using de Bruijn graphs. ''Genome Res.'' '''18''':821-9. PMID: 18349386
+
* Wang C, Zhan X, Bragg-Gresham J, Kang HM, Stambolian D, Chew EY, Branham KE, Heckenlively J; FUSION Study, Fulton R, Wilson RK, Mardis ER, Lin X, Swaroop A, Zöllner S, Abecasis GR (2014) Ancestry estimation and control of population stratification for sequence-based association studies. ''Nat Genet. 2014'' '''46''':409-15
 +
 
 +
* Wen X, Stephens M (2010) Using linear predictors to impute allele frequencies from summary or pooled genotype data. ''Ann Appl Stat.'' '''4''':1158-1182. PMID: 21479081
  
 
== Course History ==
 
== Course History ==
  
 
This course is an ad-hoc course, first taught by Goncalo Abecasis in the Fall of 2013.
 
This course is an ad-hoc course, first taught by Goncalo Abecasis in the Fall of 2013.
 +
 +
* [[Biostatistics 830: Fall 2013 Edition]]

Latest revision as of 03:43, 5 January 2017

Objective

Gene mapping studies study the relationship between genetic variation and susceptibility to human disease. These studies are changing rapidly with the availability of techniques for very large scale genetic analysis, whether based on sequencing or on genotyping. Biostatistics 830 is a Ph.D. level course that dissects some recently developed methods and the principles behind their implementation. It is meant to provide students with a toolkit to facilitate development and implementation of new statistical methods.

For additional information, see also Core Competencies in Biostatistics Program covered by this course.

Target Audience

It is highly recommended that students registering for Biostatistics 830 should have previously completed Biostatistics 666 and Biostatistics 615/815, which are courses introducing methods for genetic analysis and programming principles, respectively.

Scheduling

For Fall 2014, classes are scheduled for Mondays and Tuesdays, 3:00 - 4:30 pm room M4318 SPH II. Tentatively, we will aim for Wednesday 2:00 - 3:30 for office hours and any classes that must be rescheduled.

Grading

The final grade will take into account your performance in problem sets and worksheets as well as your participation in class.

Class Worksheets

September 8 - Li et al (2010)

September 22 - Howie et al (2012)

September 29 - Menelaou et al (2013)

October 6 - School of Public Health Symposium

October 7 - Delaneau et al (2013)

October 13/14 - University Fall Break

October 20/21 - American Society of Human Genetics Meeting

October 27 - Li and Durbin (2009)

November 3 - Iqbal et al (2012)

November 17 - Albers et al (2010)

November 24 - Kircher et al (2014)

December 1 - Liu et al (2014)

December 8 - Kang et al (2010)

Standards of Academic Conduct

The following is an extract from the School of Public Health's Student Code of Conduct [1]:

Student academic misconduct includes behavior involving plagiarism, cheating, fabrication, falsification of records or official documents, intentional misuse of equipment or materials, and aiding and abetting the perpetration of such acts. The preparation of reports, papers, and examinations, assigned on an individual basis, must represent each student’s own effort. Reference sources should be indicated clearly. The use of assistance from other students or aids of any kind during a written examination, except when the use of books or notes has been approved by an instructor, is a violation of the standard of academic conduct.

In the context of this course, any work you hand-in should be your own and any material that is a transcript (or interpreted transcript) of work by others must be clearly labeled as such. If you turn in work that is directly copied from another student or from a published or unpublished source without attribution, you risk failing the course.

Required Reading

  • Albers CA, Lunter G, MacArthur DG, McVean G, Ouwehand WH, Durbin R (2010) Dindel: accurate indel calls from short-read data. Genome Res. 21:961-73
  • Coventry A, Bull-Otterson LM, Liu X, Clark AG, Maxwell TJ, Crosby J, Hixson JE, Rea TJ, Muzny DM, Lewis LR, Wheeler DA, Sabo A, Lusk C, Weiss KG, Akbar H, Cree A, Hawes AC, Newsham I, Varghese RT, Villasana D, Gross S, Joshi V, Santibanez J, Morgan M, Chang K, Iv WH, Templeton AR, Boerwinkle E, Gibbs R, Sing CF (2010) Deep resequencing reveals excess rare recent variants consistent with explosive population growth. Nat Commun. 1:131. PMID: 21119644
  • Delaneau O, Zagury JF, Marchini J (2013) Improved whole-chromosome phasing for disease and population genetic studies. Nat Methods. 10:5-6. PMID: 23269371
  • Howie B, Fuchsberger C, Stephens M, Marchini J, Abecasis GR (2012) Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Nat Genet. 44:955-9. PMID: 22820512 [Code Snippets]
  • Iqbal Z, Caccamo M, Turner I, Flicek P, McVean G (2012) De novo assembly and genotyping of variants using colored de Bruijn graphs. Nat Genet. 44:226-32. PMID: 22231483
  • Kang HM, Sul JH, Service SK, Zaitlen NA, Kong SY, Freimer NB, Sabatti C, Eskin E (2010) Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42:348-354
  • Kircher M, Witten DM, Jain P, O'Roak BJ, Cooper GM, Shendure J (2014) A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 46 310–315
  • Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 25:1754-60. PMID: 19451168
  • Li Y, Willer CJ, Ding J, Scheet P, Abecasis GR (2010) MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet Epidemiol. 34:816-34. PMID: 21058334 [Code Snippets]
  • Menelaou A, Marchini J (2013) Genotype calling and phasing using next-generation sequencing reads and a haplotype scaffold. Bioinformatics. 29:84-91. PMID: 23093610
  • Lin DY, Zeng D (2010) Meta-analysis of genome-wide association studies: no efficiency gain in using individual participant data. Genet Epidemiol. 34:60-6. PMID: 19847795
  • Liu DJ, Peloso GM, Zhan X, Holmen OL, Zawistowski M, Feng S, Nikpay M, Auer PL, Goel A, Zhang H, Peters U, Farrall M, Orho-Melander M, Kooperberg C, McPherson R, Watkins H, Willer CJ, Hveem K, Melander O, Kathiresan S, Abecasis GR (2014) Meta-analysis of gene-level tests for rare variant association. Nat Genet. 46:200-4
  • Schiffels S, Durbin R (2014) Inferring human population size and separation history from multiple genome sequences. Nat Genet. 46:919-25
  • Yang J, Zaitlen NA, Goddard ME, Visscher PM, Price AL (2014) Advantages and pitfalls in the application of mixed-model association methods. Nat. Genet. 46:100-106
  • Wang C, Zhan X, Bragg-Gresham J, Kang HM, Stambolian D, Chew EY, Branham KE, Heckenlively J; FUSION Study, Fulton R, Wilson RK, Mardis ER, Lin X, Swaroop A, Zöllner S, Abecasis GR (2014) Ancestry estimation and control of population stratification for sequence-based association studies. Nat Genet. 2014 46:409-15
  • Wen X, Stephens M (2010) Using linear predictors to impute allele frequencies from summary or pooled genotype data. Ann Appl Stat. 4:1158-1182. PMID: 21479081

Course History

This course is an ad-hoc course, first taught by Goncalo Abecasis in the Fall of 2013.