Difference between revisions of "TOPMed Site Visit 2018"

From Genome Analysis Wiki
Jump to navigationJump to search
(Created page with "'''One day: Thursday, September 13, 2018, 8:30 am - 3:00 pm ?''' Perhaps 4 1/3 total hours of presentations, if lunch is brought in. (If you think I'm a bit optimistic o...")
 
Line 1: Line 1:
'''One day:  Thursday, September 13, 2018,  8:30 am - 3:00 pm ?'''
+
'''One day:  Thursday, September 13, 2018,  8:30 am - 4:00 pm ?'''
  
Perhaps 4 1/3 total hours of presentations, if lunch is brought in.  
+
Perhaps 5 1/2 total hours of presentations.
 
 
(If you think I'm a bit optimistic on the timings, I'd have to agree with you.)
 
  
 
= Draft Agenda =
 
= Draft Agenda =
Line 59: Line 57:
 
== Break (30 minutes) ==
 
== Break (30 minutes) ==
  
== ENCORE analysis server (15 minutes) ==
+
== Summary of contract spending to date (20 minutes) ==
  
''Matthew Flickinger will present this section''
+
''Denise Bianchi and Goncalo Abecasis will present this section''
 +
 
 +
* Broad subdivisions:  personnel, cloud storage, cloud computing, hardware
 +
* Divided between Task 1 and Task 2
  
* Principle of operation
+
== Improved results from the latest TOPMed imputation panel (20 minutes) ==
* Releases only aggregate data summaries
 
* Visualizations help to assess results
 
* Data sharing and collaboration tools
 
* Capability to re-run previous jobs with new data
 
* SAIGE analysis gives accurate results in case-control studies
 
* Usage statistics comparing last 12-months to previous 12-months
 
* Some highlights from user survey results
 
  
== Improved results from the latest TOPMed imputation panel (15 minutes) ==
+
''Goncalo Abecasis will present this, unless Lukas Forer is available. Ketian Yu to prepare summaries of imputation quality''
  
''Goncalo Abecasis will present this, unless Lukas Forrer is available. Ketian to prepare summaries of imputation quality''
+
[[Media:nhlbi.4761.imputation.accuracy.2018aug31.pptx|'''(slides)''']]
  
 
* Principle of operation
 
* Principle of operation
Line 82: Line 76:
 
* Challenges and opportunities from collaboration and integration with NIH Commons / NHLBI Stage
 
* Challenges and opportunities from collaboration and integration with NIH Commons / NHLBI Stage
  
== Potential Population Genetics Update (15 minutes) ==
+
== Potential Population Genetics Update (20 minutes) ==
  
 
'' Check with Sebastian Zoellner''
 
'' Check with Sebastian Zoellner''
 +
 +
== Cloud access to TOPMed sequence data (20 minutes) ==
 +
 +
''Tom Blackwell to take the lead on this section''
 +
 +
[[Media:nhlbi.4768.fusera.slides.01.pdf|'''(slides)''']]
 +
 +
* NCBI's 'Fusera' controlled access mechanism
 +
* User perspective -- involves a Google or Amazon billing project
 +
* What is needed for users to have a great overall experience?
 +
* Education and training for users
 +
 +
== Lunch (70 minutes) ==
 +
 +
== ENCORE analysis server (20 minutes) ==
 +
 +
''Matthew Flickinger will present this section''
 +
 +
* Principle of operation
 +
* Releases only aggregate data summaries
 +
* Visualizations help to assess results
 +
* Data sharing and collaboration tools
 +
* Capability to re-run previous jobs with new data
 +
* SAIGE analysis gives accurate results in case-control studies
 +
* Usage statistics comparing last 12-months to previous 12-months
 +
* Some highlights from user survey results
  
 
== Manuscript support (30 minutes) ==
 
== Manuscript support (30 minutes) ==
Line 99: Line 119:
 
* Context specific mutation rates
 
* Context specific mutation rates
 
* Data sharing with Centers for Common Disease Genetics
 
* Data sharing with Centers for Common Disease Genetics
 
== Lunch (45 minutes) ==
 
  
 
== Interactions with outside groups (30 minutes) ==
 
== Interactions with outside groups (30 minutes) ==
Line 112: Line 130:
 
* NIMH Parkinsons Disease Consortium
 
* NIMH Parkinsons Disease Consortium
  
== Cloud access to TOPMed sequence data (15 minutes) ==
+
== Break (20 minutes) ==
  
''Tom Blackwell to take the lead on this section''
+
== Future plans for the next Task Order (30 minutes) ==
  
* NCBI's 'Fusera' controlled access mechanism
+
== Feedback from NHLBI (40 minutes) ==
* User perspective -- involves a Google or Amazon billing project
 
* What is needed for users to have a great overall experience?
 
* Education and training for users
 
  
== Feedback from NHLBI (60 minutes) ==
+
== Finish (3:50 pm) ==

Revision as of 11:43, 4 September 2018

One day: Thursday, September 13, 2018, 8:30 am - 4:00 pm ?

Perhaps 5 1/2 total hours of presentations.

Draft Agenda

Introduction and Overview of TOPMed data resources and services (30 minutes)

Goncalo Abecasis will present overview of IRC activies

  • Introduce IRC personnel and their expertise
  • Sequence for 130,000+ participants
  • Variant calls and genotypes, phased and unphased
  • Structural variant calls in progress
  • BRAVO variant browser
  • ENCORE analysis server
  • TOPMed imputation reference panel
  • Main developments in the past year, including security improvements, Manual, etc.

Characteristics of variants in data freeze 6 (20 minutes)

Hyun Min Kang and Jonathan LeFaive will present this section

  • Overall numbers
  • Differences by ancestry, study and sequencing center
  • Allele frequencies of deleterious variants
  • Genotype accuracy
  • Compare harmonized versus sequencing center mappings
  • Process for variant calling, genotyping, filtering, phasing and distribution
  • Process for interim 'snapshot' genotypes

Calling structural variants (20 minutes)

Our colleagues at Baylor will present this section. William Salerno?

  • Procedures and plans for structural variant calling
  • Benefits of ensemble approach
  • Initial results
  • Data access mechanisms
  • Anticipated data size
  • Coordination with SNPs and indels

BRAVO variant browser helps to assess variant quality (20 minutes)

Daniel Taliun will present this section

  • Purpose
  • Which studies are included
  • Usage / main features
  • Both rare and common variants
  • Improvements in the last year
  • Access via an applications programming interface (API)
  • View all information used in filtering
  • Coordination with gnomAD
  • Potential plans for PheWeb integration

Break (30 minutes)

Summary of contract spending to date (20 minutes)

Denise Bianchi and Goncalo Abecasis will present this section

  • Broad subdivisions: personnel, cloud storage, cloud computing, hardware
  • Divided between Task 1 and Task 2

Improved results from the latest TOPMed imputation panel (20 minutes)

Goncalo Abecasis will present this, unless Lukas Forer is available. Ketian Yu to prepare summaries of imputation quality

(slides)

  • Principle of operation
  • Measuring imputation quality in different populations
  • Pushing the low frequency boundary
  • Improved accuracy for African American and Latino samples
  • Challenges and opportunities from collaboration and integration with NIH Commons / NHLBI Stage

Potential Population Genetics Update (20 minutes)

Check with Sebastian Zoellner

Cloud access to TOPMed sequence data (20 minutes)

Tom Blackwell to take the lead on this section

(slides)

  • NCBI's 'Fusera' controlled access mechanism
  • User perspective -- involves a Google or Amazon billing project
  • What is needed for users to have a great overall experience?
  • Education and training for users

Lunch (70 minutes)

ENCORE analysis server (20 minutes)

Matthew Flickinger will present this section

  • Principle of operation
  • Releases only aggregate data summaries
  • Visualizations help to assess results
  • Data sharing and collaboration tools
  • Capability to re-run previous jobs with new data
  • SAIGE analysis gives accurate results in case-control studies
  • Usage statistics comparing last 12-months to previous 12-months
  • Some highlights from user survey results

Manuscript support (30 minutes)

Albert Vernon Smith will present overview for this section

  • How can and how is the IRC supporting TOPMed manuscripts and discoveries?
  • Overall TOPMed landmark paper
  • Analysis of telomere length
  • Mitochondrial DNA copy number
  • Lipids analysis using TOPMed imputed genotypes
  • UK BioBank with TOPMed imputation
  • Context specific mutation rates
  • Data sharing with Centers for Common Disease Genetics

Interactions with outside groups (30 minutes)

Albert Vernon Smith to take the lead on this section

  • NIH Data Commons
  • NHLBI Data STAGE
  • NHGRI Centers for Common Disease Genetics
  • Global Alliance for Genomics and Health (GA4GH)
  • NIMH Parkinsons Disease Consortium

Break (20 minutes)

Future plans for the next Task Order (30 minutes)

Feedback from NHLBI (40 minutes)

Finish (3:50 pm)