TOPMed Site Visit 2018

From Genome Analysis Wiki
Revision as of 11:43, 4 September 2018 by Tblackw (talk | contribs)
Jump to navigationJump to search

One day: Thursday, September 13, 2018, 8:30 am - 4:00 pm ?

Perhaps 5 1/2 total hours of presentations.

Draft Agenda

Introduction and Overview of TOPMed data resources and services (30 minutes)

Goncalo Abecasis will present overview of IRC activies

  • Introduce IRC personnel and their expertise
  • Sequence for 130,000+ participants
  • Variant calls and genotypes, phased and unphased
  • Structural variant calls in progress
  • BRAVO variant browser
  • ENCORE analysis server
  • TOPMed imputation reference panel
  • Main developments in the past year, including security improvements, Manual, etc.

Characteristics of variants in data freeze 6 (20 minutes)

Hyun Min Kang and Jonathan LeFaive will present this section

  • Overall numbers
  • Differences by ancestry, study and sequencing center
  • Allele frequencies of deleterious variants
  • Genotype accuracy
  • Compare harmonized versus sequencing center mappings
  • Process for variant calling, genotyping, filtering, phasing and distribution
  • Process for interim 'snapshot' genotypes

Calling structural variants (20 minutes)

Our colleagues at Baylor will present this section. William Salerno?

  • Procedures and plans for structural variant calling
  • Benefits of ensemble approach
  • Initial results
  • Data access mechanisms
  • Anticipated data size
  • Coordination with SNPs and indels

BRAVO variant browser helps to assess variant quality (20 minutes)

Daniel Taliun will present this section

  • Purpose
  • Which studies are included
  • Usage / main features
  • Both rare and common variants
  • Improvements in the last year
  • Access via an applications programming interface (API)
  • View all information used in filtering
  • Coordination with gnomAD
  • Potential plans for PheWeb integration

Break (30 minutes)

Summary of contract spending to date (20 minutes)

Denise Bianchi and Goncalo Abecasis will present this section

  • Broad subdivisions: personnel, cloud storage, cloud computing, hardware
  • Divided between Task 1 and Task 2

Improved results from the latest TOPMed imputation panel (20 minutes)

Goncalo Abecasis will present this, unless Lukas Forer is available. Ketian Yu to prepare summaries of imputation quality

(slides)

  • Principle of operation
  • Measuring imputation quality in different populations
  • Pushing the low frequency boundary
  • Improved accuracy for African American and Latino samples
  • Challenges and opportunities from collaboration and integration with NIH Commons / NHLBI Stage

Potential Population Genetics Update (20 minutes)

Check with Sebastian Zoellner

Cloud access to TOPMed sequence data (20 minutes)

Tom Blackwell to take the lead on this section

(slides)

  • NCBI's 'Fusera' controlled access mechanism
  • User perspective -- involves a Google or Amazon billing project
  • What is needed for users to have a great overall experience?
  • Education and training for users

Lunch (70 minutes)

ENCORE analysis server (20 minutes)

Matthew Flickinger will present this section

  • Principle of operation
  • Releases only aggregate data summaries
  • Visualizations help to assess results
  • Data sharing and collaboration tools
  • Capability to re-run previous jobs with new data
  • SAIGE analysis gives accurate results in case-control studies
  • Usage statistics comparing last 12-months to previous 12-months
  • Some highlights from user survey results

Manuscript support (30 minutes)

Albert Vernon Smith will present overview for this section

  • How can and how is the IRC supporting TOPMed manuscripts and discoveries?
  • Overall TOPMed landmark paper
  • Analysis of telomere length
  • Mitochondrial DNA copy number
  • Lipids analysis using TOPMed imputed genotypes
  • UK BioBank with TOPMed imputation
  • Context specific mutation rates
  • Data sharing with Centers for Common Disease Genetics

Interactions with outside groups (30 minutes)

Albert Vernon Smith to take the lead on this section

  • NIH Data Commons
  • NHLBI Data STAGE
  • NHGRI Centers for Common Disease Genetics
  • Global Alliance for Genomics and Health (GA4GH)
  • NIMH Parkinsons Disease Consortium

Break (20 minutes)

Future plans for the next Task Order (30 minutes)

Feedback from NHLBI (40 minutes)

Finish (3:50 pm)