TOPMed Site Visit 2018

From Genome Analysis Wiki
Revision as of 14:18, 29 August 2018 by Tblackw (talk | contribs) (Created page with "'''One day: Thursday, September 13, 2018, 8:30 am - 3:00 pm ?''' Perhaps 4 1/3 total hours of presentations, if lunch is brought in. (If you think I'm a bit optimistic o...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

One day: Thursday, September 13, 2018, 8:30 am - 3:00 pm ?

Perhaps 4 1/3 total hours of presentations, if lunch is brought in.

(If you think I'm a bit optimistic on the timings, I'd have to agree with you.)

Draft Agenda

Introduction and Overview of TOPMed data resources and services (30 minutes)

Goncalo Abecasis will present overview of IRC activies

  • Introduce IRC personnel and their expertise
  • Sequence for 130,000+ participants
  • Variant calls and genotypes, phased and unphased
  • Structural variant calls in progress
  • BRAVO variant browser
  • ENCORE analysis server
  • TOPMed imputation reference panel
  • Main developments in the past year, including security improvements, Manual, etc.

Characteristics of variants in data freeze 6 (20 minutes)

Hyun Min Kang and Jonathan LeFaive will present this section

  • Overall numbers
  • Differences by ancestry, study and sequencing center
  • Allele frequencies of deleterious variants
  • Genotype accuracy
  • Compare harmonized versus sequencing center mappings
  • Process for variant calling, genotyping, filtering, phasing and distribution
  • Process for interim 'snapshot' genotypes

Calling structural variants (20 minutes)

Our colleagues at Baylor will present this section. William Salerno?

  • Procedures and plans for structural variant calling
  • Benefits of ensemble approach
  • Initial results
  • Data access mechanisms
  • Anticipated data size
  • Coordination with SNPs and indels

BRAVO variant browser helps to assess variant quality (20 minutes)

Daniel Taliun will present this section

  • Purpose
  • Which studies are included
  • Usage / main features
  • Both rare and common variants
  • Improvements in the last year
  • Access via an applications programming interface (API)
  • View all information used in filtering
  • Coordination with gnomAD
  • Potential plans for PheWeb integration

Break (30 minutes)

ENCORE analysis server (15 minutes)

Matthew Flickinger will present this section

  • Principle of operation
  • Releases only aggregate data summaries
  • Visualizations help to assess results
  • Data sharing and collaboration tools
  • Capability to re-run previous jobs with new data
  • SAIGE analysis gives accurate results in case-control studies
  • Usage statistics comparing last 12-months to previous 12-months
  • Some highlights from user survey results

Improved results from the latest TOPMed imputation panel (15 minutes)

Goncalo Abecasis will present this, unless Lukas Forrer is available. Ketian to prepare summaries of imputation quality

  • Principle of operation
  • Measuring imputation quality in different populations
  • Pushing the low frequency boundary
  • Improved accuracy for African American and Latino samples
  • Challenges and opportunities from collaboration and integration with NIH Commons / NHLBI Stage

Potential Population Genetics Update (15 minutes)

Check with Sebastian Zoellner

Manuscript support (30 minutes)

Albert Vernon Smith will present overview for this section

  • How can and how is the IRC supporting TOPMed manuscripts and discoveries?
  • Overall TOPMed landmark paper
  • Analysis of telomere length
  • Mitochondrial DNA copy number
  • Lipids analysis using TOPMed imputed genotypes
  • UK BioBank with TOPMed imputation
  • Context specific mutation rates
  • Data sharing with Centers for Common Disease Genetics

Lunch (45 minutes)

Interactions with outside groups (30 minutes)

Albert Vernon Smith to take the lead on this section

  • NIH Data Commons
  • NHLBI Data STAGE
  • NHGRI Centers for Common Disease Genetics
  • Global Alliance for Genomics and Health (GA4GH)
  • NIMH Parkinsons Disease Consortium

Cloud access to TOPMed sequence data (15 minutes)

Tom Blackwell to take the lead on this section

  • NCBI's 'Fusera' controlled access mechanism
  • User perspective -- involves a Google or Amazon billing project
  • What is needed for users to have a great overall experience?
  • Education and training for users

Feedback from NHLBI (60 minutes)