TOPMed Site Visit 2018
One day: Thursday, September 13, 2018, 8:30 am - 4:00 pm ?
Perhaps 5 1/2 total hours of presentations.
Draft Agenda
Introduction and Overview of TOPMed data resources and services (30 minutes)
Goncalo Abecasis will present overview of IRC activies
- Introduce IRC personnel and their expertise
- Sequence for 130,000+ participants
- Variant calls and genotypes, phased and unphased
- Structural variant calls in progress
- BRAVO variant browser
- ENCORE analysis server
- TOPMed imputation reference panel
- Main developments in the past year, including security improvements, Manual, etc.
Characteristics of variants in data freeze 6 (20 minutes)
Hyun Min Kang and Jonathan LeFaive will present this section
- Overall numbers
- Differences by ancestry, study and sequencing center
- Allele frequencies of deleterious variants
- Genotype accuracy
- Compare harmonized versus sequencing center mappings
- Process for variant calling, genotyping, filtering, phasing and distribution
- Process for interim 'snapshot' genotypes
Calling structural variants (20 minutes)
Our colleagues at Baylor will present this section. William Salerno?
- Procedures and plans for structural variant calling
- Benefits of ensemble approach
- Initial results
- Data access mechanisms
- Anticipated data size
- Coordination with SNPs and indels
BRAVO variant browser helps to assess variant quality (20 minutes)
Daniel Taliun will present this section
- Purpose
- Which studies are included
- Usage / main features
- Both rare and common variants
- Improvements in the last year
- Access via an applications programming interface (API)
- View all information used in filtering
- Coordination with gnomAD
- Potential plans for PheWeb integration
Break (30 minutes)
Summary of contract spending to date (20 minutes)
Denise Bianchi and Goncalo Abecasis will present this section
- Broad subdivisions: personnel, cloud storage, cloud computing, hardware
- Divided between Task 1 and Task 2
Improved results from the latest TOPMed imputation panel (20 minutes)
Goncalo Abecasis will present this, unless Lukas Forer is available. Ketian Yu to prepare summaries of imputation quality
- Principle of operation
- Measuring imputation quality in different populations
- Pushing the low frequency boundary
- Improved accuracy for African American and Latino samples
- Challenges and opportunities from collaboration and integration with NIH Commons / NHLBI Stage
Potential Population Genetics Update (20 minutes)
Check with Sebastian Zoellner
Cloud access to TOPMed sequence data (20 minutes)
Tom Blackwell to take the lead on this section
- NCBI's 'Fusera' controlled access mechanism
- User perspective -- involves a Google or Amazon billing project
- What is needed for users to have a great overall experience?
- Education and training for users
Lunch (70 minutes)
ENCORE analysis server (20 minutes)
Matthew Flickinger will present this section
- Principle of operation
- Releases only aggregate data summaries
- Visualizations help to assess results
- Data sharing and collaboration tools
- Capability to re-run previous jobs with new data
- SAIGE analysis gives accurate results in case-control studies
- Usage statistics comparing last 12-months to previous 12-months
- Some highlights from user survey results
Manuscript support (30 minutes)
Albert Vernon Smith will present overview for this section
- How can and how is the IRC supporting TOPMed manuscripts and discoveries?
- Overall TOPMed landmark paper
- Analysis of telomere length
- Mitochondrial DNA copy number
- Lipids analysis using TOPMed imputed genotypes
- UK BioBank with TOPMed imputation
- Context specific mutation rates
- Data sharing with Centers for Common Disease Genetics
Interactions with outside groups (30 minutes)
Albert Vernon Smith to take the lead on this section
- NIH Data Commons
- NHLBI Data STAGE
- NHGRI Centers for Common Disease Genetics
- Global Alliance for Genomics and Health (GA4GH)
- NIMH Parkinsons Disease Consortium