Overview:
This three day webinar on docker will be split into three sessions to look at containers (Docker) for reproducibly interrogating multiple biobanks and other data enclaves. The target audience for this webinar is end users in biotech covering pharma, diagnostics and data roles.
The first session on Monday 13 November 2023 will look at using Docker containers reproducibly in multiple computing environments. Reproducibly aggregating and comparing across Biobanks requires the use of reproducible software environments such as Docker Containers. In this webinar, we’ll review basic Docker concepts and show examples of utilizing existing Docker containers in both high performance computing (HPC) and cloud computing environments. We’ll learn tricks to effectively use Docker Image Files and specify them within existing workflows.
In the second session on Thursday 16 November 2023 will look more into building and modifying containers for reproducible research. Building your own Docker Containers for reproducible search involves modifying and rebuilding existing containers. In this webinar, we will cover the build process to build your own container images using Dockerfiles. Once your image is built, it can be shared and distributed with others using container image libraries such as DockerHub.
The third session on Tuesday 21 November 2023 will be a roundtable discussion on phenotype mining and aggregation. Panellists in this session will discuss practical and technical aspects of summary data aggregation after analysis in distinct data enclaves. We will be joined by Hernando Sanches from The Hyve, Dr Tiffany J. Callahan from IBM Research, Deepak Unni from Swiss Institute of Bioinformatics (SIB) and DNAnexus’ Ben Busby, who will be acting as a moderator.
Discussion prompts will include:
- Why should people care about data harmonization when they are integrating phenotypic datasets?
- What are ontologies and why are they important?
- What ontologies do you use in your work?
- How do you use those ontologies?
- Why are large public datasets important to you?
- UKB and similar governmentally supported biobanks
- UKB, NCBI, EBI, etc.
- Health insurance data
- What new datasets are you excited about?
- What emerging models are you excited about?
Book your free place via the links below: