Skip to main content
The NCI Community Hub will be retiring in May 2024. For more information please visit the NCIHub Retirement Page:https://ncihub.cancer.gov/groups/ncihubshutdown/overview
close

Topic: Reproducible FAIR+ workflows and the CCWL
 

Dr. Pjotr Prins
Title:  Assistant Professor
Organization: University of Tennessee Health Science Center

Arun Isaac
Title: PhD Student
Organization: University of Tennessee Health Science Center

Abstract: FAIR principles are focused on data and fail to account for reproducible and (on-demand) workflows. In this talk, we will explore FAIR+ (Findable, Accessible, Interoperable, Reusable, and Computable) in the context of GeneNetwork.org - one of the oldest web resources in bioinformatics. With GeneNetwork we are realizing reproducible software deployment, building on free and open-source software including GNU Guix and containers. We also are building scalable workflows that are triggered on demand to run in the cloud or on bare metal and we created our own HPC to run GNU Guix-based pangenomics. In this talk, we will present our infrastructure, including a prototype COVID19 cloud setup, with a hands-on introduction of GNU Guix and the concise CWL - a CWL generator that looks like shell scripts, but in reality, can be reasoned on and are far more portable.

The Common Workflow Language (CWL) is an open standard for describing analysis workflows and tools in a way that makes them portable and scalable across a variety of software and hardware environments, from workstations to cluster, cloud, and high-performance computing (HPC) environments.

Guix is an advanced distribution of the GNU operating system developed by the GNU Project which respects the freedom of computer users.  Guix supports transactional upgrades and roll-backs, unprivileged package management, and more. When used as a standalone distribution, Guix supports declarative system configuration for transparent and reproducible operating systems.

The Concise Common Workflow Language (CCWL) is a concise syntax to express CWL workflows. It is implemented as an Embedded Domain Specific Language (EDSL) in the Scheme programming language, a minimalist dialect of the Lisp family of programming languages.

Resources:
https://genenetwork.org/
https://covid19.genenetwork.org/ FAIR+ workflows
https://hpc.guix.info/blog/2019/01/creating-a-reproducible-workflow-with-cwl/
https://guix.gnu.org/
https://git.systemreboot.net/ccwl/tree/README.org 

Slides

Created by Alan Zheng Last Modified Tue October 12, 2021 12:00 pm by Alan Zheng