The presentation and video recording are now available.
Overview:
Project organization is key for communication and reproducibility of data science projects. Dr. Fear will offer guidelines and examples from his personal experience, including 10 best practices, examples of do’s and don’ts – and useful tools of the trade to get you started!
Topics: 10 Best Practices for Organizing Data Science Projects
- Use the same structure and names across projects
- Separate original data, generated data, and scripts
- Use workflows to orchestrate
- Split out configuration for consistency
- Modularize reusable code
- Use a style guide and linters
- Use containers and environments
- Document as you go
- Document as you go
- Document as you go!
Date: Thursday, December 12, 2019
Time: 9:00-10:00 a.m.
Location: NCI Shady Grove, Seminar Room 406
Instructor: Justin Fear, PhD, Postdoctoral Researcher at the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK).
Questions? Contact the NCI Data Science Learning Exchange