Learning Resources for Beginners
Resources for Getting Started Building Your Data Science Skillset
Here is a list of resources designed to help you learn data science fundamentals–programming languages, common tooling, and general concepts. You can find material to study from and groups to study with.
The list will be updated on an ongoing basis with suggestions from the community. Please help your peers learn by sharing (email link) your recommendations for this list!
Scripting and Programming Languages
- Software Carpentry lessons – tutorials on shell, Python, R, Git, etc.
Python
- Intro to Python
- Introduction to Programming (with Python) – a webinar from NIAID
R / RStudio
Bash / Shell Scripting / UNIX/Linux Command Line
Git/GitHub
Data Visualization
- Matplotlib in-depth user guides: beginner, intermediate, and advanced sections, plus sections covering specific topics
Study Groups and Special Interest Groups
- Cloud 4 Bio, led by DCEG’s Jonas de Almeida - weekly hackathon on Cloud Services and Web Applications for Cancer Research
- NIH Data Science Slack group
- Bioinformatics Training and Education Program message board
HPC and Batch Computing
- How to use NIH’s Biowulf Compute Cluster – a self-paced course
General Tutorials and Overviews
- Biostars “Bioinformatics Explained”
- CBIIT Cancer Data Science Seminar Series
- Dataquest.io - interactive tutorials
- Toward Data Science - tutorials and overviews
We now have licenses available for the Biostar Handbook and Dataquest.io!
NIH Listservs
Note: You must register for an account before subscribing to these.