Bioinformatics Resources
Published:
Resources for Bioinformatics Software Development & Data Analysis
I found myself sending some of the same links over and over again to people who asked questions related to bioinformatics. So it was time to compile all the links in one convenient place!
All of the resources linked below are free unless otherwise noted. This isn’t intended to be an exhaustive list of all the resources available, just some of the ones I have come across and have found useful.
Last updated: 2019-05-22
Table of Contents:
Programming
- Software Carpentry: Intro lessons on the Unix shell, git, R, & Python.
- Langmead Lab teaching materials: cover classic bioinformatics algorithms.
- Advent of Code: small programming puzzles.
- Stepik Bioinformatics Contest.
Python
- Project Rosalind: learn Python & practice solving bioinformatics problems.
- GWC Code demos: introductory Python demos - Girls Who Code @ UM-DCMB
- GWC Challenge Questions: practice solving problems - Girls Who Code @ UM-DCMB
- Python For Everybody course on Coursera (free for UMich students) - Charles Severance
- Object-Oriented Programming (OOP) in Python tutorial - RealPython
- Books:
- Automate the Boring Stuff with Python - Al Sweigart
- Think Python: How to Think Like a Computer Scientist - Allen Downey
- Dive Into Python 3 - Mark Pilgrim
- Object-Oriented Programming in Python - University of Cape Town
- Videos:
R
- Riffomonas minimalR: Intro to R tutorial with applications in microbiology - Pat Schloss
- What they forgot to teach you about R - Jenny Bryan & Jim Hester
- Happy Git and GitHub for the useR - Jenny Bryan & Jim Hester
- Books:
- R for Data Science - Hadley Wickham
- Mastering Software Development in R - Roger Peng, Sean Kross, & Brooke Anderson
- Advanced R - Hadley Wickham
- R Packages - Hadley Wickham
Reproducibility
- Riffomonas reproducible research tutorial - Pat Schloss
- Snakemake: Python-based workflow management system.
- conda: package, dependency, & environment manager.
- git: version control system.
- Also take a look at the Software Carpentry lesson on git.
Project organization
- Noble WS. A quick guide to organizing computational biology projects. PLoS Comput Biol. 2009 Jul;5(7):e1000424. doi: 10.1371/journal.pcbi.1000424.
- Scientific project template.
- cookiecutter project templating tool.
Literate programming
R Markdown
- How I use R Markdown to document my bioinformatics analyses - Rachael Lappan
- RMarkdown for writing reproducible scientific papers - Mike Frank & Chris Hartgerink
- R Markdown: The Definitive Guide - Yihui Xie, J. J. Allaire, Garrett Grolemund
Jupyter
- Jupyter Notebooks for Performing and Sharing Bioinformatics Analyses - Jonathan Dursi
- JupyterLab Documentation
Documentation
- Lee BD (2018) Ten simple rules for documenting scientific software. PLoS Comput Biol 14(12): e1006561. doi: 10.1371/journal.pcbi.1006561.
- Sphinx for creating documentation.
- Read the Docs for hosting documentation.
- Writing R package documentation.
- pkgdown: build a website for your R package.
Misc. Tools
For git
- Link your university email to GitHub to get pro/education features.
- All users (Pro or free) get free unlimited private repositories on GitHub.
- GitKraken has a nice GUI for interacting with git, GitHub, GitLab, etc. (Note that this is a referral link to be entered to win a Nintendo Switch.)
Editors
- Atom: text editor. Additional pacakges for atom:
- PyCharm: IDE for Python.
- The community edition is free, or link your university email to get the pro version for free.
- Supports Snakemake syntax highlighting & Jupyter notebooks.
- RStudio: IDE for R.
- Kite: AI autocomplete for Python. Works in Atom, PyCharm, Vim, & more.
etc.
- docopt: easily create & parse command-line interfaces. Available for Python, R, C++, & more.
- csvkit: command-line tool for working with and converting to CSV format from Excel, JSON, etc.
- Hypothesis Python testing module.