I’ve been using and teaching Python for over a decade, starting first in 2007 as a teaching assistant at The Summer Science Program. While I’m most comfortable in the Pandas/SciPy/matplotlib stack for general analysis and in Pysam for close inspection of NGS reads, I have experience up and down the abstraction ladder. Lately I’ve been doing some work in Flask to help production-ize some exploratory tools that I’ve prototypes in Jupyter notebooks.
I honed my programming by teaching the QB3 Python Bootcamp, from the summer of 2010-2013. Aimed at molecular biology graduate students and postdocs (and in one case, a professor) with no prior programming experience, this two week course taught the fundamentals of python as well as the numpy/scipy stack. This course is still offered by the Berkeley Center for Computational Biology, in what appears to me to be a condensed, but little changed form.
Snakemake is one of the tools I’ve found to be most revolutionary in terms of quickly building creating reliable, maintainable computational pipelines. By combining expressive rules, auditability, and a built in connection to cloud and cluster computing resources, this tool has helped me to design and refine analyses to be run on hundreds of distinct samples.
Once the data has been processed, cleaning and visualization are crucial for communicating the key ideas to stakeholders at all levels. I am comfortable using visualization tools at all levels of abstraction—from high level Plotly charts for rapid exploratory analysis, to customizing matplotlib graphs to fit within a consistent graphical style, to building custom SVG plotting tools when existing tools just don’t have the features I need.
Feel free to check out my Github. Many of the packages there are thoroughly academic in quality—worked on mostly by myself, with a premium on getting the results I need. However, examples of better-engineered tools are Hornet (a reimplementation of the WASP allele-specific mapping pipeline optimized for memory usage and speed) and Phased Pileup (a tool written to visualize allele-specific mapping).
In addition, I have broad experience in other programming paradigms, including low-level string twiddling in C, statistical analysis in R, and instrument control in LabView and Matlab.