Jupyter Notebook Templates for Data Science: Plotting

The planet Jupiter as seen by the Juno spacecraft.

I recently released my Jupyter Notebook Template Library. Its goal is to accelerate your data science projects without having to to spend hours poring over old notebooks to find handy code snippets. In this post I dive into the plotting notebook to show you what it can do.

The Plotting Notebook

Visualizing your data is a critical step in understanding it, and so it is appropriate that the first notebook in the library helps with making beautiful plots.

The notebook begins with boilerplate code that defines metadata for the resulting files and also changes some defaults, such as the figure size and resolution, font size, and legend frame. After that there are a few helpful functions which I will discuss below.

Draw Bands

One of my favorite functions is draw_bands(). It draws a set of alternating colored bands on the background of the plot based on the axis tick locations.

When called with just the axis, like draw_bands(ax), it produces this:

A plot showing the default grey bands.

But you can also customize the color using draw_bands(ax, color="orange", alpha=0.05), which produces:

A plot showing the orange bands.

These bands are a subtle way of indicating where on the X-axis a point lies, which is especially useful when plotting a time series. I use them often. Here are some examples:

Draw Legends

I like minimal, but informative, legends. Color alone is often enough to differentiate lines or points, so I wrote a function to change the color of the legend text to match the line, called draw_colored_legend(). It produces a legend like on this plot:

A plot showing my colored legend.

This legend style can be seen in these posts:

Putting It Together

The plotting notebook enables you to make beautiful plots quickly and easily. For example, this plot:

An example plot from the notebook library

Was produced by this short code snippet:

fig, ax = setup_plot(
    title="Title",
    xlabel="X-axis",
    ylabel="Y-axis",
)

ax.scatter(np.random.rand(500)-0.65, np.random.rand(500), label="First dataset")
ax.scatter(np.random.rand(500)-0.35, np.random.rand(500), label="Second dataset")

draw_colored_legend(ax)

draw_bands(ax)

save_plot(fig, "/tmp/output.svg")

If the notebook template library is useful to you, be sure to let me know on Twitter or Github. Your feedback helps make the project better for everyone!