If you’re active in any Data Science communities—whether on Twitter, Reddit,
LinkedIn, Discord, or wherever else—you’ve probably already encountered a
member of the pie chart hate-brigade. In fact, in reading this post, there is a
good chance that you’re *already* a member of this group, rallying to heap
further criticism on your least favourite graphic!

Read More

# Blog

- In Defence of the Humble Pie Chart
- Posterior predictives for AR(p) models in Stan
- Matplotlib plots for LaTeX with PGF
- Speed-up numpy with Intel's Math Kernel Library (MKL)
- Matplotlib graphics for the metropolis beamer theme
- Matplotlib boxplots with custom percentiles
- Speed up Python code with Numpy: an example case
- Plot circular data with matplotlib
- Plot publication-quality figures with matplotlib and LaTeX

Autoregressive (AR) models represent
a popular type of statistical model. They are used to describe processes which evolve
through time. Often then, a statistician is interested in fitting such a model to real
data, with the intention of using the fitted model to make predictions about the future.

Read More

Matplotlib’s pgf
backend
is pretty great, allowing plots to be exported directly from python to pgf
drawing commands. These drawing
commands can be inserted directly into a LaTeX `.tex`

document, and so the
generated plot will be realised at compile time. This method of embedding
plots into a LaTeX document allows a quick and easy method to ensure that fonts
between your document body and plots match.

Read More

The numpy package is at the core of scientific computing
in python. It is the go-to tool for implementing any numerically intensive
tasks. The popular pandas package is
also built on top of the capabilities of numpy.

Read More

Beamer is a great tool to make presentations with, and is *indispensable* to those who need to typeset mathematics within their slides. Beamer is actually just a LaTeX document class, so its syntax and setup is familiar to those who have experience working with TeX and friends.

Read More

This post was inspired by a question I answered on stack overflow. In the question a user asked if it was possible to make a boxplot with box boundaries at arbitrary percentiles, using matplotlib. Of course, with matplotlib anything is possible and so I set to work…

Read More

In this post I shall introduce the definition of the effective sample size (ESS) as given by Gelman *et. al* in their book Bayesian Data Analysis 3. Afterwards I shall review PyMC’s computation of the ESS. PyMC’s implementation provides a perfect example case of how we can speed up code with Numpy. I show how we can do so and compute the ESS over *500x faster* than PyMC. I’ve posted the full example code and speed comparison used in this post here.

Read More

Circular data arises very naturally in many different situations. Meterologists regularly encounter directional data when considering wind directions, ecologists may come across angular data when looking at the directions of motion of animals, and we all come into contact with at least one type of circular data every day: the time.

Read More

Figures are an incredibly important aspect of effectively communicating research and
ideas. Bad figures are bad communicators: difficult to understand and interpret. They rear
their ugly heads only to nauseate the reader and detract from the accompanying text. Good
plots, however, are clear and concise. They seamlessly blend with their accompanied text
and complement its narrative. Well executed figures should leave our readers
informed, soothed, and certainly *not* nauseated.

Read More