Path: blob/master/notebooks/04.09-Text-and-Annotation.ipynb
657 views
Text and Annotation
Creating a good visualization involves guiding the reader so that the figure tells a story. In some cases, this story can be told in an entirely visual manner, without the need for added text, but in others, small textual cues and labels are necessary. Perhaps the most basic types of annotations you will use are axes labels and titles, but the options go beyond this. Let's take a look at some data and how we might visualize and annotate it to help convey interesting information. We'll start by setting up the notebook for plotting and importing the functions we will use:
Example: Effect of Holidays on US Births
Let's return to some data we worked with earlier, in Example: Birthrate Data, where we generated a plot of average births over the course of the calendar year. We'll start with the same cleaning procedure we used there, and plot the results (see the following figure):
When we're visualizing data like this, it is often useful to annotate certain features of the plot to draw the reader's attention. This can be done manually with the plt.text
/ax.text
functions, which will place text at a particular x/y value (see the following figure):
The ax.text
method takes an x position, a y position, a string, and then optional keywords specifying the color, size, style, alignment, and other properties of the text. Here we used ha='right'
and ha='center'
, where ha
is short for horizontal alignment. See the docstrings of plt.text
and mpl.text.Text
for more information on the available options.
Transforms and Text Position
In the previous example, we anchored our text annotations to data locations. Sometimes it's preferable to anchor the text to a fixed position on the axes or figure, independent of the data. In Matplotlib, this is done by modifying the transform.
Matplotlib makes use of a few different coordinate systems: a data point at corresponds to a certain location on the axes or figure, which in turn corresponds to a particular pixel on the screen. Mathematically, transforming between such coordinate systems is relatively straightforward, and Matplotlib has a well-developed set of tools that it uses internally to perform these transforms (these tools can be explored in the matplotlib.transforms
submodule).
A typical user rarely needs to worry about the details of the transforms, but it is helpful knowledge to have when considering the placement of text on a figure. There are three predefined transforms that can be useful in this situation:
ax.transData
: Transform associated with data coordinatesax.transAxes
: Transform associated with the axes (in units of axes dimensions)fig.transFigure
: Transform associated with the figure (in units of figure dimensions)
Let's look at an example of drawing text at various locations using these transforms (see the following figure):
Matplotlib's default text alignment is such that the "." at the beginning of each string will approximately mark the specified coordinate location.
The transData
coordinates give the usual data coordinates associated with the x- and y-axis labels. The transAxes
coordinates give the location from the bottom-left corner of the axes (here the white box), as a fraction of the total axes size. The transFigure
coordinates are similar, but specify the position from the bottom-left corner of the figure (here the gray box) as a fraction of the total figure size.
Notice now that if we change the axes limits, it is only the transData
coordinates that will be affected, while the others remain stationary (see the following figure):
This behavior can be seen more clearly by changing the axes limits interactively: if you are executing this code in a notebook, you can make that happen by changing %matplotlib inline
to %matplotlib notebook
and using each plot's menu to interact with the plot.
Arrows and Annotation
Along with tickmarks and text, another useful annotation mark is the simple arrow.
While there is a plt.arrow
function available, I wouldn't suggest using it: the arrows it creates are SVG objects that will be subject to the varying aspect ratio of your plots, making it tricky to get them right. Instead, I'd suggest using the plt.annotate
function, which creates some text and an arrow and allows the arrows to be very flexibly specified.
Here is a demonstration of annotate
with several of its options (see the following figure):
The arrow style is controlled through the arrowprops
dictionary, which has numerous options available. These options are well documented in Matplotlib's online documentation, so rather than repeating them here it is probably more useful to show some examples. Let's demonstrate several of the possible options using the birthrate plot from before (see the following figure):
The variety of options make annotate
powerful and flexible: you can create nearly any arrow style you wish. Unfortunately, it also means that these sorts of features often must be manually tweaked, a process that can be very time-consuming when producing publication-quality graphics! Finally, I'll note that the preceding mix of styles is by no means best practice for presenting data, but rather is included as a demonstration of some of the available options.
More discussion and examples of available arrow and annotation styles can be found in the Matplotlib Annotations tutorial.