Overview

This page provides an overview of Plotnine’s most important concepts and corresponding syntax. Throughout this overview, we’ll focus on reproducing the plot below, which looks at city (cty) versus highway mileage (hwy) across different years of cars:

Code
from plotnine import *
from plotnine.data import mpg

(
    ggplot(mpg, aes("cty", "hwy"))
    + geom_point(mapping=aes(colour="displ"))
    + geom_smooth(method="lm", color="blue")
    + scale_color_continuous(cmap_name="viridis")
    + facet_grid("year ~ drv")
    + coord_fixed()
    + theme_minimal()
    + theme(panel_grid_minor=element_blank())
)

Note

If you haven’t installed Plotnine, you can do so with:

pip install plotnine

Specifying data

Every plot in Plotnine starts with passing data to the ggplot() function. The data can be a Pandas or Polars DataFrame. Plotnine works best when the data is in a tidy format, which tends to be longer—with more rows and fewer columns.

from plotnine import *
from plotnine.data import mpg

ggplot(data=mpg)

aes() mapping

The aes() function maps columns of data onto graphical attributes–such as colors, shapes, or x and y coordinates. (The name aes() is short for aesthetic.)

We can map the cty and hwy columns to the x- and y- coordinates in the plot using the code below:

ggplot(data=mpg, mapping = aes(x = "cty", y = "hwy"))

geom_*() objects

Functions starting with geom_*() specify geometric objects (geoms) to add to the plot. Geoms determine how to turn mapped data into visual elements–such as points, lines, or even boxplots.

Here is how we can map the cty and hwy columns to points with a smooth trend line on top:

(
    ggplot(mpg, aes("cty", "hwy"))
    # to create a scatterplot
    + geom_point()
    # to fit and overlay a loess trendline
    + geom_smooth(method="lm", color="blue")
)

scale_*() translations

The scale_*() functions customize the styling of visual elements from a data mapping. This includes color palettes, axis spacing (e.g. log scale), axis limits, and more. Scales can set the names on guides like axes, legends, and colorbars.

The names of scales follow the pattern scale_<aesthetic>_<type>, where <aesthetic> is the name of an aesthetic attribute, like x, y, or color. If we want to use the plasma palette for color, we can use the code below:

(
    ggplot(mpg, aes("cty", "hwy", color="displ"))
    + geom_point()
    + scale_color_continuous(name="displ (VIRIDIS)", cmap_name="viridis")
)

position_*() adjustments

The position_*() functions adjust the position of elements in the plot. For example, by adding a small amount of random noise (jitter) to the x- and y- coordinates of points, so they don’t cover each other.

(
    ggplot(mpg, aes("cty", "hwy", color="displ"))
    # add a small amount of jitter
    + geom_point(position=position_jitter())
)

facet_*() subplots

Facets split a plot into multiple subplots. The two facet functions, facet_grid() and facet_wrap(), determine how to split the data. For example, we can create a grid of plots based on the year and drv columns:

(
    ggplot(mpg, aes("cty", "hwy"))
    + geom_point()
    # facet into subplots
    + facet_grid("year ~ drv")
)

coord_*() projections

The coord_*() functions specify the coordinate system of the plot. Currently, only a few coordinate systems are available. However, in the future systems like coord_polar() will allow for plotting using polar coordinates.

The code below uses coord_fixed() to ensure that the x- and y- axes have the same spacing.

(
    ggplot(mpg, aes("cty", "hwy"))
    + geom_point()
    + coord_fixed(xlim=[0, 40], ylim=[0, 40])
)

theme_*() styling

Themes control the style of all aspects of the plot that are not decided by the data. This includes the position of the legend, the color of the axes, the size of the text, and much more.

Use the theme_*() functions to start with a pre-made theme, or the general theme() function to create a new custom theme. Use the element_*() functions to configure new theme settings.

(
    ggplot(mpg, aes("cty", "hwy", colour="class"))
    + geom_point()
    + theme_minimal()
    + theme(
        legend_position="top",
        axis_line=element_line(linewidth=0.75),
        axis_line_x=element_line(colour="blue"),
    )
)

Putting it together

Finally, we put all the pieces together into the original plot:

(
    ggplot(mpg, aes("cty", "hwy"))
    + geom_point(mapping=aes(colour="displ"))
    + geom_smooth(method="lm", color="blue")
    + scale_color_continuous(cmap_name="viridis")
    + facet_grid("year ~ drv")
    + coord_fixed()
    + theme_minimal()
    + theme(panel_grid_minor=element_blank())
)

Note

This tutorial was adapted from the ggplot2 “Getting Started” tutorial