from plotnine import ggplot, aes, geom_point, labs
from plotnine.data import penguins
("bill_length_mm", "bill_depth_mm", color="species"))
ggplot(penguins, aes(+ geom_point()
)
Introduction
Plotnine is a Python package for data visualization, based on the grammar of graphics. It implements a wide range of plots—including barcharts, linegraphs, scatterplots, maps, and much more.
This guide goes from a basic overview of Plotnine code, to explaining each piece of its grammar in more detail. While getting started is quick, learning the full grammar takes time. But it’s worth it! The grammar of graphics shows how even plots that look very different share the same underlying structure.
The rest of this page provides brief instructions for installing and starting with Plotnine, followed by some example use cases.
Installing
# simple install
pip install plotnine
# with dependencies used in examples
pip install 'plotnine[extra]'
# simple install
uv add plotnine
# with dependencies used in examples
uv add 'plotnine[extra]'
# simple install
conda install -c conda-forge plotnine
# with dependencies used in examples
conda install -c conda-forge 'plotnine[extra]'
Quickstart
Basic plot
Plotnine comes with over a dozen example datasets, in order to quickly illustrate a wide range of plots. For example, the Palmer’s Penguins dataset (plotnine.data.penguins
) contains data on three different penguin species.
The scatterplot below shows the relationship between bill length and bill depth for each penguin species.
DataFrame support
Plotnine supports both Pandas and Polars DataFrames. It also provides simple a >>
operator to pipe data into a plot.
The example below shows a Polars DataFrame being filtered, then piped into a plot.
import polars as pl
= pl.from_pandas(penguins)
pl_penguins
(# polars: subset rows ----
filter(pl.col("species") == "Adelie")
pl_penguins.#
# pipe to plotnine ----
>> ggplot(aes("bill_length_mm", "bill_depth_mm", fill="species"))
+ geom_point()
+ labs(title="Adelie penguins")
)
Notice that the code above keeps the Polars filter code and plotting code together (inside the parentheses). This makes it easy to quickly create plots, without needing a bunch of intermediate variables.
Use cases
See the Plotnine gallery for more examples.
Publication ready plots
Code
from plotnine import *
from plotnine.data import anscombe_quartet
("x", "y"))
ggplot(anscombe_quartet, aes(+ geom_point(color="sienna", fill="orange", size=3)
+ geom_smooth(method="lm", se=False, fullrange=True, color="steelblue", size=1)
+ facet_wrap("dataset")
+ labs(title="Anscombe’s Quartet")
+ scale_y_continuous(breaks=(4, 8, 12))
+ coord_fixed(xlim=(3, 22), ylim=(2, 14))
+ theme_tufte(base_family="Futura", base_size=16)
+ theme(
=element_line(color="#4d4d4d"),
axis_line=element_line(color="#00000000"),
axis_ticks_major=element_blank(),
axis_title=0.09,
panel_spacing
) )
Annotated charts
The plot below makes heavy use of annotation, in order to illustrate coal production over the past century. The chart is largely Plotnine code, with matplotlib for some of the fancier text annotations. Learn more on in this blog post by the author, Nicola Rennie.
Geospatial plots
Code
from plotnine import *
import geodatasets
import geopandas as gp
= gp.read_file(geodatasets.get_path("geoda.chicago_commpop"))
chicago
(="POP2010"))
ggplot(chicago, aes(fill+ geom_map()
+ coord_fixed()
+ theme_minimal()
+ labs(title="Chicago Population in 2010")
)
Getting artsy
Code
import polars as pl
import numpy as np
from plotnine import *
from mizani.palettes import brewer_pal, gradient_n_pal
345678)
np.random.seed(
# generate random areas for each group to fill per year ---------
# Note that in the data the x-axis is called Year, and the
# filled bands are called Group(s)
= [0] * 100 + list(range(1, 31))
opts = []
values for ii in range(30):
30, replace=False))
values.extend(np.random.choice(opts,
# Put all the data together -------------------------------------
= pl.DataFrame({"Year": list(range(30))})
years = pl.DataFrame({"Group": [f"grp_{ii}" for ii in range(30)]})
groups
= (
df ="cross")
years.join(groups, how=pl.Series(values))
.with_columns(Values=pl.col("Values") / pl.col("Values").sum().over("Year"))
.with_columns(prop
)
"plot-data.csv")
df.write_csv(
# Generate color palette ----------------------------------------
# this uses 12 colors interpolated to all 30 Groups
= brewer_pal("qual", "Paired")
pal
= pal(12)
colors
np.random.shuffle(colors)
= gradient_n_pal(colors)(np.linspace(0, 1, 30))
all_colors
# Plot ---------------------------------------------------------
(
df>> ggplot(aes("Year", "prop", fill="Group"))
+ geom_area()
+ scale_fill_manual(values=all_colors)
+ theme(
=element_blank(),
axis_text=element_blank(),
line=element_blank(),
title="none",
legend_position=0,
plot_margin=element_blank(),
panel_border=element_blank(),
panel_background
) )
Next steps
Continue to the Overview for a worked example breaking down each piece of Plotnine’s grammar of graphics.