import pandas as pd
from plotnine import *
Aesthetic specifications
This document is a translation of the ggplot2 aesthetic specification.
Color and fill
Almost every geom has either color, fill, or both. Colors and fills can be specified in the following ways:
A name, e.g.,
"red"
. These can be any color name or value supported by matplotlib. See the matplotlib css colors documentation and the plot below for a list of colors.An rgb specification, with a string of the form
"#RRGGBB"
where each of the pairsRR
,GG
,BB
consists of two hexadecimal digits giving a value in the range00
toFF
You can optionally make the color transparent by using the form
"#RRGGBBAA"
.A missing value (e.g. None, np.nan, pd.NA), for a completely transparent colour.
Here’s an example of listing CSS color names matplotlib supports:
from matplotlib import colors as mcolors
= len(mcolors.CSS4_COLORS)
n_colors = pd.DataFrame(
colors
{"name": [name for name in mcolors.CSS4_COLORS.keys()],
"x": [(x // 30) * 1.5 for x in range(n_colors)],
"y": [-(x % 30) for x in range(n_colors)],
}
)
("x", "y"))
ggplot(colors, aes(+ geom_point(aes(color="name"), size=5)
+ geom_text(aes(label="name"), nudge_x=0.14, size=7.5, ha="left")
+ scale_color_identity(guide=None)
+ expand_limits(x=7)
+ theme_void()
+ labs(title="CSS4 colors")
)
Lines
As well as colour
, the appearance of a line is affected by linewidth
, linetype
, linejoin
and lineend
.
Line type
Line types can be specified with:
A name: solid, dashed, dotted, dashdot, as shown below:
= [ lty "solid", "dashed", "dotted", "dashdot", ]= pd.DataFrame({ linetypes "y": list(range(len(lty))), "lty": lty }) (0, "y")) ggplot(linetypes, aes(+ geom_segment(aes(xend=5, yend="y", linetype="lty")) + geom_text(aes(label="lty"), nudge_y=0.2, ha="left") + scale_x_continuous(name=None, breaks=None) + scale_y_reverse(name=None, breaks=None) + scale_linetype_identity(guide=None) )
The lengths of on/off stretches of line. This is done with a tuple of the form
(offset, (on, off, ...))
.= [ lty 0, (1, 1)), (0, (1, 8)), (0, (1, 15)), (0, (8, 1)), (0, (8, 8)), (0, (8, 15)), (0, (15, 1)), (0, (15, 8)), (0, (15, 15)), (0, (2, 2, 6, 2)), ( ]= pd.DataFrame({"y": list(range(len(lty))), "lty": lty}) linetypes (0, "y")) ggplot(linetypes, aes(+ geom_segment(aes(xend=5, yend="y", linetype="lty")) + geom_text(aes(label="lty"), nudge_y=0.2, ha="left") + scale_x_continuous(name=None, breaks=None) + scale_y_reverse(name=None, breaks=None) + scale_linetype_identity(guide=None) )
The three standard dash-dot line types described above correspond to:
- dashed:
(0, (4, 4))
- dotted:
(0, (1, 3)
- dashdot:
(0, (1, 3, 4, 3))
Linewidth
Due to a historical error, the unit of linewidth is roughly 0.75 mm. Making it exactly 1 mm would change a very large number of existing plots, so we’re stuck with this mistake.
Line end/join parameters
The appearance of the line end is controlled by the
lineend
paramter, and can be one of “round”, “butt” (the default), or “square”.= pd.DataFrame({"x": [1,2,3], "y": [4, 1, 9]}) df = ggplot(df, aes("x", "y")) + xlim(0.5, 3.5) + ylim(0, 10) base ( base+ geom_path(size=10) + geom_path(size=1, colour="red") ) ( base+ geom_path(size=10, lineend="round") + geom_path(size=1, colour="red") ) ( base+ geom_path(size=10, lineend="square") + geom_path(size=1, colour="red") )
The appearance of line joins is controlled by
linejoin
and can be one of “round” (the default), “mitre”, or “bevel”.= pd.DataFrame({"x": [1,2,3], "y": [9, 1, 9]}) df = ggplot(df, aes("x", "y")) + ylim(0, 10) base ( base+ geom_path(size=10) + geom_path(size=1, colour="red") ) ( base+ geom_path(size=10, linejoin="mitre") + geom_path(size=1, colour="red") ) ( base+ geom_path(size=10, linejoin="bevel") + geom_path(size=1, colour="red") )
Mitre joins are automatically converted to bevel joins whenever the angle is too small (which would create a very long bevel). This is controlled by the linemitre
parameter which specifies the maximum ratio between the line width and the length of the mitre.
Polygons
The border of the polygon is controlled by the colour
, linetype
, and linewidth
aesthetics as described above. The inside is controlled by fill
.
Point
Shape
A character string or an integer for a point type, as specified in matplotlib.markers:
from plotnine.scales.scale_shape import shapes, unfilled_shapes = [*shapes, *unfilled_shapes, "none"] shape_points = len(shape_points) n_shapes = pd.DataFrame( shapes {"shape": shape_points, "shape_text": [repr(x) if str(x).isdigit() else x for x in shape_points], "x": [x % 5 for x in range(n_shapes)], "y": [-(x // 5) for x in range(n_shapes)], } ) ("x", "y")) ggplot(shapes, aes(+ geom_point(aes(shape="shape"), size=5, fill="red") + geom_text(aes(label="shape_text"), nudge_x=0.15) + scale_shape_identity(guide=None) + theme_void() + expand_limits(x=4.1) )
Shape path and mathtex (specified in the bottom of matplotlib.markers table):
# (num, type, angle) = (5, 0, 60) # type 0: polygon pentagon = (5, 1, 60) # type 1: star five_point_star = (5, 2, 60) # type 2: asterisk five_point_asterisk = [pentagon, five_point_star, five_point_asterisk, "$\\alpha$", "$\\beta$"] shape = [x.replace("$", "\\$") if isinstance(x, str) else x for x in shape] shape_text = pd.DataFrame( shapes {"shape": shape, "shape_text": shape_text, "x": list(range(len(shape))), "y": [0] * len(shape), } ) ("x", "y")) ggplot(shapes, aes(+ geom_point(aes(shape="shape"), size=10) + geom_text(aes(label="shape_text"), nudge_y=-0.1) + scale_shape_identity(guide=None) + theme_void() #+ theme(figure_size=(6, 2)) + expand_limits(y=[-0.2, 0.2], x=[-0.5, len(shape) - 0.5]) )
Custom matplotlib path objects as a literal mapping:
from plotnine.data import mtcars from matplotlib.path import Path = Path.unit_regular_star(6) star = Path.unit_circle() circle = Path( cut_star =[*circle.vertices, *star.vertices[::-1]], vertices=[*circle.codes, *star.codes] codes ) ("wt", "mpg")) ggplot(mtcars, aes(+ geom_point(shape=cut_star, size=10) )
Color and fill
While color
applies to all shapes, fill
only applies to shapes with red fill in the plot above. The size of the filled part is controlled by size
, the size of the stroke is controlled by stroke
. Each is measured in mm, and the total size of the point is the sum of the two. Note that the size is constant along the diagonal in the following figure.
= pd.DataFrame({"size": [0, 2, 4, 6]}).merge(
sizes "stroke": [0, 2, 4, 6]}), how="cross"
pd.DataFrame({
)
("size", "stroke", size="size", stroke="stroke"))
ggplot(sizes, aes(+ geom_abline(slope=-1, intercept=6, colour="white", size=6)
+ geom_point(shape="o", fill="red")
+ scale_size_identity(guide=None)
)
Text
Font family
There are only three fonts that are guaranteed to work everywhere: “sans” (the default), “serif”, or “mono”:
= pd.DataFrame(
df "x": 1, "y": [3, 2, 1], "family": ["sans-serif", "serif", "monospace"]}
{
)
"x", "y")) + geom_text(aes(label="family", family="family"), size=20)) (ggplot(df, aes(
findfont: Generic family 'serif' not found because none of the following families were found: Times, Palatino, New Century Schoolbook, Bookman, Computer Modern Roman, Times New Roman
findfont: Generic family 'serif' not found because none of the following families were found: Times, Palatino, New Century Schoolbook, Bookman, Computer Modern Roman, Times New Roman
findfont: Generic family 'serif' not found because none of the following families were found: Times, Palatino, New Century Schoolbook, Bookman, Computer Modern Roman, Times New Roman
Font weight
There are two important considerations when using font weights:
- font family must support your intended font weight(s).
- fonts that bundle their multiple variants in one
.ttc
file may currently not be supported on your platform. (see this issue)
= pd.DataFrame(
df "x": [1, 2, 3, 4], "fontweight": ["light", "normal", "bold", "heavy"]}
{
)
(1, "x"))
ggplot(df, aes(+ geom_text(
="fontweight", fontweight="fontweight"),
aes(label="Dejavu Sans",
family
) )
Font style
Similar to font weight, fonts that bundle multiple variants in one .ttc
may not be supported (see this issue)
= pd.DataFrame({"x": [1, 2, 3], "fontstyle": ["normal", "italic", "oblique"]})
df
(1, "x"))
ggplot(df, aes(+ geom_text(
="fontstyle", fontstyle="fontstyle"),
aes(label="DejaVu Sans",
family
) )
Font size
The size
of text is measured in mm by default. This is unusual, but makes the size of text consistent with the size of lines and points. Typically you specify font size using points (or pt for short), where 1 pt = 0.35mm. In geom_text()
and geom_label()
, you can set size.unit = "pt"
to use points instead of millimeters. In addition, ggplot2 provides a conversion factor as the variable .pt
, so if you want to draw 12pt text, you can also set size = 12 / .pt
.
Justification
Horizontal and vertical justification have the same parameterisation, either a string (“top”, “middle”, “bottom”, “left”, “center”, “right”) or a number between 0 and 1:
- top = 1, middle = 0.5, bottom = 0
- left = 0, center = 0.5, right = 1
= pd.DataFrame({"hjust": ["center", "right", "left"], "x": [0, 1, 2]}).merge(pd.DataFrame({"vjust": ["center", "top", "bottom"], "y": [0, 1, 2]}), how="cross")
just
"label"] = just["hjust"].astype(str).str.cat(just["vjust"].astype(str), sep=", ")
just[
("x", "y"))
ggplot(just, aes(+ geom_point(colour="grey", size=5)
+ geom_text(aes(label="label", hjust="hjust", vjust="vjust"))
+ expand_limits(x=[-.5, 2.5], y=[-.5, 2.5])
)
Note that you can use numbers outside the range (0, 1), but it’s not recommended.