plotnine.stat_smooth

stat_smooth(
    mapping=None,
    data=None,
    *,
    geom="smooth",
    position="identity",
    na_rm=False,
    method="auto",
    se=True,
    n=80,
    formula=None,
    fullrange=False,
    level=0.95,
    span=0.75,
    method_args={},
    **kwargs
)

Calculate a smoothed conditional mean

Parameters

mapping : aes = None

Aesthetic mappings created with aes. If specified and inherit_aes=True, it is combined with the default mapping for the plot. You must supply mapping if there is no plot mapping.

Aesthetic Default value
x
y

The bold aesthetics are required.

Options for computed aesthetics

"se"    # Standard error of points in bin
"ymin"  # Lower confidence limit
"ymax"  # Upper confidence limit

Calculated aesthetics are accessed using the after_stat function. e.g. after_stat('se').

data : DataFrame = None

The data to be displayed in this layer. If None, the data from from the ggplot() call is used. If specified, it overrides the data from the ggplot() call.

geom : str | geom = "smooth"

The statistical transformation to use on the data for this layer. If it is a string, it must be the registered and known to Plotnine.

position : str | position = "identity"

Position adjustment. If it is a string, it must be registered and known to Plotnine.

na_rm : bool = False

If False, removes missing values with a warning. If True silently removes missing values.

method : str | callable = "auto"

The available methods are:

"auto"       # Use loess if (n<1000), glm otherwise
"lm", "ols"  # Linear Model
"wls"        # Weighted Linear Model
"rlm"        # Robust Linear Model
"glm"        # Generalized linear Model
"gls"        # Generalized Least Squares
"lowess"     # Locally Weighted Regression (simple)
"loess"      # Locally Weighted Regression
"mavg"       # Moving Average
"gpr"        # Gaussian Process Regressor

If a callable is passed, it must have the signature:

def my_smoother(data, xseq, **params):
    # * data - has the x and y values for the model
    # * xseq - x values to be predicted
    # * params - stat parameters
    #
    # It must return a new dataframe. Below is the
    # template used internally by Plotnine

    # Input data into the model
    x, y = data["x"], data["y"]

    # Create and fit a model
    model = Model(x, y)
    results = Model.fit()

    # Create output data by getting predictions on
    # the xseq values
    data = pd.DataFrame({
        "x": xseq,
        "y": results.predict(xseq)})

    # Compute confidence intervals, this depends on
    # the model. However, given standard errors and the
    # degrees of freedom we can compute the confidence
    # intervals using the t-distribution.
    #
    # For an alternative, implement confidence intervals by
    # the bootstrap method
    if params["se"]:
        from plotnine.utils.smoothers import tdist_ci
        y = data["y"]            # The predicted value
        df = 123                 # Degrees of freedom
        stderr = results.stderr  # Standard error
        level = params["level"]  # The parameter value
        low, high = tdist_ci(y, df, stderr, level)
        data["se"] = stderr
        data["ymin"] = low
        data["ymax"] = high

    return data

For loess smoothing you must install the scikit-misc package. You can install it using with pip install scikit-misc or pip install plotnine[all].

formula : formula_like = None

An object that can be used to construct a patsy design matrix. This is usually a string. You can only use a formula if method is one of lm, ols, wls, glm, rlm or gls, and in the formula you may refer to the x and y aesthetic variables.

se : bool = True

If True draw confidence interval around the smooth line.

n : int = 80

Number of points to evaluate the smoother at. Some smoothers like mavg do not support this.

fullrange : bool = False

If True the fit will span the full range of the plot.

level : float = 0.95

Level of confidence to use if se=True.

span : float = 2/3.

Controls the amount of smoothing for the loess smoother. Larger number means more smoothing. It should be in the (0, 1) range.

method_args : dict = {}

Additional arguments passed on to the modelling method.

**kwargs : Any = {}

Aesthetics or parameters used by the geom.

See Also

OLS
WLS
RLM
GLM
GLS
lowess
loess
rolling
GaussianProcessRegressor

Notes

geom_smooth and stat_smooth are effectively aliases, they both use the same arguments. Use geom_smooth unless you want to display the results with a non-standard geom.