Title: | Visualizations of Distributions and Uncertainty |
---|---|
Description: | Provides primitives for visualizing distributions using 'ggplot2' that are particularly tuned for visualizing uncertainty in either a frequentist or Bayesian mode. Both analytical distributions (such as frequentist confidence distributions or Bayesian priors) and distributions represented as samples (such as bootstrap distributions or Bayesian posterior samples) are easily visualized. Visualization primitives include but are not limited to: points with multiple uncertainty intervals, eye plots (Spiegelhalter D., 1999) <https://ideas.repec.org/a/bla/jorssa/v162y1999i1p45-58.html>, density plots, gradient plots, dot plots (Wilkinson L., 1999) <doi:10.1080/00031305.1999.10474474>, quantile dot plots (Kay M., Kola T., Hullman J., Munson S., 2016) <doi:10.1145/2858036.2858558>, complementary cumulative distribution function barplots (Fernandes M., Walls L., Munson S., Hullman J., Kay M., 2018) <doi:10.1145/3173574.3173718>, and fit curves with multiple uncertainty ribbons. |
Authors: | Matthew Kay [aut, cre], Brenton M. Wiernik [ctb] |
Maintainer: | Matthew Kay <[email protected]> |
License: | GPL (>= 3) |
Version: | 3.3.2.9000 |
Built: | 2024-11-21 05:59:34 UTC |
Source: | https://github.com/mjskay/ggdist |
ggdist
is an R package that aims to make it easy to integrate
popular Bayesian modeling methods into a tidy data + ggplot workflow.
ggdist
is an R package that provides a flexible set of ggplot2
geoms and stats designed
especially for visualizing distributions and uncertainty. It is designed for both
frequentist and Bayesian uncertainty visualization, taking the view that uncertainty
visualization can be unified through the perspective of distribution visualization:
for frequentist models, one visualizes confidence distributions or bootstrap distributions
(see vignette("freq-uncertainty-vis")
); for Bayesian models, one visualizes probability
distributions (see vignette("tidybayes", package = "tidybayes")
).
The geom_slabinterval()
/ stat_slabinterval()
family (see vignette("slabinterval")
) makes it
easy to visualize point summaries and intervals, eye plots, half-eye plots, ridge plots,
CCDF bar plots, gradient plots, histograms, and more.
The geom_dotsinterval()
/ stat_dotsinterval()
family (see vignette("dotsinterval")
) makes
it easy to visualize dot+interval plots, Wilkinson dotplots, beeswarm plots, and quantile dotplots.
The geom_lineribbon()
/ stat_lineribbon()
family (see vignette("lineribbon")
)
makes it easy to visualize fit lines with an arbitrary number of uncertainty bands.
Maintainer: Matthew Kay [email protected]
Other contributors:
Brenton M. Wiernik [email protected] [contributor]
Useful links:
Report bugs at https://github.com/mjskay/ggdist/issues/new
Methods for aligning breaks (bins) in histograms, as used in the align
argument to density_histogram()
.
Supports automatic partial function application with waived arguments.
align_none(breaks) align_boundary(breaks, at = 0) align_center(breaks, at = 0)
align_none(breaks) align_boundary(breaks, at = 0) align_center(breaks, at = 0)
breaks |
<numeric> A sorted vector of breaks (bin edges). |
at |
<scalar numeric> The alignment point.
|
These functions take a sorted vector of equally-spaced breaks
giving
bin edges and return a numeric offset which, if subtracted from breaks
,
will align them as desired:
align_none()
performs no alignment (it always returns 0
).
align_boundary()
ensures that a bin edge lines up with at
.
align_center()
ensures that a bin center lines up with at.
For align_boundary()
(respectively align_center()
), if no bin edge (or center) in the
range of breaks
would line up with at
, it ensures that at
is an integer
multiple of the bin width away from a bin edge (or center).
A scalar numeric returning an offset to be subtracted from breaks
.
library(ggplot2) set.seed(1234) x = rnorm(200, 1, 2) # If we manually specify a bin width using breaks_fixed(), the default # alignment (align_none()) will not align bin edges to any "pretty" numbers. # Here is a comparison of the three alignment methods on such a histogram: ggplot(data.frame(x), aes(x)) + stat_slab( aes(y = "align_none()\nor 'none'"), density = "histogram", breaks = breaks_fixed(width = 1), outline_bars = TRUE, # no need to specify align; align_none() is the default color = "black", ) + stat_slab( aes(y = "align_center(at = 0)\nor 'center'"), density = "histogram", breaks = breaks_fixed(width = 1), align = align_center(at = 0), # or align = "center" outline_bars = TRUE, color = "black", ) + stat_slab( aes(y = "align_boundary(at = 0)\nor 'boundary'"), density = "histogram", breaks = breaks_fixed(width = 1), align = align_boundary(at = 0), # or align = "boundary" outline_bars = TRUE, color = "black", ) + geom_point(aes(y = 0.7), alpha = 0.5) + labs( subtitle = "ggdist::stat_slab(density = 'histogram', ...)", y = "align =", x = NULL ) + geom_vline(xintercept = 0, linetype = "22", color = "red")
library(ggplot2) set.seed(1234) x = rnorm(200, 1, 2) # If we manually specify a bin width using breaks_fixed(), the default # alignment (align_none()) will not align bin edges to any "pretty" numbers. # Here is a comparison of the three alignment methods on such a histogram: ggplot(data.frame(x), aes(x)) + stat_slab( aes(y = "align_none()\nor 'none'"), density = "histogram", breaks = breaks_fixed(width = 1), outline_bars = TRUE, # no need to specify align; align_none() is the default color = "black", ) + stat_slab( aes(y = "align_center(at = 0)\nor 'center'"), density = "histogram", breaks = breaks_fixed(width = 1), align = align_center(at = 0), # or align = "center" outline_bars = TRUE, color = "black", ) + stat_slab( aes(y = "align_boundary(at = 0)\nor 'boundary'"), density = "histogram", breaks = breaks_fixed(width = 1), align = align_boundary(at = 0), # or align = "boundary" outline_bars = TRUE, color = "black", ) + geom_point(aes(y = 0.7), alpha = 0.5) + labs( subtitle = "ggdist::stat_slab(density = 'histogram', ...)", y = "align =", x = NULL ) + geom_vline(xintercept = 0, linetype = "22", color = "red")
Several ggdist functions support automatic partial application: when called, if all of their required arguments have not been provided, the function returns a modified version of itself that uses the arguments passed to it so far as defaults. Technically speaking, these functions are essentially "Curried" with respect to their required arguments, but I think "automatic partial application" gets the idea across more clearly.
Functions supporting automatic partial application include:
The point_interval()
family, such as median_qi()
, mean_qi()
,
mode_hdi()
, etc.
The smooth_
family, such as smooth_bounded()
, smooth_unbounded()
,
smooth_discrete()
, and smooth_bar()
.
The density_
family, such as density_bounded()
, density_unbounded()
and
density_histogram()
.
The align family.
The breaks family.
The bandwidth family.
The blur family.
Partial application makes it easier to supply custom parameters to these
functions when using them inside other functions, such as geoms and stats.
For example, smoothers for geom_dots()
can be supplied in one of three
ways:
as a suffix: geom_dots(smooth = "bounded")
as a function: geom_dots(smooth = smooth_bounded)
as a partially-applied function with options:
geom_dots(smooth = smooth_bounded(kernel = "cosine"))
Many other common arguments for ggdist functions work similarly; e.g.
density
, align
, breaks
, bandwidth
, and point_interval
arguments.
These function families (except point_interval()
) also support passing
waivers to their optional arguments: if waiver()
is passed to any
of these arguments, their default value (or the most
recently-partially-applied non-waiver value) is used instead.
Use the auto_partial()
function to create new functions that support
automatic partial application.
auto_partial(f, name = NULL, waivable = TRUE)
auto_partial(f, name = NULL, waivable = TRUE)
f |
<function> Function to automatically partially-apply. |
name |
<string> Name of the function, to be used when printing. |
waivable |
<scalar logical> If |
A modified version of f
that will automatically be partially
applied if all of its required arguments are not given.
set.seed(1234) x = rnorm(100) # the first required argument, `x`, of the density_ family is the vector # to calculate a kernel density estimate from. If it is not provided, the # function is partially applied and returned as-is density_unbounded() # we could create a new function that uses half the default bandwidth density_half_bw = density_unbounded(adjust = 0.5) density_half_bw # we can overwrite partially-applied arguments density_quarter_bw_trimmed = density_half_bw(adjust = 0.25, trim = TRUE) density_quarter_bw_trimmed # when we eventually call the function and provide the required argument # `x`, it is applied using the arguments we have "saved up" so far density_quarter_bw_trimmed(x) # create a custom automatically partially applied function f = auto_partial(function(x, y, z = 3) (x + y) * z) f() f(1) g = f(y = 2)(z = 4) g g(1) # pass waiver() to optional arguments to use existing values f(z = waiver())(1, 2) # uses default z = 3 f(z = 4)(z = waiver())(1, 2) # uses z = 4
set.seed(1234) x = rnorm(100) # the first required argument, `x`, of the density_ family is the vector # to calculate a kernel density estimate from. If it is not provided, the # function is partially applied and returned as-is density_unbounded() # we could create a new function that uses half the default bandwidth density_half_bw = density_unbounded(adjust = 0.5) density_half_bw # we can overwrite partially-applied arguments density_quarter_bw_trimmed = density_half_bw(adjust = 0.25, trim = TRUE) density_quarter_bw_trimmed # when we eventually call the function and provide the required argument # `x`, it is applied using the arguments we have "saved up" so far density_quarter_bw_trimmed(x) # create a custom automatically partially applied function f = auto_partial(function(x, y, z = 3) (x + y) * z) f() f(1) g = f(y = 2)(z = 4) g g(1) # pass waiver() to optional arguments to use existing values f(z = waiver())(1, 2) # uses default z = 3 f(z = 4)(z = waiver())(1, 2) # uses z = 4
Bandwidth estimators for densities, used in the bandwidth
argument
to density functions (e.g. density_bounded()
, density_unbounded()
).
Supports automatic partial function application with waived arguments.
bandwidth_nrd0(x, ...) bandwidth_nrd(x, ...) bandwidth_ucv(x, ...) bandwidth_bcv(x, ...) bandwidth_SJ(x, ...) bandwidth_dpi(x, ...)
bandwidth_nrd0(x, ...) bandwidth_nrd(x, ...) bandwidth_ucv(x, ...) bandwidth_bcv(x, ...) bandwidth_SJ(x, ...) bandwidth_dpi(x, ...)
x |
<numeric> Vector containing a sample. |
... |
Arguments passed on to
|
These are loose wrappers around the corresponding bw.
-prefixed functions
in stats. See, for example, bw.SJ()
.
bandwidth_dpi()
, which is the default bandwidth estimator in ggdist,
is the Sheather-Jones direct plug-in estimator, i.e. bw.SJ(..., method = "dpi")
.
With the exception of bandwidth_nrd0()
, these estimators may fail in some
cases, often when a sample contains many duplicates. If they do they will
automatically fall back to bandwidth_nrd0()
with a warning. However, these
failures are typically symptomatic of situations where you should not want to
use a kernel density estimator in the first place (e.g. data with duplicates
and/or discrete data). In these cases consider using a dotplot (geom_dots()
)
or histogram (density_histogram()
) instead.
A single number giving the bandwidth
density_bounded()
, density_unbounded()
.
Bins the provided data values using one of several dotplot algorithms.
bin_dots( x, y, binwidth, heightratio = 1, stackratio = 1, layout = c("bin", "weave", "hex", "swarm", "bar"), side = c("topright", "top", "right", "bottomleft", "bottom", "left", "topleft", "bottomright", "both"), orientation = c("horizontal", "vertical", "y", "x"), overlaps = "nudge" )
bin_dots( x, y, binwidth, heightratio = 1, stackratio = 1, layout = c("bin", "weave", "hex", "swarm", "bar"), side = c("topright", "top", "right", "bottomleft", "bottom", "left", "topleft", "bottomright", "both"), orientation = c("horizontal", "vertical", "y", "x"), overlaps = "nudge" )
x |
<numeric> x values. |
y |
<numeric> y values (same length as |
binwidth |
<scalar numeric> Bin width. |
heightratio |
<scalar numeric> Ratio of bin width to dot height |
stackratio |
<scalar numeric> Ratio of dot height to vertical distance between dot centers |
layout |
<string> The layout method used for the dots. One of:
|
side |
Which side to place the slab on. |
orientation |
<string> Whether the dots are laid out horizontally
or vertically. Follows the naming scheme of
For compatibility with the base ggplot naming scheme for |
overlaps |
<string> How to handle overlapping dots or bins in the
|
A data.frame
with three columns:
x
: the x position of each dot
y
: the y position of each dot
bin
: a unique number associated with each bin
(supplied but not used when layout = "swarm"
)
find_dotplot_binwidth()
for an algorithm that finds good bin widths
to use with this function; geom_dotsinterval()
for geometries that use
these algorithms to create dotplots.
library(dplyr) library(ggplot2) x = qnorm(ppoints(20)) bin_df = bin_dots(x = x, y = 0, binwidth = 0.5, heightratio = 1) bin_df # we can manually plot the binning above, though this is only recommended # if you are using find_dotplot_binwidth() and bin_dots() to build your own # grob. For practical use it is much easier to use geom_dots(), which will # automatically select good bin widths for you (and which uses # find_dotplot_binwidth() and bin_dots() internally) bin_df %>% ggplot(aes(x = x, y = y)) + geom_point(size = 4) + coord_fixed()
library(dplyr) library(ggplot2) x = qnorm(ppoints(20)) bin_df = bin_dots(x = x, y = 0, binwidth = 0.5, heightratio = 1) bin_df # we can manually plot the binning above, though this is only recommended # if you are using find_dotplot_binwidth() and bin_dots() to build your own # grob. For practical use it is much easier to use geom_dots(), which will # automatically select good bin widths for you (and which uses # find_dotplot_binwidth() and bin_dots() internally) bin_df %>% ggplot(aes(x = x, y = y)) + geom_point(size = 4) + coord_fixed()
Methods for constructing blurs, as used in the blur
argument to
geom_blur_dots()
or stat_mcse_dots()
.
Supports automatic partial function application with waived arguments.
blur_gaussian(x, r, sd) blur_interval(x, r, sd, .width = 0.95)
blur_gaussian(x, r, sd) blur_interval(x, r, sd, .width = 0.95)
x |
<numeric> Vector of positive distances from the center of the dot (assumed to be 0) to evaluate blur function at. |
r |
<scalar numeric> Radius of the dot that is being blurred. |
sd |
<scalar numeric> Standard deviation of the dot that is being blurred. |
.width |
<scalar numeric> For |
These functions are passed x
, r
, and sd
when geom_blur_dots()
draws in order to create a radial gradient representing each dot in the
dotplot. They return values between 0
and 1
giving the opacity of the
dot at each value of x
.
blur_gaussian()
creates a dot with radius r
that has a Gaussian blur with
standard deviation sd
applied to it. It does this by calculating
, the opacity at distance
from the center
of a dot with radius
that has had a Gaussian blur with standard
deviation
=
sd
applied to it:
blur_interval()
creates an interval-type representation around the
dot at 50% opacity, where the interval is a Gaussian quantile interval with
mass equal to .width
and standard deviation sd
.
A vector with the same length as x
giving the opacity of the radial
gradient representing the dot at each x
value.
geom_blur_dots()
and stat_mcse_dots()
for geometries making use of
blur
functions.
# see examples in geom_blur_dots()
# see examples in geom_blur_dots()
Estimate the bounds of the distribution a sample came from using the CDF of
the order statistics of the sample. Use with the bounder
argument to density_bounded()
.
Supports automatic partial function application with waived arguments.
bounder_cdf(x, p = 0.01)
bounder_cdf(x, p = 0.01)
x |
<numeric> Sample to estimate the bounds of. |
p |
<scalar numeric> in |
bounder_cdf()
uses the distribution of the order statistics of
to estimate where the first and last order statistics (i.e. the
min and max) of this distribution would be, assuming the sample
x
is the
distribution. Then, it adjusts the boundary outwards from min(x)
(or max(x)
)
by the distance between min(x)
(or max(x)
) and the nearest estimated
order statistic.
Taking =
x
, the distributions of the first and last order statistics are:
Re-arranging, we can get the inverse CDFs (quantile functions) of each
order statistic in terms of the quantile function of (which we
can estimate from the data), giving us an estimate for the minimum
and maximum order statistic:
Then the estimated bounds are:
These bounds depend on , the percentile of the distribution of the order
statistic used to form the estimate. While
(the median) might be
a reasonable choice (and gives results similar to
bounder_cooke()
), this tends
to be a bit too aggressive in "detecting" bounded distributions, especially in
small sample sizes. Thus, we use a default of , which tends to
be very conservative in small samples (in that it usually gives results
roughly equivalent to an unbounded distribution), but which still performs
well on bounded distributions when sample sizes are larger (in the thousands).
A length-2 numeric vector giving an estimate of the minimum and maximum bounds
of the distribution that x
came from.
The bounder
argument to density_bounded()
.
Other bounds estimators:
bounder_cooke()
,
bounder_range()
Estimate the bounds of the distribution a sample came from using Cooke's method.
Use with the bounder
argument to density_bounded()
.
Supports automatic partial function application with waived arguments.
bounder_cooke(x)
bounder_cooke(x)
x |
<numeric> Sample to estimate the bounds of. |
Estimate the bounds of a distribution using the method from Cooke (1979); i.e. method 2.3 from Loh (1984). These bounds are:
Where is the
th order statistic of
x
(i.e. its
th-smallest value).
A length-2 numeric vector giving an estimate of the minimum and maximum bounds
of the distribution that x
came from.
Cooke, P. (1979). Statistical inference for bounds of random variables. Biometrika 66(2), 367–374. doi:10.1093/biomet/66.2.367.
Loh, W. Y. (1984). Estimating an endpoint of a distribution with resampling methods. The Annals of Statistics 12(4), 1543–1550. doi:10.1214/aos/1176346811
The bounder
argument to density_bounded()
.
Other bounds estimators:
bounder_cdf()
,
bounder_range()
Estimate the bounds of the distribution a sample came from using the range of the sample.
Use with the bounder
argument to density_bounded()
.
Supports automatic partial function application with waived arguments.
bounder_range(x)
bounder_range(x)
x |
<numeric> Sample to estimate the bounds of. |
Estimate the bounds of a distribution using range(x)
.
A length-2 numeric vector giving an estimate of the minimum and maximum bounds
of the distribution that x
came from.
The bounder
argument to density_bounded()
.
Other bounds estimators:
bounder_cdf()
,
bounder_cooke()
Methods for determining breaks (bins) in histograms, as used in the breaks
argument to density_histogram()
.
Supports automatic partial function application with waived arguments.
breaks_fixed(x, weights = NULL, width = 1) breaks_Sturges(x, weights = NULL) breaks_Scott(x, weights = NULL) breaks_FD(x, weights = NULL, digits = 5) breaks_quantiles(x, weights = NULL, max_n = "Scott", min_width = 0.5)
breaks_fixed(x, weights = NULL, width = 1) breaks_Sturges(x, weights = NULL) breaks_Scott(x, weights = NULL) breaks_FD(x, weights = NULL, digits = 5) breaks_quantiles(x, weights = NULL, max_n = "Scott", min_width = 0.5)
x |
<numeric> Sample values. |
weights |
<numeric | NULL> Optional weights to apply to |
width |
<scalar numeric> For |
digits |
<scalar numeric> For |
max_n |
<scalar numeric | function | string>
For |
min_width |
<scalar numeric> For |
These functions take a sample and its weights and return a value suitable for
the breaks
argument to density_histogram()
that will determine the histogram
breaks.
breaks_fixed()
allows you to manually specify a fixed bin width.
breaks_Sturges()
, breaks_Scott()
, and breaks_FD()
implement weighted
versions of their corresponding base functions. They return a scalar
numeric giving the number of bins. See nclass.Sturges()
, nclass.scott()
,
and nclass.FD()
.
breaks_quantiles()
constructs irregularly-sized bins using max_n + 1
(possibly weighted) quantiles of x
. The final number of bins is
at most max_n
, as small bins (ones whose bin width is less than half
the range of the data divided by max_n
times min_width
) will be merged
into adjacent bins.
Either a single number (giving the number of bins) or a vector giving the edges between bins.
library(ggplot2) set.seed(1234) x = rnorm(2000, 1, 2) # Let's compare the different break-selection algorithms on this data: ggplot(data.frame(x), aes(x)) + stat_slab( aes(y = "breaks_fixed(width = 0.5)"), density = "histogram", breaks = breaks_fixed(width = 0.5), outline_bars = TRUE, color = "black", ) + stat_slab( aes(y = "breaks_Sturges()\nor 'Sturges'"), density = "histogram", breaks = "Sturges", outline_bars = TRUE, color = "black", ) + stat_slab( aes(y = "breaks_Scott()\nor 'Scott'"), density = "histogram", breaks = "Scott", outline_bars = TRUE, color = "black", ) + stat_slab( aes(y = "breaks_FD()\nor 'FD'"), density = "histogram", breaks = "FD", outline_bars = TRUE, color = "black", ) + stat_slab( aes(y = "breaks_quantiles()\nor 'quantiles'"), density = "histogram", breaks = "quantiles", outline_bars = TRUE, color = "black", ) + geom_point(aes(y = 0.7), alpha = 0.5) + labs( subtitle = "ggdist::stat_slab(density = 'histogram', ...)", y = "breaks =", x = NULL )
library(ggplot2) set.seed(1234) x = rnorm(2000, 1, 2) # Let's compare the different break-selection algorithms on this data: ggplot(data.frame(x), aes(x)) + stat_slab( aes(y = "breaks_fixed(width = 0.5)"), density = "histogram", breaks = breaks_fixed(width = 0.5), outline_bars = TRUE, color = "black", ) + stat_slab( aes(y = "breaks_Sturges()\nor 'Sturges'"), density = "histogram", breaks = "Sturges", outline_bars = TRUE, color = "black", ) + stat_slab( aes(y = "breaks_Scott()\nor 'Scott'"), density = "histogram", breaks = "Scott", outline_bars = TRUE, color = "black", ) + stat_slab( aes(y = "breaks_FD()\nor 'FD'"), density = "histogram", breaks = "FD", outline_bars = TRUE, color = "black", ) + stat_slab( aes(y = "breaks_quantiles()\nor 'quantiles'"), density = "histogram", breaks = "quantiles", outline_bars = TRUE, color = "black", ) + geom_point(aes(y = 0.7), alpha = 0.5) + labs( subtitle = "ggdist::stat_slab(density = 'histogram', ...)", y = "breaks =", x = NULL )
Translates draws from distributions in a grouped data frame into a set of point and interval summaries using a curve boxplot-inspired approach.
curve_interval( .data, ..., .along = NULL, .width = 0.5, na.rm = FALSE, .interval = c("mhd", "mbd", "bd", "bd-mbd") ) ## S3 method for class 'matrix' curve_interval( .data, ..., .along = NULL, .width = 0.5, na.rm = FALSE, .interval = c("mhd", "mbd", "bd", "bd-mbd") ) ## S3 method for class 'rvar' curve_interval( .data, ..., .along = NULL, .width = 0.5, na.rm = FALSE, .interval = c("mhd", "mbd", "bd", "bd-mbd") ) ## S3 method for class 'data.frame' curve_interval( .data, ..., .along = NULL, .width = 0.5, na.rm = FALSE, .interval = c("mhd", "mbd", "bd", "bd-mbd"), .simple_names = TRUE, .exclude = c(".chain", ".iteration", ".draw", ".row") )
curve_interval( .data, ..., .along = NULL, .width = 0.5, na.rm = FALSE, .interval = c("mhd", "mbd", "bd", "bd-mbd") ) ## S3 method for class 'matrix' curve_interval( .data, ..., .along = NULL, .width = 0.5, na.rm = FALSE, .interval = c("mhd", "mbd", "bd", "bd-mbd") ) ## S3 method for class 'rvar' curve_interval( .data, ..., .along = NULL, .width = 0.5, na.rm = FALSE, .interval = c("mhd", "mbd", "bd", "bd-mbd") ) ## S3 method for class 'data.frame' curve_interval( .data, ..., .along = NULL, .width = 0.5, na.rm = FALSE, .interval = c("mhd", "mbd", "bd", "bd-mbd"), .simple_names = TRUE, .exclude = c(".chain", ".iteration", ".draw", ".row") )
.data |
<data.frame | rvar | matrix> One of:
|
... |
<bare language> Bare column names or expressions that, when evaluated in the context of
|
.along |
<tidyselect> Which columns are the input values to the function
describing the curve (e.g., the "x" values). Intervals are calculated jointly with
respect to these variables, conditional on all other grouping variables in the data frame. The default
( |
.width |
<numeric> Vector of probabilities to use that determine the widths of the resulting
intervals. If multiple probabilities are provided, multiple rows per group are generated, each with
a different probability interval (and value of the corresponding |
na.rm |
<scalar logical> Should |
.interval |
<string> The method used to calculate the intervals. Currently, all
methods rank the curves using some measure of data depth, then create envelopes containing the
|
.simple_names |
<scalar logical> When |
.exclude |
<character> Vector of names of columns to be excluded from summarization if no column names are specified to be summarized. Default ignores several meta-data column names used in ggdist and tidybayes. |
Intervals are calculated by ranking the curves using some measure of data depth, then
using binary search to find a cutoff k
such that an envelope containing the k
% "deepest"
curves also contains .width
% of the curves, for each value of .width
(note that k
and .width
are not necessarily the same). This is in contrast to most functional boxplot
or curve boxplot approaches, which tend to simply take the .width
% deepest curves, and
are generally quite conservative (i.e. they may contain more than .width
% of the curves).
See Mirzargar et al. (2014) or Juul et al. (2020) for an accessible introduction to data depth and curve boxplots / functional boxplots.
A data frame containing point summaries and intervals, with at least one column corresponding
to the point summary, one to the lower end of the interval, one to the upper end of the interval, the
width of the interval (.width
), the type of point summary (.point
), and the type of interval (.interval
).
Matthew Kay
Fraiman, Ricardo and Graciela Muniz. (2001). "Trimmed means for functional data". Test 10: 419–440. doi:10.1007/BF02595706.
Sun, Ying and Marc G. Genton. (2011). "Functional Boxplots". Journal of Computational and Graphical Statistics, 20(2): 316-334. doi:10.1198/jcgs.2011.09224
Mirzargar, Mahsa, Ross T Whitaker, and Robert M Kirby. (2014). "Curve Boxplot: Generalization of Boxplot for Ensembles of Curves". IEEE Transactions on Visualization and Computer Graphics. 20(12): 2654-2663. doi:10.1109/TVCG.2014.2346455
Juul Jonas, Kaare Græsbøll, Lasse Engbo Christiansen, and Sune Lehmann. (2020). "Fixed-time descriptive statistics underestimate extremes of epidemic curve ensembles". arXiv e-print. arXiv:2007.05035
point_interval()
for pointwise intervals. See vignette("lineribbon")
for more examples
and discussion of the differences between pointwise and curvewise intervals.
library(dplyr) library(ggplot2) # generate a set of curves k = 11 # number of curves n = 201 df = tibble( .draw = rep(1:k, n), mean = rep(seq(-5,5, length.out = k), n), x = rep(seq(-15,15,length.out = n), each = k), y = dnorm(x, mean, 3) ) # see pointwise intervals... df %>% group_by(x) %>% median_qi(y, .width = c(.5)) %>% ggplot(aes(x = x, y = y)) + geom_lineribbon(aes(ymin = .lower, ymax = .upper)) + geom_line(aes(group = .draw), alpha=0.15, data = df) + scale_fill_brewer() + ggtitle("50% pointwise intervals with point_interval()") + theme_ggdist() # ... compare them to curvewise intervals df %>% group_by(x) %>% curve_interval(y, .width = c(.5)) %>% ggplot(aes(x = x, y = y)) + geom_lineribbon(aes(ymin = .lower, ymax = .upper)) + geom_line(aes(group = .draw), alpha=0.15, data = df) + scale_fill_brewer() + ggtitle("50% curvewise intervals with curve_interval()") + theme_ggdist()
library(dplyr) library(ggplot2) # generate a set of curves k = 11 # number of curves n = 201 df = tibble( .draw = rep(1:k, n), mean = rep(seq(-5,5, length.out = k), n), x = rep(seq(-15,15,length.out = n), each = k), y = dnorm(x, mean, 3) ) # see pointwise intervals... df %>% group_by(x) %>% median_qi(y, .width = c(.5)) %>% ggplot(aes(x = x, y = y)) + geom_lineribbon(aes(ymin = .lower, ymax = .upper)) + geom_line(aes(group = .draw), alpha=0.15, data = df) + scale_fill_brewer() + ggtitle("50% pointwise intervals with point_interval()") + theme_ggdist() # ... compare them to curvewise intervals df %>% group_by(x) %>% curve_interval(y, .width = c(.5)) %>% ggplot(aes(x = x, y = y)) + geom_lineribbon(aes(ymin = .lower, ymax = .upper)) + geom_line(aes(group = .draw), alpha=0.15, data = df) + scale_fill_brewer() + ggtitle("50% curvewise intervals with curve_interval()") + theme_ggdist()
Given a vector of probabilities from a cumulative distribution function (CDF)
and a list of desired quantile intervals, return a vector categorizing each
element of the input vector according to which quantile interval it falls into.
NOTE: While this function can be used for (and was originally designed for)
drawing slabs with intervals overlaid on the density, this is can now be
done more easily by mapping the .width
or level
computed variable to
slab fill or color. See Examples.
cut_cdf_qi(p, .width = c(0.66, 0.95, 1), labels = NULL)
cut_cdf_qi(p, .width = c(0.66, 0.95, 1), labels = NULL)
p |
<numeric> Vector of values from a cumulative distribution function,
such as values returned by |
.width |
<numeric> Vector of probabilities to use that determine the widths of the resulting intervals. |
labels |
<character | function | NULL> One of:
|
An ordered factor of the same length as p
giving the quantile interval to
which each value of p
belongs.
See stat_slabinterval()
and
its shortcut stats, which generate cdf
aesthetics that can be used with
cut_cdf_qi()
to draw slabs colored by their intervals.
library(ggplot2) library(dplyr) library(scales) library(distributional) theme_set(theme_ggdist()) # NOTE: cut_cdf_qi() used to be the recommended way to do intervals overlaid # on densities, like this... tibble(x = dist_normal(0, 1)) %>% ggplot(aes(xdist = x)) + stat_slab( aes(fill = after_stat(cut_cdf_qi(cdf))) ) + scale_fill_brewer(direction = -1) # ... however this is now more easily and flexibly accomplished by directly # mapping .width or level onto fill: tibble(x = dist_normal(0, 1)) %>% ggplot(aes(xdist = x)) + stat_slab( aes(fill = after_stat(level)), .width = c(.66, .95, 1) ) + scale_fill_brewer() # See vignette("slabinterval") for more examples. The remaining examples # below using cut_cdf_qi() are kept for posterity. # With a halfeye (or other geom with slab and interval), NA values will # show up in the fill scale from the CDF function applied to the internal # interval geometry data and can be ignored, hence na.translate = FALSE tibble(x = dist_normal(0, 1)) %>% ggplot(aes(xdist = x)) + stat_halfeye(aes( fill = after_stat(cut_cdf_qi(cdf, .width = c(.5, .8, .95, 1))) )) + scale_fill_brewer(direction = -1, na.translate = FALSE) # we could also use the labels parameter to apply nicer formatting # and provide a better name for the legend, and omit the 100% interval # if desired tibble(x = dist_normal(0, 1)) %>% ggplot(aes(xdist = x)) + stat_halfeye(aes( fill = after_stat(cut_cdf_qi( cdf, .width = c(.5, .8, .95), labels = percent_format(accuracy = 1) )) )) + labs(fill = "Interval") + scale_fill_brewer(direction = -1, na.translate = FALSE)
library(ggplot2) library(dplyr) library(scales) library(distributional) theme_set(theme_ggdist()) # NOTE: cut_cdf_qi() used to be the recommended way to do intervals overlaid # on densities, like this... tibble(x = dist_normal(0, 1)) %>% ggplot(aes(xdist = x)) + stat_slab( aes(fill = after_stat(cut_cdf_qi(cdf))) ) + scale_fill_brewer(direction = -1) # ... however this is now more easily and flexibly accomplished by directly # mapping .width or level onto fill: tibble(x = dist_normal(0, 1)) %>% ggplot(aes(xdist = x)) + stat_slab( aes(fill = after_stat(level)), .width = c(.66, .95, 1) ) + scale_fill_brewer() # See vignette("slabinterval") for more examples. The remaining examples # below using cut_cdf_qi() are kept for posterity. # With a halfeye (or other geom with slab and interval), NA values will # show up in the fill scale from the CDF function applied to the internal # interval geometry data and can be ignored, hence na.translate = FALSE tibble(x = dist_normal(0, 1)) %>% ggplot(aes(xdist = x)) + stat_halfeye(aes( fill = after_stat(cut_cdf_qi(cdf, .width = c(.5, .8, .95, 1))) )) + scale_fill_brewer(direction = -1, na.translate = FALSE) # we could also use the labels parameter to apply nicer formatting # and provide a better name for the legend, and omit the 100% interval # if desired tibble(x = dist_normal(0, 1)) %>% ggplot(aes(xdist = x)) + stat_halfeye(aes( fill = after_stat(cut_cdf_qi( cdf, .width = c(.5, .8, .95), labels = percent_format(accuracy = 1) )) )) + labs(fill = "Interval") + scale_fill_brewer(direction = -1, na.translate = FALSE)
Bounded density estimator using the reflection method.
Supports automatic partial function application with waived arguments.
density_bounded( x, weights = NULL, n = 501, bandwidth = "dpi", adjust = 1, kernel = "gaussian", trim = TRUE, bounds = c(NA, NA), bounder = "cdf", adapt = 1, na.rm = FALSE, ..., range_only = FALSE )
density_bounded( x, weights = NULL, n = 501, bandwidth = "dpi", adjust = 1, kernel = "gaussian", trim = TRUE, bounds = c(NA, NA), bounder = "cdf", adapt = 1, na.rm = FALSE, ..., range_only = FALSE )
x |
<numeric> Sample to compute a density estimate for. |
weights |
|
n |
<scalar numeric> The number of grid points to evaluate the density estimator at. |
bandwidth |
<scalar numeric | function | string> Bandwidth of the density estimator. One of:
|
adjust |
<scalar numeric> Value to multiply the bandwidth of the density estimator by. Default |
kernel |
<string> The smoothing kernel to be used. This must partially
match one of |
trim |
<scalar logical> Should the density estimate be trimmed to the range of the data? Default |
bounds |
<length-2 numeric> Min and max bounds. If a bound is |
bounder |
<function | string> Method to use to find missing
(
|
adapt |
<positive integer> (very experimental) The name and interpretation of this argument
are subject to change without notice. If |
na.rm |
<scalar logical> Should missing ( |
... |
Additional arguments (ignored). |
range_only |
<scalar logical> If |
An object of class "density"
, mimicking the output format of
stats::density()
, with the following components:
x
: The grid of points at which the density was estimated.
y
: The estimated density values.
bw
: The bandwidth.
n
: The sample size of the x
input argument.
call
: The call used to produce the result, as a quoted expression.
data.name
: The deparsed name of the x
input argument.
has.na
: Always FALSE
(for compatibility).
cdf
: Values of the (possibly weighted) empirical cumulative distribution
function at x
. See weighted_ecdf()
.
This allows existing methods for density objects, like print()
and plot()
, to work if desired.
This output format (and in particular, the x
and y
components) is also
the format expected by the density
argument of the stat_slabinterval()
and the smooth_
family of functions.
Cooke, P. (1979). Statistical inference for bounds of random variables. Biometrika 66(2), 367–374. doi:10.1093/biomet/66.2.367.
Loh, W. Y. (1984). Estimating an endpoint of a distribution with resampling methods. The Annals of Statistics 12(4), 1543–1550. doi:10.1214/aos/1176346811
Other density estimators:
density_histogram()
,
density_unbounded()
library(distributional) library(dplyr) library(ggplot2) # For compatibility with existing code, the return type of density_bounded() # is the same as stats::density(), ... set.seed(123) x = rbeta(5000, 1, 3) d = density_bounded(x) d # ... thus, while designed for use with the `density` argument of # stat_slabinterval(), output from density_bounded() can also be used with # base::plot(): plot(d) # here we'll use the same data as above, but pick either density_bounded() # or density_unbounded() (which is equivalent to stats::density()). Notice # how the bounded density (green) is biased near the boundary of the support, # while the unbounded density is not. data.frame(x) %>% ggplot() + stat_slab( aes(xdist = dist), data = data.frame(dist = dist_beta(1, 3)), alpha = 0.25 ) + stat_slab(aes(x), density = "bounded", fill = NA, color = "#d95f02", alpha = 0.5) + stat_slab(aes(x), density = "unbounded", fill = NA, color = "#1b9e77", alpha = 0.5) + scale_thickness_shared() + theme_ggdist() # We can also supply arguments to the density estimators by using their # full function names instead of the string suffix; e.g. we can supply # the exact bounds of c(0,1) rather than using the bounds of the data. data.frame(x) %>% ggplot() + stat_slab( aes(xdist = dist), data = data.frame(dist = dist_beta(1, 3)), alpha = 0.25 ) + stat_slab( aes(x), fill = NA, color = "#d95f02", alpha = 0.5, density = density_bounded(bounds = c(0,1)) ) + scale_thickness_shared() + theme_ggdist()
library(distributional) library(dplyr) library(ggplot2) # For compatibility with existing code, the return type of density_bounded() # is the same as stats::density(), ... set.seed(123) x = rbeta(5000, 1, 3) d = density_bounded(x) d # ... thus, while designed for use with the `density` argument of # stat_slabinterval(), output from density_bounded() can also be used with # base::plot(): plot(d) # here we'll use the same data as above, but pick either density_bounded() # or density_unbounded() (which is equivalent to stats::density()). Notice # how the bounded density (green) is biased near the boundary of the support, # while the unbounded density is not. data.frame(x) %>% ggplot() + stat_slab( aes(xdist = dist), data = data.frame(dist = dist_beta(1, 3)), alpha = 0.25 ) + stat_slab(aes(x), density = "bounded", fill = NA, color = "#d95f02", alpha = 0.5) + stat_slab(aes(x), density = "unbounded", fill = NA, color = "#1b9e77", alpha = 0.5) + scale_thickness_shared() + theme_ggdist() # We can also supply arguments to the density estimators by using their # full function names instead of the string suffix; e.g. we can supply # the exact bounds of c(0,1) rather than using the bounds of the data. data.frame(x) %>% ggplot() + stat_slab( aes(xdist = dist), data = data.frame(dist = dist_beta(1, 3)), alpha = 0.25 ) + stat_slab( aes(x), fill = NA, color = "#d95f02", alpha = 0.5, density = density_bounded(bounds = c(0,1)) ) + scale_thickness_shared() + theme_ggdist()
Histogram density estimator.
Supports automatic partial function application with waived arguments.
density_histogram( x, weights = NULL, breaks = "Scott", align = "none", outline_bars = FALSE, right_closed = TRUE, outermost_closed = TRUE, na.rm = FALSE, ..., range_only = FALSE )
density_histogram( x, weights = NULL, breaks = "Scott", align = "none", outline_bars = FALSE, right_closed = TRUE, outermost_closed = TRUE, na.rm = FALSE, ..., range_only = FALSE )
x |
<numeric> Sample to compute a density estimate for. |
weights |
|
breaks |
<numeric | function | string> Determines the breakpoints defining bins. Default
For example, |
align |
<scalar numeric | function | string> Determines how to align the breakpoints defining bins. Default
For example, |
outline_bars |
<scalar logical> Should outlines in between the bars (i.e. density values of 0) be included? |
right_closed |
<scalar logical> Should the right edge of each bin be closed? For
a bin with endpoints
Equivalent to the |
outermost_closed |
<scalar logical> Should values on the edges of the outermost (first
or last) bins always be included in those bins? If Equivalent to the |
na.rm |
<scalar logical> Should missing ( |
... |
Additional arguments (ignored). |
range_only |
<scalar logical> If |
An object of class "density"
, mimicking the output format of
stats::density()
, with the following components:
x
: The grid of points at which the density was estimated.
y
: The estimated density values.
bw
: The bandwidth.
n
: The sample size of the x
input argument.
call
: The call used to produce the result, as a quoted expression.
data.name
: The deparsed name of the x
input argument.
has.na
: Always FALSE
(for compatibility).
cdf
: Values of the (possibly weighted) empirical cumulative distribution
function at x
. See weighted_ecdf()
.
This allows existing methods for density objects, like print()
and plot()
, to work if desired.
This output format (and in particular, the x
and y
components) is also
the format expected by the density
argument of the stat_slabinterval()
and the smooth_
family of functions.
Other density estimators:
density_bounded()
,
density_unbounded()
library(distributional) library(dplyr) library(ggplot2) # For compatibility with existing code, the return type of density_unbounded() # is the same as stats::density(), ... set.seed(123) x = rbeta(5000, 1, 3) d = density_histogram(x) d # ... thus, while designed for use with the `density` argument of # stat_slabinterval(), output from density_histogram() can also be used with # base::plot(): plot(d) # here we'll use the same data as above with stat_slab(): data.frame(x) %>% ggplot() + stat_slab( aes(xdist = dist), data = data.frame(dist = dist_beta(1, 3)), alpha = 0.25 ) + stat_slab(aes(x), density = "histogram", fill = NA, color = "#d95f02", alpha = 0.5) + scale_thickness_shared() + theme_ggdist()
library(distributional) library(dplyr) library(ggplot2) # For compatibility with existing code, the return type of density_unbounded() # is the same as stats::density(), ... set.seed(123) x = rbeta(5000, 1, 3) d = density_histogram(x) d # ... thus, while designed for use with the `density` argument of # stat_slabinterval(), output from density_histogram() can also be used with # base::plot(): plot(d) # here we'll use the same data as above with stat_slab(): data.frame(x) %>% ggplot() + stat_slab( aes(xdist = dist), data = data.frame(dist = dist_beta(1, 3)), alpha = 0.25 ) + stat_slab(aes(x), density = "histogram", fill = NA, color = "#d95f02", alpha = 0.5) + scale_thickness_shared() + theme_ggdist()
Unbounded density estimator using stats::density()
.
Supports automatic partial function application with waived arguments.
density_unbounded( x, weights = NULL, n = 501, bandwidth = "dpi", adjust = 1, kernel = "gaussian", trim = TRUE, adapt = 1, na.rm = FALSE, ..., range_only = FALSE )
density_unbounded( x, weights = NULL, n = 501, bandwidth = "dpi", adjust = 1, kernel = "gaussian", trim = TRUE, adapt = 1, na.rm = FALSE, ..., range_only = FALSE )
x |
<numeric> Sample to compute a density estimate for. |
weights |
|
n |
<scalar numeric> The number of grid points to evaluate the density estimator at. |
bandwidth |
<scalar numeric | function | string> Bandwidth of the density estimator. One of:
|
adjust |
<scalar numeric> Value to multiply the bandwidth of the density estimator by. Default |
kernel |
<string> The smoothing kernel to be used. This must partially
match one of |
trim |
<scalar logical> Should the density estimate be trimmed to the range of the data? Default |
adapt |
<positive integer> (very experimental) The name and interpretation of this argument
are subject to change without notice. If |
na.rm |
<scalar logical> Should missing ( |
... |
Additional arguments (ignored). |
range_only |
<scalar logical> If |
An object of class "density"
, mimicking the output format of
stats::density()
, with the following components:
x
: The grid of points at which the density was estimated.
y
: The estimated density values.
bw
: The bandwidth.
n
: The sample size of the x
input argument.
call
: The call used to produce the result, as a quoted expression.
data.name
: The deparsed name of the x
input argument.
has.na
: Always FALSE
(for compatibility).
cdf
: Values of the (possibly weighted) empirical cumulative distribution
function at x
. See weighted_ecdf()
.
This allows existing methods for density objects, like print()
and plot()
, to work if desired.
This output format (and in particular, the x
and y
components) is also
the format expected by the density
argument of the stat_slabinterval()
and the smooth_
family of functions.
Other density estimators:
density_bounded()
,
density_histogram()
library(distributional) library(dplyr) library(ggplot2) # For compatibility with existing code, the return type of density_unbounded() # is the same as stats::density(), ... set.seed(123) x = rbeta(5000, 1, 3) d = density_unbounded(x) d # ... thus, while designed for use with the `density` argument of # stat_slabinterval(), output from density_unbounded() can also be used with # base::plot(): plot(d) # here we'll use the same data as above, but pick either density_bounded() # or density_unbounded() (which is equivalent to stats::density()). Notice # how the bounded density (green) is biased near the boundary of the support, # while the unbounded density is not. data.frame(x) %>% ggplot() + stat_slab( aes(xdist = dist), data = data.frame(dist = dist_beta(1, 3)), alpha = 0.25 ) + stat_slab(aes(x), density = "bounded", fill = NA, color = "#d95f02", alpha = 0.5) + stat_slab(aes(x), density = "unbounded", fill = NA, color = "#1b9e77", alpha = 0.5) + scale_thickness_shared() + theme_ggdist()
library(distributional) library(dplyr) library(ggplot2) # For compatibility with existing code, the return type of density_unbounded() # is the same as stats::density(), ... set.seed(123) x = rbeta(5000, 1, 3) d = density_unbounded(x) d # ... thus, while designed for use with the `density` argument of # stat_slabinterval(), output from density_unbounded() can also be used with # base::plot(): plot(d) # here we'll use the same data as above, but pick either density_bounded() # or density_unbounded() (which is equivalent to stats::density()). Notice # how the bounded density (green) is biased near the boundary of the support, # while the unbounded density is not. data.frame(x) %>% ggplot() + stat_slab( aes(xdist = dist), data = data.frame(dist = dist_beta(1, 3)), alpha = 0.25 ) + stat_slab(aes(x), density = "bounded", fill = NA, color = "#d95f02", alpha = 0.5) + stat_slab(aes(x), density = "unbounded", fill = NA, color = "#1b9e77", alpha = 0.5) + scale_thickness_shared() + theme_ggdist()
Searches for a nice-looking bin width to use to draw a dotplot such that
the height of the dotplot fits within a given space (maxheight
).
find_dotplot_binwidth( x, maxheight, heightratio = 1, stackratio = 1, layout = c("bin", "weave", "hex", "swarm", "bar") )
find_dotplot_binwidth( x, maxheight, heightratio = 1, stackratio = 1, layout = c("bin", "weave", "hex", "swarm", "bar") )
x |
<numeric> Data values. |
maxheight |
<scalar numeric> Maximum height of the dotplot. |
heightratio |
<scalar numeric> Ratio of bin width to dot height. |
stackratio |
<scalar numeric> Ratio of dot height to vertical distance between dot centers |
layout |
<string> The layout method used for the dots. One of:
|
This dynamic bin selection algorithm uses a binary search over the number of
bins to find a bin width such that if the input data (x
) is binned
using a Wilkinson-style dotplot algorithm the height of the tallest bin
will be less than maxheight
.
This algorithm is used by geom_dotsinterval()
(and its variants) to automatically
select bin widths. Unless you are manually implementing you own dotplot grob
or geom
, you probably do not need to use this function directly
A suitable bin width such that a dotplot created with this bin width
and heightratio
should have its tallest bin be less than or equal to maxheight
.
bin_dots()
for an algorithm can bin dots using bin widths selected
by this function; geom_dotsinterval()
for geometries that use
these algorithms to create dotplots.
library(dplyr) library(ggplot2) x = qnorm(ppoints(20)) binwidth = find_dotplot_binwidth(x, maxheight = 4, heightratio = 1) binwidth bin_df = bin_dots(x = x, y = 0, binwidth = binwidth, heightratio = 1) bin_df # we can manually plot the binning above, though this is only recommended # if you are using find_dotplot_binwidth() and bin_dots() to build your own # grob. For practical use it is much easier to use geom_dots(), which will # automatically select good bin widths for you (and which uses # find_dotplot_binwidth() and bin_dots() internally) bin_df %>% ggplot(aes(x = x, y = y)) + geom_point(size = 4) + coord_fixed()
library(dplyr) library(ggplot2) x = qnorm(ppoints(20)) binwidth = find_dotplot_binwidth(x, maxheight = 4, heightratio = 1) binwidth bin_df = bin_dots(x = x, y = 0, binwidth = binwidth, heightratio = 1) bin_df # we can manually plot the binning above, though this is only recommended # if you are using find_dotplot_binwidth() and bin_dots() to build your own # grob. For practical use it is much easier to use geom_dots(), which will # automatically select good bin widths for you (and which uses # find_dotplot_binwidth() and bin_dots() internally) bin_df %>% ggplot(aes(x = x, y = y)) + geom_point(size = 4) + coord_fixed()
Variant of geom_dots()
for creating blurry dotplots. Accepts an sd
aesthetic that gives the standard deviation of the blur applied to the dots.
Requires a graphics engine supporting radial gradients. Unlike geom_dots()
,
this geom only supports circular and square shape
s.
geom_blur_dots( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., blur = "gaussian", binwidth = NA, dotsize = 1.07, stackratio = 1, layout = "bin", overlaps = "nudge", smooth = "none", overflow = "warn", verbose = FALSE, orientation = NA, subguide = "slab", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
geom_blur_dots( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., blur = "gaussian", binwidth = NA, dotsize = 1.07, stackratio = 1, layout = "bin", overlaps = "nudge", smooth = "none", overflow = "warn", verbose = FALSE, orientation = NA, subguide = "slab", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to |
blur |
<function | string> Blur function to apply to dots. One of:
|
binwidth |
<numeric | unit> The bin width to use for laying out the dots. One of:
If the value is numeric, it is assumed to be in units of data. The bin width
(or its bounds) can also be specified using |
dotsize |
<scalar numeric> The width of the dots relative to the |
stackratio |
<scalar numeric> The distance between the center of the dots in the same
stack relative to the dot height. The default, |
layout |
<string> The layout method used for the dots. One of:
|
overlaps |
<string> How to handle overlapping dots or bins in the
|
smooth |
<function | string> Smoother to apply to dot positions. One of:
Smoothing is most effective when the smoother is matched to the support of
the distribution; e.g. using |
overflow |
<string> How to handle overflow of dots beyond the extent of the geom
when a minimum
If you find the default layout has dots that are too small, and you are okay
with dots overlapping, consider setting |
verbose |
<scalar logical> If |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
subguide |
<function | string> Sub-guide used to annotate the
|
na.rm |
<scalar logical> If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param
|
If |
The dots family of stats and geoms are similar to ggplot2::geom_dotplot()
but with a number of differences:
Dots geoms act like slabs in geom_slabinterval()
and can be given x positions (or y positions when
in a horizontal orientation).
Given the available space to lay out dots, the dots geoms will automatically determine how many bins to use to fit the available space.
Dots geoms use a dynamic layout algorithm that lays out dots from the center out if the input data are symmetrical, guaranteeing that symmetrical data results in a symmetrical plot. The layout algorithm also prevents dots from overlapping each other.
The shape of the dots in these geoms can be changed using the slab_shape
aesthetic (when using the
dotsinterval
family) or the shape
or slab_shape
aesthetic (when using the dots
family)
Stats and geoms in this family include:
geom_dots()
: dotplots on raw data. Ensures the dotplot fits within available space by reducing the size
of the dots automatically (may result in very small dots).
geom_swarm()
and geom_weave()
: dotplots on raw data with defaults intended to create "beeswarm" plots.
Used side = "both"
by default, and sets the default dot size to the same size as geom_point()
(binwidth = unit(1.5, "mm")
), allowing dots to overlap instead of getting very small.
stat_dots()
: dotplots on raw data, distributional objects, and posterior::rvar()
s
geom_dotsinterval()
: dotplot + interval plots on raw data with already-calculated
intervals (rarely useful directly).
stat_dotsinterval()
: dotplot + interval plots on raw data, distributional objects,
and posterior::rvar()
s (will calculate intervals for you).
geom_blur_dots()
: blurry dotplots that allow the standard deviation of a blur applied to
each dot to be specified using the sd
aesthetic.
stat_mcse_dots()
: blurry dotplots of quantiles using the Monte Carlo Standard Error of each quantile.
stat_dots()
and stat_dotsinterval()
, when used with the quantiles
argument,
are particularly useful for constructing quantile dotplots, which can be an effective way to communicate uncertainty
using a frequency framing that may be easier for laypeople to understand (Kay et al. 2016, Fernandes et al. 2018).
A ggplot2::Geom representing a blurry dot geometry which can
be added to a ggplot()
object.
The dots+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the dots (aka the slab), the
point, and the interval.
Positional aesthetics
x
: x position of the geometry
y
: y position of the geometry
Dots-specific (aka Slab-specific) aesthetics
sd
: The standard deviation (in data units) of the blur associated with each dot.
order
: The order in which data points are stacked within bins. Can be used to create the effect of
"stacked" dots by ordering dots according to a discrete variable. If omitted (NULL
), the
value of the data points themselves are used to determine stacking order. Only applies when
layout
is "bin"
or "hex"
, as the other layout methods fully determine both x and y positions.
side
: Which side to place the slab on. "topright"
, "top"
, and "right"
are synonyms
which cause the slab to be drawn on the top or the right depending on if orientation
is "horizontal"
or "vertical"
. "bottomleft"
, "bottom"
, and "left"
are synonyms which cause the slab
to be drawn on the bottom or the left depending on if orientation
is "horizontal"
or
"vertical"
. "topleft"
causes the slab to be drawn on the top or the left, and "bottomright"
causes the slab to be drawn on the bottom or the right. "both"
draws the slab mirrored on both
sides (as in a violin plot).
scale
: What proportion of the region allocated to this geom to use to draw the slab. If scale = 1
,
slabs that use the maximum range will just touch each other. Default is 0.9
to leave some space
between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization,
see the thickness
scale article.
justification
: Justification of the interval relative to the slab, where 0
indicates bottom/left
justification and 1
indicates top/right justification (depending on orientation
). If justification
is NULL
(the default), then it is set automatically based on the value of side
: when side
is
"top"
/"right"
justification
is set to 0
, when side
is "bottom"
/"left"
justification
is set to 1
, and when side
is "both"
justification
is set to 0.5.
datatype
: When using composite geoms directly without a stat
(e.g. geom_slabinterval()
), datatype
is used to
indicate which part of the geom a row in the data targets: rows with datatype = "slab"
target the
slab portion of the geometry and rows with datatype = "interval"
target the interval portion of
the geometry. This is set automatically when using ggdist stat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (if orientation = "horizontal"
).
xmax
: Right end of the interval sub-geometry (if orientation = "horizontal"
).
ymin
: Lower end of the interval sub-geometry (if orientation = "vertical"
).
ymax
: Upper end of the interval sub-geometry (if orientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (or color
) The color of the interval and point sub-geometries.
Use the slab_color
, interval_color
, or point_color
aesthetics (below) to
set sub-geometry colors separately.
fill
: The fill color of the slab and point sub-geometries. Use the slab_fill
or point_fill
aesthetics (below) to set sub-geometry colors separately.
alpha
: The opacity of the slab, interval, and point sub-geometries. Use the slab_alpha
,
interval_alpha
, or point_alpha
aesthetics (below) to set sub-geometry colors separately.
colour_ramp
: (or color_ramp
) A secondary scale that modifies the color
scale to "ramp" to another color. See scale_colour_ramp()
for examples.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except with geom_slab()
: then
it is the width of the slab). With composite geometries including an interval and slab,
use slab_linewidth
to set the line width of the slab (see below). For interval, raw
linewidth
values are transformed according to the interval_size_domain
and interval_size_range
parameters of the geom
(see above).
size
: Determines the size of the point. If linewidth
is not provided, size
will
also determines the width of the line used to draw the interval (this allows line width and
point size to be modified together by setting only size
and not linewidth
). Raw
size
values are transformed according to the interval_size_domain
, interval_size_range
,
and fatten_point
parameters of the geom
(see above). Use the point_size
aesthetic
(below) to set sub-geometry size directly without applying the effects of
interval_size_domain
, interval_size_range
, and fatten_point
.
stroke
: Width of the outline around the point sub-geometry.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc) used to draw the interval
and the outline of the slab (if it is visible). Use the slab_linetype
or
interval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override for fill
: the fill color of the slab.
slab_colour
: (or slab_color
) Override for colour
/color
: the outline color of the slab.
slab_alpha
: Override for alpha
: the opacity of the slab.
slab_linewidth
: Override for linwidth
: the width of the outline of the slab.
slab_linetype
: Override for linetype
: the line type of the outline of the slab.
slab_shape
: Override for shape
: the shape of the dots used to draw the dotplot slab.
Interval-specific color and line override aesthetics
interval_colour
: (or interval_color
) Override for colour
/color
: the color of the interval.
interval_alpha
: Override for alpha
: the opacity of the interval.
interval_linetype
: Override for linetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override for fill
: the fill color of the point.
point_colour
: (or point_color
) Override for colour
/color
: the outline color of the point.
point_alpha
: Override for alpha
: the opacity of the point.
point_size
: Override for size
: the size of the point.
Deprecated aesthetics
slab_size
: Use slab_linewidth
.
interval_size
: Use interval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("dotsinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
Kay, M., Kola, T., Hullman, J. R., & Munson, S. A. (2016). When (ish) is My Bus? User-centered Visualizations of Uncertainty in Everyday, Mobile Predictive Systems. Conference on Human Factors in Computing Systems - CHI '16, 5092–5103. doi:10.1145/2858036.2858558.
Fernandes, M., Walls, L., Munson, S., Hullman, J., & Kay, M. (2018). Uncertainty Displays Using Quantile Dotplots or CDFs Improve Transit Decision-Making. Conference on Human Factors in Computing Systems - CHI '18. doi:10.1145/3173574.3173718.
See geom_dotsinterval()
for the geometry this shortcut is based on.
See vignette("dotsinterval")
for a variety of examples of use.
Other dotsinterval geoms:
geom_dots()
,
geom_dotsinterval()
,
geom_swarm()
,
geom_weave()
library(dplyr) library(ggplot2) theme_set(theme_ggdist()) set.seed(1234) x = rnorm(1000) # manually calculate quantiles and their MCSE # this could also be done more succinctly with stat_mcse_dots() p = ppoints(100) df = data.frame( q = quantile(x, p), se = posterior::mcse_quantile(x, p) ) df %>% ggplot(aes(x = q, sd = se)) + geom_blur_dots() df %>% ggplot(aes(x = q, sd = se)) + # or blur = blur_interval(.width = .95) to set the interval width geom_blur_dots(blur = "interval")
library(dplyr) library(ggplot2) theme_set(theme_ggdist()) set.seed(1234) x = rnorm(1000) # manually calculate quantiles and their MCSE # this could also be done more succinctly with stat_mcse_dots() p = ppoints(100) df = data.frame( q = quantile(x, p), se = posterior::mcse_quantile(x, p) ) df %>% ggplot(aes(x = q, sd = se)) + geom_blur_dots() df %>% ggplot(aes(x = q, sd = se)) + # or blur = blur_interval(.width = .95) to set the interval width geom_blur_dots(blur = "interval")
Shortcut version of geom_dotsinterval()
for creating dot plots.
Geoms based on geom_dotsinterval()
create dotplots that automatically
ensure the plot fits within the available space.
Roughly equivalent to:
geom_dotsinterval( show_point = FALSE, show_interval = FALSE )
geom_dots( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., binwidth = NA, dotsize = 1.07, stackratio = 1, layout = "bin", overlaps = "nudge", smooth = "none", overflow = "warn", verbose = FALSE, orientation = NA, subguide = "slab", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
geom_dots( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., binwidth = NA, dotsize = 1.07, stackratio = 1, layout = "bin", overlaps = "nudge", smooth = "none", overflow = "warn", verbose = FALSE, orientation = NA, subguide = "slab", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to |
binwidth |
<numeric | unit> The bin width to use for laying out the dots. One of:
If the value is numeric, it is assumed to be in units of data. The bin width
(or its bounds) can also be specified using |
dotsize |
<scalar numeric> The width of the dots relative to the |
stackratio |
<scalar numeric> The distance between the center of the dots in the same
stack relative to the dot height. The default, |
layout |
<string> The layout method used for the dots. One of:
|
overlaps |
<string> How to handle overlapping dots or bins in the
|
smooth |
<function | string> Smoother to apply to dot positions. One of:
Smoothing is most effective when the smoother is matched to the support of
the distribution; e.g. using |
overflow |
<string> How to handle overflow of dots beyond the extent of the geom
when a minimum
If you find the default layout has dots that are too small, and you are okay
with dots overlapping, consider setting |
verbose |
<scalar logical> If |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
subguide |
<function | string> Sub-guide used to annotate the
|
na.rm |
<scalar logical> If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param
|
If |
The dots family of stats and geoms are similar to ggplot2::geom_dotplot()
but with a number of differences:
Dots geoms act like slabs in geom_slabinterval()
and can be given x positions (or y positions when
in a horizontal orientation).
Given the available space to lay out dots, the dots geoms will automatically determine how many bins to use to fit the available space.
Dots geoms use a dynamic layout algorithm that lays out dots from the center out if the input data are symmetrical, guaranteeing that symmetrical data results in a symmetrical plot. The layout algorithm also prevents dots from overlapping each other.
The shape of the dots in these geoms can be changed using the slab_shape
aesthetic (when using the
dotsinterval
family) or the shape
or slab_shape
aesthetic (when using the dots
family)
Stats and geoms in this family include:
geom_dots()
: dotplots on raw data. Ensures the dotplot fits within available space by reducing the size
of the dots automatically (may result in very small dots).
geom_swarm()
and geom_weave()
: dotplots on raw data with defaults intended to create "beeswarm" plots.
Used side = "both"
by default, and sets the default dot size to the same size as geom_point()
(binwidth = unit(1.5, "mm")
), allowing dots to overlap instead of getting very small.
stat_dots()
: dotplots on raw data, distributional objects, and posterior::rvar()
s
geom_dotsinterval()
: dotplot + interval plots on raw data with already-calculated
intervals (rarely useful directly).
stat_dotsinterval()
: dotplot + interval plots on raw data, distributional objects,
and posterior::rvar()
s (will calculate intervals for you).
geom_blur_dots()
: blurry dotplots that allow the standard deviation of a blur applied to
each dot to be specified using the sd
aesthetic.
stat_mcse_dots()
: blurry dotplots of quantiles using the Monte Carlo Standard Error of each quantile.
stat_dots()
and stat_dotsinterval()
, when used with the quantiles
argument,
are particularly useful for constructing quantile dotplots, which can be an effective way to communicate uncertainty
using a frequency framing that may be easier for laypeople to understand (Kay et al. 2016, Fernandes et al. 2018).
A ggplot2::Geom representing a dot geometry which can
be added to a ggplot()
object.
The dots+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the dots (aka the slab), the
point, and the interval.
Positional aesthetics
x
: x position of the geometry
y
: y position of the geometry
Dots-specific (aka Slab-specific) aesthetics
family
: The font family used to draw the dots.
order
: The order in which data points are stacked within bins. Can be used to create the effect of
"stacked" dots by ordering dots according to a discrete variable. If omitted (NULL
), the
value of the data points themselves are used to determine stacking order. Only applies when
layout
is "bin"
or "hex"
, as the other layout methods fully determine both x and y positions.
side
: Which side to place the slab on. "topright"
, "top"
, and "right"
are synonyms
which cause the slab to be drawn on the top or the right depending on if orientation
is "horizontal"
or "vertical"
. "bottomleft"
, "bottom"
, and "left"
are synonyms which cause the slab
to be drawn on the bottom or the left depending on if orientation
is "horizontal"
or
"vertical"
. "topleft"
causes the slab to be drawn on the top or the left, and "bottomright"
causes the slab to be drawn on the bottom or the right. "both"
draws the slab mirrored on both
sides (as in a violin plot).
scale
: What proportion of the region allocated to this geom to use to draw the slab. If scale = 1
,
slabs that use the maximum range will just touch each other. Default is 0.9
to leave some space
between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization,
see the thickness
scale article.
justification
: Justification of the interval relative to the slab, where 0
indicates bottom/left
justification and 1
indicates top/right justification (depending on orientation
). If justification
is NULL
(the default), then it is set automatically based on the value of side
: when side
is
"top"
/"right"
justification
is set to 0
, when side
is "bottom"
/"left"
justification
is set to 1
, and when side
is "both"
justification
is set to 0.5.
datatype
: When using composite geoms directly without a stat
(e.g. geom_slabinterval()
), datatype
is used to
indicate which part of the geom a row in the data targets: rows with datatype = "slab"
target the
slab portion of the geometry and rows with datatype = "interval"
target the interval portion of
the geometry. This is set automatically when using ggdist stat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (if orientation = "horizontal"
).
xmax
: Right end of the interval sub-geometry (if orientation = "horizontal"
).
ymin
: Lower end of the interval sub-geometry (if orientation = "vertical"
).
ymax
: Upper end of the interval sub-geometry (if orientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (or color
) The color of the interval and point sub-geometries.
Use the slab_color
, interval_color
, or point_color
aesthetics (below) to
set sub-geometry colors separately.
fill
: The fill color of the slab and point sub-geometries. Use the slab_fill
or point_fill
aesthetics (below) to set sub-geometry colors separately.
alpha
: The opacity of the slab, interval, and point sub-geometries. Use the slab_alpha
,
interval_alpha
, or point_alpha
aesthetics (below) to set sub-geometry colors separately.
colour_ramp
: (or color_ramp
) A secondary scale that modifies the color
scale to "ramp" to another color. See scale_colour_ramp()
for examples.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except with geom_slab()
: then
it is the width of the slab). With composite geometries including an interval and slab,
use slab_linewidth
to set the line width of the slab (see below). For interval, raw
linewidth
values are transformed according to the interval_size_domain
and interval_size_range
parameters of the geom
(see above).
size
: Determines the size of the point. If linewidth
is not provided, size
will
also determines the width of the line used to draw the interval (this allows line width and
point size to be modified together by setting only size
and not linewidth
). Raw
size
values are transformed according to the interval_size_domain
, interval_size_range
,
and fatten_point
parameters of the geom
(see above). Use the point_size
aesthetic
(below) to set sub-geometry size directly without applying the effects of
interval_size_domain
, interval_size_range
, and fatten_point
.
stroke
: Width of the outline around the point sub-geometry.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc) used to draw the interval
and the outline of the slab (if it is visible). Use the slab_linetype
or
interval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override for fill
: the fill color of the slab.
slab_colour
: (or slab_color
) Override for colour
/color
: the outline color of the slab.
slab_alpha
: Override for alpha
: the opacity of the slab.
slab_linewidth
: Override for linwidth
: the width of the outline of the slab.
slab_linetype
: Override for linetype
: the line type of the outline of the slab.
slab_shape
: Override for shape
: the shape of the dots used to draw the dotplot slab.
Interval-specific color and line override aesthetics
interval_colour
: (or interval_color
) Override for colour
/color
: the color of the interval.
interval_alpha
: Override for alpha
: the opacity of the interval.
interval_linetype
: Override for linetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override for fill
: the fill color of the point.
point_colour
: (or point_color
) Override for colour
/color
: the outline color of the point.
point_alpha
: Override for alpha
: the opacity of the point.
point_size
: Override for size
: the size of the point.
Deprecated aesthetics
slab_size
: Use slab_linewidth
.
interval_size
: Use interval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("dotsinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
Kay, M., Kola, T., Hullman, J. R., & Munson, S. A. (2016). When (ish) is My Bus? User-centered Visualizations of Uncertainty in Everyday, Mobile Predictive Systems. Conference on Human Factors in Computing Systems - CHI '16, 5092–5103. doi:10.1145/2858036.2858558.
Fernandes, M., Walls, L., Munson, S., Hullman, J., & Kay, M. (2018). Uncertainty Displays Using Quantile Dotplots or CDFs Improve Transit Decision-Making. Conference on Human Factors in Computing Systems - CHI '18. doi:10.1145/3173574.3173718.
See stat_dots()
for the stat version, intended for
use on sample data or analytical distributions.
See geom_dotsinterval()
for the geometry this shortcut is based on.
See vignette("dotsinterval")
for a variety of examples of use.
Other dotsinterval geoms:
geom_blur_dots()
,
geom_dotsinterval()
,
geom_swarm()
,
geom_weave()
library(dplyr) library(ggplot2) theme_set(theme_ggdist()) set.seed(12345) df = tibble( g = rep(c("a", "b"), 200), value = rnorm(400, c(0, 3), c(0.75, 1)) ) # orientation is detected automatically based on # which axis is discrete df %>% ggplot(aes(x = value, y = g)) + geom_dots() df %>% ggplot(aes(y = value, x = g)) + geom_dots()
library(dplyr) library(ggplot2) theme_set(theme_ggdist()) set.seed(12345) df = tibble( g = rep(c("a", "b"), 200), value = rnorm(400, c(0, 3), c(0.75, 1)) ) # orientation is detected automatically based on # which axis is discrete df %>% ggplot(aes(x = value, y = g)) + geom_dots() df %>% ggplot(aes(y = value, x = g)) + geom_dots()
This meta-geom supports drawing combinations of dotplots, points, and intervals.
Geoms and stats based on geom_dotsinterval()
create dotplots that automatically determine a bin width that
ensures the plot fits within the available space. They also ensure dots do not overlap, and allow
the generation of quantile dotplots using the quantiles
argument to stat_dotsinterval()
/stat_dots()
.
Generally follows the naming scheme and
arguments of the geom_slabinterval()
and stat_slabinterval()
family of
geoms and stats.
geom_dotsinterval( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., binwidth = NA, dotsize = 1.07, stackratio = 1, layout = "bin", overlaps = "nudge", smooth = "none", overflow = "warn", verbose = FALSE, orientation = NA, interval_size_domain = c(1, 6), interval_size_range = c(0.6, 1.4), fatten_point = 1.8, arrow = NULL, show_slab = TRUE, show_point = TRUE, show_interval = TRUE, subguide = "slab", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
geom_dotsinterval( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., binwidth = NA, dotsize = 1.07, stackratio = 1, layout = "bin", overlaps = "nudge", smooth = "none", overflow = "warn", verbose = FALSE, orientation = NA, interval_size_domain = c(1, 6), interval_size_range = c(0.6, 1.4), fatten_point = 1.8, arrow = NULL, show_slab = TRUE, show_point = TRUE, show_interval = TRUE, subguide = "slab", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to |
binwidth |
<numeric | unit> The bin width to use for laying out the dots. One of:
If the value is numeric, it is assumed to be in units of data. The bin width
(or its bounds) can also be specified using |
dotsize |
<scalar numeric> The width of the dots relative to the |
stackratio |
<scalar numeric> The distance between the center of the dots in the same
stack relative to the dot height. The default, |
layout |
<string> The layout method used for the dots. One of:
|
overlaps |
<string> How to handle overlapping dots or bins in the
|
smooth |
<function | string> Smoother to apply to dot positions. One of:
Smoothing is most effective when the smoother is matched to the support of
the distribution; e.g. using |
overflow |
<string> How to handle overflow of dots beyond the extent of the geom
when a minimum
If you find the default layout has dots that are too small, and you are okay
with dots overlapping, consider setting |
verbose |
<scalar logical> If |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
interval_size_domain |
<length-2 numeric> Minimum and maximum of the values of the |
interval_size_range |
<length-2 numeric> This geom scales the raw size aesthetic values when drawing interval
and point sizes, as they tend to be too thick when using the default settings of |
fatten_point |
<scalar numeric> A multiplicative factor used to adjust the size of the point relative to the
size of the thickest interval line. If you wish to specify point sizes directly, you can also use
the |
arrow |
<arrow | NULL> Type of arrow heads to use on the interval, or |
show_slab |
<scalar logical> Should the slab portion of the geom be drawn? |
show_point |
<scalar logical> Should the point portion of the geom be drawn? |
show_interval |
<scalar logical> Should the interval portion of the geom be drawn? |
subguide |
<function | string> Sub-guide used to annotate the
|
na.rm |
<scalar logical> If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param
|
If |
The dots family of stats and geoms are similar to ggplot2::geom_dotplot()
but with a number of differences:
Dots geoms act like slabs in geom_slabinterval()
and can be given x positions (or y positions when
in a horizontal orientation).
Given the available space to lay out dots, the dots geoms will automatically determine how many bins to use to fit the available space.
Dots geoms use a dynamic layout algorithm that lays out dots from the center out if the input data are symmetrical, guaranteeing that symmetrical data results in a symmetrical plot. The layout algorithm also prevents dots from overlapping each other.
The shape of the dots in these geoms can be changed using the slab_shape
aesthetic (when using the
dotsinterval
family) or the shape
or slab_shape
aesthetic (when using the dots
family)
Stats and geoms in this family include:
geom_dots()
: dotplots on raw data. Ensures the dotplot fits within available space by reducing the size
of the dots automatically (may result in very small dots).
geom_swarm()
and geom_weave()
: dotplots on raw data with defaults intended to create "beeswarm" plots.
Used side = "both"
by default, and sets the default dot size to the same size as geom_point()
(binwidth = unit(1.5, "mm")
), allowing dots to overlap instead of getting very small.
stat_dots()
: dotplots on raw data, distributional objects, and posterior::rvar()
s
geom_dotsinterval()
: dotplot + interval plots on raw data with already-calculated
intervals (rarely useful directly).
stat_dotsinterval()
: dotplot + interval plots on raw data, distributional objects,
and posterior::rvar()
s (will calculate intervals for you).
geom_blur_dots()
: blurry dotplots that allow the standard deviation of a blur applied to
each dot to be specified using the sd
aesthetic.
stat_mcse_dots()
: blurry dotplots of quantiles using the Monte Carlo Standard Error of each quantile.
stat_dots()
and stat_dotsinterval()
, when used with the quantiles
argument,
are particularly useful for constructing quantile dotplots, which can be an effective way to communicate uncertainty
using a frequency framing that may be easier for laypeople to understand (Kay et al. 2016, Fernandes et al. 2018).
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
xdist
, ydist
, and dist
can be any distribution object from the distributional
package (dist_normal()
, dist_beta()
, etc)
or can be a posterior::rvar()
object. Since these functions are vectorized,
other columns can be passed directly to them in an aes()
specification; e.g.
aes(dist = dist_normal(mu, sigma))
will work if mu
and sigma
are columns in the
input data frame.
dist
can be a character vector giving the distribution name. Then the arg1
, ... arg9
aesthetics (or args
as a list column) specify distribution arguments. Distribution names
should correspond to R functions that have "p"
, "q"
, and "d"
functions; e.g. "norm"
is a valid distribution name because R defines the pnorm()
, qnorm()
, and dnorm()
functions for Normal distributions.
See the parse_dist()
function for a useful way to generate dist
and args
values from human-readable distribution specs (like "normal(0,1)"
). Such specs are also
produced by other packages (like the brms::get_prior
function in brms); thus,
parse_dist()
combined with the stats described here can help you visualize the output
of those functions.
A ggplot2::Geom or ggplot2::Stat representing a dotplot or combined dotplot+interval geometry which can
be added to a ggplot()
object.
The dots+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the dots (aka the slab), the
point, and the interval.
Positional aesthetics
x
: x position of the geometry
y
: y position of the geometry
Dots-specific (aka Slab-specific) aesthetics
family
: The font family used to draw the dots.
order
: The order in which data points are stacked within bins. Can be used to create the effect of
"stacked" dots by ordering dots according to a discrete variable. If omitted (NULL
), the
value of the data points themselves are used to determine stacking order. Only applies when
layout
is "bin"
or "hex"
, as the other layout methods fully determine both x and y positions.
side
: Which side to place the slab on. "topright"
, "top"
, and "right"
are synonyms
which cause the slab to be drawn on the top or the right depending on if orientation
is "horizontal"
or "vertical"
. "bottomleft"
, "bottom"
, and "left"
are synonyms which cause the slab
to be drawn on the bottom or the left depending on if orientation
is "horizontal"
or
"vertical"
. "topleft"
causes the slab to be drawn on the top or the left, and "bottomright"
causes the slab to be drawn on the bottom or the right. "both"
draws the slab mirrored on both
sides (as in a violin plot).
scale
: What proportion of the region allocated to this geom to use to draw the slab. If scale = 1
,
slabs that use the maximum range will just touch each other. Default is 0.9
to leave some space
between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization,
see the thickness
scale article.
justification
: Justification of the interval relative to the slab, where 0
indicates bottom/left
justification and 1
indicates top/right justification (depending on orientation
). If justification
is NULL
(the default), then it is set automatically based on the value of side
: when side
is
"top"
/"right"
justification
is set to 0
, when side
is "bottom"
/"left"
justification
is set to 1
, and when side
is "both"
justification
is set to 0.5.
datatype
: When using composite geoms directly without a stat
(e.g. geom_slabinterval()
), datatype
is used to
indicate which part of the geom a row in the data targets: rows with datatype = "slab"
target the
slab portion of the geometry and rows with datatype = "interval"
target the interval portion of
the geometry. This is set automatically when using ggdist stat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (if orientation = "horizontal"
).
xmax
: Right end of the interval sub-geometry (if orientation = "horizontal"
).
ymin
: Lower end of the interval sub-geometry (if orientation = "vertical"
).
ymax
: Upper end of the interval sub-geometry (if orientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (or color
) The color of the interval and point sub-geometries.
Use the slab_color
, interval_color
, or point_color
aesthetics (below) to
set sub-geometry colors separately.
fill
: The fill color of the slab and point sub-geometries. Use the slab_fill
or point_fill
aesthetics (below) to set sub-geometry colors separately.
alpha
: The opacity of the slab, interval, and point sub-geometries. Use the slab_alpha
,
interval_alpha
, or point_alpha
aesthetics (below) to set sub-geometry colors separately.
colour_ramp
: (or color_ramp
) A secondary scale that modifies the color
scale to "ramp" to another color. See scale_colour_ramp()
for examples.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except with geom_slab()
: then
it is the width of the slab). With composite geometries including an interval and slab,
use slab_linewidth
to set the line width of the slab (see below). For interval, raw
linewidth
values are transformed according to the interval_size_domain
and interval_size_range
parameters of the geom
(see above).
size
: Determines the size of the point. If linewidth
is not provided, size
will
also determines the width of the line used to draw the interval (this allows line width and
point size to be modified together by setting only size
and not linewidth
). Raw
size
values are transformed according to the interval_size_domain
, interval_size_range
,
and fatten_point
parameters of the geom
(see above). Use the point_size
aesthetic
(below) to set sub-geometry size directly without applying the effects of
interval_size_domain
, interval_size_range
, and fatten_point
.
stroke
: Width of the outline around the point sub-geometry.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc) used to draw the interval
and the outline of the slab (if it is visible). Use the slab_linetype
or
interval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override for fill
: the fill color of the slab.
slab_colour
: (or slab_color
) Override for colour
/color
: the outline color of the slab.
slab_alpha
: Override for alpha
: the opacity of the slab.
slab_linewidth
: Override for linwidth
: the width of the outline of the slab.
slab_linetype
: Override for linetype
: the line type of the outline of the slab.
slab_shape
: Override for shape
: the shape of the dots used to draw the dotplot slab.
Interval-specific color and line override aesthetics
interval_colour
: (or interval_color
) Override for colour
/color
: the color of the interval.
interval_alpha
: Override for alpha
: the opacity of the interval.
interval_linetype
: Override for linetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override for fill
: the fill color of the point.
point_colour
: (or point_color
) Override for colour
/color
: the outline color of the point.
point_alpha
: Override for alpha
: the opacity of the point.
point_size
: Override for size
: the size of the point.
Deprecated aesthetics
slab_size
: Use slab_linewidth
.
interval_size
: Use interval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("dotsinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
Matthew Kay
Kay, M., Kola, T., Hullman, J. R., & Munson, S. A. (2016). When (ish) is My Bus? User-centered Visualizations of Uncertainty in Everyday, Mobile Predictive Systems. Conference on Human Factors in Computing Systems - CHI '16, 5092–5103. doi:10.1145/2858036.2858558.
Fernandes, M., Walls, L., Munson, S., Hullman, J., & Kay, M. (2018). Uncertainty Displays Using Quantile Dotplots or CDFs Improve Transit Decision-Making. Conference on Human Factors in Computing Systems - CHI '18. doi:10.1145/3173574.3173718.
See the stat_slabinterval()
family for other
stats built on top of geom_slabinterval()
.
See vignette("dotsinterval")
for a variety of examples of use.
Other dotsinterval geoms:
geom_blur_dots()
,
geom_dots()
,
geom_swarm()
,
geom_weave()
library(dplyr) library(ggplot2) theme_set(theme_ggdist()) set.seed(12345) df = tibble( g = rep(c("a", "b"), 200), value = rnorm(400, c(0, 3), c(0.75, 1)) ) # orientation is detected automatically based on # which axis is discrete df %>% ggplot(aes(x = value, y = g)) + geom_dotsinterval() df %>% ggplot(aes(y = value, x = g)) + geom_dotsinterval() # stat_dots can summarize quantiles, creating quantile dotplots data(RankCorr_u_tau, package = "ggdist") RankCorr_u_tau %>% ggplot(aes(x = u_tau, y = factor(i))) + stat_dots(quantiles = 100) # color and fill aesthetics can be mapped within the geom # dotsinterval adds an interval RankCorr_u_tau %>% ggplot(aes(x = u_tau, y = factor(i), fill = after_stat(x > 6))) + stat_dotsinterval(quantiles = 100)
library(dplyr) library(ggplot2) theme_set(theme_ggdist()) set.seed(12345) df = tibble( g = rep(c("a", "b"), 200), value = rnorm(400, c(0, 3), c(0.75, 1)) ) # orientation is detected automatically based on # which axis is discrete df %>% ggplot(aes(x = value, y = g)) + geom_dotsinterval() df %>% ggplot(aes(y = value, x = g)) + geom_dotsinterval() # stat_dots can summarize quantiles, creating quantile dotplots data(RankCorr_u_tau, package = "ggdist") RankCorr_u_tau %>% ggplot(aes(x = u_tau, y = factor(i))) + stat_dots(quantiles = 100) # color and fill aesthetics can be mapped within the geom # dotsinterval adds an interval RankCorr_u_tau %>% ggplot(aes(x = u_tau, y = factor(i), fill = after_stat(x > 6))) + stat_dotsinterval(quantiles = 100)
Shortcut version of geom_slabinterval()
for creating multiple-interval plots.
Roughly equivalent to:
geom_slabinterval( aes( datatype = "interval", side = "both" ), interval_size_range = c(1, 6), show_slab = FALSE, show_point = FALSE )
geom_interval( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., orientation = NA, interval_size_range = c(1, 6), interval_size_domain = c(1, 6), arrow = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
geom_interval( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., orientation = NA, interval_size_range = c(1, 6), interval_size_domain = c(1, 6), arrow = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
interval_size_range |
<length-2 numeric> This geom scales the raw size aesthetic values when drawing interval
and point sizes, as they tend to be too thick when using the default settings of |
interval_size_domain |
<length-2 numeric> Minimum and maximum of the values of the |
arrow |
<arrow | NULL> Type of arrow heads to use on the interval, or |
na.rm |
<scalar logical> If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param
|
If |
This geom wraps geom_slabinterval()
with defaults designed to produce
multiple-interval plots. Default aesthetic mappings are applied if the .width
column
is present in the input data (e.g., as generated by the point_interval()
family of functions),
making this geom often more convenient than vanilla ggplot2 geometries when used with
functions like median_qi()
, mean_qi()
, mode_hdi()
, etc.
Specifically, if .width
is present in the input, geom_interval()
acts
as if its default aesthetics are aes(colour = forcats::fct_rev(ordered(.width)))
A ggplot2::Geom representing a multiple-interval geometry which can
be added to a ggplot()
object.
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
Positional aesthetics
x
: x position of the geometry
y
: y position of the geometry
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (if orientation = "horizontal"
).
xmax
: Right end of the interval sub-geometry (if orientation = "horizontal"
).
ymin
: Lower end of the interval sub-geometry (if orientation = "vertical"
).
ymax
: Upper end of the interval sub-geometry (if orientation = "vertical"
).
Color aesthetics
colour
: (or color
) The color of the interval and point sub-geometries.
Use the slab_color
, interval_color
, or point_color
aesthetics (below) to
set sub-geometry colors separately.
fill
: The fill color of the slab and point sub-geometries. Use the slab_fill
or point_fill
aesthetics (below) to set sub-geometry colors separately.
alpha
: The opacity of the slab, interval, and point sub-geometries. Use the slab_alpha
,
interval_alpha
, or point_alpha
aesthetics (below) to set sub-geometry colors separately.
colour_ramp
: (or color_ramp
) A secondary scale that modifies the color
scale to "ramp" to another color. See scale_colour_ramp()
for examples.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except with geom_slab()
: then
it is the width of the slab). With composite geometries including an interval and slab,
use slab_linewidth
to set the line width of the slab (see below). For interval, raw
linewidth
values are transformed according to the interval_size_domain
and interval_size_range
parameters of the geom
(see above).
size
: Determines the size of the point. If linewidth
is not provided, size
will
also determines the width of the line used to draw the interval (this allows line width and
point size to be modified together by setting only size
and not linewidth
). Raw
size
values are transformed according to the interval_size_domain
, interval_size_range
,
and fatten_point
parameters of the geom
(see above). Use the point_size
aesthetic
(below) to set sub-geometry size directly without applying the effects of
interval_size_domain
, interval_size_range
, and fatten_point
.
stroke
: Width of the outline around the point sub-geometry.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc) used to draw the interval
and the outline of the slab (if it is visible). Use the slab_linetype
or
interval_linetype
aesthetics (below) to set sub-geometry line types separately.
Interval-specific color and line override aesthetics
interval_colour
: (or interval_color
) Override for colour
/color
: the color of the interval.
interval_alpha
: Override for alpha
: the opacity of the interval.
interval_linetype
: Override for linetype
: the line type of the interval.
Deprecated aesthetics
interval_size
: Use interval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See stat_interval()
for the stat version, intended for
use on sample data or analytical distributions.
See geom_slabinterval()
for the geometry this shortcut is based on.
Other slabinterval geoms:
geom_pointinterval()
,
geom_slab()
,
geom_spike()
library(dplyr) library(ggplot2) theme_set(theme_ggdist()) data(RankCorr_u_tau, package = "ggdist") # orientation is detected automatically based on # use of xmin/xmax or ymin/ymax RankCorr_u_tau %>% group_by(i) %>% median_qi(.width = c(.5, .8, .95, .99)) %>% ggplot(aes(y = i, x = u_tau, xmin = .lower, xmax = .upper)) + geom_interval() + scale_color_brewer() RankCorr_u_tau %>% group_by(i) %>% median_qi(.width = c(.5, .8, .95, .99)) %>% ggplot(aes(x = i, y = u_tau, ymin = .lower, ymax = .upper)) + geom_interval() + scale_color_brewer()
library(dplyr) library(ggplot2) theme_set(theme_ggdist()) data(RankCorr_u_tau, package = "ggdist") # orientation is detected automatically based on # use of xmin/xmax or ymin/ymax RankCorr_u_tau %>% group_by(i) %>% median_qi(.width = c(.5, .8, .95, .99)) %>% ggplot(aes(y = i, x = u_tau, xmin = .lower, xmax = .upper)) + geom_interval() + scale_color_brewer() RankCorr_u_tau %>% group_by(i) %>% median_qi(.width = c(.5, .8, .95, .99)) %>% ggplot(aes(x = i, y = u_tau, ymin = .lower, ymax = .upper)) + geom_interval() + scale_color_brewer()
A combination of geom_line()
and geom_ribbon()
with default aesthetics designed for use with output from point_interval()
.
geom_lineribbon( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., step = FALSE, orientation = NA, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
geom_lineribbon( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., step = FALSE, orientation = NA, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed to |
step |
<scalar logical | string> Should the line/ribbon be drawn as a step function? One of:
|
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param
|
If |
geom_lineribbon()
is a combination of a geom_line()
and
geom_ribbon()
designed for use with output from point_interval()
.
This geom sets some default aesthetics equal to the .width
column generated by the
point_interval()
family of functions, making them often more convenient than a vanilla
geom_ribbon()
+ geom_line()
.
Specifically, geom_lineribbon()
acts as if its default aesthetics are
aes(fill = forcats::fct_rev(ordered(.width)))
.
A ggplot2::Geom representing a combined line + multiple-ribbon geometry which can
be added to a ggplot()
object.
The line+ribbon stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their two sub-geometries: the line and the ribbon.
Positional aesthetics
x
: x position of the geometry
y
: y position of the geometry
Ribbon-specific aesthetics
xmin
: Left edge of the ribbon sub-geometry (if orientation = "horizontal"
).
xmax
: Right edge of the ribbon sub-geometry (if orientation = "horizontal"
).
ymin
: Lower edge of the ribbon sub-geometry (if orientation = "vertical"
).
ymax
: Upper edge of the ribbon sub-geometry (if orientation = "vertical"
).
order
: The order in which ribbons are drawn. Ribbons with the smallest mean value of order
are drawn first (i.e., will be drawn below ribbons with larger mean values of order
). If
order
is not supplied to geom_lineribbon()
, -abs(xmax - xmin)
or -abs(ymax - ymax)
(depending on orientation
) is used, having the effect of drawing the widest (on average)
ribbons on the bottom. stat_lineribbon()
uses order = after_stat(level)
by default,
causing the ribbons generated from the largest .width
to be drawn on the bottom.
Color aesthetics
colour
: (or color
) The color of the line sub-geometry.
fill
: The fill color of the ribbon sub-geometry.
alpha
: The opacity of the line and ribbon sub-geometries.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of line. In ggplot2 < 3.4, was called size
.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc)
Other aesthetics (these work as in standard geom
s)
group
See examples of some of these aesthetics in action in vignette("lineribbon")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
Matthew Kay
See stat_lineribbon()
for a version that does summarizing of samples into points and intervals
within ggplot. See geom_pointinterval()
for a similar geom intended
for point summaries and intervals. See geom_line()
and
geom_ribbon()
and for the geoms this is based on.
library(dplyr) library(ggplot2) theme_set(theme_ggdist()) set.seed(12345) tibble( x = rep(1:10, 100), y = rnorm(1000, x) ) %>% group_by(x) %>% median_qi(.width = c(.5, .8, .95)) %>% ggplot(aes(x = x, y = y, ymin = .lower, ymax = .upper)) + # automatically uses aes(fill = forcats::fct_rev(ordered(.width))) geom_lineribbon() + scale_fill_brewer()
library(dplyr) library(ggplot2) theme_set(theme_ggdist()) set.seed(12345) tibble( x = rep(1:10, 100), y = rnorm(1000, x) ) %>% group_by(x) %>% median_qi(.width = c(.5, .8, .95)) %>% ggplot(aes(x = x, y = y, ymin = .lower, ymax = .upper)) + # automatically uses aes(fill = forcats::fct_rev(ordered(.width))) geom_lineribbon() + scale_fill_brewer()
Shortcut version of geom_slabinterval()
for creating point + multiple-interval plots.
Roughly equivalent to:
geom_slabinterval( aes( datatype = "interval", side = "both" ), show_slab = FALSE, show.legend = c(size = FALSE) )
geom_pointinterval( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., orientation = NA, interval_size_domain = c(1, 6), interval_size_range = c(0.6, 1.4), fatten_point = 1.8, arrow = NULL, na.rm = FALSE, show.legend = c(size = FALSE), inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
geom_pointinterval( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., orientation = NA, interval_size_domain = c(1, 6), interval_size_range = c(0.6, 1.4), fatten_point = 1.8, arrow = NULL, na.rm = FALSE, show.legend = c(size = FALSE), inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
interval_size_domain |
<length-2 numeric> Minimum and maximum of the values of the |
interval_size_range |
<length-2 numeric> This geom scales the raw size aesthetic values when drawing interval
and point sizes, as they tend to be too thick when using the default settings of |
fatten_point |
<scalar numeric> A multiplicative factor used to adjust the size of the point relative to the
size of the thickest interval line. If you wish to specify point sizes directly, you can also use
the |
arrow |
<arrow | NULL> Type of arrow heads to use on the interval, or |
na.rm |
<scalar logical> If |
show.legend |
<logical> Should this layer be included in the legends?
Default is |
inherit.aes |
If |
check.aes , check.param
|
If |
This geom wraps geom_slabinterval()
with defaults designed to produce
point + multiple-interval plots. Default aesthetic mappings are applied if the .width
column
is present in the input data (e.g., as generated by the point_interval()
family of functions),
making this geom often more convenient than vanilla ggplot2 geometries when used with
functions like median_qi()
, mean_qi()
, mode_hdi()
, etc.
Specifically, if .width
is present in the input, geom_pointinterval()
acts
as if its default aesthetics are aes(size = -.width)
A ggplot2::Geom representing a point + multiple-interval geometry which can
be added to a ggplot()
object.
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
Positional aesthetics
x
: x position of the geometry
y
: y position of the geometry
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (if orientation = "horizontal"
).
xmax
: Right end of the interval sub-geometry (if orientation = "horizontal"
).
ymin
: Lower end of the interval sub-geometry (if orientation = "vertical"
).
ymax
: Upper end of the interval sub-geometry (if orientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (or color
) The color of the interval and point sub-geometries.
Use the slab_color
, interval_color
, or point_color
aesthetics (below) to
set sub-geometry colors separately.
fill
: The fill color of the slab and point sub-geometries. Use the slab_fill
or point_fill
aesthetics (below) to set sub-geometry colors separately.
alpha
: The opacity of the slab, interval, and point sub-geometries. Use the slab_alpha
,
interval_alpha
, or point_alpha
aesthetics (below) to set sub-geometry colors separately.
colour_ramp
: (or color_ramp
) A secondary scale that modifies the color
scale to "ramp" to another color. See scale_colour_ramp()
for examples.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except with geom_slab()
: then
it is the width of the slab). With composite geometries including an interval and slab,
use slab_linewidth
to set the line width of the slab (see below). For interval, raw
linewidth
values are transformed according to the interval_size_domain
and interval_size_range
parameters of the geom
(see above).
size
: Determines the size of the point. If linewidth
is not provided, size
will
also determines the width of the line used to draw the interval (this allows line width and
point size to be modified together by setting only size
and not linewidth
). Raw
size
values are transformed according to the interval_size_domain
, interval_size_range
,
and fatten_point
parameters of the geom
(see above). Use the point_size
aesthetic
(below) to set sub-geometry size directly without applying the effects of
interval_size_domain
, interval_size_range
, and fatten_point
.
stroke
: Width of the outline around the point sub-geometry.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc) used to draw the interval
and the outline of the slab (if it is visible). Use the slab_linetype
or
interval_linetype
aesthetics (below) to set sub-geometry line types separately.
Interval-specific color and line override aesthetics
interval_colour
: (or interval_color
) Override for colour
/color
: the color of the interval.
interval_alpha
: Override for alpha
: the opacity of the interval.
interval_linetype
: Override for linetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override for fill
: the fill color of the point.
point_colour
: (or point_color
) Override for colour
/color
: the outline color of the point.
point_alpha
: Override for alpha
: the opacity of the point.
point_size
: Override for size
: the size of the point.
Deprecated aesthetics
interval_size
: Use interval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See stat_pointinterval()
for the stat version, intended for
use on sample data or analytical distributions.
See geom_slabinterval()
for the geometry this shortcut is based on.
Other slabinterval geoms:
geom_interval()
,
geom_slab()
,
geom_spike()
library(dplyr) library(ggplot2) data(RankCorr_u_tau, package = "ggdist") # orientation is detected automatically based on # use of xmin/xmax or ymin/ymax RankCorr_u_tau %>% group_by(i) %>% median_qi(.width = c(.8, .95)) %>% ggplot(aes(y = i, x = u_tau, xmin = .lower, xmax = .upper)) + geom_pointinterval() RankCorr_u_tau %>% group_by(i) %>% median_qi(.width = c(.8, .95)) %>% ggplot(aes(x = i, y = u_tau, ymin = .lower, ymax = .upper)) + geom_pointinterval()
library(dplyr) library(ggplot2) data(RankCorr_u_tau, package = "ggdist") # orientation is detected automatically based on # use of xmin/xmax or ymin/ymax RankCorr_u_tau %>% group_by(i) %>% median_qi(.width = c(.8, .95)) %>% ggplot(aes(y = i, x = u_tau, xmin = .lower, xmax = .upper)) + geom_pointinterval() RankCorr_u_tau %>% group_by(i) %>% median_qi(.width = c(.8, .95)) %>% ggplot(aes(x = i, y = u_tau, ymin = .lower, ymax = .upper)) + geom_pointinterval()
Shortcut version of geom_slabinterval()
for creating slab (ridge) plots.
Roughly equivalent to:
geom_slabinterval( show_point = FALSE, show_interval = FALSE )
geom_slab( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., orientation = NA, subscale = "thickness", normalize = "all", fill_type = "segments", subguide = "slab", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
geom_slab( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., orientation = NA, subscale = "thickness", normalize = "all", fill_type = "segments", subguide = "slab", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
subscale |
<function | string> Sub-scale used to scale values of the
For a comprehensive discussion and examples of slab scaling and normalization, see the
|
normalize |
<string> Groups within which to scale values of the
For a comprehensive discussion and examples of slab scaling and normalization, see the
|
fill_type |
<string> What type of fill to use when the fill color or alpha varies within a slab. One of:
|
subguide |
<function | string> Sub-guide used to annotate the
|
na.rm |
<scalar logical> If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param
|
If |
A ggplot2::Geom representing a slab (ridge) geometry which can
be added to a ggplot()
object.
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
Positional aesthetics
x
: x position of the geometry
y
: y position of the geometry
Slab-specific aesthetics
thickness
: The thickness of the slab at each x
value (if orientation = "horizontal"
) or
y
value (if orientation = "vertical"
) of the slab.
side
: Which side to place the slab on. "topright"
, "top"
, and "right"
are synonyms
which cause the slab to be drawn on the top or the right depending on if orientation
is "horizontal"
or "vertical"
. "bottomleft"
, "bottom"
, and "left"
are synonyms which cause the slab
to be drawn on the bottom or the left depending on if orientation
is "horizontal"
or
"vertical"
. "topleft"
causes the slab to be drawn on the top or the left, and "bottomright"
causes the slab to be drawn on the bottom or the right. "both"
draws the slab mirrored on both
sides (as in a violin plot).
scale
: What proportion of the region allocated to this geom to use to draw the slab. If scale = 1
,
slabs that use the maximum range will just touch each other. Default is 0.9
to leave some space
between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization,
see the thickness
scale article.
justification
: Justification of the interval relative to the slab, where 0
indicates bottom/left
justification and 1
indicates top/right justification (depending on orientation
). If justification
is NULL
(the default), then it is set automatically based on the value of side
: when side
is
"top"
/"right"
justification
is set to 0
, when side
is "bottom"
/"left"
justification
is set to 1
, and when side
is "both"
justification
is set to 0.5.
Color aesthetics
colour
: (or color
) The color of the interval and point sub-geometries.
Use the slab_color
, interval_color
, or point_color
aesthetics (below) to
set sub-geometry colors separately.
fill
: The fill color of the slab and point sub-geometries. Use the slab_fill
or point_fill
aesthetics (below) to set sub-geometry colors separately.
alpha
: The opacity of the slab, interval, and point sub-geometries. Use the slab_alpha
,
interval_alpha
, or point_alpha
aesthetics (below) to set sub-geometry colors separately.
colour_ramp
: (or color_ramp
) A secondary scale that modifies the color
scale to "ramp" to another color. See scale_colour_ramp()
for examples.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except with geom_slab()
: then
it is the width of the slab). With composite geometries including an interval and slab,
use slab_linewidth
to set the line width of the slab (see below). For interval, raw
linewidth
values are transformed according to the interval_size_domain
and interval_size_range
parameters of the geom
(see above).
size
: Determines the size of the point. If linewidth
is not provided, size
will
also determines the width of the line used to draw the interval (this allows line width and
point size to be modified together by setting only size
and not linewidth
). Raw
size
values are transformed according to the interval_size_domain
, interval_size_range
,
and fatten_point
parameters of the geom
(see above). Use the point_size
aesthetic
(below) to set sub-geometry size directly without applying the effects of
interval_size_domain
, interval_size_range
, and fatten_point
.
stroke
: Width of the outline around the point sub-geometry.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc) used to draw the interval
and the outline of the slab (if it is visible). Use the slab_linetype
or
interval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override for fill
: the fill color of the slab.
slab_colour
: (or slab_color
) Override for colour
/color
: the outline color of the slab.
slab_alpha
: Override for alpha
: the opacity of the slab.
slab_linewidth
: Override for linwidth
: the width of the outline of the slab.
slab_linetype
: Override for linetype
: the line type of the outline of the slab.
Deprecated aesthetics
slab_size
: Use slab_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See stat_slab()
for the stat version, intended for
use on sample data or analytical distributions.
See geom_slabinterval()
for the geometry this shortcut is based on.
Other slabinterval geoms:
geom_interval()
,
geom_pointinterval()
,
geom_spike()
library(dplyr) library(ggplot2) theme_set(theme_ggdist()) # we will manually demonstrate plotting a density with geom_slab(), # though generally speaking this is easier to do using stat_slab(), which # will determine sensible limits automatically and correctly adjust # densities when using scale transformations df = expand.grid( mean = 1:3, input = seq(-2, 6, length.out = 100) ) %>% mutate( group = letters[4 - mean], density = dnorm(input, mean, 1) ) # orientation is detected automatically based on # use of x or y df %>% ggplot(aes(y = group, x = input, thickness = density)) + geom_slab() df %>% ggplot(aes(x = group, y = input, thickness = density)) + geom_slab() # RIDGE PLOTS # "ridge" plots can be created by increasing the slab height and # setting the slab color df %>% ggplot(aes(y = group, x = input, thickness = density)) + geom_slab(height = 2, color = "black")
library(dplyr) library(ggplot2) theme_set(theme_ggdist()) # we will manually demonstrate plotting a density with geom_slab(), # though generally speaking this is easier to do using stat_slab(), which # will determine sensible limits automatically and correctly adjust # densities when using scale transformations df = expand.grid( mean = 1:3, input = seq(-2, 6, length.out = 100) ) %>% mutate( group = letters[4 - mean], density = dnorm(input, mean, 1) ) # orientation is detected automatically based on # use of x or y df %>% ggplot(aes(y = group, x = input, thickness = density)) + geom_slab() df %>% ggplot(aes(x = group, y = input, thickness = density)) + geom_slab() # RIDGE PLOTS # "ridge" plots can be created by increasing the slab height and # setting the slab color df %>% ggplot(aes(y = group, x = input, thickness = density)) + geom_slab(height = 2, color = "black")
This meta-geom supports drawing combinations of functions (as slabs, aka ridge plots or joy plots), points, and
intervals. It acts as a meta-geom for many other ggdist geoms that are wrappers around this geom, including
eye plots, half-eye plots, CCDF barplots, and point+multiple interval plots, and supports both horizontal and
vertical orientations, dodging (via the position
argument), and relative justification of slabs with their
corresponding intervals.
geom_slabinterval( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., orientation = NA, subscale = "thickness", normalize = "all", fill_type = "segments", interval_size_domain = c(1, 6), interval_size_range = c(0.6, 1.4), fatten_point = 1.8, arrow = NULL, show_slab = TRUE, show_point = TRUE, show_interval = TRUE, subguide = "slab", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
geom_slabinterval( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., orientation = NA, subscale = "thickness", normalize = "all", fill_type = "segments", interval_size_domain = c(1, 6), interval_size_range = c(0.6, 1.4), fatten_point = 1.8, arrow = NULL, show_slab = TRUE, show_point = TRUE, show_interval = TRUE, subguide = "slab", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
subscale |
<function | string> Sub-scale used to scale values of the
For a comprehensive discussion and examples of slab scaling and normalization, see the
|
normalize |
<string> Groups within which to scale values of the
For a comprehensive discussion and examples of slab scaling and normalization, see the
|
fill_type |
<string> What type of fill to use when the fill color or alpha varies within a slab. One of:
|
interval_size_domain |
<length-2 numeric> Minimum and maximum of the values of the |
interval_size_range |
<length-2 numeric> This geom scales the raw size aesthetic values when drawing interval
and point sizes, as they tend to be too thick when using the default settings of |
fatten_point |
<scalar numeric> A multiplicative factor used to adjust the size of the point relative to the
size of the thickest interval line. If you wish to specify point sizes directly, you can also use
the |
arrow |
<arrow | NULL> Type of arrow heads to use on the interval, or |
show_slab |
<scalar logical> Should the slab portion of the geom be drawn? |
show_point |
<scalar logical> Should the point portion of the geom be drawn? |
show_interval |
<scalar logical> Should the interval portion of the geom be drawn? |
subguide |
<function | string> Sub-guide used to annotate the
|
na.rm |
<scalar logical> If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param
|
If |
geom_slabinterval()
is a flexible meta-geom that you can use directly or through a variety of "shortcut"
geoms that represent useful combinations of the various parameters of this geom. In many cases you will want to
use the shortcut geoms instead as they create more useful mnemonic primitives, such as eye plots,
half-eye plots, point+interval plots, or CCDF barplots.
The slab portion of the geom is much like a ridge or "joy" plot: it represents the value of a function
scaled to fit between values on the x
or y
axis (depending on the value of orientation
). Values of
the functions are specified using the thickness
aesthetic and are scaled to fit into scale
times the distance between points on the relevant axis. E.g., if orientation
is "horizontal"
,
scale
is 0.9
, and y
is a discrete variable, then the thickness
aesthetic specifies the
value of some function of x
that is drawn for every y
value and scaled to fit into 0.9
times
the distance between points on the y
axis.
For the interval portion of the geom, x
and y
aesthetics specify the location of the
point, and ymin
/ymax
or xmin
/xmax
(depending on the value of orientation
)
specify the endpoints of the interval. A scaling factor for interval line width and point size is applied
through the interval_size_domain
, interval_size_range
, and fatten_point
parameters.
These scaling factors are designed to give multiple uncertainty intervals reasonable
scaling at the default settings for scale_size_continuous()
.
As a combination geom, this geom expects a datatype
aesthetic specifying which part of the geom a given
row in the input data corresponds to: "slab"
or "interval"
. However, specifying this aesthetic
manually is typically only necessary if you use this geom directly; the numerous wrapper geoms will
usually set this aesthetic for you as needed, and their use is recommended unless you have a very custom
use case.
Wrapper geoms include:
In addition, the stat_slabinterval()
family of stats uses geoms from the
geom_slabinterval()
family, and is often easier to use than using these geoms
directly. Typically, the geom_*
versions are meant for use with already-summarized data (such as intervals) and the
stat_*
versions are summarize the data themselves (usually draws from a distribution) to produce the geom.
A ggplot2::Geom representing a slab or combined slab+interval geometry which can
be added to a ggplot()
object.
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
Positional aesthetics
x
: x position of the geometry
y
: y position of the geometry
Slab-specific aesthetics
thickness
: The thickness of the slab at each x
value (if orientation = "horizontal"
) or
y
value (if orientation = "vertical"
) of the slab.
side
: Which side to place the slab on. "topright"
, "top"
, and "right"
are synonyms
which cause the slab to be drawn on the top or the right depending on if orientation
is "horizontal"
or "vertical"
. "bottomleft"
, "bottom"
, and "left"
are synonyms which cause the slab
to be drawn on the bottom or the left depending on if orientation
is "horizontal"
or
"vertical"
. "topleft"
causes the slab to be drawn on the top or the left, and "bottomright"
causes the slab to be drawn on the bottom or the right. "both"
draws the slab mirrored on both
sides (as in a violin plot).
scale
: What proportion of the region allocated to this geom to use to draw the slab. If scale = 1
,
slabs that use the maximum range will just touch each other. Default is 0.9
to leave some space
between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization,
see the thickness
scale article.
justification
: Justification of the interval relative to the slab, where 0
indicates bottom/left
justification and 1
indicates top/right justification (depending on orientation
). If justification
is NULL
(the default), then it is set automatically based on the value of side
: when side
is
"top"
/"right"
justification
is set to 0
, when side
is "bottom"
/"left"
justification
is set to 1
, and when side
is "both"
justification
is set to 0.5.
datatype
: When using composite geoms directly without a stat
(e.g. geom_slabinterval()
), datatype
is used to
indicate which part of the geom a row in the data targets: rows with datatype = "slab"
target the
slab portion of the geometry and rows with datatype = "interval"
target the interval portion of
the geometry. This is set automatically when using ggdist stat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (if orientation = "horizontal"
).
xmax
: Right end of the interval sub-geometry (if orientation = "horizontal"
).
ymin
: Lower end of the interval sub-geometry (if orientation = "vertical"
).
ymax
: Upper end of the interval sub-geometry (if orientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (or color
) The color of the interval and point sub-geometries.
Use the slab_color
, interval_color
, or point_color
aesthetics (below) to
set sub-geometry colors separately.
fill
: The fill color of the slab and point sub-geometries. Use the slab_fill
or point_fill
aesthetics (below) to set sub-geometry colors separately.
alpha
: The opacity of the slab, interval, and point sub-geometries. Use the slab_alpha
,
interval_alpha
, or point_alpha
aesthetics (below) to set sub-geometry colors separately.
colour_ramp
: (or color_ramp
) A secondary scale that modifies the color
scale to "ramp" to another color. See scale_colour_ramp()
for examples.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except with geom_slab()
: then
it is the width of the slab). With composite geometries including an interval and slab,
use slab_linewidth
to set the line width of the slab (see below). For interval, raw
linewidth
values are transformed according to the interval_size_domain
and interval_size_range
parameters of the geom
(see above).
size
: Determines the size of the point. If linewidth
is not provided, size
will
also determines the width of the line used to draw the interval (this allows line width and
point size to be modified together by setting only size
and not linewidth
). Raw
size
values are transformed according to the interval_size_domain
, interval_size_range
,
and fatten_point
parameters of the geom
(see above). Use the point_size
aesthetic
(below) to set sub-geometry size directly without applying the effects of
interval_size_domain
, interval_size_range
, and fatten_point
.
stroke
: Width of the outline around the point sub-geometry.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc) used to draw the interval
and the outline of the slab (if it is visible). Use the slab_linetype
or
interval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override for fill
: the fill color of the slab.
slab_colour
: (or slab_color
) Override for colour
/color
: the outline color of the slab.
slab_alpha
: Override for alpha
: the opacity of the slab.
slab_linewidth
: Override for linwidth
: the width of the outline of the slab.
slab_linetype
: Override for linetype
: the line type of the outline of the slab.
Interval-specific color and line override aesthetics
interval_colour
: (or interval_color
) Override for colour
/color
: the color of the interval.
interval_alpha
: Override for alpha
: the opacity of the interval.
interval_linetype
: Override for linetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override for fill
: the fill color of the point.
point_colour
: (or point_color
) Override for colour
/color
: the outline color of the point.
point_alpha
: Override for alpha
: the opacity of the point.
point_size
: Override for size
: the size of the point.
Deprecated aesthetics
slab_size
: Use slab_linewidth
.
interval_size
: Use interval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
Matthew Kay
See geom_lineribbon()
for a combination geom designed for fit curves plus probability bands.
See geom_dotsinterval()
for a combination geom designed for plotting dotplots with intervals.
See stat_slabinterval()
for families of stats
built on top of this geom for common use cases (like stat_halfeye()
).
See vignette("slabinterval")
for a variety of examples of use.
# geom_slabinterval() is typically not that useful on its own. # See vignette("slabinterval") for a variety of examples of the use of its # shortcut geoms and stats, which are more useful than using # geom_slabinterval() directly.
# geom_slabinterval() is typically not that useful on its own. # See vignette("slabinterval") for a variety of examples of the use of its # shortcut geoms and stats, which are more useful than using # geom_slabinterval() directly.
Geometry for drawing "spikes" (optionally with points on them) on top of
geom_slabinterval()
geometries: this geometry understands the scaling and
positioning of the thickness
aesthetic from geom_slabinterval()
, which
allows you to position spikes and points along a slab.
geom_spike( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., subguide = "spike", orientation = NA, subscale = "thickness", normalize = "all", arrow = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
geom_spike( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., subguide = "spike", orientation = NA, subscale = "thickness", normalize = "all", arrow = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to |
subguide |
<function | string> Sub-guide used to annotate the
|
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
subscale |
<function | string> Sub-scale used to scale values of the
For a comprehensive discussion and examples of slab scaling and normalization, see the
|
normalize |
<string> Groups within which to scale values of the
For a comprehensive discussion and examples of slab scaling and normalization, see the
|
arrow |
<arrow | NULL> Type of arrow heads to use on the spike, or |
na.rm |
<scalar logical> If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param
|
If |
This geometry consists of a "spike" (vertical/horizontal line segment) and a
"point" (at the end of the line segment). It uses the thickness
aesthetic
to determine where the endpoint of the line is, which allows it to be used
with geom_slabinterval()
geometries for labeling specific values of the
thickness function.
A ggplot2::Geom representing a spike geometry which can
be added to a ggplot()
object.
rd_slabinterval_aesthetics(geom_name),
The spike geom
has a wide variety of aesthetics that control
the appearance of its two sub-geometries: the spike and the point.
Positional aesthetics
x
: x position of the geometry
y
: y position of the geometry
Spike-specific (aka Slab-specific) aesthetics
thickness
: The thickness of the slab at each x
value (if orientation = "horizontal"
) or
y
value (if orientation = "vertical"
) of the slab.
side
: Which side to place the slab on. "topright"
, "top"
, and "right"
are synonyms
which cause the slab to be drawn on the top or the right depending on if orientation
is "horizontal"
or "vertical"
. "bottomleft"
, "bottom"
, and "left"
are synonyms which cause the slab
to be drawn on the bottom or the left depending on if orientation
is "horizontal"
or
"vertical"
. "topleft"
causes the slab to be drawn on the top or the left, and "bottomright"
causes the slab to be drawn on the bottom or the right. "both"
draws the slab mirrored on both
sides (as in a violin plot).
scale
: What proportion of the region allocated to this geom to use to draw the slab. If scale = 1
,
slabs that use the maximum range will just touch each other. Default is 0.9
to leave some space
between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization,
see the thickness
scale article.
Color aesthetics
colour
: (or color
) The color of the spike and point sub-geometries.
fill
: The fill color of the point sub-geometry.
alpha
: The opacity of the spike and point sub-geometries.
colour_ramp
: (or color_ramp
) A secondary scale that modifies the color
scale to "ramp" to another color. See scale_colour_ramp()
for examples.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the spike sub-geometry.
size
: Size of the point sub-geometry.
stroke
: Width of the outline around the point sub-geometry.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc) used to draw the spike.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See stat_spike()
for the stat version, intended for
use on sample data or analytical distributions.
Other slabinterval geoms:
geom_interval()
,
geom_pointinterval()
,
geom_slab()
library(ggplot2) library(distributional) library(dplyr) # geom_spike is easiest to use with distributional or # posterior::rvar objects df = tibble( d = dist_normal(1:2, 1:2), g = c("a", "b") ) # annotate the density at the mean of a distribution df %>% mutate( mean = mean(d), density(d, list(density_at_mean = mean)) ) %>% ggplot(aes(y = g)) + stat_slab(aes(xdist = d)) + geom_spike(aes(x = mean, thickness = density_at_mean)) + # need shared thickness scale so that stat_slab and geom_spike line up scale_thickness_shared() # annotate the endpoints of intervals of a distribution # here we'll use an arrow instead of a point by setting size = 0 arrow_spec = arrow(angle = 45, type = "closed", length = unit(4, "pt")) df %>% mutate( median_qi(d, .width = 0.9), density(d, list(density_lower = .lower, density_upper = .upper)) ) %>% ggplot(aes(y = g)) + stat_halfeye(aes(xdist = d), .width = 0.9, color = "gray35") + geom_spike( aes(x = .lower, thickness = density_lower), size = 0, arrow = arrow_spec, color = "blue", linewidth = 0.75 ) + geom_spike( aes(x = .upper, thickness = density_upper), size = 0, arrow = arrow_spec, color = "red", linewidth = 0.75 ) + scale_thickness_shared()
library(ggplot2) library(distributional) library(dplyr) # geom_spike is easiest to use with distributional or # posterior::rvar objects df = tibble( d = dist_normal(1:2, 1:2), g = c("a", "b") ) # annotate the density at the mean of a distribution df %>% mutate( mean = mean(d), density(d, list(density_at_mean = mean)) ) %>% ggplot(aes(y = g)) + stat_slab(aes(xdist = d)) + geom_spike(aes(x = mean, thickness = density_at_mean)) + # need shared thickness scale so that stat_slab and geom_spike line up scale_thickness_shared() # annotate the endpoints of intervals of a distribution # here we'll use an arrow instead of a point by setting size = 0 arrow_spec = arrow(angle = 45, type = "closed", length = unit(4, "pt")) df %>% mutate( median_qi(d, .width = 0.9), density(d, list(density_lower = .lower, density_upper = .upper)) ) %>% ggplot(aes(y = g)) + stat_halfeye(aes(xdist = d), .width = 0.9, color = "gray35") + geom_spike( aes(x = .lower, thickness = density_lower), size = 0, arrow = arrow_spec, color = "blue", linewidth = 0.75 ) + geom_spike( aes(x = .upper, thickness = density_upper), size = 0, arrow = arrow_spec, color = "red", linewidth = 0.75 ) + scale_thickness_shared()
Shortcut version of geom_dotsinterval()
for creating beeswarm plots.
Geoms based on geom_dotsinterval()
create dotplots that automatically
ensure the plot fits within the available space.
Roughly equivalent to:
geom_dots( aes(side = "both"), overflow = "compress", binwidth = unit(1.5, "mm"), layout = "swarm" )
geom_swarm( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., overflow = "compress", binwidth = unit(1.5, "mm"), layout = "swarm", dotsize = 1.07, stackratio = 1, overlaps = "nudge", smooth = "none", verbose = FALSE, orientation = NA, subguide = "slab", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
geom_swarm( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., overflow = "compress", binwidth = unit(1.5, "mm"), layout = "swarm", dotsize = 1.07, stackratio = 1, overlaps = "nudge", smooth = "none", verbose = FALSE, orientation = NA, subguide = "slab", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to |
overflow |
<string> How to handle overflow of dots beyond the extent of the geom
when a minimum
If you find the default layout has dots that are too small, and you are okay
with dots overlapping, consider setting |
binwidth |
<numeric | unit> The bin width to use for laying out the dots. One of:
If the value is numeric, it is assumed to be in units of data. The bin width
(or its bounds) can also be specified using |
layout |
<string> The layout method used for the dots. One of:
|
dotsize |
<scalar numeric> The width of the dots relative to the |
stackratio |
<scalar numeric> The distance between the center of the dots in the same
stack relative to the dot height. The default, |
overlaps |
<string> How to handle overlapping dots or bins in the
|
smooth |
<function | string> Smoother to apply to dot positions. One of:
Smoothing is most effective when the smoother is matched to the support of
the distribution; e.g. using |
verbose |
<scalar logical> If |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
subguide |
<function | string> Sub-guide used to annotate the
|
na.rm |
<scalar logical> If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param
|
If |
The dots family of stats and geoms are similar to ggplot2::geom_dotplot()
but with a number of differences:
Dots geoms act like slabs in geom_slabinterval()
and can be given x positions (or y positions when
in a horizontal orientation).
Given the available space to lay out dots, the dots geoms will automatically determine how many bins to use to fit the available space.
Dots geoms use a dynamic layout algorithm that lays out dots from the center out if the input data are symmetrical, guaranteeing that symmetrical data results in a symmetrical plot. The layout algorithm also prevents dots from overlapping each other.
The shape of the dots in these geoms can be changed using the slab_shape
aesthetic (when using the
dotsinterval
family) or the shape
or slab_shape
aesthetic (when using the dots
family)
Stats and geoms in this family include:
geom_dots()
: dotplots on raw data. Ensures the dotplot fits within available space by reducing the size
of the dots automatically (may result in very small dots).
geom_swarm()
and geom_weave()
: dotplots on raw data with defaults intended to create "beeswarm" plots.
Used side = "both"
by default, and sets the default dot size to the same size as geom_point()
(binwidth = unit(1.5, "mm")
), allowing dots to overlap instead of getting very small.
stat_dots()
: dotplots on raw data, distributional objects, and posterior::rvar()
s
geom_dotsinterval()
: dotplot + interval plots on raw data with already-calculated
intervals (rarely useful directly).
stat_dotsinterval()
: dotplot + interval plots on raw data, distributional objects,
and posterior::rvar()
s (will calculate intervals for you).
geom_blur_dots()
: blurry dotplots that allow the standard deviation of a blur applied to
each dot to be specified using the sd
aesthetic.
stat_mcse_dots()
: blurry dotplots of quantiles using the Monte Carlo Standard Error of each quantile.
stat_dots()
and stat_dotsinterval()
, when used with the quantiles
argument,
are particularly useful for constructing quantile dotplots, which can be an effective way to communicate uncertainty
using a frequency framing that may be easier for laypeople to understand (Kay et al. 2016, Fernandes et al. 2018).
A ggplot2::Geom representing a beeswarm geometry which can
be added to a ggplot()
object.
The dots+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the dots (aka the slab), the
point, and the interval.
Positional aesthetics
x
: x position of the geometry
y
: y position of the geometry
Dots-specific (aka Slab-specific) aesthetics
family
: The font family used to draw the dots.
order
: The order in which data points are stacked within bins. Can be used to create the effect of
"stacked" dots by ordering dots according to a discrete variable. If omitted (NULL
), the
value of the data points themselves are used to determine stacking order. Only applies when
layout
is "bin"
or "hex"
, as the other layout methods fully determine both x and y positions.
side
: Which side to place the slab on. "topright"
, "top"
, and "right"
are synonyms
which cause the slab to be drawn on the top or the right depending on if orientation
is "horizontal"
or "vertical"
. "bottomleft"
, "bottom"
, and "left"
are synonyms which cause the slab
to be drawn on the bottom or the left depending on if orientation
is "horizontal"
or
"vertical"
. "topleft"
causes the slab to be drawn on the top or the left, and "bottomright"
causes the slab to be drawn on the bottom or the right. "both"
draws the slab mirrored on both
sides (as in a violin plot).
scale
: What proportion of the region allocated to this geom to use to draw the slab. If scale = 1
,
slabs that use the maximum range will just touch each other. Default is 0.9
to leave some space
between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization,
see the thickness
scale article.
justification
: Justification of the interval relative to the slab, where 0
indicates bottom/left
justification and 1
indicates top/right justification (depending on orientation
). If justification
is NULL
(the default), then it is set automatically based on the value of side
: when side
is
"top"
/"right"
justification
is set to 0
, when side
is "bottom"
/"left"
justification
is set to 1
, and when side
is "both"
justification
is set to 0.5.
datatype
: When using composite geoms directly without a stat
(e.g. geom_slabinterval()
), datatype
is used to
indicate which part of the geom a row in the data targets: rows with datatype = "slab"
target the
slab portion of the geometry and rows with datatype = "interval"
target the interval portion of
the geometry. This is set automatically when using ggdist stat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (if orientation = "horizontal"
).
xmax
: Right end of the interval sub-geometry (if orientation = "horizontal"
).
ymin
: Lower end of the interval sub-geometry (if orientation = "vertical"
).
ymax
: Upper end of the interval sub-geometry (if orientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (or color
) The color of the interval and point sub-geometries.
Use the slab_color
, interval_color
, or point_color
aesthetics (below) to
set sub-geometry colors separately.
fill
: The fill color of the slab and point sub-geometries. Use the slab_fill
or point_fill
aesthetics (below) to set sub-geometry colors separately.
alpha
: The opacity of the slab, interval, and point sub-geometries. Use the slab_alpha
,
interval_alpha
, or point_alpha
aesthetics (below) to set sub-geometry colors separately.
colour_ramp
: (or color_ramp
) A secondary scale that modifies the color
scale to "ramp" to another color. See scale_colour_ramp()
for examples.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except with geom_slab()
: then
it is the width of the slab). With composite geometries including an interval and slab,
use slab_linewidth
to set the line width of the slab (see below). For interval, raw
linewidth
values are transformed according to the interval_size_domain
and interval_size_range
parameters of the geom
(see above).
size
: Determines the size of the point. If linewidth
is not provided, size
will
also determines the width of the line used to draw the interval (this allows line width and
point size to be modified together by setting only size
and not linewidth
). Raw
size
values are transformed according to the interval_size_domain
, interval_size_range
,
and fatten_point
parameters of the geom
(see above). Use the point_size
aesthetic
(below) to set sub-geometry size directly without applying the effects of
interval_size_domain
, interval_size_range
, and fatten_point
.
stroke
: Width of the outline around the point sub-geometry.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc) used to draw the interval
and the outline of the slab (if it is visible). Use the slab_linetype
or
interval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override for fill
: the fill color of the slab.
slab_colour
: (or slab_color
) Override for colour
/color
: the outline color of the slab.
slab_alpha
: Override for alpha
: the opacity of the slab.
slab_linewidth
: Override for linwidth
: the width of the outline of the slab.
slab_linetype
: Override for linetype
: the line type of the outline of the slab.
slab_shape
: Override for shape
: the shape of the dots used to draw the dotplot slab.
Interval-specific color and line override aesthetics
interval_colour
: (or interval_color
) Override for colour
/color
: the color of the interval.
interval_alpha
: Override for alpha
: the opacity of the interval.
interval_linetype
: Override for linetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override for fill
: the fill color of the point.
point_colour
: (or point_color
) Override for colour
/color
: the outline color of the point.
point_alpha
: Override for alpha
: the opacity of the point.
point_size
: Override for size
: the size of the point.
Deprecated aesthetics
slab_size
: Use slab_linewidth
.
interval_size
: Use interval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("dotsinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
Kay, M., Kola, T., Hullman, J. R., & Munson, S. A. (2016). When (ish) is My Bus? User-centered Visualizations of Uncertainty in Everyday, Mobile Predictive Systems. Conference on Human Factors in Computing Systems - CHI '16, 5092–5103. doi:10.1145/2858036.2858558.
Fernandes, M., Walls, L., Munson, S., Hullman, J., & Kay, M. (2018). Uncertainty Displays Using Quantile Dotplots or CDFs Improve Transit Decision-Making. Conference on Human Factors in Computing Systems - CHI '18. doi:10.1145/3173574.3173718.
See geom_dotsinterval()
for the geometry this shortcut is based on.
See vignette("dotsinterval")
for a variety of examples of use.
Other dotsinterval geoms:
geom_blur_dots()
,
geom_dots()
,
geom_dotsinterval()
,
geom_weave()
library(dplyr) library(ggplot2) theme_set(theme_ggdist()) set.seed(12345) df = tibble( g = rep(c("a", "b"), 200), value = rnorm(400, c(0, 3), c(0.75, 1)) ) # orientation is detected automatically based on # which axis is discrete df %>% ggplot(aes(x = value, y = g)) + geom_swarm() df %>% ggplot(aes(y = value, x = g)) + geom_swarm()
library(dplyr) library(ggplot2) theme_set(theme_ggdist()) set.seed(12345) df = tibble( g = rep(c("a", "b"), 200), value = rnorm(400, c(0, 3), c(0.75, 1)) ) # orientation is detected automatically based on # which axis is discrete df %>% ggplot(aes(x = value, y = g)) + geom_swarm() df %>% ggplot(aes(y = value, x = g)) + geom_swarm()
Shortcut version of geom_dotsinterval()
for creating dot-weave plots.
Geoms based on geom_dotsinterval()
create dotplots that automatically
ensure the plot fits within the available space.
Roughly equivalent to:
geom_dots( aes(side = "both"), layout = "weave", overflow = "compress", binwidth = unit(1.5, "mm") )
geom_weave( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., layout = "weave", overflow = "compress", binwidth = unit(1.5, "mm"), dotsize = 1.07, stackratio = 1, overlaps = "nudge", smooth = "none", verbose = FALSE, orientation = NA, subguide = "slab", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
geom_weave( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., layout = "weave", overflow = "compress", binwidth = unit(1.5, "mm"), dotsize = 1.07, stackratio = 1, overlaps = "nudge", smooth = "none", verbose = FALSE, orientation = NA, subguide = "slab", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to |
layout |
<string> The layout method used for the dots. One of:
|
overflow |
<string> How to handle overflow of dots beyond the extent of the geom
when a minimum
If you find the default layout has dots that are too small, and you are okay
with dots overlapping, consider setting |
binwidth |
<numeric | unit> The bin width to use for laying out the dots. One of:
If the value is numeric, it is assumed to be in units of data. The bin width
(or its bounds) can also be specified using |
dotsize |
<scalar numeric> The width of the dots relative to the |
stackratio |
<scalar numeric> The distance between the center of the dots in the same
stack relative to the dot height. The default, |
overlaps |
<string> How to handle overlapping dots or bins in the
|
smooth |
<function | string> Smoother to apply to dot positions. One of:
Smoothing is most effective when the smoother is matched to the support of
the distribution; e.g. using |
verbose |
<scalar logical> If |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
subguide |
<function | string> Sub-guide used to annotate the
|
na.rm |
<scalar logical> If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param
|
If |
The dots family of stats and geoms are similar to ggplot2::geom_dotplot()
but with a number of differences:
Dots geoms act like slabs in geom_slabinterval()
and can be given x positions (or y positions when
in a horizontal orientation).
Given the available space to lay out dots, the dots geoms will automatically determine how many bins to use to fit the available space.
Dots geoms use a dynamic layout algorithm that lays out dots from the center out if the input data are symmetrical, guaranteeing that symmetrical data results in a symmetrical plot. The layout algorithm also prevents dots from overlapping each other.
The shape of the dots in these geoms can be changed using the slab_shape
aesthetic (when using the
dotsinterval
family) or the shape
or slab_shape
aesthetic (when using the dots
family)
Stats and geoms in this family include:
geom_dots()
: dotplots on raw data. Ensures the dotplot fits within available space by reducing the size
of the dots automatically (may result in very small dots).
geom_swarm()
and geom_weave()
: dotplots on raw data with defaults intended to create "beeswarm" plots.
Used side = "both"
by default, and sets the default dot size to the same size as geom_point()
(binwidth = unit(1.5, "mm")
), allowing dots to overlap instead of getting very small.
stat_dots()
: dotplots on raw data, distributional objects, and posterior::rvar()
s
geom_dotsinterval()
: dotplot + interval plots on raw data with already-calculated
intervals (rarely useful directly).
stat_dotsinterval()
: dotplot + interval plots on raw data, distributional objects,
and posterior::rvar()
s (will calculate intervals for you).
geom_blur_dots()
: blurry dotplots that allow the standard deviation of a blur applied to
each dot to be specified using the sd
aesthetic.
stat_mcse_dots()
: blurry dotplots of quantiles using the Monte Carlo Standard Error of each quantile.
stat_dots()
and stat_dotsinterval()
, when used with the quantiles
argument,
are particularly useful for constructing quantile dotplots, which can be an effective way to communicate uncertainty
using a frequency framing that may be easier for laypeople to understand (Kay et al. 2016, Fernandes et al. 2018).
A ggplot2::Geom representing a dot-weave geometry which can
be added to a ggplot()
object.
The dots+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the dots (aka the slab), the
point, and the interval.
Positional aesthetics
x
: x position of the geometry
y
: y position of the geometry
Dots-specific (aka Slab-specific) aesthetics
family
: The font family used to draw the dots.
order
: The order in which data points are stacked within bins. Can be used to create the effect of
"stacked" dots by ordering dots according to a discrete variable. If omitted (NULL
), the
value of the data points themselves are used to determine stacking order. Only applies when
layout
is "bin"
or "hex"
, as the other layout methods fully determine both x and y positions.
side
: Which side to place the slab on. "topright"
, "top"
, and "right"
are synonyms
which cause the slab to be drawn on the top or the right depending on if orientation
is "horizontal"
or "vertical"
. "bottomleft"
, "bottom"
, and "left"
are synonyms which cause the slab
to be drawn on the bottom or the left depending on if orientation
is "horizontal"
or
"vertical"
. "topleft"
causes the slab to be drawn on the top or the left, and "bottomright"
causes the slab to be drawn on the bottom or the right. "both"
draws the slab mirrored on both
sides (as in a violin plot).
scale
: What proportion of the region allocated to this geom to use to draw the slab. If scale = 1
,
slabs that use the maximum range will just touch each other. Default is 0.9
to leave some space
between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization,
see the thickness
scale article.
justification
: Justification of the interval relative to the slab, where 0
indicates bottom/left
justification and 1
indicates top/right justification (depending on orientation
). If justification
is NULL
(the default), then it is set automatically based on the value of side
: when side
is
"top"
/"right"
justification
is set to 0
, when side
is "bottom"
/"left"
justification
is set to 1
, and when side
is "both"
justification
is set to 0.5.
datatype
: When using composite geoms directly without a stat
(e.g. geom_slabinterval()
), datatype
is used to
indicate which part of the geom a row in the data targets: rows with datatype = "slab"
target the
slab portion of the geometry and rows with datatype = "interval"
target the interval portion of
the geometry. This is set automatically when using ggdist stat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (if orientation = "horizontal"
).
xmax
: Right end of the interval sub-geometry (if orientation = "horizontal"
).
ymin
: Lower end of the interval sub-geometry (if orientation = "vertical"
).
ymax
: Upper end of the interval sub-geometry (if orientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (or color
) The color of the interval and point sub-geometries.
Use the slab_color
, interval_color
, or point_color
aesthetics (below) to
set sub-geometry colors separately.
fill
: The fill color of the slab and point sub-geometries. Use the slab_fill
or point_fill
aesthetics (below) to set sub-geometry colors separately.
alpha
: The opacity of the slab, interval, and point sub-geometries. Use the slab_alpha
,
interval_alpha
, or point_alpha
aesthetics (below) to set sub-geometry colors separately.
colour_ramp
: (or color_ramp
) A secondary scale that modifies the color
scale to "ramp" to another color. See scale_colour_ramp()
for examples.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except with geom_slab()
: then
it is the width of the slab). With composite geometries including an interval and slab,
use slab_linewidth
to set the line width of the slab (see below). For interval, raw
linewidth
values are transformed according to the interval_size_domain
and interval_size_range
parameters of the geom
(see above).
size
: Determines the size of the point. If linewidth
is not provided, size
will
also determines the width of the line used to draw the interval (this allows line width and
point size to be modified together by setting only size
and not linewidth
). Raw
size
values are transformed according to the interval_size_domain
, interval_size_range
,
and fatten_point
parameters of the geom
(see above). Use the point_size
aesthetic
(below) to set sub-geometry size directly without applying the effects of
interval_size_domain
, interval_size_range
, and fatten_point
.
stroke
: Width of the outline around the point sub-geometry.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc) used to draw the interval
and the outline of the slab (if it is visible). Use the slab_linetype
or
interval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override for fill
: the fill color of the slab.
slab_colour
: (or slab_color
) Override for colour
/color
: the outline color of the slab.
slab_alpha
: Override for alpha
: the opacity of the slab.
slab_linewidth
: Override for linwidth
: the width of the outline of the slab.
slab_linetype
: Override for linetype
: the line type of the outline of the slab.
slab_shape
: Override for shape
: the shape of the dots used to draw the dotplot slab.
Interval-specific color and line override aesthetics
interval_colour
: (or interval_color
) Override for colour
/color
: the color of the interval.
interval_alpha
: Override for alpha
: the opacity of the interval.
interval_linetype
: Override for linetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override for fill
: the fill color of the point.
point_colour
: (or point_color
) Override for colour
/color
: the outline color of the point.
point_alpha
: Override for alpha
: the opacity of the point.
point_size
: Override for size
: the size of the point.
Deprecated aesthetics
slab_size
: Use slab_linewidth
.
interval_size
: Use interval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("dotsinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
Kay, M., Kola, T., Hullman, J. R., & Munson, S. A. (2016). When (ish) is My Bus? User-centered Visualizations of Uncertainty in Everyday, Mobile Predictive Systems. Conference on Human Factors in Computing Systems - CHI '16, 5092–5103. doi:10.1145/2858036.2858558.
Fernandes, M., Walls, L., Munson, S., Hullman, J., & Kay, M. (2018). Uncertainty Displays Using Quantile Dotplots or CDFs Improve Transit Decision-Making. Conference on Human Factors in Computing Systems - CHI '18. doi:10.1145/3173574.3173718.
See geom_dotsinterval()
for the geometry this shortcut is based on.
See vignette("dotsinterval")
for a variety of examples of use.
Other dotsinterval geoms:
geom_blur_dots()
,
geom_dots()
,
geom_dotsinterval()
,
geom_swarm()
library(dplyr) library(ggplot2) theme_set(theme_ggdist()) set.seed(12345) df = tibble( g = rep(c("a", "b"), 200), value = rnorm(400, c(0, 3), c(0.75, 1)) ) # orientation is detected automatically based on # which axis is discrete df %>% ggplot(aes(x = value, y = g)) + geom_weave() df %>% ggplot(aes(y = value, x = g)) + geom_weave()
library(dplyr) library(ggplot2) theme_set(theme_ggdist()) set.seed(12345) df = tibble( g = rep(c("a", "b"), 200), value = rnorm(400, c(0, 3), c(0.75, 1)) ) # orientation is detected automatically based on # which axis is discrete df %>% ggplot(aes(x = value, y = g)) + geom_weave() df %>% ggplot(aes(y = value, x = g)) + geom_weave()
Deprecated functions and arguments and their alternatives are listed below.
The stat_sample_...
and stat_dist_...
families of stats were merged in ggdist 3.1.
This means:
stat_dist_...
is deprecated. For any code using stat_dist_XXX()
, you should now
be able to use stat_XXX()
instead without additional modifications in almost all cases.
stat_sample_slabinterval()
is deprecated. You should be able to use
stat_slabinterval()
instead without additional modifications in almost all cases.
The old stat_dist_...
names are currently kept as aliases, but may be removed in the future.
Deprecated parameters for stat_slabinterval()
and family:
The .prob
argument, which is a long-deprecated alias for .width
, was
removed in ggdist 3.1.
The limits_function
argument: this was a parameter for determining the
function to compute limits of the slab in stat_slabinterval()
and its
derived stats. This function is really an internal function only needed by
subclasses of the base class, yet added a lot of noise to the documentation,
so it was replaced with AbstractStatSlabInterval$compute_limits()
.
The limits_args
argument: extra stat parameters are now passed through to
the ...
arguments to AbstractStatSlabInterval$compute_limits()
; use
these instead.
The slab_function
argument: this was a parameter for determining the
function to compute slabs in stat_slabinterval()
and its
derived stats. This function is really an internal function only needed by
subclasses of the base class, yet added a lot of noise to the documentation,
so it was replaced with AbstractStatSlabInterval$compute_slab()
.
The slab_args
argument: extra stat parameters are now passed through to
the ...
arguments to AbstractStatSlabInterval$compute_slab()
; use
these instead.
The slab_type
argument: instead of setting the slab type, either adjust
the density
argument (e.g. use density = "histogram"
to replace
slab_type = "histogram"
) or use the pdf
or cdf
computed variables
mapped onto an appropriate aesthetic (e.g. use aes(thickness = after_stat(cdf))
to create a CDF).
The interval_function
and fun.data
arguments: these were parameters for determining the
function to compute intervals in stat_slabinterval()
and its
derived stats. This function is really an internal function only needed by
subclasses of the base class, yet added a lot of noise to the documentation,
so it was replaced with AbstractStatSlabInterval$compute_interval()
.
The interval_args
and fun.args
arguments: to pass extra arguments to
a point_interval
replace the value of the point_interval
argument with
a simple wrapper; e.g. stat_halfeye(point_interval = \(...) point_interval(..., extra_arg = XXX))
Deprecated parameters for geom_slabinterval()
and family:
The size_domain
and size_range
arguments, which are long-deprecated aliases
for interval_size_domain
and interval_size_range
, were removed in ggdist 3.1.
Matthew Kay
A colour ramp bar guide that shows continuous colour ramp scales mapped onto
values as a smooth gradient. Designed for use with scale_fill_ramp_continuous()
and scale_colour_ramp_continuous()
. Based on guide_colourbar()
.
guide_rampbar( ..., to = "gray65", available_aes = c("fill_ramp", "colour_ramp") )
guide_rampbar( ..., to = "gray65", available_aes = c("fill_ramp", "colour_ramp") )
... |
Arguments passed on to
|
to |
<string> The color to ramp to in the guide. Corresponds to |
available_aes |
<character> Vector listing the aesthetics for which a |
This guide creates smooth gradient color bars for use with scale_fill_ramp_continuous()
and scale_colour_ramp_continuous()
. The color to ramp from is determined by the from
argument of the scale_*
function, and the color to ramp to is determined by the to
argument
to guide_rampbar()
.
Guides can be specified in each scale_*
function or in guides()
.
guide = "rampbar"
in scale_*
is syntactic sugar for guide = guide_rampbar()
;
e.g. scale_colour_ramp_continuous(guide = "rampbar")
. For how to specify
the guide for each scale in more detail, see guides()
.
A guide object.
Matthew Kay
Other colour ramp functions:
partial_colour_ramp()
,
ramp_colours()
,
scale_colour_ramp
library(dplyr) library(ggplot2) library(distributional) # The default guide for ramp scales is guide_legend(), which creates a # discrete style scale: tibble(d = dist_uniform(0, 1)) %>% ggplot(aes(y = 0, xdist = d)) + stat_slab(aes(fill_ramp = after_stat(x)), fill = "blue") + scale_fill_ramp_continuous(from = "red") # We can use guide_rampbar() to instead create a continuous guide, but # it does not know what color to ramp to (defaults to "gray65"): tibble(d = dist_uniform(0, 1)) %>% ggplot(aes(y = 0, xdist = d)) + stat_slab(aes(fill_ramp = after_stat(x)), fill = "blue") + scale_fill_ramp_continuous(from = "red", guide = guide_rampbar()) # We can tell the guide what color to ramp to using the `to` argument: tibble(d = dist_uniform(0, 1)) %>% ggplot(aes(y = 0, xdist = d)) + stat_slab(aes(fill_ramp = after_stat(x)), fill = "blue") + scale_fill_ramp_continuous(from = "red", guide = guide_rampbar(to = "blue"))
library(dplyr) library(ggplot2) library(distributional) # The default guide for ramp scales is guide_legend(), which creates a # discrete style scale: tibble(d = dist_uniform(0, 1)) %>% ggplot(aes(y = 0, xdist = d)) + stat_slab(aes(fill_ramp = after_stat(x)), fill = "blue") + scale_fill_ramp_continuous(from = "red") # We can use guide_rampbar() to instead create a continuous guide, but # it does not know what color to ramp to (defaults to "gray65"): tibble(d = dist_uniform(0, 1)) %>% ggplot(aes(y = 0, xdist = d)) + stat_slab(aes(fill_ramp = after_stat(x)), fill = "blue") + scale_fill_ramp_continuous(from = "red", guide = guide_rampbar()) # We can tell the guide what color to ramp to using the `to` argument: tibble(d = dist_uniform(0, 1)) %>% ggplot(aes(y = 0, xdist = d)) + stat_slab(aes(fill_ramp = after_stat(x)), fill = "blue") + scale_fill_ramp_continuous(from = "red", guide = guide_rampbar(to = "blue"))
Marginal distribution for the correlation in a single cell from a correlation matrix distributed according to an LKJ distribution.
dlkjcorr_marginal(x, K, eta, log = FALSE) plkjcorr_marginal(q, K, eta, lower.tail = TRUE, log.p = FALSE) qlkjcorr_marginal(p, K, eta, lower.tail = TRUE, log.p = FALSE) rlkjcorr_marginal(n, K, eta)
dlkjcorr_marginal(x, K, eta, log = FALSE) plkjcorr_marginal(q, K, eta, lower.tail = TRUE, log.p = FALSE) qlkjcorr_marginal(p, K, eta, lower.tail = TRUE, log.p = FALSE) rlkjcorr_marginal(n, K, eta)
x , q
|
vector of quantiles. |
K |
<numeric> Dimension of the correlation matrix. Must be greater than or equal to 2. |
eta |
<numeric> Parameter controlling the shape of the distribution |
log , log.p
|
logical; if TRUE, probabilities p are given as log(p). |
lower.tail |
logical; if TRUE (default), probabilities are
|
p |
vector of probabilities. |
n |
number of observations. If |
The LKJ distribution is a distribution over correlation matrices with a single parameter, .
For a given
and a
correlation matrix
:
Each off-diagonal entry of ,
, has the
following marginal distribution (Lewandowski, Kurowicka, and Joe 2009):
In other words, is marginally distributed according to the above Beta
distribution scaled into
.
dlkjcorr_marginal
gives the density
plkjcorr_marginal
gives the cumulative distribution function (CDF)
qlkjcorr_marginal
gives the quantile function (inverse CDF)
rlkjcorr_marginal
generates random draws.
The length of the result is determined by n
for rlkjcorr_marginal
, and is the maximum of the lengths of
the numerical arguments for the other functions.
The numerical arguments other than n
are recycled to the length of the result. Only the first elements
of the logical arguments are used.
Lewandowski, D., Kurowicka, D., & Joe, H. (2009). Generating random correlation matrices based on vines and extended onion method. Journal of Multivariate Analysis, 100(9), 1989–2001. doi:10.1016/j.jmva.2009.04.008.
parse_dist()
and marginalize_lkjcorr()
for parsing specs that use the
LKJ correlation distribution and the stat_slabinterval()
family of stats for visualizing them.
library(dplyr) library(ggplot2) theme_set(theme_ggdist()) expand.grid( eta = 1:6, K = 2:6 ) %>% ggplot(aes(y = ordered(eta), dist = "lkjcorr_marginal", arg1 = K, arg2 = eta)) + stat_slab() + facet_grid(~ paste0(K, "x", K)) + scale_y_discrete(limits = rev) + labs( title = paste0( "Marginal correlation for LKJ(eta) prior on different matrix sizes:\n", "dlkjcorr_marginal(K, eta)" ), subtitle = "Correlation matrix size (KxK)", y = "eta", x = "Marginal correlation" ) + theme(axis.title = element_text(hjust = 0))
library(dplyr) library(ggplot2) theme_set(theme_ggdist()) expand.grid( eta = 1:6, K = 2:6 ) %>% ggplot(aes(y = ordered(eta), dist = "lkjcorr_marginal", arg1 = K, arg2 = eta)) + stat_slab() + facet_grid(~ paste0(K, "x", K)) + scale_y_discrete(limits = rev) + labs( title = paste0( "Marginal correlation for LKJ(eta) prior on different matrix sizes:\n", "dlkjcorr_marginal(K, eta)" ), subtitle = "Correlation matrix size (KxK)", y = "eta", x = "Marginal correlation" ) + theme(axis.title = element_text(hjust = 0))
Turns specs for an LKJ correlation matrix distribution as returned by
parse_dist()
into specs for the marginal distribution of
a single cell in an LKJ-distributed correlation matrix (i.e., lkjcorr_marginal()
).
Useful for visualizing prior correlations from LKJ distributions.
marginalize_lkjcorr( data, K, predicate = NULL, dist = ".dist", args = ".args", dist_obj = ".dist_obj" )
marginalize_lkjcorr( data, K, predicate = NULL, dist = ".dist", args = ".args", dist_obj = ".dist_obj" )
data |
<data.frame> A data frame containing a column with distribution names ( |
K |
<numeric> Dimension of the correlation matrix. Must be greater than or equal to 2. |
predicate |
<bare language | NULL> Expression for selecting the rows of If |
dist |
<string> The name of the column containing distribution names. See |
args |
<string> The name of the column containing distribution arguments. See |
dist_obj |
<string> The name of the output column to contain a distributional
object representing the distribution. See |
The LKJ(eta) prior on a correlation matrix induces a marginal prior on each correlation
in the matrix that depends on both the value of eta
and K
, the dimension
of the correlation matrix. Thus to visualize the marginal prior
on the correlations, it is necessary to specify the value of
K
, which depends
on what your model specification looks like.
Given a data frame representing parsed distribution specifications (such
as returned by parse_dist()
), this function updates any rows with .dist == "lkjcorr"
so that the first argument to the distribution (stored in .args
) is equal to the specified dimension
of the correlation matrix (K
), changes the distribution name in .dist
to "lkjcorr_marginal"
,
and assigns a distributional object representing this distribution to .dist_obj
.
This allows the distribution to be easily visualized using the stat_slabinterval()
family of ggplot2 stats.
A data frame of the same size and column names as the input, with the dist
, and args
,
and dist_obj
columns modified on rows where dist == "lkjcorr"
such that they represent a
marginal LKJ correlation distribution with name lkjcorr_marginal
and args
having
K
equal to the input value of K
.
parse_dist()
, lkjcorr_marginal()
library(dplyr) library(ggplot2) # Say we have an LKJ(3) prior on a 2x2 correlation matrix. We can visualize # its marginal distribution as follows... data.frame(prior = "lkjcorr(3)") %>% parse_dist(prior) %>% marginalize_lkjcorr(K = 2) %>% ggplot(aes(y = prior, xdist = .dist_obj)) + stat_halfeye() + xlim(-1, 1) + xlab("Marginal correlation for LKJ(3) prior on 2x2 correlation matrix") # Say our prior list has multiple LKJ priors on correlation matrices # of different sizes, we can supply a predicate expression to select # only those rows we want to modify data.frame(coef = c("a", "b"), prior = "lkjcorr(3)") %>% parse_dist(prior) %>% marginalize_lkjcorr(K = 2, coef == "a") %>% marginalize_lkjcorr(K = 4, coef == "b")
library(dplyr) library(ggplot2) # Say we have an LKJ(3) prior on a 2x2 correlation matrix. We can visualize # its marginal distribution as follows... data.frame(prior = "lkjcorr(3)") %>% parse_dist(prior) %>% marginalize_lkjcorr(K = 2) %>% ggplot(aes(y = prior, xdist = .dist_obj)) + stat_halfeye() + xlim(-1, 1) + xlab("Marginal correlation for LKJ(3) prior on 2x2 correlation matrix") # Say our prior list has multiple LKJ priors on correlation matrices # of different sizes, we can supply a predicate expression to select # only those rows we want to modify data.frame(coef = c("a", "b"), prior = "lkjcorr(3)") %>% parse_dist(prior) %>% marginalize_lkjcorr(K = 2, coef == "a") %>% marginalize_lkjcorr(K = 4, coef == "b")
Parses simple string distribution specifications, like "normal(0, 1)"
, into two columns of
a data frame, suitable for use with the dist
and args
aesthetics of stat_slabinterval()
and its shortcut stats (like stat_halfeye()
). This format is output
by brms::get_prior
, making it particularly useful for visualizing priors from
brms models.
parse_dist( object, ..., dist = ".dist", args = ".args", dist_obj = ".dist_obj", package = NULL, to_r_names = TRUE ) ## Default S3 method: parse_dist(object, ...) ## S3 method for class 'data.frame' parse_dist( object, dist_col, ..., dist = ".dist", args = ".args", dist_obj = ".dist_obj", package = NULL, lb = "lb", ub = "ub", to_r_names = TRUE ) ## S3 method for class 'character' parse_dist( object, ..., dist = ".dist", args = ".args", dist_obj = ".dist_obj", package = NULL, to_r_names = TRUE ) ## S3 method for class 'factor' parse_dist( object, ..., dist = ".dist", args = ".args", dist_obj = ".dist_obj", package = NULL, to_r_names = TRUE ) ## S3 method for class 'brmsprior' parse_dist( object, dist_col = prior, ..., dist = ".dist", args = ".args", dist_obj = ".dist_obj", package = NULL, to_r_names = TRUE ) r_dist_name(dist_name)
parse_dist( object, ..., dist = ".dist", args = ".args", dist_obj = ".dist_obj", package = NULL, to_r_names = TRUE ) ## Default S3 method: parse_dist(object, ...) ## S3 method for class 'data.frame' parse_dist( object, dist_col, ..., dist = ".dist", args = ".args", dist_obj = ".dist_obj", package = NULL, lb = "lb", ub = "ub", to_r_names = TRUE ) ## S3 method for class 'character' parse_dist( object, ..., dist = ".dist", args = ".args", dist_obj = ".dist_obj", package = NULL, to_r_names = TRUE ) ## S3 method for class 'factor' parse_dist( object, ..., dist = ".dist", args = ".args", dist_obj = ".dist_obj", package = NULL, to_r_names = TRUE ) ## S3 method for class 'brmsprior' parse_dist( object, dist_col = prior, ..., dist = ".dist", args = ".args", dist_obj = ".dist_obj", package = NULL, to_r_names = TRUE ) r_dist_name(dist_name)
object |
<character | data.frame> One of:
|
... |
Arguments passed to other implementations of |
dist |
<string> The name of the output column to contain the distribution name. |
args |
<string> The name of the output column to contain the arguments to the distribution. |
dist_obj |
<string> The name of the output column to contain a distributional object representing the distribution. |
package |
<string | environment | NULL> The package or environment to search for
distribution functions in. Passed to
|
to_r_names |
<scalar logical> If |
dist_col |
<bare language> Column or column expression of |
lb |
<string> The name of an input column (for |
ub |
<string> The name of an input column (for |
dist_name |
<character> For |
parse_dist()
can be applied to character vectors or to a data frame + bare column name of the
column to parse, and returns a data frame with ".dist"
and ".args"
columns added.
parse_dist()
uses r_dist_name()
to translate distribution names into names recognized
by R.
r_dist_name()
takes a character vector of names and translates common names into R
distribution names. Names are first made into valid R names using make.names()
,
then translated (ignoring character case, "."
, and "_"
). Thus, "lognormal"
,
"LogNormal"
, "log_normal"
, "log-Normal"
, and any number of other variants
all get translated into "lnorm"
.
parse_dist
returns a data frame containing at least two columns named after the dist
and args
parameters. If the input is a data frame, the output is a data frame of the same length with those
two columns added. If the input is a character vector or factor, the output is a two-column data frame
with the same number of rows as the length of the input.
r_dist_name
returns a character vector the same length as the input containing translations of the
input names into distribution names R can recognize.
See stat_slabinterval()
and its shortcut stats, which can easily make use of
the output of this function using the dist
and args
aesthetics.
library(dplyr) # parse dist can operate on strings directly... parse_dist(c("normal(0,1)", "student_t(3,0,1)")) # ... or on columns of a data frame, where it adds the # parsed specs back on as columns data.frame(prior = c("normal(0,1)", "student_t(3,0,1)")) %>% parse_dist(prior) # parse_dist is particularly useful with the output of brms::prior(), # which follows the same format as above
library(dplyr) # parse dist can operate on strings directly... parse_dist(c("normal(0,1)", "student_t(3,0,1)")) # ... or on columns of a data frame, where it adds the # parsed specs back on as columns data.frame(prior = c("normal(0,1)", "student_t(3,0,1)")) %>% parse_dist(prior) # parse_dist is particularly useful with the output of brms::prior(), # which follows the same format as above
A representation of a partial ramp between two colours: the origin colour
(from
) and the distance from the origin colour to the target colour
(amount
, a value between 0
and 1
). The target colour of the ramp
can be filled in later using ramp_colours()
, producing a colour.
partial_colour_ramp(amount = double(), from = "white")
partial_colour_ramp(amount = double(), from = "white")
amount |
<numeric> Vector of values between |
from |
<character> Vector giving colours to ramp from. |
This datatype is used by scale_colour_ramp to create ramped colours in
ggdist geoms. It is a vctrs::rcrd datatype with two fields:
"amount"
, the amount to ramp, and "from"
, the colour to ramp from.
Colour ramps can be applied (i.e. translated into colours) using
ramp_colours()
, which can be used with partial_colour_ramp()
to implement geoms that make use of colour_ramp
or fill_ramp
scales.
A vctrs::rcrd of class "ggdist_partial_colour_ramp"
with fields
"amount"
and "from"
.
Matthew Kay
Other colour ramp functions:
guide_rampbar()
,
ramp_colours()
,
scale_colour_ramp
pcr = partial_colour_ramp(c(0, 0.25, 0.75, 1), "red") pcr ramp_colours("blue", pcr)
pcr = partial_colour_ramp(c(0, 0.25, 0.75, 1), "red") pcr ramp_colours("blue", pcr)
Translates draws from distributions in a (possibly grouped) data frame into point and interval summaries (or set of point and interval summaries, if there are multiple groups in a grouped data frame).
Supports automatic partial function application.
point_interval( .data, ..., .width = 0.95, .point = median, .interval = qi, .simple_names = TRUE, na.rm = FALSE, .exclude = c(".chain", ".iteration", ".draw", ".row"), .prob ) ## Default S3 method: point_interval( .data, ..., .width = 0.95, .point = median, .interval = qi, .simple_names = TRUE, na.rm = FALSE, .exclude = c(".chain", ".iteration", ".draw", ".row"), .prob ) ## S3 method for class 'tbl_df' point_interval(.data, ...) ## S3 method for class 'numeric' point_interval( .data, ..., .width = 0.95, .point = median, .interval = qi, .simple_names = FALSE, na.rm = FALSE, .exclude = c(".chain", ".iteration", ".draw", ".row"), .prob ) ## S3 method for class 'rvar' point_interval( .data, ..., .width = 0.95, .point = median, .interval = qi, .simple_names = TRUE, na.rm = FALSE ) ## S3 method for class 'distribution' point_interval( .data, ..., .width = 0.95, .point = median, .interval = qi, .simple_names = TRUE, na.rm = FALSE ) qi(x, .width = 0.95, .prob, na.rm = FALSE) ll(x, .width = 0.95, na.rm = FALSE) ul(x, .width = 0.95, na.rm = FALSE) hdi( x, .width = 0.95, na.rm = FALSE, ..., density = density_bounded(trim = TRUE), n = 4096, .prob ) Mode(x, na.rm = FALSE, ...) ## Default S3 method: Mode( x, na.rm = FALSE, ..., density = density_bounded(trim = TRUE), n = 2001, weights = NULL ) ## S3 method for class 'rvar' Mode(x, na.rm = FALSE, ...) ## S3 method for class 'distribution' Mode(x, na.rm = FALSE, ...) hdci(x, .width = 0.95, na.rm = FALSE) mean_qi(.data, ..., .width = 0.95) median_qi(.data, ..., .width = 0.95) mode_qi(.data, ..., .width = 0.95) mean_ll(.data, ..., .width = 0.95) median_ll(.data, ..., .width = 0.95) mode_ll(.data, ..., .width = 0.95) mean_ul(.data, ..., .width = 0.95) median_ul(.data, ..., .width = 0.95) mode_ul(.data, ..., .width = 0.95) mean_hdi(.data, ..., .width = 0.95) median_hdi(.data, ..., .width = 0.95) mode_hdi(.data, ..., .width = 0.95) mean_hdci(.data, ..., .width = 0.95) median_hdci(.data, ..., .width = 0.95) mode_hdci(.data, ..., .width = 0.95)
point_interval( .data, ..., .width = 0.95, .point = median, .interval = qi, .simple_names = TRUE, na.rm = FALSE, .exclude = c(".chain", ".iteration", ".draw", ".row"), .prob ) ## Default S3 method: point_interval( .data, ..., .width = 0.95, .point = median, .interval = qi, .simple_names = TRUE, na.rm = FALSE, .exclude = c(".chain", ".iteration", ".draw", ".row"), .prob ) ## S3 method for class 'tbl_df' point_interval(.data, ...) ## S3 method for class 'numeric' point_interval( .data, ..., .width = 0.95, .point = median, .interval = qi, .simple_names = FALSE, na.rm = FALSE, .exclude = c(".chain", ".iteration", ".draw", ".row"), .prob ) ## S3 method for class 'rvar' point_interval( .data, ..., .width = 0.95, .point = median, .interval = qi, .simple_names = TRUE, na.rm = FALSE ) ## S3 method for class 'distribution' point_interval( .data, ..., .width = 0.95, .point = median, .interval = qi, .simple_names = TRUE, na.rm = FALSE ) qi(x, .width = 0.95, .prob, na.rm = FALSE) ll(x, .width = 0.95, na.rm = FALSE) ul(x, .width = 0.95, na.rm = FALSE) hdi( x, .width = 0.95, na.rm = FALSE, ..., density = density_bounded(trim = TRUE), n = 4096, .prob ) Mode(x, na.rm = FALSE, ...) ## Default S3 method: Mode( x, na.rm = FALSE, ..., density = density_bounded(trim = TRUE), n = 2001, weights = NULL ) ## S3 method for class 'rvar' Mode(x, na.rm = FALSE, ...) ## S3 method for class 'distribution' Mode(x, na.rm = FALSE, ...) hdci(x, .width = 0.95, na.rm = FALSE) mean_qi(.data, ..., .width = 0.95) median_qi(.data, ..., .width = 0.95) mode_qi(.data, ..., .width = 0.95) mean_ll(.data, ..., .width = 0.95) median_ll(.data, ..., .width = 0.95) mode_ll(.data, ..., .width = 0.95) mean_ul(.data, ..., .width = 0.95) median_ul(.data, ..., .width = 0.95) mode_ul(.data, ..., .width = 0.95) mean_hdi(.data, ..., .width = 0.95) median_hdi(.data, ..., .width = 0.95) mode_hdi(.data, ..., .width = 0.95) mean_hdci(.data, ..., .width = 0.95) median_hdci(.data, ..., .width = 0.95) mode_hdci(.data, ..., .width = 0.95)
.data |
<data.frame | grouped_df> Data frame (or grouped
data frame as returned by |
... |
<bare language> Column names or expressions that, when evaluated in the context of
|
.width |
<numeric> vector of probabilities to use that determine the widths of
the resulting intervals. If multiple probabilities are provided, multiple rows per
group are generated, each with a different probability interval (and value of the
corresponding |
.point |
<function> Point summary function, which takes a vector and returns a single
value, e.g. |
.interval |
<function> Interval function, which takes a vector and a probability
( |
.simple_names |
<scalar logical> When |
na.rm |
<scalar logical> Should |
.exclude |
<character> Vector of names of columns to be excluded from summarization
if no column names are specified to be summarized in |
.prob |
Deprecated. Use |
x |
<numeric> Vector to summarize (for interval functions: |
density |
<function | string> For |
n |
<scalar numeric> For |
weights |
<numeric | NULL> For |
If .data
is a data frame, then ...
is a list of bare names of
columns (or expressions derived from columns) of .data
, on which
the point and interval summaries are derived. Column expressions are processed
using the tidy evaluation framework (see rlang::eval_tidy()
).
For a column named x
, the resulting data frame will have a column
named x
containing its point summary. If there is a single
column to be summarized and .simple_names
is TRUE
, the output will
also contain columns .lower
(the lower end of the interval),
.upper
(the upper end of the interval).
Otherwise, for every summarized column x
, the output will contain
x.lower
(the lower end of the interval) and x.upper
(the upper
end of the interval). Finally, the output will have a .width
column
containing the' probability for the interval on each output row.
If .data
includes groups (see e.g. dplyr::group_by()
),
the points and intervals are calculated within the groups.
If .data
is a vector, ...
is ignored and the result is a
data frame with one row per value of .width
and three columns:
y
(the point summary), ymin
(the lower end of the interval),
ymax
(the upper end of the interval), and .width
, the probability
corresponding to the interval. This behavior allows point_interval
and its derived functions (like median_qi
, mean_qi
, mode_hdi
, etc)
to be easily used to plot intervals in ggplot stats using methods like
stat_eye()
, stat_halfeye()
, or stat_summary()
.
median_qi
, mode_hdi
, etc are short forms for
point_interval(..., .point = median, .interval = qi)
, etc.
qi
yields the quantile interval (also known as the percentile interval or
equi-tailed interval) as a 1x2 matrix.
hdi
yields the highest-density interval(s) (also known as the highest posterior
density interval). Note: If the distribution is multimodal, hdi
may return multiple
intervals for each probability level (these will be spread over rows). You may wish to use
hdci
(below) instead if you want a single highest-density interval, with the caveat that when
the distribution is multimodal hdci
is not a highest-density interval.
hdci
yields the highest-density continuous interval, also known as the shortest
probability interval. Note: If the distribution is multimodal, this may not actually
be the highest-density interval (there may be a higher-density
discontinuous interval, which can be found using hdi
).
ll
and ul
yield lower limits and upper limits, respectively (where the opposite
limit is set to either Inf
or -Inf
).
A data frame containing point summaries and intervals, with at least one column corresponding
to the point summary, one to the lower end of the interval, one to the upper end of the interval, the
width of the interval (.width
), the type of point summary (.point
), and the type of interval (.interval
).
Matthew Kay
library(dplyr) library(ggplot2) set.seed(123) rnorm(1000) %>% median_qi() data.frame(x = rnorm(1000)) %>% median_qi(x, .width = c(.50, .80, .95)) data.frame( x = rnorm(1000), y = rnorm(1000, mean = 2, sd = 2) ) %>% median_qi(x, y) data.frame( x = rnorm(1000), group = "a" ) %>% rbind(data.frame( x = rnorm(1000, mean = 2, sd = 2), group = "b") ) %>% group_by(group) %>% median_qi(.width = c(.50, .80, .95)) multimodal_draws = data.frame( x = c(rnorm(5000, 0, 1), rnorm(2500, 4, 1)) ) multimodal_draws %>% mode_hdi(.width = c(.66, .95)) multimodal_draws %>% ggplot(aes(x = x, y = 0)) + stat_halfeye(point_interval = mode_hdi, .width = c(.66, .95))
library(dplyr) library(ggplot2) set.seed(123) rnorm(1000) %>% median_qi() data.frame(x = rnorm(1000)) %>% median_qi(x, .width = c(.50, .80, .95)) data.frame( x = rnorm(1000), y = rnorm(1000, mean = 2, sd = 2) ) %>% median_qi(x, y) data.frame( x = rnorm(1000), group = "a" ) %>% rbind(data.frame( x = rnorm(1000, mean = 2, sd = 2), group = "b") ) %>% group_by(group) %>% median_qi(.width = c(.50, .80, .95)) multimodal_draws = data.frame( x = c(rnorm(5000, 0, 1), rnorm(2500, 4, 1)) ) multimodal_draws %>% mode_hdi(.width = c(.66, .95)) multimodal_draws %>% ggplot(aes(x = x, y = 0)) + stat_halfeye(point_interval = mode_hdi, .width = c(.66, .95))
A justification-preserving variant of ggplot2::position_dodge()
which preserves the
vertical position of a geom while adjusting the horizontal position (or vice
versa when in a horizontal orientation). Unlike ggplot2::position_dodge()
,
position_dodgejust()
attempts to preserve the "justification" of x
positions relative to the bounds containing them (xmin
/xmax
) (or y
positions relative to ymin
/ymax
when in a horizontal orientation). This
makes it useful for dodging annotations to geoms and stats from the
geom_slabinterval()
family, which also preserve the justification of their
intervals relative to their slabs when dodging.
position_dodgejust( width = NULL, preserve = c("total", "single"), justification = NULL )
position_dodgejust( width = NULL, preserve = c("total", "single"), justification = NULL )
width |
Dodging width, when different to the width of the individual elements. This is useful when you want to align narrow geoms with wider geoms. See the examples. |
preserve |
Should dodging preserve the |
justification |
<scalar numeric> Justification of the point position ( |
library(dplyr) library(ggplot2) library(distributional) dist_df = tribble( ~group, ~subgroup, ~mean, ~sd, 1, "h", 5, 1, 2, "h", 7, 1.5, 3, "h", 8, 1, 3, "i", 9, 1, 3, "j", 7, 1 ) # An example with normal "dodge" positioning # Notice how dodge points are placed in the center of their bounding boxes, # which can cause slabs to be positioned outside their bounds. dist_df %>% ggplot(aes( x = factor(group), ydist = dist_normal(mean, sd), fill = subgroup )) + stat_halfeye( position = "dodge" ) + geom_rect( aes(xmin = group, xmax = group + 1, ymin = 2, ymax = 13, color = subgroup), position = "dodge", data = . %>% filter(group == 3), alpha = 0.1 ) + geom_point( aes(x = group, y = 7.5, color = subgroup), position = position_dodge(width = 1), data = . %>% filter(group == 3), shape = 1, size = 4, stroke = 1.5 ) + scale_fill_brewer(palette = "Set2") + scale_color_brewer(palette = "Dark2") # This same example with "dodgejust" positioning. For the points we # supply a justification parameter to position_dodgejust which mimics the # justification parameter of stat_halfeye, ensuring that they are # placed appropriately. On slabinterval family geoms, position_dodgejust() # will automatically detect the appropriate justification. dist_df %>% ggplot(aes( x = factor(group), ydist = dist_normal(mean, sd), fill = subgroup )) + stat_halfeye( position = "dodgejust" ) + geom_rect( aes(xmin = group, xmax = group + 1, ymin = 2, ymax = 13, color = subgroup), position = "dodgejust", data = . %>% filter(group == 3), alpha = 0.1 ) + geom_point( aes(x = group, y = 7.5, color = subgroup), position = position_dodgejust(width = 1, justification = 0), data = . %>% filter(group == 3), shape = 1, size = 4, stroke = 1.5 ) + scale_fill_brewer(palette = "Set2") + scale_color_brewer(palette = "Dark2")
library(dplyr) library(ggplot2) library(distributional) dist_df = tribble( ~group, ~subgroup, ~mean, ~sd, 1, "h", 5, 1, 2, "h", 7, 1.5, 3, "h", 8, 1, 3, "i", 9, 1, 3, "j", 7, 1 ) # An example with normal "dodge" positioning # Notice how dodge points are placed in the center of their bounding boxes, # which can cause slabs to be positioned outside their bounds. dist_df %>% ggplot(aes( x = factor(group), ydist = dist_normal(mean, sd), fill = subgroup )) + stat_halfeye( position = "dodge" ) + geom_rect( aes(xmin = group, xmax = group + 1, ymin = 2, ymax = 13, color = subgroup), position = "dodge", data = . %>% filter(group == 3), alpha = 0.1 ) + geom_point( aes(x = group, y = 7.5, color = subgroup), position = position_dodge(width = 1), data = . %>% filter(group == 3), shape = 1, size = 4, stroke = 1.5 ) + scale_fill_brewer(palette = "Set2") + scale_color_brewer(palette = "Dark2") # This same example with "dodgejust" positioning. For the points we # supply a justification parameter to position_dodgejust which mimics the # justification parameter of stat_halfeye, ensuring that they are # placed appropriately. On slabinterval family geoms, position_dodgejust() # will automatically detect the appropriate justification. dist_df %>% ggplot(aes( x = factor(group), ydist = dist_normal(mean, sd), fill = subgroup )) + stat_halfeye( position = "dodgejust" ) + geom_rect( aes(xmin = group, xmax = group + 1, ymin = 2, ymax = 13, color = subgroup), position = "dodgejust", data = . %>% filter(group == 3), alpha = 0.1 ) + geom_point( aes(x = group, y = 7.5, color = subgroup), position = position_dodgejust(width = 1, justification = 0), data = . %>% filter(group == 3), shape = 1, size = 4, stroke = 1.5 ) + scale_fill_brewer(palette = "Set2") + scale_color_brewer(palette = "Dark2")
Experimental probability-like expressions that can be used in place of
some after_stat()
expressions in aesthetic assignments in ggdist stats.
Pr_(x) p_(x)
Pr_(x) p_(x)
x |
<bare language> Expressions. See Probability expressions, below. |
Pr_()
and p_()
are an experimental mini-language for specifying aesthetic values
based on probabilities and probability densities derived from distributions
supplied to ggdist stats (e.g., in stat_slabinterval()
,
stat_dotsinterval()
, etc.). They generate expressions that use after_stat()
and the computed variables of the stat (such as cdf
and pdf
; see e.g.
the Computed Variables section of stat_slabinterval()
) to compute
the desired probabilities or densities.
For example, one way to map the density of a distribution onto the alpha
aesthetic of a slab is to use after_stat(pdf)
:
ggplot() + stat_slab(aes(xdist = distributional::dist_normal(), alpha = after_stat(pdf)))
ggdist probability expressions offer an alternative, equivalent syntax:
ggplot() + stat_slab(aes(xdist = distributional::dist_normal(), alpha = !!p_(x)))
Where p_(x)
is the probability density function. The use of !!
is
necessary to splice the generated expression into the aes()
call; for
more information, see quasiquotation.
Probability expressions consist of a call to Pr_()
or p_()
containing
a small number of valid combinations of operators and variable names.
Valid variables in probability expressions include:
x
, y
, or value
: values along the x
or y
axis. value
is the
orientation-neutral form.
xdist
, ydist
, or dist
: distributions mapped along the x
or y
axis. dist
is the orientation-neutral form. X
and Y
can also be
used as synonyms for xdist
and ydist
.
interval
: the smallest interval containing the current x
/y
value.
Pr_()
generates expressions for probabilities, e.g. cumulative distribution
functions (CDFs). Valid operators inside Pr_()
are:
<
, <=
, >
, >=
: generates values of the cumulative distribution
function (CDF) or complementary CDF by comparing one of {x
, y
, value
}
to one of {xdist
, ydist
, dist
, X
, Y
}. For example, Pr_(xdist <= x)
gives the CDF and Pr_(xdist > x)
gives the CCDF.
%in%
: currently can only be used with interval
on the right-hand side:
gives the probability of {x
, y
, value
} (left-hand side) being in the
smallest interval the stat generated that contains the value; e.g.
Pr_(x %in% interval)
.
p_()
generates expressions for probability density functions or probability mass
functions (depending on if the underlying distribution is continuous or
discrete). It currently does not allow any operators in the expression, and
must be passed one of x
, y
, or value
.
The Computed Variables section of stat_slabinterval()
(especially
cdf
and pdf
) and the after_stat()
function.
library(ggplot2) library(distributional) df = data.frame( d = c(dist_normal(2.7, 1), dist_lognormal(1, 1/3)), name = c("normal", "lognormal") ) # map density onto alpha of the fill ggplot(df, aes(y = name, xdist = d)) + stat_slabinterval(aes(alpha = !!p_(x))) # map CCDF onto thickness (like stat_ccdfinterval()) ggplot(df, aes(y = name, xdist = d)) + stat_slabinterval(aes(thickness = !!Pr_(xdist > x))) # map containing interval onto fill ggplot(df, aes(y = name, xdist = d)) + stat_slabinterval(aes(fill = !!Pr_(x %in% interval))) # the color scale in the previous example is not great, so turn the # probability into an ordered factor and adjust the fill scale. # Though, see also the `level` computed variable in `stat_slabinterval()`, # which is probably easier to use to create this style of chart. ggplot(df, aes(y = name, xdist = d)) + stat_slabinterval(aes(fill = ordered(!!Pr_(x %in% interval)))) + scale_fill_brewer(direction = -1)
library(ggplot2) library(distributional) df = data.frame( d = c(dist_normal(2.7, 1), dist_lognormal(1, 1/3)), name = c("normal", "lognormal") ) # map density onto alpha of the fill ggplot(df, aes(y = name, xdist = d)) + stat_slabinterval(aes(alpha = !!p_(x))) # map CCDF onto thickness (like stat_ccdfinterval()) ggplot(df, aes(y = name, xdist = d)) + stat_slabinterval(aes(thickness = !!Pr_(xdist > x))) # map containing interval onto fill ggplot(df, aes(y = name, xdist = d)) + stat_slabinterval(aes(fill = !!Pr_(x %in% interval))) # the color scale in the previous example is not great, so turn the # probability into an ordered factor and adjust the fill scale. # Though, see also the `level` computed variable in `stat_slabinterval()`, # which is probably easier to use to create this style of chart. ggplot(df, aes(y = name, xdist = d)) + stat_slabinterval(aes(fill = ordered(!!Pr_(x %in% interval)))) + scale_fill_brewer(direction = -1)
Given vectors of colours and partial_colour_ramp
s, ramps the colours
according to the parameters of the partial colour ramps, returning
a vector of the same length as the inputs giving the transformed
(ramped) colours.
ramp_colours(colour, ramp)
ramp_colours(colour, ramp)
colour |
<character> Vector of colours to ramp to. |
ramp |
<partial_colour_ramp> Vector of colour ramps (same length as
|
Takes vectors of colours and partial_colour_ramp
s and produces
colours by interpolating between each from
colour and the target colour
the specified amount
(where amount
and from
are the corresponding
fields of the ramp
).
For example, to add support for the fill_ramp
aesthetic to a geometry,
this line could be used inside the draw_group()
or draw_panel()
method
of a geom:
data$fill = ramp_colours(data$fill, data$fill_ramp)
A character vector of colours.
Matthew Kay
Other colour ramp functions:
guide_rampbar()
,
partial_colour_ramp()
,
scale_colour_ramp
pcr = partial_colour_ramp(c(0, 0.25, 0.75, 1), "red") pcr ramp_colours("blue", pcr)
pcr = partial_colour_ramp(c(0, 0.25, 0.75, 1), "red") pcr ramp_colours("blue", pcr)
This scale creates a secondary scale that modifies the fill
or color
scale of
geoms that support it (geom_lineribbon()
and geom_slabinterval()
) to "ramp"
from a secondary color (by default white) to the primary fill color (determined
by the standard color
or fill
aesthetics). It uses the
partial_colour_ramp()
data type.
scale_colour_ramp_continuous( from = "white", ..., limits = function(l) c(min(0, l[[1]]), l[[2]]), range = c(0, 1), guide = "legend", aesthetics = "colour_ramp" ) scale_color_ramp_continuous( from = "white", ..., limits = function(l) c(min(0, l[[1]]), l[[2]]), range = c(0, 1), guide = "legend", aesthetics = "colour_ramp" ) scale_colour_ramp_discrete( from = "white", ..., range = c(0.2, 1), aesthetics = "colour_ramp" ) scale_color_ramp_discrete( from = "white", ..., range = c(0.2, 1), aesthetics = "colour_ramp" ) scale_fill_ramp_continuous(..., aesthetics = "fill_ramp") scale_fill_ramp_discrete(..., aesthetics = "fill_ramp")
scale_colour_ramp_continuous( from = "white", ..., limits = function(l) c(min(0, l[[1]]), l[[2]]), range = c(0, 1), guide = "legend", aesthetics = "colour_ramp" ) scale_color_ramp_continuous( from = "white", ..., limits = function(l) c(min(0, l[[1]]), l[[2]]), range = c(0, 1), guide = "legend", aesthetics = "colour_ramp" ) scale_colour_ramp_discrete( from = "white", ..., range = c(0.2, 1), aesthetics = "colour_ramp" ) scale_color_ramp_discrete( from = "white", ..., range = c(0.2, 1), aesthetics = "colour_ramp" ) scale_fill_ramp_continuous(..., aesthetics = "fill_ramp") scale_fill_ramp_discrete(..., aesthetics = "fill_ramp")
from |
<string> The color to ramp from. Corresponds to |
... |
Arguments passed to underlying scale or guide functions. E.g.
|
limits |
One of:
|
range |
<length-2 numeric> Minimum and maximum
values after the scale transformation. These values should be between |
guide |
<Guide | string> A function used to create a guide or its name. For
|
aesthetics |
<character> Names of aesthetics to set scales for. |
These scales transform data into partial_colour_ramp
s. Each partial_colour_ramp
is a pair of two values: a from
colour and a numeric amount
between 0
and 1
representing a distance between from
and the target color (where 0
indicates the from
color and 1
the target color).
The target color is determined by the corresponding aesthetic: for example,
the colour_ramp
aesthetic creates ramps between from
and whatever the
value of the colour
aesthetic is; the fill_ramp
aesthetic creates ramps
between from
and whatever the value of the fill
aesthetic is. When the
colour_ramp
aesthetic is set, ggdist geometries will modify their
colour
by applying the colour ramp between from
and colour
(and
similarly for fill_ramp
and fill
).
Colour ramps can be applied (i.e. translated into colours) using
ramp_colours()
, which can be used with partial_colour_ramp()
to implement geoms that make use of colour_ramp
or fill_ramp
scales.
A ggplot2::Scale representing a scale for the colour_ramp
and/or fill_ramp
aesthetics for ggdist
geoms. Can be added to a ggplot()
object.
Matthew Kay
Other ggdist scales:
scale_side_mirrored()
,
scale_thickness
,
sub-geometry-scales
Other colour ramp functions:
guide_rampbar()
,
partial_colour_ramp()
,
ramp_colours()
library(dplyr) library(ggplot2) library(distributional) tibble(d = dist_uniform(0, 1)) %>% ggplot(aes(y = 0, xdist = d)) + stat_slab(aes(fill_ramp = after_stat(x))) tibble(d = dist_uniform(0, 1)) %>% ggplot(aes(y = 0, xdist = d)) + stat_slab(aes(fill_ramp = after_stat(x)), fill = "blue") + scale_fill_ramp_continuous(from = "red") # you can invert the order of `range` to change the order of the blend tibble(d = dist_normal(0, 1)) %>% ggplot(aes(y = 0, xdist = d)) + stat_slab(aes(fill_ramp = after_stat(cut_cdf_qi(cdf))), fill = "blue") + scale_fill_ramp_discrete(from = "red", range = c(1, 0))
library(dplyr) library(ggplot2) library(distributional) tibble(d = dist_uniform(0, 1)) %>% ggplot(aes(y = 0, xdist = d)) + stat_slab(aes(fill_ramp = after_stat(x))) tibble(d = dist_uniform(0, 1)) %>% ggplot(aes(y = 0, xdist = d)) + stat_slab(aes(fill_ramp = after_stat(x)), fill = "blue") + scale_fill_ramp_continuous(from = "red") # you can invert the order of `range` to change the order of the blend tibble(d = dist_normal(0, 1)) %>% ggplot(aes(y = 0, xdist = d)) + stat_slab(aes(fill_ramp = after_stat(cut_cdf_qi(cdf))), fill = "blue") + scale_fill_ramp_discrete(from = "red", range = c(1, 0))
This scale creates mirrored slabs for the side
aesthetic of the geom_slabinterval()
and geom_dotsinterval()
family of geoms and stats. It works on discrete variables
of two or three levels.
scale_side_mirrored(start = "topright", ..., aesthetics = "side")
scale_side_mirrored(start = "topright", ..., aesthetics = "side")
start |
<string> The side to start from. Can be any valid value of the |
... |
Arguments passed on to
|
aesthetics |
<character> Names of aesthetics to set scales for. |
A ggplot2::Scale representing a scale for the side
aesthetic for ggdist geoms. Can be added to a ggplot()
object.
Matthew Kay
Other ggdist scales:
scale_colour_ramp
,
scale_thickness
,
sub-geometry-scales
library(dplyr) library(ggplot2) set.seed(1234) data.frame( x = rnorm(400, c(1,4)), g = c("a","b") ) %>% ggplot(aes(x, fill = g, side = g)) + geom_weave(linewidth = 0, scale = 0.5) + scale_side_mirrored()
library(dplyr) library(ggplot2) set.seed(1234) data.frame( x = rnorm(400, c(1,4)), g = c("a","b") ) %>% ggplot(aes(x, fill = g, side = g)) + geom_weave(linewidth = 0, scale = 0.5) + scale_side_mirrored()
This ggplot2 scale linearly scales all thickness
values of geoms
that support the thickness
aesthetic (such as geom_slabinterval()
). It
can be used to align the thickness
scales across multiple geoms (by default,
thickness
is normalized on a per-geom level instead of as a global scale).
For a comprehensive discussion and examples of slab scaling and normalization,
see the thickness
scale article.
scale_thickness_shared( name = waiver(), breaks = waiver(), labels = waiver(), limits = function(l) c(min(0, l[[1]]), l[[2]]), renormalize = FALSE, oob = scales::oob_keep, guide = "none", expand = c(0, 0), ... ) scale_thickness_identity(..., guide = "none")
scale_thickness_shared( name = waiver(), breaks = waiver(), labels = waiver(), limits = function(l) c(min(0, l[[1]]), l[[2]]), renormalize = FALSE, oob = scales::oob_keep, guide = "none", expand = c(0, 0), ... ) scale_thickness_identity(..., guide = "none")
name |
The name of the scale. Used as the axis or legend title. If
|
breaks |
One of:
|
labels |
One of:
|
limits |
One of:
|
renormalize |
<scalar logical> When mapping values to the |
oob |
One of:
|
guide |
A function used to create a guide or its name. See
|
expand |
<numeric> Vector of limit expansion constants of length
2 or 4, following the same format used by the |
... |
Arguments passed on to
|
By default, normalization/scaling of slab thicknesses is controlled by geometries,
not by a ggplot2 scale function. This allows various functionality not
otherwise possible, such as (1) allowing different geometries to have different
thickness scales and (2) allowing the user to control at what level of aggregation
(panels, groups, the entire plot, etc) thickness scaling is done via the normalize
parameter to geom_slabinterval()
.
However, this default approach has one drawback: two different geoms will always
have their own scaling of thickness
. scale_thickness_shared()
offers an
alternative approach: when added to a chart, all geoms will use the same
thickness
scale, and geom-level normalization (via their normalize
parameters)
is ignored. This is achieved by "marking" thickness values as already
normalized by wrapping them in the thickness()
data type (this can be
disabled by setting renormalize = TRUE
).
Note: while a slightly more typical name for scale_thickness_shared()
might
be scale_thickness_continuous()
, the latter name would cause this scale
to be applied to all thickness
aesthetics by default according to the rules
ggplot2 uses to find default scales. Thus, to retain the usual behavior
of stat_slabinterval()
(per-geom normalization of thickness
), this scale
is called scale_thickness_shared()
.
A ggplot2::Scale representing a scale for the thickness
aesthetic for ggdist
geoms. Can be added to a ggplot()
object.
Matthew Kay
The thickness datatype.
The thickness
aesthetic of geom_slabinterval()
.
subscale_thickness()
, for setting a thickness
sub-scale within
a single geom_slabinterval()
.
Other ggdist scales:
scale_colour_ramp
,
scale_side_mirrored()
,
sub-geometry-scales
library(distributional) library(ggplot2) library(dplyr) prior_post = data.frame( prior = dist_normal(0, 1), posterior = dist_normal(0.1, 0.5) ) # By default, separate geoms have their own thickness scales, which means # distributions plotted using two separate geoms will not have their slab # functions drawn on the same scale (thus here, the two distributions have # different areas under their density curves): prior_post %>% ggplot() + stat_halfeye(aes(xdist = posterior)) + stat_slab(aes(xdist = prior), fill = NA, color = "red") # For this kind of prior/posterior chart, it makes more sense to have the # densities on the same scale; thus, the areas under both would be the same. # We can do that using scale_thickness_shared(): prior_post %>% ggplot() + stat_halfeye(aes(xdist = posterior)) + stat_slab(aes(xdist = prior), fill = NA, color = "#e41a1c") + scale_thickness_shared()
library(distributional) library(ggplot2) library(dplyr) prior_post = data.frame( prior = dist_normal(0, 1), posterior = dist_normal(0.1, 0.5) ) # By default, separate geoms have their own thickness scales, which means # distributions plotted using two separate geoms will not have their slab # functions drawn on the same scale (thus here, the two distributions have # different areas under their density curves): prior_post %>% ggplot() + stat_halfeye(aes(xdist = posterior)) + stat_slab(aes(xdist = prior), fill = NA, color = "red") # For this kind of prior/posterior chart, it makes more sense to have the # densities on the same scale; thus, the areas under both would be the same. # We can do that using scale_thickness_shared(): prior_post %>% ggplot() + stat_halfeye(aes(xdist = posterior)) + stat_slab(aes(xdist = prior), fill = NA, color = "#e41a1c") + scale_thickness_shared()
Smooths x
values using a density estimator, returning new x
of the same
length. Can be used with a dotplot (e.g. geom_dots
(smooth = ...)
) to create
"density dotplots".
Supports automatic partial function application with waived arguments.
smooth_bounded( x, density = "bounded", bounds = c(NA, NA), bounder = "cooke", trim = FALSE, ... ) smooth_unbounded(x, density = "unbounded", trim = FALSE, ...)
smooth_bounded( x, density = "bounded", bounds = c(NA, NA), bounder = "cooke", trim = FALSE, ... ) smooth_unbounded(x, density = "unbounded", trim = FALSE, ...)
x |
<numeric> Values to smooth. |
density |
<function | string> Density estimator to use for smoothing. One of:
|
bounds |
<length-2 numeric> Min and max bounds. If a bound is |
bounder |
<function | string> Method to use to find missing
(
|
trim |
<scalar logical> Passed to |
... |
Arguments passed to the density estimator specified by |
Applies a kernel density estimator (KDE) to x
, then uses weighted quantiles
of the KDE to generate a new set of x
values with smoothed values. Plotted
using a dotplot (e.g. geom_dots(smooth = "bounded")
or
geom_dots(smooth = smooth_bounded(...)
), these values create a variation on
a "density dotplot" (Zvinca 2018).
Such plots are recommended only in very large sample sizes where precise positions of individual values are not particularly meaningful. In small samples, normal dotplots should generally be used.
Two variants are supplied by default:
smooth_bounded()
, which uses density_bounded()
.
Passes the bounds
arguments to the estimator.
smooth_unbounded()
, which uses density_unbounded()
.
It is generally recommended to pick the smooth based on the known bounds of
your data, e.g. by using smooth_bounded()
with the bounds
parameter if
there are finite bounds, or smooth_unbounded()
if both bounds are infinite.
A numeric vector of length(x)
, where each entry is a smoothed version of
the corresponding entry in x
.
If x
is missing, returns a partial application of itself. See automatic-partial-functions.
Zvinca, Daniel. "In the pursuit of diversity in data visualization. Jittering data to access details." https://www.linkedin.com/pulse/pursuit-diversity-data-visualization-jittering-access-daniel-zvinca/.
Other dotplot smooths:
smooth_discrete()
,
smooth_none()
library(ggplot2) set.seed(1234) x = rnorm(1000) # basic dotplot is noisy ggplot(data.frame(x), aes(x)) + geom_dots() # density dotplot is smoother, but does move points (most noticeable # in areas of low density) ggplot(data.frame(x), aes(x)) + geom_dots(smooth = "unbounded") # you can adjust the kernel and bandwidth... ggplot(data.frame(x), aes(x)) + geom_dots(smooth = smooth_unbounded(kernel = "triangular", adjust = 0.5)) # for bounded data, you should use the bounded smoother x_beta = rbeta(1000, 0.5, 0.5) ggplot(data.frame(x_beta), aes(x_beta)) + geom_dots(smooth = smooth_bounded(bounds = c(0, 1)))
library(ggplot2) set.seed(1234) x = rnorm(1000) # basic dotplot is noisy ggplot(data.frame(x), aes(x)) + geom_dots() # density dotplot is smoother, but does move points (most noticeable # in areas of low density) ggplot(data.frame(x), aes(x)) + geom_dots(smooth = "unbounded") # you can adjust the kernel and bandwidth... ggplot(data.frame(x), aes(x)) + geom_dots(smooth = smooth_unbounded(kernel = "triangular", adjust = 0.5)) # for bounded data, you should use the bounded smoother x_beta = rbeta(1000, 0.5, 0.5) ggplot(data.frame(x_beta), aes(x_beta)) + geom_dots(smooth = smooth_bounded(bounds = c(0, 1)))
Note: Better-looking bar dotplots are typically easier to achieve using
layout = "bar"
with the geom_dotsinterval()
family instead of
smooth = "bar"
or smooth = "discrete"
.
Smooths x
values where x
is presumed to be discrete, returning a new x
of the same length. Both smooth_discrete()
and smooth_bar()
use the
resolution()
of the data to apply smoothing around unique values in the
dataset; smooth_discrete()
uses a kernel density estimator and smooth_bar()
places values in an evenly-spaced grid. Can be used with a dotplot
(e.g. geom_dots
(smooth = ...)
) to create "bar dotplots".
Supports automatic partial function application with waived arguments.
smooth_discrete( x, kernel = c("rectangular", "gaussian", "epanechnikov", "triangular", "biweight", "cosine", "optcosine"), width = 0.7, ... ) smooth_bar(x, width = 0.7, ...)
smooth_discrete( x, kernel = c("rectangular", "gaussian", "epanechnikov", "triangular", "biweight", "cosine", "optcosine"), width = 0.7, ... ) smooth_bar(x, width = 0.7, ...)
x |
<numeric> Values to smooth. |
kernel |
<string> The smoothing kernel to be used. This must partially
match one of |
width |
<scalar numeric> approximate width of the bars as a fraction of data |
... |
additional parameters; |
smooth_discrete()
applies a kernel density estimator (default: rectangular)
to x
. It automatically sets the bandwidth to be such that the kernel's
width (for each kernel type) is approximately width
times the resolution()
of the data. This means it essentially creates smoothed bins around each
unique value. It calls down to smooth_unbounded()
.
smooth_bar()
generates an evenly-spaced grid of values spanning +/- width/2
around each unique value in x
.
A numeric vector of length(x)
, where each entry is a smoothed version of
the corresponding entry in x
.
If x
is missing, returns a partial application of itself. See automatic-partial-functions.
Other dotplot smooths:
smooth_density
,
smooth_none()
library(ggplot2) set.seed(1234) x = rpois(1000, 2) # automatic binwidth in basic dotplot on large counts in discrete # distributions is very small ggplot(data.frame(x), aes(x)) + geom_dots() # NOTE: It is now recommended to use layout = "bar" instead of # smooth = "discrete" or smooth = "bar"; the latter are retained because # they can sometimes be useful in combination with other layouts for # more specialized (but finicky) applications. ggplot(data.frame(x), aes(x)) + geom_dots(layout = "bar") # smooth_discrete() constructs wider bins of dots ggplot(data.frame(x), aes(x)) + geom_dots(smooth = "discrete") # smooth_bar() is an alternative approach to rectangular layouts ggplot(data.frame(x), aes(x)) + geom_dots(smooth = "bar") # adjust the shape by changing the kernel or the width. epanechnikov # works well with side = "both" ggplot(data.frame(x), aes(x)) + geom_dots(smooth = smooth_discrete(kernel = "epanechnikov", width = 0.8), side = "both")
library(ggplot2) set.seed(1234) x = rpois(1000, 2) # automatic binwidth in basic dotplot on large counts in discrete # distributions is very small ggplot(data.frame(x), aes(x)) + geom_dots() # NOTE: It is now recommended to use layout = "bar" instead of # smooth = "discrete" or smooth = "bar"; the latter are retained because # they can sometimes be useful in combination with other layouts for # more specialized (but finicky) applications. ggplot(data.frame(x), aes(x)) + geom_dots(layout = "bar") # smooth_discrete() constructs wider bins of dots ggplot(data.frame(x), aes(x)) + geom_dots(smooth = "discrete") # smooth_bar() is an alternative approach to rectangular layouts ggplot(data.frame(x), aes(x)) + geom_dots(smooth = "bar") # adjust the shape by changing the kernel or the width. epanechnikov # works well with side = "both" ggplot(data.frame(x), aes(x)) + geom_dots(smooth = smooth_discrete(kernel = "epanechnikov", width = 0.8), side = "both")
Default smooth for dotplots: no smooth. Simply returns the input values.
Supports automatic partial function application with waived arguments.
smooth_none(x, ...)
smooth_none(x, ...)
x |
<numeric> Values to smooth. |
... |
ignored |
This is the default value for the smooth
argument of geom_dotsinterval()
.
x
If x
is missing, returns a partial application of itself. See automatic-partial-functions.
Other dotplot smooths:
smooth_density
,
smooth_discrete()
Shortcut version of stat_slabinterval()
with geom_slabinterval()
for
creating CCDF bar plots.
Roughly equivalent to:
stat_slabinterval( aes( thickness = after_stat(thickness(1 - cdf, 0, 1)), justification = after_stat(0.5), side = after_stat("topleft") ), normalize = "none", expand = TRUE )
stat_ccdfinterval( mapping = NULL, data = NULL, geom = "slabinterval", position = "identity", ..., normalize = "none", expand = TRUE, p_limits = c(NA, NA), density = "bounded", adjust = waiver(), trim = waiver(), breaks = waiver(), align = waiver(), outline_bars = waiver(), point_interval = "median_qi", limits = NULL, n = waiver(), .width = c(0.66, 0.95), orientation = NA, na.rm = FALSE, show.legend = c(size = FALSE), inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
stat_ccdfinterval( mapping = NULL, data = NULL, geom = "slabinterval", position = "identity", ..., normalize = "none", expand = TRUE, p_limits = c(NA, NA), density = "bounded", adjust = waiver(), trim = waiver(), breaks = waiver(), align = waiver(), outline_bars = waiver(), point_interval = "median_qi", limits = NULL, n = waiver(), .width = c(0.66, 0.95), orientation = NA, na.rm = FALSE, show.legend = c(size = FALSE), inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override
the default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
normalize |
<string> Groups within which to scale values of the
For a comprehensive discussion and examples of slab scaling and normalization, see the
|
expand |
<logical> For sample data, should the slab be expanded to the limits of the scale? Default |
p_limits |
<length-2 numeric> Probability limits. Used to determine the lower and upper
limits of analytical distributions (distributions from samples ignore this parameter and determine
their limits based on the limits of the sample and the value of the |
density |
<function | string> Density estimator for sample data. One of:
|
adjust |
<scalar numeric | waiver> Passed to |
trim |
<scalar logical | waiver> Passed to |
breaks |
<numeric | function | string | waiver> Passed to
For example, |
align |
<scalar numeric | function | string | waiver> Passed to
For example, |
outline_bars |
<scalar logical | waiver> Passed to |
point_interval |
<function | string> A function from the |
limits |
<length-2 numeric> Manually-specified limits for the slab, as a vector of length two. These limits are combined with those
computed based on |
n |
<scalar numeric> Number of points at which to evaluate the function that defines the slab. Also
passed to |
.width |
<numeric> The |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
<logical> Should this layer be included in the legends? Default is |
inherit.aes |
If |
check.aes , check.param
|
If |
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
xdist
, ydist
, and dist
can be any distribution object from the distributional
package (dist_normal()
, dist_beta()
, etc)
or can be a posterior::rvar()
object. Since these functions are vectorized,
other columns can be passed directly to them in an aes()
specification; e.g.
aes(dist = dist_normal(mu, sigma))
will work if mu
and sigma
are columns in the
input data frame.
dist
can be a character vector giving the distribution name. Then the arg1
, ... arg9
aesthetics (or args
as a list column) specify distribution arguments. Distribution names
should correspond to R functions that have "p"
, "q"
, and "d"
functions; e.g. "norm"
is a valid distribution name because R defines the pnorm()
, qnorm()
, and dnorm()
functions for Normal distributions.
See the parse_dist()
function for a useful way to generate dist
and args
values from human-readable distribution specs (like "normal(0,1)"
). Such specs are also
produced by other packages (like the brms::get_prior
function in brms); thus,
parse_dist()
combined with the stats described here can help you visualize the output
of those functions.
A ggplot2::Stat representing a CCDF bar geometry which can
be added to a ggplot()
object.
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
x
or y
: For slabs, the input values to the slab function.
For intervals, the point summary from the interval function. Whether it is x
or y
depends on orientation
xmin
or ymin
: For intervals, the lower end of the interval from the interval function.
xmax
or ymax
: For intervals, the upper end of the interval from the interval function.
.width
: For intervals, the interval width as a numeric value in [0, 1]
.
For slabs, the width of the smallest interval containing that value of the slab.
level
: For intervals, the interval width as an ordered factor.
For slabs, the level of the smallest interval containing that value of the slab.
pdf
: For slabs, the probability density function (PDF).
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the PDF at the point summary; intervals also have pdf_min
and pdf_max
for the PDF at the lower and upper ends of the interval.
cdf
: For slabs, the cumulative distribution function.
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the CDF at the point summary; intervals also have cdf_min
and cdf_max
for the CDF at the lower and upper ends of the interval.
n
: For slabs, the number of data points summarized into that slab. If the slab was created from
an analytical distribution via the xdist
, ydist
, or dist
aesthetic, n
will be Inf
.
f
: (deprecated) For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF),
determined by slab_type
. Instead of using slab_type
to change f
and then mapping f
onto an
aesthetic, it is now recommended to simply map the corresponding computed variable (e.g. pdf
, cdf
, or
1 - cdf
) directly onto the desired aesthetic.
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation = "vertical"
); or sample data to be summarized
(when orientation = "horizontal"
with sample data).
y
: y position of the geometry (when orientation = "horizontal"
); or sample data to be summarized
(when orientation = "vertical"
with sample data).
weight
: When using samples (i.e. the x
and y
aesthetics, not xdist
or ydist
), optional
weights to be applied to each draw.
xdist
: When using analytical distributions, distribution to map on the x axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
ydist
: When using analytical distributions, distribution to map on the y axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
dist
: When using analytical distributions, a name of a distribution (e.g. "norm"
), a
distributional object (e.g. dist_normal()
), or a posterior::rvar()
object. See Details.
args
: Distribution arguments (args
or arg1
, ... arg9
). See Details.
In addition, in their default configuration (paired with geom_slabinterval()
)
the following aesthetics are supported by the underlying geom:
Slab-specific aesthetics
thickness
: The thickness of the slab at each x
value (if orientation = "horizontal"
) or
y
value (if orientation = "vertical"
) of the slab.
side
: Which side to place the slab on. "topright"
, "top"
, and "right"
are synonyms
which cause the slab to be drawn on the top or the right depending on if orientation
is "horizontal"
or "vertical"
. "bottomleft"
, "bottom"
, and "left"
are synonyms which cause the slab
to be drawn on the bottom or the left depending on if orientation
is "horizontal"
or
"vertical"
. "topleft"
causes the slab to be drawn on the top or the left, and "bottomright"
causes the slab to be drawn on the bottom or the right. "both"
draws the slab mirrored on both
sides (as in a violin plot).
scale
: What proportion of the region allocated to this geom to use to draw the slab. If scale = 1
,
slabs that use the maximum range will just touch each other. Default is 0.9
to leave some space
between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization,
see the thickness
scale article.
justification
: Justification of the interval relative to the slab, where 0
indicates bottom/left
justification and 1
indicates top/right justification (depending on orientation
). If justification
is NULL
(the default), then it is set automatically based on the value of side
: when side
is
"top"
/"right"
justification
is set to 0
, when side
is "bottom"
/"left"
justification
is set to 1
, and when side
is "both"
justification
is set to 0.5.
datatype
: When using composite geoms directly without a stat
(e.g. geom_slabinterval()
), datatype
is used to
indicate which part of the geom a row in the data targets: rows with datatype = "slab"
target the
slab portion of the geometry and rows with datatype = "interval"
target the interval portion of
the geometry. This is set automatically when using ggdist stat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (if orientation = "horizontal"
).
xmax
: Right end of the interval sub-geometry (if orientation = "horizontal"
).
ymin
: Lower end of the interval sub-geometry (if orientation = "vertical"
).
ymax
: Upper end of the interval sub-geometry (if orientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (or color
) The color of the interval and point sub-geometries.
Use the slab_color
, interval_color
, or point_color
aesthetics (below) to
set sub-geometry colors separately.
fill
: The fill color of the slab and point sub-geometries. Use the slab_fill
or point_fill
aesthetics (below) to set sub-geometry colors separately.
alpha
: The opacity of the slab, interval, and point sub-geometries. Use the slab_alpha
,
interval_alpha
, or point_alpha
aesthetics (below) to set sub-geometry colors separately.
colour_ramp
: (or color_ramp
) A secondary scale that modifies the color
scale to "ramp" to another color. See scale_colour_ramp()
for examples.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except with geom_slab()
: then
it is the width of the slab). With composite geometries including an interval and slab,
use slab_linewidth
to set the line width of the slab (see below). For interval, raw
linewidth
values are transformed according to the interval_size_domain
and interval_size_range
parameters of the geom
(see above).
size
: Determines the size of the point. If linewidth
is not provided, size
will
also determines the width of the line used to draw the interval (this allows line width and
point size to be modified together by setting only size
and not linewidth
). Raw
size
values are transformed according to the interval_size_domain
, interval_size_range
,
and fatten_point
parameters of the geom
(see above). Use the point_size
aesthetic
(below) to set sub-geometry size directly without applying the effects of
interval_size_domain
, interval_size_range
, and fatten_point
.
stroke
: Width of the outline around the point sub-geometry.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc) used to draw the interval
and the outline of the slab (if it is visible). Use the slab_linetype
or
interval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override for fill
: the fill color of the slab.
slab_colour
: (or slab_color
) Override for colour
/color
: the outline color of the slab.
slab_alpha
: Override for alpha
: the opacity of the slab.
slab_linewidth
: Override for linwidth
: the width of the outline of the slab.
slab_linetype
: Override for linetype
: the line type of the outline of the slab.
Interval-specific color and line override aesthetics
interval_colour
: (or interval_color
) Override for colour
/color
: the color of the interval.
interval_alpha
: Override for alpha
: the opacity of the interval.
interval_linetype
: Override for linetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override for fill
: the fill color of the point.
point_colour
: (or point_color
) Override for colour
/color
: the outline color of the point.
point_alpha
: Override for alpha
: the opacity of the point.
point_size
: Override for size
: the size of the point.
Deprecated aesthetics
slab_size
: Use slab_linewidth
.
interval_size
: Use interval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See geom_slabinterval()
for the geom underlying this stat.
See stat_slabinterval()
for the stat this shortcut is based on.
Other slabinterval stats:
stat_cdfinterval()
,
stat_eye()
,
stat_gradientinterval()
,
stat_halfeye()
,
stat_histinterval()
,
stat_interval()
,
stat_pointinterval()
,
stat_slab()
,
stat_spike()
library(dplyr) library(ggplot2) library(distributional) theme_set(theme_ggdist()) # ON SAMPLE DATA set.seed(1234) df = data.frame( group = c("a", "b", "c"), value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1)) ) df %>% ggplot(aes(x = value, y = group)) + stat_ccdfinterval() + expand_limits(x = 0) # ON ANALYTICAL DISTRIBUTIONS dist_df = data.frame( group = c("a", "b", "c"), mean = c( 5, 7, 8), sd = c( 1, 1.5, 1) ) # Vectorized distribution types, like distributional::dist_normal() # and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics dist_df %>% ggplot(aes(y = group, xdist = dist_normal(mean, sd))) + stat_ccdfinterval() + expand_limits(x = 0)
library(dplyr) library(ggplot2) library(distributional) theme_set(theme_ggdist()) # ON SAMPLE DATA set.seed(1234) df = data.frame( group = c("a", "b", "c"), value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1)) ) df %>% ggplot(aes(x = value, y = group)) + stat_ccdfinterval() + expand_limits(x = 0) # ON ANALYTICAL DISTRIBUTIONS dist_df = data.frame( group = c("a", "b", "c"), mean = c( 5, 7, 8), sd = c( 1, 1.5, 1) ) # Vectorized distribution types, like distributional::dist_normal() # and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics dist_df %>% ggplot(aes(y = group, xdist = dist_normal(mean, sd))) + stat_ccdfinterval() + expand_limits(x = 0)
Shortcut version of stat_slabinterval()
with geom_slabinterval()
for
creating CDF bar plots.
Roughly equivalent to:
stat_slabinterval( aes( thickness = after_stat(thickness(cdf, 0, 1)), justification = after_stat(0.5), side = after_stat("topleft") ), normalize = "none", expand = TRUE )
stat_cdfinterval( mapping = NULL, data = NULL, geom = "slabinterval", position = "identity", ..., normalize = "none", expand = TRUE, p_limits = c(NA, NA), density = "bounded", adjust = waiver(), trim = waiver(), breaks = waiver(), align = waiver(), outline_bars = waiver(), point_interval = "median_qi", limits = NULL, n = waiver(), .width = c(0.66, 0.95), orientation = NA, na.rm = FALSE, show.legend = c(size = FALSE), inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
stat_cdfinterval( mapping = NULL, data = NULL, geom = "slabinterval", position = "identity", ..., normalize = "none", expand = TRUE, p_limits = c(NA, NA), density = "bounded", adjust = waiver(), trim = waiver(), breaks = waiver(), align = waiver(), outline_bars = waiver(), point_interval = "median_qi", limits = NULL, n = waiver(), .width = c(0.66, 0.95), orientation = NA, na.rm = FALSE, show.legend = c(size = FALSE), inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override
the default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
normalize |
<string> Groups within which to scale values of the
For a comprehensive discussion and examples of slab scaling and normalization, see the
|
expand |
<logical> For sample data, should the slab be expanded to the limits of the scale? Default |
p_limits |
<length-2 numeric> Probability limits. Used to determine the lower and upper
limits of analytical distributions (distributions from samples ignore this parameter and determine
their limits based on the limits of the sample and the value of the |
density |
<function | string> Density estimator for sample data. One of:
|
adjust |
<scalar numeric | waiver> Passed to |
trim |
<scalar logical | waiver> Passed to |
breaks |
<numeric | function | string | waiver> Passed to
For example, |
align |
<scalar numeric | function | string | waiver> Passed to
For example, |
outline_bars |
<scalar logical | waiver> Passed to |
point_interval |
<function | string> A function from the |
limits |
<length-2 numeric> Manually-specified limits for the slab, as a vector of length two. These limits are combined with those
computed based on |
n |
<scalar numeric> Number of points at which to evaluate the function that defines the slab. Also
passed to |
.width |
<numeric> The |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
<logical> Should this layer be included in the legends? Default is |
inherit.aes |
If |
check.aes , check.param
|
If |
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
xdist
, ydist
, and dist
can be any distribution object from the distributional
package (dist_normal()
, dist_beta()
, etc)
or can be a posterior::rvar()
object. Since these functions are vectorized,
other columns can be passed directly to them in an aes()
specification; e.g.
aes(dist = dist_normal(mu, sigma))
will work if mu
and sigma
are columns in the
input data frame.
dist
can be a character vector giving the distribution name. Then the arg1
, ... arg9
aesthetics (or args
as a list column) specify distribution arguments. Distribution names
should correspond to R functions that have "p"
, "q"
, and "d"
functions; e.g. "norm"
is a valid distribution name because R defines the pnorm()
, qnorm()
, and dnorm()
functions for Normal distributions.
See the parse_dist()
function for a useful way to generate dist
and args
values from human-readable distribution specs (like "normal(0,1)"
). Such specs are also
produced by other packages (like the brms::get_prior
function in brms); thus,
parse_dist()
combined with the stats described here can help you visualize the output
of those functions.
A ggplot2::Stat representing a CDF bar geometry which can
be added to a ggplot()
object.
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
x
or y
: For slabs, the input values to the slab function.
For intervals, the point summary from the interval function. Whether it is x
or y
depends on orientation
xmin
or ymin
: For intervals, the lower end of the interval from the interval function.
xmax
or ymax
: For intervals, the upper end of the interval from the interval function.
.width
: For intervals, the interval width as a numeric value in [0, 1]
.
For slabs, the width of the smallest interval containing that value of the slab.
level
: For intervals, the interval width as an ordered factor.
For slabs, the level of the smallest interval containing that value of the slab.
pdf
: For slabs, the probability density function (PDF).
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the PDF at the point summary; intervals also have pdf_min
and pdf_max
for the PDF at the lower and upper ends of the interval.
cdf
: For slabs, the cumulative distribution function.
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the CDF at the point summary; intervals also have cdf_min
and cdf_max
for the CDF at the lower and upper ends of the interval.
n
: For slabs, the number of data points summarized into that slab. If the slab was created from
an analytical distribution via the xdist
, ydist
, or dist
aesthetic, n
will be Inf
.
f
: (deprecated) For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF),
determined by slab_type
. Instead of using slab_type
to change f
and then mapping f
onto an
aesthetic, it is now recommended to simply map the corresponding computed variable (e.g. pdf
, cdf
, or
1 - cdf
) directly onto the desired aesthetic.
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation = "vertical"
); or sample data to be summarized
(when orientation = "horizontal"
with sample data).
y
: y position of the geometry (when orientation = "horizontal"
); or sample data to be summarized
(when orientation = "vertical"
with sample data).
weight
: When using samples (i.e. the x
and y
aesthetics, not xdist
or ydist
), optional
weights to be applied to each draw.
xdist
: When using analytical distributions, distribution to map on the x axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
ydist
: When using analytical distributions, distribution to map on the y axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
dist
: When using analytical distributions, a name of a distribution (e.g. "norm"
), a
distributional object (e.g. dist_normal()
), or a posterior::rvar()
object. See Details.
args
: Distribution arguments (args
or arg1
, ... arg9
). See Details.
In addition, in their default configuration (paired with geom_slabinterval()
)
the following aesthetics are supported by the underlying geom:
Slab-specific aesthetics
thickness
: The thickness of the slab at each x
value (if orientation = "horizontal"
) or
y
value (if orientation = "vertical"
) of the slab.
side
: Which side to place the slab on. "topright"
, "top"
, and "right"
are synonyms
which cause the slab to be drawn on the top or the right depending on if orientation
is "horizontal"
or "vertical"
. "bottomleft"
, "bottom"
, and "left"
are synonyms which cause the slab
to be drawn on the bottom or the left depending on if orientation
is "horizontal"
or
"vertical"
. "topleft"
causes the slab to be drawn on the top or the left, and "bottomright"
causes the slab to be drawn on the bottom or the right. "both"
draws the slab mirrored on both
sides (as in a violin plot).
scale
: What proportion of the region allocated to this geom to use to draw the slab. If scale = 1
,
slabs that use the maximum range will just touch each other. Default is 0.9
to leave some space
between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization,
see the thickness
scale article.
justification
: Justification of the interval relative to the slab, where 0
indicates bottom/left
justification and 1
indicates top/right justification (depending on orientation
). If justification
is NULL
(the default), then it is set automatically based on the value of side
: when side
is
"top"
/"right"
justification
is set to 0
, when side
is "bottom"
/"left"
justification
is set to 1
, and when side
is "both"
justification
is set to 0.5.
datatype
: When using composite geoms directly without a stat
(e.g. geom_slabinterval()
), datatype
is used to
indicate which part of the geom a row in the data targets: rows with datatype = "slab"
target the
slab portion of the geometry and rows with datatype = "interval"
target the interval portion of
the geometry. This is set automatically when using ggdist stat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (if orientation = "horizontal"
).
xmax
: Right end of the interval sub-geometry (if orientation = "horizontal"
).
ymin
: Lower end of the interval sub-geometry (if orientation = "vertical"
).
ymax
: Upper end of the interval sub-geometry (if orientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (or color
) The color of the interval and point sub-geometries.
Use the slab_color
, interval_color
, or point_color
aesthetics (below) to
set sub-geometry colors separately.
fill
: The fill color of the slab and point sub-geometries. Use the slab_fill
or point_fill
aesthetics (below) to set sub-geometry colors separately.
alpha
: The opacity of the slab, interval, and point sub-geometries. Use the slab_alpha
,
interval_alpha
, or point_alpha
aesthetics (below) to set sub-geometry colors separately.
colour_ramp
: (or color_ramp
) A secondary scale that modifies the color
scale to "ramp" to another color. See scale_colour_ramp()
for examples.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except with geom_slab()
: then
it is the width of the slab). With composite geometries including an interval and slab,
use slab_linewidth
to set the line width of the slab (see below). For interval, raw
linewidth
values are transformed according to the interval_size_domain
and interval_size_range
parameters of the geom
(see above).
size
: Determines the size of the point. If linewidth
is not provided, size
will
also determines the width of the line used to draw the interval (this allows line width and
point size to be modified together by setting only size
and not linewidth
). Raw
size
values are transformed according to the interval_size_domain
, interval_size_range
,
and fatten_point
parameters of the geom
(see above). Use the point_size
aesthetic
(below) to set sub-geometry size directly without applying the effects of
interval_size_domain
, interval_size_range
, and fatten_point
.
stroke
: Width of the outline around the point sub-geometry.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc) used to draw the interval
and the outline of the slab (if it is visible). Use the slab_linetype
or
interval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override for fill
: the fill color of the slab.
slab_colour
: (or slab_color
) Override for colour
/color
: the outline color of the slab.
slab_alpha
: Override for alpha
: the opacity of the slab.
slab_linewidth
: Override for linwidth
: the width of the outline of the slab.
slab_linetype
: Override for linetype
: the line type of the outline of the slab.
Interval-specific color and line override aesthetics
interval_colour
: (or interval_color
) Override for colour
/color
: the color of the interval.
interval_alpha
: Override for alpha
: the opacity of the interval.
interval_linetype
: Override for linetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override for fill
: the fill color of the point.
point_colour
: (or point_color
) Override for colour
/color
: the outline color of the point.
point_alpha
: Override for alpha
: the opacity of the point.
point_size
: Override for size
: the size of the point.
Deprecated aesthetics
slab_size
: Use slab_linewidth
.
interval_size
: Use interval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See geom_slabinterval()
for the geom underlying this stat.
See stat_slabinterval()
for the stat this shortcut is based on.
Other slabinterval stats:
stat_ccdfinterval()
,
stat_eye()
,
stat_gradientinterval()
,
stat_halfeye()
,
stat_histinterval()
,
stat_interval()
,
stat_pointinterval()
,
stat_slab()
,
stat_spike()
library(dplyr) library(ggplot2) library(distributional) theme_set(theme_ggdist()) # ON SAMPLE DATA set.seed(1234) df = data.frame( group = c("a", "b", "c"), value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1)) ) df %>% ggplot(aes(x = value, y = group)) + stat_cdfinterval() # ON ANALYTICAL DISTRIBUTIONS dist_df = data.frame( group = c("a", "b", "c"), mean = c( 5, 7, 8), sd = c( 1, 1.5, 1) ) # Vectorized distribution types, like distributional::dist_normal() # and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics dist_df %>% ggplot(aes(y = group, xdist = dist_normal(mean, sd))) + stat_cdfinterval()
library(dplyr) library(ggplot2) library(distributional) theme_set(theme_ggdist()) # ON SAMPLE DATA set.seed(1234) df = data.frame( group = c("a", "b", "c"), value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1)) ) df %>% ggplot(aes(x = value, y = group)) + stat_cdfinterval() # ON ANALYTICAL DISTRIBUTIONS dist_df = data.frame( group = c("a", "b", "c"), mean = c( 5, 7, 8), sd = c( 1, 1.5, 1) ) # Vectorized distribution types, like distributional::dist_normal() # and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics dist_df %>% ggplot(aes(y = group, xdist = dist_normal(mean, sd))) + stat_cdfinterval()
A combination of stat_slabinterval()
and geom_dotsinterval()
with sensible defaults
for making dot plots. While geom_dotsinterval()
is intended for use on data
frames that have already been summarized using a point_interval()
function,
stat_dots()
is intended for use directly on data frames of draws or of
analytical distributions, and will perform the summarization using a point_interval()
function. Geoms based on geom_dotsinterval()
create dotplots that automatically determine a bin width that
ensures the plot fits within the available space. They can also ensure dots do not overlap.
Roughly equivalent to:
stat_dotsinterval( aes(size = NULL), geom = "dots", show_point = FALSE, show_interval = FALSE, show.legend = NA )
stat_dots( mapping = NULL, data = NULL, geom = "dots", position = "identity", ..., quantiles = NA, orientation = NA, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
stat_dots( mapping = NULL, data = NULL, geom = "dots", position = "identity", ..., quantiles = NA, orientation = NA, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override
the default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
quantiles |
<scalar logical> Number of quantiles to plot in the dotplot. Use |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param
|
If |
The dots family of stats and geoms are similar to ggplot2::geom_dotplot()
but with a number of differences:
Dots geoms act like slabs in geom_slabinterval()
and can be given x positions (or y positions when
in a horizontal orientation).
Given the available space to lay out dots, the dots geoms will automatically determine how many bins to use to fit the available space.
Dots geoms use a dynamic layout algorithm that lays out dots from the center out if the input data are symmetrical, guaranteeing that symmetrical data results in a symmetrical plot. The layout algorithm also prevents dots from overlapping each other.
The shape of the dots in these geoms can be changed using the slab_shape
aesthetic (when using the
dotsinterval
family) or the shape
or slab_shape
aesthetic (when using the dots
family)
Stats and geoms in this family include:
geom_dots()
: dotplots on raw data. Ensures the dotplot fits within available space by reducing the size
of the dots automatically (may result in very small dots).
geom_swarm()
and geom_weave()
: dotplots on raw data with defaults intended to create "beeswarm" plots.
Used side = "both"
by default, and sets the default dot size to the same size as geom_point()
(binwidth = unit(1.5, "mm")
), allowing dots to overlap instead of getting very small.
stat_dots()
: dotplots on raw data, distributional objects, and posterior::rvar()
s
geom_dotsinterval()
: dotplot + interval plots on raw data with already-calculated
intervals (rarely useful directly).
stat_dotsinterval()
: dotplot + interval plots on raw data, distributional objects,
and posterior::rvar()
s (will calculate intervals for you).
geom_blur_dots()
: blurry dotplots that allow the standard deviation of a blur applied to
each dot to be specified using the sd
aesthetic.
stat_mcse_dots()
: blurry dotplots of quantiles using the Monte Carlo Standard Error of each quantile.
stat_dots()
and stat_dotsinterval()
, when used with the quantiles
argument,
are particularly useful for constructing quantile dotplots, which can be an effective way to communicate uncertainty
using a frequency framing that may be easier for laypeople to understand (Kay et al. 2016, Fernandes et al. 2018).
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
xdist
, ydist
, and dist
can be any distribution object from the distributional
package (dist_normal()
, dist_beta()
, etc)
or can be a posterior::rvar()
object. Since these functions are vectorized,
other columns can be passed directly to them in an aes()
specification; e.g.
aes(dist = dist_normal(mu, sigma))
will work if mu
and sigma
are columns in the
input data frame.
dist
can be a character vector giving the distribution name. Then the arg1
, ... arg9
aesthetics (or args
as a list column) specify distribution arguments. Distribution names
should correspond to R functions that have "p"
, "q"
, and "d"
functions; e.g. "norm"
is a valid distribution name because R defines the pnorm()
, qnorm()
, and dnorm()
functions for Normal distributions.
See the parse_dist()
function for a useful way to generate dist
and args
values from human-readable distribution specs (like "normal(0,1)"
). Such specs are also
produced by other packages (like the brms::get_prior
function in brms); thus,
parse_dist()
combined with the stats described here can help you visualize the output
of those functions.
A ggplot2::Stat representing a dot geometry which can
be added to a ggplot()
object.
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
x
or y
: For slabs, the input values to the slab function.
For intervals, the point summary from the interval function. Whether it is x
or y
depends on orientation
xmin
or ymin
: For intervals, the lower end of the interval from the interval function.
xmax
or ymax
: For intervals, the upper end of the interval from the interval function.
.width
: For intervals, the interval width as a numeric value in [0, 1]
.
For slabs, the width of the smallest interval containing that value of the slab.
level
: For intervals, the interval width as an ordered factor.
For slabs, the level of the smallest interval containing that value of the slab.
pdf
: For slabs, the probability density function (PDF).
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the PDF at the point summary; intervals also have pdf_min
and pdf_max
for the PDF at the lower and upper ends of the interval.
cdf
: For slabs, the cumulative distribution function.
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the CDF at the point summary; intervals also have cdf_min
and cdf_max
for the CDF at the lower and upper ends of the interval.
n
: For slabs, the number of data points summarized into that slab. If the slab was created from
an analytical distribution via the xdist
, ydist
, or dist
aesthetic, n
will be Inf
.
f
: (deprecated) For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF),
determined by slab_type
. Instead of using slab_type
to change f
and then mapping f
onto an
aesthetic, it is now recommended to simply map the corresponding computed variable (e.g. pdf
, cdf
, or
1 - cdf
) directly onto the desired aesthetic.
The dots+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the dots (aka the slab), the
point, and the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation = "vertical"
); or sample data to be summarized
(when orientation = "horizontal"
with sample data).
y
: y position of the geometry (when orientation = "horizontal"
); or sample data to be summarized
(when orientation = "vertical"
with sample data).
weight
: When using samples (i.e. the x
and y
aesthetics, not xdist
or ydist
), optional
weights to be applied to each draw.
xdist
: When using analytical distributions, distribution to map on the x axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
ydist
: When using analytical distributions, distribution to map on the y axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
dist
: When using analytical distributions, a name of a distribution (e.g. "norm"
), a
distributional object (e.g. dist_normal()
), or a posterior::rvar()
object. See Details.
args
: Distribution arguments (args
or arg1
, ... arg9
). See Details.
In addition, in their default configuration (paired with geom_dots()
)
the following aesthetics are supported by the underlying geom:
Dots-specific (aka Slab-specific) aesthetics
family
: The font family used to draw the dots.
order
: The order in which data points are stacked within bins. Can be used to create the effect of
"stacked" dots by ordering dots according to a discrete variable. If omitted (NULL
), the
value of the data points themselves are used to determine stacking order. Only applies when
layout
is "bin"
or "hex"
, as the other layout methods fully determine both x and y positions.
side
: Which side to place the slab on. "topright"
, "top"
, and "right"
are synonyms
which cause the slab to be drawn on the top or the right depending on if orientation
is "horizontal"
or "vertical"
. "bottomleft"
, "bottom"
, and "left"
are synonyms which cause the slab
to be drawn on the bottom or the left depending on if orientation
is "horizontal"
or
"vertical"
. "topleft"
causes the slab to be drawn on the top or the left, and "bottomright"
causes the slab to be drawn on the bottom or the right. "both"
draws the slab mirrored on both
sides (as in a violin plot).
scale
: What proportion of the region allocated to this geom to use to draw the slab. If scale = 1
,
slabs that use the maximum range will just touch each other. Default is 0.9
to leave some space
between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization,
see the thickness
scale article.
justification
: Justification of the interval relative to the slab, where 0
indicates bottom/left
justification and 1
indicates top/right justification (depending on orientation
). If justification
is NULL
(the default), then it is set automatically based on the value of side
: when side
is
"top"
/"right"
justification
is set to 0
, when side
is "bottom"
/"left"
justification
is set to 1
, and when side
is "both"
justification
is set to 0.5.
datatype
: When using composite geoms directly without a stat
(e.g. geom_slabinterval()
), datatype
is used to
indicate which part of the geom a row in the data targets: rows with datatype = "slab"
target the
slab portion of the geometry and rows with datatype = "interval"
target the interval portion of
the geometry. This is set automatically when using ggdist stat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (if orientation = "horizontal"
).
xmax
: Right end of the interval sub-geometry (if orientation = "horizontal"
).
ymin
: Lower end of the interval sub-geometry (if orientation = "vertical"
).
ymax
: Upper end of the interval sub-geometry (if orientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (or color
) The color of the interval and point sub-geometries.
Use the slab_color
, interval_color
, or point_color
aesthetics (below) to
set sub-geometry colors separately.
fill
: The fill color of the slab and point sub-geometries. Use the slab_fill
or point_fill
aesthetics (below) to set sub-geometry colors separately.
alpha
: The opacity of the slab, interval, and point sub-geometries. Use the slab_alpha
,
interval_alpha
, or point_alpha
aesthetics (below) to set sub-geometry colors separately.
colour_ramp
: (or color_ramp
) A secondary scale that modifies the color
scale to "ramp" to another color. See scale_colour_ramp()
for examples.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except with geom_slab()
: then
it is the width of the slab). With composite geometries including an interval and slab,
use slab_linewidth
to set the line width of the slab (see below). For interval, raw
linewidth
values are transformed according to the interval_size_domain
and interval_size_range
parameters of the geom
(see above).
size
: Determines the size of the point. If linewidth
is not provided, size
will
also determines the width of the line used to draw the interval (this allows line width and
point size to be modified together by setting only size
and not linewidth
). Raw
size
values are transformed according to the interval_size_domain
, interval_size_range
,
and fatten_point
parameters of the geom
(see above). Use the point_size
aesthetic
(below) to set sub-geometry size directly without applying the effects of
interval_size_domain
, interval_size_range
, and fatten_point
.
stroke
: Width of the outline around the point sub-geometry.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc) used to draw the interval
and the outline of the slab (if it is visible). Use the slab_linetype
or
interval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override for fill
: the fill color of the slab.
slab_colour
: (or slab_color
) Override for colour
/color
: the outline color of the slab.
slab_alpha
: Override for alpha
: the opacity of the slab.
slab_linewidth
: Override for linwidth
: the width of the outline of the slab.
slab_linetype
: Override for linetype
: the line type of the outline of the slab.
slab_shape
: Override for shape
: the shape of the dots used to draw the dotplot slab.
Interval-specific color and line override aesthetics
interval_colour
: (or interval_color
) Override for colour
/color
: the color of the interval.
interval_alpha
: Override for alpha
: the opacity of the interval.
interval_linetype
: Override for linetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override for fill
: the fill color of the point.
point_colour
: (or point_color
) Override for colour
/color
: the outline color of the point.
point_alpha
: Override for alpha
: the opacity of the point.
point_size
: Override for size
: the size of the point.
Deprecated aesthetics
slab_size
: Use slab_linewidth
.
interval_size
: Use interval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("dotsinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
Kay, M., Kola, T., Hullman, J. R., & Munson, S. A. (2016). When (ish) is My Bus? User-centered Visualizations of Uncertainty in Everyday, Mobile Predictive Systems. Conference on Human Factors in Computing Systems - CHI '16, 5092–5103. doi:10.1145/2858036.2858558.
Fernandes, M., Walls, L., Munson, S., Hullman, J., & Kay, M. (2018). Uncertainty Displays Using Quantile Dotplots or CDFs Improve Transit Decision-Making. Conference on Human Factors in Computing Systems - CHI '18. doi:10.1145/3173574.3173718.
See geom_dots()
for the geom underlying this stat.
See vignette("dotsinterval")
for a variety of examples of use.
Other dotsinterval stats:
stat_dotsinterval()
,
stat_mcse_dots()
library(dplyr) library(ggplot2) library(distributional) theme_set(theme_ggdist()) # ON SAMPLE DATA set.seed(12345) tibble( x = rep(1:10, 100), y = rnorm(1000, x) ) %>% ggplot(aes(x = x, y = y)) + stat_dots() # ON ANALYTICAL DISTRIBUTIONS # Vectorized distribution types, like distributional::dist_normal() # and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics tibble( x = 1:10, sd = seq(1, 3, length.out = 10) ) %>% ggplot(aes(x = x, ydist = dist_normal(x, sd))) + stat_dots(quantiles = 50)
library(dplyr) library(ggplot2) library(distributional) theme_set(theme_ggdist()) # ON SAMPLE DATA set.seed(12345) tibble( x = rep(1:10, 100), y = rnorm(1000, x) ) %>% ggplot(aes(x = x, y = y)) + stat_dots() # ON ANALYTICAL DISTRIBUTIONS # Vectorized distribution types, like distributional::dist_normal() # and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics tibble( x = 1:10, sd = seq(1, 3, length.out = 10) ) %>% ggplot(aes(x = x, ydist = dist_normal(x, sd))) + stat_dots(quantiles = 50)
A combination of stat_slabinterval()
and geom_dotsinterval()
with sensible defaults
for making dots + point + interval plots. While geom_dotsinterval()
is intended for use on data
frames that have already been summarized using a point_interval()
function,
stat_dotsinterval()
is intended for use directly on data frames of draws or of
analytical distributions, and will perform the summarization using a point_interval()
function. Geoms based on geom_dotsinterval()
create dotplots that automatically determine a bin width that
ensures the plot fits within the available space. They can also ensure dots do not overlap.
stat_dotsinterval( mapping = NULL, data = NULL, geom = "dotsinterval", position = "identity", ..., quantiles = NA, point_interval = "median_qi", .width = c(0.66, 0.95), orientation = NA, na.rm = FALSE, show.legend = c(size = FALSE), inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
stat_dotsinterval( mapping = NULL, data = NULL, geom = "dotsinterval", position = "identity", ..., quantiles = NA, point_interval = "median_qi", .width = c(0.66, 0.95), orientation = NA, na.rm = FALSE, show.legend = c(size = FALSE), inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override
the default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
quantiles |
<scalar logical> Number of quantiles to plot in the dotplot. Use |
point_interval |
<function | string> A function from the |
.width |
<numeric> The |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param
|
If |
The dots family of stats and geoms are similar to ggplot2::geom_dotplot()
but with a number of differences:
Dots geoms act like slabs in geom_slabinterval()
and can be given x positions (or y positions when
in a horizontal orientation).
Given the available space to lay out dots, the dots geoms will automatically determine how many bins to use to fit the available space.
Dots geoms use a dynamic layout algorithm that lays out dots from the center out if the input data are symmetrical, guaranteeing that symmetrical data results in a symmetrical plot. The layout algorithm also prevents dots from overlapping each other.
The shape of the dots in these geoms can be changed using the slab_shape
aesthetic (when using the
dotsinterval
family) or the shape
or slab_shape
aesthetic (when using the dots
family)
Stats and geoms in this family include:
geom_dots()
: dotplots on raw data. Ensures the dotplot fits within available space by reducing the size
of the dots automatically (may result in very small dots).
geom_swarm()
and geom_weave()
: dotplots on raw data with defaults intended to create "beeswarm" plots.
Used side = "both"
by default, and sets the default dot size to the same size as geom_point()
(binwidth = unit(1.5, "mm")
), allowing dots to overlap instead of getting very small.
stat_dots()
: dotplots on raw data, distributional objects, and posterior::rvar()
s
geom_dotsinterval()
: dotplot + interval plots on raw data with already-calculated
intervals (rarely useful directly).
stat_dotsinterval()
: dotplot + interval plots on raw data, distributional objects,
and posterior::rvar()
s (will calculate intervals for you).
geom_blur_dots()
: blurry dotplots that allow the standard deviation of a blur applied to
each dot to be specified using the sd
aesthetic.
stat_mcse_dots()
: blurry dotplots of quantiles using the Monte Carlo Standard Error of each quantile.
stat_dots()
and stat_dotsinterval()
, when used with the quantiles
argument,
are particularly useful for constructing quantile dotplots, which can be an effective way to communicate uncertainty
using a frequency framing that may be easier for laypeople to understand (Kay et al. 2016, Fernandes et al. 2018).
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
xdist
, ydist
, and dist
can be any distribution object from the distributional
package (dist_normal()
, dist_beta()
, etc)
or can be a posterior::rvar()
object. Since these functions are vectorized,
other columns can be passed directly to them in an aes()
specification; e.g.
aes(dist = dist_normal(mu, sigma))
will work if mu
and sigma
are columns in the
input data frame.
dist
can be a character vector giving the distribution name. Then the arg1
, ... arg9
aesthetics (or args
as a list column) specify distribution arguments. Distribution names
should correspond to R functions that have "p"
, "q"
, and "d"
functions; e.g. "norm"
is a valid distribution name because R defines the pnorm()
, qnorm()
, and dnorm()
functions for Normal distributions.
See the parse_dist()
function for a useful way to generate dist
and args
values from human-readable distribution specs (like "normal(0,1)"
). Such specs are also
produced by other packages (like the brms::get_prior
function in brms); thus,
parse_dist()
combined with the stats described here can help you visualize the output
of those functions.
A ggplot2::Stat representing a dots + point + interval geometry which can
be added to a ggplot()
object.
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
x
or y
: For slabs, the input values to the slab function.
For intervals, the point summary from the interval function. Whether it is x
or y
depends on orientation
xmin
or ymin
: For intervals, the lower end of the interval from the interval function.
xmax
or ymax
: For intervals, the upper end of the interval from the interval function.
.width
: For intervals, the interval width as a numeric value in [0, 1]
.
For slabs, the width of the smallest interval containing that value of the slab.
level
: For intervals, the interval width as an ordered factor.
For slabs, the level of the smallest interval containing that value of the slab.
pdf
: For slabs, the probability density function (PDF).
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the PDF at the point summary; intervals also have pdf_min
and pdf_max
for the PDF at the lower and upper ends of the interval.
cdf
: For slabs, the cumulative distribution function.
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the CDF at the point summary; intervals also have cdf_min
and cdf_max
for the CDF at the lower and upper ends of the interval.
n
: For slabs, the number of data points summarized into that slab. If the slab was created from
an analytical distribution via the xdist
, ydist
, or dist
aesthetic, n
will be Inf
.
f
: (deprecated) For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF),
determined by slab_type
. Instead of using slab_type
to change f
and then mapping f
onto an
aesthetic, it is now recommended to simply map the corresponding computed variable (e.g. pdf
, cdf
, or
1 - cdf
) directly onto the desired aesthetic.
The dots+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the dots (aka the slab), the
point, and the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation = "vertical"
); or sample data to be summarized
(when orientation = "horizontal"
with sample data).
y
: y position of the geometry (when orientation = "horizontal"
); or sample data to be summarized
(when orientation = "vertical"
with sample data).
weight
: When using samples (i.e. the x
and y
aesthetics, not xdist
or ydist
), optional
weights to be applied to each draw.
xdist
: When using analytical distributions, distribution to map on the x axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
ydist
: When using analytical distributions, distribution to map on the y axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
dist
: When using analytical distributions, a name of a distribution (e.g. "norm"
), a
distributional object (e.g. dist_normal()
), or a posterior::rvar()
object. See Details.
args
: Distribution arguments (args
or arg1
, ... arg9
). See Details.
In addition, in their default configuration (paired with geom_dotsinterval()
)
the following aesthetics are supported by the underlying geom:
Dots-specific (aka Slab-specific) aesthetics
family
: The font family used to draw the dots.
order
: The order in which data points are stacked within bins. Can be used to create the effect of
"stacked" dots by ordering dots according to a discrete variable. If omitted (NULL
), the
value of the data points themselves are used to determine stacking order. Only applies when
layout
is "bin"
or "hex"
, as the other layout methods fully determine both x and y positions.
side
: Which side to place the slab on. "topright"
, "top"
, and "right"
are synonyms
which cause the slab to be drawn on the top or the right depending on if orientation
is "horizontal"
or "vertical"
. "bottomleft"
, "bottom"
, and "left"
are synonyms which cause the slab
to be drawn on the bottom or the left depending on if orientation
is "horizontal"
or
"vertical"
. "topleft"
causes the slab to be drawn on the top or the left, and "bottomright"
causes the slab to be drawn on the bottom or the right. "both"
draws the slab mirrored on both
sides (as in a violin plot).
scale
: What proportion of the region allocated to this geom to use to draw the slab. If scale = 1
,
slabs that use the maximum range will just touch each other. Default is 0.9
to leave some space
between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization,
see the thickness
scale article.
justification
: Justification of the interval relative to the slab, where 0
indicates bottom/left
justification and 1
indicates top/right justification (depending on orientation
). If justification
is NULL
(the default), then it is set automatically based on the value of side
: when side
is
"top"
/"right"
justification
is set to 0
, when side
is "bottom"
/"left"
justification
is set to 1
, and when side
is "both"
justification
is set to 0.5.
datatype
: When using composite geoms directly without a stat
(e.g. geom_slabinterval()
), datatype
is used to
indicate which part of the geom a row in the data targets: rows with datatype = "slab"
target the
slab portion of the geometry and rows with datatype = "interval"
target the interval portion of
the geometry. This is set automatically when using ggdist stat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (if orientation = "horizontal"
).
xmax
: Right end of the interval sub-geometry (if orientation = "horizontal"
).
ymin
: Lower end of the interval sub-geometry (if orientation = "vertical"
).
ymax
: Upper end of the interval sub-geometry (if orientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (or color
) The color of the interval and point sub-geometries.
Use the slab_color
, interval_color
, or point_color
aesthetics (below) to
set sub-geometry colors separately.
fill
: The fill color of the slab and point sub-geometries. Use the slab_fill
or point_fill
aesthetics (below) to set sub-geometry colors separately.
alpha
: The opacity of the slab, interval, and point sub-geometries. Use the slab_alpha
,
interval_alpha
, or point_alpha
aesthetics (below) to set sub-geometry colors separately.
colour_ramp
: (or color_ramp
) A secondary scale that modifies the color
scale to "ramp" to another color. See scale_colour_ramp()
for examples.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except with geom_slab()
: then
it is the width of the slab). With composite geometries including an interval and slab,
use slab_linewidth
to set the line width of the slab (see below). For interval, raw
linewidth
values are transformed according to the interval_size_domain
and interval_size_range
parameters of the geom
(see above).
size
: Determines the size of the point. If linewidth
is not provided, size
will
also determines the width of the line used to draw the interval (this allows line width and
point size to be modified together by setting only size
and not linewidth
). Raw
size
values are transformed according to the interval_size_domain
, interval_size_range
,
and fatten_point
parameters of the geom
(see above). Use the point_size
aesthetic
(below) to set sub-geometry size directly without applying the effects of
interval_size_domain
, interval_size_range
, and fatten_point
.
stroke
: Width of the outline around the point sub-geometry.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc) used to draw the interval
and the outline of the slab (if it is visible). Use the slab_linetype
or
interval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override for fill
: the fill color of the slab.
slab_colour
: (or slab_color
) Override for colour
/color
: the outline color of the slab.
slab_alpha
: Override for alpha
: the opacity of the slab.
slab_linewidth
: Override for linwidth
: the width of the outline of the slab.
slab_linetype
: Override for linetype
: the line type of the outline of the slab.
slab_shape
: Override for shape
: the shape of the dots used to draw the dotplot slab.
Interval-specific color and line override aesthetics
interval_colour
: (or interval_color
) Override for colour
/color
: the color of the interval.
interval_alpha
: Override for alpha
: the opacity of the interval.
interval_linetype
: Override for linetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override for fill
: the fill color of the point.
point_colour
: (or point_color
) Override for colour
/color
: the outline color of the point.
point_alpha
: Override for alpha
: the opacity of the point.
point_size
: Override for size
: the size of the point.
Deprecated aesthetics
slab_size
: Use slab_linewidth
.
interval_size
: Use interval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("dotsinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
Kay, M., Kola, T., Hullman, J. R., & Munson, S. A. (2016). When (ish) is My Bus? User-centered Visualizations of Uncertainty in Everyday, Mobile Predictive Systems. Conference on Human Factors in Computing Systems - CHI '16, 5092–5103. doi:10.1145/2858036.2858558.
Fernandes, M., Walls, L., Munson, S., Hullman, J., & Kay, M. (2018). Uncertainty Displays Using Quantile Dotplots or CDFs Improve Transit Decision-Making. Conference on Human Factors in Computing Systems - CHI '18. doi:10.1145/3173574.3173718.
See geom_dotsinterval()
for the geom underlying this stat.
See vignette("dotsinterval")
for a variety of examples of use.
Other dotsinterval stats:
stat_dots()
,
stat_mcse_dots()
library(dplyr) library(ggplot2) library(distributional) theme_set(theme_ggdist()) # ON SAMPLE DATA set.seed(12345) tibble( x = rep(1:10, 100), y = rnorm(1000, x) ) %>% ggplot(aes(x = x, y = y)) + stat_dotsinterval() # ON ANALYTICAL DISTRIBUTIONS # Vectorized distribution types, like distributional::dist_normal() # and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics tibble( x = 1:10, sd = seq(1, 3, length.out = 10) ) %>% ggplot(aes(x = x, ydist = dist_normal(x, sd))) + stat_dotsinterval(quantiles = 50)
library(dplyr) library(ggplot2) library(distributional) theme_set(theme_ggdist()) # ON SAMPLE DATA set.seed(12345) tibble( x = rep(1:10, 100), y = rnorm(1000, x) ) %>% ggplot(aes(x = x, y = y)) + stat_dotsinterval() # ON ANALYTICAL DISTRIBUTIONS # Vectorized distribution types, like distributional::dist_normal() # and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics tibble( x = 1:10, sd = seq(1, 3, length.out = 10) ) %>% ggplot(aes(x = x, ydist = dist_normal(x, sd))) + stat_dotsinterval(quantiles = 50)
Shortcut version of stat_slabinterval()
with geom_slabinterval()
for
creating eye (violin + interval) plots.
Roughly equivalent to:
stat_slabinterval( aes(side = after_stat("both")) )
stat_eye( mapping = NULL, data = NULL, geom = "slabinterval", position = "identity", ..., p_limits = c(NA, NA), density = "bounded", adjust = waiver(), trim = waiver(), breaks = waiver(), align = waiver(), outline_bars = waiver(), expand = FALSE, point_interval = "median_qi", limits = NULL, n = waiver(), .width = c(0.66, 0.95), orientation = NA, na.rm = FALSE, show.legend = c(size = FALSE), inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
stat_eye( mapping = NULL, data = NULL, geom = "slabinterval", position = "identity", ..., p_limits = c(NA, NA), density = "bounded", adjust = waiver(), trim = waiver(), breaks = waiver(), align = waiver(), outline_bars = waiver(), expand = FALSE, point_interval = "median_qi", limits = NULL, n = waiver(), .width = c(0.66, 0.95), orientation = NA, na.rm = FALSE, show.legend = c(size = FALSE), inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override
the default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
p_limits |
<length-2 numeric> Probability limits. Used to determine the lower and upper
limits of analytical distributions (distributions from samples ignore this parameter and determine
their limits based on the limits of the sample and the value of the |
density |
<function | string> Density estimator for sample data. One of:
|
adjust |
<scalar numeric | waiver> Passed to |
trim |
<scalar logical | waiver> Passed to |
breaks |
<numeric | function | string | waiver> Passed to
For example, |
align |
<scalar numeric | function | string | waiver> Passed to
For example, |
outline_bars |
<scalar logical | waiver> Passed to |
expand |
<logical> For sample data, should the slab be expanded to the limits of the scale? Default |
point_interval |
<function | string> A function from the |
limits |
<length-2 numeric> Manually-specified limits for the slab, as a vector of length two. These limits are combined with those
computed based on |
n |
<scalar numeric> Number of points at which to evaluate the function that defines the slab. Also
passed to |
.width |
<numeric> The |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
<logical> Should this layer be included in the legends? Default is |
inherit.aes |
If |
check.aes , check.param
|
If |
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
xdist
, ydist
, and dist
can be any distribution object from the distributional
package (dist_normal()
, dist_beta()
, etc)
or can be a posterior::rvar()
object. Since these functions are vectorized,
other columns can be passed directly to them in an aes()
specification; e.g.
aes(dist = dist_normal(mu, sigma))
will work if mu
and sigma
are columns in the
input data frame.
dist
can be a character vector giving the distribution name. Then the arg1
, ... arg9
aesthetics (or args
as a list column) specify distribution arguments. Distribution names
should correspond to R functions that have "p"
, "q"
, and "d"
functions; e.g. "norm"
is a valid distribution name because R defines the pnorm()
, qnorm()
, and dnorm()
functions for Normal distributions.
See the parse_dist()
function for a useful way to generate dist
and args
values from human-readable distribution specs (like "normal(0,1)"
). Such specs are also
produced by other packages (like the brms::get_prior
function in brms); thus,
parse_dist()
combined with the stats described here can help you visualize the output
of those functions.
A ggplot2::Stat representing a eye (violin + interval) geometry which can
be added to a ggplot()
object.
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
x
or y
: For slabs, the input values to the slab function.
For intervals, the point summary from the interval function. Whether it is x
or y
depends on orientation
xmin
or ymin
: For intervals, the lower end of the interval from the interval function.
xmax
or ymax
: For intervals, the upper end of the interval from the interval function.
.width
: For intervals, the interval width as a numeric value in [0, 1]
.
For slabs, the width of the smallest interval containing that value of the slab.
level
: For intervals, the interval width as an ordered factor.
For slabs, the level of the smallest interval containing that value of the slab.
pdf
: For slabs, the probability density function (PDF).
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the PDF at the point summary; intervals also have pdf_min
and pdf_max
for the PDF at the lower and upper ends of the interval.
cdf
: For slabs, the cumulative distribution function.
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the CDF at the point summary; intervals also have cdf_min
and cdf_max
for the CDF at the lower and upper ends of the interval.
n
: For slabs, the number of data points summarized into that slab. If the slab was created from
an analytical distribution via the xdist
, ydist
, or dist
aesthetic, n
will be Inf
.
f
: (deprecated) For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF),
determined by slab_type
. Instead of using slab_type
to change f
and then mapping f
onto an
aesthetic, it is now recommended to simply map the corresponding computed variable (e.g. pdf
, cdf
, or
1 - cdf
) directly onto the desired aesthetic.
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation = "vertical"
); or sample data to be summarized
(when orientation = "horizontal"
with sample data).
y
: y position of the geometry (when orientation = "horizontal"
); or sample data to be summarized
(when orientation = "vertical"
with sample data).
weight
: When using samples (i.e. the x
and y
aesthetics, not xdist
or ydist
), optional
weights to be applied to each draw.
xdist
: When using analytical distributions, distribution to map on the x axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
ydist
: When using analytical distributions, distribution to map on the y axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
dist
: When using analytical distributions, a name of a distribution (e.g. "norm"
), a
distributional object (e.g. dist_normal()
), or a posterior::rvar()
object. See Details.
args
: Distribution arguments (args
or arg1
, ... arg9
). See Details.
In addition, in their default configuration (paired with geom_slabinterval()
)
the following aesthetics are supported by the underlying geom:
Slab-specific aesthetics
thickness
: The thickness of the slab at each x
value (if orientation = "horizontal"
) or
y
value (if orientation = "vertical"
) of the slab.
side
: Which side to place the slab on. "topright"
, "top"
, and "right"
are synonyms
which cause the slab to be drawn on the top or the right depending on if orientation
is "horizontal"
or "vertical"
. "bottomleft"
, "bottom"
, and "left"
are synonyms which cause the slab
to be drawn on the bottom or the left depending on if orientation
is "horizontal"
or
"vertical"
. "topleft"
causes the slab to be drawn on the top or the left, and "bottomright"
causes the slab to be drawn on the bottom or the right. "both"
draws the slab mirrored on both
sides (as in a violin plot).
scale
: What proportion of the region allocated to this geom to use to draw the slab. If scale = 1
,
slabs that use the maximum range will just touch each other. Default is 0.9
to leave some space
between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization,
see the thickness
scale article.
justification
: Justification of the interval relative to the slab, where 0
indicates bottom/left
justification and 1
indicates top/right justification (depending on orientation
). If justification
is NULL
(the default), then it is set automatically based on the value of side
: when side
is
"top"
/"right"
justification
is set to 0
, when side
is "bottom"
/"left"
justification
is set to 1
, and when side
is "both"
justification
is set to 0.5.
datatype
: When using composite geoms directly without a stat
(e.g. geom_slabinterval()
), datatype
is used to
indicate which part of the geom a row in the data targets: rows with datatype = "slab"
target the
slab portion of the geometry and rows with datatype = "interval"
target the interval portion of
the geometry. This is set automatically when using ggdist stat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (if orientation = "horizontal"
).
xmax
: Right end of the interval sub-geometry (if orientation = "horizontal"
).
ymin
: Lower end of the interval sub-geometry (if orientation = "vertical"
).
ymax
: Upper end of the interval sub-geometry (if orientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (or color
) The color of the interval and point sub-geometries.
Use the slab_color
, interval_color
, or point_color
aesthetics (below) to
set sub-geometry colors separately.
fill
: The fill color of the slab and point sub-geometries. Use the slab_fill
or point_fill
aesthetics (below) to set sub-geometry colors separately.
alpha
: The opacity of the slab, interval, and point sub-geometries. Use the slab_alpha
,
interval_alpha
, or point_alpha
aesthetics (below) to set sub-geometry colors separately.
colour_ramp
: (or color_ramp
) A secondary scale that modifies the color
scale to "ramp" to another color. See scale_colour_ramp()
for examples.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except with geom_slab()
: then
it is the width of the slab). With composite geometries including an interval and slab,
use slab_linewidth
to set the line width of the slab (see below). For interval, raw
linewidth
values are transformed according to the interval_size_domain
and interval_size_range
parameters of the geom
(see above).
size
: Determines the size of the point. If linewidth
is not provided, size
will
also determines the width of the line used to draw the interval (this allows line width and
point size to be modified together by setting only size
and not linewidth
). Raw
size
values are transformed according to the interval_size_domain
, interval_size_range
,
and fatten_point
parameters of the geom
(see above). Use the point_size
aesthetic
(below) to set sub-geometry size directly without applying the effects of
interval_size_domain
, interval_size_range
, and fatten_point
.
stroke
: Width of the outline around the point sub-geometry.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc) used to draw the interval
and the outline of the slab (if it is visible). Use the slab_linetype
or
interval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override for fill
: the fill color of the slab.
slab_colour
: (or slab_color
) Override for colour
/color
: the outline color of the slab.
slab_alpha
: Override for alpha
: the opacity of the slab.
slab_linewidth
: Override for linwidth
: the width of the outline of the slab.
slab_linetype
: Override for linetype
: the line type of the outline of the slab.
Interval-specific color and line override aesthetics
interval_colour
: (or interval_color
) Override for colour
/color
: the color of the interval.
interval_alpha
: Override for alpha
: the opacity of the interval.
interval_linetype
: Override for linetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override for fill
: the fill color of the point.
point_colour
: (or point_color
) Override for colour
/color
: the outline color of the point.
point_alpha
: Override for alpha
: the opacity of the point.
point_size
: Override for size
: the size of the point.
Deprecated aesthetics
slab_size
: Use slab_linewidth
.
interval_size
: Use interval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See geom_slabinterval()
for the geom underlying this stat.
See stat_slabinterval()
for the stat this shortcut is based on.
Other slabinterval stats:
stat_ccdfinterval()
,
stat_cdfinterval()
,
stat_gradientinterval()
,
stat_halfeye()
,
stat_histinterval()
,
stat_interval()
,
stat_pointinterval()
,
stat_slab()
,
stat_spike()
library(dplyr) library(ggplot2) library(distributional) theme_set(theme_ggdist()) # ON SAMPLE DATA set.seed(1234) df = data.frame( group = c("a", "b", "c"), value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1)) ) df %>% ggplot(aes(x = value, y = group)) + stat_eye() # ON ANALYTICAL DISTRIBUTIONS dist_df = data.frame( group = c("a", "b", "c"), mean = c( 5, 7, 8), sd = c( 1, 1.5, 1) ) # Vectorized distribution types, like distributional::dist_normal() # and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics dist_df %>% ggplot(aes(y = group, xdist = dist_normal(mean, sd))) + stat_eye()
library(dplyr) library(ggplot2) library(distributional) theme_set(theme_ggdist()) # ON SAMPLE DATA set.seed(1234) df = data.frame( group = c("a", "b", "c"), value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1)) ) df %>% ggplot(aes(x = value, y = group)) + stat_eye() # ON ANALYTICAL DISTRIBUTIONS dist_df = data.frame( group = c("a", "b", "c"), mean = c( 5, 7, 8), sd = c( 1, 1.5, 1) ) # Vectorized distribution types, like distributional::dist_normal() # and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics dist_df %>% ggplot(aes(y = group, xdist = dist_normal(mean, sd))) + stat_eye()
Shortcut version of stat_slabinterval()
with geom_slabinterval()
for
creating gradient + interval plots.
Roughly equivalent to:
stat_slabinterval( aes( justification = after_stat(0.5), thickness = after_stat(thickness(1)), slab_alpha = after_stat(f) ), fill_type = "auto", show.legend = c(size = FALSE, slab_alpha = FALSE) )
If your graphics device supports it, it is recommended to use this stat
with fill_type = "gradient"
(see the description of that parameter). On R >= 4.2,
support for fill_type = "gradient"
should be auto-detected based on the
graphics device you are using.
stat_gradientinterval( mapping = NULL, data = NULL, geom = "slabinterval", position = "identity", ..., fill_type = "auto", p_limits = c(NA, NA), density = "bounded", adjust = waiver(), trim = waiver(), breaks = waiver(), align = waiver(), outline_bars = waiver(), expand = FALSE, point_interval = "median_qi", limits = NULL, n = waiver(), .width = c(0.66, 0.95), orientation = NA, na.rm = FALSE, show.legend = c(size = FALSE, slab_alpha = FALSE), inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
stat_gradientinterval( mapping = NULL, data = NULL, geom = "slabinterval", position = "identity", ..., fill_type = "auto", p_limits = c(NA, NA), density = "bounded", adjust = waiver(), trim = waiver(), breaks = waiver(), align = waiver(), outline_bars = waiver(), expand = FALSE, point_interval = "median_qi", limits = NULL, n = waiver(), .width = c(0.66, 0.95), orientation = NA, na.rm = FALSE, show.legend = c(size = FALSE, slab_alpha = FALSE), inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override
the default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
fill_type |
<string> What type of fill to use when the fill color or alpha varies within a slab. One of:
|
p_limits |
<length-2 numeric> Probability limits. Used to determine the lower and upper
limits of analytical distributions (distributions from samples ignore this parameter and determine
their limits based on the limits of the sample and the value of the |
density |
<function | string> Density estimator for sample data. One of:
|
adjust |
<scalar numeric | waiver> Passed to |
trim |
<scalar logical | waiver> Passed to |
breaks |
<numeric | function | string | waiver> Passed to
For example, |
align |
<scalar numeric | function | string | waiver> Passed to
For example, |
outline_bars |
<scalar logical | waiver> Passed to |
expand |
<logical> For sample data, should the slab be expanded to the limits of the scale? Default |
point_interval |
<function | string> A function from the |
limits |
<length-2 numeric> Manually-specified limits for the slab, as a vector of length two. These limits are combined with those
computed based on |
n |
<scalar numeric> Number of points at which to evaluate the function that defines the slab. Also
passed to |
.width |
<numeric> The |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
<logical> Should this layer be included in the legends? Default is |
inherit.aes |
If |
check.aes , check.param
|
If |
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
xdist
, ydist
, and dist
can be any distribution object from the distributional
package (dist_normal()
, dist_beta()
, etc)
or can be a posterior::rvar()
object. Since these functions are vectorized,
other columns can be passed directly to them in an aes()
specification; e.g.
aes(dist = dist_normal(mu, sigma))
will work if mu
and sigma
are columns in the
input data frame.
dist
can be a character vector giving the distribution name. Then the arg1
, ... arg9
aesthetics (or args
as a list column) specify distribution arguments. Distribution names
should correspond to R functions that have "p"
, "q"
, and "d"
functions; e.g. "norm"
is a valid distribution name because R defines the pnorm()
, qnorm()
, and dnorm()
functions for Normal distributions.
See the parse_dist()
function for a useful way to generate dist
and args
values from human-readable distribution specs (like "normal(0,1)"
). Such specs are also
produced by other packages (like the brms::get_prior
function in brms); thus,
parse_dist()
combined with the stats described here can help you visualize the output
of those functions.
A ggplot2::Stat representing a gradient + interval geometry which can
be added to a ggplot()
object.
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
x
or y
: For slabs, the input values to the slab function.
For intervals, the point summary from the interval function. Whether it is x
or y
depends on orientation
xmin
or ymin
: For intervals, the lower end of the interval from the interval function.
xmax
or ymax
: For intervals, the upper end of the interval from the interval function.
.width
: For intervals, the interval width as a numeric value in [0, 1]
.
For slabs, the width of the smallest interval containing that value of the slab.
level
: For intervals, the interval width as an ordered factor.
For slabs, the level of the smallest interval containing that value of the slab.
pdf
: For slabs, the probability density function (PDF).
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the PDF at the point summary; intervals also have pdf_min
and pdf_max
for the PDF at the lower and upper ends of the interval.
cdf
: For slabs, the cumulative distribution function.
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the CDF at the point summary; intervals also have cdf_min
and cdf_max
for the CDF at the lower and upper ends of the interval.
n
: For slabs, the number of data points summarized into that slab. If the slab was created from
an analytical distribution via the xdist
, ydist
, or dist
aesthetic, n
will be Inf
.
f
: (deprecated) For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF),
determined by slab_type
. Instead of using slab_type
to change f
and then mapping f
onto an
aesthetic, it is now recommended to simply map the corresponding computed variable (e.g. pdf
, cdf
, or
1 - cdf
) directly onto the desired aesthetic.
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation = "vertical"
); or sample data to be summarized
(when orientation = "horizontal"
with sample data).
y
: y position of the geometry (when orientation = "horizontal"
); or sample data to be summarized
(when orientation = "vertical"
with sample data).
weight
: When using samples (i.e. the x
and y
aesthetics, not xdist
or ydist
), optional
weights to be applied to each draw.
xdist
: When using analytical distributions, distribution to map on the x axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
ydist
: When using analytical distributions, distribution to map on the y axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
dist
: When using analytical distributions, a name of a distribution (e.g. "norm"
), a
distributional object (e.g. dist_normal()
), or a posterior::rvar()
object. See Details.
args
: Distribution arguments (args
or arg1
, ... arg9
). See Details.
In addition, in their default configuration (paired with geom_slabinterval()
)
the following aesthetics are supported by the underlying geom:
Slab-specific aesthetics
thickness
: The thickness of the slab at each x
value (if orientation = "horizontal"
) or
y
value (if orientation = "vertical"
) of the slab.
side
: Which side to place the slab on. "topright"
, "top"
, and "right"
are synonyms
which cause the slab to be drawn on the top or the right depending on if orientation
is "horizontal"
or "vertical"
. "bottomleft"
, "bottom"
, and "left"
are synonyms which cause the slab
to be drawn on the bottom or the left depending on if orientation
is "horizontal"
or
"vertical"
. "topleft"
causes the slab to be drawn on the top or the left, and "bottomright"
causes the slab to be drawn on the bottom or the right. "both"
draws the slab mirrored on both
sides (as in a violin plot).
scale
: What proportion of the region allocated to this geom to use to draw the slab. If scale = 1
,
slabs that use the maximum range will just touch each other. Default is 0.9
to leave some space
between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization,
see the thickness
scale article.
justification
: Justification of the interval relative to the slab, where 0
indicates bottom/left
justification and 1
indicates top/right justification (depending on orientation
). If justification
is NULL
(the default), then it is set automatically based on the value of side
: when side
is
"top"
/"right"
justification
is set to 0
, when side
is "bottom"
/"left"
justification
is set to 1
, and when side
is "both"
justification
is set to 0.5.
datatype
: When using composite geoms directly without a stat
(e.g. geom_slabinterval()
), datatype
is used to
indicate which part of the geom a row in the data targets: rows with datatype = "slab"
target the
slab portion of the geometry and rows with datatype = "interval"
target the interval portion of
the geometry. This is set automatically when using ggdist stat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (if orientation = "horizontal"
).
xmax
: Right end of the interval sub-geometry (if orientation = "horizontal"
).
ymin
: Lower end of the interval sub-geometry (if orientation = "vertical"
).
ymax
: Upper end of the interval sub-geometry (if orientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (or color
) The color of the interval and point sub-geometries.
Use the slab_color
, interval_color
, or point_color
aesthetics (below) to
set sub-geometry colors separately.
fill
: The fill color of the slab and point sub-geometries. Use the slab_fill
or point_fill
aesthetics (below) to set sub-geometry colors separately.
alpha
: The opacity of the slab, interval, and point sub-geometries. Use the slab_alpha
,
interval_alpha
, or point_alpha
aesthetics (below) to set sub-geometry colors separately.
colour_ramp
: (or color_ramp
) A secondary scale that modifies the color
scale to "ramp" to another color. See scale_colour_ramp()
for examples.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except with geom_slab()
: then
it is the width of the slab). With composite geometries including an interval and slab,
use slab_linewidth
to set the line width of the slab (see below). For interval, raw
linewidth
values are transformed according to the interval_size_domain
and interval_size_range
parameters of the geom
(see above).
size
: Determines the size of the point. If linewidth
is not provided, size
will
also determines the width of the line used to draw the interval (this allows line width and
point size to be modified together by setting only size
and not linewidth
). Raw
size
values are transformed according to the interval_size_domain
, interval_size_range
,
and fatten_point
parameters of the geom
(see above). Use the point_size
aesthetic
(below) to set sub-geometry size directly without applying the effects of
interval_size_domain
, interval_size_range
, and fatten_point
.
stroke
: Width of the outline around the point sub-geometry.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc) used to draw the interval
and the outline of the slab (if it is visible). Use the slab_linetype
or
interval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override for fill
: the fill color of the slab.
slab_colour
: (or slab_color
) Override for colour
/color
: the outline color of the slab.
slab_alpha
: Override for alpha
: the opacity of the slab.
slab_linewidth
: Override for linwidth
: the width of the outline of the slab.
slab_linetype
: Override for linetype
: the line type of the outline of the slab.
Interval-specific color and line override aesthetics
interval_colour
: (or interval_color
) Override for colour
/color
: the color of the interval.
interval_alpha
: Override for alpha
: the opacity of the interval.
interval_linetype
: Override for linetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override for fill
: the fill color of the point.
point_colour
: (or point_color
) Override for colour
/color
: the outline color of the point.
point_alpha
: Override for alpha
: the opacity of the point.
point_size
: Override for size
: the size of the point.
Deprecated aesthetics
slab_size
: Use slab_linewidth
.
interval_size
: Use interval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See geom_slabinterval()
for the geom underlying this stat.
See stat_slabinterval()
for the stat this shortcut is based on.
Other slabinterval stats:
stat_ccdfinterval()
,
stat_cdfinterval()
,
stat_eye()
,
stat_halfeye()
,
stat_histinterval()
,
stat_interval()
,
stat_pointinterval()
,
stat_slab()
,
stat_spike()
library(dplyr) library(ggplot2) library(distributional) theme_set(theme_ggdist()) # ON SAMPLE DATA set.seed(1234) df = data.frame( group = c("a", "b", "c"), value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1)) ) df %>% ggplot(aes(x = value, y = group)) + stat_gradientinterval() # ON ANALYTICAL DISTRIBUTIONS dist_df = data.frame( group = c("a", "b", "c"), mean = c( 5, 7, 8), sd = c( 1, 1.5, 1) ) # Vectorized distribution types, like distributional::dist_normal() # and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics dist_df %>% ggplot(aes(y = group, xdist = dist_normal(mean, sd))) + stat_gradientinterval()
library(dplyr) library(ggplot2) library(distributional) theme_set(theme_ggdist()) # ON SAMPLE DATA set.seed(1234) df = data.frame( group = c("a", "b", "c"), value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1)) ) df %>% ggplot(aes(x = value, y = group)) + stat_gradientinterval() # ON ANALYTICAL DISTRIBUTIONS dist_df = data.frame( group = c("a", "b", "c"), mean = c( 5, 7, 8), sd = c( 1, 1.5, 1) ) # Vectorized distribution types, like distributional::dist_normal() # and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics dist_df %>% ggplot(aes(y = group, xdist = dist_normal(mean, sd))) + stat_gradientinterval()
Equivalent to stat_slabinterval()
, whose default settings create half-eye (density + interval) plots.
stat_halfeye( mapping = NULL, data = NULL, geom = "slabinterval", position = "identity", ..., p_limits = c(NA, NA), density = "bounded", adjust = waiver(), trim = waiver(), breaks = waiver(), align = waiver(), outline_bars = waiver(), expand = FALSE, point_interval = "median_qi", limits = NULL, n = waiver(), .width = c(0.66, 0.95), orientation = NA, na.rm = FALSE, show.legend = c(size = FALSE), inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
stat_halfeye( mapping = NULL, data = NULL, geom = "slabinterval", position = "identity", ..., p_limits = c(NA, NA), density = "bounded", adjust = waiver(), trim = waiver(), breaks = waiver(), align = waiver(), outline_bars = waiver(), expand = FALSE, point_interval = "median_qi", limits = NULL, n = waiver(), .width = c(0.66, 0.95), orientation = NA, na.rm = FALSE, show.legend = c(size = FALSE), inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override
the default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
p_limits |
<length-2 numeric> Probability limits. Used to determine the lower and upper
limits of analytical distributions (distributions from samples ignore this parameter and determine
their limits based on the limits of the sample and the value of the |
density |
<function | string> Density estimator for sample data. One of:
|
adjust |
<scalar numeric | waiver> Passed to |
trim |
<scalar logical | waiver> Passed to |
breaks |
<numeric | function | string | waiver> Passed to
For example, |
align |
<scalar numeric | function | string | waiver> Passed to
For example, |
outline_bars |
<scalar logical | waiver> Passed to |
expand |
<logical> For sample data, should the slab be expanded to the limits of the scale? Default |
point_interval |
<function | string> A function from the |
limits |
<length-2 numeric> Manually-specified limits for the slab, as a vector of length two. These limits are combined with those
computed based on |
n |
<scalar numeric> Number of points at which to evaluate the function that defines the slab. Also
passed to |
.width |
<numeric> The |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
<logical> Should this layer be included in the legends? Default is |
inherit.aes |
If |
check.aes , check.param
|
If |
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
xdist
, ydist
, and dist
can be any distribution object from the distributional
package (dist_normal()
, dist_beta()
, etc)
or can be a posterior::rvar()
object. Since these functions are vectorized,
other columns can be passed directly to them in an aes()
specification; e.g.
aes(dist = dist_normal(mu, sigma))
will work if mu
and sigma
are columns in the
input data frame.
dist
can be a character vector giving the distribution name. Then the arg1
, ... arg9
aesthetics (or args
as a list column) specify distribution arguments. Distribution names
should correspond to R functions that have "p"
, "q"
, and "d"
functions; e.g. "norm"
is a valid distribution name because R defines the pnorm()
, qnorm()
, and dnorm()
functions for Normal distributions.
See the parse_dist()
function for a useful way to generate dist
and args
values from human-readable distribution specs (like "normal(0,1)"
). Such specs are also
produced by other packages (like the brms::get_prior
function in brms); thus,
parse_dist()
combined with the stats described here can help you visualize the output
of those functions.
A ggplot2::Stat representing a half-eye (density + interval) geometry which can
be added to a ggplot()
object.
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
x
or y
: For slabs, the input values to the slab function.
For intervals, the point summary from the interval function. Whether it is x
or y
depends on orientation
xmin
or ymin
: For intervals, the lower end of the interval from the interval function.
xmax
or ymax
: For intervals, the upper end of the interval from the interval function.
.width
: For intervals, the interval width as a numeric value in [0, 1]
.
For slabs, the width of the smallest interval containing that value of the slab.
level
: For intervals, the interval width as an ordered factor.
For slabs, the level of the smallest interval containing that value of the slab.
pdf
: For slabs, the probability density function (PDF).
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the PDF at the point summary; intervals also have pdf_min
and pdf_max
for the PDF at the lower and upper ends of the interval.
cdf
: For slabs, the cumulative distribution function.
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the CDF at the point summary; intervals also have cdf_min
and cdf_max
for the CDF at the lower and upper ends of the interval.
n
: For slabs, the number of data points summarized into that slab. If the slab was created from
an analytical distribution via the xdist
, ydist
, or dist
aesthetic, n
will be Inf
.
f
: (deprecated) For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF),
determined by slab_type
. Instead of using slab_type
to change f
and then mapping f
onto an
aesthetic, it is now recommended to simply map the corresponding computed variable (e.g. pdf
, cdf
, or
1 - cdf
) directly onto the desired aesthetic.
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation = "vertical"
); or sample data to be summarized
(when orientation = "horizontal"
with sample data).
y
: y position of the geometry (when orientation = "horizontal"
); or sample data to be summarized
(when orientation = "vertical"
with sample data).
weight
: When using samples (i.e. the x
and y
aesthetics, not xdist
or ydist
), optional
weights to be applied to each draw.
xdist
: When using analytical distributions, distribution to map on the x axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
ydist
: When using analytical distributions, distribution to map on the y axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
dist
: When using analytical distributions, a name of a distribution (e.g. "norm"
), a
distributional object (e.g. dist_normal()
), or a posterior::rvar()
object. See Details.
args
: Distribution arguments (args
or arg1
, ... arg9
). See Details.
In addition, in their default configuration (paired with geom_slabinterval()
)
the following aesthetics are supported by the underlying geom:
Slab-specific aesthetics
thickness
: The thickness of the slab at each x
value (if orientation = "horizontal"
) or
y
value (if orientation = "vertical"
) of the slab.
side
: Which side to place the slab on. "topright"
, "top"
, and "right"
are synonyms
which cause the slab to be drawn on the top or the right depending on if orientation
is "horizontal"
or "vertical"
. "bottomleft"
, "bottom"
, and "left"
are synonyms which cause the slab
to be drawn on the bottom or the left depending on if orientation
is "horizontal"
or
"vertical"
. "topleft"
causes the slab to be drawn on the top or the left, and "bottomright"
causes the slab to be drawn on the bottom or the right. "both"
draws the slab mirrored on both
sides (as in a violin plot).
scale
: What proportion of the region allocated to this geom to use to draw the slab. If scale = 1
,
slabs that use the maximum range will just touch each other. Default is 0.9
to leave some space
between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization,
see the thickness
scale article.
justification
: Justification of the interval relative to the slab, where 0
indicates bottom/left
justification and 1
indicates top/right justification (depending on orientation
). If justification
is NULL
(the default), then it is set automatically based on the value of side
: when side
is
"top"
/"right"
justification
is set to 0
, when side
is "bottom"
/"left"
justification
is set to 1
, and when side
is "both"
justification
is set to 0.5.
datatype
: When using composite geoms directly without a stat
(e.g. geom_slabinterval()
), datatype
is used to
indicate which part of the geom a row in the data targets: rows with datatype = "slab"
target the
slab portion of the geometry and rows with datatype = "interval"
target the interval portion of
the geometry. This is set automatically when using ggdist stat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (if orientation = "horizontal"
).
xmax
: Right end of the interval sub-geometry (if orientation = "horizontal"
).
ymin
: Lower end of the interval sub-geometry (if orientation = "vertical"
).
ymax
: Upper end of the interval sub-geometry (if orientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (or color
) The color of the interval and point sub-geometries.
Use the slab_color
, interval_color
, or point_color
aesthetics (below) to
set sub-geometry colors separately.
fill
: The fill color of the slab and point sub-geometries. Use the slab_fill
or point_fill
aesthetics (below) to set sub-geometry colors separately.
alpha
: The opacity of the slab, interval, and point sub-geometries. Use the slab_alpha
,
interval_alpha
, or point_alpha
aesthetics (below) to set sub-geometry colors separately.
colour_ramp
: (or color_ramp
) A secondary scale that modifies the color
scale to "ramp" to another color. See scale_colour_ramp()
for examples.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except with geom_slab()
: then
it is the width of the slab). With composite geometries including an interval and slab,
use slab_linewidth
to set the line width of the slab (see below). For interval, raw
linewidth
values are transformed according to the interval_size_domain
and interval_size_range
parameters of the geom
(see above).
size
: Determines the size of the point. If linewidth
is not provided, size
will
also determines the width of the line used to draw the interval (this allows line width and
point size to be modified together by setting only size
and not linewidth
). Raw
size
values are transformed according to the interval_size_domain
, interval_size_range
,
and fatten_point
parameters of the geom
(see above). Use the point_size
aesthetic
(below) to set sub-geometry size directly without applying the effects of
interval_size_domain
, interval_size_range
, and fatten_point
.
stroke
: Width of the outline around the point sub-geometry.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc) used to draw the interval
and the outline of the slab (if it is visible). Use the slab_linetype
or
interval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override for fill
: the fill color of the slab.
slab_colour
: (or slab_color
) Override for colour
/color
: the outline color of the slab.
slab_alpha
: Override for alpha
: the opacity of the slab.
slab_linewidth
: Override for linwidth
: the width of the outline of the slab.
slab_linetype
: Override for linetype
: the line type of the outline of the slab.
Interval-specific color and line override aesthetics
interval_colour
: (or interval_color
) Override for colour
/color
: the color of the interval.
interval_alpha
: Override for alpha
: the opacity of the interval.
interval_linetype
: Override for linetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override for fill
: the fill color of the point.
point_colour
: (or point_color
) Override for colour
/color
: the outline color of the point.
point_alpha
: Override for alpha
: the opacity of the point.
point_size
: Override for size
: the size of the point.
Deprecated aesthetics
slab_size
: Use slab_linewidth
.
interval_size
: Use interval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See geom_slabinterval()
for the geom underlying this stat.
See stat_slabinterval()
for the stat this shortcut is based on.
Other slabinterval stats:
stat_ccdfinterval()
,
stat_cdfinterval()
,
stat_eye()
,
stat_gradientinterval()
,
stat_histinterval()
,
stat_interval()
,
stat_pointinterval()
,
stat_slab()
,
stat_spike()
library(dplyr) library(ggplot2) library(distributional) theme_set(theme_ggdist()) # ON SAMPLE DATA set.seed(1234) df = data.frame( group = c("a", "b", "c"), value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1)) ) df %>% ggplot(aes(x = value, y = group)) + stat_halfeye() # ON ANALYTICAL DISTRIBUTIONS dist_df = data.frame( group = c("a", "b", "c"), mean = c( 5, 7, 8), sd = c( 1, 1.5, 1) ) # Vectorized distribution types, like distributional::dist_normal() # and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics dist_df %>% ggplot(aes(y = group, xdist = dist_normal(mean, sd))) + stat_halfeye()
library(dplyr) library(ggplot2) library(distributional) theme_set(theme_ggdist()) # ON SAMPLE DATA set.seed(1234) df = data.frame( group = c("a", "b", "c"), value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1)) ) df %>% ggplot(aes(x = value, y = group)) + stat_halfeye() # ON ANALYTICAL DISTRIBUTIONS dist_df = data.frame( group = c("a", "b", "c"), mean = c( 5, 7, 8), sd = c( 1, 1.5, 1) ) # Vectorized distribution types, like distributional::dist_normal() # and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics dist_df %>% ggplot(aes(y = group, xdist = dist_normal(mean, sd))) + stat_halfeye()
Shortcut version of stat_slabinterval()
with geom_slabinterval()
for
creating histogram + interval plots.
Roughly equivalent to:
stat_slabinterval( density = "histogram" )
stat_histinterval( mapping = NULL, data = NULL, geom = "slabinterval", position = "identity", ..., density = "histogram", p_limits = c(NA, NA), adjust = waiver(), trim = waiver(), breaks = waiver(), align = waiver(), outline_bars = waiver(), expand = FALSE, point_interval = "median_qi", limits = NULL, n = waiver(), .width = c(0.66, 0.95), orientation = NA, na.rm = FALSE, show.legend = c(size = FALSE), inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
stat_histinterval( mapping = NULL, data = NULL, geom = "slabinterval", position = "identity", ..., density = "histogram", p_limits = c(NA, NA), adjust = waiver(), trim = waiver(), breaks = waiver(), align = waiver(), outline_bars = waiver(), expand = FALSE, point_interval = "median_qi", limits = NULL, n = waiver(), .width = c(0.66, 0.95), orientation = NA, na.rm = FALSE, show.legend = c(size = FALSE), inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override
the default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
density |
<function | string> Density estimator for sample data. One of:
|
p_limits |
<length-2 numeric> Probability limits. Used to determine the lower and upper
limits of analytical distributions (distributions from samples ignore this parameter and determine
their limits based on the limits of the sample and the value of the |
adjust |
<scalar numeric | waiver> Passed to |
trim |
<scalar logical | waiver> Passed to |
breaks |
<numeric | function | string | waiver> Passed to
For example, |
align |
<scalar numeric | function | string | waiver> Passed to
For example, |
outline_bars |
<scalar logical | waiver> Passed to |
expand |
<logical> For sample data, should the slab be expanded to the limits of the scale? Default |
point_interval |
<function | string> A function from the |
limits |
<length-2 numeric> Manually-specified limits for the slab, as a vector of length two. These limits are combined with those
computed based on |
n |
<scalar numeric> Number of points at which to evaluate the function that defines the slab. Also
passed to |
.width |
<numeric> The |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
<logical> Should this layer be included in the legends? Default is |
inherit.aes |
If |
check.aes , check.param
|
If |
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
xdist
, ydist
, and dist
can be any distribution object from the distributional
package (dist_normal()
, dist_beta()
, etc)
or can be a posterior::rvar()
object. Since these functions are vectorized,
other columns can be passed directly to them in an aes()
specification; e.g.
aes(dist = dist_normal(mu, sigma))
will work if mu
and sigma
are columns in the
input data frame.
dist
can be a character vector giving the distribution name. Then the arg1
, ... arg9
aesthetics (or args
as a list column) specify distribution arguments. Distribution names
should correspond to R functions that have "p"
, "q"
, and "d"
functions; e.g. "norm"
is a valid distribution name because R defines the pnorm()
, qnorm()
, and dnorm()
functions for Normal distributions.
See the parse_dist()
function for a useful way to generate dist
and args
values from human-readable distribution specs (like "normal(0,1)"
). Such specs are also
produced by other packages (like the brms::get_prior
function in brms); thus,
parse_dist()
combined with the stats described here can help you visualize the output
of those functions.
A ggplot2::Stat representing a histogram + interval geometry which can
be added to a ggplot()
object.
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
x
or y
: For slabs, the input values to the slab function.
For intervals, the point summary from the interval function. Whether it is x
or y
depends on orientation
xmin
or ymin
: For intervals, the lower end of the interval from the interval function.
xmax
or ymax
: For intervals, the upper end of the interval from the interval function.
.width
: For intervals, the interval width as a numeric value in [0, 1]
.
For slabs, the width of the smallest interval containing that value of the slab.
level
: For intervals, the interval width as an ordered factor.
For slabs, the level of the smallest interval containing that value of the slab.
pdf
: For slabs, the probability density function (PDF).
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the PDF at the point summary; intervals also have pdf_min
and pdf_max
for the PDF at the lower and upper ends of the interval.
cdf
: For slabs, the cumulative distribution function.
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the CDF at the point summary; intervals also have cdf_min
and cdf_max
for the CDF at the lower and upper ends of the interval.
n
: For slabs, the number of data points summarized into that slab. If the slab was created from
an analytical distribution via the xdist
, ydist
, or dist
aesthetic, n
will be Inf
.
f
: (deprecated) For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF),
determined by slab_type
. Instead of using slab_type
to change f
and then mapping f
onto an
aesthetic, it is now recommended to simply map the corresponding computed variable (e.g. pdf
, cdf
, or
1 - cdf
) directly onto the desired aesthetic.
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation = "vertical"
); or sample data to be summarized
(when orientation = "horizontal"
with sample data).
y
: y position of the geometry (when orientation = "horizontal"
); or sample data to be summarized
(when orientation = "vertical"
with sample data).
weight
: When using samples (i.e. the x
and y
aesthetics, not xdist
or ydist
), optional
weights to be applied to each draw.
xdist
: When using analytical distributions, distribution to map on the x axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
ydist
: When using analytical distributions, distribution to map on the y axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
dist
: When using analytical distributions, a name of a distribution (e.g. "norm"
), a
distributional object (e.g. dist_normal()
), or a posterior::rvar()
object. See Details.
args
: Distribution arguments (args
or arg1
, ... arg9
). See Details.
In addition, in their default configuration (paired with geom_slabinterval()
)
the following aesthetics are supported by the underlying geom:
Slab-specific aesthetics
thickness
: The thickness of the slab at each x
value (if orientation = "horizontal"
) or
y
value (if orientation = "vertical"
) of the slab.
side
: Which side to place the slab on. "topright"
, "top"
, and "right"
are synonyms
which cause the slab to be drawn on the top or the right depending on if orientation
is "horizontal"
or "vertical"
. "bottomleft"
, "bottom"
, and "left"
are synonyms which cause the slab
to be drawn on the bottom or the left depending on if orientation
is "horizontal"
or
"vertical"
. "topleft"
causes the slab to be drawn on the top or the left, and "bottomright"
causes the slab to be drawn on the bottom or the right. "both"
draws the slab mirrored on both
sides (as in a violin plot).
scale
: What proportion of the region allocated to this geom to use to draw the slab. If scale = 1
,
slabs that use the maximum range will just touch each other. Default is 0.9
to leave some space
between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization,
see the thickness
scale article.
justification
: Justification of the interval relative to the slab, where 0
indicates bottom/left
justification and 1
indicates top/right justification (depending on orientation
). If justification
is NULL
(the default), then it is set automatically based on the value of side
: when side
is
"top"
/"right"
justification
is set to 0
, when side
is "bottom"
/"left"
justification
is set to 1
, and when side
is "both"
justification
is set to 0.5.
datatype
: When using composite geoms directly without a stat
(e.g. geom_slabinterval()
), datatype
is used to
indicate which part of the geom a row in the data targets: rows with datatype = "slab"
target the
slab portion of the geometry and rows with datatype = "interval"
target the interval portion of
the geometry. This is set automatically when using ggdist stat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (if orientation = "horizontal"
).
xmax
: Right end of the interval sub-geometry (if orientation = "horizontal"
).
ymin
: Lower end of the interval sub-geometry (if orientation = "vertical"
).
ymax
: Upper end of the interval sub-geometry (if orientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (or color
) The color of the interval and point sub-geometries.
Use the slab_color
, interval_color
, or point_color
aesthetics (below) to
set sub-geometry colors separately.
fill
: The fill color of the slab and point sub-geometries. Use the slab_fill
or point_fill
aesthetics (below) to set sub-geometry colors separately.
alpha
: The opacity of the slab, interval, and point sub-geometries. Use the slab_alpha
,
interval_alpha
, or point_alpha
aesthetics (below) to set sub-geometry colors separately.
colour_ramp
: (or color_ramp
) A secondary scale that modifies the color
scale to "ramp" to another color. See scale_colour_ramp()
for examples.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except with geom_slab()
: then
it is the width of the slab). With composite geometries including an interval and slab,
use slab_linewidth
to set the line width of the slab (see below). For interval, raw
linewidth
values are transformed according to the interval_size_domain
and interval_size_range
parameters of the geom
(see above).
size
: Determines the size of the point. If linewidth
is not provided, size
will
also determines the width of the line used to draw the interval (this allows line width and
point size to be modified together by setting only size
and not linewidth
). Raw
size
values are transformed according to the interval_size_domain
, interval_size_range
,
and fatten_point
parameters of the geom
(see above). Use the point_size
aesthetic
(below) to set sub-geometry size directly without applying the effects of
interval_size_domain
, interval_size_range
, and fatten_point
.
stroke
: Width of the outline around the point sub-geometry.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc) used to draw the interval
and the outline of the slab (if it is visible). Use the slab_linetype
or
interval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override for fill
: the fill color of the slab.
slab_colour
: (or slab_color
) Override for colour
/color
: the outline color of the slab.
slab_alpha
: Override for alpha
: the opacity of the slab.
slab_linewidth
: Override for linwidth
: the width of the outline of the slab.
slab_linetype
: Override for linetype
: the line type of the outline of the slab.
Interval-specific color and line override aesthetics
interval_colour
: (or interval_color
) Override for colour
/color
: the color of the interval.
interval_alpha
: Override for alpha
: the opacity of the interval.
interval_linetype
: Override for linetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override for fill
: the fill color of the point.
point_colour
: (or point_color
) Override for colour
/color
: the outline color of the point.
point_alpha
: Override for alpha
: the opacity of the point.
point_size
: Override for size
: the size of the point.
Deprecated aesthetics
slab_size
: Use slab_linewidth
.
interval_size
: Use interval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See geom_slabinterval()
for the geom underlying this stat.
See stat_slabinterval()
for the stat this shortcut is based on.
Other slabinterval stats:
stat_ccdfinterval()
,
stat_cdfinterval()
,
stat_eye()
,
stat_gradientinterval()
,
stat_halfeye()
,
stat_interval()
,
stat_pointinterval()
,
stat_slab()
,
stat_spike()
library(dplyr) library(ggplot2) library(distributional) theme_set(theme_ggdist()) # ON SAMPLE DATA set.seed(1234) df = data.frame( group = c("a", "b", "c"), value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1)) ) df %>% ggplot(aes(x = value, y = group)) + stat_histinterval() # ON ANALYTICAL DISTRIBUTIONS dist_df = data.frame( group = c("a", "b", "c"), mean = c( 5, 7, 8), sd = c( 1, 1.5, 1) ) # Vectorized distribution types, like distributional::dist_normal() # and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics dist_df %>% ggplot(aes(y = group, xdist = dist_normal(mean, sd))) + stat_histinterval()
library(dplyr) library(ggplot2) library(distributional) theme_set(theme_ggdist()) # ON SAMPLE DATA set.seed(1234) df = data.frame( group = c("a", "b", "c"), value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1)) ) df %>% ggplot(aes(x = value, y = group)) + stat_histinterval() # ON ANALYTICAL DISTRIBUTIONS dist_df = data.frame( group = c("a", "b", "c"), mean = c( 5, 7, 8), sd = c( 1, 1.5, 1) ) # Vectorized distribution types, like distributional::dist_normal() # and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics dist_df %>% ggplot(aes(y = group, xdist = dist_normal(mean, sd))) + stat_histinterval()
Shortcut version of stat_slabinterval()
with geom_interval()
for
creating multiple-interval plots.
Roughly equivalent to:
stat_slabinterval( aes( colour = after_stat(level), size = NULL ), geom = "interval", show_point = FALSE, .width = c(0.5, 0.8, 0.95), show_slab = FALSE, show.legend = NA )
stat_interval( mapping = NULL, data = NULL, geom = "interval", position = "identity", ..., .width = c(0.5, 0.8, 0.95), point_interval = "median_qi", orientation = NA, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
stat_interval( mapping = NULL, data = NULL, geom = "interval", position = "identity", ..., .width = c(0.5, 0.8, 0.95), point_interval = "median_qi", orientation = NA, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override
the default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
.width |
<numeric> The |
point_interval |
<function | string> A function from the |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
<logical> Should this layer be included in the legends? Default is |
inherit.aes |
If |
check.aes , check.param
|
If |
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
xdist
, ydist
, and dist
can be any distribution object from the distributional
package (dist_normal()
, dist_beta()
, etc)
or can be a posterior::rvar()
object. Since these functions are vectorized,
other columns can be passed directly to them in an aes()
specification; e.g.
aes(dist = dist_normal(mu, sigma))
will work if mu
and sigma
are columns in the
input data frame.
dist
can be a character vector giving the distribution name. Then the arg1
, ... arg9
aesthetics (or args
as a list column) specify distribution arguments. Distribution names
should correspond to R functions that have "p"
, "q"
, and "d"
functions; e.g. "norm"
is a valid distribution name because R defines the pnorm()
, qnorm()
, and dnorm()
functions for Normal distributions.
See the parse_dist()
function for a useful way to generate dist
and args
values from human-readable distribution specs (like "normal(0,1)"
). Such specs are also
produced by other packages (like the brms::get_prior
function in brms); thus,
parse_dist()
combined with the stats described here can help you visualize the output
of those functions.
A ggplot2::Stat representing a multiple-interval geometry which can
be added to a ggplot()
object.
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
x
or y
: For slabs, the input values to the slab function.
For intervals, the point summary from the interval function. Whether it is x
or y
depends on orientation
xmin
or ymin
: For intervals, the lower end of the interval from the interval function.
xmax
or ymax
: For intervals, the upper end of the interval from the interval function.
.width
: For intervals, the interval width as a numeric value in [0, 1]
.
For slabs, the width of the smallest interval containing that value of the slab.
level
: For intervals, the interval width as an ordered factor.
For slabs, the level of the smallest interval containing that value of the slab.
pdf
: For slabs, the probability density function (PDF).
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the PDF at the point summary; intervals also have pdf_min
and pdf_max
for the PDF at the lower and upper ends of the interval.
cdf
: For slabs, the cumulative distribution function.
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the CDF at the point summary; intervals also have cdf_min
and cdf_max
for the CDF at the lower and upper ends of the interval.
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation = "vertical"
); or sample data to be summarized
(when orientation = "horizontal"
with sample data).
y
: y position of the geometry (when orientation = "horizontal"
); or sample data to be summarized
(when orientation = "vertical"
with sample data).
weight
: When using samples (i.e. the x
and y
aesthetics, not xdist
or ydist
), optional
weights to be applied to each draw.
xdist
: When using analytical distributions, distribution to map on the x axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
ydist
: When using analytical distributions, distribution to map on the y axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
dist
: When using analytical distributions, a name of a distribution (e.g. "norm"
), a
distributional object (e.g. dist_normal()
), or a posterior::rvar()
object. See Details.
args
: Distribution arguments (args
or arg1
, ... arg9
). See Details.
In addition, in their default configuration (paired with geom_interval()
)
the following aesthetics are supported by the underlying geom:
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (if orientation = "horizontal"
).
xmax
: Right end of the interval sub-geometry (if orientation = "horizontal"
).
ymin
: Lower end of the interval sub-geometry (if orientation = "vertical"
).
ymax
: Upper end of the interval sub-geometry (if orientation = "vertical"
).
Color aesthetics
colour
: (or color
) The color of the interval and point sub-geometries.
Use the slab_color
, interval_color
, or point_color
aesthetics (below) to
set sub-geometry colors separately.
fill
: The fill color of the slab and point sub-geometries. Use the slab_fill
or point_fill
aesthetics (below) to set sub-geometry colors separately.
alpha
: The opacity of the slab, interval, and point sub-geometries. Use the slab_alpha
,
interval_alpha
, or point_alpha
aesthetics (below) to set sub-geometry colors separately.
colour_ramp
: (or color_ramp
) A secondary scale that modifies the color
scale to "ramp" to another color. See scale_colour_ramp()
for examples.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except with geom_slab()
: then
it is the width of the slab). With composite geometries including an interval and slab,
use slab_linewidth
to set the line width of the slab (see below). For interval, raw
linewidth
values are transformed according to the interval_size_domain
and interval_size_range
parameters of the geom
(see above).
size
: Determines the size of the point. If linewidth
is not provided, size
will
also determines the width of the line used to draw the interval (this allows line width and
point size to be modified together by setting only size
and not linewidth
). Raw
size
values are transformed according to the interval_size_domain
, interval_size_range
,
and fatten_point
parameters of the geom
(see above). Use the point_size
aesthetic
(below) to set sub-geometry size directly without applying the effects of
interval_size_domain
, interval_size_range
, and fatten_point
.
stroke
: Width of the outline around the point sub-geometry.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc) used to draw the interval
and the outline of the slab (if it is visible). Use the slab_linetype
or
interval_linetype
aesthetics (below) to set sub-geometry line types separately.
Interval-specific color and line override aesthetics
interval_colour
: (or interval_color
) Override for colour
/color
: the color of the interval.
interval_alpha
: Override for alpha
: the opacity of the interval.
interval_linetype
: Override for linetype
: the line type of the interval.
Deprecated aesthetics
interval_size
: Use interval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See geom_interval()
for the geom underlying this stat.
See stat_slabinterval()
for the stat this shortcut is based on.
Other slabinterval stats:
stat_ccdfinterval()
,
stat_cdfinterval()
,
stat_eye()
,
stat_gradientinterval()
,
stat_halfeye()
,
stat_histinterval()
,
stat_pointinterval()
,
stat_slab()
,
stat_spike()
library(dplyr) library(ggplot2) library(distributional) theme_set(theme_ggdist()) # ON SAMPLE DATA set.seed(1234) df = data.frame( group = c("a", "b", "c"), value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1)) ) df %>% ggplot(aes(x = value, y = group)) + stat_interval() + scale_color_brewer() # ON ANALYTICAL DISTRIBUTIONS dist_df = data.frame( group = c("a", "b", "c"), mean = c( 5, 7, 8), sd = c( 1, 1.5, 1) ) # Vectorized distribution types, like distributional::dist_normal() # and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics dist_df %>% ggplot(aes(y = group, xdist = dist_normal(mean, sd))) + stat_interval() + scale_color_brewer()
library(dplyr) library(ggplot2) library(distributional) theme_set(theme_ggdist()) # ON SAMPLE DATA set.seed(1234) df = data.frame( group = c("a", "b", "c"), value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1)) ) df %>% ggplot(aes(x = value, y = group)) + stat_interval() + scale_color_brewer() # ON ANALYTICAL DISTRIBUTIONS dist_df = data.frame( group = c("a", "b", "c"), mean = c( 5, 7, 8), sd = c( 1, 1.5, 1) ) # Vectorized distribution types, like distributional::dist_normal() # and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics dist_df %>% ggplot(aes(y = group, xdist = dist_normal(mean, sd))) + stat_interval() + scale_color_brewer()
A combination of stat_slabinterval()
and geom_lineribbon()
with sensible defaults
for making line + multiple-ribbon plots. While geom_lineribbon()
is intended for use on data
frames that have already been summarized using a point_interval()
function,
stat_lineribbon()
is intended for use directly on data frames of draws or of
analytical distributions, and will perform the summarization using a point_interval()
function.
Roughly equivalent to:
stat_slabinterval( aes( group = after_stat(level), fill = after_stat(level), order = after_stat(level), size = NULL ), geom = "lineribbon", .width = c(0.5, 0.8, 0.95), show_slab = FALSE, show.legend = NA )
stat_lineribbon( mapping = NULL, data = NULL, geom = "lineribbon", position = "identity", ..., .width = c(0.5, 0.8, 0.95), point_interval = "median_qi", orientation = NA, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
stat_lineribbon( mapping = NULL, data = NULL, geom = "lineribbon", position = "identity", ..., .width = c(0.5, 0.8, 0.95), point_interval = "median_qi", orientation = NA, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override
the default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
.width |
<numeric> The |
point_interval |
<function | string> A function from the |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
<logical> Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param
|
If |
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
xdist
, ydist
, and dist
can be any distribution object from the distributional
package (dist_normal()
, dist_beta()
, etc)
or can be a posterior::rvar()
object. Since these functions are vectorized,
other columns can be passed directly to them in an aes()
specification; e.g.
aes(dist = dist_normal(mu, sigma))
will work if mu
and sigma
are columns in the
input data frame.
dist
can be a character vector giving the distribution name. Then the arg1
, ... arg9
aesthetics (or args
as a list column) specify distribution arguments. Distribution names
should correspond to R functions that have "p"
, "q"
, and "d"
functions; e.g. "norm"
is a valid distribution name because R defines the pnorm()
, qnorm()
, and dnorm()
functions for Normal distributions.
See the parse_dist()
function for a useful way to generate dist
and args
values from human-readable distribution specs (like "normal(0,1)"
). Such specs are also
produced by other packages (like the brms::get_prior
function in brms); thus,
parse_dist()
combined with the stats described here can help you visualize the output
of those functions.
A ggplot2::Stat representing a line + multiple-ribbon geometry which can
be added to a ggplot()
object.
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
x
or y
: For slabs, the input values to the slab function.
For intervals, the point summary from the interval function. Whether it is x
or y
depends on orientation
xmin
or ymin
: For intervals, the lower end of the interval from the interval function.
xmax
or ymax
: For intervals, the upper end of the interval from the interval function.
.width
: For intervals, the interval width as a numeric value in [0, 1]
.
For slabs, the width of the smallest interval containing that value of the slab.
level
: For intervals, the interval width as an ordered factor.
For slabs, the level of the smallest interval containing that value of the slab.
pdf
: For slabs, the probability density function (PDF).
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the PDF at the point summary; intervals also have pdf_min
and pdf_max
for the PDF at the lower and upper ends of the interval.
cdf
: For slabs, the cumulative distribution function.
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the CDF at the point summary; intervals also have cdf_min
and cdf_max
for the CDF at the lower and upper ends of the interval.
The line+ribbon stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their two sub-geometries: the line and the ribbon.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation = "vertical"
); or sample data to be summarized
(when orientation = "horizontal"
with sample data).
y
: y position of the geometry (when orientation = "horizontal"
); or sample data to be summarized
(when orientation = "vertical"
with sample data).
weight
: When using samples (i.e. the x
and y
aesthetics, not xdist
or ydist
), optional
weights to be applied to each draw.
xdist
: When using analytical distributions, distribution to map on the x axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
ydist
: When using analytical distributions, distribution to map on the y axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
dist
: When using analytical distributions, a name of a distribution (e.g. "norm"
), a
distributional object (e.g. dist_normal()
), or a posterior::rvar()
object. See Details.
args
: Distribution arguments (args
or arg1
, ... arg9
). See Details.
In addition, in their default configuration (paired with geom_lineribbon()
)
the following aesthetics are supported by the underlying geom:
Ribbon-specific aesthetics
xmin
: Left edge of the ribbon sub-geometry (if orientation = "horizontal"
).
xmax
: Right edge of the ribbon sub-geometry (if orientation = "horizontal"
).
ymin
: Lower edge of the ribbon sub-geometry (if orientation = "vertical"
).
ymax
: Upper edge of the ribbon sub-geometry (if orientation = "vertical"
).
order
: The order in which ribbons are drawn. Ribbons with the smallest mean value of order
are drawn first (i.e., will be drawn below ribbons with larger mean values of order
). If
order
is not supplied to geom_lineribbon()
, -abs(xmax - xmin)
or -abs(ymax - ymax)
(depending on orientation
) is used, having the effect of drawing the widest (on average)
ribbons on the bottom. stat_lineribbon()
uses order = after_stat(level)
by default,
causing the ribbons generated from the largest .width
to be drawn on the bottom.
Color aesthetics
colour
: (or color
) The color of the line sub-geometry.
fill
: The fill color of the ribbon sub-geometry.
alpha
: The opacity of the line and ribbon sub-geometries.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of line. In ggplot2 < 3.4, was called size
.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc)
Other aesthetics (these work as in standard geom
s)
group
See examples of some of these aesthetics in action in vignette("lineribbon")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See geom_lineribbon()
for the geom underlying this stat.
Other lineribbon stats:
stat_ribbon()
library(dplyr) library(ggplot2) library(distributional) theme_set(theme_ggdist()) # ON SAMPLE DATA set.seed(12345) tibble( x = rep(1:10, 100), y = rnorm(1000, x) ) %>% ggplot(aes(x = x, y = y)) + stat_lineribbon() + scale_fill_brewer() # ON ANALYTICAL DISTRIBUTIONS # Vectorized distribution types, like distributional::dist_normal() # and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics tibble( x = 1:10, sd = seq(1, 3, length.out = 10) ) %>% ggplot(aes(x = x, ydist = dist_normal(x, sd))) + stat_lineribbon() + scale_fill_brewer()
library(dplyr) library(ggplot2) library(distributional) theme_set(theme_ggdist()) # ON SAMPLE DATA set.seed(12345) tibble( x = rep(1:10, 100), y = rnorm(1000, x) ) %>% ggplot(aes(x = x, y = y)) + stat_lineribbon() + scale_fill_brewer() # ON ANALYTICAL DISTRIBUTIONS # Vectorized distribution types, like distributional::dist_normal() # and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics tibble( x = 1:10, sd = seq(1, 3, length.out = 10) ) %>% ggplot(aes(x = x, ydist = dist_normal(x, sd))) + stat_lineribbon() + scale_fill_brewer()
Variant of stat_dots()
for creating blurry dotplots of quantiles. Uses
posterior::mcse_quantile()
to calculate the Monte Carlo Standard Error
of each quantile computed for the dotplot, yielding an se
computed variable
that is by default mapped onto the sd
aesthetic of geom_blur_dots()
.
stat_mcse_dots( mapping = NULL, data = NULL, geom = "blur_dots", position = "identity", ..., quantiles = NA, orientation = NA, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
stat_mcse_dots( mapping = NULL, data = NULL, geom = "blur_dots", position = "identity", ..., quantiles = NA, orientation = NA, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override
the default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
quantiles |
<scalar logical> Number of quantiles to plot in the dotplot. Use |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param
|
If |
The dots family of stats and geoms are similar to ggplot2::geom_dotplot()
but with a number of differences:
Dots geoms act like slabs in geom_slabinterval()
and can be given x positions (or y positions when
in a horizontal orientation).
Given the available space to lay out dots, the dots geoms will automatically determine how many bins to use to fit the available space.
Dots geoms use a dynamic layout algorithm that lays out dots from the center out if the input data are symmetrical, guaranteeing that symmetrical data results in a symmetrical plot. The layout algorithm also prevents dots from overlapping each other.
The shape of the dots in these geoms can be changed using the slab_shape
aesthetic (when using the
dotsinterval
family) or the shape
or slab_shape
aesthetic (when using the dots
family)
Stats and geoms in this family include:
geom_dots()
: dotplots on raw data. Ensures the dotplot fits within available space by reducing the size
of the dots automatically (may result in very small dots).
geom_swarm()
and geom_weave()
: dotplots on raw data with defaults intended to create "beeswarm" plots.
Used side = "both"
by default, and sets the default dot size to the same size as geom_point()
(binwidth = unit(1.5, "mm")
), allowing dots to overlap instead of getting very small.
stat_dots()
: dotplots on raw data, distributional objects, and posterior::rvar()
s
geom_dotsinterval()
: dotplot + interval plots on raw data with already-calculated
intervals (rarely useful directly).
stat_dotsinterval()
: dotplot + interval plots on raw data, distributional objects,
and posterior::rvar()
s (will calculate intervals for you).
geom_blur_dots()
: blurry dotplots that allow the standard deviation of a blur applied to
each dot to be specified using the sd
aesthetic.
stat_mcse_dots()
: blurry dotplots of quantiles using the Monte Carlo Standard Error of each quantile.
stat_dots()
and stat_dotsinterval()
, when used with the quantiles
argument,
are particularly useful for constructing quantile dotplots, which can be an effective way to communicate uncertainty
using a frequency framing that may be easier for laypeople to understand (Kay et al. 2016, Fernandes et al. 2018).
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
xdist
, ydist
, and dist
can be any distribution object from the distributional
package (dist_normal()
, dist_beta()
, etc)
or can be a posterior::rvar()
object. Since these functions are vectorized,
other columns can be passed directly to them in an aes()
specification; e.g.
aes(dist = dist_normal(mu, sigma))
will work if mu
and sigma
are columns in the
input data frame.
dist
can be a character vector giving the distribution name. Then the arg1
, ... arg9
aesthetics (or args
as a list column) specify distribution arguments. Distribution names
should correspond to R functions that have "p"
, "q"
, and "d"
functions; e.g. "norm"
is a valid distribution name because R defines the pnorm()
, qnorm()
, and dnorm()
functions for Normal distributions.
See the parse_dist()
function for a useful way to generate dist
and args
values from human-readable distribution specs (like "normal(0,1)"
). Such specs are also
produced by other packages (like the brms::get_prior
function in brms); thus,
parse_dist()
combined with the stats described here can help you visualize the output
of those functions.
A ggplot2::Stat representing a blurry MCSE dot geometry which can
be added to a ggplot()
object.
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
x
or y
: For slabs, the input values to the slab function.
For intervals, the point summary from the interval function. Whether it is x
or y
depends on orientation
xmin
or ymin
: For intervals, the lower end of the interval from the interval function.
xmax
or ymax
: For intervals, the upper end of the interval from the interval function.
.width
: For intervals, the interval width as a numeric value in [0, 1]
.
For slabs, the width of the smallest interval containing that value of the slab.
level
: For intervals, the interval width as an ordered factor.
For slabs, the level of the smallest interval containing that value of the slab.
pdf
: For slabs, the probability density function (PDF).
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the PDF at the point summary; intervals also have pdf_min
and pdf_max
for the PDF at the lower and upper ends of the interval.
cdf
: For slabs, the cumulative distribution function.
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the CDF at the point summary; intervals also have cdf_min
and cdf_max
for the CDF at the lower and upper ends of the interval.
n
: For slabs, the number of data points summarized into that slab. If the slab was created from
an analytical distribution via the xdist
, ydist
, or dist
aesthetic, n
will be Inf
.
f
: (deprecated) For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF),
determined by slab_type
. Instead of using slab_type
to change f
and then mapping f
onto an
aesthetic, it is now recommended to simply map the corresponding computed variable (e.g. pdf
, cdf
, or
1 - cdf
) directly onto the desired aesthetic.
se
: For dots, the Monte Carlo Standard Error of the quantile corresponding to each dot.
The dots+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the dots (aka the slab), the
point, and the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation = "vertical"
); or sample data to be summarized
(when orientation = "horizontal"
with sample data).
y
: y position of the geometry (when orientation = "horizontal"
); or sample data to be summarized
(when orientation = "vertical"
with sample data).
weight
: When using samples (i.e. the x
and y
aesthetics, not xdist
or ydist
), optional
weights to be applied to each draw.
xdist
: When using analytical distributions, distribution to map on the x axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
ydist
: When using analytical distributions, distribution to map on the y axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
dist
: When using analytical distributions, a name of a distribution (e.g. "norm"
), a
distributional object (e.g. dist_normal()
), or a posterior::rvar()
object. See Details.
args
: Distribution arguments (args
or arg1
, ... arg9
). See Details.
In addition, in their default configuration (paired with geom_blur_dots()
)
the following aesthetics are supported by the underlying geom:
Dots-specific (aka Slab-specific) aesthetics
sd
: The standard deviation (in data units) of the blur associated with each dot.
order
: The order in which data points are stacked within bins. Can be used to create the effect of
"stacked" dots by ordering dots according to a discrete variable. If omitted (NULL
), the
value of the data points themselves are used to determine stacking order. Only applies when
layout
is "bin"
or "hex"
, as the other layout methods fully determine both x and y positions.
side
: Which side to place the slab on. "topright"
, "top"
, and "right"
are synonyms
which cause the slab to be drawn on the top or the right depending on if orientation
is "horizontal"
or "vertical"
. "bottomleft"
, "bottom"
, and "left"
are synonyms which cause the slab
to be drawn on the bottom or the left depending on if orientation
is "horizontal"
or
"vertical"
. "topleft"
causes the slab to be drawn on the top or the left, and "bottomright"
causes the slab to be drawn on the bottom or the right. "both"
draws the slab mirrored on both
sides (as in a violin plot).
scale
: What proportion of the region allocated to this geom to use to draw the slab. If scale = 1
,
slabs that use the maximum range will just touch each other. Default is 0.9
to leave some space
between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization,
see the thickness
scale article.
justification
: Justification of the interval relative to the slab, where 0
indicates bottom/left
justification and 1
indicates top/right justification (depending on orientation
). If justification
is NULL
(the default), then it is set automatically based on the value of side
: when side
is
"top"
/"right"
justification
is set to 0
, when side
is "bottom"
/"left"
justification
is set to 1
, and when side
is "both"
justification
is set to 0.5.
datatype
: When using composite geoms directly without a stat
(e.g. geom_slabinterval()
), datatype
is used to
indicate which part of the geom a row in the data targets: rows with datatype = "slab"
target the
slab portion of the geometry and rows with datatype = "interval"
target the interval portion of
the geometry. This is set automatically when using ggdist stat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (if orientation = "horizontal"
).
xmax
: Right end of the interval sub-geometry (if orientation = "horizontal"
).
ymin
: Lower end of the interval sub-geometry (if orientation = "vertical"
).
ymax
: Upper end of the interval sub-geometry (if orientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (or color
) The color of the interval and point sub-geometries.
Use the slab_color
, interval_color
, or point_color
aesthetics (below) to
set sub-geometry colors separately.
fill
: The fill color of the slab and point sub-geometries. Use the slab_fill
or point_fill
aesthetics (below) to set sub-geometry colors separately.
alpha
: The opacity of the slab, interval, and point sub-geometries. Use the slab_alpha
,
interval_alpha
, or point_alpha
aesthetics (below) to set sub-geometry colors separately.
colour_ramp
: (or color_ramp
) A secondary scale that modifies the color
scale to "ramp" to another color. See scale_colour_ramp()
for examples.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except with geom_slab()
: then
it is the width of the slab). With composite geometries including an interval and slab,
use slab_linewidth
to set the line width of the slab (see below). For interval, raw
linewidth
values are transformed according to the interval_size_domain
and interval_size_range
parameters of the geom
(see above).
size
: Determines the size of the point. If linewidth
is not provided, size
will
also determines the width of the line used to draw the interval (this allows line width and
point size to be modified together by setting only size
and not linewidth
). Raw
size
values are transformed according to the interval_size_domain
, interval_size_range
,
and fatten_point
parameters of the geom
(see above). Use the point_size
aesthetic
(below) to set sub-geometry size directly without applying the effects of
interval_size_domain
, interval_size_range
, and fatten_point
.
stroke
: Width of the outline around the point sub-geometry.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc) used to draw the interval
and the outline of the slab (if it is visible). Use the slab_linetype
or
interval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override for fill
: the fill color of the slab.
slab_colour
: (or slab_color
) Override for colour
/color
: the outline color of the slab.
slab_alpha
: Override for alpha
: the opacity of the slab.
slab_linewidth
: Override for linwidth
: the width of the outline of the slab.
slab_linetype
: Override for linetype
: the line type of the outline of the slab.
slab_shape
: Override for shape
: the shape of the dots used to draw the dotplot slab.
Interval-specific color and line override aesthetics
interval_colour
: (or interval_color
) Override for colour
/color
: the color of the interval.
interval_alpha
: Override for alpha
: the opacity of the interval.
interval_linetype
: Override for linetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override for fill
: the fill color of the point.
point_colour
: (or point_color
) Override for colour
/color
: the outline color of the point.
point_alpha
: Override for alpha
: the opacity of the point.
point_size
: Override for size
: the size of the point.
Deprecated aesthetics
slab_size
: Use slab_linewidth
.
interval_size
: Use interval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("dotsinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
Kay, M., Kola, T., Hullman, J. R., & Munson, S. A. (2016). When (ish) is My Bus? User-centered Visualizations of Uncertainty in Everyday, Mobile Predictive Systems. Conference on Human Factors in Computing Systems - CHI '16, 5092–5103. doi:10.1145/2858036.2858558.
Fernandes, M., Walls, L., Munson, S., Hullman, J., & Kay, M. (2018). Uncertainty Displays Using Quantile Dotplots or CDFs Improve Transit Decision-Making. Conference on Human Factors in Computing Systems - CHI '18. doi:10.1145/3173574.3173718.
See geom_blur_dots()
for the geom underlying this stat.
See vignette("dotsinterval")
for a variety of examples of use.
Other dotsinterval stats:
stat_dots()
,
stat_dotsinterval()
library(dplyr) library(ggplot2) theme_set(theme_ggdist()) set.seed(1234) data.frame(x = rnorm(1000)) %>% ggplot(aes(x = x)) + stat_mcse_dots(quantiles = 100, layout = "weave")
library(dplyr) library(ggplot2) theme_set(theme_ggdist()) set.seed(1234) data.frame(x = rnorm(1000)) %>% ggplot(aes(x = x)) + stat_mcse_dots(quantiles = 100, layout = "weave")
Shortcut version of stat_slabinterval()
with geom_pointinterval()
for
creating point + multiple-interval plots.
Roughly equivalent to:
stat_slabinterval( geom = "pointinterval", show_slab = FALSE )
stat_pointinterval( mapping = NULL, data = NULL, geom = "pointinterval", position = "identity", ..., point_interval = "median_qi", .width = c(0.66, 0.95), orientation = NA, na.rm = FALSE, show.legend = c(size = FALSE), inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
stat_pointinterval( mapping = NULL, data = NULL, geom = "pointinterval", position = "identity", ..., point_interval = "median_qi", .width = c(0.66, 0.95), orientation = NA, na.rm = FALSE, show.legend = c(size = FALSE), inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override
the default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
point_interval |
<function | string> A function from the |
.width |
<numeric> The |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
<logical> Should this layer be included in the legends? Default is |
inherit.aes |
If |
check.aes , check.param
|
If |
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
xdist
, ydist
, and dist
can be any distribution object from the distributional
package (dist_normal()
, dist_beta()
, etc)
or can be a posterior::rvar()
object. Since these functions are vectorized,
other columns can be passed directly to them in an aes()
specification; e.g.
aes(dist = dist_normal(mu, sigma))
will work if mu
and sigma
are columns in the
input data frame.
dist
can be a character vector giving the distribution name. Then the arg1
, ... arg9
aesthetics (or args
as a list column) specify distribution arguments. Distribution names
should correspond to R functions that have "p"
, "q"
, and "d"
functions; e.g. "norm"
is a valid distribution name because R defines the pnorm()
, qnorm()
, and dnorm()
functions for Normal distributions.
See the parse_dist()
function for a useful way to generate dist
and args
values from human-readable distribution specs (like "normal(0,1)"
). Such specs are also
produced by other packages (like the brms::get_prior
function in brms); thus,
parse_dist()
combined with the stats described here can help you visualize the output
of those functions.
A ggplot2::Stat representing a point + multiple-interval geometry which can
be added to a ggplot()
object.
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
x
or y
: For slabs, the input values to the slab function.
For intervals, the point summary from the interval function. Whether it is x
or y
depends on orientation
xmin
or ymin
: For intervals, the lower end of the interval from the interval function.
xmax
or ymax
: For intervals, the upper end of the interval from the interval function.
.width
: For intervals, the interval width as a numeric value in [0, 1]
.
For slabs, the width of the smallest interval containing that value of the slab.
level
: For intervals, the interval width as an ordered factor.
For slabs, the level of the smallest interval containing that value of the slab.
pdf
: For slabs, the probability density function (PDF).
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the PDF at the point summary; intervals also have pdf_min
and pdf_max
for the PDF at the lower and upper ends of the interval.
cdf
: For slabs, the cumulative distribution function.
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the CDF at the point summary; intervals also have cdf_min
and cdf_max
for the CDF at the lower and upper ends of the interval.
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation = "vertical"
); or sample data to be summarized
(when orientation = "horizontal"
with sample data).
y
: y position of the geometry (when orientation = "horizontal"
); or sample data to be summarized
(when orientation = "vertical"
with sample data).
weight
: When using samples (i.e. the x
and y
aesthetics, not xdist
or ydist
), optional
weights to be applied to each draw.
xdist
: When using analytical distributions, distribution to map on the x axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
ydist
: When using analytical distributions, distribution to map on the y axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
dist
: When using analytical distributions, a name of a distribution (e.g. "norm"
), a
distributional object (e.g. dist_normal()
), or a posterior::rvar()
object. See Details.
args
: Distribution arguments (args
or arg1
, ... arg9
). See Details.
In addition, in their default configuration (paired with geom_pointinterval()
)
the following aesthetics are supported by the underlying geom:
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (if orientation = "horizontal"
).
xmax
: Right end of the interval sub-geometry (if orientation = "horizontal"
).
ymin
: Lower end of the interval sub-geometry (if orientation = "vertical"
).
ymax
: Upper end of the interval sub-geometry (if orientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (or color
) The color of the interval and point sub-geometries.
Use the slab_color
, interval_color
, or point_color
aesthetics (below) to
set sub-geometry colors separately.
fill
: The fill color of the slab and point sub-geometries. Use the slab_fill
or point_fill
aesthetics (below) to set sub-geometry colors separately.
alpha
: The opacity of the slab, interval, and point sub-geometries. Use the slab_alpha
,
interval_alpha
, or point_alpha
aesthetics (below) to set sub-geometry colors separately.
colour_ramp
: (or color_ramp
) A secondary scale that modifies the color
scale to "ramp" to another color. See scale_colour_ramp()
for examples.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except with geom_slab()
: then
it is the width of the slab). With composite geometries including an interval and slab,
use slab_linewidth
to set the line width of the slab (see below). For interval, raw
linewidth
values are transformed according to the interval_size_domain
and interval_size_range
parameters of the geom
(see above).
size
: Determines the size of the point. If linewidth
is not provided, size
will
also determines the width of the line used to draw the interval (this allows line width and
point size to be modified together by setting only size
and not linewidth
). Raw
size
values are transformed according to the interval_size_domain
, interval_size_range
,
and fatten_point
parameters of the geom
(see above). Use the point_size
aesthetic
(below) to set sub-geometry size directly without applying the effects of
interval_size_domain
, interval_size_range
, and fatten_point
.
stroke
: Width of the outline around the point sub-geometry.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc) used to draw the interval
and the outline of the slab (if it is visible). Use the slab_linetype
or
interval_linetype
aesthetics (below) to set sub-geometry line types separately.
Interval-specific color and line override aesthetics
interval_colour
: (or interval_color
) Override for colour
/color
: the color of the interval.
interval_alpha
: Override for alpha
: the opacity of the interval.
interval_linetype
: Override for linetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override for fill
: the fill color of the point.
point_colour
: (or point_color
) Override for colour
/color
: the outline color of the point.
point_alpha
: Override for alpha
: the opacity of the point.
point_size
: Override for size
: the size of the point.
Deprecated aesthetics
interval_size
: Use interval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See geom_pointinterval()
for the geom underlying this stat.
See stat_slabinterval()
for the stat this shortcut is based on.
Other slabinterval stats:
stat_ccdfinterval()
,
stat_cdfinterval()
,
stat_eye()
,
stat_gradientinterval()
,
stat_halfeye()
,
stat_histinterval()
,
stat_interval()
,
stat_slab()
,
stat_spike()
library(dplyr) library(ggplot2) library(distributional) theme_set(theme_ggdist()) # ON SAMPLE DATA set.seed(1234) df = data.frame( group = c("a", "b", "c"), value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1)) ) df %>% ggplot(aes(x = value, y = group)) + stat_pointinterval() # ON ANALYTICAL DISTRIBUTIONS dist_df = data.frame( group = c("a", "b", "c"), mean = c( 5, 7, 8), sd = c( 1, 1.5, 1) ) # Vectorized distribution types, like distributional::dist_normal() # and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics dist_df %>% ggplot(aes(y = group, xdist = dist_normal(mean, sd))) + stat_pointinterval()
library(dplyr) library(ggplot2) library(distributional) theme_set(theme_ggdist()) # ON SAMPLE DATA set.seed(1234) df = data.frame( group = c("a", "b", "c"), value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1)) ) df %>% ggplot(aes(x = value, y = group)) + stat_pointinterval() # ON ANALYTICAL DISTRIBUTIONS dist_df = data.frame( group = c("a", "b", "c"), mean = c( 5, 7, 8), sd = c( 1, 1.5, 1) ) # Vectorized distribution types, like distributional::dist_normal() # and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics dist_df %>% ggplot(aes(y = group, xdist = dist_normal(mean, sd))) + stat_pointinterval()
A combination of stat_slabinterval()
and geom_lineribbon()
with sensible defaults
for making multiple-ribbon plots. While geom_lineribbon()
is intended for use on data
frames that have already been summarized using a point_interval()
function,
stat_ribbon()
is intended for use directly on data frames of draws or of
analytical distributions, and will perform the summarization using a point_interval()
function.
Roughly equivalent to:
stat_lineribbon( show_point = FALSE )
stat_ribbon( mapping = NULL, data = NULL, geom = "lineribbon", position = "identity", ..., .width = c(0.5, 0.8, 0.95), point_interval = "median_qi", orientation = NA, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
stat_ribbon( mapping = NULL, data = NULL, geom = "lineribbon", position = "identity", ..., .width = c(0.5, 0.8, 0.95), point_interval = "median_qi", orientation = NA, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override
the default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
.width |
<numeric> The |
point_interval |
<function | string> A function from the |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
<logical> Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param
|
If |
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
xdist
, ydist
, and dist
can be any distribution object from the distributional
package (dist_normal()
, dist_beta()
, etc)
or can be a posterior::rvar()
object. Since these functions are vectorized,
other columns can be passed directly to them in an aes()
specification; e.g.
aes(dist = dist_normal(mu, sigma))
will work if mu
and sigma
are columns in the
input data frame.
dist
can be a character vector giving the distribution name. Then the arg1
, ... arg9
aesthetics (or args
as a list column) specify distribution arguments. Distribution names
should correspond to R functions that have "p"
, "q"
, and "d"
functions; e.g. "norm"
is a valid distribution name because R defines the pnorm()
, qnorm()
, and dnorm()
functions for Normal distributions.
See the parse_dist()
function for a useful way to generate dist
and args
values from human-readable distribution specs (like "normal(0,1)"
). Such specs are also
produced by other packages (like the brms::get_prior
function in brms); thus,
parse_dist()
combined with the stats described here can help you visualize the output
of those functions.
A ggplot2::Stat representing a multiple-ribbon geometry which can
be added to a ggplot()
object.
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
x
or y
: For slabs, the input values to the slab function.
For intervals, the point summary from the interval function. Whether it is x
or y
depends on orientation
xmin
or ymin
: For intervals, the lower end of the interval from the interval function.
xmax
or ymax
: For intervals, the upper end of the interval from the interval function.
.width
: For intervals, the interval width as a numeric value in [0, 1]
.
For slabs, the width of the smallest interval containing that value of the slab.
level
: For intervals, the interval width as an ordered factor.
For slabs, the level of the smallest interval containing that value of the slab.
pdf
: For slabs, the probability density function (PDF).
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the PDF at the point summary; intervals also have pdf_min
and pdf_max
for the PDF at the lower and upper ends of the interval.
cdf
: For slabs, the cumulative distribution function.
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the CDF at the point summary; intervals also have cdf_min
and cdf_max
for the CDF at the lower and upper ends of the interval.
The line+ribbon stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their two sub-geometries: the line and the ribbon.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation = "vertical"
); or sample data to be summarized
(when orientation = "horizontal"
with sample data).
y
: y position of the geometry (when orientation = "horizontal"
); or sample data to be summarized
(when orientation = "vertical"
with sample data).
weight
: When using samples (i.e. the x
and y
aesthetics, not xdist
or ydist
), optional
weights to be applied to each draw.
xdist
: When using analytical distributions, distribution to map on the x axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
ydist
: When using analytical distributions, distribution to map on the y axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
dist
: When using analytical distributions, a name of a distribution (e.g. "norm"
), a
distributional object (e.g. dist_normal()
), or a posterior::rvar()
object. See Details.
args
: Distribution arguments (args
or arg1
, ... arg9
). See Details.
In addition, in their default configuration (paired with geom_lineribbon()
)
the following aesthetics are supported by the underlying geom:
Ribbon-specific aesthetics
xmin
: Left edge of the ribbon sub-geometry (if orientation = "horizontal"
).
xmax
: Right edge of the ribbon sub-geometry (if orientation = "horizontal"
).
ymin
: Lower edge of the ribbon sub-geometry (if orientation = "vertical"
).
ymax
: Upper edge of the ribbon sub-geometry (if orientation = "vertical"
).
order
: The order in which ribbons are drawn. Ribbons with the smallest mean value of order
are drawn first (i.e., will be drawn below ribbons with larger mean values of order
). If
order
is not supplied to geom_lineribbon()
, -abs(xmax - xmin)
or -abs(ymax - ymax)
(depending on orientation
) is used, having the effect of drawing the widest (on average)
ribbons on the bottom. stat_lineribbon()
uses order = after_stat(level)
by default,
causing the ribbons generated from the largest .width
to be drawn on the bottom.
Color aesthetics
colour
: (or color
) The color of the line sub-geometry.
fill
: The fill color of the ribbon sub-geometry.
alpha
: The opacity of the line and ribbon sub-geometries.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Other aesthetics (these work as in standard geom
s)
group
See examples of some of these aesthetics in action in vignette("lineribbon")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See geom_lineribbon()
for the geom underlying this stat.
Other lineribbon stats:
stat_lineribbon()
library(dplyr) library(ggplot2) library(distributional) theme_set(theme_ggdist()) # ON SAMPLE DATA set.seed(12345) tibble( x = rep(1:10, 100), y = rnorm(1000, x) ) %>% ggplot(aes(x = x, y = y)) + stat_ribbon() + scale_fill_brewer() # ON ANALYTICAL DISTRIBUTIONS # Vectorized distribution types, like distributional::dist_normal() # and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics tibble( x = 1:10, sd = seq(1, 3, length.out = 10) ) %>% ggplot(aes(x = x, ydist = dist_normal(x, sd))) + stat_ribbon() + scale_fill_brewer()
library(dplyr) library(ggplot2) library(distributional) theme_set(theme_ggdist()) # ON SAMPLE DATA set.seed(12345) tibble( x = rep(1:10, 100), y = rnorm(1000, x) ) %>% ggplot(aes(x = x, y = y)) + stat_ribbon() + scale_fill_brewer() # ON ANALYTICAL DISTRIBUTIONS # Vectorized distribution types, like distributional::dist_normal() # and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics tibble( x = 1:10, sd = seq(1, 3, length.out = 10) ) %>% ggplot(aes(x = x, ydist = dist_normal(x, sd))) + stat_ribbon() + scale_fill_brewer()
Shortcut version of stat_slabinterval()
with geom_slab()
for
creating slab (ridge) plots.
Roughly equivalent to:
stat_slabinterval( aes(size = NULL), geom = "slab", show_point = FALSE, show_interval = FALSE, show.legend = NA )
stat_slab( mapping = NULL, data = NULL, geom = "slab", position = "identity", ..., p_limits = c(NA, NA), density = "bounded", adjust = waiver(), trim = waiver(), breaks = waiver(), align = waiver(), outline_bars = waiver(), expand = FALSE, limits = NULL, n = waiver(), orientation = NA, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
stat_slab( mapping = NULL, data = NULL, geom = "slab", position = "identity", ..., p_limits = c(NA, NA), density = "bounded", adjust = waiver(), trim = waiver(), breaks = waiver(), align = waiver(), outline_bars = waiver(), expand = FALSE, limits = NULL, n = waiver(), orientation = NA, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override
the default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
p_limits |
<length-2 numeric> Probability limits. Used to determine the lower and upper
limits of analytical distributions (distributions from samples ignore this parameter and determine
their limits based on the limits of the sample and the value of the |
density |
<function | string> Density estimator for sample data. One of:
|
adjust |
<scalar numeric | waiver> Passed to |
trim |
<scalar logical | waiver> Passed to |
breaks |
<numeric | function | string | waiver> Passed to
For example, |
align |
<scalar numeric | function | string | waiver> Passed to
For example, |
outline_bars |
<scalar logical | waiver> Passed to |
expand |
<logical> For sample data, should the slab be expanded to the limits of the scale? Default |
limits |
<length-2 numeric> Manually-specified limits for the slab, as a vector of length two. These limits are combined with those
computed based on |
n |
<scalar numeric> Number of points at which to evaluate the function that defines the slab. Also
passed to |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
<logical> Should this layer be included in the legends? Default is |
inherit.aes |
If |
check.aes , check.param
|
If |
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
xdist
, ydist
, and dist
can be any distribution object from the distributional
package (dist_normal()
, dist_beta()
, etc)
or can be a posterior::rvar()
object. Since these functions are vectorized,
other columns can be passed directly to them in an aes()
specification; e.g.
aes(dist = dist_normal(mu, sigma))
will work if mu
and sigma
are columns in the
input data frame.
dist
can be a character vector giving the distribution name. Then the arg1
, ... arg9
aesthetics (or args
as a list column) specify distribution arguments. Distribution names
should correspond to R functions that have "p"
, "q"
, and "d"
functions; e.g. "norm"
is a valid distribution name because R defines the pnorm()
, qnorm()
, and dnorm()
functions for Normal distributions.
See the parse_dist()
function for a useful way to generate dist
and args
values from human-readable distribution specs (like "normal(0,1)"
). Such specs are also
produced by other packages (like the brms::get_prior
function in brms); thus,
parse_dist()
combined with the stats described here can help you visualize the output
of those functions.
A ggplot2::Stat representing a slab (ridge) geometry which can
be added to a ggplot()
object.
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
x
or y
: For slabs, the input values to the slab function.
For intervals, the point summary from the interval function. Whether it is x
or y
depends on orientation
xmin
or ymin
: For intervals, the lower end of the interval from the interval function.
xmax
or ymax
: For intervals, the upper end of the interval from the interval function.
.width
: For intervals, the interval width as a numeric value in [0, 1]
.
For slabs, the width of the smallest interval containing that value of the slab.
level
: For intervals, the interval width as an ordered factor.
For slabs, the level of the smallest interval containing that value of the slab.
pdf
: For slabs, the probability density function (PDF).
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the PDF at the point summary; intervals also have pdf_min
and pdf_max
for the PDF at the lower and upper ends of the interval.
cdf
: For slabs, the cumulative distribution function.
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the CDF at the point summary; intervals also have cdf_min
and cdf_max
for the CDF at the lower and upper ends of the interval.
n
: For slabs, the number of data points summarized into that slab. If the slab was created from
an analytical distribution via the xdist
, ydist
, or dist
aesthetic, n
will be Inf
.
f
: (deprecated) For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF),
determined by slab_type
. Instead of using slab_type
to change f
and then mapping f
onto an
aesthetic, it is now recommended to simply map the corresponding computed variable (e.g. pdf
, cdf
, or
1 - cdf
) directly onto the desired aesthetic.
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation = "vertical"
); or sample data to be summarized
(when orientation = "horizontal"
with sample data).
y
: y position of the geometry (when orientation = "horizontal"
); or sample data to be summarized
(when orientation = "vertical"
with sample data).
weight
: When using samples (i.e. the x
and y
aesthetics, not xdist
or ydist
), optional
weights to be applied to each draw.
xdist
: When using analytical distributions, distribution to map on the x axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
ydist
: When using analytical distributions, distribution to map on the y axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
dist
: When using analytical distributions, a name of a distribution (e.g. "norm"
), a
distributional object (e.g. dist_normal()
), or a posterior::rvar()
object. See Details.
args
: Distribution arguments (args
or arg1
, ... arg9
). See Details.
In addition, in their default configuration (paired with geom_slab()
)
the following aesthetics are supported by the underlying geom:
Slab-specific aesthetics
thickness
: The thickness of the slab at each x
value (if orientation = "horizontal"
) or
y
value (if orientation = "vertical"
) of the slab.
side
: Which side to place the slab on. "topright"
, "top"
, and "right"
are synonyms
which cause the slab to be drawn on the top or the right depending on if orientation
is "horizontal"
or "vertical"
. "bottomleft"
, "bottom"
, and "left"
are synonyms which cause the slab
to be drawn on the bottom or the left depending on if orientation
is "horizontal"
or
"vertical"
. "topleft"
causes the slab to be drawn on the top or the left, and "bottomright"
causes the slab to be drawn on the bottom or the right. "both"
draws the slab mirrored on both
sides (as in a violin plot).
scale
: What proportion of the region allocated to this geom to use to draw the slab. If scale = 1
,
slabs that use the maximum range will just touch each other. Default is 0.9
to leave some space
between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization,
see the thickness
scale article.
justification
: Justification of the interval relative to the slab, where 0
indicates bottom/left
justification and 1
indicates top/right justification (depending on orientation
). If justification
is NULL
(the default), then it is set automatically based on the value of side
: when side
is
"top"
/"right"
justification
is set to 0
, when side
is "bottom"
/"left"
justification
is set to 1
, and when side
is "both"
justification
is set to 0.5.
Color aesthetics
colour
: (or color
) The color of the interval and point sub-geometries.
Use the slab_color
, interval_color
, or point_color
aesthetics (below) to
set sub-geometry colors separately.
fill
: The fill color of the slab and point sub-geometries. Use the slab_fill
or point_fill
aesthetics (below) to set sub-geometry colors separately.
alpha
: The opacity of the slab, interval, and point sub-geometries. Use the slab_alpha
,
interval_alpha
, or point_alpha
aesthetics (below) to set sub-geometry colors separately.
colour_ramp
: (or color_ramp
) A secondary scale that modifies the color
scale to "ramp" to another color. See scale_colour_ramp()
for examples.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except with geom_slab()
: then
it is the width of the slab). With composite geometries including an interval and slab,
use slab_linewidth
to set the line width of the slab (see below). For interval, raw
linewidth
values are transformed according to the interval_size_domain
and interval_size_range
parameters of the geom
(see above).
size
: Determines the size of the point. If linewidth
is not provided, size
will
also determines the width of the line used to draw the interval (this allows line width and
point size to be modified together by setting only size
and not linewidth
). Raw
size
values are transformed according to the interval_size_domain
, interval_size_range
,
and fatten_point
parameters of the geom
(see above). Use the point_size
aesthetic
(below) to set sub-geometry size directly without applying the effects of
interval_size_domain
, interval_size_range
, and fatten_point
.
stroke
: Width of the outline around the point sub-geometry.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc) used to draw the interval
and the outline of the slab (if it is visible). Use the slab_linetype
or
interval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override for fill
: the fill color of the slab.
slab_colour
: (or slab_color
) Override for colour
/color
: the outline color of the slab.
slab_alpha
: Override for alpha
: the opacity of the slab.
slab_linewidth
: Override for linwidth
: the width of the outline of the slab.
slab_linetype
: Override for linetype
: the line type of the outline of the slab.
Deprecated aesthetics
slab_size
: Use slab_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See geom_slab()
for the geom underlying this stat.
See stat_slabinterval()
for the stat this shortcut is based on.
Other slabinterval stats:
stat_ccdfinterval()
,
stat_cdfinterval()
,
stat_eye()
,
stat_gradientinterval()
,
stat_halfeye()
,
stat_histinterval()
,
stat_interval()
,
stat_pointinterval()
,
stat_spike()
library(dplyr) library(ggplot2) library(distributional) theme_set(theme_ggdist()) # ON SAMPLE DATA set.seed(1234) df = data.frame( group = c("a", "b", "c"), value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1)) ) df %>% ggplot(aes(x = value, y = group)) + stat_slab() # ON ANALYTICAL DISTRIBUTIONS dist_df = data.frame( group = c("a", "b", "c"), mean = c( 5, 7, 8), sd = c( 1, 1.5, 1) ) # Vectorized distribution types, like distributional::dist_normal() # and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics dist_df %>% ggplot(aes(y = group, xdist = dist_normal(mean, sd))) + stat_slab() # RIDGE PLOTS # "ridge" plots can be created by expanding the slabs to the limits of the plot # (expand = TRUE), allowing the density estimator to be nonzero outside the # limits of the data (trim = FALSE), and increasing the height of the slabs. data.frame( group = letters[1:3], value = rnorm(3000, 3:1) ) %>% ggplot(aes(y = group, x = value)) + stat_slab(color = "black", expand = TRUE, trim = FALSE, height = 2)
library(dplyr) library(ggplot2) library(distributional) theme_set(theme_ggdist()) # ON SAMPLE DATA set.seed(1234) df = data.frame( group = c("a", "b", "c"), value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1)) ) df %>% ggplot(aes(x = value, y = group)) + stat_slab() # ON ANALYTICAL DISTRIBUTIONS dist_df = data.frame( group = c("a", "b", "c"), mean = c( 5, 7, 8), sd = c( 1, 1.5, 1) ) # Vectorized distribution types, like distributional::dist_normal() # and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics dist_df %>% ggplot(aes(y = group, xdist = dist_normal(mean, sd))) + stat_slab() # RIDGE PLOTS # "ridge" plots can be created by expanding the slabs to the limits of the plot # (expand = TRUE), allowing the density estimator to be nonzero outside the # limits of the data (trim = FALSE), and increasing the height of the slabs. data.frame( group = letters[1:3], value = rnorm(3000, 3:1) ) %>% ggplot(aes(y = group, x = value)) + stat_slab(color = "black", expand = TRUE, trim = FALSE, height = 2)
"Meta" stat for computing distribution functions (densities or CDFs) + intervals for use with
geom_slabinterval()
. Useful for creating eye plots, half-eye plots, CCDF bar plots,
gradient plots, histograms, and more. Sample data can be supplied to the x
and y
aesthetics or analytical distributions (in a variety of formats) can be supplied to the
xdist
and ydist
aesthetics.
See Details.
stat_slabinterval( mapping = NULL, data = NULL, geom = "slabinterval", position = "identity", ..., p_limits = c(NA, NA), density = "bounded", adjust = waiver(), trim = waiver(), breaks = waiver(), align = waiver(), outline_bars = waiver(), expand = FALSE, point_interval = "median_qi", limits = NULL, n = waiver(), .width = c(0.66, 0.95), orientation = NA, na.rm = FALSE, show.legend = c(size = FALSE), inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
stat_slabinterval( mapping = NULL, data = NULL, geom = "slabinterval", position = "identity", ..., p_limits = c(NA, NA), density = "bounded", adjust = waiver(), trim = waiver(), breaks = waiver(), align = waiver(), outline_bars = waiver(), expand = FALSE, point_interval = "median_qi", limits = NULL, n = waiver(), .width = c(0.66, 0.95), orientation = NA, na.rm = FALSE, show.legend = c(size = FALSE), inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override the
default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
p_limits |
<length-2 numeric> Probability limits. Used to determine the lower and upper
limits of analytical distributions (distributions from samples ignore this parameter and determine
their limits based on the limits of the sample and the value of the |
density |
<function | string> Density estimator for sample data. One of:
|
adjust |
<scalar numeric | waiver> Passed to |
trim |
<scalar logical | waiver> Passed to |
breaks |
<numeric | function | string | waiver> Passed to
For example, |
align |
<scalar numeric | function | string | waiver> Passed to
For example, |
outline_bars |
<scalar logical | waiver> Passed to |
expand |
<logical> For sample data, should the slab be expanded to the limits of the scale? Default |
point_interval |
<function | string> A function from the |
limits |
<length-2 numeric> Manually-specified limits for the slab, as a vector of length two. These limits are combined with those
computed based on |
n |
<scalar numeric> Number of points at which to evaluate the function that defines the slab. Also
passed to |
.width |
<numeric> The |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
<logical> Should this layer be included in the legends? Default is |
inherit.aes |
If |
check.aes , check.param
|
If |
A highly configurable stat for generating a variety of plots that combine a "slab" that describes a distribution plus a point summary and any number of intervals. Several "shortcut" stats are provided which combine multiple options to create useful geoms, particularly eye plots (a violin plot of density plus interval), half-eye plots (a density plot plus interval), CCDF bar plots (a complementary CDF plus interval), and gradient plots (a density encoded in color alpha plus interval).
The shortcut stats include:
stat_eye()
: Eye plots (violin + interval)
stat_halfeye()
: Half-eye plots (density + interval)
stat_ccdfinterval()
: CCDF bar plots (CCDF + interval)
stat_cdfinterval()
: CDF bar plots (CDF + interval)
stat_gradientinterval()
: Density gradient + interval plots
stat_slab()
: Density plots
stat_histinterval()
: Histogram + interval plots
stat_pointinterval()
: Point + interval plots
stat_interval()
: Interval plots
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
xdist
, ydist
, and dist
can be any distribution object from the distributional
package (dist_normal()
, dist_beta()
, etc)
or can be a posterior::rvar()
object. Since these functions are vectorized,
other columns can be passed directly to them in an aes()
specification; e.g.
aes(dist = dist_normal(mu, sigma))
will work if mu
and sigma
are columns in the
input data frame.
dist
can be a character vector giving the distribution name. Then the arg1
, ... arg9
aesthetics (or args
as a list column) specify distribution arguments. Distribution names
should correspond to R functions that have "p"
, "q"
, and "d"
functions; e.g. "norm"
is a valid distribution name because R defines the pnorm()
, qnorm()
, and dnorm()
functions for Normal distributions.
See the parse_dist()
function for a useful way to generate dist
and args
values from human-readable distribution specs (like "normal(0,1)"
). Such specs are also
produced by other packages (like the brms::get_prior
function in brms); thus,
parse_dist()
combined with the stats described here can help you visualize the output
of those functions.
A ggplot2::Stat representing a slab or combined slab+interval geometry which can
be added to a ggplot()
object.
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
x
or y
: For slabs, the input values to the slab function.
For intervals, the point summary from the interval function. Whether it is x
or y
depends on orientation
xmin
or ymin
: For intervals, the lower end of the interval from the interval function.
xmax
or ymax
: For intervals, the upper end of the interval from the interval function.
.width
: For intervals, the interval width as a numeric value in [0, 1]
.
For slabs, the width of the smallest interval containing that value of the slab.
level
: For intervals, the interval width as an ordered factor.
For slabs, the level of the smallest interval containing that value of the slab.
pdf
: For slabs, the probability density function (PDF).
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the PDF at the point summary; intervals also have pdf_min
and pdf_max
for the PDF at the lower and upper ends of the interval.
cdf
: For slabs, the cumulative distribution function.
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the CDF at the point summary; intervals also have cdf_min
and cdf_max
for the CDF at the lower and upper ends of the interval.
n
: For slabs, the number of data points summarized into that slab. If the slab was created from
an analytical distribution via the xdist
, ydist
, or dist
aesthetic, n
will be Inf
.
f
: (deprecated) For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF),
determined by slab_type
. Instead of using slab_type
to change f
and then mapping f
onto an
aesthetic, it is now recommended to simply map the corresponding computed variable (e.g. pdf
, cdf
, or
1 - cdf
) directly onto the desired aesthetic.
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation = "vertical"
); or sample data to be summarized
(when orientation = "horizontal"
with sample data).
y
: y position of the geometry (when orientation = "horizontal"
); or sample data to be summarized
(when orientation = "vertical"
with sample data).
weight
: When using samples (i.e. the x
and y
aesthetics, not xdist
or ydist
), optional
weights to be applied to each draw.
xdist
: When using analytical distributions, distribution to map on the x axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
ydist
: When using analytical distributions, distribution to map on the y axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
dist
: When using analytical distributions, a name of a distribution (e.g. "norm"
), a
distributional object (e.g. dist_normal()
), or a posterior::rvar()
object. See Details.
args
: Distribution arguments (args
or arg1
, ... arg9
). See Details.
In addition, in their default configuration (paired with geom_slabinterval()
)
the following aesthetics are supported by the underlying geom:
Slab-specific aesthetics
thickness
: The thickness of the slab at each x
value (if orientation = "horizontal"
) or
y
value (if orientation = "vertical"
) of the slab.
side
: Which side to place the slab on. "topright"
, "top"
, and "right"
are synonyms
which cause the slab to be drawn on the top or the right depending on if orientation
is "horizontal"
or "vertical"
. "bottomleft"
, "bottom"
, and "left"
are synonyms which cause the slab
to be drawn on the bottom or the left depending on if orientation
is "horizontal"
or
"vertical"
. "topleft"
causes the slab to be drawn on the top or the left, and "bottomright"
causes the slab to be drawn on the bottom or the right. "both"
draws the slab mirrored on both
sides (as in a violin plot).
scale
: What proportion of the region allocated to this geom to use to draw the slab. If scale = 1
,
slabs that use the maximum range will just touch each other. Default is 0.9
to leave some space
between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization,
see the thickness
scale article.
justification
: Justification of the interval relative to the slab, where 0
indicates bottom/left
justification and 1
indicates top/right justification (depending on orientation
). If justification
is NULL
(the default), then it is set automatically based on the value of side
: when side
is
"top"
/"right"
justification
is set to 0
, when side
is "bottom"
/"left"
justification
is set to 1
, and when side
is "both"
justification
is set to 0.5.
datatype
: When using composite geoms directly without a stat
(e.g. geom_slabinterval()
), datatype
is used to
indicate which part of the geom a row in the data targets: rows with datatype = "slab"
target the
slab portion of the geometry and rows with datatype = "interval"
target the interval portion of
the geometry. This is set automatically when using ggdist stat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (if orientation = "horizontal"
).
xmax
: Right end of the interval sub-geometry (if orientation = "horizontal"
).
ymin
: Lower end of the interval sub-geometry (if orientation = "vertical"
).
ymax
: Upper end of the interval sub-geometry (if orientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (or color
) The color of the interval and point sub-geometries.
Use the slab_color
, interval_color
, or point_color
aesthetics (below) to
set sub-geometry colors separately.
fill
: The fill color of the slab and point sub-geometries. Use the slab_fill
or point_fill
aesthetics (below) to set sub-geometry colors separately.
alpha
: The opacity of the slab, interval, and point sub-geometries. Use the slab_alpha
,
interval_alpha
, or point_alpha
aesthetics (below) to set sub-geometry colors separately.
colour_ramp
: (or color_ramp
) A secondary scale that modifies the color
scale to "ramp" to another color. See scale_colour_ramp()
for examples.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except with geom_slab()
: then
it is the width of the slab). With composite geometries including an interval and slab,
use slab_linewidth
to set the line width of the slab (see below). For interval, raw
linewidth
values are transformed according to the interval_size_domain
and interval_size_range
parameters of the geom
(see above).
size
: Determines the size of the point. If linewidth
is not provided, size
will
also determines the width of the line used to draw the interval (this allows line width and
point size to be modified together by setting only size
and not linewidth
). Raw
size
values are transformed according to the interval_size_domain
, interval_size_range
,
and fatten_point
parameters of the geom
(see above). Use the point_size
aesthetic
(below) to set sub-geometry size directly without applying the effects of
interval_size_domain
, interval_size_range
, and fatten_point
.
stroke
: Width of the outline around the point sub-geometry.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc) used to draw the interval
and the outline of the slab (if it is visible). Use the slab_linetype
or
interval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override for fill
: the fill color of the slab.
slab_colour
: (or slab_color
) Override for colour
/color
: the outline color of the slab.
slab_alpha
: Override for alpha
: the opacity of the slab.
slab_linewidth
: Override for linwidth
: the width of the outline of the slab.
slab_linetype
: Override for linetype
: the line type of the outline of the slab.
Interval-specific color and line override aesthetics
interval_colour
: (or interval_color
) Override for colour
/color
: the color of the interval.
interval_alpha
: Override for alpha
: the opacity of the interval.
interval_linetype
: Override for linetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override for fill
: the fill color of the point.
point_colour
: (or point_color
) Override for colour
/color
: the outline color of the point.
point_alpha
: Override for alpha
: the opacity of the point.
point_size
: Override for size
: the size of the point.
Deprecated aesthetics
slab_size
: Use slab_linewidth
.
interval_size
: Use interval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See geom_slabinterval()
for more information on the geom these stats
use by default and some of the options it has.
See vignette("slabinterval")
for a variety of examples of use.
library(dplyr) library(ggplot2) library(distributional) theme_set(theme_ggdist()) # EXAMPLES ON SAMPLE DATA set.seed(1234) df = data.frame( group = c("a", "b", "c", "c", "c"), value = rnorm(2500, mean = c(5, 7, 9, 9, 9), sd = c(1, 1.5, 1, 1, 1)) ) # here are vertical eyes: df %>% ggplot(aes(x = group, y = value)) + stat_eye() # note the sample size is not automatically incorporated into the # area of the densities in case one wishes to plot densities against # a reference (e.g. a prior distribution). # But you may wish to account for sample size if using these geoms # for something other than visualizing posteriors; in which case # you can use after_stat(f*n): df %>% ggplot(aes(x = group, y = value)) + stat_eye(aes(thickness = after_stat(pdf*n))) # EXAMPLES ON ANALYTICAL DISTRIBUTIONS dist_df = tribble( ~group, ~subgroup, ~mean, ~sd, "a", "h", 5, 1, "b", "h", 7, 1.5, "c", "h", 8, 1, "c", "i", 9, 1, "c", "j", 7, 1 ) # Using functions from the distributional package (like dist_normal()) with the # dist aesthetic can lead to more compact/expressive specifications dist_df %>% ggplot(aes(x = group, ydist = dist_normal(mean, sd), fill = subgroup)) + stat_eye(position = "dodge") # using the old character vector + args approach dist_df %>% ggplot(aes(x = group, dist = "norm", arg1 = mean, arg2 = sd, fill = subgroup)) + stat_eye(position = "dodge") # the stat_slabinterval family applies a Jacobian adjustment to densities # when plotting on transformed scales in order to plot them correctly. # It determines the Jacobian using symbolic differentiation if possible, # using stats::D(). If symbolic differentation fails, it falls back # to numericDeriv(), which is less reliable; therefore, it is # advisable to use scale transformation functions that are defined in # terms of basic math functions so that their derivatives can be # determined analytically (most of the transformation functions in the # scales package currently have this property). # For example, here is a log-Normal distribution plotted on the log # scale, where it will appear Normal: data.frame(dist = "lnorm", logmean = log(10), logsd = 2*log(10)) %>% ggplot(aes(y = 1, dist = dist, arg1 = logmean, arg2 = logsd)) + stat_halfeye() + scale_x_log10(breaks = 10^seq(-5,7, by = 2)) # see vignette("slabinterval") for many more examples.
library(dplyr) library(ggplot2) library(distributional) theme_set(theme_ggdist()) # EXAMPLES ON SAMPLE DATA set.seed(1234) df = data.frame( group = c("a", "b", "c", "c", "c"), value = rnorm(2500, mean = c(5, 7, 9, 9, 9), sd = c(1, 1.5, 1, 1, 1)) ) # here are vertical eyes: df %>% ggplot(aes(x = group, y = value)) + stat_eye() # note the sample size is not automatically incorporated into the # area of the densities in case one wishes to plot densities against # a reference (e.g. a prior distribution). # But you may wish to account for sample size if using these geoms # for something other than visualizing posteriors; in which case # you can use after_stat(f*n): df %>% ggplot(aes(x = group, y = value)) + stat_eye(aes(thickness = after_stat(pdf*n))) # EXAMPLES ON ANALYTICAL DISTRIBUTIONS dist_df = tribble( ~group, ~subgroup, ~mean, ~sd, "a", "h", 5, 1, "b", "h", 7, 1.5, "c", "h", 8, 1, "c", "i", 9, 1, "c", "j", 7, 1 ) # Using functions from the distributional package (like dist_normal()) with the # dist aesthetic can lead to more compact/expressive specifications dist_df %>% ggplot(aes(x = group, ydist = dist_normal(mean, sd), fill = subgroup)) + stat_eye(position = "dodge") # using the old character vector + args approach dist_df %>% ggplot(aes(x = group, dist = "norm", arg1 = mean, arg2 = sd, fill = subgroup)) + stat_eye(position = "dodge") # the stat_slabinterval family applies a Jacobian adjustment to densities # when plotting on transformed scales in order to plot them correctly. # It determines the Jacobian using symbolic differentiation if possible, # using stats::D(). If symbolic differentation fails, it falls back # to numericDeriv(), which is less reliable; therefore, it is # advisable to use scale transformation functions that are defined in # terms of basic math functions so that their derivatives can be # determined analytically (most of the transformation functions in the # scales package currently have this property). # For example, here is a log-Normal distribution plotted on the log # scale, where it will appear Normal: data.frame(dist = "lnorm", logmean = log(10), logsd = 2*log(10)) %>% ggplot(aes(y = 1, dist = dist, arg1 = logmean, arg2 = logsd)) + stat_halfeye() + scale_x_log10(breaks = 10^seq(-5,7, by = 2)) # see vignette("slabinterval") for many more examples.
Stat for drawing "spikes" (optionally with points on them) at specific points
on a distribution (numerical or determined as a function of the distribution),
intended for annotating stat_slabinterval()
geometries.
stat_spike( mapping = NULL, data = NULL, geom = "spike", position = "identity", ..., at = "median", p_limits = c(NA, NA), density = "bounded", adjust = waiver(), trim = waiver(), breaks = waiver(), align = waiver(), outline_bars = waiver(), expand = FALSE, limits = NULL, n = waiver(), orientation = NA, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
stat_spike( mapping = NULL, data = NULL, geom = "spike", position = "identity", ..., at = "median", p_limits = c(NA, NA), density = "bounded", adjust = waiver(), trim = waiver(), breaks = waiver(), align = waiver(), outline_bars = waiver(), expand = FALSE, limits = NULL, n = waiver(), orientation = NA, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override the default
connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
at |
<numeric | function | character | list> The points at which to evaluate the PDF and CDF of the distribution. One of:
The values of |
p_limits |
<length-2 numeric> Probability limits. Used to determine the lower and upper
limits of analytical distributions (distributions from samples ignore this parameter and determine
their limits based on the limits of the sample and the value of the |
density |
<function | string> Density estimator for sample data. One of:
|
adjust |
<scalar numeric | waiver> Passed to |
trim |
<scalar logical | waiver> Passed to |
breaks |
<numeric | function | string | waiver> Passed to
For example, |
align |
<scalar numeric | function | string | waiver> Passed to
For example, |
outline_bars |
<scalar logical | waiver> Passed to |
expand |
<logical> For sample data, should the slab be expanded to the limits of the scale? Default |
limits |
<length-2 numeric> Manually-specified limits for the slab, as a vector of length two. These limits are combined with those
computed based on |
n |
<scalar numeric> Number of points at which to evaluate the function that defines the slab. Also
passed to |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
<logical> Should this layer be included in the legends? Default is |
inherit.aes |
If |
check.aes , check.param
|
If |
This stat computes slab values (i.e. PDF and CDF values) at specified locations
on a distribution, as determined by the at
parameter.
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
xdist
, ydist
, and dist
can be any distribution object from the distributional
package (dist_normal()
, dist_beta()
, etc)
or can be a posterior::rvar()
object. Since these functions are vectorized,
other columns can be passed directly to them in an aes()
specification; e.g.
aes(dist = dist_normal(mu, sigma))
will work if mu
and sigma
are columns in the
input data frame.
dist
can be a character vector giving the distribution name. Then the arg1
, ... arg9
aesthetics (or args
as a list column) specify distribution arguments. Distribution names
should correspond to R functions that have "p"
, "q"
, and "d"
functions; e.g. "norm"
is a valid distribution name because R defines the pnorm()
, qnorm()
, and dnorm()
functions for Normal distributions.
See the parse_dist()
function for a useful way to generate dist
and args
values from human-readable distribution specs (like "normal(0,1)"
). Such specs are also
produced by other packages (like the brms::get_prior
function in brms); thus,
parse_dist()
combined with the stats described here can help you visualize the output
of those functions.
A ggplot2::Stat representing a spike geometry which can be added to a ggplot()
object.
The spike geom
has a wide variety of aesthetics that control
the appearance of its two sub-geometries: the spike and the point.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation = "vertical"
); or sample data to be summarized
(when orientation = "horizontal"
with sample data).
y
: y position of the geometry (when orientation = "horizontal"
); or sample data to be summarized
(when orientation = "vertical"
with sample data).
weight
: When using samples (i.e. the x
and y
aesthetics, not xdist
or ydist
), optional
weights to be applied to each draw.
xdist
: When using analytical distributions, distribution to map on the x axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
ydist
: When using analytical distributions, distribution to map on the y axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
dist
: When using analytical distributions, a name of a distribution (e.g. "norm"
), a
distributional object (e.g. dist_normal()
), or a posterior::rvar()
object. See Details.
args
: Distribution arguments (args
or arg1
, ... arg9
). See Details.
In addition, in their default configuration (paired with geom_spike()
)
the following aesthetics are supported by the underlying geom:
Spike-specific (aka Slab-specific) aesthetics
thickness
: The thickness of the slab at each x
value (if orientation = "horizontal"
) or
y
value (if orientation = "vertical"
) of the slab.
side
: Which side to place the slab on. "topright"
, "top"
, and "right"
are synonyms
which cause the slab to be drawn on the top or the right depending on if orientation
is "horizontal"
or "vertical"
. "bottomleft"
, "bottom"
, and "left"
are synonyms which cause the slab
to be drawn on the bottom or the left depending on if orientation
is "horizontal"
or
"vertical"
. "topleft"
causes the slab to be drawn on the top or the left, and "bottomright"
causes the slab to be drawn on the bottom or the right. "both"
draws the slab mirrored on both
sides (as in a violin plot).
scale
: What proportion of the region allocated to this geom to use to draw the slab. If scale = 1
,
slabs that use the maximum range will just touch each other. Default is 0.9
to leave some space
between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization,
see the thickness
scale article.
Color aesthetics
colour
: (or color
) The color of the spike and point sub-geometries.
fill
: The fill color of the point sub-geometry.
alpha
: The opacity of the spike and point sub-geometries.
colour_ramp
: (or color_ramp
) A secondary scale that modifies the color
scale to "ramp" to another color. See scale_colour_ramp()
for examples.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the spike sub-geometry.
size
: Size of the point sub-geometry.
stroke
: Width of the outline around the point sub-geometry.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc) used to draw the spike.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
x
or y
: For slabs, the input values to the slab function.
For intervals, the point summary from the interval function. Whether it is x
or y
depends on orientation
xmin
or ymin
: For intervals, the lower end of the interval from the interval function.
xmax
or ymax
: For intervals, the upper end of the interval from the interval function.
.width
: For intervals, the interval width as a numeric value in [0, 1]
.
For slabs, the width of the smallest interval containing that value of the slab.
level
: For intervals, the interval width as an ordered factor.
For slabs, the level of the smallest interval containing that value of the slab.
pdf
: For slabs, the probability density function (PDF).
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the PDF at the point summary; intervals also have pdf_min
and pdf_max
for the PDF at the lower and upper ends of the interval.
cdf
: For slabs, the cumulative distribution function.
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the CDF at the point summary; intervals also have cdf_min
and cdf_max
for the CDF at the lower and upper ends of the interval.
n
: For slabs, the number of data points summarized into that slab. If the slab was created from
an analytical distribution via the xdist
, ydist
, or dist
aesthetic, n
will be Inf
.
f
: (deprecated) For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF),
determined by slab_type
. Instead of using slab_type
to change f
and then mapping f
onto an
aesthetic, it is now recommended to simply map the corresponding computed variable (e.g. pdf
, cdf
, or
1 - cdf
) directly onto the desired aesthetic.
at
: For spikes, a character vector of names of the functions or expressions used to determine
the points at which the slab functions were evaluated to create spikes. Values of this computed
variable are determined by the at
parameter; see its description above.
See geom_spike()
for the geom underlying this stat.
See stat_slabinterval()
for the stat this shortcut is based on.
Other slabinterval stats:
stat_ccdfinterval()
,
stat_cdfinterval()
,
stat_eye()
,
stat_gradientinterval()
,
stat_halfeye()
,
stat_histinterval()
,
stat_interval()
,
stat_pointinterval()
,
stat_slab()
library(ggplot2) library(distributional) library(dplyr) df = tibble( d = c(dist_normal(1), dist_gamma(2,2)), g = c("a", "b") ) # annotate the density at the mode of a distribution df %>% ggplot(aes(y = g, xdist = d)) + stat_slab(aes(xdist = d)) + stat_spike(at = "Mode") + # need shared thickness scale so that stat_slab and geom_spike line up scale_thickness_shared() # annotate the endpoints of intervals of a distribution # here we'll use an arrow instead of a point by setting size = 0 arrow_spec = arrow(angle = 45, type = "closed", length = unit(4, "pt")) df %>% ggplot(aes(y = g, xdist = d)) + stat_halfeye(point_interval = mode_hdci) + stat_spike( at = function(x) hdci(x, .width = .66), size = 0, arrow = arrow_spec, color = "blue", linewidth = 0.75 ) + scale_thickness_shared() # annotate quantiles of a sample set.seed(1234) data.frame(x = rnorm(1000, 1:2), g = c("a","b")) %>% ggplot(aes(x, g)) + stat_slab() + stat_spike(at = function(x) quantile(x, ppoints(10))) + scale_thickness_shared()
library(ggplot2) library(distributional) library(dplyr) df = tibble( d = c(dist_normal(1), dist_gamma(2,2)), g = c("a", "b") ) # annotate the density at the mode of a distribution df %>% ggplot(aes(y = g, xdist = d)) + stat_slab(aes(xdist = d)) + stat_spike(at = "Mode") + # need shared thickness scale so that stat_slab and geom_spike line up scale_thickness_shared() # annotate the endpoints of intervals of a distribution # here we'll use an arrow instead of a point by setting size = 0 arrow_spec = arrow(angle = 45, type = "closed", length = unit(4, "pt")) df %>% ggplot(aes(y = g, xdist = d)) + stat_halfeye(point_interval = mode_hdci) + stat_spike( at = function(x) hdci(x, .width = .66), size = 0, arrow = arrow_spec, color = "blue", linewidth = 0.75 ) + scale_thickness_shared() # annotate quantiles of a sample set.seed(1234) data.frame(x = rnorm(1000, 1:2), g = c("a","b")) %>% ggplot(aes(x, g)) + stat_slab() + stat_spike(at = function(x) quantile(x, ppoints(10))) + scale_thickness_shared()
Density, distribution function, quantile function and random generation for the
scaled and shifted Student's t distribution, parameterized by degrees of freedom (df
),
location (mu
), and scale (sigma
).
dstudent_t(x, df, mu = 0, sigma = 1, log = FALSE) pstudent_t(q, df, mu = 0, sigma = 1, lower.tail = TRUE, log.p = FALSE) qstudent_t(p, df, mu = 0, sigma = 1, lower.tail = TRUE, log.p = FALSE) rstudent_t(n, df, mu = 0, sigma = 1)
dstudent_t(x, df, mu = 0, sigma = 1, log = FALSE) pstudent_t(q, df, mu = 0, sigma = 1, lower.tail = TRUE, log.p = FALSE) qstudent_t(p, df, mu = 0, sigma = 1, lower.tail = TRUE, log.p = FALSE) rstudent_t(n, df, mu = 0, sigma = 1)
x , q
|
vector of quantiles. |
df |
degrees of freedom ( |
mu |
<numeric> Location parameter (median). |
sigma |
<numeric> Scale parameter. |
log , log.p
|
logical; if TRUE, probabilities p are given as log(p). |
lower.tail |
logical; if TRUE (default), probabilities are
|
p |
vector of probabilities. |
n |
number of observations. If |
dstudent_t
gives the density
pstudent_t
gives the cumulative distribution function (CDF)
qstudent_t
gives the quantile function (inverse CDF)
rstudent_t
generates random draws.
The length of the result is determined by n
for rstudent_t
, and is the maximum of the lengths of
the numerical arguments for the other functions.
The numerical arguments other than n
are recycled to the length of the result. Only the first elements
of the logical arguments are used.
parse_dist()
and parsing distribution specs and the stat_slabinterval()
family of stats for visualizing them.
library(dplyr) library(ggplot2) expand.grid( df = c(3,5,10,30), scale = c(1,1.5) ) %>% ggplot(aes(y = 0, dist = "student_t", arg1 = df, arg2 = 0, arg3 = scale, color = ordered(df))) + stat_slab(p_limits = c(.01, .99), fill = NA) + scale_y_continuous(breaks = NULL) + facet_grid( ~ scale) + labs( title = "dstudent_t(x, df, 0, sigma)", subtitle = "Scale (sigma)", y = NULL, x = NULL ) + theme_ggdist() + theme(axis.title = element_text(hjust = 0))
library(dplyr) library(ggplot2) expand.grid( df = c(3,5,10,30), scale = c(1,1.5) ) %>% ggplot(aes(y = 0, dist = "student_t", arg1 = df, arg2 = 0, arg3 = scale, color = ordered(df))) + stat_slab(p_limits = c(.01, .99), fill = NA) + scale_y_continuous(breaks = NULL) + facet_grid( ~ scale) + labs( title = "dstudent_t(x, df, 0, sigma)", subtitle = "Scale (sigma)", y = NULL, x = NULL ) + theme_ggdist() + theme(axis.title = element_text(hjust = 0))
These scales allow more specific aesthetic mappings to be made when using geom_slabinterval()
and stats/geoms based on it (like eye plots).
scale_point_colour_discrete(..., aesthetics = "point_colour") scale_point_color_discrete(..., aesthetics = "point_colour") scale_point_colour_continuous( ..., aesthetics = "point_colour", guide = guide_colourbar2() ) scale_point_color_continuous( ..., aesthetics = "point_colour", guide = guide_colourbar2() ) scale_point_fill_discrete(..., aesthetics = "point_fill") scale_point_fill_continuous( ..., aesthetics = "point_fill", guide = guide_colourbar2() ) scale_point_alpha_continuous(..., range = c(0.1, 1)) scale_point_alpha_discrete(..., range = c(0.1, 1)) scale_point_size_continuous(..., range = c(1, 6)) scale_point_size_discrete(..., range = c(1, 6), na.translate = FALSE) scale_interval_colour_discrete(..., aesthetics = "interval_colour") scale_interval_color_discrete(..., aesthetics = "interval_colour") scale_interval_colour_continuous( ..., aesthetics = "interval_colour", guide = guide_colourbar2() ) scale_interval_color_continuous( ..., aesthetics = "interval_colour", guide = guide_colourbar2() ) scale_interval_alpha_continuous(..., range = c(0.1, 1)) scale_interval_alpha_discrete(..., range = c(0.1, 1)) scale_interval_size_continuous(..., range = c(1, 6)) scale_interval_size_discrete(..., range = c(1, 6), na.translate = FALSE) scale_interval_linetype_discrete(..., na.value = "blank") scale_interval_linetype_continuous(...) scale_slab_colour_discrete(..., aesthetics = "slab_colour") scale_slab_color_discrete(..., aesthetics = "slab_colour") scale_slab_colour_continuous( ..., aesthetics = "slab_colour", guide = guide_colourbar2() ) scale_slab_color_continuous( ..., aesthetics = "slab_colour", guide = guide_colourbar2() ) scale_slab_fill_discrete(..., aesthetics = "slab_fill") scale_slab_fill_continuous( ..., aesthetics = "slab_fill", guide = guide_colourbar2() ) scale_slab_alpha_continuous( ..., limits = function(l) c(min(0, l[[1]]), l[[2]]), range = c(0, 1) ) scale_slab_alpha_discrete(..., range = c(0.1, 1)) scale_slab_size_continuous(..., range = c(1, 6)) scale_slab_size_discrete(..., range = c(1, 6), na.translate = FALSE) scale_slab_linewidth_continuous(..., range = c(1, 6)) scale_slab_linewidth_discrete(..., range = c(1, 6), na.translate = FALSE) scale_slab_linetype_discrete(..., na.value = "blank") scale_slab_linetype_continuous(...) scale_slab_shape_discrete(..., solid = TRUE) scale_slab_shape_continuous(...) guide_colourbar2(...) guide_colorbar2(...)
scale_point_colour_discrete(..., aesthetics = "point_colour") scale_point_color_discrete(..., aesthetics = "point_colour") scale_point_colour_continuous( ..., aesthetics = "point_colour", guide = guide_colourbar2() ) scale_point_color_continuous( ..., aesthetics = "point_colour", guide = guide_colourbar2() ) scale_point_fill_discrete(..., aesthetics = "point_fill") scale_point_fill_continuous( ..., aesthetics = "point_fill", guide = guide_colourbar2() ) scale_point_alpha_continuous(..., range = c(0.1, 1)) scale_point_alpha_discrete(..., range = c(0.1, 1)) scale_point_size_continuous(..., range = c(1, 6)) scale_point_size_discrete(..., range = c(1, 6), na.translate = FALSE) scale_interval_colour_discrete(..., aesthetics = "interval_colour") scale_interval_color_discrete(..., aesthetics = "interval_colour") scale_interval_colour_continuous( ..., aesthetics = "interval_colour", guide = guide_colourbar2() ) scale_interval_color_continuous( ..., aesthetics = "interval_colour", guide = guide_colourbar2() ) scale_interval_alpha_continuous(..., range = c(0.1, 1)) scale_interval_alpha_discrete(..., range = c(0.1, 1)) scale_interval_size_continuous(..., range = c(1, 6)) scale_interval_size_discrete(..., range = c(1, 6), na.translate = FALSE) scale_interval_linetype_discrete(..., na.value = "blank") scale_interval_linetype_continuous(...) scale_slab_colour_discrete(..., aesthetics = "slab_colour") scale_slab_color_discrete(..., aesthetics = "slab_colour") scale_slab_colour_continuous( ..., aesthetics = "slab_colour", guide = guide_colourbar2() ) scale_slab_color_continuous( ..., aesthetics = "slab_colour", guide = guide_colourbar2() ) scale_slab_fill_discrete(..., aesthetics = "slab_fill") scale_slab_fill_continuous( ..., aesthetics = "slab_fill", guide = guide_colourbar2() ) scale_slab_alpha_continuous( ..., limits = function(l) c(min(0, l[[1]]), l[[2]]), range = c(0, 1) ) scale_slab_alpha_discrete(..., range = c(0.1, 1)) scale_slab_size_continuous(..., range = c(1, 6)) scale_slab_size_discrete(..., range = c(1, 6), na.translate = FALSE) scale_slab_linewidth_continuous(..., range = c(1, 6)) scale_slab_linewidth_discrete(..., range = c(1, 6), na.translate = FALSE) scale_slab_linetype_discrete(..., na.value = "blank") scale_slab_linetype_continuous(...) scale_slab_shape_discrete(..., solid = TRUE) scale_slab_shape_continuous(...) guide_colourbar2(...) guide_colorbar2(...)
... |
Arguments passed to underlying scale or guide functions. E.g. |
aesthetics |
<character> Names of aesthetics to set scales for. |
guide |
|
range |
<length-2 numeric> The minimum and maximum size of the plotting symbol after transformation. |
na.translate |
<scalar logical> In discrete scales, should we show missing values? |
na.value |
<linetype> When |
limits |
One of:
|
solid |
Should the shapes be solid, |
The following additional scales / aesthetics are defined for use with geom_slabinterval()
and
related geoms:
scale_point_color_*
Point color
scale_point_fill_*
Point fill color
scale_point_alpha_*
Point alpha level / opacity
scale_point_size_*
Point size
scale_interval_color_*
Interval line color
scale_interval_alpha_*
Interval alpha level / opacity
scale_interval_linetype_*
Interval line type
scale_slab_color_*
Slab outline color
scale_slab_fill_*
Slab fill color
scale_slab_alpha_*
Slab alpha level / opacity. The default settings of
scale_slab_alpha_continuous
differ from scale_alpha_continuous()
and are designed for gradient plots (e.g. stat_gradientinterval()
) by ensuring that
densities of 0 get mapped to 0 in the output.
scale_slab_linewidth_*
Slab outline line width
scale_slab_linetype_*
Slab outline line type
scale_slab_shape_*
Slab dot shape (for geom_dotsinterval()
)
See the corresponding scale documentation in ggplot for more information; e.g.
scale_color_discrete()
,
scale_color_continuous()
, etc.
Other scale functions can be used with the aesthetics/scales defined here by using the aesthetics
argument to that scale function. For example, to use color brewer scales with the point_color
aesthetic:
scale_color_brewer(..., aesthetics = "point_color")
With continuous color scales, you may also need to provide a guide as the default guide does not work properly;
this is what guide_colorbar2
is for:
scale_color_distiller(..., guide = "colorbar2", aesthetics = "point_color")
These scales have been deprecated:
scale_interval_size_*
Use scale_linewidth_*
scale_slab_size_*
Slab scale_size_linewidth_*
A ggplot2::Scale representing one of the aesthetics used to target the appearance of specific parts of composite
ggdist
geoms. Can be added to a ggplot()
object.
Matthew Kay
Other ggplot2 scales: scale_color_discrete()
,
scale_color_continuous()
, etc.
Other ggdist scales:
scale_colour_ramp
,
scale_side_mirrored()
,
scale_thickness
library(dplyr) library(ggplot2) # This plot shows how to set multiple specific aesthetics # NB it is very ugly and is only for demo purposes. data.frame(distribution = "Normal(1,2)") %>% parse_dist(distribution) %>% ggplot(aes(y = distribution, xdist = .dist, args = .args)) + stat_halfeye( shape = 21, # this point shape has a fill and outline point_color = "red", point_fill = "black", point_alpha = .1, point_size = 6, stroke = 2, interval_color = "blue", # interval line widths are scaled from [1, 6] onto [0.6, 1.4] by default # see the interval_size_range parameter in help("geom_slabinterval") linewidth = 8, interval_linetype = "dashed", interval_alpha = .25, # fill sets the fill color of the slab (here the density) slab_color = "green", slab_fill = "purple", slab_linewidth = 3, slab_linetype = "dotted", slab_alpha = .5 )
library(dplyr) library(ggplot2) # This plot shows how to set multiple specific aesthetics # NB it is very ugly and is only for demo purposes. data.frame(distribution = "Normal(1,2)") %>% parse_dist(distribution) %>% ggplot(aes(y = distribution, xdist = .dist, args = .args)) + stat_halfeye( shape = 21, # this point shape has a fill and outline point_color = "red", point_fill = "black", point_alpha = .1, point_size = 6, stroke = 2, interval_color = "blue", # interval line widths are scaled from [1, 6] onto [0.6, 1.4] by default # see the interval_size_range parameter in help("geom_slabinterval") linewidth = 8, interval_linetype = "dashed", interval_alpha = .25, # fill sets the fill color of the slab (here the density) slab_color = "green", slab_fill = "purple", slab_linewidth = 3, slab_linetype = "dotted", slab_alpha = .5 )
This is a sub-guide intended for annotating the thickness
and dot-count
subscales in ggdist. It can be used with the subguide
parameter of
geom_slabinterval()
and geom_dotsinterval()
.
Supports automatic partial function application with waived arguments.
subguide_axis( values, title = NULL, breaks = waiver(), labels = waiver(), position = 0, just = 0, label_side = "topright", orientation = "horizontal", theme = theme_get() ) subguide_inside(..., label_side = "inside") subguide_outside(..., label_side = "outside", just = 1) subguide_integer(..., breaks = scales::breaks_extended(Q = c(1, 5, 2, 4, 3))) subguide_count(..., breaks = scales::breaks_width(1)) subguide_slab(values, ...) subguide_dots(values, ...) subguide_spike(values, ...)
subguide_axis( values, title = NULL, breaks = waiver(), labels = waiver(), position = 0, just = 0, label_side = "topright", orientation = "horizontal", theme = theme_get() ) subguide_inside(..., label_side = "inside") subguide_outside(..., label_side = "outside", just = 1) subguide_integer(..., breaks = scales::breaks_extended(Q = c(1, 5, 2, 4, 3))) subguide_count(..., breaks = scales::breaks_width(1)) subguide_slab(values, ...) subguide_dots(values, ...) subguide_spike(values, ...)
values |
<numeric> Values used to construct the scale used for this guide.
Typically provided automatically by |
title |
<string> The title of the scale shown on the sub-guide's axis. |
breaks |
One of:
|
labels |
One of:
|
position |
<scalar numeric> Value between |
just |
<scalar numeric> Value between |
label_side |
<string> Which side of the axis to draw the ticks and labels on.
|
orientation |
<string> Orientation of the geometry this sub-guide is for. One
of |
theme |
<theme> Theme used to determine the style that the
sub-guide elements are drawn in. The title label is drawn using the
|
... |
Arguments passed to other functions, typically back to
|
subguide_inside()
is a shortcut for drawing labels inside of the chart
region.
subguide_outside()
is a shortcut for drawing labels outside of the chart
region.
subguide_integer()
only draws breaks that are integer values, useful for
labeling counts in geom_dots()
.
subguide_count()
is a shortcut for drawing labels where every whole number
is labeled, useful for labeling counts in geom_dots()
. If your max count is
large, subguide_integer()
may be better.
subguide_slab()
, subguide_dots()
, and subguide_spike()
are aliases
for subguide_none()
that allow you to change the default subguide used
for the geom_slabinterval()
, geom_dotsinterval()
, and geom_spike()
families. If you overwrite these in the global environment, you can set
the corresponding default subguide. For example:
subguide_slab = ggdist::subguide_inside(position = "right")
This will cause geom_slabinterval()
s to default to having a guide on the
right side of the geom.
The thickness datatype.
The thickness
aesthetic of geom_slabinterval()
.
scale_thickness_shared()
, for setting a thickness
scale across
all geometries using the thickness
aesthetic.
subscale_thickness()
, for setting a thickness
sub-scale within
a single geom_slabinterval()
.
Other sub-guides:
subguide_none()
library(ggplot2) library(distributional) df = data.frame(d = dist_normal(2:3, 2:3), g = c("a", "b")) # subguides allow you to label thickness axes ggplot(df, aes(xdist = d, y = g)) + stat_slabinterval(subguide = "inside") # they respect normalization and use of scale_thickness_shared() ggplot(df, aes(xdist = d, y = g)) + stat_slabinterval(subguide = "inside", normalize = "groups") # they can also be positioned outside the plot area, though # this typically requires manually adjusting plot margins ggplot(df, aes(xdist = d, y = g)) + stat_slabinterval(subguide = subguide_outside(title = "density", position = "right")) + theme(plot.margin = margin(5.5, 50, 5.5, 5.5)) # any of the subguide types will also work to indicate bin counts in # geom_dots(); subguide_integer() and subguide_count() can be useful for # dotplots as they only label integers / whole numbers: df = data.frame(d = dist_gamma(2:3, 2:3), g = c("a", "b")) ggplot(df, aes(xdist = d, y = g)) + stat_dots(subguide = subguide_count(label_side = "left", title = "count")) + scale_y_discrete(expand = expansion(add = 0.1)) + scale_x_continuous(expand = expansion(add = 0.5))
library(ggplot2) library(distributional) df = data.frame(d = dist_normal(2:3, 2:3), g = c("a", "b")) # subguides allow you to label thickness axes ggplot(df, aes(xdist = d, y = g)) + stat_slabinterval(subguide = "inside") # they respect normalization and use of scale_thickness_shared() ggplot(df, aes(xdist = d, y = g)) + stat_slabinterval(subguide = "inside", normalize = "groups") # they can also be positioned outside the plot area, though # this typically requires manually adjusting plot margins ggplot(df, aes(xdist = d, y = g)) + stat_slabinterval(subguide = subguide_outside(title = "density", position = "right")) + theme(plot.margin = margin(5.5, 50, 5.5, 5.5)) # any of the subguide types will also work to indicate bin counts in # geom_dots(); subguide_integer() and subguide_count() can be useful for # dotplots as they only label integers / whole numbers: df = data.frame(d = dist_gamma(2:3, 2:3), g = c("a", "b")) ggplot(df, aes(xdist = d, y = g)) + stat_dots(subguide = subguide_count(label_side = "left", title = "count")) + scale_y_discrete(expand = expansion(add = 0.1)) + scale_x_continuous(expand = expansion(add = 0.5))
This is a blank sub-guide that omits annotations for the thickness
and
dot-count sub-scales in ggdist. It can be used with the subguide
parameter of geom_slabinterval()
and geom_dotsinterval()
.
Supports automatic partial function application with waived arguments.
subguide_none(values, ...)
subguide_none(values, ...)
values |
<numeric> Values used to construct the scale used for this guide.
Typically provided automatically by |
... |
ignored. |
Other sub-guides:
subguide_axis()
This is an identity sub-scale for the thickness
aesthetic
in ggdist. It returns its input as a thickness vector without
rescaling. It can be used with the subscale
parameter of
geom_slabinterval()
.
subscale_identity(x)
subscale_identity(x)
x |
<numeric> Vector to be rescaled.
Typically provided automatically by |
A thickness vector of the same length as x
, with infinite
values in x
squished into the data range.
Other sub-scales:
subscale_thickness()
This is a sub-scale intended for adjusting the scaling of the thickness
aesthetic at a geometry (or sub-geometry) level in ggdist. It can be
used with the subscale
parameter of geom_slabinterval()
.
Supports automatic partial function application with waived arguments.
subscale_thickness( x, limits = function(l) c(min(0, l[1]), l[2]), expand = c(0, 0) )
subscale_thickness( x, limits = function(l) c(min(0, l[1]), l[2]), expand = c(0, 0) )
x |
<numeric> Vector to be rescaled.
Typically provided automatically by |
limits |
<length-2 numeric | function | NULL> One of:
|
expand |
<numeric> Vector of limit expansion constants of length
2 or 4, following the same format used by the |
You can overwrite subscale_thickness
in the global environment to set
the default properties of the thickness subscale. For example:
subscale_thickness = ggdist::subscale_thickness(expand = expansion(c(0, 0.05)))
This will cause geom_slabinterval()
s to default to a thickness subscale
that expands by 5% at the top of the scale. Always prefix such a
definition with ggdist::
to avoid infinite loops caused by recursion.
A thickness vector of the same length as x
scaled to be between
0
and 1
.
The thickness datatype.
The thickness
aesthetic of geom_slabinterval()
.
scale_thickness_shared()
, for setting a thickness
scale across
all geometries using the thickness
aesthetic.
Other sub-scales:
subscale_identity()
library(ggplot2) library(distributional) df = data.frame(d = dist_normal(2:3, 1), g = c("a", "b")) # breaks on thickness subguides are always limited to the bounds of the # subscale, which may leave labels off near the edge of the subscale # (e.g. here `0.4` is omitted because the max value is approx `0.39`) ggplot(df, aes(xdist = d, y = g)) + stat_slabinterval( subguide = "inside" ) # We can use the subscale to expand the upper limit of the thickness scale # by 5% (similar to the default for positional scales), allowing bounds near # (but just less than) the limit, like `0.4`, to be shown. ggplot(df, aes(xdist = d, y = g)) + stat_slabinterval( subguide = "inside", subscale = subscale_thickness(expand = expansion(c(0, 0.5))) )
library(ggplot2) library(distributional) df = data.frame(d = dist_normal(2:3, 1), g = c("a", "b")) # breaks on thickness subguides are always limited to the bounds of the # subscale, which may leave labels off near the edge of the subscale # (e.g. here `0.4` is omitted because the max value is approx `0.39`) ggplot(df, aes(xdist = d, y = g)) + stat_slabinterval( subguide = "inside" ) # We can use the subscale to expand the upper limit of the thickness scale # by 5% (similar to the default for positional scales), allowing bounds near # (but just less than) the limit, like `0.4`, to be shown. ggplot(df, aes(xdist = d, y = g)) + stat_slabinterval( subguide = "inside", subscale = subscale_thickness(expand = expansion(c(0, 0.5))) )
A simple, relatively minimalist ggplot2 theme, and some helper functions to go with it.
theme_ggdist( base_size = 11, base_family = "", base_line_size = base_size/22, base_rect_size = base_size/22 ) theme_tidybayes( base_size = 11, base_family = "", base_line_size = base_size/22, base_rect_size = base_size/22 ) facet_title_horizontal() axis_titles_bottom_left() facet_title_left_horizontal() facet_title_right_horizontal()
theme_ggdist( base_size = 11, base_family = "", base_line_size = base_size/22, base_rect_size = base_size/22 ) theme_tidybayes( base_size = 11, base_family = "", base_line_size = base_size/22, base_rect_size = base_size/22 ) facet_title_horizontal() axis_titles_bottom_left() facet_title_left_horizontal() facet_title_right_horizontal()
base_size |
base font size, given in pts. |
base_family |
base font family |
base_line_size |
base size for line elements |
base_rect_size |
base size for rect elements |
This is a relatively minimalist ggplot2 theme, intended to be used for making publication-ready plots.
It is currently based on ggplot2::theme_light()
.
A word of warning: this theme may (and very likely will) change in the future as I tweak it to my taste.
theme_ggdist()
and theme_tidybayes()
are aliases.
A named list in the format of ggplot2::theme()
Matthew Kay
ggplot2::theme()
, ggplot2::theme_set()
library(ggplot2) theme_set(theme_ggdist())
library(ggplot2) theme_set(theme_ggdist())
A representation of the thickness of a slab: a scaled value (x
) where
0
is the base of the slab and 1
is its maximum extent, and the lower
(lower
) and upper (upper
) limits of the slab values in their original
data units.
thickness(x = double(), lower = NA_real_, upper = NA_real_)
thickness(x = double(), lower = NA_real_, upper = NA_real_)
x |
<coercible-to-numeric> A numeric vector or an object
coercible to a numeric (via |
lower |
<numeric> The original lower bounds of thickness values before scaling.
May be |
upper |
<numeric> The original upper bounds of thickness values before scaling.
May be |
This datatype is used by scale_thickness_shared()
and subscale_thickness()
to represent numeric()
-like objects marked as being in units of slab "thickness".
Unlike regular numeric()
s, thickness()
values mapped onto the thickness
aesthetic are not rescaled by scale_thickness_shared()
or geom_slabinterval()
.
In most cases thickness()
is not useful directly; though it can be used to
mark values that should not be rescaled—see the definitions of
stat_ccdfinterval()
and stat_gradientinterval()
for some example usages.
thickness objects with unequal lower or upper limits may not be combined.
However, thickness objects with NA
limits may be combined with
thickness objects with non-NA
limits. This allows (e.g.) specifying
locations on the thickness scale that are independent of data limits.
A vctrs::rcrd of class "ggdist_thickness"
with fields
"x"
, "lower"
, and "upper"
.
Matthew Kay
The thickness
aesthetic of geom_slabinterval()
.
scale_thickness_shared()
, for setting a thickness
scale across
all geometries using the thickness
aesthetic.
subscale_thickness()
, for setting a thickness
sub-scale within
a single geom_slabinterval()
.
thickness(0:1) thickness(0:1, 0, 10)
thickness(0:1) thickness(0:1, 0, 10)
These functions translate ggdist/tidybayes-style data frames to/from different data frame formats (each format using a different naming scheme for its columns).
to_broom_names(data) from_broom_names(data) to_ggmcmc_names(data) from_ggmcmc_names(data)
to_broom_names(data) from_broom_names(data) to_ggmcmc_names(data) from_ggmcmc_names(data)
data |
<data.frame> A data frame to translate. |
Function prefixed with to_
translate from the ggdist/tidybayes format to another format, functions
prefixed with from_
translate from that format back to the ggdist/tidybayes format. Formats include:
to_broom_names()
/ from_broom_names()
:
.variable
<-> term
.value
<-> estimate
.prediction
<-> .fitted
.lower
<-> conf.low
.upper
<-> conf.high
to_ggmcmc_names()
/ from_ggmcmc_names()
:
.chain
<-> Chain
.iteration
<-> Iteration
.variable
<-> Parameter
.value
<-> value
A data frame with (possibly) new names in some columns, according to the translation scheme described in Details.
Matthew Kay
library(dplyr) data(RankCorr_u_tau, package = "ggdist") df = RankCorr_u_tau %>% dplyr::rename(.variable = i, .value = u_tau) %>% group_by(.variable) %>% median_qi(.value) df df %>% to_broom_names()
library(dplyr) data(RankCorr_u_tau, package = "ggdist") df = RankCorr_u_tau %>% dplyr::rename(.variable = i, .value = u_tau) %>% group_by(.variable) %>% median_qi(.value) df df %>% to_broom_names()
A flag indicating that the default value of an argument should be used.
waiver()
waiver()
A waiver()
is a flag passed to a function argument that indicates the
function should use the default value of that argument. It is used in two
cases:
ggplot2 functions use it to distinguish between "nothing" (NULL
)
and a default value calculated elsewhere (waiver()
).
ggdist turns ggplot2's convention into a standardized method of
argument-passing: any named argument with a default value in an
automatically partially-applied function can be passed
waiver()
when calling the function. This will cause the default value
(or the most recently partially-applied value) of that argument to be used
instead.
Note: due to historical limitations, waiver()
cannot currently be
used on arguments to the point_interval()
family of functions.
auto_partial()
, ggplot2::waiver()
f = auto_partial(function(x, y = "b") { c(x = x, y = y) }) f("a") # uses the default value of `y` ("b") f("a", y = waiver()) # partially apply `f` g = f(y = "c") g # uses the last partially-applied value of `y` ("c") g("a", y = waiver())
f = auto_partial(function(x, y = "b") { c(x = x, y = y) }) f("a") # uses the default value of `y` ("b") f("a", y = waiver()) # partially apply `f` g = f(y = "c") g # uses the last partially-applied value of `y` ("c") g("a", y = waiver())
A variation of ecdf()
that can be applied to weighted samples.
weighted_ecdf(x, weights = NULL, na.rm = FALSE)
weighted_ecdf(x, weights = NULL, na.rm = FALSE)
x |
<numeric> Sample values. |
weights |
<numeric | NULL> Weights for the sample. One of:
|
na.rm |
<scalar logical> If |
Generates a weighted empirical cumulative distribution function, .
Given
, a sorted vector (derived from
x
), and , the corresponding
weight
for ,
is a step function with steps at each
with
equal to the sum of all weights up to and including
.
weighted_ecdf()
returns a function of class "weighted_ecdf"
, which also
inherits from the stepfun()
class. Thus, it also has plot()
and print()
methods. Like ecdf()
, weighted_ecdf()
also provides a quantile()
method,
which dispatches to weighted_quantile()
.
weighted_ecdf(1:3, weights = 1:3) plot(weighted_ecdf(1:3, weights = 1:3)) quantile(weighted_ecdf(1:3, weights = 1:3), 0.4)
weighted_ecdf(1:3, weights = 1:3) plot(weighted_ecdf(1:3, weights = 1:3)) quantile(weighted_ecdf(1:3, weights = 1:3), 0.4)
A variation of quantile()
that can be applied to weighted samples.
weighted_quantile( x, probs = seq(0, 1, 0.25), weights = NULL, n = NULL, na.rm = FALSE, names = TRUE, type = 7, digits = 7 ) weighted_quantile_fun(x, weights = NULL, n = NULL, na.rm = FALSE, type = 7)
weighted_quantile( x, probs = seq(0, 1, 0.25), weights = NULL, n = NULL, na.rm = FALSE, names = TRUE, type = 7, digits = 7 ) weighted_quantile_fun(x, weights = NULL, n = NULL, na.rm = FALSE, type = 7)
x |
<numeric> Sample values. |
probs |
<numeric> Vector of probabilities in |
weights |
<numeric | NULL> Weights for the sample. One of:
|
n |
<scalar numeric> Presumed effective sample size. If this is greater than 1 and
continuous quantiles (
|
na.rm |
<scalar logical> If |
names |
<scalar logical> If |
type |
<scalar integer> Value between 1 and 9: determines the type of quantile estimator to be used. Types 1 to 3 are for discontinuous quantiles, types 4 to 9 are for continuous quantiles. See Details. |
digits |
<scalar numeric> The number of digits to use to format percentages
when |
Calculates weighted quantiles using a variation of the quantile types based
on a generalization of quantile()
.
Type 1–3 (discontinuous) quantiles are directly a function of the inverse CDF as a step function, and so can be directly translated to the weighted case using the natural definition of the weighted ECDF as the cumulative sum of the normalized weights.
Type 4–9 (continuous) quantiles require some translation from the definitions
in quantile()
. quantile()
defines continuous estimators in terms of
, which is the
th order statistic, and
, which is a function of
and
(the sample size). In the weighted case, we instead take
as the
th
smallest value of
in the weighted sample (not necessarily an order statistic,
because of the weights). Then we can re-write the formulas for
in terms of
(the empirical CDF at
, i.e. the cumulative sum of normalized
weights) and
(the normalized weight at
), by using the
fact that, in the unweighted case,
and
:
Then the quantile function (inverse CDF) is the piece-wise linear function
defined by the points .
weighted_quantile()
returns a numeric vector of length(probs)
with the
estimate of the corresponding quantile from probs
.
weighted_quantile_fun()
returns a function that takes a single argument,
a vector of probabilities, which itself returns the corresponding quantile
estimates. It may be useful when weighted_quantile()
needs to be called
repeatedly for the same sample, re-using some pre-computation.