Random distributions
For maximum flexibility when producing random values, we define the
Distribution
trait:
use rand::{Rng, distributions::DistIter};
// a producer of data of type T:
pub trait Distribution<T> {
// the key function:
fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> T;
// a convenience function defined using sample:
fn sample_iter<R>(self, rng: R) -> DistIter<Self, R, T>
where
Self: Sized,
R: Rng,
{
// [has a default implementation]
todo!()
}
}
Implementations of Distribution
are probability distribution: mappings
from events to probabilities (e.g. for a die roll P(x = i) = ⅙
or for a Normal
distribution with mean μ=0
, P(x > 0) = ½
).
Note that although probability distributions all have properties such as a mean,
a Probability Density Function, and can be sampled by inverting the Cumulative
Density Function, here we only concern ourselves with sampling random values.
If you require use of such properties you may prefer to use the statrs
crate.
Rand provides implementations of many different distributions; we cover the most
common of these here, but for full details refer to the distributions
module
and the rand_distr
crate.
Uniform distributions
The most obvious type of distribution is the one we already discussed: one where each equally-sized sub-range has equal chance of containing the next sample. This is known as uniform.
Rand actually has several variants of this, representing different ranges:
Standard
requires no parameters and samples values uniformly according to the type.Rng::gen
provides a short-cut to this distribution.Uniform
is parametrised byUniform::new(low, high)
(includinglow
, excludinghigh
) orUniform::new_inclusive(low, high)
(including both), and samples values uniformly within this range.Rng::gen_range
is a convenience method defined overUniform::sample_single
, optimised for single-sample usage.Alphanumeric
is uniform over thechar
values0-9A-Za-z
.Open01
andOpenClosed01
are provide alternate sampling ranges for floating-point types (see below).
Uniform sampling by type
Lets go over the distributions by type:
-
For
bool
,Standard
samples each value with probability 50%. -
For
Option<T>
, theStandard
distribution samplesNone
with probability 50%, otherwiseSome(value)
is sampled, according to its type. -
For integers (
u8
through tou128
,usize
, andi*
variants),Standard
samples from all possible values whileUniform
samples from the parameterised range. -
For
NonZeroU8
and other "non-zero" types,Standard
samples uniformly from all non-zero values (rejection method). -
Wrapping<T>
integer types are sampled as for the corresponding integer type by theStandard
distribution. -
For floats (
f32
,f64
),Standard
samples from the half-open range[0, 1)
with 24 or 53 bits of precision (forf32
andf64
respectively)OpenClosed01
samples from the half-open range(0, 1]
with 24 or 53 bits of precisionOpen01
samples from the open range(0, 1)
with 23 or 52 bits of precisionUniform
samples from a given range with 23 or 52 bits of precision
-
For the
char
type, theStandard
distribution samples from all available Unicode code points, uniformly; many of these values may not be printable (depending on font support). TheAlphanumeric
samples from only a-z, A-Z and 0-9 uniformly. -
For tuples and arrays, each element is sampled as above, where supported. The
Standard
andUniform
distributions each support a selection of these types (up to 12-tuples and 32-element arrays). This includes the empty tuple()
and array. When usingrustc
≥ 1.51, enable themin_const_gen
feature to support arrays larger than 32 elements. -
For SIMD types, each element is sampled as above, for
Standard
andUniform
(for the latter,low
andhigh
parameters are also SIMD types, effectively sampling from multiple ranges simultaneously). SIMD support requires using thesimd_support
feature flag and nightlyrustc
. -
For enums, you have to implement uniform sampling yourself. For example, you could use the following approach:
#![allow(unused)] fn main() { use rand::{Rng, distributions::{Distribution, Standard}}; pub enum Food { Burger, Pizza, Kebab, } impl Distribution<Food> for Standard { fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> Food { let index: u8 = rng.gen_range(0..3); match index { 0 => Food::Burger, 1 => Food::Pizza, 2 => Food::Kebab, _ => unreachable!(), } } } }
Non-uniform distributions
The rand
crate provides only two non-uniform distributions:
- The
Bernoulli
distribution simply generates a boolean where the probability of samplingtrue
is some constant (Bernoulli::new(0.5)
) or ratio (Bernoulli::from_ratio(1, 6)
). - The
WeightedIndex
distribution may be used to sample from a sequence of weighted values. See the Sequences section.
Many more non-uniform distributions are provided by the rand_distr
crate.
Integers
The Binomial
distribution is related to the Bernoulli
in that it
models running n
independent trials each with probability p
of success,
then counts the number of successes.
Note that for large n
the Binomial
distribution's implementation is
much faster than sampling n
trials individually.
The Poisson
distribution expresses the expected number of events
occurring within a fixed interval, given that events occur with fixed rate λ.
Poisson
distribution sampling generates Float
values because Float
s
are used in the sampling calculations, and we prefer to defer to the user on
integer types and the potentially lossy and panicking associated conversions.
For example, u64
values can be attained with rng.sample(Poisson) as u64
.
Note that out of range float to int conversions with as
result in undefined
behavior for Rust <1.45 and a saturating conversion for Rust >=1.45.
Continuous non-uniform distributions
Continuous distributions model samples drawn from the real number line ℝ, or in
some cases a point from a higher dimension (ℝ², ℝ³, etc.). We provide
implementations for f64
and for f32
output in most cases, although currently
the f32
implementations simply reduce the precision of an f64
sample.
The exponential distribution, Exp
, simulates time until decay, assuming a
fixed rate of decay (i.e. exponential decay).
The Normal
distribution (also known as Gaussian) simulates sampling from
the Normal distribution ("Bell curve") with the given mean and standard
deviation. The LogNormal
is related: for sample X
from the log-normal
distribution, log(X)
is normally distributed; this "skews" the normal
distribution to avoid negative values and to have a long positive tail.
The UnitCircle
and UnitSphere
distributions simulate uniform
sampling from the edge of a circle or surface of a sphere.
The Cauchy
distribution (also known as the Lorentz distribution) is the
distribution of the x-intercept of a ray from point (x0, γ)
with uniformly
distributed angle.
The Beta
distribution is a two-parameter probability distribution, whose
output values lie between 0 and 1. The Dirichlet
distribution is a
generalisation to any positive number of parameters.