Random distributions
For maximum flexibility when producing random values, we define the
Distribution trait:
use rand::Rng;
// a producer of data of type T:
pub trait Distribution<T> {
// the key function:
fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> T;
// a convenience function defined using sample:
fn sample_iter<R>(self, rng: R) -> rand::distr::Iter<Self, R, T>
where
Self: Sized,
R: Rng,
{
// [has a default implementation]
todo!()
}
}
Implementations of Distribution are probability distribution: mappings
from events to probabilities (e.g. for a die roll P(x = i) = ⅙ or for a Normal
distribution with mean μ=0, P(x > 0) = ½).
Note that although probability distributions all have properties such as a mean,
a Probability Density Function, and can be sampled by inverting the Cumulative
Density Function, here we only concern ourselves with sampling random values.
If you require use of such properties you may prefer to use the statrs crate.
Rand provides implementations of many different distributions; we cover the most
common of these here, but for full details refer to the distr module
and the rand_distr crate.
Uniform distributions
The most obvious type of distribution is the one we already discussed: one where each equally-sized sub-range has equal chance of containing the next sample. This is known as uniform.
Rand actually has several variants of this, representing different ranges:
StandardUniformrequires no parameters and samples values uniformly according to the type.Rng::randomprovides a short-cut to this distribution.Uniformis parametrised byUniform::new(low, high)(includinglow, excludinghigh) orUniform::new_inclusive(low, high)(including both), and samples values uniformly within this range.Rng::random_rangeis a convenience method defined overUniform::sample_single, optimised for single-sample usage.Alphanumericis uniform over thecharvalues0-9A-Za-z.Open01andOpenClosed01are provide alternate sampling ranges for floating-point types (see below).
Uniform sampling by type
Lets go over the distributions by type:
-
For
bool,StandardUniformsamples each value with probability 50%. -
For
Option<T>, theStandardUniformdistribution samplesNonewith probability 50%, otherwiseSome(value)is sampled, according to its type. -
For integers (
u8through tou128,usize, andi*variants),StandardUniformsamples from all possible values whileUniformsamples from the parameterised range. -
For
NonZeroU8and other "non-zero" types,StandardUniformsamples uniformly from all non-zero values (rejection method). -
Wrapping<T>integer types are sampled as for the corresponding integer type by theStandardUniformdistribution. -
For floats (
f32,f64),StandardUniformsamples from the half-open range[0, 1)with 24 or 53 bits of precision (forf32andf64respectively)OpenClosed01samples from the half-open range(0, 1]with 24 or 53 bits of precisionOpen01samples from the open range(0, 1)with 23 or 52 bits of precisionUniformsamples from a given range with 23 or 52 bits of precision
-
For the
chartype, theStandardUniformdistribution samples from all available Unicode code points, uniformly; many of these values may not be printable (depending on font support). TheAlphanumericsamples from only a-z, A-Z and 0-9 uniformly. -
For tuples and arrays, each element is sampled as above, where supported. The
StandardUniformandUniformdistributions each support a selection of these types (up to 12-tuples and 32-element arrays). This includes the empty tuple()and array. When usingrustc≥ 1.51, enable themin_const_genfeature to support arrays larger than 32 elements. -
For SIMD types, each element is sampled as above, for
StandardUniformandUniform(for the latter,lowandhighparameters are also SIMD types, effectively sampling from multiple ranges simultaneously). SIMD support requires using thesimd_supportfeature flag and nightlyrustc. -
For enums, you have to implement uniform sampling yourself. For example, you could use the following approach:
#![allow(unused)] fn main() { use rand::{Rng, distr::{Distribution, StandardUniform}}; pub enum Food { Burger, Pizza, Kebab, } impl Distribution<Food> for StandardUniform { fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> Food { let index: u8 = rng.random_range(0..3); match index { 0 => Food::Burger, 1 => Food::Pizza, 2 => Food::Kebab, _ => unreachable!(), } } } }
Non-uniform distributions
The rand crate provides only two non-uniform distributions:
- The
Bernoullidistribution simply generates a boolean where the probability of samplingtrueis some constant (Bernoulli::new(0.5)) or ratio (Bernoulli::from_ratio(1, 6)). - The
WeightedIndexdistribution may be used to sample from a sequence of weighted values. See the Sequences section.
Many more non-uniform distributions are provided by the rand_distr crate.
Integers
The Binomial distribution is related to the Bernoulli in that it
models running n independent trials each with probability p of success,
then counts the number of successes.
Note that for large n the Binomial distribution's implementation is
much faster than sampling n trials individually.
The Poisson distribution expresses the expected number of events
occurring within a fixed interval, given that events occur with fixed rate λ.
Poisson distribution sampling generates Float values because Floats
are used in the sampling calculations, and we prefer to defer to the user on
integer types and the potentially lossy and panicking associated conversions.
For example, u64 values can be attained with rng.sample(Poisson) as u64.
Note that out of range float to int conversions with as result in undefined
behavior for Rust <1.45 and a saturating conversion for Rust >=1.45.
Continuous non-uniform distributions
Continuous distributions model samples drawn from the real number line ℝ, or in
some cases a point from a higher dimension (ℝ², ℝ³, etc.). We provide
implementations for f64 and for f32 output in most cases, although currently
the f32 implementations simply reduce the precision of an f64 sample.
The exponential distribution, Exp, simulates time until decay, assuming a
fixed rate of decay (i.e. exponential decay).
The Normal distribution (also known as Gaussian) simulates sampling from
the Normal distribution ("Bell curve") with the given mean and standard
deviation. The LogNormal is related: for sample X from the log-normal
distribution, log(X) is normally distributed; this "skews" the normal
distribution to avoid negative values and to have a long positive tail.
The UnitCircle and UnitSphere distributions simulate uniform
sampling from the edge of a circle or surface of a sphere.
The Cauchy distribution (also known as the Lorentz distribution) is the
distribution of the x-intercept of a ray from point (x0, γ) with uniformly
distributed angle.
The Beta distribution is a two-parameter probability distribution, whose
output values lie between 0 and 1. The Dirichlet distribution is a
generalisation to any positive number of parameters.