My favourite programming language Haskell has a nice library for doing calculations with probablility distributions, called probability.
The library was originally written by Martin Ewig and Steve Kollmansberger and then extended and packaged for convenient installation by Henning Thieleman. Its main idea for the implementation is to use the fact that “probability distributions form a monad” (from a paper by the authors of the library ). This post aims at translating this claim into standard probability notation and showing how it is used in the library.
Monad is a notion in category theory which is defined as a monoid in the category of endofunctors. That sounds so cool that I really wish I knew what it means. The only thing I know now is that the “monoid” here is not the standard monoid from mathematics founded on set theory, but rather its category theory analogue. So what does the claim “probability distributions form a monad” mean Haskell-wise?
Let’s load the library and play with it.
In the probability library distributions are (essentially) represented as lists of pairs (element, probability).
There are a couple of functions defined that provide means of generating some often used distributions. For example “choose p a b” will generate a distribution that assigns probability p to a value a and (1-p) to b.
Prelude Numeric.Probability.Distribution> choose 0.3 0 1 Loading package array-0.1.0.0 ... linking ... done. Loading package containers-0.1.0.1 ... linking ... done. Loading package mtl-184.108.40.206 ... linking ... done. Loading package old-locale-220.127.116.11 ... linking ... done. Loading package old-time-18.104.22.168 ... linking ... done. Loading package random-22.214.171.124 ... linking ... done. Loading package probability-0.2.1 ... linking ... done. fromFreqs [(1,0.7),(0,0.3)]
Another example is the function “uniform” that assigns the same probability to all elements of a list.
Prelude Numeric.Probability.Distribution> uniform [0,1,2,3] fromFreqs [(0,0.25),(1,0.25),(2,0.25),(3,0.25)]
There is also a function “certainly” that constructs a point mass
Prelude Numeric.Probability.Distribution> certainly 5 fromFreqs [(5,1)]
Now what about that monad thing? Let’s think about distributions on a given finite set as nonnegative functions that sum up to 1 and let be the set of such distributions. So, we have . Suppose now we have two finite sets . We can define a binary operation that takes a distribution on and a function that assigns a distribution on to every element of , and creates a distribution on in a (quite) natural way:
for any and .
By taking the sum over all it is easy to check that is indeed a distribution on . It can be interpreted as the distribution of the result of selecting first according to distribution and then selecting from the set according to the distribution given by .
We can use the “” (corresponding to Haskell’s “>>=”) operation to construct joint distributions. For example to generate o model of two coin tosses we can do
Prelude Numeric.Probability.Distribution> (choose 0.5 0 1) >>= (\x -> choose 0.5 (x,0) (x,1)) fromFreqs [((0,0),0.25),((0,1),0.25),((1,0),0.25),((1,1),0.25)]
Now suppose we have three finite sets and corresponding operations in
that we will both denote . Then for all , and we have the identity , where is defined as .
Note that if then is a distribution on and for .
Let’s expand both sides to see that the identity really holds. Fix . We want to see that the values of the distributions and are the same on .
The left hand side evaluates to
and on the right hand side we have
So they are the same.
The fact that we have such quasi-associativity (plus two other conditions that are trivial to check) allows to claim that the type of distributions forms a monad. In Haskell this opens up a whole world of possibilities. There are hundreds of functions related to monads in Haskell libraries. One can assume that each one does something useful when applied to probabilistic calculations and the only problem is to figure out what that useful thing is.
For example the function “replicateM” creates distributions of sequences of independent identically distributed random variables:
Prelude Numeric.Probability.Distribution> :m + Control.Monad Prelude Numeric.Probability.Distribution Control.Monad> replicateM 4 (choose 0.5 0 1) fromFreqs [([0,0,0,0],6.25e-2),([0,0,0,1],6.25e-2),([0,0,1,0],6.25e-2), ([0,0,1,1],6.25e-2),([0,1,0,0],6.25e-2),([0,1,0,1],6.25e-2),([0,1,1,0], 6.25e-2),([0,1,1,1],6.25e-2),([1,0,0,0],6.25e-2),([1,0,0,1],6.25e-2), ([1,0,1,0],6.25e-2),([1,0,1,1],6.25e-2),([1,1,0,0],6.25e-2), ([1,1,0,1],6.25e-2),([1,1,1,0],6.25e-2),([1,1,1,1],6.25e-2)]
One can define a function that calculates the sequence of expectations of maximum of i.i.d random variables with common distribution as follows:
expmaxseq n d = [D.expected $ D.map maximum $ replicateM k d | k <- [1..n]]
For throwing dice we get the following expectations of maximum for 1,2,…, 7 dice:
ProbLib Numeric.Probability.Distribution> expmaxseq 7 (uniform [1..6]) [3.5,4.472222222222223,4.958333333333335,5.2445987654321105, 5.4309413580246915,5.560292352536542,5.65411736969124]