nLab monads of probability, measures, and valuations

Contents

Contents

Idea

In many categorical approaches to probability theory, one considers a category of spaces, such as measurable spaces or topological spaces, and equips this category with a monad whose functor part assigns to each space XX a space PXP X of measures, probability measures, or valuations on XX, or a variation thereof.

For probability theory, this can be interpreted as adding to the points of a space XX new “random points”, modelled as probability measures or valuations. The old points, which we can think of as deterministic, are embedded in PXP X via the unit of the monad XPXX\to P X. Just as well, the Kleisli morphisms of PP can be seen as stochastic maps. (Monads can be seen as ways of extending our spaces and functions to account for new phenomena, see for example extension system and monad in computer science.) Note that these probability measures are technically different from random elements: they rather correspond to the laws of the random elements.

Algebras of probability and measure monads can be interpreted as generalized convex spaces or conical spaces of a certain kind. For probability theory, in particular, the algebras of a probability monad can be seen as spaces equipped with a notion of expectation value of a random variable.

The details vary depending on the monad and on the category under consideration.

Many choices of categories and of monads are possible, depending on which aspects of measure theory or probability one wants to study. See the table below for more details.

Note that “probability monad” and “measure monad” are not used here as technical terms with a definition. Here we just explain what most monads of this kind look like. (There are ongoing works which may give a general, precise definition of probability monad.) The term “probability monad” was coined by Giry (see here), to refer to what we today call the Giry monad.

Functor, unit and multiplication

The functor: random outcomes

The basic idea, that holds for all monads of this sort, is that of forming spaces of random elements, or rather their laws.

Consider a category CC, whose objects we can think of as “spaces of outcomes” with possibly extra structure, such as measurable spaces, or topological spaces, or even just sets. A probability monad assigns to each space XX a space PXP X which we can think of as a space of “laws of random outcomes on XX” or other measure-like entities. For example,

(See also the table below.)

A possible interpretation of probability monads, which uses the interpretation of monads as encoding generalized theories, is that PXP X is a space of “formal generalized convex combinations” (in the sense of convex space) of points of XX. For instance, for the distribution monad, given the set {heads,tails}\{heads,tails\} of possible outcomes of a coin flip, the set PXP X contains for example the distribution that assigns probability 1/21/2 to both outcomes. This can be thought of as the formal convex combination

12heads+12tails. \frac{1}{2}\, heads + \frac{1}{2}\,tails.

On the morphisms, the functor gives for example the pushforward of measures.

The unit: deterministic random variables

The unit of the monad gives, for each space of outcomes, a map XPXX\to P X, which we can think of as picking out “those outcomes which are not really random”, or “Dirac delta” (see Dirac measure and Dirac valuation.

These can be seen as the laws of a deterministic random variable.

The multiplication: mixing probability measures.

The multiplication of the monad is a map PPXPXP P X\to P X for all objects XX. This can be thought of as of “mixing” or “averaging” probability measures.

The following example is taken from Perrone ‘19, Example 5.1.2. Suppose that you have two coins in your pocket. Suppose that one coin is fair, with “heads” on one face and “tails” on the other face; suppose the second coin has “heads” on both sides. Suppose now that you draw a coin randomly, and flip it. We can sketch the probabilities in the following way: Let XX be the set {heads,tails}\{heads,tails\}. A coin gives a law according to which we will obtain “heads” or “tails” so it determines an element of PXP X. Since the choice of coin is also random (we also have a law on the coins), the law on the coins determines an element of PPXP P X. By averaging, the resulting overall probabilities are

In other words, the “average” or “mixture” can be thought of as an assignment E:PPXPXE:P P X\to P X, from laws of “random random variables” to laws of ordinary random variables.

Algebras: expectation values

According to the interpretation of probability monads in terms of “formal generalized convex combinations”, the algebras of a probability monad are then spaces equipped with a way of evaluating these expressions to a result, which we can see as a generalized “midpoint?”, “average”, or “expectation value”. (Expectation values play a very important role in probability: probability monads can encode them via their algebras.)

In other words, algebras of probability monads can be thought of as generalizations of convex sets, depending on the actual category and monad in question. For example,

(See also the table below.)

If the measure are not required to be normalized, instead of (generalized) convex combinations one should think of linear combinations with non-negative coefficients, and so as algebra one gets a generalization of conical spaces instead.

Kleisli morphisms: random maps

The basic interpretation of a Kleisli morphism, i.e. a map XPYX\to P Y, is that of a map with a random outcome. For example,

The Kleisli composition recovers exactly the Chapman-Kolmogorov formula? for the composition of stochastic maps. In the language of conditional probability, this reads as

p(z|x)= yp(z|y)p(y|x) p(z|x) = \sum_y p(z|y) \, p (y|x)

for the discrete case, and as

p(A|x)= Yp(A|y)dp(y|x) p(A|x) = \int_Y p(A|y) \, d p(y|x)

for the continuous case.

(See Giry’s original article for the details, as well as the introductions given in Perrone ‘18, Section 1.1 and Perrone ‘19, Example 5.1.13).

Kleisli categories of probability monads are often instances of Markov categories.

Monoidal structure: products and marginals

Probability monads are usually defined on monoidal categories, in particular on cartesian monoidal categories. On product spaces X×YX\times Y, the object P(X×Y)P(X\times Y) has the interpretation of containing joint distributions. Given a joint distribution, one can form the marginals by pushing forward along the product projections X×YXX\times Y\to X and X×YYX\times Y\to Y. This is in general a destructive operation, since stochastic dependence may be discarded.

Given marginal distributions, one can form a canonical joint distribution by forming the product distribution. This is encoded by a monoidal structure on the probability monad (or equivalently, a commutative strength), a map which satisfies particular compatibility conditions. The Giry monad is monoidal on the category Meas with the cartesian product. Similar statements are true for most other probability and measure monads.

When the monad is affine, one has that a product probability is necessarily the product of its marginals. This is the case for monads of probability measures, but not for monads of unnormalized measures.

More on this at joint and marginal probability.

Duality

(…work in progress…)

Detailed list

Monad ( P P ) Category Elements/points of P X P X Extra structure of P X P X P P -Algebras References
distribution monad (a.k.a. finitary Giry monad, convex combination monad) Set convex combinations or finitely-supported probability measures (just a set) convex spaces Fritz '09, Jacobs '18
Giry monad Meas probability measures initial σ-algebra of evaluation maps Full characterization unknown. See also here. Lawvere '62, Giry '80
Giry monad Pol Borel probability measures initial topology of integration maps Full characterization unknown. See also here. Giry '80
Probability monad on QBS? quasi-Borel spaces Equivalence classes of random variables quasi-Borel structure inherited by the cartesian closed structure Full characterization unknown. HKSY '17
Radon monad Comp Radon probability measures (or continuous valuations) weak topology w.r.t. continuous functions compact convex subsets of locally convex topological vector spaces Swirszcz '74, Keimel '08
ordered Radon monad CompOrd Radon probability measures (or continuous valuations) weak topology w.r.t. continuous functions, stochastic order compact convex subsets of ordered? locally convex topological vector spaces Swirszcz '74, Keimel '08
measure monad on Top Top τ-additive measures A-topology, stochastic order Full characterization unknown. See also here. F-P-R '19
probabilistic powerdomain? dcpo, continuous domains? continuous valuations stochastic order abstract probabilistic domains? (continuous case) J-P '89
extended probabilistic powerdomain Top, stably compact spaces continuous valuations initial topology of evaluation maps, stochastic order Full characterization unknown. Dedicated section here Heckmann '96, A-J-K '04, GL-J '19, F-P-R '19
valuation monad on locales? Loc continuous valuations initial topology of evaluation maps Vickers '11
Kantorovich monad complete metric spaces Radon probability measures of finite first moment Kantorovich-Wasserstein metric closed convex subsets of Banach spaces van Breugel '05, F-P '19
ordered Kantorovich monad complete L-ordered? metric spaces Radon probability measures of finite first moment Kantorovich-Wasserstein metric, stochastic order closed convex subsets of ordered Banach spaces? F-P

(…to be expanded…)

See also

References

An account in terms of codensity monads in:

  • Ruben Van Belle, Probability monads as codensity monads. Theory and Applications of Categories 38 (2022), 811–842, (tac)

An introduction to some of the concepts can be also found in:

  • Paolo Perrone, Notes on Category Theory with examples from basic mathematics, Chapter 5. (arXiv)

  • Paolo Perrone, Categorical probability and stochastic dominance in metric spaces, PhD thesis, Chapter 1. (pdf)

  • Tobias Fritz and Paolo Perrone, Monads, partial evaluations, and rewriting, MFPS 36, 2020 - Section 6. (arXiv)

and in the following video lectures:

category: probability

Last revised on February 20, 2024 at 17:54:17. See the history of this page for a list of all contributions to it.