nLab transport plan

Contents

Contents

Idea

One can view a probability measure pp on a space (X,π’œ)(X,\mathcal{A}) as a β€œpile of mass”, for example, of sand, on the space XX. Using this picture, given two probability spaces (X,π’œ,p)(X,\mathcal{A},p) and (Y,ℬ,q)(Y,\mathcal{B},q), there could be many ways of moving the mass from XX to YY in such a way that the sand from the pile pp is arranged to form the pile qq. (The mass from which point goes to which point, or points?) This β€œway of moving the mass” is called a transport plan, and it is usually encoded by a joint distribution or by a Markov kernel (see below).

It is useful to keep track of in which way we are rearranging the mass pp to form qq, and we can see these different ways as different morphisms, between the objects (X,π’œ,p)(X,\mathcal{A},p) and (Y,ℬ,q)(Y,\mathcal{B},q), in a category of couplings.

Definition

Let (X,π’œ,p)(X,\mathcal{A},p) and (Y,ℬ,q)(Y,\mathcal{B},q) be probability spaces. A coupling or transport plan between (X,π’œ,p)(X,\mathcal{A},p) and (Y,ℬ,q)(Y,\mathcal{B},q) is a probability space (XΓ—Y,π’œβŠ—β„¬,r)(X\times Y, \mathcal{A}\otimes\mathcal{B},r) where

  • π’œβŠ—β„¬\mathcal{A}\otimes\mathcal{B} is the tensor product sigma-algebra on the product space XΓ—YX\times Y (generated by the sets AΓ—BA\times B with Aβˆˆπ’œA\in\mathcal{A} and Bβˆˆβ„¬B\in\mathcal{B});

  • the measure rr has pp and qq as marginals, in the sense that for all Aβˆˆπ’œA\in\mathcal{A} and Bβˆˆβ„¬B\in\mathcal{B},

    r(AΓ—Y)=p(A)andr(XΓ—B)=q(B). r(A\times Y) \,=\, p(A) \;\; \text{and} \;\; r(X\times B) \,=\, q(B) \,.

Main constructions

Identity coupling

Given a probability space (X,π’œ,p)(X,\mathcal{A},p), the identity coupling or diagonal coupling is given by the following measure on π’œβŠ—π’œ\mathcal{A}\otimes\mathcal{A}:

Ξ” p(AΓ—Aβ€²)=p(A∩Aβ€²) \Delta_p (A\times A') = p(A\cap A')

for all A,Aβ€²βˆˆπ’œA,A'\in\mathcal{A}.

Intuitively, this is a copy of pp on XX concentrated on the diagonal subset {(x,x):x∈X}βŠ†XΓ—X\{(x,x):x\in X\}\subseteq X\times X. (Whenever (X,π’œ)(X,\mathcal{A}) is standard Borel, the diagonal subset is measurable, and so this intuition can be made precise.)

This coupling gives the identity in the category of couplings. In terms of transport plans, this corresponds to not moving any mass (almost surely).

Independent coupling

Given probability spaces (X,π’œ,p)(X,\mathcal{A},p) and (Y,ℬ,q)(Y,\mathcal{B},q) the independent coupling or product coupling or constant coupling is given by the product measure pβŠ—qp\otimes q, i.e.

(pβŠ—q)(AΓ—B)=p(A)q(B) (p\otimes q)(A\times B) = p(A)\,q(B)

for all Aβˆˆπ’œA\in\mathcal{A} and Bβˆˆβ„¬B\in\mathcal{B}.

In terms of transport plans, this arranges the mass from almost all points of XX to a distribution proportional to qq, (almost surely) independently of the point of origin.

Composition of couplings

Let (X,π’œ,p)(X,\mathcal{A},p), (Y,ℬ,q)(Y,\mathcal{B},q), (Z,π’ž,r)(Z,\mathcal{C},r) be standard Borel probability spaces, and consider transport plans ss from pp to qq and tt from qq to rr. The composite transport plan t∘st\circ s from pp to rr is defined as follows:

(t∘s)(AΓ—C)=∫ Ysβ€²(A|y)tβ€²(C|y)q(dy) (t\circ s)(A\times C) = \int_Y s'(A|y)\,t'(C|y) q(dy)

for all Aβˆˆπ’œA\in\mathcal{A} and Cβˆˆπ’žC\in\mathcal{C}, and where sβ€²s' and tβ€²t' are the regular conditional distributions associated to ss and tt given YY. The interpretation is that the mass in moved according to the plan ss and then according to the plan tt, and in case the transport is stochastic, the two transitions are taken independently.

This construction gives composition in the category of couplings. When the transport plans are induced by functions or kernels (see below), the composition of transport plans is given by the composition of functions or kernels.

In Kozen-Silva-Voogd’23, this construction was extended beyond the standard Borel case. (See there for the details.)

Couplings induced by functions

Let f:(X,π’œ,p)β†’(Y,ℬ,q)f:(X,\mathcal{A},p)\to(Y,\mathcal{B},q) be a measure-preserving function. One can define the β€œdeterministic” transport plan r fr_f as follows,

r f(AΓ—B)=p(A∩f βˆ’1(B)) r_f(A\times B) = p\big(A\cap f^{-1}(B)\big)

for all Aβˆˆπ’œA\in\mathcal{A} and Bβˆˆβ„¬B\in\mathcal{B}. Intuitively, this maps all the mass at xx to the point f(x)f(x), for every x∈Xx\in X.

Note that in general there may exist no measure-preserving function between two probability spaces, for example, on the real line, if pp is a Dirac delta and qq is not. A construction that always exists is in terms of Markov kernels, see below.

Couplings induced by kernels

Let k:(X,π’œ,p)β†’(Y,ℬ,q)k:(X,\mathcal{A},p)\to(Y,\mathcal{B},q) be a measure-preserving Markov kernel. One can define a transport plan r kr_k as follows,

r k(AΓ—B)=∫ Ak(B|x)p(dx) r_k(A\times B) = \int_A k(B|x)\,p(dx)

for all Aβˆˆπ’œA\in\mathcal{A} and Bβˆˆβ„¬B\in\mathcal{B}. Intuitively, this maps all the mass at xx to a measure on YY proportional to the measure B↦k(B|x)B\mapsto k(B|x).

Note that in the formula above, the measure B↦k(B|x)B\mapsto k(B|x) is invoked only for almost all xx, and so it is insensitive to changes in kk on a pp-measure-zero set. In a certain sense, this transporting the mass of pp, more than the single points xx.

In many cases, such as if (X,π’œ)(X,\mathcal{A}) and (Y,ℬ)(Y,\mathcal{B}) are standard Borel, every transport plan is in the form r kr_k for some kk. See also the discussion at "categories of couplings".

Bayesian inversion

Couplings are in some sense undirected, meaning that every transport plan from XX to YY can also be seen as (and canonically induces) a transport plan from YY to XX.

This makes the category of couplings canonically a dagger category.

For transport plans specified by kernels, this symmetry corresponds exactly to Bayesian inversion of kernels.

References

  • Cedric Villani, Optimal transport: old and new, Springer, 2008.

  • Fredrik Dahlqvist, Vincent Danos, Ilias Garnier, and Alexandra Silva, Borel kernels and their approximation, categorically, MFPS 2018. arXiv.

  • Dexter Kozen, Alexandra Silva, Erik Voogd, Joint Distributions in Probabilistic Semantics, MFPS 2023. (arXiv)

  • Paolo Perrone, Lifting couplings in Wasserstein spaces, 2021. (arXiv:2110.06591)

category: probability

Last revised on February 7, 2024 at 17:02:42. See the history of this page for a list of all contributions to it.