Workshop Categorical Probability and Statistics, 5–8 June 2020 (online) and June 2021 (in person)

Tutorial videos:

Schedule for the online live event (hover for abstracts):

Fri, 5 Jun Sat, 6 Jun Sun, 7 Jun Mon, 8 Jun

11:45 UTC
Opening Welcome address by the organizers
12h UTC
Staton Categorical models of probability with symmetries

I will present two categorical models for probability which have some different properties to the usual perspective. The first can be understood as representing real numbers as Pólya urns, as I will explain. The second includes infinite random graphs as measures. The models are inspired by the symmetries in the foundations of probabilistic programming (although I won’t assume familiarity with this). The common ground is that they both involve invariance via naturality.

This is (in part) joint work with Nate Ackerman, Cameron Freer, Dan Roy, Dario Stein and Hongseok Yang.

(slides, recording)
Spekkens Disentangling inference and influence in classical and quantum theories
(slides, recording)
McIver The Category of Correlations

Designing programs that do not leak confidential information continues to be a challenge. Part of the difficulty arises when partial information leaks are inevitable, implying that design interventions can only limit rather than eliminate their impact. We show, by example, how to gain a better understanding of the consequences of information leaks by modelling what adversaries might be able to do with any leaked information. The presentation is based on the theory of Quantitative Information Flow, and uses the well-known probability monad to provide an information-flow aware semantics for a small programming language. We will explore some properties of the language and demonstrate that "correlations" rather than the more familiar "prior/posterior" probabilities of Bayesian reasoning are fundamental to understanding how information leaks in programs.

Simpson Synthetic probability theory

I shall outline an alternative to the standard set-theoretic formulation of probability theory, based on a category-theory-influenced axiomatisation of the notion of "random variable" as a primitive concept. The axioms capture desirable properties of random variables, not all of which apply to random variables as usually defined. They are designed to support a streamlined development of probability theory, in which sample spaces play no role and one (almost) never needs to consider sigma-algebras and associated notions of measurability.

(slides, recording)
13h UTC
Probabilistic morphisms and Bayesian statistical models

I shall present a categorical approach to Bayesian nonparametrics, using the notion of a probabilistic morphism, that has been developed in my joint work with Juergen Jost, Duc Hoang Luu and Tat Dat Tran (arXiv:1905.11448). In particular, I shall revisit Dirichlet measures and posterior distribution using probabilistic morphisms, and give a new formula for posterior distribution. Finally I shall compare the notion of a Bayesian statistical model with the notion of a diffeological statistical model that has been introduced in my subsequent work (arXiv:1912.02090, Mathematics 2020, 8(2), 167).

(slides, recording)
Rischel Introduction to Markov categories

In this talk I give an introduction to the formalism of Markov categories, a synthetic approach to the foundations of statistics. A Markov category is a category with the structure necessary to make sense of basic notions of probability theory. The essential example is the category Stoch of measurable spaces and Markov kernels. I will introduce the basic notions of Markov categories and give examples of basic results from probability theory and statistics generalized to this setting.

(slides, recording)
Spitters Synthetic topology in Homotopy Type Theory for probabilistic programming

Martin E. Bidlingmaier, Florian Faissole, Bas Spitters

The ALEA Coq library formalizes measure theory based on a variant of the Giry monad on the category of sets. This enables the interpretation of a probabilistic programming language with primitives for sampling from discrete distributions. However, continuous distributions have to be discretized because the corresponding measures cannot be defined on all subsets of their carriers. This paper proposes the use of synthetic topology to model continuous distributions for probabilistic computations in type theory. We study the initial σ-frame and the corresponding induced topology on arbitrary sets. Based on these intrinsic topologies we define valuations and lower integrals on sets, and prove versions of the Riesz and Fubini theorems. We then show how the Lebesgue valuation, and hence continuous distributions, can be constructed.

(slides, recording)
Jacobs De Finetti's construction as a categorical limit

This paper reformulates a classical result in probability theory from the 1930s in modern categorical terms: de Finetti's representation theorem is redescribed as limit statement for a chain of finite spaces in the Kleisli category of the Giry monad. This new limit is used to identify among exchangeable coalgebras the final one.

(slides, recording)

15h UTC
McCullagh Categorical notions in statistics

I will talk about three areas in which categorical thinking has helped to clarify statistical ideas.

The first is in the mundane area of sampling, and specifically the importance of inheritance in the definition of symmetric functions. Inheritance leads directly to k-statistics and the classical cumulants. The second is related to symmetric functions for spectral samples or eigenvalues. Inheritance leads to a family of polynomial symmetric functions, called spectral k-statistics that are closely related in the limit to free cumulants. The third area is related to factorial models and is closely connected with representation theory for injective maps.

(slides, recording)
Perrone Probability monads and stochastic dominance

In this talk we give a category-theoretical account of notions of stochastic dominance, of first and second order, using the Kantorovich monad.

In many areas of applied mathematics, such as decision theory and mathematical finance, some states of a given system may be preferable compared to others - for example because they carry a price, or a utility, which one wants to optimize. Stochastic dominance relations are ways of comparing *random* states, according to what has to be optimized, often extending the preference order of the deterministic states, taking the randomness into account.

In particular, first-order stochastic dominance (also called stochastic order) is a way of comparing random variables on an ordered space, according to "how likely it is to have a better outcome". This extends the underlying (deterministic) order to the random case. Probability monads can be thought of as tools for extending deterministic concepts to the stochastic case in a consistent way. Indeed, operating this extension to the order of a space, one gets the stochastic order between the random variables on it.

Second-order stochastic dominance is a way of comparing random variables on a convex or vector space (such as the real line), according to how "random" or "risky" they are. Choosing a less-random distribution, sometimes at the expense of a lower expected value, can be useful for example to plan more strategically. We show that this order can be encoded in terms of a categorical structure known as the "bar construction" of a probability monad, and that it is linked to the notions of conditional expectation and martingales.

Sources: arXiv:1808.09898,

(slides, recording)
Parzygnat Categorical probability in the quantum realm

One of the axioms of a Markov category requires the copy/duplicate map to be symmetric. The non-commutativity of observables in quantum mechanics prevents this as well as the existence of a positive probability-preserving map that can be used as a copy map (this is sometimes called the no-cloning theorem). Nevertheless, the subcategory of quantum physical processes embeds into a larger category where a copy map exists and satisfies a property that is almost as good as symmetry. This allows one to use many of the string-diagrammatic techniques to formulate and prove interesting facts regarding Bayesian inversion in the quantum setting. Going full circle, these results may suggest new insight into categorical probability.

(slides, recording)
Panangaden Approximating probabilistic bisimulation via conditional expectation

In a recent talk at the Applied Category Seminar I described a functorial view of conditional expectation. In this talk I will describe how one can use this to approximate Markov processes. Instead of quotienting the state space we coarsen the sigma-algebra. The conditional expectation is used to describe how the transition probability kernels are approximated. It turns out that bisimulation is a very special kind of approximation in which no information is lost. The approximations that we construct are then seen as “approximate bisimulations”. I will quickly review the basic facts about conditional expectation so this talk should be reasonably self-contained. This is joint work with Philip Chaput, Vincent Danos and Gordon Plotkin.

(slides, recording)
16h UTC
Gerhold Independence and Lévy processes in monoidal categories

Following an idea of Uwe Franz, we define independence in monoidal categories with inclusions. We show how the construction of a Lévy process from its marginal distributions as a projective limit can be generalized to this categorical setting. A rich source of examples is provided by independences in noncommutative probability.

Joint work with S. Gerhold and M. Schürmann.

Patterson The algebra of statistical theories and models

Using tools from categorical logic, a precise analogy is drawn between models in logic and statistics. The notion of a statistical theory is introduced as a small Markov category with extra linear algebraic structure. Models of statistical theories in a category of Markov kernels are then seen to be statistical models, in the conventional sense. Morphisms between statistical theories, and their induced model migration functors, formalize relationships between different statistical models. These ideas are illustrated by a wide range of models from classical applied statistics, such as linear models, mixed models, and generalized linear models.

(slides, recording)
Spivak Internal probability valuations

There are many competing approaches to probability theory. A particularly nice one from a categorical viewpoint is that of (probability) valuations, which monotonically assign an element in [0,1] to each open set of a space, satisfying an inclusion-exclusion formula and a continuity requirement. Valuations are known to coincide with probability measures in many non-pathological cases, like on Polish spaces.

The data and axioms of a topological space and a valuation on it can be stated constructively, and hence in the internal logic of any topos with a natural numbers object. In the topos of sets, for example, the result is just a valuation in the usual sense; however, understanding the semantics of valuations in more general toposes is an open problem.

In this talk, I'll discuss work-in-progress with Tobias Fritz, in which we investigate the semantics of valuations on spatial toposes. We believe that the result will in particular provide a new constructive and synthetic approach to stochastic processes.

(slides, recording)

Technical setup: