nLab biology

Contents

Contents

Idea

This page is used as a hub to gather references about category-theoretic work with explicit applications to biology.

Biology can be seen as an umbrella term for a spectrum of scientific domains all interested in studying life, at various scales. A range of category-theoretic formalisms have now been proposed to model these scales, from DNA mechanisms to complex systems through protein interactions. This page gathers works expressing their results in the language of biology by using category theory concepts. More works with potential applications to topics somewhat related to biological questions are referred at the end of the page.

Categories

This section presents category theoretic models using categories as ways to describe biological mechanisms and relations.

Ologs

Ontologies are used in genomics to classify categories of observable phenomena such as diseases or gene expressions. A category-theoretic model called “Ologs” was proposed to formalize such ontologies. Ologs are freely generated categories over graphs whose objects and arrows are defined by (English) sentences such that

  • each object is a nominal sentence about a given topic;

  • each arrow is encoded by a grammatical predicate starting with the verb is but deprived of its last grammatical object such that the arrow, its source and its target all together define a semantically correct (English) sentence.

The idea is that each arrow of an olog describes a fact about a given topic.

Ologs have been used to characterize hierarchies in biology.

Segments

Categories of segments aim to model the set of natural and experimental operations that can be done on DNA segments. The definition of their arrows is simple but flexible enough to express a wide range of biological mechanisms occurring in genetics and related fields.

For every non-negative integer nn, we will denote the set {1,2,,n}\{1,2,\dots,n\} as [n][n]. Note that [0]=[0] = \emptyset.

Definition

For every preorder (Ω,)(\Omega,\preceq), we define a segment on Ω\Omega as a tuple (n 0,n 1,t,c)(n_0,n_1,t,c) where n 0n_0 and n 1n_1 are non-negative integers, t:[n 1][n 0]t:[n_1] \to [n_0] is an order-preserving surjection and c:[n 0]Ωc:[n_0] \to \Omega is a function.

If we take the preorder 2={01}\mathbf{2} = \{ 0\leq 1\}, then the following diagram represents a segment (14,5,c,t)(14,5,c,t) in 2\mathbf{2}; the brackets represent the fibers of the order-preserving surjection tt while the corresponding colors represent the mappings of the function cc, which lands in the set 2\mathbf{2}.

Definition

For every preorder (Ω,)(\Omega,\preceq), we define a morphism of segments from segment (n 0,n 1,t,c)(n_0,n_1,t,c) on Ω\Omega to a segment (n 0,n 1,t,c)(n_0',n_1',t',c') on Ω\Omega as a pair (f 1,f 0)(f_1,f_0) where f 1:[n 1][n 1]f_1:[n_1] \to [n_1'] is an order-preserving injection and f 0:[n 0][n 0]f_0:[n_0] \to [n_0'] is an order-preserving function such that the relation c(f 0(i))c(i)c'(f_0(i)) \preceq c(i) holds in (Ω,)(\Omega,\preceq) for every i[n 0]i \in [n_0].

We define the category of segments over a preorder (Ω,)(\Omega,\preceq) as the category Seg(Ω)\mathbf{Seg}(\Omega) whose objects are segments over Ω\Omega and whose arrows are the morphisms of segments between them.

We can show that if the preorder (Ω,)(\Omega,\preceq) defines a lattice, then the category Seg(Ω)\mathbf{Seg}(\Omega) can be equipped with a site structure.

References

Ologs and biology:

  • DI Spivak, RE Kent, Ologs: A categorical framework for knowledge representation, pdf
  • JY Wong, J McDonald, M Taylor-Pinney, DI Spivak, KaplanDL , MJ Buehler, Materials by Design: Merging Proteins and Music, Nano Today. 2012 Dec 1;7(6):488-495, link
  • T. Giesa, DI Spivak, MJ. Buehler, Reoccurring patterns in hierarchical protein materials and music: The power of analogies, pdf

Introductory material for categories of segments:

Theories and models

This section presents category theoretic models taking the form of diagrams. These models can be either presented as functors with properties or as commutative diagrams. Common examples are models for a limit sketch. The first attempt to formalize biological systems in terms of diagrams (with limits) was initiated by R. Rosen (see the references at the end of the page). However, Rosen’s work stays quite abstract and does not treat of any specific biological phenomenon.

Stock-flow diagrams and Petri nets

In the spirit of Spivak's approach in encoding databases as functors over small categories, functors have been used to organize (and hence model) biological data. One example is that of stock-flow diagrams, which are defined as follows.

Let us denote as H\mathsf{H} the free category generated over the following graph:

The previous diagram should be seen as a sketch specifying a structure in which there are links that go from a stock to a flow such that each flow goes from a stock to another stock.

Definition

We define a primitive stock-flow diagram as a functor HFinSet\mathsf{H} \to \mathbf{FinSet} where FinSet\mathbf{FinSet} is the category of finite sets and functions.

Definition

A stock-flow diagram consists of primitive stock-flow diagram F:HFinSetF:\mathsf{H} \to \mathbf{FinSet} and, for every element xF(flow)x \in F(\mathrm{flow}), a continuous function U x\mathbb{R}^{U_x} \to \mathbb{R} where U xU_x denotes the finite fiber F(t) 1(x)F(t)^{-1}(x).

There is a notion of open stock-flow diagram that can be composed by using the composition of cospans. Stock-flow diagrams have been used to model epidemics and more specifically COVID-19.

Pedigrads

A pedigrad is a model for a limit sketch defined on a category of segments. The functors defining pedigrads can land in any types of categories. These functors have been used to model genomic data and design algorithms to study them.

References

Stock-flow diagrams and petri nets:

  • JC Baez, X Li, S Libkind, N Osgood and E Patterson, Compositional modeling with stock and flow diagrams, arXiv:2205.08373

  • A Baas, J Fairbanks, M Halter, S Libkind and E Patterson, An algebraic framework for structured epidemic modeling, arXiv:2203.16345

  • JC Baez, BS Pollard, A Compositional Framework for Reaction Networks, arXiv:1704.02051

Pedigrads:

  • R Tuyeras, Category theory for genetics I: mutations and sequence alignments, Theory and Applications of Categories, Vol. 33, 2018, No. 40, pp 1269-1317, link

  • R Tuyeras, Category theory for genetics II: genotype, phenotype and haplotype, arXiv:1805.07004

  • R Tuyeras, A category theoretical argument for causal inference, arXiv:2004.09999

Adjunctions

Adjunctions have been used to model pathogens and disease diagnoses, and their corresponding immune response. For example, denote as II the set of immune responses and as PP the set pathogens and disease symptoms. In practice, we can map a subset of PP to a subset of II. This defines a binary relation as follows.

QSub(P)×Sub(I) Q \subseteq \mathsf{Sub}(P) \times \mathsf{Sub}(I)

We can complete this binary relation into a functor of the following form.

Q:Sub(P) op×Sub(I)2={01}. Q: \mathsf{Sub}(P)^{\mathsf{op}} \times \mathsf{Sub}(I) \to \mathbf{2} = \{0 \leq 1\}.

We can then define a functor F:Sub(I)Sub(P)F:\mathsf{Sub}(I) \to \mathsf{Sub}(P) with the following specification.

F(i)={p|Q(p,i)=1} F(i) = \bigcup\{p| Q(p,i) = 1\}

If the binary relation QQ is defined such that FF preserves meets (ie. intersections), then we can use the adjoint functor theorem to define a left-adjoint L:Sub(P)Sub(I)L:\mathsf{Sub}(P) \to \mathsf{Sub}(I) for the functor FF.

L(p)={i|pF(i)} L(p) = \bigcap\{i|p \subseteq F(i)\}

The previous adjunction gives us a context in which we can reason about immune responses and their triggering pathogens and diseases. Further, the adjunction formalism ensures a certain continuity in time and space regarding the linking of diseases/pathogens to their immune response (as suggested in the following correspondence).

We can embed the previous type of models in time and space by filtering the sets PP and II into a sequence (or a category) of subsets containing chronological and spatial occurrences of diseases and their immune responses, respectively. In this case, the corresponding binary relations QSub(P)×Sub(I)Q \subseteq \mathsf{Sub}(P) \times \mathsf{Sub}(I) need to be defined for each time and space parameter in a functorial way.

References

  • J-F Mascari, D Giacchero and N Sfakianakis, Symetries and asymetries of the immune system response: A categorification approach, 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2017, pp. 1451-1454, pdf

Operads and algebras

Trees

Operads have been used to model phylogenetic trees, see:

Axioms for algebras and gradient descent

A little-disc-operad-inspired formalism was developed to model biological systems and cellular behaviors. This formalism considers gradient descent techniques to make certain subsets A ×nA \subseteq \mathbb{R}^{\times n} converge towards algebra-like structures.

More specifically, these gradient descent techniques use the underlying equations of the axioms for algebras as objective functions. The sets AA obtained through these optimizations can be interpreted as models for specialization of biological functions and entropic mechanisms in living organisms.

References

Phylogenies:

  • JC Baez, N Otter, Operads and Phylogenetic Trees, Theory and Applications of Categories, Vol. 32 No. 40 (2017), 1397-1453, paper

Specialization and gradient descent:

  • R Tuyeras et al., Cellular intelligence: dynamic specialization through non-equilibrium multi-scale compartmentalization, bioarxiv

This section gathers works in domains that often use category theory as a language for logical discourse.

Algebraic topology

Algebraic topology techniques and intuitions developed along with persistent homology have been used to study neuronal morphologies.

Additionally, the clustering algorithm UMAP, commonly-used in computational biology to classify sequencing data, relies on properties holding for fuzzy simplicial set?s. For more detail, see the following blog post:

Algebraic geometry

The concept of singularities, in algebraic geometry, has been used to model anatomic morphologies and behaviors.

Additionally, various concepts of algebraic geometry such as manifolds and moduli spaces have been used by M Gromov to model molecular mechanisms in living cells.

References

Persistent homology:

  • L Kanari, P Dłotko, M Scolamiero, R Levi, J Shillcock, K Hess, H Markram (2016), Quantifying topological invariants of neuronal morphologies, arXiv:1603.08432

  • Y Lee, SD Barthel, P Dłotko, SM Moosavi, K Hess, B Smit, (2017), Pore-geometry recognition: on the importance of quantifying similarity in nanoporous materials, arXiv:1701.06953.

Fuzzy simplicial sets:

  • A Jackson, The mathematics of UMAP, pdf

  • DI Spivak, Metric Realization of Fuzzy Simplicial Sets, pdf

Algebraic geometry:

  • EC Zeeman, Catastrophe Theory, Scientific American, April 1976; pp. 65–70, 75–83, pdf

  • M Gromov, Mathematical slices of molecular biology, pdf

Higher structures

Hyperstructures

N Baas has proposed hyperstructures to describe hierarchical organizations in biology. While these structures were originally designed to organize extended cobordism structures, they are argued to also be appropriate for modeling multilevel systems in biology. Note that contrarily to multilevel structures such as n n -categories, hyperstructures offer more freedom in that biological processes are not necessarily oriented as globular arrows but, instead, appear to be organized as “aggregates” with bonds.

The idea behind applying hyperstructures to biology is that they allow us to consider some set X 0X_0 of “agents” such that any subset SX 0S \subseteq X_0 can define an aggregate when it is “labeled” by an explanation (or description) ω\omega for that aggregation.

The pair (S,ω)(S,\omega) can then be represented by another label, say β\beta, that can classify a collection of pairs sharing similarities.

The previous construction can then be repeated recursively on the set of labels β\beta. This set, call it X 1X_1, could potentially be the set of all the biological organs in an individual. Then, the next level X 2X_2 (obtained from X 1X_1 by following the previous procedure) can be the set of all organ systems, which can subsequently be organized as bodies on a fourth level X 3X_3.

Note that hyperstructures also require compatibility properties between the labels. In particular, for each level X kX_k, the pairs (S,ω)(S,\omega) should be organized into a Grothendieck construction Ω\int \Omega such that the mappings (S,ω)β(S,\omega) \mapsto \beta define a functor ΩSet\int \Omega \to \mathbf{Set}.

Bigraphs, λ\lambda-calculus and π\pi-calculus

Bigraphs are a type of hypergraph-based structures that can be linked to process algebras such as the λ \lambda -calculus, the π\pi-calculus and its stochastic variant. Each of these calculus formalisms has been used to model biological systems.

Interestingly, it was shown that the π\pi-calculus can be interpreted within a 2-category-theoretic setting.

A number of bigraph-based formalisms have been proposed to model complex systems, some with a category theoretic flavor. For example, stochastic bigraphs and their compositions have been used to model membrane budding in a biological system.

References

Hyperstructures:

Bigraphs:

  • R Milner, Bigraphs and Their Algebra, Electronic Notes in Theoretical Computer Science

    Volume 209, 24 April 2008, Pages 5-19, link

  • J Krivine, R Milner, A Troina, Stochastic Bigraphs, Electronic Notes in Theoretical Computer Science Volume 218, 22 October 2008, Pages 73-9, link

Relations between λ\lambda-calculus, π\pi-calculus and biology:

  • S Federhen, Replication is Recursion; or, Lambda: the Biological Imperative, bioarxiv

  • A Regev, W Silverman, E Shapiro E, Representation and simulation of biochemical processes using the pi-calculus process algebra, Pacific Symposium on Biocomputing 6:459-470 (2001), pdf

  • See the blog post by John Baez: Biology and the Pi-Calculus

  • M Stay, LG Meredith, Higher category models of the pi-calculus, arXiv:1504.04311

Other references

A general discussion on using category theory for biology can be found on the nn-category café:

Phylogenomics:

  • L Pachter, B Sturmfels, The mathematics of phylogenomics, math/0409132

Cell biology:

  • V Noel, D Grigoriev, S Vakulenko, O Radulescu, Tropical geometries and dynamics of biochemical networks. Application to hybrid cell cycle models, pdf

Systems biology:

  • R Rosen, The representation of biological systems from the standpoint of the theory of categories , Bulletin of Mathematical Biophysics, Vol 20. 1958, pdf

  • IC Baianu, JF Glazebrook and R Brown, A Category Theory And Higher Dimensional Algebra Approach To Complex Systems Biology, Meta-systems And Ontological Theory Of Levels: Emergence Of Life, Society, Human Consciousness And Artificial Intelligence, text

Neuroscience:

  • According to Mikhail Gromov, the mathematical structures nearest to what is happening in the mind are n-categories. See his talk: Ergologic and Interfaces Between Languages

  • AC Ehresmann and P S Wlimes, Towards a theoretical framework for wandering logic intelligence memory evolutive systems. In P. L. Simeonov, L. S. Smith, and A. C. Ehresmann, editors, Integral. Biomathics: Tracing the Road to Reality. Springer-Verlag, 2012.

  • AC Ehresmann and J-P Vanbremeersch, The memory evolutive systems as a model of Rosen’s organisms. Axiomathes, 16:165–214, 2006.

  • AC Ehresmann and J-P Vanbremeersch. Memory Evolutive Systems: Hierarchy, Emergence, Cognition, volume 4 of Studies in Multidisciplinarity. Elsevier, 2007.

  • AC Ehresmann, N Baas, and J-P Vanbremeersch. Hyperstructures and memory evolutive systems.

    Intern. J. Gen. Sys., 33(5):553–568, 2004.

  • D Pastor, E Beurier, AC Ehresmann, R Waldeck, Interfacing biology, category theory and mathematical statistics, pdf

Last revised on March 24, 2023 at 12:24:32. See the history of this page for a list of all contributions to it.