nLab
internal logic

Context

Category theory

Topos Theory

Type theory

Contents

Idea

One of the most important observations of category theory is that large parts of mathematics can be internalized in any category with sufficient structure. The most basic examples of this involve algebraic structures; for instance, a group can be defined in any category with finite products, and an internal category can be defined in any ambient category with pullbacks. For such algebraic (or even essentially algebraic) structures, which are defined by operations with equational axioms imposed, it suffices for the ambient category to have (usually finite) limits.

However, it turns out that if we assume some additional structure on the ambient category, then much more of mathematics can be internalized, potentially including fields, local rings, finite sets, topological spaces, even the field of real numbers. The idea is to exploit the fact that all mathematics can be written in the language of logic, and seek a way to internalize logic in a category with sufficient structure.

The basic ideas of the internal logic induced by a given a category C is this:

  • the objects A of C are regarded as collections of things of a given type A;

  • a subobject ϕA is regarded as a proposition (predicate): by thinking of it as the sub-collection of all those things of type A for which the statement ϕ is true.

    • the maximal subobject is hence the proposition that is always true, this is the logical object of truth A;

    • the minimal subobject is hence the proposition that is always false, this is the logical object of falsity A;

    • one proposition implies another if as subobjects of A they are connected by a morphism in the poset of subobjects: ϕψ means ϕψ

  • Logical operations are implemented by universal constructions on subobjects:

    and so on.

(There are other sorts of internalization that do not fit in this general framework. For example, a monoid internal to a monoidal category is not an example of the sort of internalization discussed here; the framework we consider here can only deal with monoids in cartesian monoidal categories, or at least cartesian multicategories. However, linear logic holds out the hope of some reconciliation between the two. See internalization for a discussion of the more general notion in the context of doctrines.)

Internal first-order logic

Kinds of internal logics in different categories

There is a hierarchy of types of logical theories, each of which corresponds to a type of category in which such theories can be internalized.

TheoryCategory
finite limit (aka “left exact” or “cartesian”)finitely complete category
regularregular category
coherentcoherent category
disjunctivelextensive category (aka finitary disjunctive category)
geometricinfinitary coherent category (aka geometric category)
first-orderHeyting category
dependent typeslocally cartesian closed category
higher orderelementary topos

Each type of logic up through “geometric” can also be described in terms of sketches. Not coincidentally, the corresponding types of category up through “geometric” fit into the framework of familial regularity and exactness. Sketches can also describe theories applicable to categories not even having all finite limits, such as finite product sketches or finite sum sketches, but the logical approach taken here seems to require at least finite limits.

Theories

For purposes of this page, what we mean by a theory is a type theory. This entails the following.

  • The signature of the theory consists of

    • Various types A,B,C. For example, the theory of a group has only one type (group elements), but the theory of a-ring-and-a-module has two types (ring elements and module elements). There are also generally type constructors that build new types from basic ones, such as product types A×B and the unit type 1.

    • The theory will also generally contain function symbols such as f:AB, each with a source and target that are types. For example, the theory of a monoid has one type M, one function symbol m:M×MM, and one function symbol e:1M. Function symbols of source 1 are also called constants.

    • The theory may also contain relation symbols R:A, each equipped with a type. For example, the theory of a poset has one type P and one relation :P×P. The most basic relation symbol, which most theories contain, is equality = A:A×A on a type A.

  • Finally the theory may contain logical axioms of the form Γφψ. Here φ and ψ are first-order formulas built up from terms and relation symbols using logical connectives and quantifiers such as ,,,,,¬,,, and Γ is a context which declares the type of every variable occurring in φ and ψ.

For example, the theory of a group has one type G, three function symbols m:G×GG, i:GG, and e:1G, and axioms

x:G,y:G,z:Gm(m(x,y),z)=m(x,m(y,z)) x:Gm(x,i(x))=em(i(x),x)=e x:Gm(x,e)=xm(e,x)=x.\array{ x:G,y:G,z:G | \top \vdash m(m(x,y),z) = m(x,m(y,z))\\ x:G | \top \vdash m(x,i(x)) = e \;\wedge\; m(i(x),x) = e\\ x:G | \top \vdash m(x,e) = x \;\wedge\; m(e,x) = x. }

This is an equational theory?, meaning that each axiom is just one or more equations between terms that must hold in a given context. For a different sort of example, the theory of a poset has one type P, one relation :P×P, and axioms

x:Pxx x:P,y:Pxyyxx=y x:P,y:P,z:Pxyyzxz.\array{ x:P | \top \vdash x\le x\\ x:P,y:P | x\le y \;\wedge\; y\le x \vdash x=y\\ x:P,y:P,z:P | x\le y \;\wedge\; y\le z \vdash x\le z. }

Categorical semantics

Now suppose that we have a category C with finite limits and we want to interpret such a theory internally in C. We identify the aspects of the theory with structures in the category by what is called categorical semantics:

First, for each type in the theory we choose an object of C. Then for each function symbol in the theory we choose a morphism in C. And finally, for each relation in the theory we choose a subobject in C. (We always interpret the relation of equality on a type A by the diagonal AA×A in C.) Thus, for example, to interpret the theory of a group in C we must choose an object G and morphisms m:G×GG, i:GG, and e:1G, while to interpret the theory of a poset, we must choose an object P and a subobject []P×P.

Of course, this is not enough; we need to say somehow that the axioms are satisfied. We first define, inductively, an interpretation of every term that can be constructed from the theory by a morphism in C. For example, given an object G and a morphism m:G×GG, there are two evident morphisms G×G×GG which are the interpretations of the two terms m(m(x,y),z) and m(x,m(y,z)).

We then define, inductively, an interpretation of every logical formula? that can be constructed from the theory by a subobject in C. The idea is that if x:A is a variable of type A and φ(x) is a formula with x as its free variable, then the interpretation of φ(x) should be the “subset” {xAφ(x)} of A. The base case of this induction is that if t is a term interpreted by a morphism AB and R:B is a relation symbol, then R(t) is interpreted by the pullback of the chosen subobject RB representing R along the morphism t:AB. The building blocks of logical formulas then correspond to operations on the posets Sub(A) of subobjects in C, as follows.

Logical operatorOperation on Sub(A)
conjunction: intersection (pullback)
truth: top element (A itself)
disjunction: union
falsity: bottom element (strict initial object)
implication: Heyting implication
existential quantification: left adjoint to pullback
universal quantification: right adjoint to pullback

The fact that existential and universal quantifiers can be interpreted as left and right adjoints to pullbacks was first realized by Bill Lawvere. One way to realize that it makes sense is to notice that in Set, the image of a subset RA under a function f:AB can be defined as

{bB(aA)(aRf(a)=b)},\{b\in B | (\exists a\in A)(a\in R \;\wedge\; f(a)=b)\},

while its “dual image” (the right adjoint to pullback) can be defined as

{bB(aA)(f(a)=baR)}.\{b\in B | (\forall a \in A)(f(a)=b \Rightarrow a\in R)\}.

Of course, in not all finitely complete categories C do all these operations on subobjects exist. Moreover, in order for the relationship with logic to be well-behaved, any of these operations we make use of must be stable under (preserved by) pullbacks. (Pullbacks of subobjects correspond to “innocuous” logical operations such as adding extra unused variables, duplicating variables, and so on, so they should definitely not affect the meaning of the logical connectives. However, in linear logic such operations become less innocuous.)

In any category with finite limits, the posets Sub(A) always have finite intersections (given by pullback), including a top element (given by A itself). Thus in any such category, we can interpret logical theories that use only the connectives and . This includes both the theories of groups and posets considered above.
In a regular category, the existence of pullback-stable images implies that the base change functor f *:Sub(B)Sub(A) along any map f:AB has a left adjoint, usually written f, and that these adjoints “commute with pullbacks” in an appropriate sense (given by the Beck-Chevalley condition. Thus, in a regular category we can interpret any theory in so-called regular logic, which uses only , , and .

Actually, some instances of can be interpreted in any category with finite limits: if f is itself a monomorphism, then f * always has a left adjoint simply given by composition with f. On the logical side, this means that we can interpret “provably unique existence” in any category with finite limits. Logic with , , and “provably unique existence” is called cartesian logic or finite-limit logic.

A coherent category is basically defined to be a regular category in which the subobject posets additionally have pullback-stable finite unions. Thus, in a coherent category we can interpret so-called coherent logic, which adds and to regular logic. Likewise, in an infinitary-coherent (or “geometric”) category we can interpret geometric logic, which adds infinitary disjunctions iφ i to coherent logic. Geometric logic is especially important because it is preserved by the inverse image parts of geometric morphisms, and because any geometric theory has a classifying topos.

On the other hand, in a lextensive category, we do not have images or all unions, but if we have two subobjects of A which are disjoint (their intersection is initial), then their coproduct is also their union in Sub(A). Therefore, in a lextensive category we can interpret disjunctive logic, which is cartesian logic plus and “provably disjoint disjunction.” Likewise, in an infinitary-lextensive category we can interpret “infinitary-disjunctive logic.”

Finally, in a Heyting category the base change functors f *:Sub(B)Sub(A) also have right adjoints, usually written f, and it is easy to see that this implies that each Sub(A) is also a Heyting algebra, hence has an “implication” as well. (We define “negation” by ¬φφ.) Thus, in a Heyting category we can interpret all of (finitary, first-order) intuitionistic logic.

Now that we know how to interpret logic, we can say that a model of a given theory in C consists of a choice of objects, morphisms, and subobjects for the types, function symbols, and relation symbols as above, such that for each axiom Γφψ, we have [φ][ψ] in Sub([Γ]). Here, [Γ] is the product of the objects that correspond to the types of the variables in Γ, [φ] and [ψ] are the interpretations of the formulas φ and ψ as subobjects of [Γ], and is the relation of subobject inclusion.

It is easy to verify that a model of the theory of a group in C is precisely an internal group object in C, as usually defined. For instance, the validity of the axiom

x:G,y:G,z:Gm(m(x,y),z)=m(x,m(y,z))x:G,y:G,z:G | \top \vdash m(m(x,y),z) = m(x,m(y,z))

means that the equalizer of the two morphisms G×G×GG must be all of G×G×G, or equivalently that those two morphisms must be equal. The same happens in most other cases.

Soundness and internal reasoning

Internal logic is not just a way to concisely describe internal structures in a category, but also gives us a way to prove things about them by “internal reasoning.” We simply need to verify that the “usual” methods of logical reasoning (for example, from φψ and ψχ deduce φχ) are internally valid, in the sense that if the premises are satisfied in some model C (in the example, if [φ][ψ] and [ψ][χ]) then so is the conclusion (in the example, [φ][χ]). This is called the Soundness Theorem.

It then follows that if we start from the axioms of a theory and “reason normally” within type theory, which in practice amounts to pretending that the types are sets, the function symbols are functions, and the relation symbols are subsets, then anything we prove will still be true when the theory is interpreted in an arbitrary category, not just Set. For example, by easy equational reasoning from the theory of a group, we can prove that inverses are unique, which is expressed by the logical sequent

x:G,y:G,z:Gm(x,y)=em(x,z)=ey=z.x:G,y:G,z:G | m(x,y)=e \;\wedge\; m(x,z)=e \vdash y=z.

It follows that this is also true, suitably interpreted, as a statement about internal group objects in any category.

There are (at least) three caveats. Firstly, we must take care to use only the rules appropriate to the fragment of logic that is valid in the particular categories we are interested in. For example, if we want our conclusions to be valid in any regular category, we must restrict ourselves to reasoning “within regular logic.” Most mathematicians are not familiar with making such distinctions in their reasoning, but in practice most things one would want to say about a regular theory turn out to be provable in regular logic. (We will not spell out the details of what this means.) And once we are in a Heyting category, and in particular in a topos, this problem goes away and we can use full first-order logic.

The second, more important, caveat is that the internal logic of all these categories is, in general, constructive. This means that, among other things, the interpretation of ¬¬φ is, in general, distinct from that of φ, and that φ¬φ is not always valid. So even if we believe that classical logic (including the principle of excluded middle and even the axiom of choice) is “true,” as many mathematicians do, there is still a reason to look for proofs that are constructively acceptable, since it is only these which are valid in the internal logic of most categories. If the category is Boolean and/or satisfies the internal axiom of choice, however, then this problem goes away, but these fail in many categories in which one wants to internalize (such as many Grothendieck toposes).

The third caveat is that one must take care to distinguish the internal logic of a category from what is externally true about it. In general, internal validity is “local” truth, meaning things which become true “after passing to a cover.” This is particularly important for formulas involving disjunction and existence. For example, an object’s being projective in the category C is a different statement from its being internally projective, meaning that ”X is projective” is true in the internal logic. Another good example can be found in the different notions of finite object in a topos. This problem goes away if the ambient category is well-pointed, but well-pointed categories are even rarer than Boolean ones satisfying choice; the only well-pointed Grothendieck topos is Set itself.

Completeness, syntactic categories, and Morita equivalence

The converse of the Soundness Theorem is called the Completeness Theorem, and states that if a sequent φψ is valid in every model of a theory, then it is provable from that theory. This is noticeably less trivial. In classical first-order logic, where the only models considered are set-valued ones, the completeness theorem is usually proven using ultraproducts. However, in categorical logic there is a more elegant approach (which additionally no longer depends on any form of the axiom of choice). From any theory T we can construct a category, called its syntactic category C T, containing a “generic” model of the theory, in which the valid sequents are precisely those provable from the theory. It turns out that C T is precisely the category of contexts described on the page context. Therefore, if a sequent is valid in all models, it is also valid in the generic model in C T, and hence provable from T.

The syntactic category C T also has a universal property: any T-model in a category D with suitable structure (regular, coherent, etc.) is the image of the generic model in C T under an essentially unique functor (of the appropriate type) C TD. In other words, the generic model is also the initial model. This can often be useful; for instance, sometimes one can prove something about the generic model and then carry it over to all models.

Furthermore, if T lives in a sub-fragment of geometric logic (such as regular, coherent, lextensive, or geometric logic), then the Grothendieck topos of sheaves on C T for its appropriate (regular, coherent, extensive, or geometric) coverage contains a T-model which is generic for models in Grothendieck toposes: any T-model in a Grothendieck topos is its image under the inverse image of a unique geometric morphism. This topos is called the classifying topos of the theory.

The syntactic category of a theory can be considered as the “extensional essence” of that theory, since functors out of C T completely determine the T-models in any category D with suitable structure. It therefore makes sense, in some contexts, to define a morphism of theories to be a functor between their syntactic categories, and an equivalence of theories (sometimes called a Morita equivalence) to be an equivalence between their syntactic categories.

A morphism TT between theories, in this sense, induces a functor from T-models in D to T-models in D, for any category D with suitable structure, in a way which is natural in D. In particular, theories which are “Morita equivalent” in this sense have naturally equivalent categories of models in all categories D with suitable structure; so they have the same “meaning” even though they may be presented quite differently. (Note that this is a much stronger sort of equivalence than merely having equivalent categories of models in some particular category, such as Set.) Moreover, the fact that the syntactic category is defined “syntactically” means that a morphism TT actually induces a “translation” of the types, functions, and relations of T into those of T.

By first applying various “completion” processes to syntactic categories before asking about equivalence, we obtain coarser notions of equivalence, which only induce equivalences of models in more restricted sorts of categories. For instance, if we compare the exact completions of syntactic categories of regular theories, we obtain a notion of equivalence that induces equivalences of categories of models in all exact categories (not necessarily all regular ones). Likewise for coherent theories and pretoposes, and for geometric theories and infinitary pretoposes. Note, though, that the infinitary-pretopos completion of a (small) geometric theory is in fact already a (Grothendieck) topos, and coincides with the classifying topos considered above. Thus, passage to classifying toposes is also an instance of this construction, and an equivalence of classifying toposes means that two theories have equivalent categories of models in all toposes. (This is still much stronger than just having equivalent categories of models in Set.)

Kripke–Joyal semantics

(to be written…)

Higher-order logic

(to be written…)

But see Mitchell–Bénabou language for the version in a topos.

Examples

Internal logic in Set

The topos Set in classical mathematics of course has as its internal logic the “ordinary” logic. This is reproduced by following the abstract nonsense as follows:

the terminal object of Set is the one-element set *, the subobject classifier in Set is the two-element set Ω={true,false} equipped with the map

T:*ΩT : {*} \to \Omega

that picks the element true in Ω. The Heyting algebra of subobjects of the terminal object is the poset

L={*}L = \{ \emptyset \hookrightarrow {*} \}

consisting only of the two trivial subobjects of *, the point itself and the empty set, and the unique inclusion morphism between them. These are classified, respectively, by the truth values *falseΩ and *trueΩ, so that we can also write our poset of subobjects of the terminal object as

L={falsetrue}.L = \{ false \to true \} \,.

The logical operation =AND is the product in the poset L. Indeed we find pullback diagrams in L

true×true=true true truetrue×false=false false truefalse×false=false false false.\array{ true \times \true = true &\to& true \\ \downarrow \\ true } \;\;\; \;\;\; \;\;\; \array{ true \times false = false &\to& false \\ \downarrow \\ true } \;\;\; \;\;\; \;\;\; \array{ false \times false = false &\to& false \\ \downarrow \\ false } \,.

The logical operation =OR is the coproduct in the poset L. Indeed we find pushout diagrams in L

true true truetrue=true false true truefalse=true false false falsefalse=false.\array{ && true \\ && \downarrow \\ true &\to& true \coprod \true = true } \;\;\;\; \;\;\; \;\;\; \array{ && false \\ && \downarrow \\ true &\to& true \coprod false = true } \;\;\;\; \;\;\; \;\;\; \array{ && false \\ && \downarrow \\ false &\to& false \coprod false = false } \,.

The logical operation ¬=NOT is given by the internal hom into the initial object in L:

¬=hom(,false):L opL.\not = hom(-, false) : L^{op} \to L \,.

We find the value of the internal hom by its defining adjunction. For hom(true,false) we have

Hom L(true,hom(true,false))Hom L(true×true,false)=Hom L(true,false)=Hom_L(true, hom(true,false)) \simeq Hom_L(true \times true, false) = Hom_L(true, false) = \emptyset

and

Hom L(false,hom(true,false))Hom L(false×true,false)=Hom L(false,false)=*Hom_L(false, hom(true,false)) \simeq Hom_L(false \times true, false) = Hom_L(false, false) = {*}

from which we deduce that

hom(true,false)=false.hom(true,false) = false\,.

Similarly for hom(false,false) we have

Hom L(true,hom(false,false))Hom L(true×false,false)=Hom L(false,false)=*Hom_L(true, hom(false,false)) \simeq Hom_L(true \times false, false) = Hom_L(false, false) = {*}

and

Hom L(false,hom(false,false))Hom L(false×false,false)=Hom L(false,false)=*Hom_L(false, hom(false,false)) \simeq Hom_L(false \times false, false) = Hom_L(false, false) = {*}

from which we deduce that

hom(false,false)=true.hom(false,false) = true \,.

This way all the familiar logical operations are recovered from the internal logic of the topos Set.

Internal logic in a sheaf topos on open subsets

Let X be a topological space and Op(X) its category of open subsets and Sh(X):=Sh(Op(X)) the Grothendieck topos of of sheaves on X.

We discuss the internal logic of this sheaf topos (originally Tarski, 1983).

The terminal object is the sheaf represented by X: the one that is constant on the one-element set

X:U*.X : U \mapsto {*} \,.

The subobjects of this object are the representable presheaves

hom(,V):U{* ifUV otherwisehom(-,V) : U \mapsto \left\{ \array{ {*} & | if U \subset V \\ \emptyset & otherwise } \right.

for VOp(X).

Remark

In the presheaf topos PSh(Op(X))=Func(Op(X) op,Set), the subobjects of 1 are arbitrary sieves in Op(X), not just representables. For instance, for any two open sets U and V there is a sieve consisting of all open sets contained in either U or V, which doesn’t necessarily contain UV. It’s only in the sheaf topos Sh(X) that the representables are precisely the subobjects of 1.

The poset of subobjects formed by these is just the category of open subsets itself:

L=Op(X).L = Op(X) \,.
  • The logical operation AND is the product in Op(X): this is the intersection of open subsets.

  • The logical operation OR is the coproduct in Op(X): this is the union of open subsets.

  • The internal hom in Op(X) is given by

    hom(U,V)=(U cV) hom(U,V) = (U^c \vee V)^\circ

    (the interior of the union of the complement of U with V).

    So negation is given by sending an open subset to the interior of its complement:

    ¬U=hom(U,)=(U c) =(U c) .\not U = hom(U,\emptyset) = (U^c \vee \emptyset)^\circ = (U^c)^\circ \,.

In particular we find that in the internal logic of PSh(X) the law of the excluded middle fails in general, as in general we do not have that

(¬U)U=true(\not U) \vee U = true

because ¬UU=(U c) U=X\U is the total space X without the boundary (frontier) of U, and not true=X, all of the total space.

Thus, the internal logic of this sheaf topos is (in general) intuitionistic logic. As remarked above, this is the case in many toposes.

References

  • Most books on topos theory develop some internal logic, at least in the context of a topos. For example:

  • Saunders Mac Lane Ieke Moerdijk, Sheaves in Geometry and Logic

  • Goldblatt, “Topoi: the categorial analysis of logic”

  • Part D of Sketches of an Elephant is comprehensive.

  • “Categorical Logic and Type Theory” by Jacobs works in the even more general context of fibrations, allowing us to associate to each object A an arbitrary poset instead of Sub(A).

  • Paul Taylor's book Practical Foundations of Mathematics is arguably all about this subject (although you wouldn't know it until about Chapter VIII), but from a different perspective. In particular, Taylor allows us to replace having all pullbacks with pullbacks along a pullback-stable class of display morphisms.

A discussion of dependent type theory as the internal language of locally cartesian closed categories is in

  • R. A. G. Seely, Locally cartesian closed categories and type theory, Math. Proc. Camb. Phil. Soc. (1984) 95 (pdf)

The observation that the poset of open subsets of a topological space serve as a model for intuitionistic logic is apparently originally due to

  • Alfred Tarski, Der Aussagenkalkül und die Topologie, FundamentaMathemeticae 31 (1938), pp. 103–134.