Contents

Linguistics

# Contents

## Overview

Pregroup grammar is a mathematical model of natural language grammar introduced by Lambek (1999); it is part of the categorial grammar tradition.

## Pregroups

### Definition

A preordered monoid is a tuple $(P, \leq, \cdot, 1)$ where $(P, \leq)$ is a preorder and $(P, \cdot, 1)$ is a monoid such that the multiplication is monotone, i.e.

• $a \leq c \wedge b \leq d \implies a \cdot b \leq c \cdot d \quad$ (monotony)

or equivalently:

• $b \leq d \implies a \cdot b \cdot c \leq a \cdot d \cdot c \quad$ (substitution)

for all $a, b, c, d \in P$. A pomonoid is a preordered monoid that is furthermore a poset, i.e.

• $a \leq b \leq a \implies a = b \quad$ (antisymmetry)

A pregroup is a pomonoid such that every object $t \in P$ has left and right adjoints $t^l, t^r \in P$. Explicitly we have the following four axioms:

• $t^l \cdot t \leq 1, \quad t \cdot t^r \leq 1 \quad$ (contraction)
• $1 \leq t \cdot t^l, \quad 1 \leq t^r \cdot t \quad$ (expansion)

In other words, a pregroup is a rigid monoidal category which is thin and skeletal. Note that a commutative pregroup is simply an ordered group. Here are a few variants of pregroups that appear in the literature, see Coecke 2013:

• If we drop the anti-symmetry axiom for posets, we get a quasi-pregroup. These are useful as they avoid the equality $a \leq a \cdot a^l \cdot a \leq a \implies a = a \cdot a^l \cdot a$ for all $a \in P$.
• If we drop the two expansion axioms, we get a protogroup.
• If we drop the axioms for the right adjoints, we get a left pregroup. Symmetrically, if we drop the left adjoints we get a right pregroup.

### Monotone unbounded functions

Take the natural numbers $\mathbb{N}$ and define the set $P$ of monotone functions $f : \mathbb{N} \to \mathbb{N}$ that are furthermore unbounded, i.e. for all $n \in \mathbb{N}$ there is an $m \in \mathbb{N}$ with $n \leq f(m)$. $P$ is a pomonoid with composition as product, identity as unit and $f \leq g$ iff $f(n) \leq g(n)$ for all $n \in \mathbb{N}$. To show that $P$ is a left pregroup, define:

$f^l(m) = \text{min} \{n \in \mathbb{N} \vert m \leq f(n)\}$

we get:

$(f^l \cdot f)(m) = f^l(f(m)) = \text{min}\{n \in \mathbb{N} \vert f(m) \leq f(n)\} \leq m = 1(m)$

and

$1(m) = m \leq f(\text{min} \{ n \in \mathbb{N} \vert m \leq f(n)\}) = f(f^l(m)) = (f \cdot f^l) (m)$

for all $m \in \mathbb{N}$, i.e. $f^l \cdot f \leq 1 \leq f \cdot f^l$.

If we replace the natural numbers $\mathbb{N}$ by the integers $\mathbb{Z}$ then the set of unbounded monotone functions $f : \mathbb{Z} \to \mathbb{Z}$ is both a left and a right pregroup where:

$f^r(m) = \text{max} \{n \in \mathbb{N} \vert f(n) \leq m\}$

For example, if $f(n) = 2 n$ then $f^l(n) = \lceil \frac{n + 1}{2} \rceil \neq \lceil \frac{n}{2} \rceil = f^r(n)$.

Proposition (Buszkowski 2001): Every pregroup is a pregroup of functions, i.e. a sub-pomonoid of $X \to X$ for some poset $X$.

### Free pregroups and the switching lemma

In linguistic applications, one starts with a poset $B = \{ s, n, \dots \}$ of basic types (e.g. sentence and noun) and then takes the free pregroup $P_B$ generated by $B$. For a basic type $b \in B$, iterated adjoints $\dots, b^{ll}, b^l, b, b^r, b^{rr}, \dots \in P_B$ are called simple types; an arbitrary type $t \in P_B$ may then be given as a sequence of simple types. The following lemma makes the parsing problem for free pregroups decidable — i.e. given a sentence type $s \in P_B$ and the types for a sequence of words $t_1, \dots, t_n \in P_B$, is $t_1 \dots t_n \leq s$?

Switching lemma (Lambek 1999): For any pair of types $t, t' \in P_B$, if $t \leq t'$ then there is a type $t'' \in P_B$ such that $t \leq t''$ without expansions and $t'' \leq t'$ without contractions, i.e. free protogroups recognise the same language as free pregroups.

### From pregroups to compact 2-categories

In Preller, Lambek 2007, Preller and Lambek generalise the free pregroup construction above to free compact 2-categories in order to capture proof relevance. This allows to distinguish between the distinct parsings of ambiguous phrases such as “(men and women) who like maths” vs. “men and (women who like maths)”.

Note that the compact 2-categories of Lambek and Preller differ from compact closed 2-category in that they are not required to be symmetric. A non-symmetric compact closed 2-category with one object is simply a rigid monoidal category.

Given a poset of basic types $B$, the objects of the free rigid monoidal category $C_B$ are the same as that of the free (quasi-)pregroup, the arrows may be described as planar string diagrams. Given two types $t, t' \in P_B$, we have that $t \leq t'$ if and only if there is an arrow $r : t \to t'$ in $C_B$.

## Pregroup grammars as free rigid monoidal categories

A pregroup grammar is a tuple $G = (B, \Sigma, \Delta, s)$ where $B$ and $\Sigma$ are finite sets called the basic types and the vocabulary respectively, $\Delta \subseteq \Sigma \times P_B$ is a relation called the dictionary and $s \in P_B$ is a designated sentence type. We require that $\Delta$ is finite, i.e. the set of possible types $\Delta(w) \subseteq P_B$ is finite for each word $w \in \Sigma$. The language of $G$ is then given by:

$L(G) = \big\{ w_1 \dots w_n \in \Sigma^n \vert n \in \mathbb{N}, \quad \exists t \in \prod_{i \leq n} \Delta(w_i) \quad t \leq s \big\}$

i.e. a sequence of words $w_1 \dots w_n \in \Sigma^n$ is said to be grammatical whenever there is a dictionary entry $(w_i, t_i) \in \Delta$ for each word $w_i$ such that $t_1 \dots t_n \leq s$. One may simplify this by redefining $L(G) = \big\{ w_1 \dots w_n \in \Sigma^n \vert C_G(w_1 \dots w_n, s) \neq \emptyset \big\}$ where $C_G$ is the free rigid monoidal category with:

• generating objects the disjoint union $B + \Sigma$,
• generating arrows the dictionary entries $(w, t) \in \Delta$ with $dom(w, t) = w$ and $cod(w, t) = t$.

That is, a sequence of words is grammatical whenever there exists a string diagram going from the sequence of words to the sentence type. Note that traditionally the identity wires for words $w \in \Sigma$ are omitted, hence dictionary entries are depicted as triangles with no input. Contractions are depicted as cups, e.g. from $(Alice, n), (loves, n^r s n^l), (Bob, n) \in \Delta$ and $n n^r s n^l n \leq s$ we get the following diagram:

## Pregroup semantics as strong monoidal functors

One may give a semantics to a pregroup grammar $G = (B, \Sigma, \Delta, s)$ by defining a strong monoidal functor $F : C_G \to S$, where $C_G$ is the free rigid monoidal category described in section 2. $S$ is a suitable rigid monoidal category, e.g. $\text{FdVect}$ or $\text{Rel}$, depending on the application. Note the similarity with a Lawvere theory as a monoidal functor from a syntactic category to $\text{Set}$.

We require the image for all words $w \in \Sigma$ to be the monoidal unit $F(w) = I$, hence the image for each dictionary entry $(w, t) \in \Delta$ is given by a state $F(w, t) : I \to F(t)$. The meaning $F(r) : I \to F(s)$ for a sentence $w_1 \dots w_n \in \Sigma^n$ with grammatical reduction $r : w_1 \dots w_n \to s$ may then be computed from the individual meanings $F(w_i, t_i) : I \to F(t_i)$ of the words, following Frege's principle of compositionality. This has been developed in Preller 2005 as well as in a series of papers by Bob Coecke and others, see categorical compositional distributional semantics.

## Expressive power and extensions

Theorem (Buszkowski, Moroz 2008): For each pregroup grammar $G$, there is a context-free grammar $G'$ such that $L(G) = L(G')$. Furthermore, $G'$ may be computed in time polynomial in the size of $G$.

The opposite direction also holds, hence pregroup grammar and context-free grammar are said to be weakly equivalent: the translation preserves only the generated languages, it does not preserve the structure of syntax trees.

Since it is generally accepted that natural languages go beyond context-free languages this result illustrates the lack of expressive power of pregroup grammars whence it becomes necessary to enhance the original grammar model:

One way (Kobele 2005) is to consider product pregroup grammars $G_1\times G_2$: these assign to a string $w$ a tuple $(t_1,t_2)$ of syntactic types with grammaticality of $(w,(t_1,t_2))$ checked by parallel computation of the grammaticality of $(w,t_1)$ in $G_1$ and of $(w,t_2)$ in $G_2$. $(w,(t_1,t_2))$ is then grammatical iff $(w,t_1)$ and $(w,t_2)$ are i.e. $L(G_1\times G_2)= L(G_1)\cap L(G_2)$ (cf. Kobele&Kracht (2005), prop. 4). This permits to describe e.g. the context-sensitive rules of Swiss German (Lambek (2008)). The checking of syntactic types is still given by computation in a pregroup, namely the product pregroup $P_1\times P_2$, but $P_1\times P_2$ is not free!

Kobele and Kracht (2005) use this together with the above result by Buszkowski and the classical result in formal language theory that any type 0 language $L$ is the homomorphic image $h(L_1\cap L_2)$ of the intersection of two (deterministic) context-free languages $L_1,L_2$ in order to show that pregroup grammars generate every type 0 language when type checking is done in arbitrary pregroups.

Since it is usually assumed that natural languages do not go beyond mildly context-sensitive languages one sees that (homomorphic images) of products of arbitrary pregroup grammars have excessive power and it becomes necessary to cut back this power. One way to do this is by restricting $G_1$ to be of a type simpler than context-free e.g. a Dyck language? thought of now as a control language supervising the “context-free” computations in $G_2$. This idea was explored in Kobele (2005).

Another approach to adapt the pregroup grammar formalism to mildly context-sensitive languages was proposed by Genkin et al. (2010) which augment pregroup grammars with buffers.

• Joachim Lambek, Type Grammar Revisited, Logical Aspects of Computational Linguistics (1999)

• Anne Preller?, Category Theoretical Semantics for Pregroup Grammars, Logical Aspects of Computational Linguistics, 5th International Conference, LACL 2005 (pdf)

• Anne Preller?, Joachim Lambek, Free Compact 2-Categories, Mathematical Structures in Computer Science, Cambridge University Press (CUP), 2007 (pdf)

• Joachim Lambek, From Word to Sentence: A Computational Algebraic Approach to Grammar, Polimetrica 2008 (pdf)

• Wojciech Buszkowski?, Katarzyna Moroz?, Pregroup Grammars and Context-free Grammars, Computational Algebraic Approaches to Natural Language, Polimetrica (2008) (pdf)

• Wojciech Buszkowski?. Lambek grammars based on pregroups. Logical Aspects of Computational Linguistics, LNAI 2099: 95–109, 2001.

• Bob Coecke, An alternative Gospel of structure: order, composition, processes, Introductory chapter to C. Heunen, M. Sadrzadeh, and E. Grefenstette. Quantum Physics and Linguistics: A Compositional, Diagrammatic Discourse. Oxford University Press, 2013 (arxiv:1307.4038)

• Gregory M. Kobele?, Pregroups, Products, and Generative Power , handout Chieti workshop on pregroups May 2005.

• Gregory M. Kobele?, Marcus Kracht, On Pregroups, Freedom, and (Virtual) Conceptual Necessity , U. Penn Working Papers in Linguistics 10 no.1 (2005).

• Daniel Genkin?, Nissim Francez?, and Michael Kaminski?. Mildly Context-Sensitive Languages via Buffer Augmented Pregroup Grammars, In Essays in Memory of Amir Pnueli, 2010.