nLab
An Exercise in Kantization

Motivation

Recently, Johan Alm wrote a very neat paper

The paper proves an aspect of

suggested by Urs Schreiber. A summary of Alm’s paper with many motivating references is provided here.

Urs: Johan Alm meanwhile has a refined version of these notes, but some aspects still need more work and thinking. When things have stabilized further I’ll be happy to provide more details here.

Eric: Are the refined notes available?

This page represents an experiment similar to the Journal Club's effort to understand the papers

  • IntTrans David Ben-Zvi, John Francis, David Nadler, Integral transforms and Drinfeld Centers in Derived Geometry (arXiv)

  • CharTheo David Ben-Zvi, David Nadler, The Character Theory of a Complex Group (arXiv)

The goal is to build up, step-by-step, the background knowledge required to understand Alm’s paper. The finite model considered is simple enough that we hope to advance our understanding of elementary concepts that are mostly taken for granted by the experts here at n-Headquarters.

When we are done, we hope to have a complete, self-contained presentation that should be accessible to mildly sophisticated undergraduate physics and mathematics students.

We also hope that the experts will keep an eye on us and when we get stuck, might gently nudge us in the right direction. As always, this page is open to all. Any questions or contributions are more than welcome.

Collection of Concepts

Maybe we can start by skimming the paper and collecting some unfamiliar keywords so that we can begin the process of tracing back definitions.

References

Discussion

(We’ll pull out any outstanding questions from the discussion below to here in order to make life a little easier for would be angels who might be willing and able to help without forcing them to read the entire discussion.)

Eric: What is a “component of a cocone”?

Urs Schreiber: where did you see that term used? Maybe the question (or its answer) belongs at colimit. Do you have an idea what a cocone itself is? It consists of lots of morphisms from the objects of a diagram to the cocone tip. If we regard the cocone as a natural transformation to a constant functor, then the components of that natural transformation are these single morphism from objects to the tip of the cocone. These I would call “components of the cocone”.

Eric: It is in “Lemma 1” of Alm’s paper. I didn’t quite get the bit about “constant functor”, but thanks to a great Catsters video (General Limits and Colimits), I have an idea about cocones (via colimits).

Toby: Answered at cocone.

(Topics are separated by horizontal lines, with topics presented in reverse chronological order, i.e. the first section is the most recent.)_

Eric: I’ve been reading (not exactly studying) Goldblatt and watching some Catster videos. I still feel like I am pretty far away from understanding Kan extension. John Baez is helping Mike Stay understand them on free cocompletion, which might be helpful to anyone trying to understand Alm’s paper.


Eric: “Let X˜\tilde{X} be the groupoid which has as objects pairs (x,s)(x,s), where xMx\in M and sRs\in\mathbf{R}, and as morphisms (x,s)(x,s)(x,s)\to(x',s') paths γ:xx𝒫 1(M)\gamma:x\to x'\in\mathcal{P}_1(M) of proper time τ(γ)=ss\tau(\gamma)=s'-s.”

Why do we consider groupoids here? I wouldn’t expect paths to have “inverses”, I’d expect them to have “adjoints” or maybe “opposites”.

Shouldn’t there be a partial order somewhere related to causality?

Toby: Good point. Especially since we're looking only at timelike (or lightlike on the boundary) paths here (those with proper time rather than proper distance), it would be quite natural to fit causality in here. Even if you want to allow for timelike loops, the individual paths can be classified as forward or backward under very weak orientability assumptions (and assuming only one timelike dimension, of course).

Eric: I keep coming to back to the idea of reformulating this on a double category and relating it to a Feynman checkerboard somehow. Maybe even getting back to relating it to Position, Velocity, and Acceleration.


Eric: As I read Alm’s notes, I keep getting distracted thinking things like, “Why do they do that? What if they did this instead?” For example, why not

e iS:𝒫 1(M)BU(1)e^{iS}:\mathcal{P}_1(M)\to \mathbf{B}U(1)

? Then with ρ:BU(1)Vect\rho:\mathbf{B}U(1)\to Vect, we’d have

ρe iS:𝒫 1(M)Vect.\rho\circ e^{iS}:\mathcal{P}_1(M)\to\Vect.

That might make things a little clearer.

Now, I’m wondering if we should introduce double categories as a model for the Feynman checkerboard.


If any experts happen to wonder by, Daniel would appreciate any clarification on these points (should be considered as questions more than statements):

  • Using left and right Kan extensions simultaneously seems to be required because QM mechanics needs symmetric everywhere defined operators.

    Urs: coinciding left and right Kan extensions might be relevant, but I don’t see this very clearly yet

  • The “everywhere” follows from the arbitrary path taken to carry a state.

  • “Symmetric” follows that both left and right are similar definitions, but they are different because of the direction to where tha natural functor maps.

  • Operator is the so called Z, in which the author calls the quantum theory.

  • Further, such operator is self-adjoint. Using both of them at the same time makes the functorial definition of adjoint functor work as a self adjoint functor, because as stated above, given that the natural map works both ways, you can define projection operators in antagonistic, but equivalent ways.

    Urs: I don’t really see the point about self-adjointness made here –

    Daniel: This is also confusing to me, but since I am trying to think about Quantum Mechanics, I want to understand how a definintion of Lan and Ran should induce an hermitian state representation matrix of the operator Z. To see this, take projection operators k and j of sections 1.1 and 1.2 to another diagram such that their roles are interchanged. Both of these diagrams are the same, except that now the components of the natural transformations pho and lambda will be reveresed. Equivalentely, it is like inverting the arrow of the definitions of the natural transformations, and because of this, exchanging the definition of left and right kan extensions. Given that, both pictures must describe the evolution of Z and that k and j ends up on their dual space, because of theorem 1, and so, I don’t really know how to finish this, but it seems that we will see that Lan and Kan are self adjoint operators.

  • So, using both of them leads to the proper definition of a normed operator, because you have antatonistic projections.

  • The definition of Kan extension forces all natural functors to filter uniquely all paths, which means that the outcome of a measurement will be unique. Since all possible configurations will be taken into consideration and projected, it sounds to be reasonable to me to think that the operator will be unitary.

    Urs: I am not sure that I understand what this means

    Daniel:Forget about that. The fact that Kan extension filters all natural transofrmations of a given kind, means that a matrix is invertible. The unitarity is ad hock, but it seems that operations involving the proof of unitarity, involving Kan exenting, will involve speaking about the filtering of natural transformations.


Daniel: On page 3, he writes: “Usually, one has a classical ‘action’ of some kind definedned for manifolds with some extra structure, e.g. a riemannian metric, a symplectic form, a principalbundle, or etc. Quantization is what happens when one tries to assign that same action to a manifold that does not have that structure!” We can define many structures over a manifolds. Like, a toplogy, a piecewise linear strucure, a metric, etc. Just like the layers of an onion. So, quantization is just summing over all possible new layers of every new structure? It looks like so.

Toby: That seems like a wild idea to me, but it does seem to be what he's saying. And what's really wild is that he just might be right!

Daniel:I am trying to understand why Kan extension were used in that way to define quantiation, and it came up to me that listing how concepts are speculatively both involved in both QM and Kan extensions might be useful. (Urs copied the questions I made to “Outstanding Questions” above, so, for the sake of brevity, I deleted them here).

Eric: It looks like you’re making good progress! I’m still trying to understand Kan extensions. I’m even still shaky on functors :) My goal is to start with the linear map examples on functor, i.e. one-object full subcategories of FinVect and see what Kan extension means there. I wouldn’t be broken hearted if someone spoiled the fun and explained it.

Daniel: Actualy no! I am trying to understand Kan extensions going backwards from QM! I am just waiting to someone to tell me if the above makes sense. Otherwise, I will remain completely confused as I am indeed now… Worse, people can take seriously what I wrote!

Eric: “I am trying to understand Kan extensions going backwards from QM!” I am REALLY happy to see statements like that. That is really what I hope to get out of this exercise too. If we can get to the point where the mathematical concept of “Kan extension” can actually be understood in terms of QM, that would be pretty cool.

Daniel:Yes, sure, but for that, I would just like anyone to tell me if what I wrote above makes sense… Otherwise, I am stuck.


Eric: There is something about Kan extension that reminds me of some kind of linear least squares. I started trying to make it explicit by building an example with finite-dimensional vector spaces. I placed the example at functor. It seems like left and right Kan extensions are like left and right inverses (which in my universe means linear least squares).

Daniel: Eric, I don’t understand why you wrote on functor specificaly about linear least squares, it doesn’t seem to have any special relation to do with what I found on wikipedia, other then being just a general observation about linear spaces.

Eric: Hi Daniel, have a look a John's explanation of Kan extension. What he described made me think of linear least squares (which has nothing to do with any wikipedia article). I was trying to work out an example where Kan extension really did turn out to be linear least squares, but expressed rather in terms of left and right inverses. I may be completely wrong about the relation, but before I can judge, I needed functors between vector spaces thought of as one-object categories, hence the examples on functor. My goal is to get to the point where I feel fairly comfortable with Kan extension and then try to understand it in the context of Alm’s paper.

Toby: Can you explain what linear least squares is, then, if it has nothing to do with what's on that Wikipedia article? (which is the only linear least squares that I'm familiar with myself).

Eric: Hi Toby and Daniel, linear least squares is what you think it is and is what is on Wikipedia. I didn’t say what I wanted to say very clearly. And now I see I didn’t read Daniel’s question very carefully either. It’s the hazards of not having more than 5 minute spurts to pay attention to this stuff :O What I meant to say is that my thoughts about the relationship between linear least squares and Kan extension (which are nothing more than a gut feeling) were related to what John said and not about anything I read on Wikipedia. I didn’t mean to suggest that what Wikipedia said was not related or relevant. Of course it is. It was just a comment about what made me think of the idea. Sorry about that.

In my work, we often have a bunch of time series representing prices of financial securities. We also have time series of financial/economic factors and we try to explain the prices in terms of factors. There are not enough factors to find a true “inverse”, but we do “the best we can”. That spirit of doing “the best we can” sounded very close to what John was describing. That’s all I was trying to say.

Eric: Here’s how I think of it (which may be completely misguided). Given F:CDF:C\to D and p:CCp:C\to C', and an expression (in the linear case)

F=Fp+ϵ.F = F'\circ p + \epsilon.

We want to find F:CDF':C'\to D that minimizes ϵ\epsilon.

Question: How could you express this in a more general category-theoretic way?

Toby: OK, maybe I have a better idea what you mean now. In particular, just as there are left and right Kan extensions, there are really also two kinds of linear least squares. Looking at this picture from Wikipedia for reference, instead of minimising the (squares of the) lengths of the vertical lines marked, we could just as easily minimise the (squares of the) lengths of corresponding horizontal lines. It is a (sometimes rather arbitrary) distinction between dependent and independent variable that tells us which to use.

Also, looking at your example at functor, it occurs to me that at some point you'll have to deal with the fact that linear least squares doesn't happen in Vect; it happens in Hilb (or at least in Ban). That is, to minimise ϵ\epsilon, you've got to be able to measure the size (norm) of ϵ\epsilon. And that's where all of those transpose matrices in formulas for linear least squares get in.

Anyway, I wouldn't rule out a priori the idea that linear least squares might be given by a Kan extension, but I'm not seeing it yet myself. But at least I see what is the analogy that is exciting you.

Eric: It would be neat if you could use an intrinsic property of categories, e.g. maybe Leinster measure, to minimize ϵ\epsilon rather than depend on a metric within the objects of the category. I seem to remember something about “Hom” behaving like a metric in some circumstances (?)

Toby: Well, in a 22-Hilbert space (h'm, doesn't exist yet, try searching This Week's Finds or the Café), HomHom behaves like the inner product in a Hilbert space. Of course if your category is a 22-Hilbert space, then your objects are vectors whose components are themselves finite-dimensional Hilbert spaces, so you've still got metrics put in by hand. (FinHilbFin Hilb itself is the primordial 22-Hilbert space.) Still, perhaps something can be done just using the fact that you're in a dagger compact category.

And of course! another way in which HomHom can be like a metric is that, in a metric space (this is described at that link), the metric really is the hom-object operation of a certain enriched category. That's probably what you were thinking of. The enriching category is a poset (the poset of nonnegative real numbers under \geq), so minimising ϵ\epsilon is now like taking a (co)limit. Say, maybe this will work out!

Eric: Neat! The vague gut feeling just got a little clearer, but still extremely fuzzy. Maybe I’ll keep talking and something will make sense :)

The idea behind least squares can also be thought of in terms of orthogonal projections, so if we had some kind of intrinsic inner product (involving colimits??) then we could use that to decompose FF into FpF'p and ϵ\epsilon where

Fpϵ=0.F'p\cdot\epsilon = 0.

In fact, you could even say the challenge is to find an FF' such that FpF'p is orthogonal to the residual ϵ\epsilon. This FF' is the “best we can do”.

It would be neat if we could eventually say something like “Kan extension is our best attempt to find a functor F:CCF':C\to C' such that FpF'p and the residual ϵ\epsilon are orthogonal.” Or something…

Eric: Here is an example from the world of time series. Maybe it can be a model for a more general construction. Say we have two time series xx and yy and we would like to find the constant bb that minimizes the variance of aa in the expression

y=a+bx.y = a + b x.

This problem is equivalent to finding bb such that bxbx and aa are uncorrelated, i.e. their covariance is zero. The computation is simple

cov(y,x)=cov(a,x)+bcov(x,x).cov(y,x) = cov(a,x) + b cov(x,x).

Setting

b=cov(y,x)cov(x,x)b = \frac{cov(y,x)}{cov(x,x)}

gives

cov(a,x)=0.cov(a,x) = 0.

My suspicion is that this is somehow related to Kan extension.

Note: See also an article I wrote while ago:

Visualizing Market Risk: A Physicists Perspective

This shows, geometrically, why the minimum variance solution is also the solution that splits the original into orthogonal pieces.

category: reference

Revised on April 8, 2010 23:31:57 by Zoran Škoda (193.55.10.104)