an elementary treatment of Hilbert spaces


Purpose of this page

The purpose of this page is to examine how much of the theory of Hilbert spaces can be done in an “elementary” fashion. More specifically, to develop as much as possible of the standard theory without using the fact that Hilbert spaces are special normed vector spaces, or special metric spaces.

The reason for trying this is in the spirit of centipede mathematics. One can view the theory of locally convex topological vector spaces as the result of pulling off most of the legs of Hilbert spaces, though perhaps one should go one step further and say that Hilbert spaces themselves are the result of pulling the “finite dimensional” leg off the Euclidean centipede. Indeed, this is almost the standard treatment of LCTVSs except that the usual starting point is normed vector spaces rather than Hilbert spaces. In that view, Hilbert spaces are special normed vector spaces rather than normed vector spaces being Hilbert spaces without a leg or two.

Although the intention is that the treatment be elementary, we shall remark on the relationship to the standard theory and thus the commentary will not necessarily be elementary. For example, concepts that are traditionally defined by using the metric space structure of a Hilbert space will need recasting and we shall need to reassure the reader that the new definition is equivalent to the original.

We shall work over \mathbb{C} throughout.

Basic Definitions

We start with the basic definition of an inner product space.


An inner product space is a vector space, say VV, equipped with a function V×V|V| \times |V| \to \mathbb{C}, written u,v\langle u, v \rangle, satisfying:

  1. v,u=u,v¯\langle v, u\rangle = \overline{\langle u, v\rangle},
  2. v+λw,u=v,u+λw,u\langle v + \lambda w, u \rangle = \langle v, u \rangle + \lambda \langle w, u\rangle,
  3. v,v[0,)\langle v, v\rangle \in [0,\infty) with v,v=0\langle v, v\rangle = 0 if and only if v=0v = 0.

Ideally, we want to deal solely with Hilbert spaces but first we need to figure out how to deal with completeness without recourse to metric space theory. We do this by using orthonormal families.


Let (V,,)(V,\langle -, - \rangle) be an inner product space. An orthogonal family in VV is a subset BVB \subseteq V with the property that b,b=0\langle b, b'\rangle = 0 whenever b,bBb,b' \in B are distinct.

The family is said to be orthonormal if, in addition, b,b=1\langle b, b\rangle = 1 for all bBb \in B.

Two vectors, say uu and vv, are said to be orthogonal if the family {u,v}\{u,v\} is an orthogonal family.

Using orthogonal families, we can express the notion of completeness as follows.


A Hilbert space is an inner product space, (H,,)(H, \langle -, -\rangle) in which the following property holds. Let (b n)(b_n) be an orthonormal sequence and (λ n)(\lambda_n) a sequence of positive real numbers such that λ n 2\sum \lambda_n^2 is bounded. Then there is some vHv \in H such that for all uHu \in H,

λ nb n,u=v,u. \sum \lambda_n \langle b_n, u \rangle = \langle v, u \rangle.

We need to justify this notion of completeness. One direction is simple: the sequence of partial sums of the series λ nb n\sum \lambda_n b_n is Cauchy and so if the space is complete, it has a limit and this limit satisfies the criterion.

The other direction takes a little more effort. The quickest (but dirtiest) route is simply to observe that if that condition is satisfied then whenever we have an isometry from 0\ell^0 (with its standard inner product) into our space then it extends to 2\ell^2. A slightly more concrete route is as follows. Start with a Cauchy sequence in HH, say (x n)(x_n), and then apply the Gram–Schmidt process. This results in an orthonormal sequence, say (b n)(b_n). For each kk, we define λ k\lambda_k as the limit (in \mathbb{C}) of (x n,b k)(\langle x_n, b_k\rangle). We can make λ k\lambda_k a positive real number by taking the phase factor in to b kb_k. Let (s n)(s_n) be the sequence of partial sums of the series λ kb k\sum \lambda_k b_k. Then (s n)(s_n) is Cauchy and the interpolation of (s n)(s_n) and (x n)(x_n) is also Cauchy. By assumption, (s n)(s_n) has a weak limit. As it is Cauchy, the existence of a weak limit is enough to show that it has a strong limit. Thus (x n)(x_n) also converges and so HH is complete.

The tenor of the new definition and its equivalence to the standard one is a theme that will run throughout this page. In our elementary treatment, weak definitions are to be preferred to the standard strong ones. Their equivalence exposes some of the deep results of Hilbert space theory.

As this is intended as an elementary treatment, it is likely that at some point we will want to assume that our Hilbert space is “small”, by which (of course) we mean “separable”. Fortunately, it is not hard to formulate separability without recourse to metric spaces.


An inner product space is separable if it contains a sequence, say (x n)(x_n) with the property that x,x n=0\langle x, x_n\rangle = 0 for all nn implies that x=0x = 0.


The Cauchy–Schwarz inequality is one of the basic results of Hilbert space theory. As we wish to avoid any mention of a norm, we state it as follows.


Let HH be a Hilbert space, u,vHu,v \in H. Then

u,v 2u,uv,v |\langle u, v \rangle|^2 \le \langle u, u \rangle \langle v, v \rangle

with equality if and only if uu and vv are collinear.

I removed the square roots here, since there doesn't seem to be much point to them if you're not going to connect with a metric. (On the other hand, surely even an elementary treatment of Hilbert spaces may deign to mention the geometric concept of distance? It's one thing to not assume a knowledge of metric spaces, but it's another thing to refuse to even mention norms.) —Toby

As you can (probably) tell, I’m making this up as I go along! I’ve not decided yet whether to allow distances or not, so at the moment I’m avoiding them if possible. Part of my goal is to avoid overwhelming the “reader” with notation just for the sake of it. So I may well implicitly use the norm as v,v\langle v, v\rangle but without introducing the v\|v\| notation. But although I expect I’ll be the main contributor here, I don’t particularly want to be so I’m more than happy for others to weigh in with their opinions on what an “elementary treatment” would look like. If I disagree then it’ll force me to think carefully why I disagree and that can only improve things.

Pondering a little more, I think that I’m avoiding distance not because I don’t think it belongs in an elementary treatment - as you imply, what could be more simple to understand than distance? - but because it’s a way of keeping metric spaces at bay: once I start talking about distances then it’ll be easy to talk about metrics and the like.

I think you’re right about the square roots, by the way. —Andrew

Since accidentally using metric space theory is always a danger, I understand not wanting to mention distances and norms right away. On the other hand, I do think that any introductory treatment ought to mention them at some point, and even to prove that they satisfy the triangle inequality and Cauchy completeness. But those should be theorems specifically about the concept of distance in a Hilbert space, not anything fundamental to the development of the general theory of Hilbert spaces. So sure, keep them out for now. —Toby


The “if and only if” part gives us the key to the most direct proof. It will simplify matters a little if we assume that uu and vv are non-zero, the case where one of them is zero being easy to establish.

Now if uu and vv are collinear then there is some λ\lambda \in \mathbb{C} such that u=λvu = \lambda v. There is only one possibility for λ\lambda which can be found by taking inner products with vv: if λ\lambda exists such that u=λvu = \lambda v then λ=u,v/v,v\lambda = \langle u, v\rangle/\langle v, v\rangle. So we consider the question: is v,vu=u,vv\langle v, v\rangle u = \langle u, v\rangle v? Or, equivalently, is v,vuu,vv=0\langle v, v\rangle u - \langle u, v\rangle v = 0?

We have a test for when a vector is non-zero using the inner product. One of the axioms for an inner product says that a vector ww is zero if and only if w,w=0\langle w, w\rangle = 0. Moreover, we also know that in any case w,w0\langle w, w\rangle \ge 0. Thus we have

(1)v,vuu,vv,v,vuu,vv0 \left\langle \langle v, v\rangle u - \langle u, v\rangle v, \langle v, v\rangle u - \langle u, v\rangle v \right\rangle \ge 0

with equality if and only if uu and vv are collinear.

Before expanding out the left hand side of this, we note that

v,vuu,vv,v=0 \left\langle \langle v, v\rangle u - \langle u, v\rangle v , v \right\rangle = 0

whence (1) simplifies to

v,vuu,vv,u0 \left\langle \langle v, v\rangle u - \langle u, v\rangle v, u \right\rangle \ge 0

with equality if and only if uu and vv are collinear. Expanding this out yields

v,vu,uu,vv,u0 \langle v,v\rangle \langle u,u\rangle - \langle u, v \rangle \langle v, u\rangle \ge 0

with equality if and only if uu and vv are collinear. Rearranging and square-rooting produces the traditional statement of the Cauchy–Schwarz inequality.

Subspaces and Complements

Revised on May 14, 2010 15:36:50 by Urs Schreiber (