Return to computing page for the first course APMA0330
Return to computing page for the second course APMA0340
Return to computing page for the fourth course APMA0360
Return to Mathematica tutorial for the first course APMA0330
Return to Mathematica tutorial for the second course APMA0340
Return to Mathematica tutorial for the fourth course APMA0360
Return to the main page for the first course APMA0330
Return to the main page for the second course APMA0340
Return to the main page for the fourth course APMA0360
Return to Part II of the course APMA0360
Introduction to Linear Algebra with Mathematica

Glossary

Preface

In this section, we discuss how to expand a function f from the Hilbert space 𝔏²([𝑎, b], w) with respect to eigenfunction of classical Sturm--Liouville problem. Since many practical examples of such expansions are presented in the following sections, we concentrate our attention on theoretical exposition. Such representation of an arbitrary function f(x) is called an eigenfunction expansion of f(x. A natural question arises in your mind: what does “represents” mean? Does it mean in the sense of pointwise convergence? Or mean convergence? Or perhaps some other concept altogether?

Spectral Decomposition

Decomposition of complicated systems into its constituent parts is one of science’s most powerful strategies for analysis and understanding large-scale systems with linearly coupled components. Spectral decomposition—splitting a linear operator into independent modes of simple behavior—is greatly appreciated in mathematical physics. For example, a wave dynamics is usually captured by superposition of simple modes. Quantum mechanics and statistical mechanics identify the energy eigenvalues of Hamiltonians as the basic objects in thermodynamics: transitions among the energy eigenstates yield heat and work. The eigenvalue spectrum reveals itself most directly in other kinds of spectra, such as the frequency spectra of light emitted by the gases that permeate the galactic filaments of our universe.

Spectral decomposition often allows a problem to be simplified by approximations that use only the dominant contributing modes. Indeed, human-face recognition can be efficiently accomplished using a small basis of “eigenfaces”. Certainly, there are many applications that highlight the importance of decomposition. In this section, we concentrate our attention on application of decomposition theory to differential equations generated by a linear second order self-adjoint operator subject to boundary conditions. Now it is known that depending on boundary conditions, the corresponding boundary value problem may have discrete spectrum (set of eigenvalues) or continuous spectrum or their combination.

When solving an inhomogeneous differential equation

\[ L \left[ x, \texttt{D} \right] y = f \qquad ( \texttt{D} = {\text d}/{\text d}x) , \]

with a differential operator L, we expand the input function and the unknown solution into the series over eigenfunctions:

\[ f(x) = \sum_n f_n \phi_n (x) , \qquad y(x) = \sum_n c_n \phi_n (x) , \]

where { ϕ_n(x) } are eigenfunctions of the differential operator L. Upon substituting these series expansions into the differential equation, we reduce this problem to simple multiplication problem:

\[ L \left[ x, \texttt{D} \right] \sum_n c_n \phi_n (x) = \sum_n c_n L \left[ x, \texttt{D} \right] \phi_n (x) = \sum_n c_n \lambda_n \,\phi_n (x) = \sum_n f_n \phi_n (x) \]

because ϕ_n(x) are eigenfunctions. Assuming that all infinite series above converge, we obtain

\[ c_n \lambda_n = f_n \qquad \Longrightarrow \qquad c_n = \frac{f_n}{\lambda_n} . \]

In quantum mechanics, the state of a physical system is represented by a vector in a Hilbert spaces: a complex vector space with an inner product. It is a custom to use Dirac notation in which the vectors in the space are denoted by | v ⟩, called a ket, where v is some symbol which identifies the vector. A multiple of a vector by a complex number c is written as c| v ⟩. In Dirac notation, the inner product of the vectors |v〉 with |w〉 is written ⟨ v | w ⟩. Then eigenvalue expansion will have the form:

\[ |\, f\,\rangle = \sum_n f_n |\,\phi_n \rangle \qquad \mbox{or} \qquad |\, f\,\rangle = \sum_n f_n |\,n \rangle . \]

Projection

Orthogonality is one of the central concepts in the theory of Hilbert spaces. Another concept, intimately related to orthogonality, is orthogonal projection. Before getting to projections, we need to develop the notion of a convex set. Convexity, is a purely algebraic concept, but as we will see, it interacts with the topology induced by the inner product.

Two vectors x and y from inner-product vector space are called orthogonal iff their inner product is zero:

\[ \langle x\,,\,y \rangle = \langle y\,,\,x \rangle = 0 \qquad \iff \qquad x \perp y . \]

Let X be an inner-product space, and let S ⊂ X be any its subset. We denote by S^⊥ the set of vectors that are perpendicular to all the elements in S,

\[ S^{\perp} = \left\{ x\in X \,: \, \langle x\,,\,y \rangle = 0 \qquad \forall y \in S \right\} . \]

This set S^⊥ is called the orthogonal complement or annihilator of S.

An inner product is also denoted with a vertical bar instead of comma: ⟨ ⋅ | ⋅ ⟩

Note that S ⊆ S^⊥⊥ for any set from a space with inner product.

Example 1: Let us consider the space of complex-valued continuous functions on the finite interval, ℭ[−1, 1] with the inner product

\[ \langle f\,,\, g \rangle = \int_{-1}^1 f(x)^{\ast} g(x)\,{\text d} x , \]

where asterisk designates complex-conjugate. Let M be the set of functions in ℭ[−1, 1] that vanish on the interval [−1, 0], and let N be the set of all functions in ℭ[−1, 1] that vanish on [0, 1]. Every function in M is orthogonal to every function in N. Thus, N ⊆ M^⊥ and M ⊆ N^⊥.

End of Example 1

Lemma 1: Let S be a subset of an inner-product space, S ⊂ X. The set S^⊥ is a closed linear subspace of X, and

\[ S^{\perp} \cap S \subset \{ 0 \} . \]

We start by showing that S^⊥ is a linear subspace. Let x, y ∈ S^⊥, so

\[ \langle x\,, \, z \rangle = \langle y\,, \, z \rangle = 0 \qquad \mbox{for any} \quad z \in S. \]

For arbitrary scalars α, β, we have

\[ \langle \alpha\,x + \beta\, y\,, \, z \rangle = \alpha \langle x\,, \, z \rangle + \beta \langle y\,, \, z \rangle = 0 \qquad \mbox{for any} \quad z \in S , \]

which implies that &alphax + βy ∈ S^⊥, i.e., S^⊥ is a linear subspace.

We next show that S^⊥ is closed. Let {x_n} be a sequence in S^⊥ that converges to x ∈ X. By the continuity of the inner product,

\[ \langle x\,, \, z \rangle = \lim_{n\to\infty} \langle x_n\,, \, z \rangle = 0 \qquad \mbox{for any} \quad z \in S. \]

This means that x ∈ S^⊥.

Suppose that x ∈ S∩S^⊥. As an element in S^⊥, x is orthogonal to all the elements in S, and in particular to itself, hence ⟨ x , x ⟩ = 0, which by the defining property of the inner-product implies that x = 0.

Theorem 1: Let ℌ be a Hilbert space and C ⊂ ℌ be closed and convex set in it. Then for every x ∈ ℌ there exists a unique element y ∈ C such that the distance to the set C is

\[ d(x, C) = \| x-y \| , \]

where

\[ d(x, C) = \inf_{y\in C} \| x-y \| . \]

The mapping x ↦ y is called the projection of x onto the set C and it is denoted by P.

If x ∈ C, take y = x. If x ∉ C, then δ = dist(x, C) > 0 because C is closed.

We start by showing the existence of a distance minimizer. By the definition of the infimum, there exists a sequence {y_n} ⊂ C satisfying

\[ d(x, C) = \lim_{n\to\infty} d(x, y_n ) . \]

Since C is convex, ½(y_n + y_m) ∈ C for all m, n, and therefore,

\[ \left\| \frac{1}{2} \left( y_n + y_m \right) - x \right\| \geqslant d(x, C) . \]

By the parallelogram identity (this inner-product property was proved in the first section), \( \displaystyle \| a - b \|^2 = 2 \left( \| a \|^2 + \| b \|^2 \right) - \| a + b \|^2 , \quad \) we get

\begin{align*} 0 \leqslant \| y_n - y_m \|^2 &= \| \left( y_n - x \right) - \left( y_m - x \right) \|^2 \\ &= 2\,\| y_n - x \|^2 + 2\,\| y_m - x \|^2 - \| y_n + y_m - 2\,x \|^2 \\ & \leqslant 2\,\| y_n - x \|^2 + 2\,\| y_m - x \|^2 - 2\,d(x, C) \ \to \ 0 \quad \mbox{as} \quad m,n \to \infty . \end{align*}

It follows that {y_n} is a Cauchy sequence and hence converges to a limit y (which is where completeness is essential). Since C is closed, y ∈ C. Finally, by the continuity of the norm,

\[ \| x - y \| = \lim_{n\to\infty} \| x - y_n \| = d(x, C) . \]

which completes the existence proof of a distance minimizer.

Next, we show the uniqueness of the distance minimizer. Suppose that y, z ∈ C both satisfy

\[ \| y-x \| = \| z - x \| = d(x, C) . \]

By the parallelogram identity,

\[ \| y+z --2\,x \|^2 + \| y-z \|^2 = 2\,\| y-x \|^2 + 2\,\| z-x \|^2 , \]

which leads to

\[ \left\| \frac{y+z}{2} - x \right\|^2 = d^2 (x, C) - \frac{1}{4} \,\| y - z \| . \]

If y ≠ z, then (y + z)/2, which belongs to C is closer to x than the distance of x from C, which is a contradiction.

The existence of a unique projection does not hold, in general, in complete normed spaces (i.e., Banach spaces). A distance minimizer does exist in finite-dimensional normed spaces, but it may not be unique). In infinite-dimensional Banach spaces, distance minimizers may fail to exist.

Corollary 1: Let ℌ be a Hilbert space and C ⊂ ℌ be closed and convex set in it. The projection P_C : ℌ → C is idempotent, P_C ○ P_C = P_C.

Corollary 2: Let M be a closed linear subspace of a Hilbert space ℌ. Then

\[ y = P_M x \]

if and only if (abbreviated as iff)

\[ y \in M \qquad\mbox{and} \qquad x-y \in M^{\perp} . \]

Projection Theorem: Let M be a closed linear subspace of a Hilbert space ℌ. Then every vector x ∈ ℌ has a unique decomposition

\[ x = m + n , \qquad m \in M , \quad n \in M^{\perp} . \]

Furthermore, m = P_Mm, so

\[ ℌ = M \oplus M^{\perp} . \]

Let x ∈ ℌ. By Corollary 2,

\[ x - P_M x \in M^{\perp} , \]

hence,

\[ x = P_M x + \left( x - P_M x \right) \]

satisfies the required properties of the decomposition.

Next, we show that the decomposition is unique. Assume

\[ x = m_1 + n_1 = m_2 + n_2 , \]

where m₁, m₂ ∈ M and n₁, n₂ ∈ M^⊥. Then

\[ M \ni m_1 - m_2 = n_1 - n_2 \in M^{\perp} . \]

Uniqueness follows from the fact that M ∩ M^⊥ = { 0 }.

The element m in Projection theorem is called the orthogonal projection of x on M. It is worth reiterating that m is the closest element in M to x. The mapping P_M ∶ ℌ → ℌ defined by P_M(x) = m is called the projection operator (or simply the projection) of ℌ onto M.

Example 2: The projection theorem does not holds when the conditions are not satisfied. Take for example ℌ = ℓ² , with linear subspace

\[ M = \left\{ (a_n ) \in \ell^2 \, : \, \exists N; \quad \forall n > N, \quad a_n = 0 \right\} . \]

This linear subspace it not closed, and its orthogonal complement is { 0 }, so

\[ M \oplus M^{\perp} = M \ne \ell^2 . \]

End of Example 2

Corollary 3: For every linear subspace M of a Hilbert space ℌ,

\[ \left( M^{\perp} \right)^{\perp} = \overline{M} \quad (\mbox{closure}). \]

Orthogonality

Recall that a Hilbert space ℌ is a real or complex vector space with inner product that is also a complete metric space with respect to the distance function induced by the inner product; \( \displaystyle d(x,y) = \| x - y \| = \langle x-y, x-y \rangle^{1/2} . \)

A set S in a vector space X with inner product ⟨ ∣ ⟩ is called orthogonal if x⊥y for every pair of vectors x, y ∈ S, so ⟨ x ∣ y ⟩ = 0 for y ≠ x. If, in addition, ∥x∥ = 1 for all x ∈ S, the system S is referred to as orthonormal.

Orthonormal/orthogonal system S in Hilbert space ℌ is called complete orthonormal/orthogonal system or orthonormal/orthogonal basis in vector space X ⊆ ℌ if S is not a subset of any other orthonormal/orthogonal system of space X. That is, it is complete if the only vector orthogonal to all all elements of S is zero.

Theorem 3: Every Hilbert space ℌ with at least one nonzero element has a complete orthonormal system. Moreover, for any orthonormal system S in Hilbert space ℌ, there exists a complete orthonormal system that contain S as subset.

Let S be a orthnormal system in ℌ. Such system definitely exists in ℌ: for instance, if x ≠ 0, then x/∥x ∥ gives an example of such system. Let us consider family of cardinal numbers |S| of orthonormal systems {S}, then such family is a partially ordered set with ordering S₁' ≺ S₂ when S₁' ⊆ S₂. Let {T} be some partially ordered subsystem of S. The set ∪T is an orhogonal system and it is a majorate for {T}. Then according to Zorn's lemma, there exists a maximal element S₀ of {S} that contains S. This system S₀ is complete as the maximum element.

Orthogonal Expansions

The goal of this subsection is to represent an arbitrary element of a Hilbert space ℌ in terms of a basis of some kind. If dim(ℌ) < ∞, the goal is too trivial, and if dim(ℌ) = ∞, the goal is unrealistic if one insists on looking at a Hamel basis because any such basis is uncountable and hence too big to be useful. The only realistic expectation is to hope to express an arbitrary element of ℌ as a series of the basis elements, as was achieved in Fourier series section. This means that ℌ has a Schauder basis, which immediately suggests that we investigate separable Hilbert spaces.

Therefore, we focus mostly but not exclusively on separable Hilbert spaces. Many of the results were develop in this chapter are valid for inseparable Hilbert spaces. Examples include the projection theorem and the Riesz representation theorem (see section).

Theorem 4: Every Hilbert space ℌ with at least one nonzero element has a complete orthonormal system. Moreover, for any orthonormal system S in Hilbert space ℌ, there exists a complete orthonormal system that contain S as subset.

Corollary 4: Every separable Hilbert space ℌ contains a countable complete orthonormal system.

Recall that ℌ is separable if it contains a countable dense subset. Now, Let {x_n} be a dense countable set. In particular:

\[ \overline{\mbox{span} \left\{ x_n \, : \, n \in \mathbb{N} \right\}} = ℌ. \]

We can construct inductively a subset {y_n} of independent vectors such that there exists an integer N,

\[ \mbox{span} \left\{ y_k \, : \, 1 \le k \le N \right\} = \mbox{span} \left\{ x_k \, : \, 1 \le k \le n \right\} \]

for every n. In other words,

\[ \overline{\mbox{span} \left\{ y_n \, : \, n \in \mathbb{N} \right\}} = ℌ. \]

By applying Gram-Schmidt orthonormalization, we obtain an orthonormal system {u_n} such that

\[ \overline{\mbox{span} \left\{ u_n \, : \, n \in \mathbb{N} \right\}} = ℌ. \]

We will show that this orthonormal system is complete. Suppose that v were orthogonal to all {u_n}. Since every x ∈ ℌ is a limit

\[ x = \lim_{n\to\infty} \sum_{k=1}^n c_{n,k} u_k . \]

It follows by the continuity of the inner-product that

\[ \langle x\,\vert\, y \rangle = \lim_{n\to\infty} \sum_{k=1}^n c_{n,k} \langle u_k \,\vert\, v \rangle = 0, \]

i.e., v is orthogonal to all vectors in ℌ, hence it is zero, from which follows that the {u_n} form a complete orthonormal system.

Example 3: It is possible for a separable inner product space (hence for a separable Hilbert space) to contain uncountably many pairs of orthogonal vectors.

Let us consider the space of continuous functions ℭ[−1, 1] with the inner product

\[ \langle f\,,\, g \rangle = \int_{-1}^1 \overline{f(x)}\,g(x)\,{\text d} x . \]

Let M be the set of functions in ℭ[−1, 1] that vanish on the interval [−1, 0], and let N be the set of all functions in ℭ[−1, 1] that vanish on the interval [0, 1]. Every function in M is orthogonal to every function in N.

Every pair of functions (f, g) ∈ M × N is orthogonal. Since both M and N are uncountable, we have proved our assertion.

End of Example 3

Let S = { x_α : α ∈ A } be a complete orthogonal system in Hilbert space ℌ. For any element f ∈ ℌ, its Fourier coefficient is defined by

\begin{equation} \label{EqExpand.1} f_{\alpha} = \frac{\langle f, x_{\alpha} \rangle}{\langle x_{\alpha}, x_{\alpha} \rangle} = \frac{\langle f, x_{\alpha} \rangle}{\| x_{\alpha} \|^2} . \end{equation}

Theorem 5: Let S = { x_α : α ∈ A } be a complete orthogonal system in Hilbert space ℌ. Then Parseval's identity holds:

\begin{equation} \label{EqExpand.2} \| f \|^2 = \sum_{\alpha \in A} | f_{\alpha} |^2 \| x_{\alpha} \|^2 = \sum_{\alpha \in A} \frac{\langle f, x_{\alpha} \rangle^2}{\| x_{\alpha} \|^2} . \end{equation}

For every element f from Hilbert space ℌ and complete orthpgpnal system { x_α }, formula

\begin{equation} \label{EqExpand.4} f = \sum_{i\ge 1} f_{\alpha_i} x_{\alpha_i} = \lim_{n\to\infty} \sum_{i=1}^n f_{\alpha_i} x_{\alpha_i} \end{equation}

is called Fourier expansion of f with respect to complete basis { x_α }. The set of scalars { f_α } is called the set of Fourier components of f with respect to the orthogonal system.

For a given complete orthogonal system S = { x_α : α ∈ A } in Hilbert space ℌ, any vector f from ℌ can be expanded into Fourier series:

\begin{equation} \label{EqExpand.2a} f = \sum_{\alpha} f_{\alpha} x_{\alpha} = \sum_{\alpha} \frac{\langle f , x_{\alpha} \rangle}{\langle x_{\alpha} , x_{\alpha} \rangle}\, x_{\alpha} = \sum_{\alpha \in A}\frac{\langle f , x_{\alpha} \rangle}{\| x_{\alpha} \|}\, x_{\alpha} . \end{equation}

Lemma (Bessel inequality): For any orthogonal system S = { x_α : α ∈ A }, Bessel inequality holds:

\begin{equation} \label{EqExpand.3} \| f \|^2 \ge \sum_{\alpha \in A} | f_{\alpha} |^2 \| x_{\alpha} \|^2 = \sum_{\alpha \in A} \frac{\langle f, x_{\alpha} \rangle^2}{\| x_{\alpha} \|^2} . \end{equation}

Let α₁, α₂, … , α_n be arbitrary finite set of indices. For any finite set of numbers \( \displaystyle c_{1}, c_2 , \ldots , c_n , \) we have for arbitrary orthonormal system { x_α } that

\begin{align*} \left\| f - \sum_{i=1}^n c_i x_{\alpha_i} \right\|^2 &= \left\langle f - \sum_{i=1}^n c_i x_{\alpha_i} , f - \sum_{i=1}^n c_i x_{\alpha_i} \right\rangle \\ &= \| f \|^2 - \sum_{i=1}^n c_i^{\ast} x_{\alpha_i} - \sum_{i=1}^n c_i x_{\alpha_i}^{\ast} + \sum_{i=1}^n \left\vert c_i \right\vert^2 \\ &= \| f \|^2 - \sum_{i=1}^n \left\vert f_{\alpha_i} \right\vert^2 + \sum_{i=1}^n \left\vert f_{\alpha_i} - c_i \right\vert^2 . \end{align*}

Therefore, minimum of expression \( \displaystyle \left\| f - \sum_{i=1}^n c_i x_{\alpha_i} \right\|^2 \) is attained when \( \displaystyle c_i = f_{\alpha_i} , \quad i = 1,2, \ldots , n . \) Hence,

\[ \left\| f - \sum_{i=1}^n c_i x_{\alpha_i} \right\|^2 = \| f \|^2 - \sum_{i=1}^n \left\vert f_{\alpha_i} \right\vert^2 . \]

So we conclude that

\[ \sum_{i=1}^n \left\vert f_{\alpha_i} \right\vert^2 \le \| f \|^2 . \]

Since indices α₁, α₂, … , α_n were chosen arbitrary, we conclude that f_α ≠ 0 in no more than in countable number of cases; so Bessel's inequality is valid.

Observe that sequence \( \displaystyle \left\{ \sum_{i=1}^n f_{\alpha_i} x_{\alpha_i} \right\} \) is a Cauchy sequence because of orthogonality of { x_α }. Then

\[ \left\vert \sum_{i=k}^n f_{\alpha_i} x_{\alpha_i} \right\vert^2 = \left\langle \sum_{i=k}^n f_{\alpha_i} x_{\alpha_i} , \sum_{i=k}^n f_{\alpha_i} x_{\alpha_i} \right\rangle = \sum_{i=k}^n \left\vert f_{\alpha_i} \right\vert^2 \]

tends to zero as k ⟶ ∞. Let \( \displaystyle g = \lim_{n\to\infty} \sum_{i=1}^n f_{\alpha_i} x_{\alpha_i} \) WE are going to show that vector (f −g) is orthogonal to every element from S.

Due to continuity of inner product.\,

\[ \left\langle f-g, x_{\alpha} \right\rangle = \lim_{n\to\infty} \left\langle f- \sum_{i=1}^n f_{\alpha_i} x_{\alpha_i} , x_{\alpha} \right\rangle = f_{\alpha_i} - f_{\alpha_i} = 0. \]

Hence, (f −g) = 0 because the system { x_α ). Since norm is continuous, we get

\[ 0 = \lim_{n\to\infty} \left\| f - \sum_{i=1}^n f_{\alpha_i} x_{\alpha_i} \right\|^2 = \| f \|^2 - \lim_{n\to\infty} | f_{\alpha_i} |^2 \| x_{\alpha_i} \|^2 \]

Theorem --- Gram-Schmidt orthonormalization: Let { x_n } be either a finite or a countable sequence of linearly independent vectors in an inner-product space X. Then it is possible to construct an orthonormal sequence { y_n } that has the same cardinality as the sequence { x_n }, such that

\[ \mbox{span}\left\{ y_k \, : \, 1 \leqslant k \leqslant n \right\} = \mbox{span}\left\{ x_k \, : \, 1 \leqslant k \leqslant n \right\} \qquad \forall n \in \mathbb{N} . \]

Take a countable dense subset – which can be arranged as a sequence { v_j } and the existence of which is the definition of separability – and orthonormalize it. Thus, if v₁ ≠= 0, set e₁ = v₁/‖v₁‖. Proceeding by induction, we can suppose to have found for a given integer n elements e_i, i = 1, … , m, where m ≤ n, which are orthonormal and such that the linear span

\[ \mbox{span}\left\{ e_1 , e_2 , \ldots , e_m\right\} = \mbox{span}\left\{ v_1 . v_2 , \ldots , v_m \right\} . \tag{GS.1} \]

To show the inductive step observe that if v_n+1 is in the span(s) in (GS.1), then the same e_i’s work for n + 1. So we may as well assume that the next element, v_n+1 is not in the span in (GS.1). It follows that

\[ w = v_{n+1} - \sum_{i=1}^n \langle v_{n+1} , e_i \rangle e_i \ne 0 \quad \Longrightarrow \quad e_{m+1} = \frac{w}{\| w \|} \tag{GS.2} \]

makes sense. By construction it is orthogonal to all the earlier e_i’s so adding e_m+1 gives the equality of the spans for n + 1.

Thus, we may continue indefinitely, since in fact the only way the dense set could be finite is if we were dealing with the space with one element, 0, in the first place. There are only two possibilities, either we get a finite set of e_i’s or an infinite sequence. In either case this must be a maximal orthonormal sequence. That is, we claim

\[ ℌ \ni u \perp e_i \quad \forall i \qquad \Longrightarrow \qquad u = 0 . \tag{GS.3} \]

This uses the density of the v_n’s. There must exist a sequence w_j where each w_j is a v_n, such that w_j → u in X, assumed to satisfy (GS.3). Now, each v_n, and hence each w_j, is a finite linear combination of e_k’s so, by Bessel’s inequality

\[ \| w_j \|^2 = \sum_k \left\vert \langle w_j , e_k \right\vert^2 = \sum_k \left\vert \langle u - w_j , e_k \right\vert^2 \le \| u - w_j \|^2 , \tag{GS.4} \]

where ⟨ u , e_j ⟩ = 0 for all j has been used. Thus, ∥ w_j∥ → 0 and u = 0.

Example 4: Let us consider the set of power monomials on interval [−1, 1],

\[ S = \left\{ 1, \ x, \ x^2 , \ x^3 , \cdots , x^n , \cdots \right\} . \]

Since the interval is evenly distributed around the origin, the orthogonal polynomials ψ_n(x), n ∈ ℕ, are either even or odd, because the even polynomials are automatically orthogonal with the odd polynomials.

The method starts with normalizing φ₀(x) = 1 to yield ψ₀(x) as

\[ \psi_0 (x) = C_0 \phi_0 (x) = C_0 1 . \]

The normalization constant C₀ is determined from

\[ 1 = \int_{-1}^1 \left\vert \psi_0 (x) \right\vert^2 {\text d} x = \left\vert C_0 \right\vert^2 \int_{-1}^1 {\text d} x = 2. \]

Hence, C₀ = 2^−½ and we have

\[ \psi_0 (x) = \frac{1}{\sqrt{2}} . \]

The lowest order normalized polynomial ψ₀(x) is an even function of x.

Proceeding to construct the next orthogonal polynomial from φ₁(x) = x yields

\begin{align*} \psi_1 (x) &= C_1 \left( \phi_1 (x) - \psi_0 \int_{-1}^1 {\text d} x\,\phi_1 (x)\,\psi_0 (x) \right) \\ &= C_1 \left( x - \frac{1}{2} \int_{-1}^1 {\text d} x\, x \right) \\ &= C_1 x \end{align*}

because the integral of the odd function x over symmetric interval is zero. The normalization is found to be

\[ C_1 = \sqrt{\frac{3}{2}} \qquad \Longrightarrow \qquad \psi_1 (x) = \sqrt{\frac{3}{2}}\, x . \]

The next polynomial is constructed from φ₂(x) = x². We know that this function is an even function and is automatically orthogonal to all odd functions of x. Thus, we only need to orthogonalize it agains φ₀(x) = 1:

\begin{align*} \psi_2 (x) &= C_2 \left( x^2 - \frac{1}{\sqrt{2}} \int_{-1}^1 {\text d} x\,x^2 \frac{1}{\sqrt{2}} \right) \\ &= C_2 \left( x^2 - \frac{1}{3} \right) \end{align*}

and C₂ is found as

\[ C_2 = \sqrt{\frac{45}{8}} \qquad \Longrightarrow \qquad \psi_2 (x) = \sqrt{\frac{45}{8}}\, x^2 . \]

The orthogonalization of the next polynomial is non-trivial. The process starts with φ₃(x) = x³, so one finds

\begin{align*} \psi_3 (x) &= C_3 \left( x^3 - x\,\frac{3}{2} \int_{-1}^1 {\text d} x\,x^4 \right) \\ &= C_3 \left( x^3 - x\,\frac{3}{5} \right) , \end{align*}

etc.

This set of polynomials, apart from multiplicative constants, are the same as the set of Legendre polynomials {P_n(x)}. However, the Legendre polynomials are normalized differently. The Legendre polynomials are normalized by insisting that P_n(0) = 1.

End of Example 4

Theorem --- Riesz-Fischer: Let { u_n } be an orthonormal sequence in a Hilbert space ℌ. Let { c_n } be a sequence of scalars. Then partial sums

\[ s_n = \sum_{i=1}^n c_i u_i \]

strongly (in norm) converges as n → ∞ if and only if

\[ \sum_{i\ge 1} \left\vert c_i \right\vert^2 < \infty . \]

The theorem was proven independently in 1907 by Frigyes Riesz and Ernst Sigismund Fischer.

Look at the difference between s_n and s_m:

\[ \| s_n - s_m \|^2 = \left\| \sum_{k=m+1}^n c_k u_k \right\|^2 = \sum_{k=m+1}^n \left\vert c_k \right\vert^2 , \]

where we used the orthonormality of the {u_k}. Thus, the series of | c_k |² is a Cauchy sequence if and only if {s_k} is a Cauchy sequence. Note that the completeness of ℌ is crucial.

Theorem 8: Let {u_α : α ∈ A} be an orthonormal system in the Hilbert space ℌ. All the following conditions are equivalent:

{u_α : α ∈ A} is complete.
For all x ∈ ℌ: Σ_α∈A ⟨ x , u_α⟩ u_α = x.
Generalized Parseval identity. For all x, y ∈ ℌ: ⟨ x , y ⟩ = Σ_α∈A ⟨ x , u_α⟩* ⟨ y , u_α⟩.
Parseval identity. For all x ∈ ℌ: ∥x∥² = Σ_α∈A |⟨ x , u_α⟩|².

Suppose that (1) holds, i.e., the orthonormal system, is complete. Given x∈ℌ, let {α_n} be a sequence of indexes that contains all indexes for which the Fourier coefficients of x do not vanish. For every index α_n,

\[ \left\langle x - \sum_{k\ge 1} \langle x\, \vert \, u_{\alpha_k} \rangle \,u_{\alpha} , u_{\alpha_n} \right\rangle = 0 . \]

In fact, for all α∈A,

\[ \left\langle x - \sum_{k\ge 1} \langle x\, \vert \, u_{\alpha_k} \rangle \,u_{\alpha} , u_{\alpha} \right\rangle = 0 . \]

It follows that \( \displaystyle x - \sum_{k\ge 1} \langle x \,\vert\, u_{\alpha_k} \rangle\,u_{\alpha_k} \) is orthogonal all vectors {u_α} but since we assumed that the orthonormal system is complete, it follows that it is zero, i.e.,

\[ x = \sum_{k\ge 1} \langle x\,\vert\, u_{\alpha_k} \rangle\, u_{\alpha_k} , \]

and once again we may extend the sum over all α∈A.

Suppose that (2) holds:

\[ x = \sum_{\alpha \in A} \langle x\,\vert\, u_{\alpha} \rangle\, u_{\alpha} , \]

Given x, y ∈ ℌ, let {α_n} be a sequence of indexes that contains all the indexes for which at least one of the Fourier components of either x and y does not vanish. By the continuity of the inner-product:

\[ \langle x\,\vert\, y \rangle = \left\langle \sum_{k\ge 1} \langle x , u_{\alpha_k} \rangle \,u_{\alpha_k}, \langle y , u_{\alpha_k} \rangle \,u_{\alpha_k} \right\rangle = \sum_{k\ge 1} \langle x , u_{\alpha_k} \rangle^{\ast} \langle y , u_{\alpha_k} \rangle . \]

Suppose that (3) holds. Setting x = y, we obtain the Parseval identity

Suppose that (4) holds. Let x ∈ ℌ be orthogonal to all the {u_α}, then for all α∈A,

\[ \langle x \,\vert\, u_{\alpha} \rangle = 0 . \]

It follows from the Parseval identity that x = 0, i.e., the orthonormal system is complete.

Example 5: We consider the Haar functions that constitute an orthonormal basis in Hilbert space 𝔏²([0,1]). The Haar wavelet is a sequence of rescaled "square-shaped" functions defined as follows:

\begin{align*} \phi_0 (t) &= 1 , \\ \phi_1 (t) &= 1_{[0,1/2)} - 1_{[1/2,1]} , \\ \phi_2 (t) &= \sqrt{2} \left( 1_{[0,1/4)} - 1_{[1/4,1/2)} \right) , \\ \phi_3 (t) &= \sqrt{2} \left( 1_{[1/2,3/4)} - 1_{[3/4,1]} \right) , \end{align*}

and in general,

\[ \phi_{2^n +k} (t) = 2^{n/2} \left( 1_{[2^{-n} k, 2^{-n}\left( k+1/2 \right) )} - 1_{[2^{-n} \left( k + 1/2 \right) , 2^{-n} \left( k+1 \right) )} \right) , \qquad n\in \mathbb{N}, \quad k =0,1, \ldots , 2^n -1. \]

The Haar sequence or wavelet was proposed in 1909 by the Hungarian--Jewish mathematician Alfréd Haar (1885--1933). The Haar wavelets form an orthonormal system in 𝔏²([0, 1]). Also, the span of all {ϕ_n} is the same as the span of all step functions with dyadic intervals. It is known that this span is dense in 𝔏²([0, 1]), hence the Haar functions form an orthonormal basis in 𝔏²([0, 1]).

It therefore follows that for every f ∈ 𝔏²([0, 1]):

\[ f = \sum_{n\ge 0} \langle f , \phi_n \rangle \,\phi_n . \]

The limit is in 𝔏²([0, 1]). The question is whether the sum also converges pointwise (almost everywhere). Such questions are usually quite hard. For the specific choice of the Haar basis, it is relatively easy, due to the “good" ordering of those functions.

Theorem 9: Every infinite orthonormal sequence in an inner-product space weakly converges to zero

This is an immediate consequence of the Bessel inequality: if {u_n} is an orthonormal sequence, then for every x ∈ X

\[ \sum_{n\ge 1} | \langle x\,\vert\, u_n \rangle |^2 \leqslant \| x \|^2 , \]

from which follows that

\[ \lim_{n\to\infty} | \langle x\,\vert\, u_n \rangle | = 0 \]

for every x ∈ X, Hence, the sequence { u_n } converges weakly to zero, i.e., u_n ⇀ 0.

Obviously, an infinite orthonormal sequence does not (strongly) converge to zero as the corresponding sequence of norms is constant and equal to one, and the norm is continuous with respect to (strong) convergence.

Example 6: Let us consider the sequence in Hilbert space ℌ = 𝔏²([0, 2π]):

\[ u_n (x) = \frac{1}{\sqrt{2\pi}} \, e^{{\bf j}nx} , \qquad {\bf j}^2 = -1. \]

It is easy to check that they constitute an orthonormal system; thus,

\[ \lim_{n\to\infty} \frac{1}{\sqrt{2\pi}} \int_0^{2\pi} f(x)\, e^{{\bf j}nx} {\text d}x = 0 , \qquad \forall f \in ℌ. \]

End of Example 6

Eigenfunction Expansions

This subsection establishes the main characteristic of self-adjoint operators to have a set of orthogonal eigenfunctions. Recall that operator T acting in a Hilbert space ℌ is an operator for which the following identity holds:

\[ \langle T\,x, y \rangle = \langle x, T\,y \rangle , \qquad \forall x, y \in D(T), \]

where D(T) is the domain of operator T, which is assumed to be dense in ℌ.

Theorem 10: If T is a self-adjoint operator, then its eigenvalues are real and eigenvectors of T corresponding to distinct eigenvalues are orthogonal.

Now we are going to take advantage by considering the Sturm--Liouville differential operator

\begin{equation*} %\label{EqSturm.3} L\left[ x, \texttt{D} \right] = q(x)\,\texttt{I} - \texttt{D}\,p(x)\,\texttt{D} , \qquad \texttt{D} = \frac{\text d}{{\text d}x} , \quad \texttt{I} = \texttt{D}^0 , \end{equation*}

in the Hilbert space 𝔏²([𝑎, b], w). Here p(x), its derivative, and q(x), w(x) are given continuous functions on some interval [𝑎, b]. It is also assumed that the weight function w(x) is strictly positive on the closed interval [𝑎, b], but function p(x) may vanish at end points x = 𝑎 and/or x = b in case of a singular problem. Accordingly, we consider the classical Sturn--Liouville problems that consist of the differential equation with a parameter λ:

\begin{equation} \label{EqSturm.1} L\left[ x, \texttt{D} \right] y = \lambda\,w\,y \qquad\mbox{or} \qquad \frac{\text d}{{\text d}x} \left[ p(x)\,\frac{{\text d}y}{{\text d}x} \right] - q(x)\, y + \lambda \,w (x)\,y(x) =0 , \qquad a < x < b , \end{equation}

subject to the homogeneous boundary conditions of the third kind

\begin{equation} \label{EqSturm.2} \alpha_0 y(a) - \alpha_1 y'(a) =0 , \qquad \beta_0 y(b ) + \beta_1 y' (b) =0 , \qquad |\alpha_0 | + |\alpha_1 | \ne 0 \quad\mbox{and} \quad |\beta_0 | + |\beta_1 | \ne 0. \end{equation}

First, using integration by parts, it is not hard to show that the linear operator \eqref{EqSturm.1} is self-adjoint:

\[ \left\langle L \left[ x, \texttt{D} \right] u \,\vert \,v(x) \right\rangle = \int_a^b \left( \frac{{\text d}}{{\text d}x}\,p(x)\, \frac{{\text d}u}{{\text d}x} - q(x)\,u(x) \right) v(x)\,{\text d}x = \left\langle u\,\vert L \left[ x, \texttt{D} \right] v \right\rangle . \]

It turns out that the eigenfunctions corresponding to different eigenvalues are orthogonal with respect to the weight function

\begin{equation} \label{EqExpand.3a} \left\langle \phi_n , \phi_k \right\rangle = \int_a^b \phi_n (x)\,\phi_k (x)\,w(x)\,{\text d} x = 0 \qquad\mbox{if} \quad n\ne k . \end{equation}

Theorem 11: If function f belongs to the domain D of the Sturn--Liouville operator L (this means that f possesses two continuous derivatives and satisfies the corresponding boundary conditions), then f can be expanded into uniformly convergent series with respect to eigenfunctions of operator L:

\begin{equation} \label{EqExpand.6} f(x) = \sum_n c_n \phi_n (x) , \qquad c_n = \frac{\langle f(x), \phi_n (x) \rangle}{\| \phi \|_2^2} . \end{equation}

The expressions in Eq.\eqref{EqExpand.6} are

\[ \langle f , \phi_n \rangle = \int_a^b f(x)^{\ast}\,\phi_n (x)\,w(x)\,{\text d}x , \qquad \| \phi \|_2^2 = \int_a^b \left\vert \phi_n (x)\right\vert^2 w(x)\,{\text d}x , \]

where asterisk stands for complex conjugate.

In 1907, mathematicians Frigyes Riesz and Ernst Fischer independently published foundational results that establish completeness of the space 𝔏²([𝑎, b]).

However, the most important is the ability to compute the eigenfunction decomposition (which is actually the spectral decomposition according to the differential operator L) of a wide class of functions. That is, for any function f ∈ 𝔏²([𝑎, b], w) satisfying the prescribed boundary conditions, we wish to synthesize it as

\begin{equation} \label{EqSturm.6} f(x) = \sum_k c_k \phi_k (x) , \end{equation}

where ϕ_k(x) are eigenfunctions. We wish to find out whether we can represent a function in this way, and if so, we wish to calculate c_k (and of course we would want to know if the sum converges). Although this topic will be considered in detail in a dedicated section. we outline the main ideas.

Assuming that series \eqref{EqSturm.6} converges, we multiply it by w(x) ϕ_n(x) and integrate with respect to x from 𝑎 to b. This yields

\[ \langle f\,|\,\phi_n \rangle = \int_a^b \overline{f(x)} \phi_n (x)\,w (x)\, {\text d} x = \sum_k c_k \int_a^b \phi_k (x)^{\ast} \phi_n (x)\,w (x)\, {\text d} x = \sum_k c_k \langle \phi_k , \phi_n \rangle . \]

Using the orthogonal property of eigenfunctions

\begin{equation} \label{EqSturm.7} \langle \phi_k , \phi_n \rangle = \int_a^b \overline{\phi_k (x)} \phi_n (x)\,w (x)\, {\text d} x = \begin{cases} 0, & \ \mbox{ if} \quad n \ne k , \\ \| \phi_n (x) \|^2_2 , & \ \mbox{ for} \quad n=k ; \end{cases} \end{equation}

we obtain

\[ \langle f\,|\,\phi_n \rangle = c_n \| \phi_n (x) \|_2^2 \qquad \Longleftrightarrow \qquad \int_a^b f(x)^{\ast} \phi_n (x)\,w (x)\,{\text d}x = c_n \int_a^b \left\vert \phi_n (x) \right\vert^2 w (x)\, {\text d} x . \]

Therefore, we find

\begin{equation} \label{EqSturm.8} c_n = \frac{\langle f(x)\,|\,\phi_n (x) \rangle}{\| \phi_n (x) \|_2^2} . \end{equation}

and the Fourier series \eqref{EqSturm.6} becomes

\[ %\begin{equation} \label{EqSturm.9} f(x) = \sum_k \frac{\langle f(x)\,|\,\phi_k (x) \rangle}{\| \phi_k (x) \|_2^2} \,\phi_k (x) . \tag{9} \]

Upon introducing the orthonormal functions e_n = ϕ_n/∥ϕ_n∥, we rewrite Eq.(9) in compact form:

\[ f(x) = \sum_{k} \langle f\,|\,e_k \rangle \, e_k (x) . \]

In particular, we get the delta-function expansion

\[ \delta (x-t) = \sum_{k} e_k^{\ast} (t)\,w(t)\,e_k (x) , \]

for some weight function w(t). Of course, these "equations" are understood in weak sense as limits of corresponding partial sums.

These Fourier coefficients \eqref{EqSturm.8} satisfy the so-called Bessel inequality:

\[ \sum_{k\ge 1} \left\vert c_k \right\vert^2 \| \phi_k \|^2 \le \left\langle f , f \right\rangle . \]

A set of orthogonal functions { φ_k } in 𝔏² is called complete in the closed interval [𝑎, b] whenever the vanishing of inner products with all the orthogonal functions implies the member of 𝔏²([𝑎, b], w) is equal to zero almost everywhere in the domain.

The term complete was introduced in 1910 by the famous Russian mathematician Vladimir Steklov (1864--1926). A set of functions { φ_k } in 𝔏² is complete if and only if the Parseval identity holds:

\[ %\begin{equation} \label{EqSturm.10} \sum_{k\ge 1} \left\vert c_k \right\vert^2 \| \phi_k \|^2 = \left\langle f, f \right\rangle = \int_a^b \left\vert f(x) \right\vert^2 {\text d}x = \| f(x) \|^2_2 . \tag{2} \]

The identity above was stated by the famous French mathematician Marc-Antoine Parseval (1755--1836) in 1799.

The main reason to study Sturn--Liouville problems is that their eigenfunctions provide a basis for expansion for certain class of functions. This tells us that basis functions have close relationships with linear operators, and this basis is orthogonal for self-adjoint second order differential operators. In other words, solutions of differential equations can be approximated more efficiently by means of better basis functions. For example, a periodic function can be approximated more efficiently by periodic basis functions (Fourier series) than by polynomials like the Taylor series.

Let ℌ be a Hilbert space and let P_A and P_B be orthogonal projections on closed subspaces A and B.
1. Show that if P_AP_B is an orthogonal projection, then it projects on A ∩ B.
2. Show that P_AP_B is an orthogonal projection if and only if P_AP_B = P_BP_A.
3. Show that if P_AP_B is an orthogonal projection, then P_A + P_B −P_AP_B is an orthogonal projection on A + B.
4. Find an example in which P_AP_B ≠ P_BP_A.
Let M be set of even (almost everywhere) function from 𝔏²(−∞, ∞), i.e.,
\[ M = \left\{ f \in 𝔏²\, : \, f(x) =f(-x) \right\} . \]
1. Show that M is a closed subspace.
2. Find M^⊥.
What is the orthogonal complement of the following sets of Hilbert space 𝔏²[0, 1]?
1. The set of polynomials.
2. The set of polynomials in x².
3. The set of polynomials with 𝑎₀ = 0.
4. The set of polynomials with coefficients summing up to zero.
Let M = {x = (x₁, ... , x_n ) ∈ ℝⁿ ∶ ∑_i=1ⁿ x_i = 1}. Show that M is closed and convex, and find the element in M closest to the origin
Let C be a closed convex subset of a Hilbert space ℌ, let x ∈ ℌ − C, and let y be the closest element of C to x. Prove that, for every z ∈ C, Re⟨x − y, z − y⟩ ≤ 0.

Return to Mathematica page)
Return to the main page (APMA0360)
Return to the Part 1 Basic Concepts
Return to the Part 2 Fourier Series
Return to the Part 3 Integral Transforms
Return to the Part 4 Parabolic Differential Equations
Return to the Part 5 Hyperbolic Differential Equations
Return to the Part 6 Elliptic Equations
Return to the Part 7 Numerical Methods

MATHEMATICA
TUTORIAL

under the terms of the GNU General Public License (GPL)

for the Fourth Course. Part II: Eigenfunction Expansion

Email: Prof. Vladimir Dobrushkin ()

Contents [hide]

Glossary

Preface

Spectral Decomposition

Projection

Orthogonality

Orthogonal Expansions

Eigenfunction Expansions

MATHEMATICA TUTORIAL under the terms of the GNU General Public License (GPL) for the Fourth Course. Part II: Eigenfunction Expansion

Email: Prof. Vladimir Dobrushkin ()

Contents [hide]

Glossary

Preface

Spectral Decomposition

Projection

Orthogonality

Orthogonal Expansions

Eigenfunction Expansions

MATHEMATICA
TUTORIAL

under the terms of the GNU General Public License (GPL)

for the Fourth Course. Part II: Eigenfunction Expansion