Return to computing page for the first course APMA0330
Return to computing page for the second course APMA0340
Return to computing page for the fourth course APMA0360
Return to Mathematica tutorial for the first course APMA0330
Return to Mathematica tutorial for the second course APMA0340
Return to Mathematica tutorial for the fourth course APMA0360
Return to the main page for the first course APMA0330
Return to the main page for the second course APMA0340
Return to the main page for the fourth course APMA0360
Return to Part II of the course APMA0360
Introduction to Linear Algebra with Mathematica

Preface


In this section, we discuss how to expand a function f from the Hilbert space 𝔏²([𝑎, b], w) with respect to eigenfunction of classical Sturm--Liouville problem. Since many practical examples of such expansions are presented in the following sections, we concentrate our attention on theoretical exposition. Such representation of an arbitrary function f(x) is called an eigenfunction expansion of f(x. A natural question arises in your mind: what does “represents” mean? Does it mean in the sense of pointwise convergence? Or mean convergence? Or perhaps some other concept altogether?

Spectral Decomposition


Decomposition of complicated systems into its constituent parts is one of science’s most powerful strategies for analysis and understanding large-scale systems with linearly coupled components. Spectral decomposition—splitting a linear operator into independent modes of simple behavior—is greatly appreciated in mathematical physics. For example, a wave dynamics is usually captured by superposition of simple modes. Quantum mechanics and statistical mechanics identify the energy eigenvalues of Hamiltonians as the basic objects in thermodynamics: transitions among the energy eigenstates yield heat and work. The eigenvalue spectrum reveals itself most directly in other kinds of spectra, such as the frequency spectra of light emitted by the gases that permeate the galactic filaments of our universe.

Spectral decomposition often allows a problem to be simplified by approximations that use only the dominant contributing modes. Indeed, human-face recognition can be efficiently accomplished using a small basis of “eigenfaces”. Certainly, there are many applications that highlight the importance of decomposition. In this section, we concentrate our attention on application of decomposition theory to differential equations generated by a linear second order self-adjoint operator subject to boundary conditions. Now it is known that depending on boundary conditions, the corresponding boundary value problem may have discrete spectrum (set of eigenvalues) or continuous spectrum or their combination.

When solving an inhomogeneous differential equation

\[ L \left[ x, \texttt{D} \right] y = f \qquad ( \texttt{D} = {\text d}/{\text d}x) , \]
with a differential operator L, we expand the input function and the unknown solution into the series over eigenfunctions:
\[ f(x) = \sum_n f_n \phi_n (x) , \qquad y(x) = \sum_n c_n \phi_n (x) , \]
where { ϕn(x) } are eigenfunctions of the differential operator L. Upon substituting these series expansions into the differential equation, we reduce this problem to simple multiplication problem:
\[ L \left[ x, \texttt{D} \right] \sum_n c_n \phi_n (x) = \sum_n c_n L \left[ x, \texttt{D} \right] \phi_n (x) = \sum_n c_n \lambda_n \,\phi_n (x) = \sum_n f_n \phi_n (x) \]
because ϕn(x) are eigenfunctions. Assuming that all infinite series above converge, we obtain
\[ c_n \lambda_n = f_n \qquad \Longrightarrow \qquad c_n = \frac{f_n}{\lambda_n} . \]

 

In quantum mechanics, the state of a physical system is represented by a vector in a Hilbert spaces: a complex vector space with an inner product. It is a custom to use Dirac notation in which the vectors in the space are denoted by | v ⟩, called a ket, where v is some symbol which identifies the vector. A multiple of a vector by a complex number c is written as c| v ⟩. In Dirac notation, the inner product of the vectors |v〉 with |w〉 is written ⟨ v | w ⟩. Then eigenvalue expansion will have the form:
\[ |\, f\,\rangle = \sum_n f_n |\,\phi_n \rangle \qquad \mbox{or} \qquad |\, f\,\rangle = \sum_n f_n |\,n \rangle . \]

Projection


Orthogonality is one of the central concepts in the theory of Hilbert spaces. Another concept, intimately related to orthogonality, is orthogonal projection. Before getting to projections, we need to develop the notion of a convex set. Convexity, is a purely algebraic concept, but as we will see, it interacts with the topology induced by the inner product.

Two vectors x and y from inner-product vector space are called orthogonal iff their inner product is zero:
\[ \langle x\,,\,y \rangle = \langle y\,,\,x \rangle = 0 \qquad \iff \qquad x \perp y . \]

Let X be an inner-product space, and let SX be any its subset. We denote by S the set of vectors that are perpendicular to all the elements in S,

\[ S^{\perp} = \left\{ x\in X \,: \, \langle x\,,\,y \rangle = 0 \qquad \forall y \in S \right\} . \]
This set S is called the orthogonal complement or annihilator of S.

An inner product is also denoted with a vertical bar instead of comma:   ⟨ ⋅ | ⋅ ⟩

Note that SS⊥⊥ for any set from a space with inner product.

Example 1: Let us consider the space of complex-valued continuous functions on the finite interval, ℭ[−1, 1] with the inner product

\[ \langle f\,,\, g \rangle = \int_{-1}^1 f(x)^{\ast} g(x)\,{\text d} x , \]
where asterisk designates complex-conjugate. Let M be the set of functions in ℭ[−1, 1] that vanish on the interval [−1, 0], and let N be the set of all functions in ℭ[−1, 1] that vanish on [0, 1]. Every function in M is orthogonal to every function in N. Thus, NM and MN.
End of Example 1

Lemma 1: Let S be a subset of an inner-product space, SX. The set S is a closed linear subspace of X, and
\[ S^{\perp} \cap S \subset \{ 0 \} . \]
We start by showing that S is a linear subspace. Let x, yS, so
\[ \langle x\,, \, z \rangle = \langle y\,, \, z \rangle = 0 \qquad \mbox{for any} \quad z \in S. \]
For arbitrary scalars α, β, we have
\[ \langle \alpha\,x + \beta\, y\,, \, z \rangle = \alpha \langle x\,, \, z \rangle + \beta \langle y\,, \, z \rangle = 0 \qquad \mbox{for any} \quad z \in S , \]
which implies that &alphax + βyS, i.e., S is a linear subspace.

We next show that S is closed. Let {xn} be a sequence in S that converges to xX. By the continuity of the inner product,

\[ \langle x\,, \, z \rangle = \lim_{n\to\infty} \langle x_n\,, \, z \rangle = 0 \qquad \mbox{for any} \quad z \in S. \]
This means that xS.

Suppose that xSS. As an element in S, x is orthogonal to all the elements in S, and in particular to itself, hence ⟨ x , x ⟩ = 0, which by the defining property of the inner-product implies that x = 0.

Theorem 1: Let ℌ be a Hilbert space and C ⊂ ℌ be closed and convex set in it. Then for every x ∈ ℌ there exists a unique element yC such that the distance to the set C is
\[ d(x, C) = \| x-y \| , \]
where
\[ d(x, C) = \inf_{y\in C} \| x-y \| . \]
The mapping   xy   is called the projection of x onto the set C and it is denoted by P.
If xC, take y = x. If xC, then δ = dist(x, C) > 0 because C is closed.

We start by showing the existence of a distance minimizer. By the definition of the infimum, there exists a sequence {yn} ⊂ C satisfying

\[ d(x, C) = \lim_{n\to\infty} d(x, y_n ) . \]
Since C is convex,   ½(yn + ym) ∈ C for all m, n, and therefore,
\[ \left\| \frac{1}{2} \left( y_n + y_m \right) - x \right\| \geqslant d(x, C) . \]
By the parallelogram identity (this inner-product property was proved in the first section), \( \displaystyle \| a - b \|^2 = 2 \left( \| a \|^2 + \| b \|^2 \right) - \| a + b \|^2 , \quad \) we get
\begin{align*} 0 \leqslant \| y_n - y_m \|^2 &= \| \left( y_n - x \right) - \left( y_m - x \right) \|^2 \\ &= 2\,\| y_n - x \|^2 + 2\,\| y_m - x \|^2 - \| y_n + y_m - 2\,x \|^2 \\ & \leqslant 2\,\| y_n - x \|^2 + 2\,\| y_m - x \|^2 - 2\,d(x, C) \ \to \ 0 \quad \mbox{as} \quad m,n \to \infty . \end{align*}
It follows that {yn} is a Cauchy sequence and hence converges to a limit y (which is where completeness is essential). Since C is closed, yC. Finally, by the continuity of the norm,
\[ \| x - y \| = \lim_{n\to\infty} \| x - y_n \| = d(x, C) . \]
which completes the existence proof of a distance minimizer.

Next, we show the uniqueness of the distance minimizer. Suppose that y, zC both satisfy

\[ \| y-x \| = \| z - x \| = d(x, C) . \]
By the parallelogram identity,
\[ \| y+z --2\,x \|^2 + \| y-z \|^2 = 2\,\| y-x \|^2 + 2\,\| z-x \|^2 , \]
which leads to
\[ \left\| \frac{y+z}{2} - x \right\|^2 = d^2 (x, C) - \frac{1}{4} \,\| y - z \| . \]
If yz, then   (y + z)/2,   which belongs to C is closer to x than the distance of x from C, which is a contradiction.
The existence of a unique projection does not hold, in general, in complete normed spaces (i.e., Banach spaces). A distance minimizer does exist in finite-dimensional normed spaces, but it may not be unique). In infinite-dimensional Banach spaces, distance minimizers may fail to exist.

Corollary 1: Let ℌ be a Hilbert space and C ⊂ ℌ be closed and convex set in it. The projection   PC : ℌ → C   is idempotent, PCPC = PC.

Corollary 2: Let M be a closed linear subspace of a Hilbert space ℌ. Then
\[ y = P_M x \]
if and only if (abbreviated as iff)
\[ y \in M \qquad\mbox{and} \qquad x-y \in M^{\perp} . \]

Projection Theorem: Let M be a closed linear subspace of a Hilbert space ℌ. Then every vector x ∈ ℌ has a unique decomposition
\[ x = m + n , \qquad m \in M , \quad n \in M^{\perp} . \]
Furthermore, m = PMm, so
\[ ℌ = M \oplus M^{\perp} . \]
Let x ∈ ℌ. By Corollary 2,
\[ x - P_M x \in M^{\perp} , \]
hence,
\[ x = P_M x + \left( x - P_M x \right) \]
satisfies the required properties of the decomposition.

Next, we show that the decomposition is unique. Assume

\[ x = m_1 + n_1 = m_2 + n_2 , \]
where m₁, m₂ ∈ M and n₁, n₂ ∈ M. Then
\[ M \ni m_1 - m_2 = n_1 - n_2 \in M^{\perp} . \]
Uniqueness follows from the fact that MM = { 0 }.
The element m in Projection theorem is called the orthogonal projection of x on M. It is worth reiterating that m is the closest element in M to x. The mapping PM ∶ ℌ → ℌ defined by PM(x) = m is called the projection operator (or simply the projection) of ℌ onto M.

Example 2: The projection theorem does not holds when the conditions are not satisfied. Take for example ℌ = ℓ² , with linear subspace

\[ M = \left\{ (a_n ) \in \ell^2 \, : \, \exists N; \quad \forall n > N, \quad a_n = 0 \right\} . \]
This linear subspace it not closed, and its orthogonal complement is { 0 }, so
\[ M \oplus M^{\perp} = M \ne \ell^2 . \]
End of Example 2

Corollary 3: For every linear subspace M of a Hilbert space ℌ,
\[ \left( M^{\perp} \right)^{\perp} = \overline{M} \quad (\mbox{closure}). \]

Orthogonality


Recall that a Hilbert space ℌ is a real or complex vector space with inner product that is also a complete metric space with respect to the distance function induced by the inner product; \( \displaystyle d(x,y) = \| x - y \| = \langle x-y, x-y \rangle^{1/2} . \)
A set S in a vector space X with inner product ⟨ ∣ ⟩ is called orthogonal if xy for every pair of vectors x, yS, so ⟨ xy ⟩ = 0 for yx. If, in addition, ∥x∥ = 1 for all xS, the system S is referred to as orthonormal.

Orthonormal/orthogonal system S in Hilbert space ℌ is called complete orthonormal/orthogonal system or orthonormal/orthogonal basis in vector space X ⊆ ℌ if S is not a subset of any other orthonormal/orthogonal system of space X. That is, it is complete if the only vector orthogonal to all all elements of S is zero.

Theorem 3: Every Hilbert space ℌ with at least one nonzero element has a complete orthonormal system. Moreover, for any orthonormal system S in Hilbert space ℌ, there exists a complete orthonormal system that contain S as subset.
Let S be a orthnormal system in ℌ. Such system definitely exists in ℌ: for instance, if x ≠ 0, then x/∥x ∥ gives an example of such system. Let us consider family of cardinal numbers |S| of orthonormal systems {S}, then such family is a partially ordered set with ordering S₁' ≺ S₂ when S₁' ⊆ S₂. Let {T} be some partially ordered subsystem of S. The set ∪T is an orhogonal system and it is a majorate for {T}. Then according to Zorn's lemma, there exists a maximal element S0 of {S} that contains S. This system S0 is complete as the maximum element.

Orthogonal Expansions


Recall that a Hilbert space ℌ is a real or complex vector space with inner product that is also a complete metric space with respect to the distance function induced by the inner product; \( \displaystyle d(x,y) = \| x - y \| = \langle x-y, x-y \rangle^{1/2} . \) A Hilbert space ℌ is separable if and only if every orthonormal basis of ℌ is countable.

The goal of this subsection is to represent an arbitrary element of a Hilbert space ℌ in terms of a basis of some kind. If dim(ℌ) < ∞, the goal is too trivial, and if dim(ℌ) = ∞, the goal is unrealistic if one insists on looking at a Hamel basis because any such basis is uncountable and hence too big to be useful. The only realistic expectation is to hope to express an arbitrary element of ℌ as a series of the basis elements, as was achieved in Fourier series section. This means that ℌ has a Schauder basis, which immediately suggests that we investigate separable Hilbert spaces.

`

Therefore, we focus mostly but not exclusively on separable Hilbert spaces. Many of the results were develop in this chapter are valid for inseparable Hilbert spaces. Examples include the projection theorem and the Riesz representation theorem (see section).

A set S in a vector space X with inner product ⟨ ∣ ⟩ is called orthogonal if xy for every pair of vectors x, yS, so ⟨ xy ⟩ = 0 for yx. If, in addition, ∥x∥ = 1 for all xS, the system S is referred to as orthonormal.

Orthonormal/orthogonal system S in Hilbert space ℌ is called complete orthonormal/orthogonal system or orthonormal/orthogonal basis in vector space X ⊆ ℌ if S is not a subset of any other orthonormal/orthogonal system of space X. That is, it is complete if the only vector orthogonal to all elements of S is zero.

Theorem 4: Every Hilbert space ℌ with at least one nonzero element has a complete orthonormal system. Moreover, for any orthonormal system S in Hilbert space ℌ, there exists a complete orthonormal system that contain S as subset.
Let S be a orthnormal system in ℌ. Such system definitely exists in ℌ: for instance, if x ≠ 0, then x/∥x ∥ gives an example of such system. Let us consider family of cardinal numbers |S| of orthonormal systems {S}, then such family is a partially ordered set with ordering S₁' ≺ S₂ when S₁' ⊆ S₂. Let {T} be some partially ordered subsystem of S. The set ∪T is an orhogonal system and it is a majorate for {T}. Then according to Zorn's lemma, there exists a maximal element S0 of {S} that contains S. This system S0 is complete as the maximum element.
Corollary 4: Every separable Hilbert space ℌ contains a countable complete orthonormal system.
Recall that ℌ is separable if it contains a countable dense subset. Now, Let {xn} be a dense countable set. In particular:
\[ \overline{\mbox{span} \left\{ x_n \, : \, n \in \mathbb{N} \right\}} = ℌ. \]
We can construct inductively a subset {yn} of independent vectors such that there exists an integer N,
\[ \mbox{span} \left\{ y_k \, : \, 1 \le k \le N \right\} = \mbox{span} \left\{ x_k \, : \, 1 \le k \le n \right\} \]
for every n. In other words,
\[ \overline{\mbox{span} \left\{ y_n \, : \, n \in \mathbb{N} \right\}} = ℌ. \]
By applying Gram-Schmidt orthonormalization, we obtain an orthonormal system {un} such that
\[ \overline{\mbox{span} \left\{ u_n \, : \, n \in \mathbb{N} \right\}} = ℌ. \]
We will show that this orthonormal system is complete. Suppose that v were orthogonal to all {un}. Since every x ∈ ℌ is a limit
\[ x = \lim_{n\to\infty} \sum_{k=1}^n c_{n,k} u_k . \]
It follows by the continuity of the inner-product that
\[ \langle x\,\vert\, y \rangle = \lim_{n\to\infty} \sum_{k=1}^n c_{n,k} \langle u_k \,\vert\, v \rangle = 0, \]
i.e., v is orthogonal to all vectors in ℌ, hence it is zero, from which follows that the {un} form a complete orthonormal system.

Example 3: It is possible for a separable inner product space (hence for a separable Hilbert space) to contain uncountably many pairs of orthogonal vectors.

Let us consider the space of continuous functions ℭ[−1, 1] with the inner product

\[ \langle f\,,\, g \rangle = \int_{-1}^1 \overline{f(x)}\,g(x)\,{\text d} x . \]
Let M be the set of functions in ℭ[−1, 1] that vanish on the interval [−1, 0], and let N be the set of all functions in ℭ[−1, 1] that vanish on the interval [0, 1]. Every function in M is orthogonal to every function in N.

Every pair of functions (f, g) ∈ M × N is orthogonal. Since both M and N are uncountable, we have proved our assertion.

End of Example 3

Let S = { xα : α ∈ A } be a complete orthogonal system in Hilbert space ℌ. For any element f ∈ ℌ, its Fourier coefficient is defined by
\begin{equation} \label{EqExpand.1} f_{\alpha} = \frac{\langle f, x_{\alpha} \rangle}{\langle x_{\alpha}, x_{\alpha} \rangle} = \frac{\langle f, x_{\alpha} \rangle}{\| x_{\alpha} \|^2} . \end{equation}

Theorem 5: Let S = { xα : α ∈ A } be a complete orthogonal system in Hilbert space ℌ. Then Parseval's identity holds:
\begin{equation} \label{EqExpand.2} \| f \|^2 = \sum_{\alpha \in A} | f_{\alpha} |^2 \| x_{\alpha} \|^2 = \sum_{\alpha \in A} \frac{\langle f, x_{\alpha} \rangle^2}{\| x_{\alpha} \|^2} . \end{equation}

For every element f from Hilbert space ℌ and complete orthpgpnal system { xα }, formula
\begin{equation} \label{EqExpand.4} f = \sum_{i\ge 1} f_{\alpha_i} x_{\alpha_i} = \lim_{n\to\infty} \sum_{i=1}^n f_{\alpha_i} x_{\alpha_i} \end{equation}
is called Fourier expansion of f with respect to complete basis { xα }. The set of scalars { fα } is called the set of Fourier components of f with respect to the orthogonal system.

For a given complete orthogonal system S = { xα : α ∈ A } in Hilbert space ℌ, any vector f from ℌ can be expanded into Fourier series:

\begin{equation} \label{EqExpand.2a} f = \sum_{\alpha} f_{\alpha} x_{\alpha} = \sum_{\alpha} \frac{\langle f , x_{\alpha} \rangle}{\langle x_{\alpha} , x_{\alpha} \rangle}\, x_{\alpha} = \sum_{\alpha \in A}\frac{\langle f , x_{\alpha} \rangle}{\| x_{\alpha} \|}\, x_{\alpha} . \end{equation}

Lemma (Bessel inequality): For any orthogonal system S = { xα : α ∈ A }, Bessel inequality holds:
\begin{equation} \label{EqExpand.3} \| f \|^2 \ge \sum_{\alpha \in A} | f_{\alpha} |^2 \| x_{\alpha} \|^2 = \sum_{\alpha \in A} \frac{\langle f, x_{\alpha} \rangle^2}{\| x_{\alpha} \|^2} . \end{equation}
Let α₁, α₂, … , αn be arbitrary finite set of indices. For any finite set of numbers \( \displaystyle c_{1}, c_2 , \ldots , c_n , \) we have for arbitrary orthonormal system { xα } that
\begin{align*} \left\| f - \sum_{i=1}^n c_i x_{\alpha_i} \right\|^2 &= \left\langle f - \sum_{i=1}^n c_i x_{\alpha_i} , f - \sum_{i=1}^n c_i x_{\alpha_i} \right\rangle \\ &= \| f \|^2 - \sum_{i=1}^n c_i^{\ast} x_{\alpha_i} - \sum_{i=1}^n c_i x_{\alpha_i}^{\ast} + \sum_{i=1}^n \left\vert c_i \right\vert^2 \\ &= \| f \|^2 - \sum_{i=1}^n \left\vert f_{\alpha_i} \right\vert^2 + \sum_{i=1}^n \left\vert f_{\alpha_i} - c_i \right\vert^2 . \end{align*}
Therefore, minimum of expression \( \displaystyle \left\| f - \sum_{i=1}^n c_i x_{\alpha_i} \right\|^2 \) is attained when \( \displaystyle c_i = f_{\alpha_i} , \quad i = 1,2, \ldots , n . \) Hence,
\[ \left\| f - \sum_{i=1}^n c_i x_{\alpha_i} \right\|^2 = \| f \|^2 - \sum_{i=1}^n \left\vert f_{\alpha_i} \right\vert^2 . \]
So we conclude that
\[ \sum_{i=1}^n \left\vert f_{\alpha_i} \right\vert^2 \le \| f \|^2 . \]
Since indices α₁, α₂, … , αn were chosen arbitrary, we conclude that fα ≠ 0 in no more than in countable number of cases; so Bessel's inequality is valid.
Observe that sequence \( \displaystyle \left\{ \sum_{i=1}^n f_{\alpha_i} x_{\alpha_i} \right\} \) is a Cauchy sequence because of orthogonality of { xα }. Then
\[ \left\vert \sum_{i=k}^n f_{\alpha_i} x_{\alpha_i} \right\vert^2 = \left\langle \sum_{i=k}^n f_{\alpha_i} x_{\alpha_i} , \sum_{i=k}^n f_{\alpha_i} x_{\alpha_i} \right\rangle = \sum_{i=k}^n \left\vert f_{\alpha_i} \right\vert^2 \]
tends to zero as k ⟶ ∞. Let \( \displaystyle g = \lim_{n\to\infty} \sum_{i=1}^n f_{\alpha_i} x_{\alpha_i} \) WE are going to show that vector (fg) is orthogonal to every element from S.

Due to continuity of inner product.\,

\[ \left\langle f-g, x_{\alpha} \right\rangle = \lim_{n\to\infty} \left\langle f- \sum_{i=1}^n f_{\alpha_i} x_{\alpha_i} , x_{\alpha} \right\rangle = f_{\alpha_i} - f_{\alpha_i} = 0. \]
Hence, (fg) = 0 because the system { xα ). Since norm is continuous, we get
\[ 0 = \lim_{n\to\infty} \left\| f - \sum_{i=1}^n f_{\alpha_i} x_{\alpha_i} \right\|^2 = \| f \|^2 - \lim_{n\to\infty} | f_{\alpha_i} |^2 \| x_{\alpha_i} \|^2 \]

Theorem --- Gram-Schmidt orthonormalization: Let { xn } be either a finite or a countable sequence of linearly independent vectors in an inner-product space X. Then it is possible to construct an orthonormal sequence { yn } that has the same cardinality as the sequence { xn }, such that
\[ \mbox{span}\left\{ y_k \, : \, 1 \leqslant k \leqslant n \right\} = \mbox{span}\left\{ x_k \, : \, 1 \leqslant k \leqslant n \right\} \qquad \forall n \in \mathbb{N} . \]
Take a countable dense subset – which can be arranged as a sequence { vj } and the existence of which is the definition of separability – and orthonormalize it. Thus, if v1 ≠= 0, set e1 = v₁/‖v₁‖. Proceeding by induction, we can suppose to have found for a given integer n elements ei,    i = 1, … , m, where mn, which are orthonormal and such that the linear span
\[ \mbox{span}\left\{ e_1 , e_2 , \ldots , e_m\right\} = \mbox{span}\left\{ v_1 . v_2 , \ldots , v_m \right\} . \tag{GS.1} \]
To show the inductive step observe that if vn+1 is in the span(s) in (GS.1), then the same ei’s work for n + 1. So we may as well assume that the next element, vn+1 is not in the span in (GS.1). It follows that
\[ w = v_{n+1} - \sum_{i=1}^n \langle v_{n+1} , e_i \rangle e_i \ne 0 \quad \Longrightarrow \quad e_{m+1} = \frac{w}{\| w \|} \tag{GS.2} \]
makes sense. By construction it is orthogonal to all the earlier ei’s so adding em+1 gives the equality of the spans for n + 1.

Thus, we may continue indefinitely, since in fact the only way the dense set could be finite is if we were dealing with the space with one element, 0, in the first place. There are only two possibilities, either we get a finite set of ei’s or an infinite sequence. In either case this must be a maximal orthonormal sequence. That is, we claim

\[ ℌ \ni u \perp e_i \quad \forall i \qquad \Longrightarrow \qquad u = 0 . \tag{GS.3} \]
This uses the density of the vn’s. There must exist a sequence wj where each wj is a vn, such that wju in X, assumed to satisfy (GS.3). Now, each vn, and hence each wj, is a finite linear combination of ek’s so, by Bessel’s inequality
\[ \| w_j \|^2 = \sum_k \left\vert \langle w_j , e_k \right\vert^2 = \sum_k \left\vert \langle u - w_j , e_k \right\vert^2 \le \| u - w_j \|^2 , \tag{GS.4} \]
where ⟨ u , ej ⟩ = 0 for all j has been used. Thus,    ∥ wj∥ → 0 and u = 0.

Example 4: Let us consider the set of power monomials on interval [−1, 1],

\[ S = \left\{ 1, \ x, \ x^2 , \ x^3 , \cdots , x^n , \cdots \right\} . \]
Since the interval is evenly distributed around the origin, the orthogonal polynomials ψn(x), n ∈ ℕ, are either even or odd, because the even polynomials are automatically orthogonal with the odd polynomials.

The method starts with normalizing φ0(x) = 1 to yield ψ0(x) as

\[ \psi_0 (x) = C_0 \phi_0 (x) = C_0 1 . \]
The normalization constant C0 is determined from
\[ 1 = \int_{-1}^1 \left\vert \psi_0 (x) \right\vert^2 {\text d} x = \left\vert C_0 \right\vert^2 \int_{-1}^1 {\text d} x = 2. \]
Hence, C0 = 2−½ and we have
\[ \psi_0 (x) = \frac{1}{\sqrt{2}} . \]
The lowest order normalized polynomial ψ0(x) is an even function of x.

Proceeding to construct the next orthogonal polynomial from φ₁(x) = x yields

\begin{align*} \psi_1 (x) &= C_1 \left( \phi_1 (x) - \psi_0 \int_{-1}^1 {\text d} x\,\phi_1 (x)\,\psi_0 (x) \right) \\ &= C_1 \left( x - \frac{1}{2} \int_{-1}^1 {\text d} x\, x \right) \\ &= C_1 x \end{align*}
because the integral of the odd function x over symmetric interval is zero. The normalization is found to be
\[ C_1 = \sqrt{\frac{3}{2}} \qquad \Longrightarrow \qquad \psi_1 (x) = \sqrt{\frac{3}{2}}\, x . \]
The next polynomial is constructed from φ₂(x) = x². We know that this function is an even function and is automatically orthogonal to all odd functions of x. Thus, we only need to orthogonalize it agains φ₀(x) = 1:
\begin{align*} \psi_2 (x) &= C_2 \left( x^2 - \frac{1}{\sqrt{2}} \int_{-1}^1 {\text d} x\,x^2 \frac{1}{\sqrt{2}} \right) \\ &= C_2 \left( x^2 - \frac{1}{3} \right) \end{align*}
and C₂ is found as
\[ C_2 = \sqrt{\frac{45}{8}} \qquad \Longrightarrow \qquad \psi_2 (x) = \sqrt{\frac{45}{8}}\, x^2 . \]
The orthogonalization of the next polynomial is non-trivial. The process starts with φ₃(x) = x³, so one finds
\begin{align*} \psi_3 (x) &= C_3 \left( x^3 - x\,\frac{3}{2} \int_{-1}^1 {\text d} x\,x^4 \right) \\ &= C_3 \left( x^3 - x\,\frac{3}{5} \right) , \end{align*}
etc.

This set of polynomials, apart from multiplicative constants, are the same as the set of Legendre polynomials {Pn(x)}. However, the Legendre polynomials are normalized differently. The Legendre polynomials are normalized by insisting that Pn(0) = 1.

End of Example 4

Theorem --- Riesz-Fischer: Let { un } be an orthonormal sequence in a Hilbert space ℌ. Let { cn } be a sequence of scalars. Then partial sums
\[ s_n = \sum_{i=1}^n c_i u_i \]
strongly (in norm) converges as n → ∞ if and only if
\[ \sum_{i\ge 1} \left\vert c_i \right\vert^2 < \infty . \]
The theorem was proven independently in 1907 by Frigyes Riesz and Ernst Sigismund Fischer.
Look at the difference between sn and sm:
\[ \| s_n - s_m \|^2 = \left\| \sum_{k=m+1}^n c_k u_k \right\|^2 = \sum_{k=m+1}^n \left\vert c_k \right\vert^2 , \]
where we used the orthonormality of the {uk}. Thus, the series of | ck |² is a Cauchy sequence if and only if {sk} is a Cauchy sequence. Note that the completeness of ℌ is crucial.

Theorem 8: Let {uα : α ∈ A} be an orthonormal system in the Hilbert space ℌ. All the following conditions are equivalent:
  1. {uα : α ∈ A} is complete.
  2. For all x ∈ ℌ:    Σα∈Ax , uαuα = x.
  3. Generalized Parseval identity. For all x, y ∈ ℌ:    ⟨ x , y ⟩ = Σα∈Ax , uα⟩* ⟨ y , uα⟩.
  4. Parseval identity. For all x ∈ ℌ:    ∥x∥² = Σα∈A |⟨ x , uα⟩|².
Suppose that (1) holds, i.e., the orthonormal system, is complete. Given x∈ℌ, let {αn} be a sequence of indexes that contains all indexes for which the Fourier coefficients of x do not vanish. For every index αn,
\[ \left\langle x - \sum_{k\ge 1} \langle x\, \vert \, u_{\alpha_k} \rangle \,u_{\alpha} , u_{\alpha_n} \right\rangle = 0 . \]
In fact, for all α∈A,
\[ \left\langle x - \sum_{k\ge 1} \langle x\, \vert \, u_{\alpha_k} \rangle \,u_{\alpha} , u_{\alpha} \right\rangle = 0 . \]
It follows that \( \displaystyle x - \sum_{k\ge 1} \langle x \,\vert\, u_{\alpha_k} \rangle\,u_{\alpha_k} \) is orthogonal all vectors {uα} but since we assumed that the orthonormal system is complete, it follows that it is zero, i.e.,
\[ x = \sum_{k\ge 1} \langle x\,\vert\, u_{\alpha_k} \rangle\, u_{\alpha_k} , \]
and once again we may extend the sum over all α∈A.

Suppose that (2) holds:

\[ x = \sum_{\alpha \in A} \langle x\,\vert\, u_{\alpha} \rangle\, u_{\alpha} , \]
Given x, y ∈ ℌ, let {αn} be a sequence of indexes that contains all the indexes for which at least one of the Fourier components of either x and y does not vanish. By the continuity of the inner-product:
\[ \langle x\,\vert\, y \rangle = \left\langle \sum_{k\ge 1} \langle x , u_{\alpha_k} \rangle \,u_{\alpha_k}, \langle y , u_{\alpha_k} \rangle \,u_{\alpha_k} \right\rangle = \sum_{k\ge 1} \langle x , u_{\alpha_k} \rangle^{\ast} \langle y , u_{\alpha_k} \rangle . \]

Suppose that (3) holds. Setting x = y, we obtain the Parseval identity

Suppose that (4) holds. Let x ∈ ℌ be orthogonal to all the {uα}, then for all α∈A,

\[ \langle x \,\vert\, u_{\alpha} \rangle = 0 . \]
It follows from the Parseval identity that x = 0, i.e., the orthonormal system is complete.

Example 5: We consider the Haar functions that constitute an orthonormal basis in Hilbert space 𝔏²([0,1]). The Haar wavelet is a sequence of rescaled "square-shaped" functions defined as follows:

\begin{align*} \phi_0 (t) &= 1 , \\ \phi_1 (t) &= 1_{[0,1/2)} - 1_{[1/2,1]} , \\ \phi_2 (t) &= \sqrt{2} \left( 1_{[0,1/4)} - 1_{[1/4,1/2)} \right) , \\ \phi_3 (t) &= \sqrt{2} \left( 1_{[1/2,3/4)} - 1_{[3/4,1]} \right) , \end{align*}
and in general,
\[ \phi_{2^n +k} (t) = 2^{n/2} \left( 1_{[2^{-n} k, 2^{-n}\left( k+1/2 \right) )} - 1_{[2^{-n} \left( k + 1/2 \right) , 2^{-n} \left( k+1 \right) )} \right) , \qquad n\in \mathbb{N}, \quad k =0,1, \ldots , 2^n -1. \]
The Haar sequence or wavelet was proposed in 1909 by the Hungarian--Jewish mathematician Alfréd Haar (1885--1933). The Haar wavelets form an orthonormal system in 𝔏²([0, 1]). Also, the span of all {ϕn} is the same as the span of all step functions with dyadic intervals. It is known that this span is dense in 𝔏²([0, 1]), hence the Haar functions form an orthonormal basis in 𝔏²([0, 1]).

It therefore follows that for every f ∈ 𝔏²([0, 1]):

\[ f = \sum_{n\ge 0} \langle f , \phi_n \rangle \,\phi_n . \]
The limit is in 𝔏²([0, 1]). The question is whether the sum also converges pointwise (almost everywhere). Such questions are usually quite hard. For the specific choice of the Haar basis, it is relatively easy, due to the “good" ordering of those functions.
Theorem 9: Every infinite orthonormal sequence in an inner-product space weakly converges to zero
This is an immediate consequence of the Bessel inequality: if {un} is an orthonormal sequence, then for every xX
\[ \sum_{n\ge 1} | \langle x\,\vert\, u_n \rangle |^2 \leqslant \| x \|^2 , \]
from which follows that
\[ \lim_{n\to\infty} | \langle x\,\vert\, u_n \rangle | = 0 \]
for every xX, Hence, the sequence { un } converges weakly to zero, i.e., un ⇀ 0.

Obviously, an infinite orthonormal sequence does not (strongly) converge to zero as the corresponding sequence of norms is constant and equal to one, and the norm is continuous with respect to (strong) convergence.

Example 6: Let us consider the sequence in Hilbert space ℌ = 𝔏²([0, 2π]):

\[ u_n (x) = \frac{1}{\sqrt{2\pi}} \, e^{{\bf j}nx} , \qquad {\bf j}^2 = -1. \]
It is easy to check that they constitute an orthonormal system; thus,
\[ \lim_{n\to\infty} \frac{1}{\sqrt{2\pi}} \int_0^{2\pi} f(x)\, e^{{\bf j}nx} {\text d}x = 0 , \qquad \forall f \in ℌ. \]
End of Example 6

Eigenfunction Expansions


This subsection establishes the main characteristic of self-adjoint operators to have a set of orthogonal eigenfunctions. Recall that operator T acting in a Hilbert space ℌ is an operator for which the following identity holds:
\[ \langle T\,x, y \rangle = \langle x, T\,y \rangle , \qquad \forall x, y \in D(T), \]
where D(T) is the domain of operator T, which is assumed to be dense in ℌ.
Theorem 10: If T is a self-adjoint operator, then its eigenvalues are real and eigenvectors of T corresponding to distinct eigenvalues are orthogonal.

Now we are going to take advantage by considering the Sturm--Liouville differential operator

\begin{equation*} %\label{EqSturm.3} L\left[ x, \texttt{D} \right] = q(x)\,\texttt{I} - \texttt{D}\,p(x)\,\texttt{D} , \qquad \texttt{D} = \frac{\text d}{{\text d}x} , \quad \texttt{I} = \texttt{D}^0 , \end{equation*}
in the Hilbert space 𝔏²([𝑎, b], w). Here p(x), its derivative, and q(x), w(x) are given continuous functions on some interval [𝑎, b]. It is also assumed that the weight function w(x) is strictly positive on the closed interval [𝑎, b], but function p(x) may vanish at end points x = 𝑎 and/or x = b in case of a singular problem. Accordingly, we consider the classical Sturn--Liouville problems that consist of the differential equation with a parameter λ:
\begin{equation} \label{EqSturm.1} L\left[ x, \texttt{D} \right] y = \lambda\,w\,y \qquad\mbox{or} \qquad \frac{\text d}{{\text d}x} \left[ p(x)\,\frac{{\text d}y}{{\text d}x} \right] - q(x)\, y + \lambda \,w (x)\,y(x) =0 , \qquad a < x < b , \end{equation}
subject to the homogeneous boundary conditions of the third kind
\begin{equation} \label{EqSturm.2} \alpha_0 y(a) - \alpha_1 y'(a) =0 , \qquad \beta_0 y(b ) + \beta_1 y' (b) =0 , \qquad |\alpha_0 | + |\alpha_1 | \ne 0 \quad\mbox{and} \quad |\beta_0 | + |\beta_1 | \ne 0. \end{equation}

First, using integration by parts, it is not hard to show that the linear operator \eqref{EqSturm.1} is self-adjoint:

\[ \left\langle L \left[ x, \texttt{D} \right] u \,\vert \,v(x) \right\rangle = \int_a^b \left( \frac{{\text d}}{{\text d}x}\,p(x)\, \frac{{\text d}u}{{\text d}x} - q(x)\,u(x) \right) v(x)\,{\text d}x = \left\langle u\,\vert L \left[ x, \texttt{D} \right] v \right\rangle . \]
It turns out that the eigenfunctions corresponding to different eigenvalues are orthogonal with respect to the weight function
\begin{equation} \label{EqExpand.3a} \left\langle \phi_n , \phi_k \right\rangle = \int_a^b \phi_n (x)\,\phi_k (x)\,w(x)\,{\text d} x = 0 \qquad\mbox{if} \quad n\ne k . \end{equation}
Theorem 11: If function f belongs to the domain D of the Sturn--Liouville operator L (this means that f possesses two continuous derivatives and satisfies the corresponding boundary conditions), then f can be expanded into uniformly convergent series with respect to eigenfunctions of operator L:
\begin{equation} \label{EqExpand.6} f(x) = \sum_n c_n \phi_n (x) , \qquad c_n = \frac{\langle f(x), \phi_n (x) \rangle}{\| \phi \|_2^2} . \end{equation}
The expressions in Eq.\eqref{EqExpand.6} are
\[ \langle f , \phi_n \rangle = \int_a^b f(x)^{\ast}\,\phi_n (x)\,w(x)\,{\text d}x , \qquad \| \phi \|_2^2 = \int_a^b \left\vert \phi_n (x)\right\vert^2 w(x)\,{\text d}x , \]
where asterisk stands for complex conjugate.

In 1907, mathematicians Frigyes Riesz and Ernst Fischer independently published foundational results that establish completeness of the space 𝔏²([𝑎, b]).

However, the most important is the ability to compute the eigenfunction decomposition (which is actually the spectral decomposition according to the differential operator L) of a wide class of functions. That is, for any function f ∈ 𝔏²([𝑎, b], w) satisfying the prescribed boundary conditions, we wish to synthesize it as

\begin{equation} \label{EqSturm.6} f(x) = \sum_k c_k \phi_k (x) , \end{equation}
where ϕk(x) are eigenfunctions. We wish to find out whether we can represent a function in this way, and if so, we wish to calculate ck (and of course we would want to know if the sum converges). Although this topic will be considered in detail in a dedicated section. we outline the main ideas.

Assuming that series \eqref{EqSturm.6} converges, we multiply it by w(x) ϕn(x) and integrate with respect to x from 𝑎 to b. This yields

\[ \langle f\,|\,\phi_n \rangle = \int_a^b \overline{f(x)} \phi_n (x)\,w (x)\, {\text d} x = \sum_k c_k \int_a^b \phi_k (x)^{\ast} \phi_n (x)\,w (x)\, {\text d} x = \sum_k c_k \langle \phi_k , \phi_n \rangle . \]
Using the orthogonal property of eigenfunctions
\begin{equation} \label{EqSturm.7} \langle \phi_k , \phi_n \rangle = \int_a^b \overline{\phi_k (x)} \phi_n (x)\,w (x)\, {\text d} x = \begin{cases} 0, & \ \mbox{ if} \quad n \ne k , \\ \| \phi_n (x) \|^2_2 , & \ \mbox{ for} \quad n=k ; \end{cases} \end{equation}
we obtain
\[ \langle f\,|\,\phi_n \rangle = c_n \| \phi_n (x) \|_2^2 \qquad \Longleftrightarrow \qquad \int_a^b f(x)^{\ast} \phi_n (x)\,w (x)\,{\text d}x = c_n \int_a^b \left\vert \phi_n (x) \right\vert^2 w (x)\, {\text d} x . \]
Therefore, we find
\begin{equation} \label{EqSturm.8} c_n = \frac{\langle f(x)\,|\,\phi_n (x) \rangle}{\| \phi_n (x) \|_2^2} . \end{equation}
and the Fourier series \eqref{EqSturm.6} becomes
\[ %\begin{equation} \label{EqSturm.9} f(x) = \sum_k \frac{\langle f(x)\,|\,\phi_k (x) \rangle}{\| \phi_k (x) \|_2^2} \,\phi_k (x) . \tag{9} \]
Upon introducing the orthonormal functions en = ϕn/∥ϕn∥, we rewrite Eq.(9) in compact form:
\[ f(x) = \sum_{k} \langle f\,|\,e_k \rangle \, e_k (x) . \]
In particular, we get the delta-function expansion
\[ \delta (x-t) = \sum_{k} e_k^{\ast} (t)\,w(t)\,e_k (x) , \]
for some weight function w(t). Of course, these "equations" are understood in weak sense as limits of corresponding partial sums.

These Fourier coefficients \eqref{EqSturm.8} satisfy the so-called Bessel inequality:

\[ \sum_{k\ge 1} \left\vert c_k \right\vert^2 \| \phi_k \|^2 \le \left\langle f , f \right\rangle . \]
A set of orthogonal functions { φk } in 𝔏² is called complete in the closed interval [𝑎, b] whenever the vanishing of inner products with all the orthogonal functions implies the member of 𝔏²([𝑎, b], w) is equal to zero almost everywhere in the domain.
The term complete was introduced in 1910 by the famous Russian mathematician Vladimir Steklov (1864--1926). A set of functions { φk } in 𝔏² is complete if and only if the Parseval identity holds:
\[ %\begin{equation} \label{EqSturm.10} \sum_{k\ge 1} \left\vert c_k \right\vert^2 \| \phi_k \|^2 = \left\langle f, f \right\rangle = \int_a^b \left\vert f(x) \right\vert^2 {\text d}x = \| f(x) \|^2_2 . \tag{2} \]
The identity above was stated by the famous French mathematician Marc-Antoine Parseval (1755--1836) in 1799.

The main reason to study Sturn--Liouville problems is that their eigenfunctions provide a basis for expansion for certain class of functions. This tells us that basis functions have close relationships with linear operators, and this basis is orthogonal for self-adjoint second order differential operators. In other words, solutions of differential equations can be approximated more efficiently by means of better basis functions. For example, a periodic function can be approximated more efficiently by periodic basis functions (Fourier series) than by polynomials like the Taylor series.

  1. Let ℌ be a Hilbert space and let PA and PB be orthogonal projections on closed subspaces A and B.
    1. Show that if PAPB is an orthogonal projection, then it projects on AB.
    2. Show that PAPB is an orthogonal projection if and only if    PAPB = PBPA.
    3. Show that if PAPB is an orthogonal projection, then   PA + PBPAPB   is an orthogonal projection on A + B.
    4. Find an example in which    PAPBPBPA.
  2. Let M be set of even (almost everywhere) function from 𝔏²(−∞, ∞), i.e.,
    \[ M = \left\{ f \in 𝔏²\, : \, f(x) =f(-x) \right\} . \]
    1. Show that M is a closed subspace.
    2. Find M.
  3. What is the orthogonal complement of the following sets of Hilbert space 𝔏²[0, 1]?
    1. The set of polynomials.
    2. The set of polynomials in x².
    3. The set of polynomials with 𝑎0 = 0.
    4. The set of polynomials with coefficients summing up to zero.
  4. Let M = {x = (x1, ... , xn ) ∈ ℝn ∶ ∑i=1n xi = 1}. Show that M is closed and convex, and find the element in M closest to the origin
  5. Let C be a closed convex subset of a Hilbert space ℌ, let x ∈ ℌ − C, and let y be the closest element of C to x. Prove that, for every z ∈ C, Re⟨x − y, z − y⟩ ≤ 0.

 

Return to Mathematica page)
Return to the main page (APMA0360)
Return to the Part 1 Basic Concepts
Return to the Part 2 Fourier Series
Return to the Part 3 Integral Transforms
Return to the Part 4 Parabolic Differential Equations
Return to the Part 5 Hyperbolic Differential Equations
Return to the Part 6 Elliptic Equations
Return to the Part 7 Numerical Methods