Return to computing page for the first course APMA0330
Return to computing page for the second course APMA0340
Return to computing page for the fourth course APMA0340
Return to Mathematica tutorial for the first course APMA0330
Return to Mathematica tutorial for the second course APMA0340
Return to Mathematica tutorial for the fourth course APMA0360
Return to the main page for the first course APMA0330
Return to the main page for the second course APMA0340
Return to the main page for the fourth course APMA0360
Return to Part V of the course APMA0340
Introduction to Linear Algebra with Mathematica
A typical Sturm--Liouville problem for a self-adjoint differential operator of second order (this operator is referred to as the Sturm--Liouville operator)
Here p(x), q(x), and w(x) are some continuous functions of an interval (𝑎, b), and p(x) is also assumed to have continuous derivative on this interval. Boundary conditions of the third kind \eqref{EqOrtho.2} are chosen for illustration and could be of different type (for example, Fourier series expansion is based on periodic boundary conditions). What is important is that the operator generated by the Sturm--Liouville equation \eqref{EqOrtho.1} and boundary conditions \eqref{EqOrtho.2} should be self-adjoint. So any boundary conditions that support the Lagrange identity (that giarantees the operator to be self-adjoint) can be used instead of \eqref{EqOrtho.2}.
The positive function w(x) is called the weight or density function.
The problem of finding a (real or complex) number λ such that the boundary value problem (BVP for short) \eqref{EqOrtho.1}, \eqref{EqOrtho.2} has a non-trivial solution is called the Sturm-Liouville problem. Such values of parameter λ are called eigenvalues and corresponding not identically equal to zero solutions are called eigenfunctions. When all parameters of the problem are not negative/positive, the Sturm--Liouville problem has nonnegative/positive eigenvalues. In this case, the (unbounded) linear differential operator
\( L\left[ x, \texttt{D} \right] \) is called nonnegative/positive.
Assuming that conditions on parameters of the problem \eqref{EqOrtho.1}, \eqref{EqOrtho.2} are fulfilled to guarantee existence of discrete egenvalues 0 ≤ λ_{0} < λ_{1} < λ_{2} < ··· , to which correspond eigenfunctions ϕ_{0}(x), ϕ_{1}(x), ϕ_{2}(x), …. With this in hand, we consider the problem of approximating a given function f(x), 𝑎 ≤ x ≤ b, by a sum of linear combination of the eigenfunctions
Of course, it is always assumed that the function f(x) in expansion \eqref{EqOrtho.3} satisfies the homogeneous boundary conditions \eqref{EqOrtho.2}---otherwise convergence at endpoints fails.
The word «approximate» can have a variety of meaning. Normally, we want to choose the coefficients c_{n} in such a way that the value of the partial sum
is near that of f(x) for each value of x in the interval [𝑎, b]. In this case we speak of pointwise approximation, so
\[
\lim_{N\to\infty} S_N (f; x) = f(x) .
\]
Pointwise convergence is usually needed in order to compute the value of a solution by the method of separation of variables (see next part).
However, it is much easier to find coefficients c_{n} in such a way that S_{N}(f; x) only approximate f(x) in the sense of least squares or, more briefly, in the mean.
When separation of variable method (see next chapter) is applied to partial differential equations, it leads to a corresponding Sturm--Liouville problem. However, it is only one part in constructing the solution to the (initial) boundary value problem. The second important part is representation of a function as a series over eigenfunctions. In many problems for second order partial differential equations, the set of eigenfunctions is orthogonal and coefficients in Eq.\eqref{EqOrtho.3} could be calculated explicitly. In the majority of equations of order higher than two, eigenfunctions are either not orthogonal or not complete and, therefore, separation of variable approach is not successful. Nevertheless, we are going to treat a wide class of partial differential equations of the second order for which separation of variables reduces the problems under consideration to the Sturm--Liouville problems having orthogonal complete set of eigenfunctions.
Let us consider a set X of real- or complex-valued functions on a finite interval [𝑎, b]. For positive weight function w and two arbitrary functions f and g from X, we can define an inner product by
\begin{equation} \label{EqOrtho.5}
\langle f\,\vert\, g \rangle =
\langle f\,, \, g \rangle = \int_a^b f(x)^{\ast} g(x)\,w(x)\,{\text d} x = \int_a^b \overline{f(x)}\, g(x)\,w(x)\,{\text d} x ,
\end{equation}
where asterisk or overline denote complex conjugate operation. Note that there are common ways to denote the inner product either with verical bar (in physics) or comma (in mathematics). So if \( f(x) = u(x) + {\bf j} v(x) \) is a complex-valued function, then \( \overline{f(x)} = u(x) - {\bf j} v(x) \) is its complex conjugate. With such introduced inner product and corresponding norm, we can treat the functions in 𝔏² as if they were vectors, establishing a clear analogy with the Euclidean space.
Once an inner oriduct is given, it generate the norm:
If f(x) is a function defined on an interval (finite or not) |𝑎, b|, we define the norm of f to be
The set of all functions on the interval |𝑎, b| having finite norm \eqref{EqOrtho.6} is denoted by 𝔏²([|𝑎, b], w) or simply 𝔏².
The complete set of functions with respect to the norm above is denoted by
𝔏²([𝑎, b], w), L²([𝑎, b], w), or simply 𝔏²([𝑎, b]) when weight function w is known.
Recall that the integral of a function over an interval [𝑎, b] does not depend on the values of the function at discrete number of points because they do not affect the integral to compute. Therefore, when one integrate a function over [𝑎, b], it does not matter what are the values of the function at the end points and even whether the function is defined at these points. When we don't care about end points, we use a special and convenient notation: |𝑎, b| which represent one of the following cases:
Two functions f(x) and g(x) from 𝔏² defined on some interval [𝑎, b] are called orthogonal, if the integral of the product of functions over the interval is zero:
This property of functions to be oethogonal is abbreviated asf⊥g.
Generally speaking, 𝔏²([𝑎, b], w) is
the completion of this space,but we will not go into that here.
To guarantee completeness of 𝔏²([𝑎, b], w), we need to utilize
integration in the definition of norm ∥·∥ to be in Lebesgue sense, rather than Riemann one.
Actually, the set of functions 𝔏² is a Hilbert space and its norm (a measure of the size of the function) is generated by the inner product \eqref{EqOrtho.5}.
𝔏²([|𝑎, b], w) is an example of an infinite-dimensional linear vector space, whose elements cannot be built up by linear combinations of a finite number of basis functions.) For example, Eq.\eqref{EqOrtho.3} is an infinite sum; when does it converge and does a converged sum always give the right answer? Not every infinite collection of orthogonal functions is rich enough to expand every function in 𝔏²([|𝑎, b], ρ); this is the question of completeness. How do we generate suitable sets of orthogonal functions? How do we get the coefficients c_{n} when we know the function f and the basis functions { ϕ_{n}(x) }? We answer these questions in a row.
We answer the last question for arbitrary set of functions { ϕ_{n}(x) } require only the orthogonality property.
A collection of functions { ϕ_{0}(x), ϕ_{1}(x), … , ϕ_{n}(x), … } defined on |𝑎, b| is called orthogonal on |𝑎, b| if
approaches zero as N → ∞. The smallness of this integral does not imply that S_{N}(f; x) is close to f(x) for all x. It does mean that S_{N}(f; x) is close to f(x) except for a set of intervals whose total length is small. If
we say that the sequence S_{N}(f; x) converges in the mean to f(x). This integral is called the mean square deviation of S_{N}(f; x) from f(x).
We consider the problem of approximating f(x) in the sense of least squares for a restrictive class of eigenfunctions { ϕ_{n}(x) }_{n≥0}. In particular, we assume that this class of eigenfunction is complete in 𝔏²([0, ℓ], w) and the eigenfunctions corresponding to distinct eigenvalues are orthogonal on the interval (0, ℓ) with respect to the positive weight function ρ(x):
Example 1:
We consider a set of Zernike polynomials that are orthogonal on the unit disk. They are named after optical physicist Frits Zernike (1888--1966), winner of the 1953 Nobel Prize in Physics and the inventor of phase-contrast microscopy. These polynomials play important roles in various optics branches such as beam optics and imaging.
There are even and odd Zernike polynomials. The odd Zernike polynomials are defined as
(odd function over the azimuthal angle θ, where m and n are nonnegative integers with n integers with n≥m≥0, and even Zernike polynomials are defined as
Coefficients c_{k} occur only in the first sum. Since this is a sum of squares, it is clearly minimized by making all its terms zero. In other words, we are forced to set
Therefore, this choice of the coefficients minimizes the error \( \| f(x) - \sum_{k=0}^N \phi_k (x) \|^2 . \) We have shown that the formula \eqref{EqOrtho.7} provides the best approximation to f(x) in the sense of least squares. This choice of coefficients is independent on N, the number of terms. This conclusion follows from orthogonality of eigenfunctions. The coefficients c_{k} given by \eqref{EqOrtho.7} are called the Fourier coefficients of f(x) with respect to orthogonal functions{ ϕ_{k}(x) }. The series \( \displaystyle f(x) \sim \sum_{k\ge 0} c_k \phi_k (x) \) is called the
Fourier series of f(x). Since this series may or may not converge (its meaning we will discuss separately), we cannot claim that
\( \displaystyle f(x) = \sum_{k\ge 0} c_k \phi_k (x) . \) It is a common to use a tilde notation:
\[
f(x) \sim \sum_{k\ge 0} c_k \phi_k (x)
\]
simply to denote the fact that c_{k} are Fourier coefficients defined by Eq.\eqref{EqOrtho.7}, and the Fourier series gives a procedure to recover the function f(x) from its Fourier coefficients.
Example 2:
End of Example 2
Note:Formula \eqref{EqOrtho.7} for the Fourier coefficients can be derived directly if the series \( \displaystyle \sum_{k\ge 0} c_k \phi_k (x) \) converges to f(x) uniformly. If this would be true, we could multiply the series by ϕ_{n}(x) ρ(x) and integrate from 0 to ℓ term by term. Because of orthogonality, all the integrals are zero except for k = n. Hence
The partial sums corresponding the left-hand side are nondecreasing with N and are bounded from above by the norm ∥f(x)∥². Therefore, it converges if the norm ∥f(x)∥² is finite, so if f(x) ∈ 𝔏². We may allow f(x) or ρ(x) to be discontinuous or even unbounded at some points as long as this integral (possibly defined as an improper integral) is finite.
Example 3:
Consider the collection of cosine functions { 1, cos(x), cos(2x), cos(3x), … } on an interval of length 2π, which we know from the previous sections is orthogonal. We have
An orthogonal system of functions { ϕ_{k}(x) }_{k≥0} defined on an interval (finite or not) |𝑎, b| is said to be complete relative to a class of functions if there is no nontrivial function that is orthogonal to each member of the collection of the functions. In other words, if f(x) on |𝑎, b| satisfies
\( \langle f, \phi_k \rangle =0 \) for all k ≥ 0 implies that f(x) ≡ 0 on [𝑎, b], or, more precisely, that \( \displaystyle \| f \|^2_2 = \int_a^b \left\vert f (x) \right\vert^2 \rho (x)\, {\text d}x =0 . \)
Example 4:
Considered previously the collection of cosine functions { 1, cos(x), cos(2x), cos(3x), … } on an interval of length 2π in not complete, but it is orthogonal. Indeed, if f is any odd function (for instance, x or sinx), then \( \displaystyle f(-x) = -f(x) . \) Calculations show that \( \langle f, \cos (kx) \rangle =0 \) because the product of odd function and an even function is an odd function. ■
Theorem 1:
If \( \displaystyle \left\{ \phi_k (x) \right\}_{k\ge 0} \) is a complete orthogonal system of functions on the interval |𝑎, b|, then every (piecewise continuous or, more generally, square integrable) function f(x) on |𝑎, b| has the expansion
The orthogonal expansion \( \displaystyle f(x) \sim \sum_{n\ge 0} c_n f_n (x) \) for a complete orthogonal system on [𝑎, b], holds in 𝔏² sense, but not necessarily pointwise, i.e. for a fixed x∈[𝑎, b] the series on the right hand side
might not necessarily converge and, even if it does, it might not converge to f(x).
Theorem 2:
Let ϕ_{0}, ϕ_{1}, ϕ_{2}, …, ϕ_{n}, … be the eigenfunctions of the self-adjoint Sturm--Liouville problem \eqref{EqOrtho.1},\eqref{EqOrtho.2}
Let f and its derivative \( f' \) be piecewise continuous on the closed interval \( 0 \le x \le \ell . \) Then the series
\[
f(x) = \sum_{n\ge 1} c_n \phi_n (x)
\]
whose coefficients c_{n} are given by Eq.\eqref{EqOrtho.7}
converges to \( \displaystyle \frac{1}{2} \left[ f(x+0) + f(x-0) \right] \) at each point in the open interval (0,ℓ).
Example 5:
Considered previously the collection of cosine functions { 1, cos(x), cos(2x), cos(3x), … } on an interval of length 2π in not complete, but it is orthogonal. Indeed, if f is any odd function (for instance, x or sinx), then \( \displaystyle f(-x) = -f(x) . \) Calculations show that \( \langle f, \cos (kx) \rangle =0 \) because the product of odd function and an even function is an odd function. ■
Theorem 3:
Let ϕ_{0}, ϕ_{1}, ϕ_{2}, …, ϕ_{n}, … be the eigenfunctions of the self-adjoint Sturm--Liouville problem \eqref{EqOrtho.1},\eqref{EqOrtho.2}
Let f and its derivative \( f' \) be piecewise continuous on the closed interval \( 0 \le x \le \ell . \) Then the series
\[
f(x) = \sum_{n\ge 1} c_n \phi_n (x)
\]
whose coefficients c_{n} are given by Eq.\eqref{EqOrtho.7}
converges to \( \displaystyle \frac{1}{2} \left[ f(x+0) + f(x-0) \right] \) at each point in the open interval (0,ℓ).
Example 6:
There are known other orthogonal and complete sets of functions that are used in other than differential equations areas. In particular, Li--Torney system of step function is very useful in computer science. The collection of Walsh functions form a complete orthogonal set of functions that can be used to represent any discrete function---just like trigonometric functions can be used to represent any continuous function in Fourier analysis. These functions as well as the Walsh--Hadamard code are named after the American mathematician Joseph L. Walsh (1895--1973). Applications of the Walsh functions can be found wherever digit representations are used, including speech recognition, medical and biological image processing, and digital holography. The Haar wavelet is a sequence of rescaled "square-shaped" functions which together form a wavelet family or basis. Wavelet analysis is similar to Fourier analysis. The Haar sequence was proposed in 1909 by the Hungarian mathematician Alfréd Haar (1885--1933).
The Faber–Schauder system is a Schauder basis for the space C([0, 1]) of continuous functions on [0, 1]. The Franklin system is obtained from the Faber--Schauder system by the Gram–Schmidt orthonormalization procedure.
■
Theorem (Parseval):
Let \( \displaystyle \left\{ \phi_k (x) \right\}_{k\ge 0} \) be a complete orthogonal system of functions on the interval |𝑎, b|, and f ∈ 𝔏²([𝑎, b], ρ). Then
This theorem originated in the 1799 work of the French mathematician
Marc-Antoine Parseval (1755--1836). It is also known as Rayleigh's energy theorem, or Rayleigh's Identity because the integral \( \displaystyle E[f] = \frac{1}{\ell} \int_{-\ell}^{\ell} \left\vert f (x) \right\vert^2 {\text d}x = \| f \|^2 / \ell \) usually represents the energy in applications. Therefore, the Parseval theorem could be interpreted as the energy in f equals the energy in its Fourier coefficients. The function space formed by all functions with finite energy (= square integrable, is usually denoted by L²) fill so called the Hilbert space.
This problem occurs, for instance, in the heat conduction problem in a bar or rod of unit length. The boundary condition at x = 0 corresponds to a rate of heat flow that is proportional to the temperature, and units are chosen so that the constant of proportionality is 1. The boundary condition at x = 1 corresponds to a perfect insulation there.
The solution of the differential equation \( y'' + \lambda\,y =0 \) may have one of three forms, depending on λ, so it is necessary to consider these cases. Let us start with λ = 0, the general solution becomes a linear function
\[
y = a+ b\,x ,
\]
with some constants a and b. The boundary conditions require that
\[
a-b =0 \qquad\mbox{and}\qquad b=0,
\]
respectively. The only solution is a = b = 0; hence, the boundary value problem has only trivial solution. So λ = 0 is not an eigenvalue.
If λ is negative, we can let λ = -μ² so that μ > 0. Then the differential equation for y becomes
of the Hilbert space 𝔏²([𝑎, b], ρ) of square integrable functions into the set ℓ² of all sequences (infinite vectors) such that \( \displaystyle \sum_{k\ge 0} c_k^2 \| \phi_k (x) \|_2^2 \) is finite. Then infinite series
provides the inverse transform ℱ^{−1} that assigns to every infinite vector from ℓ² a function. Restoration of a function from its sequences of Fourier coefficients is an ill-posed problem.
Therefore, there are known several regularizations to define an infinite series (3).
It is most natural to consider pointwise convergence, where we examine the discrepancy at every point. It is difficult (may be impossible) to provide general results about pointwise convergence for general orthogonal functions, but for Fourier series a lot has been done. We have to ask that f(x) be a lot smoother than just belonging to 𝔏², which is a "big" space of functions. You might think having f(x) continuous would be enough, but that obviously does not work at Gibbs phenomenon shows.
Here is a slightly different statement of the conditions copied from Keener:
If function f(x) has continuous first derivatives on the interval, except possibly at a finite number of points at which there is a jump in f(x), where left and right derivatives must exist. Then the Fourier series of f converges to ½(f(x+)+f(x−)) for every point in the open interval (0,ℓ). At x=0 and ℓ, the series converges to ½(f(0+)+f(ℓ−)).
The most remarkable result regarding pointwise convergence almost everywhere was obtained by Hans Rademacher (1922) and Dmitrii Menshov (1923).
Rademacher–Menshov Theorem:
If the coefficients c_{n} of a Fourier series
\( \displaystyle \sum_n c_n \phi_n (x) \) with respect to an orthogonal system { ϕ_{n}(x) } on an interval satisfy
then the Fourier series converges almost everywhere.
Another form of convergence is uniform convergence. This used to be the gold standard of convergence. For continuous functions, you can measure the maximum difference between the series and the function written:
The maximum discrepancy is another kind of norm, called the uniform norm (also called the sup norm, for supremum, which is the mathematically way of defining a maximum for discontinuous functions). Pointwise convergence does not imply uniform convergence! The Gibbs’ phenomenon, which is discussed later, is the most famous example of this. But uniform convergence does include pointwise.
There are significant differences between the behavior of Fourier- and power-series expansions. A power series is essentially an expansion about a point, using only information from that point about the function to be expanded (including, of course, the values of its derivatives). We already know that such expansions only converge within a radius of convergence defined by the position of the nearest singularity. However, a Fourier series (or any expansion in orthogonal functions) uses information from the entire expansion interval, and therefore can describe functions that have “nonpathological” singularities within that interval. However, we also know that the representation of a function by an orthogonal expansion is only guaranteed to converge in the mean. This feature comes into play for the expansion of functions with discontinuities, where there is no unique value to which the expansion must converge. However, for Fourier series, it can be shown that if a function f(x) satisfying the Dirichlet conditions is discontinuous at a point x_{0}, its Fourier series evaluated at that point will be the arithmetic average of the limits of the left and right approaches.
where E[f] is known as the energy of the 2ℓ-periodic function f ∈ L²[-ℓ,ℓ].
Huaien Li and David C. Torney, A complete system of orthogonal step functions, Proceedings of the American Mathematical Society, 132, No 12, 2004, pp. 3491--3502.
Menchoff, D.E., (1923), "Sur les séries de fonctions orthogonales. (Première Partie. La convergence.).", Fundamenta Mathematicae (in French), 4: 82–105, doi:10.4064/fm-4-1-82-105
Rademacher, Hans (1922), "Einige Sätze über Reihen von allgemeinen Orthogonalfunktionen", Mathematische Annalen, Springer Berlin / Heidelberg, 87: 112–138, doi:10.1007/BF01458040
Titchmarsh, E.C., Eigenfunction Expansions Associated with Second-order Differential Equations. Part I (1946); 2nd. edition (1962).
Titchmarsh, E.C., Eigenfunction Expansions Associated with Second-order Differential Equations. Part II (1958).
J. L. Walsh, A closed set of normal orthogonal functions,
American Journal of Mathematics, 45, (1923), 5--24.
Return to Mathematica page
Return to the main page (APMA0340)
Return to the Part 1 Matrix Algebra
Return to the Part 2 Linear Systems of Ordinary Differential Equations
Return to the Part 3 Non-linear Systems of Ordinary Differential Equations
Return to the Part 4 Numerical Methods
Return to the Part 5 Fourier Series
Return to the Part 6 Partial Differential Equations
Return to the Part 7 Special Functions