MATLAB TUTORIAL for the Second Course. Part 2.5: Iterative Methods

Iterative Methods

An iterative method to solve the linear system A x = b starts with an initial approximation p0 to the solution x and generates a sequence of vectors \( \left\{ {\bf p}_k \right\}_{k\ge 0} \) that converges to x. Iterative methods involve a process that converts the system A x = b into an equivalent system of the form x = B x + w, for some fixed matrix B and vector b. After the initial vector p0 is selected, the sequence of approximate solutions is generated by computing \( {\bf x}^{(k+1)} = {\bf B}\,{\bf x}^{(k)} + {\bf b} , \) for each \( k =0,1,2,\ldots . \)

For large systems containing thousands of equations, iterative methods often have decisive advantages over direct methods in terms of speed and demands on computer memory. Sometimes, if the accuracy requirements are not stringent, a modest number of iterations will suffice to produce an acceptable solution. Also, iterative methods are often very efficient for sparse systems problems. In sparse problems, the nonzero elements of A are sometimes stored in a sparse-storage format. In other case, it is not necessary to store A at all; for example, in problems involving the numerical solution of partial differential equations as, in this case, each row of A might be generated as needed but not retained after use. Another important advantage of iterative methods is that they are usually stable, and they will actually dampen errors, due to roundoff or minor blunders, as the process continues.

We consider some iterative methods for solving linear and nonlinear vector equations. We start with an extension of fixed-point iteration that applies to systems of linear equations. Jacobi Iteration
We start with an example that clarifies the method.

Example. Consider the system of equations

\[ \begin{split} 4x-2y+z&=9, \\ 2x+8y-3z&=19, \\ x+2y -5z&=2. \end{split} \]
These equations can be written in the form
\[ \begin{split} x&= (9+2y-z)/4 , \\ y&=(19-2x+3z)/8 , \\ z&=(x+2y-2)/5. \end{split} \]
This suggests the following Jacobi iterative process
\[ \begin{split} x_{k+1}&= (9+2y_k -z_k )/4 , \\ y_{k+1}&=(19-2x_k +3z_k )/8 , \\ z_{k+1}&=(x_k +2y_k -2)/5. \end{split} \]
Starting with some initial values, say (2,1,0), we fill out the table
x_i  y_i  z_i 
2.75  1.875  0.4 
3.0875  1.8375  0.9 
2.94375  1.94062  0.9525 
2.98219  1.99625   0.965 
3.00688  1.99133  0.994938 
2.99693  1.99638  0.997906 
2.99871  1.99998  0.997939 
3.00051  1.99955  0.999736 
2.99984  1.99977  0.999921 
10  2.99991  2.00001  0.999878 
This iterative process unambiguously indicates that the given system has the solution (3,2,1). ■

This process is called Jacobi iteration and can be used to solve certain types of linear systems. it is named after the German mathematician Carl Gustav Jacob Jacobi (1804--1851), who made fundamental contributions to elliptic functions, dynamics, differential equations, and number theory. His first name is sometimes given as Karl and he was the first Jewish mathematician to be appointed professor at a German university.

Linear systems with as many as 100,000 or more variables often arise in the solutions of partial differential equations (Chapter 7). The coefficient matrices for these systems are sparse; that is, a large percentage of the entries of the coefficient matrix are zero. An iterative process provides an efficient method for solving these large systems.

Sometimes the Jacobi mathod does not work. Let us experiment with the following system of equations.

Example. Consider the system of equations

\[ \begin{split} 4x-2y+8z&=16, \\ 2x+y-5z&=3, \\ 5x+2y -z&=18. \end{split} \]
We rewrite these equations as
\[ \begin{split} x&= (16+2y-8z)/4 , \\ y&= 3-2x+5z , \\ z&= 5x +2y-18. \end{split} \]
This suggests the following Jacobi iterative process
\[ \begin{split} x_{k+1}&= (9+2y_k -z_k )/4 , \\ y_{k+1}&=(19-2x_k +3z_k )/8 , \\ z_{k+1}&=(x_k +2y_k -2)/5. \end{split} \]
Starting with some initial values, say (2,1,0), we fill out the table
x_i  y_i  z_i 
4.5  -1  -6 
15.5  -36  2.5 
-19  -15.5  -12.5 
21.25  -21.5   -144. 
281.25  -759.5  45.25 
-466.25  -333.25  -130.75 
98.875  281.75  -3015.75 
6176.38  -15273.5  1039.88 
-9712.5  -7150.38  316.875 
10  -4204.94  21012.4  62881.3 
So this iterative process cannot converge to (3,2,1), true solution. ■

The last example shows that we need some criterion to determine whether the Jacobi iteration will converge. Hence, we make the following definition.

A square matrix A of dimensions \( n \times n \) is said to be strictly diagonally dominated provided that

\[ | a_{k,k} | > \sum_{j=1, j\ne k}^n |a_{k,j}| \qquad \mbox{for} \quad k=1,2,\ldots , n. \qquad ■ \]
This means that in each row of the matrix the magnitude of the element on the main diagonal must exceed the sum of the magnitudes of all other elements in the row.

Theorem (Jacobi iteration): Suppose that A is a strictly diagonally dominant matrix. Then A x = b has a unique solution x = p. Iteration using Jacobi formula

\[ x_j^{(k+1)} = \frac{1}{a_{j,j}} \left[ b_j - a_{j,1} x_1^{(k)} - \cdots - a_{j,j-1} x_{j-1}^{(k)} - a_{j,j+1} x_{j+1}^{(k)} - \cdots - a_{j,n} x_n^{(k)} \right] , \quad j=1,2,\ldots , n , \]
will produce a sequence of vectors \( \left\{ {\bf p}_k \right\}_{k\ge 0} \) that will converge to p for any choice of the starting vector p0. ■

This formula can written in compact matrix form:

\[ {\bf x}^{(k+1)} = {\bf B} {\bf x}^{(k)} + {\bf w} = - {\bf \Lambda}^{-1} \left( {\bf L} + {\bf U} \right) {\bf x}^{(k)} + {\bf \Lambda}^{-1} {\bf b} , \]
where \( {\bf A} = {\bf L} + {\bf \Lambda} + {\bf U} \) with
\[ {\bf \Lambda} = \begin{bmatrix} a_{1,1} & 0 & \cdots & 0 \\ 0 & a_{2,2} & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0&0& \cdots & a_{n,n} \end{bmatrix} , \quad {\bf U} = \begin{bmatrix} 0& a_{1,2} & \cdots & a_{1,n} \\ 0&0& \cdots & a_{2,n} \\ \vdots & \vdots & \ddots & \vdots \\ 0&0& \cdots & 0 \end{bmatrix} , \quad {\bf L} = \begin{bmatrix} 0&0& \cdots & 0 \\ a_{2,1} & 0 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ a_{n,1} & a_{n,2} & \cdots & 0 \end{bmatrix} . \]

function x1 = jacobi1(a,b,x0,tol)
n = length(b);
x = zeros(n,1);
for j = 1 : n
     x(j) = ((b(j) - a(j,[1:j-1,j+1:n]) * x0([1:j-1,j+1:n])) / a(j,j)); % the first iteration
end
x1 = x';
k = 1;
x_ny = zeros(n,1);
while norm(x1-x0,1) > tol
    for j = 1 : n
     x_ny(j) = ((b(j) - a(j,[1:j-1,j+1:n]) * x1([1:j-1,j+1:n])) / a(j,j));
    end
    x0 = x1;
    x1 = x_ny';
    k = k + 1;
end
k
x = x1';

Here is another matlab version to solve the linear system A x = b by starting with an initial guess \( {\bf x} = {\bf p}_0 \) and generating a sequence \( \left\{ {\bf p}_k \right\}_{k\ge 0} \) that converges to the solution. A sufficient condition for the method to be applicable is that A is strictly diagonally dominant.

function = jacobi(A,b,p,delta,max1)
% Input:    A is an n x n nonsingular matrix
%           b is an n x 1 matrix (= n-vector)
%           p is an n x 1 matrix: the initial guess of column-vector
%           delta is the tolerance for p
%           max1 is the maximum number of iterations
% Output:   x is an n x 1 matrix (= column vector); the Jacobi approximation
n = length(b);
for k=1:max1
    for j=1:n
        x(j)=(b(j)-A(j,[1:j-1,j+1:n])*p([1:j-1,j+1:n]))/A(j,j);
    end
    err=abs(norm(x'-p));
    relerr=err/(norm(x) +eps);
    p=x';
         if(err < delta)|(relerr < delta)
         break
    end
end
x=x';

Gauss-Seidel Iteration
Sometimes the convergence can be speeded up. Reconsider our previous example.

Example. The system of equations

\[ \begin{split} 4x-2y+z&=9, \\ 2x+8y-3z&=19, \\ x+2y -5z&=2. \end{split} \]
can be iterated as follows:
\[ \begin{split} x_{k+1}&= (9+2y_k -z_k )/4 , \\ y_{k+1}&=(19-2x_{k+1} +3z_k )/8 , \\ z_{k+1}&=(x_{k+1} +2y_{k+1} -2)/5. \end{split} \]
It seems reasonable that xk+1 could be used in place of xk in the computations of yk+1 and zk+1. Similarly, yk+1 might be used in computation of zk+1. Such algorithm, also known as the Liebmann method or the method of successive displacement, is named after the German mathematicians Carl Friedrich Gauss (1777--1855) and Philipp Ludwig von Seidel (1821--1896). This algorithm was only mentioned in a private letter from Gauss to his student Gerling in 1823. A publication was not delivered before 1874 by Seidel, who didn’t recommend to use it. Seidel also discovered in 1847 the crucial analytic concept of uniform convergence. In 1857, von Seidel decomposed the first order monochromatic aberrations into five constituent aberrations. They are now commonly referred to as the five Seidel Aberrations. The lunar crater Seidel is named after him. The following table confirms that Gauss-Seidel iteration converges faster than Jacobi.

If we start with \( {\bf p}_0 = (2,1,0) , \) then iteration will converge to the solution (3,2,1) much faster than under Jacobi method.
x_k  y_k  z_k 
2.75  1.6875  0.825 
2.8875  1.9625  0.9625 
2.99063  1.98828  0.993438 
2.99578  1.99859  0.998594 
2.99965  1.99956  0.999754 
2.99984  1.99995  0.999947 
2.99999  1.99995  0.999947 
2.99999  1.99998  0.999991 
3.  2.  1. 
10  3.  2.  1. 

We now generalize the Gauss-Seidel iteration procedure. Suppose we are given the linear system

\[ \begin{split} a_{1,1} x_1 + a_{1,2} x_2 + \cdots + a_{1,j} x_j + \cdots + a_{1,n} x_n &= b_1 , \\ a_{2,1} x_1 + a_{2,2} x_2 + \cdots + a_{2,j} x_j + \cdots + a_{2,n} x_n &= b_2 , \\ \qquad \vdots \qquad \vdots & \quad \vdots \\ a_{n,1} x_1 + a_{n,2} x_2 + \cdots + a_{n,j} x_j + \cdots + a_{n,n} x_n &= b_n . \end{split} \]
Let the kth point be \( {\bf p}_k = \left( x_1^{(k)} ,x_2^{(k)} ,\ldots , x_j^{(k)} , \ldots , x_n^{(k)} \right) ; \) then the next point is \( {\bf p}_{k+1} = \left( x_1^{(k+1)} ,x_2^{(k+1)} ,\ldots , x_j^{(k+1)} , \ldots , x_n^{(k+1)} \right) . \) The superscript (k) on the coordinates of pk enables us to identify the coordinates that belong to this point. The Gauss--Seidel ietration formulas use row j to solve for \( x_j^{(k+1)} \) in terms of a linear combination of the previous values:
\[ x_j^{(k+1)} = \frac{1}{a_{j,j}} \left[ b_j - a_{j,1} x_1^{(k+1)} - \cdots - a_{j,j-1} x_{j-1}^{(k+1)} - a_{j,j+1} x_{j+1}^{(k)} - \cdots - a_{j,n} x_n^{(k)} \right] \]
for \( j=1,2,\ldots , n . \) This formula becomes clear if we rewrite the system in the matrix form:
\[ \begin{bmatrix} a_{1,1} & 0 & \cdots & 0 \\ a_{2,1} & a_{2,2} & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ a_{n,1} & a_{2,n} & \cdots & a_{n,n} \end{bmatrix} \begin{bmatrix} x_1^{(k+1)} \\ x_2^{(k+1)} \\ \vdots \\ x_n^{(k+1)} \end{bmatrix} + \begin{bmatrix} 0& a_{1,2} & \cdots & a_{1,n} \\ 0&0& \cdots & a_{2,n} \\ \vdots & \vdots & \ddots & \vdots \\ 0&0& \cdots & 0 \end{bmatrix} \begin{bmatrix} x_1^{(k)} \\ x_2^{(k)} \\ \vdots \\ x_n^{(k)} \end{bmatrix} = \begin{bmatrix} b_1 \\ b_2 \\ \vdots \\ b_n \end{bmatrix} . \]
That is,
\[ \left( {\bf L} + {\bf \Lambda} \right) {\bf x}^{(k+1)} + {\bf U} \, {\bf x}^{(k)} = {\bf b} , \]
where L is lower triangular matrix, U is upper triangular matrix, and Λ is the diagonal matrix with elements from A. Then
\[ {\bf x}^{(k+1)} = \left( {\bf L} + {\bf \Lambda} \right)^{-1} \left[ {\bf b} - {\bf U} \, {\bf x}^{(k)} \right] , \]
which can be written in compact form
\[ {\bf x}^{(k+1)} = {\bf B} {\bf x}^{(k)} + {\bf w} , \]
where \( {\bf B} = - \left( {\bf L} + {\bf \Lambda} \right)^{-1} {\bf U} \) and \( {\bf w} = \left( {\bf L} + {\bf \Lambda} \right)^{-1} {\bf b} . \) Let \( {\bf B}_S \quad\mbox{and} \quad {\bf B}_J \) be iterative matrices for Seidel method and Jacobi method, respectively. Since
\[ \left\| {\bf B}_S \right\| = \left\| {\bf B}_J \right\|^2 , \]
the Seidel method is more efficient compared to Jacobi one. ■

We present matlab code to solve the linear system A x = b by starting with the initial guess \( {\bf x} = {\bf p}_0 \) and generating a sequence \( \left\{ {\bf p}_k \right\}_{k\ge 0} \) that converges to the solution. A sufficient condition for the method to be applicble is that A is either strictly diagonal dominant or symmetric and positive definite.

function = gseidel(A,b,p,delta,max1)
% Input:    A is an n x n nonsingular matrix
%           b is an n x 1 matrix (= n-column vector)
%           p is an n x 1 matrix: the initial guess of column-vector
%           delta is the tolerance for p
%           max1 is the maximum number of iterations
% Output:   x is an n x 1 matrix (= column vector); the Gauss-Seidel approximation
n = length(b);
for k=1:max1
    for j=1:n
        if j==1
        x(1)=(b(1)-A(1,2:n)*p(2:n))/A(1,1);
        elseif j==n
        x(n)=(b(n)-A(n,1:n-1)*x(1:n-1))' /A(n,n);
        else
        x(j)=(b(j)-A(j,1:j-1)*x(1:j-1) - A(j,j+1:n)*p(j+1:n))/A(j,j);
    end
end
    err=abs(norm(x'-p));
    relerr=err/(norm(x) +eps);
    p=x';
         if(err < delta)|(relerr < delta)
         break
    end
end
x=x';

The SOR Method
A third iterative method, called the Successive Overrelaxation (SOR) Method, is a generalization of and improvement on the Gauss-Seidel Method. Here is the idea:

We can write the Gauss-Seidel equation as

\[ {\bf \Lambda} {\bf x}^{(k+1)} = {\bf b} - {\bf L}\, {\bf x}^{(k+1)} - {\bf U} {\bf x}^{(k)} \]
so that
\[ {\bf x}^{(k+1)} = {\bf \Lambda}^{-1} \left[ {\bf b} - {\bf L}\, {\bf x}^{(k+1)} - {\bf U} {\bf x}^{(k)} \right] . \]
We can subtract \( {\bf x}^{(k)} \) from both sides to get
\[ {\bf x}^{(k+1)} - {\bf x}^{(k)} = {\bf \Lambda}^{-1} \left[ {\bf b} - {\bf L}\, {\bf x}^{(k+1)} - {\bf \Lambda} {\bf x}^{(k)} - {\bf U} {\bf x}^{(k)} \right] . \]
The right hand side can be considered as Gauss-Seidel correction and we have SOR iteration
\[ {\bf x}^{(k+1)} = {\bf x}^{(k)} + \omega {\bf \Lambda}^{-1} \left[ {\bf b} - {\bf L}\, {\bf x}^{(k+1)} - {\bf \Lambda} {\bf x}^{(k)} - {\bf U} {\bf x}^{(k)} \right] , \]
where generally \( 1 < \omega < 2 . \) Notice that if ω = 1 then this is the Gauss-Seidel Method. We can multiply both sides by matrix Λ and divide both sides by ω to rewrite this as
\[ \frac{1}{\omega} \,{\bf \Lambda} \,{\bf x}^{(k+1)} = \frac{1}{\omega} \,{\bf \Lambda} \,{\bf x}^{(k)} + \left[ {\bf b} - {\bf L}\, {\bf x}^{(k+1)} - {\bf \Lambda} {\bf x}^{(k)} - {\bf U} {\bf x}^{(k)} \right] , \]
then collect the \( {\bf x}^{(k+1)} \) terms on the left hand side to have
\[ \left( {\bf L} + \frac{1}{\omega} \,{\bf \Lambda} \right) {\bf x}^{(k+1)} = \left( \frac{1}{\omega} \,{\bf \Lambda} - {\bf \Lambda} - {\bf U} \right) {\bf x}^{(k)} + {\bf b} . \]
When we solve for \( {\bf x}^{(k+1)} , \) we get SOR iteration:
\[ {\bf x}^{(k+1)} = \left( {\bf L} + \frac{1}{\omega} \,{\bf \Lambda} \right)^{-1} \left( \frac{1}{\omega} \,{\bf \Lambda} - {\bf \Lambda} - {\bf U} \right) {\bf x}^{(k)} + \left( {\bf L} + \frac{1}{\omega} \,{\bf \Lambda} \right)^{-1} {\bf b} , \]
which can be also written as \( {\bf x}^{(k+1)} = {\bf B} {\bf x}^{(k)} + {\bf w} . \) We can rewrite it explicitly:
\[ x^{(k+1)}_i = x_i^{(k)} + \frac{\omega}{a_{i,i}} \left( b_i - \sum_{j=1}^{i-1} a_{i,j} x_j^{(k+1)} - \sum_{j=i}^n a_{i,j} x_j^{(k)} \right) . \]


Complete

Iterative Methods