No Title

Global Eigensolvers

Unlike local eigensolvers global methods for eigenproblems do not require a good initial guess for an eigenvalue. In addition, they provide the entire eigenspectrum. We will discuss two such approaches that are based on two different concepts: The QR method and the Lanczos method. The first one is general and the second] one is appropriate for symmetric sparse systems.

The QR Eigensolver

The main idea of this method is to construct a sequence of similar matrices ${\bf A}_1, {\bf A}_2, \ldots, {\bf A}_k$ , where ${\bf A}_1 = {\bf A}$ and ${\bf A}_k$ approaches an upper triangular matrix as $k\rightarrow \infty$ of the form

$\begin{displaymath}A_k \rightarrow \left [ \begin{array}{ccccccccc} \lambda_1 & ... ...\vdots\\ & & & \ddots \\ & & & &\lambda_n \end{array}\right ]\end{displaymath}$

The Schur triangulization theorem [#!Golub-vanLoan!#] guarantees that this triangular matrix exists if an orthonormal matrix ${\bf Q}$ is involved in the similarity transformation. The scalar entries in the diagonal correspond to real eigenvalues. However, for non-symmetric matrices we may encounter complex eigenvalues, which manifest themselves as $2 \times 2$ submatrices on the diagonal in the form

$\begin{displaymath}A_k\rightarrow \left [ \begin{array}{ccccccccccc} \lambda _1 ... ...& & & && \ddots \\ 0 & & & & & & \lambda_n \end{array}\right ]\end{displaymath}$

These $2 \times 2$ submatarices, as shown above, appear also for double eigenvalues and they are sometimes referred to as bumps. The eigenvalues $\lambda_2$ and $\lambda_3$ can then be obtained by simply solving the second-order characteristic polynomial of the $2 \times 2$ submatrix. We now present the basic QR iteration:

$\begin{displaymath}\begin{array}{ll} \mbox{Initialize:}& {\bf A}_0 = {\bf A} \; ... ...}_2 = {\bf Q}_2 {\bf R}_2\\ & \hspace{.5in} \vdots \end{array}\end{displaymath}$

To prove that all matrices in the sequence are similar we first examine

$\begin{displaymath}{\bf A}_1 = {\bf R}_0 {\bf Q}_0 = ({\bf Q}_0^T {\bf A}_0) {\bf Q}_0\end{displaymath}$

and since ${\bf Q}_0^T{\bf Q}_0 = {\bf I}$ we conclude that ${\bf A}_1$ is similar to ${\bf A}_0 = {\bf A}.$

Similarly, in iteration k

$\begin{displaymath}{\bf A}_k = {\bf R}_{k-1} {\bf Q}_{k-1} = ({\bf Q}^T_{k-1} {\bf A}_{k-1} ) {\bf Q}_{k-1}.\end{displaymath}$

The method converges if all eigenvalue have different magnitude, and the sequence of the diagonal will be

$\begin{displaymath}\vert\lambda_1 > \vert\lambda_2\vert> \ldots > \vert\lambda_n\vert,\end{displaymath}$

(see [#!Wilkinson65!#]). The rate of convergence is

$\begin{displaymath}\frac{\vert\lambda_i\vert}{\vert\lambda_{i+1}\vert}. \end{displaymath}$

We note here that a similar sequence can be generated with about the same or even less computational cost by using an LU decomposition, instead of the QR factorization, the so-called LR method. However, no pivoting can be employed as this will destroy the upper triangular structure and may change the eigenvalues (e.g., sign), and thus the LU procedure will be unstable. We also note that the QR decomposition can be applied to a singular matrix unlike the LU factorization. So overall, the QR eigensolver is definetely the preferred solution. However, a case for which the LU factorization may be used over the QR is when we have a banded matrix and diagonally dominant ${\bf A}$ . The reason is that LU preserves the bandwidth unlike the QR approach, which fills in the (initially) zero off-diagonal elements at the rate of one additional diagonal on each side per iteration. The diagonal dominance of A guarantees that no pivoting is needed for stability so overall the LR eigensolver will be a faster method in this case.

The Hessenberg QR Eigensolver

The basic QR method has two main disadvantages. It requires QR factorization in each step which costs

$\begin{displaymath}{\cal O}\left ( \frac{4n^3}{3}\right )\end{displaymath}$

i.e., twice as much as an LU decomposition. It also diverges if two eigenvalues have the same magnitude. However, both problems have been, fortunately, solved.

With regards to the cost, the main idea is to first transform the matrix ${\bf A}$ to an upper Hessenberg matrix, i.e.,

$\begin{displaymath}{\bf A} \longrightarrow {\bf He}^u\end{displaymath}$

(see section

), and subsequently perform QR factorization on the ${\bf He}^u.$ The method works because all subsequent matrices maintain the Hessenberg form as can be verified with an example! The initial cost to perform the Hessenberg reduction is

$\begin{displaymath}{\cal O}\left (\frac{5}{3} n^3\right ),\end{displaymath}$

however the QR factorization of a Hessenberg matrix is ${\cal O}(n^2)$ , and this is the cost we incur in every iteration. Therefore, the Hessenberg QR eigensolver is an ${\cal O}(n)$ more efficient than the basic QR eigensolver.

Shifted QR Eigensolver

The convergence rate is at most linear for the QR eigensolver, i.e., the error in every iteration is

$\begin{displaymath}\epsilon_{k+1} \sim C \epsilon_k\end{displaymath}$

where $\epsilon_k = \lambda_n-(a_{nn})_k.$

This is dictated by the ratio

$\begin{displaymath}\frac{\vert\lambda_{i+1}\vert}{\vert\lambda_i\vert}\end{displaymath}$

which we want to make as small as possible. The idea of a shift is similar to what we did in the power method (see section

Let us first assume that we have only real and simple eigenvalues. Instead of applying the QR factorization to ${\bf He}^u$ we will apply it to $({\bf He}^u - \sigma {\bf I})$ , thereby shifting the entire eigenspectrum by $\sigma$ to $(\lambda_i-\sigma).$

Specifically, we choose $\sigma$ such that the element in the position (n,n-1), $h_{n,n-1} \rightarrow 0$ as rapidly as possible.¹ Upon convergence, the entry in the position (n,n) (after we subtract the shift $\sigma$ ) will be the desired eigenvalue. Clearly, the closest estimate to $\lambda_n$ we have is h^(k)_nn of the matrix $({\bf H}_e)_k$ . The modified algorithm to generate the sequence is then

Perform QR factorization:

$\begin{displaymath}({\bf H}_e)_k - h^{(k)}_{nn} {\bf I} = {\bf Q}_k {\bf R}_k\end{displaymath}$

Obtain New Iterate:

$\begin{displaymath}({\bf H}_e)_{k+1} ={\bf R}_k {\bf Q}_k + h^{(k)}_{nn} {\bf I}\end{displaymath}$

and so on.

Then $({\bf He})_{k+1}$ is similar to $({\bf He})_k$ because

$\begin{eqnarray*}({\bf He})_{k+1} & = & {\bf Q}^T_k{\bf Q}_k {\bf R}_k {\bf Q}_... ...I] {\bf Q}_k\\ \\ & = & {\bf Q}_k^T ({\bf H}_e)_k {\bf Q}_k . \end{eqnarray*}$

The convergence of the shifted method is quadratic in general and cubic if the original matrix ${\bf A}$ is symmetric.

For double or complex eigenvalues the shift strategy has to change in order to deal effectively with the ``bumps'' in the diagonal of the ${\bf A}_k$ matrix, as discussed earlier. Specifically, we modify the algorithm so that each iteration has two substeps. We can use the eigenvalues $\lambda_2$ and $\lambda_3$ of the submatrix to obtain the iterates as follows

$\begin{eqnarray*}({\bf H}_e)_1 & = & \lambda_2{\bf I} = {\bf Q}_1 {\bf R}_1\\ (... ... H}_e)_3 & = & {\bf R}_2 {\bf Q}_2 + \lambda_3 {\bf I}\\ \vdots \end{eqnarray*}$

and so on, for subsequent iterations. This method requires complex arithmetic because $\lambda_2$ and $\lambda_3$ may be a complex pair - this is certainly a disadvantage. However, Wilkinson [#!Wilkinson!#] has come up with another method that avoids complex arithmetic and is based on the sum $(\lambda_2 + \lambda_3)$ and the product $\lambda_2\cdot \lambda_3$ of the complex pair.

Remark: After the last eigenvalue $\lambda_n$ has been obtained, deflation of the Hessenberg matrix can be applied to find the next one at reduced computational cost. In general, it takes about two to three QR iterations to compute one eigenvalue. The total cost is about ${\cal O}(10n^3)$ but such an operation count is approximate and problem dependent.

The Symmetric QR Eigensolver: Wilkinson Shift

Symmetric matrices are special, first because they have only real eigenvalues but also because the Hessenberg transformation reduces the matrix to a symmetric tridiagonal matrix. The QR iteration maintains this property for the entire sequence. The other good news is that the QR factorization of a symmetric tridiagonal matrix costs only ${\cal O}(n)$ , instead of ${\cal O}(n^2)$ for the Hessenberg matrix. Therefore, the computational cost in this case is ${\cal O}(n)$ times the number of iterations to converge to all eigenvalues.

Wilkinson[#!Wilkinson!#] has suggested that instead of using the h^(k)_nn as the shift $\sigma$ to accelerate convergence, a more effective way is to use the eigenvalue of the $2 \times 2$ matrix

$\begin{displaymath}\left [ \begin{array}{cc} h_{n-1,n-1} & h_{n,n-1}\\ h_{n,n-1} & h_{nn}\end{array}\right ]\end{displaymath}$

which is closer to the entry h_nn. Convergence of the symmetric QR eigensolver is cubic and the operation count is ${\cal O}(n^2)$ but there is an ${\cal O}(\frac{2}{3} n^3)$ cost associated with the reduction of ${\bf A}$ to a tridiagonal matrix.

Parallel QR Eigensolver: Divide-and-Conquer

A fully parallel algorithm for the symmetric eigenvalue problem was developed by Dongara and Sorensen [#!DogaraS87!#]. It is based on the QR iteration and the divide-and-conquer approach we have encountered many times in this book.

First, we tridiagonalize the full symmetric matrix ${\bf A} \rightarrow {\bf T}$ , where the latter has the form

$\begin{displaymath}{\bf T} = \left [ \begin{array}{cccccc} a_1 & b_1 \\ b_1 & a... ...hspace*{-.3in}b_{n-1} & \hspace*{-.3in}a_n \end{array}\right ]\end{displaymath}$

We then divide ${\bf T}$ into two symmetric tridiagonal matrices of half size as follows:

$\begin{displaymath}{\bf T}_1 = \left [ \begin{array}{ccccccccc} a_1 & b_1\\ b_1... ... & \hspace*{-.3in}a_k & \hspace*{-.3in}-b_k \end{array}\right ]\end{displaymath}$

and

$\begin{displaymath}{\bf T}_2 = \left [ \begin{array}{ccccccccccccc} a_{k+1}-b_k ... ...& b_{n-1}\\ &\hspace*{.5in} b_{n-1} && a_n \end{array}\right ]\end{displaymath}$

so that

$\begin{displaymath}{\bf T} = \left [ \begin{array}{cc} {\bf T}_1 & 0\\ 0 & {\bf T}_2\end{array}\right ] + b_k {\bf z}\end{displaymath}$

where

$\begin{displaymath}z = \left [ \begin{array}{cc} 0 \\ \vdots \\ 0 \\ 1\\ 1\\... ... \begin{array}{lll} \leftarrow k\\ \leftarrow k+1 \end{array}\end{displaymath}$

Next, we can diagonalize both ${\bf T}_1$ and ${\bf T}_2$ using appropriate orthogonal matrices ${\bf Q}_1$ and ${\bf Q}_2$ as follows

$\begin{displaymath}{\bf T}_1 = {\bf Q}_1 {\bf D}_1 {\bf Q}^T_1\; \mbox{and}\; {\bf T}_2 = {\bf Q}_2 {\bf D}_2 {\bf Q}^T_2\end{displaymath}$

where

$\begin{displaymath}{\bf D}_1 =\left [ \begin{array}{ccccccccccc} d_1\\ & d_2 & ... ...{\large0}\\ {\large0} && \ddots \\ &&& d_n\end{array}\right ]\end{displaymath}$

Based on this definition and by introducing

$\begin{displaymath}\mbox{\boldmath $\xi$ }= \left [ \begin{array}{cc} {\bf Q}_1 & 0\\ 0 & {\bf Q}_2 \end{array} \right ]^T {\bf z}\end{displaymath}$

we can re-write the matrix ${\bf T}$ as follows

$\begin{displaymath}{\bf T} = \left [ \begin{array}{cc} {\bf Q}_1 & 0 \\ 0 & {\b... ...eft [ \begin{array}{cc} Q_1 & 0\\ 0 & Q_L\end{array}\right ]^T\end{displaymath}$

which shows that the matrix ${\bf T}$ is similar to the matrix

$\begin{displaymath}{\bf G} \equiv D + b_k \mbox{\boldmath $\xi$} \mbox{\boldmath $\xi$}^T.\end{displaymath}$

It turns out that the eigenvalues $\lambda_i$ of ${\bf G}$ are given by the following equation

$\begin{displaymath}1 + b_k \sum^n_{i=1} \frac{\xi_i^2}{d_i-\lambda} = 0\end{displaymath}$

where $\xi_i$ are the elements of the vector $\mbox{\boldmath$\xi$ }.$ In [#!Demmel97!#] several approaches in solving that equation are presented. Having computed the eigenvalues $\lambda_i$ of ${\bf G}$ , which are also eigenvalues of ${\bf A}$ , we can now compute the eigenvectors. First, the corresponding eigenvectors of ${\bf G}$ are (see [#!Demmel97!#,#!DongarraS87!#])

$\begin{displaymath}{\bf y}_i = ({\bf D} -\lambda)^{-1} \mbox{\boldmath $\xi$}_i.\end{displaymath}$

Finally, the eigenvectors of the original matrix ${\bf A}$ are given by

$\begin{displaymath}{\bf v}_i = \left [ \begin{array}{cc} {\bf Q}_1 & 0\\ 0 & {\bf Q}_2 \end{array}\right ] {\bf y}_i. \end{displaymath}$

Clearly, this is the first level of dividing the problem. We can now repeat recursively this process for ${\bf T}_1$ and ${\bf T}_2$ , which are also tridiagonal and half in size compared to the original matrix. These, in turn can be futher divided into two, and so on. The following MPI/C++ code gives the entire parallel algorithm.

Reference: J. Dongarra and D.C. Sorensen, ``A fully parallel algorithm for the symmetric eigenvalue problem'', SIAM J. Sci. and Stat. Comp., vol. 8, p. S139-S154, (1987).

About this document ...

Next: About this document ...

George Karniadakis
2001-11-06