Eigenvalues and Eigenvectors

The objective of this section is to find invariant subspaces of a linear operator. For a given vector space V over the field of complex numbers \( \mathbb{C} \) (or real numbers \( \mathbb{R} \) ), let \( T:\,V\,\to\,V \) be a linear transformation, we want to find subspaces M of V such that \( T(M) \subseteq M . \) The operator T can be a matrix transformation, a linear integral operator, or an unbounded linear differential operator. Obviously, the zero subspace {0} and the whole space V are invariant for every linear transformation. We refer to these as trivial invariant subspaces.

Let us begin by looking for one-dimensional invariant subspaces. If M is spanned by a nonzero vector x, that is, M = span {x}, then \( T(M) \subseteq M \) if and only if there exists a scalar \( \lambda \in \mathbb{C} \) such that Tx = λx.

If A is a square \( n\times n \) matrix (or linear transformation), then a nonzero vector \( {\bf x} \in \mathbb{R}^n \) is called eigenvector of A (or a linear operator T) if Ax is a scalar multiple of x; that is,
\[ {\bf A}\,{\bf x} = \lambda \,{\bf x} \]
for some scalar λ. The scalar λ is called an eigenvalue of A (or of linear operator T), and x is said to be an eigenvector corresponding to λ. An orderred pair \( \left( \lambda , {\bf x} \right) \) of eigenvalue and corresponding eigenvector is called the eigenpair.
The set of all eigenvalues of matrix A or linear transformation T is called spectrum of matrix A or linear transformation and denoted σ(A) or σ(T), respectively.

In general, the image of a vector x under multiplication by a square matrix A (or corresponding linear transformation T) differs from x in both magnitude and direction. However, in the special case where x is an eigenvector of A, multiplication by A leaves the direction unchanged. Let M be a subspace spanned by the eigenvector x. If y is another nonzero vector from M, then y = cx for some scalar c. Hence, \( T\,{\bf y} = T \left( c\,{\bf x} \right) = c\,T \,{\bf x} = \lambda\,c\,{\bf x} = \lambda \,{\bf y} . \) Thus, we see that λ depends on linear transformation T and subspace of eigenvectors M but not on a particular choice of a vector spanning M. So any nonzero vector \( {\bf z} \in V \) satisfying \( T\,{\bf z} = \lambda\,{\bf z} \) is an eigenvector of T corresponding to, or associated with, the eigenvalue λ.

The set of all eigenvectors corresponding to an eigenvalue together with the zero vector form the vector space, called the eigenspace corresponding to the eigenvalue, and denoted by \( E_{\lambda} \ \mbox{or}\ E(\lambda ) . \) The dimension of this eigenspace is called the geometric multiplicity of the eigenvalue.

Let I denote the identity operator (or the identity matrix) on vector space V (or on \( \mathbb{R}^n \) ), then \( \lambda \in \mathbb{C} \) is an eigenvalue of the linear operator T (corresponding matrix operator TA) if and only if \( \lambda {\bf I} - T \) is not injective (one-to-one). The corresponding vector equation \( {\bf A} \,{\bf x} = \lambda\,{\bf x} \) or \( \left( \lambda\,{\bf I} - {\bf A} \right) {\bf x} = {\bf 0} \) has a nontrivial solution only when the matrix \( \lambda\,{\bf I} - {\bf A} \) is singular, that is, its determinant is zero: \( \det \left( \lambda\,{\bf I} - {\bf A} \right) =0 . \) Since this algebraic equation has degree n, we conclude that an n×n matrix has at most n posssible eigenvalues.

For a square n-by-n matrix A, the polynomial of degree n \( \chi_A (\lambda ) = \lambda^n + c_{n-1} \lambda^{n-1} + \cdots + c_0 = \det \left( \lambda\,{\bf I} - {\bf A} \right) \) is called the characteristic polynomial of matrix A.

We call the set Eλ of all eigenvectors with the zero vector an eigenspace because Eλ is the kernel (null space) of the matrix \( \lambda\,{\bf I} - {\bf A} , \) which is known to be a vector space.

Now suppose that p(λ) is an arbitrary polynomial. Then for any square matrix A we can define a matrix polynomial according to the formula:

\[ p(\lambda ) = a_m \lambda^m + a_{m-1} \lambda^{m-1} + \cdots + a_0 \qquad \Longrightarrow \qquad p \left({\bf A}\right) = a_m {\bf A}^m + a_{m-1} {\bf A}^{m-1} + \cdots + a_0 {\bf I} . \]

Theorem: Every Linear operator (or matrix) on a finite-dimensional complex vector space has an eigenvalue.

To show that a linear operator T on n-dimensional vector space V has an eigenvalue, fix any nonzero vector \( {\bf x} \in V . \) The vectors \( {\bf x}, T\,{\bf x} , T^2 {\bf x} , \ldots , T^n {\bf x} \) cannot be linearly independent because V has dimension N and we have n+1 vectors. Thus, there exist complex numbers \( a_0 , a_1 , \ldots , a_n , \) not all 0, such that
\[ a_0 {\bf x} + a_1 T\,{\bf x} + \cdots + a_n T^n {\bf x} = {\bf 0} . \]
Make the a's the coefficients of a polynomial, which can be written in factored form as
\[ a_0 + a_1 \lambda + \cdots + a_n \lambda^n = c \left( \lambda - \lambda_1 \right) \cdots \left( \lambda - \lambda_s \right) , \]
where c is a non-zero complex number, each λj is complex, and the equation holds for all complex λ. Then we have
\[ {\bf 0} = \left( a_0 I + a_1 T + \cdots + a_n T^n \right) {\bf x} = c \left( T - \lambda_1 {\bf I} \right) \cdots \left( T - \lambda_s {\bf I}\right) , \]
which means that T - λI is not injective for at least one j. In other words, T has an eigenvalue. ■

Theorem: Let \( \left( \lambda , {\bf x} \right) \) be an eigenpair of a square matrix A, and let p(λ) be a polynomial. Then

\[ p \left( {\bf A} \right) {\bf x} = p \left( \lambda \right) {\bf x} , \]
that is, \( \left( p \left( \lambda \right) , {\bf x} \right) \) is an eigenpair of p(A). ■

It is sufficient to show that Ak has the same eigenvectors as A associated with the eigenvalue λk. We proceed by induction to show that \( {\bf A}^k {\bf x} = \lambda^k {\bf x} \) for all \( k \ge 0 . \)

For the base case k=0, observe that A0 = I, the identity matrix and hence λ0 = 1. For the inductive step, suppose that \( {\bf A}^k {\bf x} = \lambda^k {\bf x} \) for some \( k \ge 0 . \) Then

\[ {\bf A}^{k+1} {\bf x} = {\bf A} \left( {\bf A}^k {\bf x} \right) = {\bf A} \left( \lambda^k {\bf x} \right) = \lambda^k \left( {\bf A}\,{\bf x} \right) = \lambda^k \left( \lambda \,{\bf x} \right) = \lambda^{k+1} {\bf x} . \]

Theorem: For every n-by-n matrix A, eigenvectors associated with distinct eigenvalues are linearly independent. ■

According to the Lagrange interpolation theorem, there are polynomials \( p_1 (\lambda ), p_2 (\lambda ) , \ldots , p_s (\lambda_s ) \) such that
\[ p_i \left( \lambda_k \right) = \begin{cases} 1 , & \ \mbox{ if } i=k , \\ 0 , & \ \mbox{ if } i \ne k , \end{cases} \]
for each \( i = 1, 2, \ldots , s . \) Suppose that \( c_1 , c_2 , \ldots , c_s \) are scalars so that for s eigenvectors associated with s distinct eigenvalues we have
\[ c_1 {\bf e}_1 + c_2 {\bf e}_2 + \cdots + c_s {\bf e}_s = {\bf 0} . \]
We must show that each ci = 0. The previous theorem ensures us that \( p_i \left( {\bf A} \right) {\bf x}_k = p \left( \lambda_k \right) {\bf x}_k , \) so for each \( i = 1, 2, \ldots , s \)
\begin{align*} {\bf 0} &= p_i \left( {\bf A} \right) {\bf 0} = p_i \left( {\bf A} \right) \left( c_1 {\bf e}_1 + c_2 {\bf e}_2 + \cdots + c_s {\bf e}_s \right) \\ &= c_1 p_i \left( {\bf A} \right) {\bf e}_1 + c_2 p_2 \left( {\bf A} \right) {\bf e}_2 + \cdots + c_s p_i \left( {\bf A} \right) {\bf e}_s \\ &= c_1 p_i \left( \lambda_1 \right) {\bf e}_1 + c_2 p_2 \left( \lambda_2 \right) {\bf e}_2 + \cdots + c_s p_i \left( \lambda_s \right) {\bf e}_s \\ &= c_i {\bf e}_i . \end{align*}
Therefore, each ci = 0. ■

Theorem: For every square matrix A, if λ is its eigenvalue then its complex conjugate \( \overline{\lambda} \) is an eigenvalue of the adjoint matrix \( {\bf A}^{\ast} = \overline{{\bf A}^{\mathrm T}} = \left( \overline{\bf A} \right)^{\mathrm T} , \) and \( \mbox{dim}\,E_{\lambda} \left( {\bf A} \right) = \mbox{dim} \,E_{\overline{\lambda}} \left( {\bf A}^{\ast} \right) . \)

If u is an eigenvector of the adjoint operator T*, we take an inner product
\[ \left\langle {\bf u} , T\,{\bf v} \right\rangle = \left\langle {\bf u} , \lambda\,{\bf v} \right\rangle = \left\langle \overline{\lambda} \,{\bf u} , {\bf v} \right\rangle = \left\langle T^{\ast} {\bf u} , {\bf v} \right\rangle , \]
for arbitrary eigenvector v of T. So we see that \( \overline{\lambda} \) is the eigenvalue of the adjoint operator. ■
Example: Consider the matrix
\[ {\bf A} = \begin{bmatrix} 1&1&5 \\ 1&1&1 \\ 1&1&1 \end{bmatrix} . \]
Its characteristic polynomial is
\[ \det \left( \lambda {\bf I} - {\bf A} \right) = \det \begin{bmatrix} \lambda -1&-1&-5 \\ -1&\lambda -1&-1 \\ -1&-1&\lambda -1 \end{bmatrix} = \lambda \left( \lambda +1 \right) \left( \lambda -4 \right) . \]
Therefore, the given 3×3 matrix has three distinct real eigenvalues. We check with Mathematica:
A = {{1,1,5}, {1,1,1}, {1,1,1}}
Eigenvalues[A]
To find associated eigenvectors, we have to solve the corresponding three systems of equations:
\[ (a)\quad \begin{cases} x_1 + x_2 + 5\,x_3 &= 0 , \\ x_1 + x_2 + x_3 &=0, \end{cases} \qquad (b) \quad \begin{cases} x_1 + x_2 + 5\,x_3 &= -x_1 , \\ x_1 + x_2 + x_3 &= - x_2 , \end{cases} \qquad (c) \quad \begin{cases} x_1 + x_2 + 5\,x_3 &= 4\,x_1 , \\ x_1 + x_2 + x_3 &= 4\, x_2 . \end{cases} \]
Solving system (a), we get x3 = 0 and x1 = -x2. Therefore, the eigenvectors associated with the eigenvalue λ = 0 are
\[ {\bf e}_0 = \begin{pmatrix} -x_2 \\ x_2 \\ 0 \end{pmatrix} = x_2 \begin{pmatrix} -1 \\ 1 \\ 0 \end{pmatrix} , \]
where x2 an arbitrary nonzero scalar. Since the corresponding eigenspace E0 is spanned on the vector \( \left[ -1, 1, 0 \right]^{\mathrm T} , \) its geometric multiplicity is 1. Solving the system (b), we find that x1 = -3x2 and x3 = x2. Therefore, the associated eigenvector becomes
\[ {\bf e}_m = \begin{pmatrix} -3\,x_2 \\ x_2 \\ x_2 \end{pmatrix} = x_2 \begin{pmatrix} -3 \\ 1 \\ 1 \end{pmatrix} . \]
So the eigenspace E-1 is spanned on the vector \( \left[ -3, 1, 1 \right]^{\mathrm T} . \) Finally, solving system (c), we obtain x1 = 2x2 and x3 = x2. Hence, the eigenvectors associated with λ = 4 are expressed via the formula
\[ {\bf e}_4 = \begin{pmatrix} 2\,x_2 \\ x_2 \\ x_2 \end{pmatrix} = x_2 \begin{pmatrix} 2 \\ 1 \\ 1 \end{pmatrix} . \]
Fortunately, Mathematica is capable to do all these tedious calculations for you:
A = {{1,1,5}, {1,1,1}, {1,1,1}}
Eigenvectors[A]
or even with one command, one can find eigenpairs:
Eigensystem[A]

Example: Consider the matrix
\[ {\bf A} = \begin{bmatrix} 16&15&8 \\ 14&13&7 \\ 9&6&4 \end{bmatrix} . \]
Its characteristic polynomial is
\[ \det \left( \lambda {\bf I} - {\bf A} \right) = \det \begin{bmatrix} \lambda -16&-15&-8 \\ -14&\lambda -13&-7 \\ -9&-6&\lambda -4 \end{bmatrix} = \left( \lambda +1 \right)^2 \left( \lambda -1 \right) . \]
The eigenvalue λ = -1 has two linearly independent eigenvectors:
\[ {\bf e}_1 = \begin{bmatrix} 3 \\ 0 \\ 1 \end{bmatrix} \qquad\mbox{and} \qquad {\bf e}_2 = \begin{bmatrix} -2 \\ 1 \\ 0 \end{bmatrix} . \]
The eigenvalue λ = 1 has the eigenspace spanned on the vector \( \left[ 2, 1, 1 \right]^{\mathrm T} . \)

Example: The matrix depending on a parameter θ
\[ {\bf A}_{\theta} = \begin{bmatrix} \cos\theta & -\sin\theta \\ \sin \theta & \cos\theta \end{bmatrix} , \qquad 0 < \theta \le \frac{\pi}{2} , \]
induces a counterclockwise rotation of \( \mathbb{R}^2 \) through θ radians around the origin. Evaluating the determinant
\[ \det \,{\bf A}_{\theta} = \det \begin{bmatrix} \lambda -\cos\theta & \sin\theta \\ -\sin \theta & \lambda - \cos\theta \end{bmatrix} = \lambda^2 -2\lambda \,\cos\theta +1 , \]
we find the eigenvalues by equating it to zero:
\[ \lambda^2 -2\lambda \,\cos\theta +1 =0 \qquad \Longrightarrow \qquad \lambda_{\pm} = \cos\theta \pm \sqrt{\cos^2 \theta -1} = \cos\theta \pm {\bf j} \sqrt{1- \cos^2 \theta} = \cos\theta \pm {\bf j} \sin \theta = e^{\pm{\bf j} \theta} . \]
Here j is the unit vector in the positive vertical direction on the complex plane, so \( {\bf j}^2 = -1 . \) Therefore, the given 2×2 matrix has two simple complex conjugate eigenvalues both having magnitude 1.

To find eigenvectors corresponding to λ+, we need to compute nonzero solutions of the homogeneous equation

\[ x_1 \cos \theta - x_2 \sin\theta = e^{{\bf j} \theta} \, x_1 = x_1 \left( \cos \theta + {\bf j}\,\sin\theta \right) \]
because the second equation \( x_1 \sin \theta + x_2 \cos\theta = x_2 e^{{\bf j} \theta} \) is a multiple of the above one. Solving it, we get \( x_1 = {\bf j{ \,x_2 . \) Hence, \( \left( e^{{\bf j} \theta} , \left[ 1 , -{\bf j} \right]^{\mathrm T} \right) \) is an eigenpair of A. Taking complex conjugate, we conclude that \( \left( e^{-{\bf j} \theta} , \left[ 1 , {\bf j} \right]^{\mathrm T} \right) \) is another eigenpair. The pair of eigenvectors \( \left[ 1 , -{\bf j} \right]^{\mathrm T} , \left[ 1 , {\bf j} \right]^{\mathrm T} \) form a basis for \( \mathbb{C} . \) It is noteworthy that the eigenvalues of Aθ on \( \theta \in \left( 0, \frac{\pi}{2} \right) , \) but associated eigenvectors do not. Each of the vectors \( \left[ 1 , \pm {\bf j} \right]^{\mathrm T} \) is an eigenvector of all the matrices Aθ independently of θ.

Example: If an integer \( n \ge 2 , \) then the identity matrix In of dimensions \( n \times n \) has n identical eigenvalues, namely λ = 1. Any basis of \( \mathbb{R}^n \) comprises eigenvectors of In because any nonzero vector is an eigenvector of In associated with λ = 1. ■

Example: The infinite set of monomials \( \left\{ 1, x, x^2 , \ldots , x^n , \ldots \right\} \) form a basis in the set of all polynomials. ■

A defective matrix is a square matrix that does not have a complete basis of eigenvectors.
In other words, a square matrix A is called defective if A has an eigenvalue λ of algebraic multiplicity m greater than 1, but for which the associated eigenspace has a basis of fewer than m vectors; that is, the dimension of the eigenspace associated with λ is less than m.
Example: Consider the matrix
\[ {\bf A} = \begin{bmatrix} 24 & -10&4 \\ 62 & - 26 & 11 \\ 25&-11&6 \end{bmatrix} \]
that has only two distinct eigenvalues λ = 1 and λ = 2. Its characteristic polynomial \( \chi (\lambda ) = \det \left( \lambda {\bf I} - {\bf A} \right) = \left( \lambda -2 \right) \left( \lambda -1 \right)^2 \) has two roots: λ2 =2, which is simple one, and λ1 =1 that has algebraic multiplicity 2. To these two eigenvalues correspond only two eigenvectors:
\[ {\bf e}_2 = \begin{bmatrix} 1 \\ 3 \\ 2 \end{bmatrix} \qquad\mbox{and} \qquad {\bf e}_1 = \begin{bmatrix} 2 \\ 5 \\ 1 \end{bmatrix} . \]
Since we have only two eigenvectors in three dimensional space \( \mathbb{R}^n , \) the given matrix is defective. ■

Theorem: For every n-by-n matrix A, the geometric multiplicity of each eigenvalue is less than or equal to its algebraic multiplicity of that eigenvalue. ■

Let L0 be the eigenspace of A corresponding to an eigenvalue λ0 and let dim L0 = m. We denote by \( {\bf e}_1 , {\bf e}_2 , \ldots , {\bf e}_m \) a basis of the eigenspace L0. We extend this basis to a basis of \( \mathbb{R}^n \) by some additional vectors \( {\bf e}_{m+1} , {\bf e}_{m+2} , \ldots , {\bf e}_n . \) Since \( {\bf A} \,{\bf e}_i = \lambda_0 {\bf e}_i , \ i=1,2,\ldots , m , \) it follows that matrix A is similar to a block diagonal matrix
\[ {\bf A} \sim \begin{bmatrix} {\bf \Lambda}_0 & {\bf B}_{m \times (n-m)} \\ {\bf 0}_{(n-m)\times m} & {\bf C}_{(n-m)\times (n-m)} \end{bmatrix} , \]
where Λ0 is a diagonal m-by-m matrix all of whose diagonal entries are equal to λ0. Hence, the characteristic polynomial of transformation corresponding to matrix A can be written as
\[ \det \left( \lambda {\bf I} - {\bf A} \right) = \left( \lambda - \lambda_0 \right)^m Q_{n-m} (\lambda ) , \]
where Qn-m(λ) is a polynomial of degree n-m. Obviously, m cannot exceed the multiplicity of the root λ0 of the characteristic polynomial. ■

Example: We consider so called N-matrices that have nonzero entries on the main diagonal and the first and last column:

\[ {\bf A} = \begin{bmatrix} a_{11} &0&0&0& \cdots &a_{1n} \\ a_{21} &a_{22} &0&0& \cdots &a_{2n} \\ a_{31} &0&a_{33} &0 & \cdots & a_{3n} \\ \vdots&\vdots&\vdots&\vdots& \ddots&\vdots \\ a_{n1} &0&0&0& \cdots & a_{nn} \end{bmatrix} . \]
All other entries are assumed to be zero. Because of the zeroes in columns 2 through (n-1) of an N-matrix, computation of its chracteristic polynomial det(λI - A) is very easy. Expanding by cofactors of any column, we obtain
\begin{align*} det \left( \lambda {\bf I} - {\bf A} \right) &= \left( \lambda - a_{22} \right) \left( \lambad - a_{33} \right) \cdots \left( \lambda - a_{(n-1)(n-1)}\right) \begin{bmatrix} \lambda - a_{11} & a_{1n} \\ a_{n1} & \lambda - a_{nn} \end{bmatrix} \\ &= \left( \lambda - a_{22} \right) \left( \lambad - a_{33} \right) \cdots \left( \lambda - a_{(n-1)(n-1)}\right) \times \left[ \lambda^2 - \lambda \left( a_{11} + a_{nn} \right) + \left( a_{11} a_{nn} - a_{1n} a_{n1} \right) \right] .
By proper selection of the elements in an N-matrix one can derive a great variety of examples of matrices with desiered eigenvalues. For example, a symmetric 3 × 3 matrix has the followng eigenvalues:
\[ \det \left( \lambda {\bf I} -{\bf A} \right) = \left( \lambda - d \right) \left(\lambda - a -b \right) \left( \lambda - a + b \right) , \qquad \mbox{for} \quad {\bf A} = \begin{bmatrix} a&0&b \\ c&d& c \\ b & 0 & a \end{bmatrix} . \]
By choosing appropriate values of entries, one can obtain a diagonalizable matrix with either all distinct eigenvalues or one double eigenvalue when d = a - b . More variety are seen for 4 × 4 matrix
\[ {\bf A} = \begin{bmatrix} a&0&0& b \\ c_1 & d & 0 & c_1 \\ c_2 & 0 & e & c_2 \\ b&0&0&a \end{bmatrix} . \]
Eigenvalues[{{a, 0, 0, b}, {c1, d, 0, c1}, {c2, 0, e, c2}, {b, 0, 0, a}}]
{a - b, a + b, d, e}
    ■

 

  1. Councilman, S., Eigenvalues and eigenvectors of "N-matrices", The American Mathematical Monthly, 1986, Vol. 93, No. 5, pp. 392--395. doi: 10.2307/2323607