We show how to define a function of a square matrix using a
diagonalization procedure. This method is applicable only for
diagonalizable square matrices, and
is not suitable for defective matrices. Recall that a matrix A is called diagonalizable if there exists a nonsingular matrix S such that
a diagonal matrix. In other words, the matrix A is similar to a diagonal matrix.
An square matrix is diagonalizable if and only if there exist n linearly independent eigenvectors, so geometrical
multiplicity of each eigenvalue is the same as its algebraic multiplicity. Then the matrix S can be built from eigenvectors of A,
column by column.
Let A be a square diagonalizable matrix, and let be the corresponding diagonal matrix of its eigenvalues:
where are eigenvalues (that may be equal) of the matrix A.
Let be linearly independent eigenvectors, corresponding to the eigenvalues
We build the nonsingular matrix S from these eigenvectors (every column is an eigenvector):
For any reasonable (we do not specify this word, it is sufficient to be smooth) function defined on the spectrum (set of all eigenvalues) of the diagonalizable matrix
A, we define the function of this matrix by the formula:
Example: Consider the matrix that has three distinct eigenvalues
A = {{1,4,16},{18,20,4},{-12,-14,-7}}
Eigenvalues[A]
Out[2]= 9, 4, 1
Eigenvectors[A]
Out[3]= {{1, -2, 1}, {4, -5, 2}, {4, -4, 1}}
Using eigenvectors, we build the transition matrix S of its
eigenvectors:
Then we are ready to construct eight (it is 23 roots
because each square root of an eigenvalue has two values; for
instance, ) square roots of this positive definite matrix:
with appropriate choice of roots on the diagonal. In particular,
We check with Mathematica for specific roots of eigenvalues:
3, 2, and 1. However, we can take any combination of these roots using
next time.
These two matrix functions are unique solutions of the following initial value
problems:
and
■
Example: Consider the matrix
that has two distinct eigenvalues
A = {{-20, -42, -21}, {6, 13, 6}, {12, 24, 13}}
Eigenvalues[A]
Out[2]= 4, 1, 1
Eigenvectors[A]
Out[3]= {{ -7, 2, 4 }, {-1, 0, 1 }, {-2, 1, 0 }}
Since the double eigenvalue has
two linearly independent eigenvectors, the given matrix is
diagonalizable, and we are able to build the transition matrix of its eigenvectors:
For three functions, we construct the corresponding matrix-functions:
These matrix functions are unique solutions of the following initial
value problems:
■
Example: Consider the matrix
that has two complex conjugate eigenvalues and one real eigenvalue
Mathematica confirms:
A = {{1, 2, 3}, {2, 3, 4}, {2, -6, -4}}
Eigenvalues[A]
Out[2]= {1 + 2 I, 1 - 2 I, -2}
Eigenvectors[A]
Out[3]= {{-1 - I, -2 - I, 2}, {-1 + I, -2 + I, 2}, {-7, -6, 11}}
We build the transition matrix of its eigenvectors:
Now we are ready to define a function of the given square matrix. For example, if
we obtain the corresponding exponential matrix:
The matrix function is the unique solution of the following
matrix initial value problem:
where I is the identity matrix.
■
============== to be modified =========
Theorem:
For a square matrix A, the geometric multiplicity of its any eigenvalue
is less than or equal to its algebraic multiplicity.
■
Let λ be an eigenvalue of a n×n matrix A, and suppose that the
dimension of its eigenspace, ker(λI - A), is k. Let
x1, x2, ... , xk be a
basis for this eigenspace. We build n×k matrix X from these
eigenvectors:
We complete matrix X with n×(n-k) matrix X' so that the square
n×n matrix S = [XX'] becomes invertible. Then
so
Compute
for some (n-k)×(n-k) matrix C. Since similar matrices have the same
characteristic polynomial, we get
Consequenly, λ is a root of χA(z) = 0 with
multiplicity at least k.
■
Theorem: Let T be a linear operator on an n-dimensional vector space
V. Then T is digonalizable if and only if its minimal polynomial ψ(λ) is the product of simple terms:
where are distinct scalars (which are eigenvalues
of matrix A)
▣
Suppose that T is diagonalizable. Let
be the distinct eigenvalues of T, and define
We know that p(λ) divides the minimal polynomial ψ(λ) of T. Let
be a basis for
V consisting of eigenvectors of T, and consider one vector vi from β. Then
for some eigenvalues λi.
Since λ - λi divides p(λ), there is a polynomial q(λ) such that
Hence
It follows that since p(T) moves each element of a basis β for
V into zero vector. Therefore, p(λ) is the minimal polynomial.
Conversely, suppose that there are distinct scalars
such that the minimal polynomial factors as
According to previous theorem, all λi are eigenvalues of T. We apply mathematical induction on
n = dim(V). Clearly, T is diagonalizable for n=1. Now suppose that T is
diagonalizable whenever dim(V) < n for some n>1, and suppose that dim(V) = n. Let
U be the range of transformation λsI-T. Clearly
because λs is an eigenvalue of T. If then
T = λsI, which is clearly diagonalizable. So suppose that 0 < dim(U) < n.
Then U is T-invariant, and for any
It follows that the minimal polynomial for TU, the projection of T on subspace U,
divides the polynomial Hence, by the induction hypothesis, TU
is diagonalizable. Furethemore, λs is not an eigenvalue of TU. Therefore,
Now let
be a basis for
U consisting of eigenvectors of TU (and hence of T), and let
be a basis for
the kernel of λkI-T, the eigenspace of T corresponding to λs.
Then β1 and β2 are disjoint. Also observe that m+k=n by the dimension theorem applied to
λkI-T. We show that is
linearly independent. Consider scalars such that
Let
Then and
x + y = 0. It follows that
and therefore x = 0. Since β1 is linearly independent, we have
Similarly,
and we conclude that β is a linearly independent subset of
V consisting of eigenvectors of T, and therefore, T is diagonalizable.
Theorem:
An n × n matrix A is diagonalizable if and
only if A has n linearly independent eigenvectors.
▣
If matrix A is diagonalizable, then its eigenvectors, written
as colomn vectors, can be used to form a nonsingular matrix S
and then S-1AS is the diagonal
matrix having the eigenvalues of A as the diagonal
entries. Thus, if A is not diagonalizable, then A does
not have n linearly independent eigenvectors and we cannot form
the matrix S.