Recall that a monic polynomial\( p(\lambda ) = \lambda^s + a_{s-1} \lambda^{s-1} + \cdots
+ a_1 \lambda + a_0 \) is the polynomial with leading term to be 1. The Cayley--Hamilton theorem tells us that
for any square n × n matrix A, there exists a polynomial p(λ) in one variable λ
that annihilates A, namely, \( p({\bf A}) = {\bf 0} \) is zero matrix. This
theorem specifies that the characteristic polynomial \( \chi (\lambda ) = \det
\left(\lambda {\bf I} - {\bf A} \right) \) is an annihilator for A. Hence, there is a
polynomial of least degree with this property of degree at most n. If g(λ) is such a polynomial, we
can divide g(λ) by its leading coefficient to obtain another polynomial ψ(λ) of the
same degree with leading coefficient 1, that is, ψ(λ) is a monic polynomial.
Let A be a square n × n matrix. The minimal polynomialψ(λ) for A
is the monic polynomial of least positive degree that annihilates the matrix: ψ(A) is zero matrix.
Theorem: Let V be a vector space of dimension n over the
field of either real numbers \( \mathbb{R} \) or complex numbers \( \mathbb{C} . \)
Let \( T\,:\,V \to V \) be a linear transformation. Then the minimal polynomial
ψ(λ), that is the monic polynomial of minimum degree for which \( \psi (T) =0, \) is
of degree less than or equal to n. ■
We prove theorem using mathematical induction on the dimension of V.
Suppose that dimension of V is 1. Then any nonzero vector from V is a constant multiple of a generating
vector x. Since T is a linear transformation, \( T{\bf x} = k{\bf x} , \)
for some scalar k. Hence, \( \left( k\,I - T \right) {\bf x} = {\bf 0} , \) where
I is the identity map on V. This shows that ψ(λ) = λ - k is the minimal
polynomial of T. Since the degree of ψ(λ) is 1, we see that the theorem is true for the case n=1.
Now, to make an induction on the dimension, we assume that the theorem is true for all spaces U of dimension less than
n. Let dimV = n, and suppose that x is a non-zero vector in V. Then the n+1
vectors x, Tx, T^{2}x, ... , T^{n}x are linearly
dependent to that there is a set { a_{0}, a_{1}, ... , a_{n} } of scalars, not all of them zero, such that
Writing \( g(\lambda ) = a_0 + a_1 \lambda + a_2 \lambda^2 + \cdots + a_n \lambda^n , \)
we see that degree \( \mbox{deg}\,g(\lambda ) \le n \) and g(T)x = 0.
If g(T)V = 0, then because ψ(λ) is the minimal polynomial, we have
\( \mbox{deg}\,g(\lambda ) \le n \) and so the theorem holds.
Suppose now that \( g(T)\,V \ne 0 . \) Let \( U = \left\{ {\bf v} \,:\, g(T){\bf v} = {\bf v} \right\} \)
be the set of vectors annihilated by g(T). Then U is a proper subspace of V and its dimension is
r < n. By the inductional assumption, there is a polynomial m(λ) of degree less than or
equal to r such that m(T)U = 0. Moreover, \( T\,U \subseteq U \) since
g(T) commutes with T, so that U is an invariant subspace of V. Now consider the
quotient spaceV/U. We have
dimV/U = n-r and so, by the inductional hypothesis again, there is a polynomial m_{2}
of degree \( \le n-r \) such that \( m_2 (T) V/U =0. \) This
means that \( m_2 (T) V \subseteq U , \) so that \( m(T)\,m_2 (T)\,V =0 . \)
Writing \( h(\lambda ) = m(\lambda )\, m_2 (\lambda ) , \) we have h(T)V = 0 so that is
ψ(λ) is the minimal polynomial, so
Thus the theorem holds in all cases and the proof is complete.
Theorem: Let A be a square n × n matrix.
If p(λ) is a nonconstant polynomial that annihilates matrix A, then the
minimal polynomial ψ(λ) of matrix A divides the annihilating polynomial p(λ). ■
Let p(λ) be a polynomial for which \( p({\bf A}) = {\bf 0} . \) By division
algorithm for polynomials (see section Polynomials), there exist polynomials q(x)
and r(x) such that
\[
p (x ) = q(x) \,\psi (x) + r(x) ,
\]
where r(x) has degree less than the degree of ψ(x). Substituting A into the above equation
and using the identity \( \psi ({\bf A}) = p({\bf A}) = {\bf 0} , \) we have
\( r ({\bf A}) = {\bf 0} . \) Since r(x) has degree less than ψ(x) and
ψ(x) is the minimal polynomial of A, r(x) must be the zero polynomial. Then we can simplify
\( p (x ) = q(x) \,\psi (x) , \) proving the theorem.
Corollary: The minimal polynomial for any square matrix is unique. ■
Theorem: Let A be a square n × n matrix and
ψ(λ) be its minimal polynomial. A scalar λ (real
or complex) is an eigenvalue of A if and only if
\( \psi (\lambda ) =0 . \) Hence, the characteristic polynomial and the minimal polynomial
have the same zeroes. ■
Let χ(λ) be the characteristic polynomial of matrix A. Since the minimal polynomial divides
χ(λ), we have χ(λ) = q(λ) ψ(λ) for some polynomial q(λ). Let λ
be a zero of the minimal polynomial ψ(λ). Then
the minimal polynomial for A must be either \( \left( \lambda -1 \right) \left( \lambda -2 \right) \)
or \( \left( \lambda -1 \right)^2 \left( \lambda -2 \right) . \) Substituting A
into \( \psi (\lambda ) = \left( \lambda -1 \right) \left( \lambda -2 \right) \) show
that \( \psi ({\bf A} ) = {\bf 0} , \) and hence ψ is the minimal polynomial for A.
Therefore, the minimal polynomial for matrix B is its characteristic polynomial:
\( \psi (\lambda ) = \chi (\lambda ) = \left( \lambda -1 \right)^2 \left( \lambda -2 \right) . \)
Example: Let \( \texttt{D} \) be the linear derivative
operator on the space P_{n} of polynomials of
degree up to n, with basis of monomials \( 1, x, x^2 , \ldots , x^n . \)
Then in this basis for P_{n}, the derivative operator is expressed via matrix
Its characteristic polynomial is \( \chi (\lambda ) = \lambda^n \) because
\( \lambda{\bf I} - \texttt{D} \) is an upper triangular matrix. Since
\( \texttt{D}^n t^n \ne 0 , \) the minimal polynomial for \( \texttt{D} \) must be
λ^{n}.
If we consider the derivative operator on space of all polynomials (of arbitrary degree), then there exist no
polynomial for which \( p \left(\texttt{D}\right) = 0. \) Hence
\( \texttt{D} \) has no minimal polynomial in this space.
A projection is a linear transformation P from a vector space to itself such that P^{2} = P.
That is, whenever P is applied twice to any value, it gives the same result as if it were applied once
(idempotent). It leaves its image unchanged.
A projection (idempotent) matrix always has two eigenvalues of 1 and 0 because its minimum polynomial is
\( \psi (\lambda ) = \lambda \left( \lambda -1 \right) . \)
Example: Let us consider the 3 × 3 matrix from the first example, which has the minimal polynomial
\( \psi (\lambda ) = \left( \lambda -1 \right) \left( \lambda -2 \right) . \) For this matrix
correspond two projection matrices on the corresponding eigenspaces:
It is not had to verify that \( {\bf P}_1^2 = {\bf P}_1 \) and \( {\bf P}_2^2 = {\bf P}_2 . \)
The characteristic polynomial of P_{1} is \( \chi_1 (\lambda ) = \det \left( \lambda {\bf I} - {\bf P}_1 \right) =
\lambda \left( \lambda -1 \right)^2 , \) while the characteristic polynomial of P_{2} is
\( \chi_2 (\lambda ) = \det \left( \lambda {\bf I} - {\bf P}_2 \right) =
\lambda^2 \left( \lambda -1 \right) . \)
Recall that a permutation matrix is a square binary matrix that has exactly one entry of 1 in each
row and each column and zeroes elsewhere. Each such matrix of size n, say P, represents a permutation of n
elements and, when used to multiply another matrix, say A, results in permuting the rows
(when pre-multiplying, i.e., PA) or columns (when post-multiplying, AP) of the
matrix A.
Its characteristic polynomial is \( \chi (\lambda ) = \det\left( \lambda {\bf I} - {\bf P} \right) =
\left( \lambda -1 \right)^2 \left( \lambda +1 \right) . \) However, its minimal polynomial is a product of two
terms: \( \psi (\lambda ) = \left( \lambda -1 \right) \left( \lambda +1 \right) = \lambda^2 -1 \)
because \( {\bf P}^2 = {\bf I} , \) the identity matrix.
We consider also another permutation matrix
\( {\bf P} = \begin{bmatrix} 0&0&1 \\ 1&0&0 \\ 0&1&0 \end{bmatrix} . \) Since this matrix
satisfies the equation P^{3} = I, the permutation matrix has one real eigenvalue λ = 1
(which is always the case because some power of a permutation matrix is the identity matrix), and two complex conjugate
eigenvalues \( \lambda_{2,3} = - \frac{1}{2} \pm {\bf j}\, \frac{\sqrt{3}}{2} \) that are
the roots of the quadratic equation \( \lambda^2 + \lambda + 1 =0 . \) Hence the minimal polynomial
for the given permutation matrix is the same as its characteristic polynomial:
\( \psi (\lambda ) = \left( \lambda -1 \right) \left(\lambda^2 + \lambda +1 \right) = \lambda^3 -1 . \)
has four distinct eigenvalues 1, -1, j, -j because its fourth power is the identity matrix:
P^{4} = I. Hence, the minimal polynomial for this permutation matrix is
\( \psi (\lambda ) = \lambda^4 -1 . \)
■
The following theorem gives a practical way to determine the minimal polynomial of a square matrix A without
actual checking which possible choice obtained from the characteristic polynomial
Theorem (V. Dobrushkin): Let A be a square n × n matrix and
\( {\bf R}_A (\lambda ) = \left( \lambda {\bf I} - {\bf A} \right)^{-1} = \frac{1}{\chi (\lambda )}
\, \left[ \mbox{Cof}_{j,i} \right] \) be its resolvent, where [Cof]_{i,j} is the cofactor matrix for
\( \lambda {\bf I} - {\bf A} \) and \( \chi (\lambda ) = \det \left(
\lambda {\bf I} - {\bf A} \right) . \) Upon reducing common factors (if any), the resolvent can be written as
where ψ(λ) is the minimal polynomial for A. It is assumed that ψ(λ) and every entry
of matrix Q are relatively prime. ■
Theorem: Let T be a linear operator on an n-dimensional vector space
V such that V is a T-cyclic subspace of itself. Then the characteristic polynomial χ(λ)
and the minimal polynomial ψ(λ) have the same degree, so χ(λ) = ψ(λ).
■
Theorem: Let p(λ) be an annihilating polynomial
of an n × n matrix A. Every eigenvalue λ is a root of p. That is,
\( p(\lambda ) =0 . \)
■
Let \( (\lambda , {\bf x}) \) be an eigenpair of A. Then
Theorem: If square matrices A and B are similar
(that is, \( {\bf A} = {\bf T}^{-1} {\bf B} {\bf T} \) for some nonsingular matrix T),
then their minimal polynomials are the same: \( \psi_A (\lambda ) = \psi_B (\lambda ). \)
■
Corollary: If matrix C is the direct sum of two matrices,
\( {\bf C} = {\bf A} \oplus {\bf B} , \) then its minimal polynomial is the product of corresponding minimal polynomials:
\( \psi_C (\lambda ) = \psi_A (\lambda )\,\psi_B (\lambda ) . \) ■
Theorem: Let T be a linear operator on a finite-dimensional
vector space, and let ψ(λ) be its minimal polynomial. Then T is invertible if and only if
\( \psi (0 ) \ne 0 . \)
■
Let T be invertible and \( \psi (\lambda ) = \lambda^n + a_{n-1} \lambda^{n-1} + \cdots + a_0 \)
be its minimal polynomial. Then multiplying the equation
\( \psi (T ) = T^n + a_{n-1} T^{n-1} + \cdots + a_0 I =0\) by T^{-1}, we get
It is obvious that a_{0} cannot be zero because we will get an annihilating polynomial of lesser degree than n. Hence
\( \psi (0 ) = a_0 \ne 0 . \)
Now suppose that \( \psi (0 ) = a_0 \ne 0 . \) Then transformation T has no zero eigenvalue.
So its determinant, \( \det T = \lambda_1 \lambda_2 \cdots \lambda_n \ne 0 \) and T
is invertible.
respectively. The block matrix A_{12} plays no role in determination of eigenvalues.
■
Theorem:
If n ≥ 2, the monic polynomial\( p(\lambda ) = \lambda^n +
a_{n-1} \lambda^{n-1} + \cdots + a_1 \lambda + a_0 \) is both the
minimal polynomial and the characteristic polynomial of its
companion matrix
First, we prove that the characteristic polynomial for the companion matrix is
\( p_n (\lambda ) = \lambda^n + a_{n-1} \lambda^{n-1} +
\cdots + a_1 \lambda + a_0 , \) by induction. For n = 2, we
have
Suppose that the statement is true for any dimension less than n, we
consider the characteristic polynomial for the companion matrix of size
n. Expanding along with the first column, we get
where \( p_{n-1} (\lambda ) = \lambda^{n-1} +
a_{n-1} \lambda^{n-2} + \cdots + a_2 \lambda + a_1 . \) Matrix in the
right-hand side is lower triangular and its determinant is the product of its
diagonal terms---in our case (-1). Since it has size n-1, the
determinant becomes (-1)^{n-1}, and we obtain the characteristic
polynomial to be p_{n}.
To prove that the minimal polynomial is the characteristic polynomial,
we trancate the companion matrix as
in which e_{2}, e_{3}, ... ,
e_{n} is the standard basis of ℂ^{n} and
\( {\bf a} = \left[ a_0 , a_1 , \ldots , a_{n-1} \right]^{\mathrm T} . \) Then we have C_{p}e_{n} =
- a and C_{p}e_{j-1} =
e_{j} for each j = 2, 3, ..., n. So
\( {\bf e}_j = {\bf C}_p {\bf e}_{j-1} = {\bf C}_p^2
{\bf e}_{j-2} = \cdots = {\bf C}_p^{j-1} {\bf e}_{1} \) for each
j = 1,2, ..., n. Then
Thus, \( p \left( {\bf C}_{p} \right) = {\bf 0} . \)
If g(z) = z^{m} + b_{m-1}z^{m-1} + ... + b_{1}z + b_{0} is any polynomial of degree less than n, then
The linear independence of e_{1}, e_{2}, ...,
e_{m+1} ensures that g(C_{p}) e_{1} ≠ 0, so g cannot annihilate C_{p}. This concludes the proof because the only monic polynomial that annihilates the companion matrix is the characteristic one.
M. D. Burrow, The Minimal Polynomial of a Linear Transformation, American
Mathematical Monthly, 80,
(1973), pp. 1129--1131.