We start with very important definitions. Our objective is to
define a function of a square matrix. So we start with polynomials
because for each such function \( q \left(
\lambda \right) = q_0 + q_1 \lambda + q_2 \lambda^2 + \cdots + q_n
\lambda^n \) we can naturally assign a matrix
\( q \left( {\bf A} \right) = q_0 {\bf I} + q_1
{\bf A} + q_2 {\bf A}^2 + \cdots + q_n {\bf A}^n \) because we
know how to multiply matrices and how to add them.
A scalar polynomial \( q \left( \lambda \right) \) is called an annulled polynomial
(or annihilating polynomial) of the square matrix A, if \( q \left( {\bf A} \right) = {\bf 0} , \)
with the understanding that \( {\bf A}^0 = {\bf I} , \) the identity matrix, replaces
\( \lambda^0 =1 \) when substituting
λ for A.
The minimal polynomial of a square
matrix A is a unique monic polynomial ψ of lowest
degree such that
\( \psi \left( {\bf A} \right) = {\bf 0} . \)
Every square matrix possesses a minimal polynomial.
A square matrix A for which the characteristic
polynomial \( \chi (\lambda ) = \det \left( \lambda
{\bf I} - {\bf A} \right) , \) where I is the
identity matrix, and the minimal polynomial are the same
is called a nonderogatory matrix. A derogatory matrix
is one that is not non-derogatory.
Arthur Cayley (1821--1895) was a British mathematician, one of the founders of the modern British school of pure
mathematics. As a child, Cayley enjoyed solving complex math problems for amusement. He entered Trinity College,
Cambridge, where he excelled in Greek, French, German, and Italian, as well as mathematics. He worked as a lawyer
for 14 years. During this period of his life, Cayley produced between two and three hundred papers. Many of his
publications were results of collaboration with his friend J. J. Sylvester.
Sir William Rowan Hamilton (1805--1865) was an Irish physicist, astronomer, and mathematician, who made important
contributions to classical mechanics, optics, and algebra. He was first foreign member of the American National
Academy of Sciences. Hamilton had a remarkable ability to learn languages,
including modern European languages, and Hebrew, Persian, Arabic, Hindustani, Sanskrit, and even Marathi and Malay.
Hamilton was part of a small but well-regarded school of mathematicians associated with Trinity College in Dublin,
which he entered at age 18. Paradoxically, the credit for discovering the quantity now called the Lagrangian and
Lagrange's equations belongs to Hamilton. In 1835, being secretary to the meeting of the British Association
which was held that year in Dublin, he was knighted by the lord-lieutenant.
James Joseph Sylvester (1814--1897) was an English mathematician. He made fundamental contributions to matrix
theory, invariant theory, number theory, partition theory, and combinatorics. Sylvester was born James Joseph in
London, England, but later he adopted the surname Sylvester when his older brother did so upon emigration to the
United States. At the age of 14, Sylvester was a student of Augustus De Morgan at the University of London.
His family withdrew him from the University after he was accused of stabbing a fellow student with a knife.
Subsequently, he attended the Liverpool Royal Institution.
However, Sylvester was not issued a degree, because graduates at that time were required to state their acceptance
of the Thirty-Nine Articles of the Church of England, and Sylvester could not do so because he was Jewish, the same
reason given in 1843 for his being denied appointment as Professor of Mathematics at Columbia College (now University)
in New York City. For the same reason, he was unable to compete for a Fellowship or obtain a Smith's prize. In 1838
Sylvester became professor of natural philosophy at University College London and in 1839 a Fellow of the Royal
Society of London. In 1841, he was awarded a BA and an MA by Trinity College, Dublin. In the same year he moved to
the United States to become a professor of mathematics at the University of Virginia, but left after less than four
months following a violent encounter with two students he had disciplined. He moved to New York City and began
friendships with the Harvard mathematician Benjamin Peirce and the Princeton physicist Joseph Henry, but in
November 1843, after his rejection by Columbia, he returned to England.
One of Sylvester's lifelong passions was for poetry; he read and translated works from the original French, German,
Italian, Latin and Greek, and many of his mathematical papers contain illustrative quotes from classical poetry. In
1876 Sylvester again crossed the Atlantic Ocean to become the inaugural professor of mathematics at the new
Johns Hopkins University in Baltimore, Maryland. His salary was $5,000 (quite generous for the time), which he
demanded be paid in gold. After negotiation, agreement was reached on a salary that was not paid in gold. In
1878 he founded the American Journal of Mathematics. In 1883, he returned to England to take up the Savilian
Professor of Geometry at Oxford University. He held this chair until his death. Sylvester invented a great number
of mathematical terms such as "matrix" (in 1850), "graph" (combinatorics) and "discriminant." He coined the term
"totient" for Euler's totient function.
Theorem: (Cayley--Hamilton) Every square matrix A is annulled by its
characteristic polynomial, that is, \( \chi ({\bf A}) = {\bf 0} , \) where
\( \chi (\lambda ) = \det \left( \lambda {\bf I} - {\bf A} \right) . \) ■
Theorem: For polynomials p
and q and a square matrix A,
\( p \left( {\bf A} \right) = q \left( {\bf A}
\right) \) if and only if p and q take the same
values on the spectrum (set of all eigenvalues) of A.
Theorem: The minimal polynomial ψ of a square matrix A divides
any other annihilating polynomial p for which \( p \left( {\bf A} \right) = {\bf 0} . \)
In particular, the minimal polynomial divides the characteristic polynomial of A. ■
From the above theorem follows that the minimal polynomial
always is a product of monomials \( \lambda -
\lambda_i \) for each eigenvalue λi
because the characteristic polynomial contains all such
multiples. So all eigenvalues must be present in the product form
for a minimal polynomial.
Theorem: The minimal polynomial for a
linear operator on a finite-dimensional vector space is unique.
Theorem: A square
matrix A is diagonalizable if and only if its
minimal polynomial is a product of simple
terms: \(
\psi (\lambda ) = \left( \lambda - \lambda_1 \right)
\left( \lambda - \lambda_2 \right) \cdots \left( \lambda - \lambda_s
\right) , \) where λ1, ... ,
λs are eigenvalues of matrix A. ■
Therefore, a square matrix A is not diagonalizable if
and only if its minimal polynomial contains at least one multiple
\( \left( \lambda - \lambda_1 \right)^m \)
for m > 1.
Theorem: Every matrix that commutes with a
square matrix A is a polynomial in A if and
only if A is nonderogatory.
Theorem: If A is upper
triangular and nonderogatory, then any solution X of the
equation \( f \left( {\bf X} \right) = {\bf A} \)
is upper triangular. ■
In MuPad, one can define the minimal polynomial using
the following code:
Then to find a minimal polynomial of previously defined square matrix
A, just enter
MatrixMinimalPolynomial[A, x]
Suppose that A is a diagonalizable \( n \times n \) matrix; this means
that all its eigenvalues are not defective and there exists a basis of n linearly independent eigenvectors.
Then its minimal polynomial (the polynomial of least possible degree that annihilates the matrix) is a product of
simple terms:
where \( \lambda_1 , \lambda_2 , \ldots , \lambda_k \) are distinct eigenvalues of
matrix A ( \( k \le n \) ).
Let \( f(\lambda ) \) be a function defined on the spectrum of the matrix A. The last condition means that
every eigenvalue λi is in the domain of f, and that every eigenvalue λi with multiplicity mi > 1 is in the interior of the domain, with f being (mi — 1) times differentiable at λi. We build a function
\( f\left( {\bf A} \right) \) of diagonalizable square matrix A according to James Sylvester, who was an English lawyer and music tutor before his appointment as a professor of mathematics in 1885. To define a function of a square matrix, we need to construct k
Sylvester auxiliary matrices for each distinct eigenvalue \( \lambda_i , \quad i= 1,2,\ldots k: \)
Sylvester's auxiliary matrix \( {\bf Z}_i \) is
actually the
Lagrange interpolating polynomial. Now we define the function
\( f\left( {\bf A} \right) \) according to the
formula:
This formula was formulated by James Sylvester in 1883:
J. J. Sylvester. On the equation to the secular inequalities in the planetary
theory. Philosophical Magazine, 16:267–269, 1883.
Each Sylvester's matrix \( {\bf Z}_i \) is a
projection matrix on eigenspace of the corresponding eigenvalue, so
\( {\bf Z}_i^2 = {\bf Z}_i , \ i = 1,2,\ldots , k. \)
respectively. Since the minimal polynomial is \( \psi (\lambda ) = (\lambda -3)(\lambda +1) \) is a product of simple factors, we build Sylvester's auxiliary matrices:
\( \begin{pmatrix} -\frac{1}{4} & -\frac{5}{4} \\
\frac{1}{4} & \frac{5}{4} \end{pmatrix} \)
Check to make sure the matrices are mutually orthogonal
Z3*Zneg1
\( \begin{pmatrix} 0 & 0 \\
0 & 0 \end{pmatrix} \)
Check to make sure the addition of the Sylvester auxiliary matrices adds
together to give the identity matrix
We consider two other functions \( \displaystyle
\Phi (\lambda ) = \frac{\sin \left( \sqrt{\lambda }\,t \right)}{\sqrt{\lambda}}
\) and \( \displaystyle
\Psi (\lambda ) = \cos \left( \sqrt{\lambda }\,t \right) . \)
It is not a problem to determine the corresponding matrix functions:
\(
\left[\left[1,1,\left[\left(\begin{array}{c} 4\\ -4\\ 1
\end{array}\right)\right]\right],\left[9,1,\left[\left(\begin{array}{c}
1\\ -2\\ 1
\end{array}\right)\right]\right],\left[4,1,\left[\left(\begin{array}{c}
2\\ - \frac{5}{2}\\ 1 \end{array}\right)\right]\right]\right] \)
To create the identity matrix for Sylvester's method, use the
following code
Identity:=matrix::identity(3)
\(
\left(\begin{array}{ccc} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1
\end{array}\right) \)
Define the three Sylvester auxiliary matrices for each eigenvalue
\(
\left(\begin{array}{ccc} -3 & -4 & -4\\ 6 & 8 & 8\\ -3 & -4 & -4
\end{array}\right) \)
We check with MuPad that these auxiliary matrices are
projectors, so we need to show that they are mutually orthogonal, and
their squares remain the same, that is,
\(
\left(\begin{array}{ccc} 0 & 0 & 0\\ 0 & 0 & 0\\ 0 & 0 & 0
\end{array}\right) \)
A projection matrix should have eigenvalues either zero or one; so we
check:
linalg::eigenvalues(Z1)
\(
\left\{0,1\right\}\)
linalg::eigenvalues(Z4)
\(
\left\{0,1\right\}\)
linalg::eigenvalues(Z9)
\(
\left\{0,1\right\}\)
With Sylvester's matrices, we are able to construct arbitrary
functions of the given matrix. We start with
the square root function \( r(\lambda ) =
\sqrt{\lambda} . \) Indeed, choosing a
particular branch, we get
The other four roots are the negatives of the above matrices.
We consider two other functions \( \displaystyle \Phi (\lambda ) =
\frac{\sin \left( t\,\sqrt{\lambda} \right)}{\sqrt{\lambda}} \) and \( \displaystyle \Psi (\lambda ) =
\cos \left( t\,\sqrt{\lambda} \right) . \) It is not a problem to determine the corresponding matrix functions
These two matrix functions are solutions of the matrix differential equation
\( \displaystyle \ddot{\bf P} (t) + {\bf A}\,{\bf P} (t) = {\bf 0} \) subject to
the initial conditions
These two matrix functions are solutions of the matrix differential equation
\( \displaystyle \ddot{\bf P} (t) + {\bf A}\,{\bf P} (t) = {\bf 0} \) subject to
the initial conditions
These two complex auxiliary matrices are also projections: \( {\bf Z}_{1+2{\bf j}}^2 =
{\bf Z}_{1+2{\bf j}}^3 = {\bf Z}_{1+2{\bf j}} , \) and similar relations for its complex conjugate.
Now suppose you want to build a matrix function corresponding to the exponential function
\( f(\lambda ) = e^{\lambda \,t} . \) Using Sylvester's auxiliary matrices, we have
Note: Although MuPad has a dedicated command for calculating
Sylvester Matrices
for two polynomials, we will not be using this command because we actually
utilize completely different technique---Lagrange's interpolating polynomials
extended for square matrices and called Sylvester's auxiliary matrices as seen
in the examples presented on this page.