Exercises

Find square roots of the positive definite 2 × 2 matrix with complex coefficients
\[ {\bf A} = \begin{bmatrix} 8 + 3{\bf j}&-7 + {\bf j} \\ -2+ {\bf j}&5-3{\bf j} \end{bmatrix} . \]
Samuelson's formula allows the characteristic polynomial to be computed recursively without divisions. See

Paul A. Samuelson A Method of Determining Explicitly the Coefficients of the Characteristic Equation, The Annals of Mathematical Statistics, Volume 13, Number 4 (1942), 424-429.

The characteristic polynomial of a matrix A may be computed in the Wolfram Language as CharacteristicPolynomial[A, lambda].
Example: Consider the matrix
\[ {\bf A} = \begin{bmatrix} 48&-30&-14&1 \\ 65&-41&-19&0 \\ 17&-10&-5&3 \\ -35&22&10&0 \end{bmatrix} . \]
Using Mathematica, we find its chracteristic polynomial
A = {{48, -30, -14, 1}, {65, -41, -19, 0}, {17, -10, -5, 3}, {-35, 22, 10, 0}}
CharacteristicPolynomial[A, x]
This gives \( \chi (\lambda ) = \lambda^4 - 2\lambda^3 + 2\lambda -1 = \left( \lambda -1 \right)^3 \left( \lambda +1 \right) . \) The matrix A is not diagonalizable because its minimal polynomial equals to the characteristic polynomial.

Let T be a linear transformation on a vector space V. A subspace U of V is called a T-invariant subspace of V if \( T(U) \subseteq U , \) that is, if \( T({\bf v}) \in U \) for all \( {\bf v} \in U . \)

Example: Suppose that T is a linear operator on a vector space V. Then the following subspaces of V are T-invariant.
  1. The zero space { 0 }.
  2. The original space V.
  3. The range of T; that is, if T is defined by a matrix A in some basis, then the column space is A-invariant.
  4. The kernel (= null space) of T.
  5. The eigenspace Eλ for any eigenvalue λ of T.

Let T be a linear operator on a vector space V, and let x be z nonzero element of V. The subspace
\[ U = \mbox{span} \left( \left\{ {\bf x} , T({\bf x}) , T^2 ({\bf x}) , \ldots \right\} \right) \]
is called the T-cyclic subspace of V generated by x

A cyclic subspace is a "smallest" T-invariant subspace of the vector space V containing vector x. We will use cyclic subspaces to establish the Cayley--Hamilton theorem, which is our main objective.

Example: Let us consider the derivative operator \( \texttt{D} = {\text d}/{\text d}x \) in the space of polynomials. Then the \( \texttt{D} \) cyclic subspace generated by x2 is span \( \left\{ x^2 , 2x, 2 \right\} = P_2 (\mathbb{R}) . \)

Let A be a square n × n matrix and \( p(\lambda ) = a_s \lambda^s + a_{s-1} \lambda^{s-1} + \cdots + a_1 \lambda + a_0 \) be a polynomial in variable λ of degree s. A polynomial for which \( p({\bf A} ) = {\bf 0} \) is called the annihilating poilynomial for the matrix A or it is said that p(λ) is an annihilator for matrix A.

An annihilating polynomial for a given square matrix is not unique and it could be multiplied by any polynomial.

Example: We reconsider the matrix
\[ {\bf A} = \begin{bmatrix} 48&-30&-14&1 \\ 65&-41&-19&0 \\ 17&-10&-5&3 \\ -35&22&10&0 \end{bmatrix} . \]
Then we ask Mathematica to verify that the characteristic polynomial \( \chi (\lambda ) = \left( \lambda -1 \right)^3 \left( \lambda +1 \right) \) annihilates the matrix:
A = {{48, -30, -14, 1}, {65, -41, -19, 0},{17, -10, -5, 3}, {-35, 22, 10, 0}}
p[x_] = CharacteristicPolynomial[A , x]
Out[2]= -1 + 2 x - 2 x^3 + x^4
A.A.A.A -2*A.A.A +2*A -IdentityMatrix[4]
Out[3]= {{0, 0, 0, 0}, {0, 0, 0, 0}, {0, 0, 0, 0}, {0, 0, 0, 0}}

Theorem: Let T be a linear opeartor on a finite-dimensional vector space V, and let U be a T-invariant subspace of V. Then the characteristic polynomial of TU divides the characteristic polynomial of T.

We choose a basis \( \gamma = \left\{ {\bf v}_1 , {\bf v}_2 , \ldots , {\bf v}_k \right\} \) for U and extend it to the basis \( \beta = \left\{ {\bf v}_1 , {\bf v}_2 , \ldots , {\bf v}_k , \ldots , {\bf v}_n \right\} \) for V/ Let A be the matrix representation of the operator T in this basis and B1 be the matrix for operator TU. Then A will be the block matrix:
\[ {\bf A} = \begin{bmatrix} {\bf B}_1 & {\bf B}_2 \\ {\bf 0}_{(n-k)\times k} & {\bf B}_3 \end{bmatrix} , \]
where matrices B2 and B3 have appropriate sizes. Let χ(λ) be the characteristic polynomial of T and g(λ) bethe characteristic polynomial of TU. Then
\[ \chi (\lambda ) = \det \left( \lambda {\bf I} - {\bf A} \right) = \det \begin{bmatrix} \lambda {\bf I}_k - {\bf B}_k & - {\bf B}_2 \\ {\bf 0} & \lambda {\bf I}_{n-k} - {\bf B}_3 \end{bmatrix} = g(\lambda )\cdot \det \left( \lambda {\bf I}_{n-k} - {\bf B}_3 \right) . \]
Thus, g(λ) divides χ(λ). ■

Theorem: Let T be a linear opeartor on a finite-dimensional vector space V, and let U denote the T-cyclic subspace of V generated by a nonzero vector \( {\bf v} \in V . \) Let \( k = \mbox{dim}(U) . \) Then \( \left\{ {\bf v} , T\,{\bf v} , \ldots , T^{k-1} {\bf v} \right\} \) is the basis for U so there exist scalars a0, a1,...,ak-1 such that

\[ a_0 {\bf v} + a_1 T\,{\bf v} + a_2 T^2 {\bf v} + \cdots + a_{k-1} T^{k-1} {\bf v} + T^k {\bf v} = {\bf 0} . \]
In this case, \( f(\lambda ) = a_0 + a_1 \lambda + \cdots + a_{k-1} \lambda^{k-1} + \lambda^k \) is the characteristic polynomial of TU.

Since \( {\bf v} \ne {\bf 0} , \) the set \( \left\{ {\bf v} \right\} \) is linearly independent. Let j be the largest positive integer for which
\[ \beta = \left\{ {\bf v}, T\,{\bf v} , \ldots , T^{j-1} {\bf v} \right\} \]
is linearly independent. Such a j exist because V is finite-dimensional space. Let Z = span(β). Then β is a basis for Z. Furthemore, \( T^j {\bf v} \in Z . \) We use this information to show that Z is a T-invariant subspace of V. Let \( {\bf u} \in Z . \) Since u is a linear combination of the elements of β, there exist scalars b0, b1, ... , bj-1 such that
\[ {\bf u} = b_0 {\bf v} + b_1 T\,{\bf v} + \cdots + b_{j-1} T^{j-1} {\bf v} , \]
and hence
\[ T\,{\bf u} = b_0 T\,{\bf v} + b_1 T^2{\bf v} + \cdots + b_{j-1} T^{j} {\bf v} . \]
Thus, Tu is a linear combination of elements of Z, and hence belongs to Z. So Z is T-invariant. Furthermore, \( {\bf v} \in Z . \) We know that U is the smallest T-invariant subspace of V that contains v, and therefore, \( U \subseteq Z . \) Clearly \( Z \subseteq U , \) and so we conclude that Z = U. It follows that β is a basis for U and therefore dim(U) = j. Thus j=k.

Now view β as an ordered basis for U. Let a0, a1, ... , ak-1 be the scalars such that

\[ a_0 {\bf v} + a_1 T\,{\bf v} + \cdots + a_{k-1} T^{k-1} {\bf v} + T^k {\bf v} = {\bf 0} . \]
Observe that the matrix that corresponds to TU in basis β is
\[ \left[ T_U \right]_{\beta} = \begin{bmatrix} 0&0& \cdots &0&-a_0 \\ 1&0& \cdots & 0&-a_1 \\ \vdots&\vdots& \ddots &\vdots &\vdots \\ 0&0&\cdots &1&-a_{k-1} \end{bmatrix} , \]
which has the characteristic polynomial
\[ f(\lambda ) = a_0 + a_1 \lambda + \cdots + a_{k-1} \lambda^{k-1} + \lambda^k . \]
Thus, f(λ) is the characteristic polynomial for TU. ■

Theorem (Cayley--Hamilton): For a given n×n matrix A and In the n×n identity matrix, the characteristic polynomial of A is defined as \( \chi (\lambda ) = \det \left( \lambda {\bf I} - {\bf A} \right) , \) where det is the determinant operation and λ is a scalar (real or complex). Then the characteristic polynomial annihilates the matrix A: \( \chi ({\bf A} ) = {\bf 0} . \)

We show that \( \chi ({\bf A} )\,{\bf v} = {\bf 0} \) for every vector v from V. It is obvious when v = 0 because \( \chi ({\bf A} ) \) is linear. Now suppose that \( {\bf v} \ne {\bf 0} . \) Let U be the A-cyclic subspace generated by v, and suppose that dim (U) = k. Then, for some scalars \( a_0 , a_1 , \ldots , a_k \)
\[ a_0 {\bf v} + a_1 {\bf A}\,{\bf v} + \cdots + {\bf A}^{k-1} {\bf v} + {\bf A}^k {\bf v} = {\bf 0} \]
because U is the A-cyclic subspace. Hence \( g(\lambda ) = a_0 + a_1 \lambda + \cdots + a_{k-1} \lambda^{k-1} + \lambda^k \) is the characteristic polynomial of TU. For the vector v, we have
\[ g\left( {\bf A} \right) {\bf v} = \left( a_0 {\bf I} + a_1 {\bf A} + \cdots + a_{k_1} {\bf A}^{k-1} + {\bf A}^k \right) {\bf v} = {\bf 0} . \]
Therefore, g(λ) divides χ(λ). ■
     
 Arthur Cayley    William Hamilton
Arthur Cayley (1821--1895) was the English mathematician and leader of the British school of pure mathematics that emerged in the 19th century. Arthur showed great skill in numerical calculations at a private school in Blackheath and, after he moved to King's College School in 1835, at age 14 rather than the usual age of entry of 16, his aptitude for advanced mathematics became apparent. However, it was not only mathematics at which he excelled, for he won prizes in many subjects. In particular, he won the Chemistry Prize in each of his final two years despite not specialising in science. His mathematics teacher advised his parents that Arthur be encouraged to pursue his studies in this area rather than follow his father's wishes to enter the family business as a merchant.

In 1838 Arthur began his studies at Trinity College, Cambridge, having George Peacock as tutor in his first year. He was coached by William Hopkins who encouraged him to read papers by continental mathematicians. His favourite mathematical topics were linear transformations and analytical geometry and while still an undergraduate he had three papers published in the newly founded Cambridge Mathematical Journal edited by Duncan Gregory. Cayley graduated as Senior Wrangler in 1842 and won the first Smith's prize. After the examinations, Cayley and his friend Edmund Venables led a reading party of undergraduates to Aberfeldy in Scotland.

The Cambridge fellowship had a limited tenure, since Cayley was not prepared to take Holy Orders, so he had to find a profession. He chose law and began training in April 1846. While still training to be a lawyer Cayley went to Dublin to hear William Rowan Hamilton lecture on quaternions. He sat next to George Salmon during these lectures and the two were to exchange mathematical ideas over many years. Cayley was a good friend of Hamilton's although the two disagreed as to the importance of the quaternions in the study of geometry. Another of Cayley's friends was James Joseph Sylvester who was also in the legal profession. The two both worked at the courts of Lincoln's Inn in London and they discussed deep mathematical questions during their working day. Others at Lincoln's Inn who were active mathematicians included Hugh Blackburn.

Sir William Rowan Hamilton (1805--1865) was an Irish mathematician, astronomer, and mathematical physicist, who made important contributions to classical mechanics, optics, and algebra. His studies of mechanical and optical systems led him to discover new mathematical concepts and techniques. His best known contribution to mathematical physics is the reformulation of Newtonian mechanics, now called Hamiltonian mechanics. This work has proven central to the modern study of classical field theories such as electromagnetism, and to the development of quantum mechanics. In pure mathematics, he is best known as the inventor of quaternions.

The Cayley--Hamilton theorem was first proved in 1853 in terms of inverses of linear functions of quaternions, a non-commutative ring, by Hamilton. This corresponds to the special case of certain 4 × 4 real or 2 × 2 complex matrices. The theorem holds for general quaternionic matrices. Cayley in 1858 stated it for 3 × 3 and smaller matrices, but only published a proof for the 2 × 2 case. The general case was first proved by the German mathematician Ferdinand Georg Frobenius (1849--1917) in 1878.