MATLAB TUTORIAL, part 2.1: Sylvester Method

Suppose that A is a diagonalizable $ n \times n $ matrix; this means that all its eigenvalues are not defective and there exists a basis of n linearly independent eigenvectors. Now assume that its minimal polynomial (the polynomial of least possible degree that anihilates the matrix) is a product of simple terms:

\[ \psi (\lambda ) = \left( \lambda - \lambda_1 \right) \left( \lambda - \lambda_2 \right) \cdots \left( \lambda - \lambda_k \right) , \]

where $ \lambda_1 , \lambda_2 , \ldots , \lambda_k $ are distinct eigenvalues of the matrix A ( $ k \le n $ ).

Let $ f(\lambda ) $ be a function defined on the spectrum of the matrix A. The last condition means that every eigenvalue $λ i$ is in the domain of $f$ , and that every eigenvalue $λ i$ with multiplicity $m$ _i > 1 is in the interior of the domain, with $f$ being ( $m i — 1$ ) times differentiable at $λ i$ . We build a function $ f\left( {\bf A} \right) $ of diagonalizable square matrix A according to James Sylvester, who was an English lawyer and music tutor before his appointment as a professor of mathematics in 1885. To define a function of a square matrix, we need to construct k Sylvester auxiliary matrices for each distinct eigenvalue $ \lambda_i , \quad i= 1,2,\ldots k: $

\[ {\bf Z}_{\lambda_i} = {\bf Z}_{i} = \dfrac{\left( {\bf A} - \lambda_1 {\bf I} \right) \cdots \left( {\bf A} - \lambda_{i-1} {\bf I} \right) \left( {\bf A} - \lambda_{i+1} {\bf I} \right) \cdots \left( {\bf A} - \lambda_k {\bf I} \right)}{(\lambda_i - \lambda_1 ) \cdots (\lambda_i - \lambda_{i-1} ) (\lambda_i - \lambda_{i+1} ) \cdots (\lambda_i - \lambda_k )} . \]

Now we define the function $ f\left( {\bf A} \right) $ according to the formula:

\[ f\left( {\bf A} \right) = \sum_{i=1}^k f(\lambda_i )\,{\bf Z}_i . \]

Each Sylvester's matrix is a projection matrix on eigenspace of the corresponding eigenvalue.

Example 1.7.1: Consider the $ 3 \times 3 $ matrix $ {\bf A} = \begin{bmatrix} 1&4&16 \\ 18&20&4 \\ -12&-14&-7 \end{bmatrix} $ that has three distinct eigenvalues Then we define the resolvent, $ {\bf R}_{\lambda} ({\bf A} ) = (\lambda {\bf I} - {\bf A})^{-1} : $


% 3x3 matrix with 2 distinct real eigenvalues
syms lambda
A = [-20 -42 -21;6 13 6;12 24 13] % define matrix
A_eig = eig(A) % calculate eigenvalues
resolvent = simplify(inv((lambda*eye(3)-A)))
z1 = simplify((lambda-1)*resolvent) % use of simplify prevents divide by 0 error
Z1 = subs(z1,lambda, 1) % plugin 1 for lambda
z4 = simplify((lambda-4)*resolvent)
Z4 = subs(z4,lambda, 4) % plugin 4 for lambda

resolvent =
$ \begin{pmatrix} \frac{\lambda -25}{\sigma_1} & -\frac{42}{\sigma_1} &-\frac{21}{\sigma_1} \\ \frac{6}{\sigma_1} & \frac{\lambda +8}{\sigma_1} & \frac{6}{\sigma_1} \\ \frac{12}{\sigma_1} & \frac{24}{\sigma_1} & \frac{\lambda +8}{\sigma_1} \end{pmatrix} $
where $ \sigma_1 = \lambda^2 -5\,\lambda +4 $
z1 =
$ \begin{pmatrix} \frac{\lambda -25}{\lambda -4} & - \frac{42}{\lambda -4} & -\frac{21}{\lambda -4} \\ \frac{6}{\lambda -4} & \frac{\lambda +8}{\lambda -4} & \frac{6}{\lambda -4} \\ \frac{12}{\lambda -4} & \frac{24}{\lambda -4} & \frac{\lambda +8}{\lambda -4} \end{pmatrix} $
Z1 =
$ \begin{pmatrix} 8&14&7 \\ -2&-3&-2 \\ -4&-8&-3 \end{pmatrix} $
z4 =
$ \begin{pmatrix} \frac{\lambda -25}{\lambda -1} & -\frac{42}{\lambda -1} & -\frac{21}{\lambda -1} \\ \frac{6}{\lambda -1} & \frac{\lambda +8}{\lambda -1} & \frac{6}{\lambda -1} \\ \frac{12}{\lambda -1} & \frac{24}{\lambda -1} & \frac{\lambda +8}{\lambda -1} \end{pmatrix} $
Z4 =
$ \begin{pmatrix} -7&-14&-7 \\ 2&4&2 \\ 4&8&4 \end{pmatrix} $


% eigenvalues of a matrix
A = [1,4,16;18,20,4;-12,-14,-7]
A_eig = eig(A)

A =
   -20   -42  -21
     6    13    6
    12    24   13
A_eig =
    1.0000
    4.0000
    1.0000


% eigen values of Z
Z1_eval = eig(Z1)
Z4_eval = eig(Z4)

Z1_eval =
$ \begin{pmatrix} 0 \\ 1 \\ 1 \end{pmatrix} $
Z4_eval =
$ \begin{pmatrix} 0 \\ 0 \\ 1 \end{pmatrix} $

In this section, we write a function sylvester(M) which returns the auxiliary matrices of matrix M. We provide the code that can work with 2x2 matrices.

Open up the editor, and start out the program with a function header

			function [Z1 Z2]=sylvester(M)

It is good practice to explain what your function does, so you should then type as a comment with something like this: This function takes a matrix M as an input, and returns its fundamental matrix.

We want M to be inputted by the user, so type

			M=input('Please input a matrix!')

Now the first step is to return a vector with the eigenvalues of M:

			eigenvalues=eig(M);

The Sylvester method only works on matrices with simple eigenvalues, so we need to check the multiplicity of eigenvalues. This is equivalent to checking whether the size of the unique vector that contains the unique values of the vector eigenvalues is the same as the size of the original vector. The function that returns the size of a matrix is size(), while the function that returns the vector of unique values is unique().

			if size(unique(eigenvalues))~=2
          display('The matrix you entered does not have simple eigenvalues, thus the Sylvester method  cannot be used');
             else
             ...
             end

After the else, that is if we do in fact have simple eigenvalues we can code the substantive part of the command. After that is done, we will have to put and end to finish the if statement. Once we have the eigenvalues, we want to calculate the auxiliary matrices.

			Z1=(1/(eigenvalues(1)-eigenvalues(2))*(M-eigenvalues(1)*eye(2))
          Z2=(1/(eigenvalues(2)-eigenvalues(1))*(M-eigenvalues(2)*eye(2))

Thus the complete function should look like this

			function [Z1 Z2]=sylvester(M)
          %   This function takes a matrix M as an input, and returns its auxiliary matrices.

            eigenvalues=eig(M);

            if size(unique(eigenvalues))~=2
                 display('The matrix you entered does not have simple eigenvalues, thus the Sylvester method  cannot be used');
            else
                 Z1=(1/(eigenvalues(1)-eigenvalues(2)))*(M-eigenvalues(1)*eye(2))
                 Z2=(1/(eigenvalues(2)-eigenvalues(1)))*(M-eigenvalues(2)*eye(2))
             end
             end

---------------- New version ----------------------

We start with very important definitions. Our objective is to define a function of a square matrix. We beging with polynomials because for each such function $ q \left( \lambda \right) = q_0 + q_1 \lambda + q_2 \lambda^2 + \cdots + q_n \lambda^n $ we can naturally assign a matrix $ q \left( {\bf A} \right) = q_0 {\bf I} + q_1 {\bf A} + q_2 {\bf A}^2 + \cdots + q_n {\bf A}^n $ because we know how to multiply matrices and how to add them. A scalar polynomial $ q \left( \lambda \right) $ is called an annulled polynomial (or annihilating polynomial) of the square matrix A, if $ q \left( {\bf A} \right) = {\bf 0} , $ with the understanding that $ {\bf A}^0 = {\bf I} , $ the identity matrix, replaces $ \lambda^0 =1 $ when substituting λ for A.

The minimal polynomial of a square matrix A is a unique monic polynomial ψ of lowest degree such that $ \psi \left( {\bf A} \right) = {\bf 0} . $ Every square matrix has a minimal polynomial.

A square matrix A for which the characteristic polynomial $ \chi (\lambda ) = \det \left( \lambda {\bf I} - {\bf A} \right) $ and the minimal polynomial are the same is called a nonderogatory matrix (where I is the identity matrix). A derogatory matrix is one that is not non-derogatory.

Example: Consider two matrices

\[ {\bf A} = \begin{bmatrix} 11&-4&-2 \\ 20&-7&-4 \\ 5&-2&0 \end{bmatrix} \qquad\mbox{and} \qquad {\bf B} = \begin{bmatrix} 13&-5&-2 \\ 24&-9&-4 \\ -7&-3&0 \end{bmatrix} \]

that share the same characteristic polynomial:

\[ \chi_{\bf A} (\lambda ) = \det \left( \lambda {\bf I} - {\bf A} \right) = \det \left( \lambda {\bf I} - {\bf B} \right) = \chi_{\bf B} (\lambda ) = \left( \lambda -1 \right)^2 (\lambda -2) = \lambda^3 - 4\,\lambda^2 + 5\,\lambda -2 . \]

Indeed

\[ {\bf A}^3 -4\,{\bf A}^2 + 5\,{\bf A} -2\,{\bf A}^0 = \begin{bmatrix} 71&-28&-14 \\ 140&-55&-28 \\ 35&-14&-6 \end{bmatrix} -4 \begin{bmatrix} 31&-12&-6 \\ 60&-23&-12 \\ 15&-6&-2 \end{bmatrix} + 5 \begin{bmatrix} 11&-4&-2 \\ 20&-7&-4 \\ 5&-2&0 \end{bmatrix} -2 \begin{bmatrix} 1&0&0 \\ 0&1&0 \\ 0&0&1 \end{bmatrix} = \begin{bmatrix} 0&0&0 \\ 0&0&0 \\ 0&0&0 \end{bmatrix} \]

and

\[ {\bf B}^3 -4\,{\bf B}^2 + 5\,{\bf B} -2\,{\bf B}^0 = \begin{bmatrix} 77&-31&-14 \\ 152&-61&-28 \\ 41&-17&-6 \end{bmatrix} -4 \begin{bmatrix} 35&-14&-6 \\ 68&-27&-12 \\ 19&-8&-2 \end{bmatrix} + 5 \begin{bmatrix} 13&-5&-2 \\ 24&-9&-4 \\ -7&-3&0 \end{bmatrix} -2 \begin{bmatrix} 1&0&0 \\ 0&1&0 \\ 0&0&1 \end{bmatrix} = \begin{bmatrix} 0&0&0 \\ 0&0&0 \\ 0&0&0 \end{bmatrix} . \]

On the other hand, the minimal polynomial for matrix A is

\[ \psi_{\bf A} (\lambda ) = \left( \lambda -1 \right) \left( \lambda -2 \right) = \lambda^2 -3\,\lambda +2 \]

since

\[ \psi_{\bf A} \left({\bf A} \right) = \left( {\bf A} - {\bf I} \right) \left( {\bf A} -2 {\bf I} \right) = {\bf A}^2 -3\,{\bf A} +2 \,{\bf I} = \begin{bmatrix} 31&-12&-6 \\ 60&-23&-12 \\ 15&-6&-2 \end{bmatrix} - 3 \begin{bmatrix} 11&-4&-2 \\ 20&-7&-4 \\ 5&-2&0 \end{bmatrix} +2 \begin{bmatrix} 1&0&0 \\ 0&1&0 \\ 0&0&1 \end{bmatrix} = \begin{bmatrix} 0&0&0 \\ 0&0&0 \\ 0&0&0 \end{bmatrix} . \]

Therefore, matrix A is a derogatory matrix. However, matrix B is a nonderogatory matrix because

\[ \psi_{\bf B} (\lambda ) = \left( \lambda -1 \right)^2 \left( \lambda -2 \right) = \lambda^3 - 4\,\lambda^2 +5\,\lambda -2 . \qquad\quad ■ \]


Arthur Cayley		James Sylvester		William Hamilton

Arthur Cayley (1821--1895) was a British mathematician, one of the founders of the modern British school of pure mathematics. As a child, Cayley enjoyed solving complex math problems for amusement. He entered Trinity College, Cambridge, where he excelled in Greek, French, German, and Italian, as well as mathematics. He worked as a lawyer for 14 years. During this period of his life, Cayley produced between two and three hundred papers. Many of his publications were results of collaboration with his friend J. J. Sylvester.

Sir William Rowan Hamilton (1805--1865) was an Irish physicist, astronomer, and mathematician, who made important contributions to classical mechanics, optics, and algebra. He was first foreign member of the American National Academy of Sciences. Hamilton had a remarkable ability to learn languages, including modern European languages, and Hebrew, Persian, Arabic, Hindustani, Sanskrit, and even Marathi and Malay. Hamilton was part of a small but well-regarded school of mathematicians associated with Trinity College in Dublin, which he entered at age 18. Paradoxically, the credit for discovering the quantity now called the Lagrangian and Lagrange's equations belongs to Hamilton. In 1835, being secretary to the meeting of the British Association which was held that year in Dublin, he was knighted by the lord-lieutenant.

James Joseph Sylvester (1814--1897) was an English mathematician. He made fundamental contributions to matrix theory, invariant theory, number theory, partition theory, and combinatorics. Sylvester was born James Joseph in London, England, but later he adopted the surname Sylvester when his older brother did so upon emigration to the United States. At the age of 14, Sylvester was a student of Augustus De Morgan at the University of London. His family withdrew him from the University after he was accused of stabbing a fellow student with a knife. Subsequently, he attended the Liverpool Royal Institution.

However, Sylvester was not issued a degree, because graduates at that time were required to state their acceptance of the Thirty-Nine Articles of the Church of England, and Sylvester could not do so because he was Jewish, the same reason given in 1843 for his being denied appointment as Professor of Mathematics at Columbia College (now University) in New York City. For the same reason, he was unable to compete for a Fellowship or obtain a Smith's prize. In 1838 Sylvester became professor of natural philosophy at University College London and in 1839 a Fellow of the Royal Society of London. In 1841, he was awarded a BA and an MA by Trinity College, Dublin. In the same year he moved to the United States to become a professor of mathematics at the University of Virginia, but left after less than four months following a violent encounter with two students he had disciplined. He moved to New York City and began friendships with the Harvard mathematician Benjamin Peirce and the Princeton physicist Joseph Henry, but in November 1843, after his rejection by Columbia, he returned to England.

One of Sylvester's lifelong passions was for poetry; he read and translated works from the original French, German, Italian, Latin and Greek, and many of his mathematical papers contain illustrative quotes from classical poetry. In 1876 Sylvester again crossed the Atlantic Ocean to become the inaugural professor of mathematics at the new Johns Hopkins University in Baltimore, Maryland. His salary was $5,000 (quite generous for the time), which he demanded be paid in gold. After negotiation, agreement was reached on a salary that was not paid in gold. In 1878 he founded the American Journal of Mathematics. In 1883, he returned to England to take up the Savilian Professor of Geometry at Oxford University. He held this chair until his death. Sylvester invented a great number of mathematical terms such as "matrix" (in 1850), "graph" (combinatorics) and "discriminant." He coined the term "totient" for Euler's totient function.

Theorem: (Cayley--Hamilton) Every square matrix A is annulled by its characteristic polynomial, that is, $ \chi ({\bf A}) = {\bf 0} , $ where $ \chi (\lambda ) = \det \left( \lambda {\bf I} - {\bf A} \right) . $ ■

Example: Consider the matrix

\[ {\bf T} = \begin{bmatrix} 29&390&336&648&-480 \\ 3&26&24 & 48&-36 \\ 12&153&131&255&-189 \\ 6&69&60&116&-87 \\ 21&249&216&420&-313 \end{bmatrix} . \]

Its characteristic polynomial is of degree 5:

\[ \chi \left( \lambda \right) = \det \left( \lambda {\bf I} - {\bf A} \right) = \left( \lambda +1 \right)^3 \left( \lambda +4 \right)^2 = \lambda^5 +11\,\lambda^4 + 43\,\lambda^3 + 73\,\lambda^2 + 58\,\lambda +16 ; \]

however, its minimal polynomial is of degree 2:

\[ \psi \left( \lambda \right) = \left( \lambda +1 \right) \left( \lambda +4 \right) = \lambda^2 + 5\,\lambda + 4 . \]

Indeed,

\[ \psi \left( {\bf T} \right) = \left( {\bf T} +{\bf I} \right) \left( {\bf T} +4\,{\bf I} \right) = {\bf T}^2 + 5\,{\bf T} + 4\,{\bf I} = \begin{bmatrix} -149&-1950&-1680&-3240&2400 \\ -15&-134&-120 &-240&180 \\ -60&-765&-659&-1275&945 \\ -30&-345&-300&-584&435 \\ -105&-1245&-1080&-2100&1561 \end{bmatrix} +5 \begin{bmatrix} 29&390&336&648&-480 \\ 3&26&24 & 48&-36 \\ 12&153&131&255&-189 \\ 6&69&60&116&-87 \\ 21&249&216&420&-313 \end{bmatrix} + 4 \begin{bmatrix} 1&0&0&0&0 \\ 0&1&0 &0&0 \\ 0&0&1&0&0 \\ 0&0&0&1&0 \\ 0&0&0&0&1 \end{bmatrix} = \begin{bmatrix} 0&0&0&0&0 \\ 0&0&0&0&0 \\ 0&0&0&0&0 \\ 0&0&0&0&0 \\ 0&0&0&0&0 \end{bmatrix} . \]

Therefore, matrix T is not a derogatory matrix. ■

Theorem: For polynomials p and q and a square matrix A, $ p \left( {\bf A} \right) = q \left( {\bf A} \right) $ if and only if p and q take the same values on the spectrum (set of all eigenvalues) of A.

Theorem: The minimal polynomial ψ of a square matrix A divides any other annihilating polynomial p for which $ p \left( {\bf A} \right) = {\bf 0} . $ In particular, the minimal polynomial divides the characteristic polynomial of A. ■

From the above theorem follows that the minimal polynomial always is a product of monomials $ \lambda - \lambda_i $ for each eigenvalue λ_i because the characteristic polynomial contains all such multiples. So all eigenvalues must be present in the product form for a minimal polynomial.

Theorem: The minimal polynomial for a linear operator on a finite-dimensional vector space is unique.

Theorem: A square matrix A is diagonalizable if and only if its minimal polynomial is a product of simple terms: $ \psi (\lambda ) = \left( \lambda - \lambda_1 \right) \left( \lambda - \lambda_2 \right) \cdots \left( \lambda - \lambda_s \right) , $ where λ₁, ... , λ_s are eigenvalues of matrix A. ■

Therefore, a square matrix A is not diagonalizable if and only if its minimal polynomial contains at least one multiple $ \left( \lambda - \lambda_1 \right)^m $ for m > 1.

Theorem: Every matrix that commutes with a square matrix A is a polynomial in A if and only if A is nonderogatory.

Theorem: If A is upper triangular and nonderogatory, then any solution X of the equation $ f \left( {\bf X} \right) = {\bf A} $ is upper triangular. ■

In Mathematica, one can define the minimal polynomial using the following code:

MatrixMinimalPolynomial[a_List?MatrixQ, x_] :=
Module[{i, n = 1, qu = {},
mnm = {Flatten[IdentityMatrix[Length[a]]]}},
While[Length[qu] == 0, AppendTo[mnm, Flatten[MatrixPower[a, n]]];
qu = NullSpace[Transpose[mnm]];
n++];
First[qu].Table[x^i, {i, 0, n - 1}]]

Then to find a minimal polynomial of previously defined square matrix A, just enter

MatrixMinimalPolynomial[A, x]

Suppose that A is a diagonalizable $ n \times n $ matrix; this means that all its eigenvalues are not defective and there exists a basis of n linearly independent eigenvectors. Then its minimal polynomial (the polynomial of least possible degree that annihilates the matrix) is a product of simple terms:

\[ \psi (\lambda ) = \left( \lambda - \lambda_1 \right) \left( \lambda - \lambda_2 \right) \cdots \left( \lambda - \lambda_k \right) , \]

where $ \lambda_1 , \lambda_2 , \ldots , \lambda_k $ are distinct eigenvalues of matrix A ( $ k \le n $ ).

Sylvester's auxiliary matrix $ {\bf Z}_i $ is actually the Lagrange interpolating polynomial. Now we define the function $ f\left( {\bf A} \right) $ according to the formula:

\[ f\left( {\bf A} \right) = \sum_{i=1}^k f(\lambda_i )\,{\bf Z}_i . \]

This formula was formulated by James Sylvester in 1883:
J. J. Sylvester. On the equation to the secular inequalities in the planetary theory. Philosophical Magazine, 16:267–269, 1883.

Each Sylvester's matrix $ {\bf Z}_i $ is a projection matrix on eigenspace of the corresponding eigenvalue, so $ {\bf Z}_i^2 = {\bf Z}_i , \ i = 1,2,\ldots , k; $ but $ {\bf Z}_1 {\bf Z}_2 = {\bf 0} . $

Example: Consider a 2-by-2 matrix

\[ {\bf A} = \begin{bmatrix} 4&5 \\ -1& -2 \end{bmatrix} \]

that has two distinct eigenvalues $ \lambda_1 =3 $ and $ \lambda_2 =-1 . $ The eigenvectors corresponding these eigenvalues are

\[ {\bf v}_1 = \left\langle -5 , \ 1 \right\rangle^T \qquad {\bf v}_2 = \left\langle 1 , \ -1 \right\rangle^T , \]

respectively. Since the minimal polynomial is $ \psi (\lambda ) = (\lambda -3)(\lambda +1) $ is a product of simple factors, we build Sylvester's auxiliary matrices:

\[ {\bf Z}_3 = \frac{{\bf A} +1}{3+1} = \frac{1}{4} \begin{bmatrix} 5&5 \\ -1& -1 \end{bmatrix} \quad\mbox{and} \quad {\bf Z}_{-1} = \frac{{\bf A} -3}{-1-3} = \frac{1}{4} \begin{bmatrix} -1&-5 \\ 1&5 \end{bmatrix} . \]

These matrices are projectors on eigenspaces:

\[ {\bf Z}_3^2 = {\bf Z}_3 \qquad \mbox{and} \qquad {\bf Z}_{-1}^2 = {\bf Z}_{-1} , \quad \mbox{but} \quad {\bf Z}_3 {\bf Z}_{-1} = {\bf 0}. \]

Let us take an arbitrary vector, say $ {\bf x} = \langle 2, \ 2 \rangle^T , $ and expand it with respect to eigenvectors:

\[ {\bf x} = \begin{bmatrix} 2 \\ 2 \end{bmatrix} = a\, {\bf v}_1 + b\,{\bf v}_2 = a \begin{bmatrix} -5 \\ 1 \end{bmatrix} + b \begin{bmatrix} 1 \\ -1 \end{bmatrix} , \]

where coefficients a and b need to be determined. Of course, we can find them by solving the system of linear equations:

\[ \begin{split} 2 &= a(-5) + b , \\ 2 &= a - b . \end{split} \]

However, an easier way to do this is to apply Sylvester's auxiliary matrices:

\begin{align*} {\bf Z}_3 \, {\bf x} &= \frac{1}{4} \begin{bmatrix} 5&5 \\ -1& -1 \end{bmatrix} \, \begin{bmatrix} 2\\ 2 \end{bmatrix} = \frac{1}{2} \begin{bmatrix} 5&5 \\ -1& -1 \end{bmatrix} \, \begin{bmatrix} 1 \\ 1 \end{bmatrix} = \frac{1}{2} \begin{bmatrix} 10 \\ 2 \end{bmatrix} = \begin{bmatrix} 5\\ -1 \end{bmatrix} = -{\bf v}_1 , \\ {\bf Z}_{-1}\, {\bf x} &= \frac{1}{4} \begin{bmatrix} -1&-5 \\ 1&5 \end{bmatrix} \, \begin{bmatrix} 2\\ 2 \end{bmatrix} = \begin{bmatrix} -3 \\ 3 \end{bmatrix} = -3\, {\bf v}_2 . \end{align*}

Therefore, coefficients are $ a =-1 $ and $ b =-3 . $ So we get the expansion:

\[ {\bf x} = \begin{bmatrix} 2 \\ 2 \end{bmatrix} = - {\bf v}_1 - 3\,{\bf v}_2 = - \begin{bmatrix} -5 \\ 1 \end{bmatrix} -3 \begin{bmatrix} 1 \\ -1 \end{bmatrix} . \]

We can define some functions, for instance, $ f(\lambda ) = e^{\lambda \,t} $ and $ g(\lambda ) = \cos \left( \lambda \,t\right) . $ Indeed,

\begin{align*} f({\bf A}) &= e^{3t}\, {\bf Z}_3 + e^{-t}\, {\bf Z}_{-1} = e^{3t}\, \frac{1}{4} \begin{bmatrix} 5&5 \\ -1& -1 \end{bmatrix} + e^{-t} \, \frac{1}{4} \begin{bmatrix} -1&-5 \\ 1&5 \end{bmatrix} , \\ g({\bf A}) &= \cos (3t) \, {\bf Z}_3 + \cos\, t\, {\bf Z}_{-1} = \frac{\cos \,3t}{4} \begin{bmatrix} 5&5 \\ -1& -1 \end{bmatrix} + \frac{\cos\, t}{4} \begin{bmatrix} -1&-5 \\ 1&5 \end{bmatrix} .\end{align*}

Example: Consider the $ 3 \times 3 $ matrix $ {\bf A} = \begin{bmatrix} 1&4&16 \\ 18&20&4 \\ -12&-14&-7 \end{bmatrix} $ that has three distinct eigenvalues; Mathematica confirms

A = {{1,4,16},{18,20,4},{-12,-14,-7}}

Eigenvalues[A]

Out[2]= 9, 4, 1

Then we define Sylvester auxiliary matrices

\[ {\bf Z}_1 = \frac{\left( {\bf A} - 9{\bf I} \right)\left( {\bf A} - 4{\bf I} \right)}{(1-9)(1-4)} = \begin{bmatrix} -4&-8&-12 \\ 4&8&12 \\ -1&-2&-3 \end{bmatrix}, \quad {\bf Z}_4 = \frac{\left( {\bf A} - 9{\bf I} \right)\left( {\bf A} - {\bf I} \right)}{(4-9)(4-1)} = \begin{bmatrix} 8&12&16 \\ -10&-15&-20 \\ 4&6&8 \end{bmatrix} , \quad {\bf Z}_9 = \frac{\left( {\bf A} - 4{\bf I} \right)\left( {\bf A} - {\bf I} \right)}{(9-1)(9-4)} = \begin{bmatrix} -3&-4&-4 \\ 6&8&8 \\ -3&-4&-4 \end{bmatrix} . \]

We check with Mathematica that these auxiliary matrices are projectors:

Z1 = (A - 9*IdentityMatrix[3]).(A - 4*IdentityMatrix[3])/3/8

{{-4, -8, -12}, {4, 8, 12}, {-1, -2, -3}}

Z4 = (A - 9*IdentityMatrix[3]).(A - 1*IdentityMatrix[3])/(4 - 9)/(4 - 1)

{{8, 12, 16}, {-10, -15, -20}, {4, 6, 8}}

Z9 = (A - 4*IdentityMatrix[3]).(A - 1*IdentityMatrix[3])/(9 - 4)/(9 - 1)

{{-3, -4, -4}, {6, 8, 8}, {-3, -4, -4}}

Eigenvalues[Z1]

{1, 0, 0}

Eigenvectors[Z1]

{{4, -4, 1}, {-3, 0, 1}, {-2, 1, 0}}

Eigenvalues[Z4]

{1, 0, 0}

Eigenvectors[Z4]

{{4, -5, 2}, {-2, 0, 1}, {-3, 2, 0}}

Eigenvalues[Z9]

{1, 0, 0}

Eigenvectors[Z9]

{{1, -2, 1}, {-4, 0, 3}, {-4, 3, 0}}

Now we show that these auxiliary matrices are mutually orthogonal.

Z1.Z4

{{0, 0, 0}, {0, 0, 0}, {0, 0, 0}}

Z1.Z1 - Z1

{{0, 0, 0}, {0, 0, 0}, {0, 0, 0}}

Z1.Z9

{{0, 0, 0}, {0, 0, 0}, {0, 0, 0}}

Z4.Z4 - Z4

{{0, 0, 0}, {0, 0, 0}, {0, 0, 0}}

Z9.Z4

{{0, 0, 0}, {0, 0, 0}, {0, 0, 0}}

Z9.Z9 - Z9

{{0, 0, 0}, {0, 0, 0}, {0, 0, 0}}

With Sylvester's matrices, we are able to construct arbitrary admissible functions of the given matrix. We start with the square root function $ r(\lambda ) = \sqrt{\lambda} , $ which is actually comprises of two branches $ r(\lambda ) = \sqrt{\lambda} $ and $ r(\lambda ) = -\sqrt{\lambda} . $ Indeed, choosing a particular branch, we get

\begin{align*} {\bf R}_1 &= {\bf Z}_1 + 2\,{\bf Z}_4 + 3\, {\bf Z}_9 = \begin{bmatrix} 3&4&8 \\ 2&2&-4 \\ -2&-2&1 \end{bmatrix} , \\ {\bf R}_2 &= {\bf Z}_1 - 2\,{\bf Z}_4 + 3\, {\bf Z}_9 = \begin{bmatrix} -29&-44&-56 \\ 42&62&76 \\ -18&-26&-31 \end{bmatrix} , \\ {\bf R}_3 &= {\bf Z}_1 + 2\,{\bf Z}_4 - 3\, {\bf Z}_9 = \begin{bmatrix} 21&28&32 \\ -34&-46&-52 \\ 16&22&25 \end{bmatrix} , \\ {\bf R}_4 &= {\bf Z}_1 - 2\,{\bf Z}_4 - 3\, {\bf Z}_9 = \begin{bmatrix} -11&-20&-32 \\ 6&14&28 \\ 0&-2&-7 \end{bmatrix} . \end{align*}

The other four roots are the negatives of the above matrices.

We consider two other functions $ \displaystyle \Phi (\lambda ) = \frac{\sin \left( t\,\sqrt{\lambda} \right)}{\sqrt{\lambda}} $ and $ \displaystyle \Psi (\lambda ) = \cos \left( t\,\sqrt{\lambda} \right) . $ It is not a problem to determine the corresponding matrix functions

\begin{align*} {\bf \Phi} (t) &= \sin t \,{\bf Z}_1 + \frac{\sin 2\,t}{2} \,{\bf Z}_4 + \frac{\sin 3\,t}{3}\, {\bf Z}_9 , \\ {\bf \Psi} (t) &= \cos t\,{\bf Z}_1 + \cos 2\,t\,{\bf Z}_4 + \cos 3\,t\, {\bf Z}_9 . \end{align*}

These two matrix functions are solutions of the matrix differential equation $ \displaystyle \ddot{\bf P} (t) + {\bf A}\,{\bf P} (t) = {\bf 0} $ subject to the initial conditions

\[ {\bf \Phi} (0) = {\bf 0} , \quad \dot{\bf \Phi} (0) = {\bf I} , \quad {\bf \Psi} (0) = {\bf I} , \quad \dot{\bf \Psi} (0) = {\bf 0} \]

because $ \displaystyle {\bf Z}_1 + {\bf Z}_4 + {\bf Z}_9 = {\bf I} $ and $ \displaystyle {\bf A}\,{\bf Z}_1 - {\bf Z}_1 = {\bf 0}, \quad {\bf A}\,{\bf Z}_4 - 4\,{\bf Z}_4 = {\bf 0}, \quad {\bf A}\,{\bf Z}_9 - 9\,{\bf Z}_9 = {\bf 0} . $


function [S1,S2,S3] = Sylvester_3X3(A)
% This function is designed to return the Sylvester Axuiliary Matrices %
% S1,S2,S3 of 3x3 diagonalizable matrix A.
% If there are less than three auxiliary matrices, it will return the proper
% amount of axuliary matrices, and return the other values as simply 0.
% Let's find the characteristic polynomial.
syms x
vec_poly = minpoly(A); % coefficients of minimal polynomial
min_poly = poly2sym(vec_poly) % create symbolic polynomial from vector of coefficients
% Eigenvalues
u = solve(min_poly,x); % distinct eigenvalues of A

E = eye(3); % 3X3 identity matrix

S1 = 0;
S2 = 0;
S3 = 0;

switch numel(u)
  case 2
    % For our first unique eigenvalue...
    S1 = (A-E*u(2))/(u(1) - u(2));
    % Now for our second...
    S2 = (A-E*u(1))/(u(2) - u(1));
  case 3
    S1 = (A - E*u(2))*(A-E*u(3))/((u(1)-u(2))*(u(1)-u(3)));
    S2 = (A - E*u(1))*(A-E*u(3))/((u(2)-u(1))*(u(2)-u(3)));
    S3 = (A - E*u(1))*(A-E*u(2))/((u(3)-u(1))*(u(3)-u(2)));
end

Example: Consider the nonderogatory (diagonalizable) $ 3 \times 3 $ matrix $ {\bf A} = \begin{bmatrix} -20&-42&-21 \\ 6&13&6 \\ 12&24&13 \end{bmatrix} $ that has only two distinct eigenvalues

Eigenvalues[A]

Out[2]= 4, 1, 1

Then we define the resolvent:

\[ {\bf R}_{\lambda} ({\bf A}) = \left( \lambda {\bf I} - {\bf A} \right)^{-1} = \frac{1}{(\lambda -1)(\lambda -4)} \,\begin{bmatrix} \lambda -25 & -42 & -21 \\ 6 & 8+\lambda & 6 \\ 12&24&8+\lambda \end{bmatrix} . \]

resolvent = FullSimplify[Inverse[lambda*IdentityMatrix[3] - A]]

Out[2]= {{(-25 + lambda)/( 4 - 5 lambda + lambda^2), -(42/(4 - 5 lambda + lambda^2)), -(21/( 4 - 5 lambda + lambda^2))}, {6/(4 - 5 lambda + lambda^2), ( 8 + lambda)/(4 - 5 lambda + lambda^2), 6/( 4 - 5 lambda + lambda^2)}, {12/(4 - 5 lambda + lambda^2), 24/( 4 - 5 lambda + lambda^2), (8 + lambda)/(4 - 5 lambda + lambda^2)}}

Using the resolvent, we define Sylvester's auxiliary matrices (that are actually projector matrices on eigenspaces)

Z1 = FullSimplify[(lambda - 1)*resolvent] /. lambda -> 1

Out[3]= {{8, 14, 7}, {-2, -3, -2}, {-4, -8, -3}}

Z4 = FullSimplify[(lambda - 4)*resolvent] /. lambda -> 4

Out[4]= {{-7, -14, -7}, {2, 4, 2}, {4, 8, 4}}

\[ {\bf Z}_1 = \begin{bmatrix} 8&14&7 \\ -2& -3&-2 \\ -4&-8&-3 \end{bmatrix} , \qquad {\bf Z}_4 = \begin{bmatrix} -7&-14&-7 \\ 2& 4&2 \\ 4&8&4 \end{bmatrix} . \]

These auxiliary matrices are actually projection matrices; this means that the following relations hold:

\[ {\bf Z}_1 {\bf Z}_1 = {\bf Z}_1 , \qquad {\bf Z}_4 {\bf Z}_4 = {\bf Z}_4 , \qquad {\bf Z}_1 {\bf Z}_4 = {\bf 0} . \]

Their eigenvalues are

Eigenvalues[Z1]

{1, 1, 0}

Eigenvalues[Z4]

{1, 0, 0}

Although matrix A has several square roots

\[ {\bf R}_1 = {\bf Z}_1 + 2\, {\bf Z}_4 = \begin{bmatrix} -6&-14&-7 \\ 2&5&2 \\ 4&8&5 \end{bmatrix} , \qquad {\bf R}_2 = {\bf Z}_1 - 2\, {\bf Z}_4 = \begin{bmatrix} 22&42&21 \\ -6&-11&-6 \\ -12&-24 & -11 \end{bmatrix} , \]

the following matrix functions are defined uniquely:

\begin{align*} {\bf \Phi} (t) &= \sin t \,{\bf Z}_1 + \frac{\sin 2\,t}{2} \,{\bf Z}_4 , \\ {\bf \Psi} (t) &= \cos t\,{\bf Z}_1 + \cos 2\,t\,{\bf Z}_4 . \end{align*}

These two matrix functions are solutions of the matrix differential equation $ \displaystyle \ddot{\bf P} (t) + {\bf A}\,{\bf P} (t) = {\bf 0} $ subject to the initial conditions

\[ {\bf \Phi} (0) = {\bf 0} , \quad \dot{\bf \Phi} (0) = {\bf I} , \quad {\bf \Psi} (0) = {\bf I} , \quad \dot{\bf \Psi} (0) = {\bf 0} \]

because $ \displaystyle {\bf Z}_1 + {\bf Z}_4 = {\bf I} $ and $ \displaystyle {\bf A}\,{\bf Z}_1 - {\bf Z}_1 = {\bf 0}, \quad {\bf A}\,{\bf Z}_4 - 4\,{\bf Z}_4 = {\bf 0} . $

Example: The matrix $ {\bf A} = \begin{bmatrix} 1&2&3 \\ 2&3&4 \\ 2&-6&-4 \end{bmatrix} $ has two complex conjugate eigenvalues $ \lambda = 1 \pm 2{\bf j} , $ and one real eigenvalue $ \lambda = -2 . $ Since $ (\lambda -1 + 2{\bf j})(\lambda -1 - 2{\bf j}) = (\lambda -1)^2 +4 , $ the Sylvester auxiliary matrix corresponding to the real eigenvalue becomes

\[ {\bf Z}_{-2} = \frac{\left( {\bf A} - {\bf I} \right)^2 + 4\, {\bf I} }{(-3)^2 + 4} = \frac{1}{13}\,\begin{bmatrix} 14&-14&-7 \\ 12&-12&-6 \\ -22&22&11 \end{bmatrix} . \]

% 3x3 matrix with 2 complex conjugate eigenvalues
syms lambda
A = [1 2 3; 2 3 4; 2 -6 -4] % define matrix
A_eig = eig(A) %calculate eigenvalues
resolvent = simplify(inv((lambda*eye(3)-A)))
z_2 = simplify(((A-eye(3))^2+4*eye(3))/((lambda-1)^2+4))
Z_2 = subs(z_2, lambda, -2) %plugin -2 for lambda
z1_2j = ((A+(-1+2i)*eye(3))*(A+2*eye(3)))/(4i*(3+2i))
Z1_2j = subs(z1_2j, lambda, 1+2i) %plugin 1+2j for lambda

A =
     1   2   3
     2   3   4
     2  -6  -4
A_eig =
    -2.0000 + 0.0000i
     1.0000 + 2.0000i
     1.0000 - 2.0000i

resolvent = $ \begin{pmatrix} \frac{\lambda^2 + \lambda +12}{\sigma_1} & \frac{2(\lambda -5)}{\sigma_1} & ??\frac{21}{\sigma_1} \\ \frac{2 \left( \lambda +8 \right)}{\sigma_1} & \frac{\lambda^2 + 3\,\lambda -10}{\sigma_1} & ??\frac{2}{\sigma_1} \\ \frac{2 \left( \lambda -9 \right)}{\sigma_1} & -\frac{2 \left( 3\,\lambda -5 \right)}{\sigma_1} & ??\frac{\lambda +8}{\sigma_1} \end{pmatrix} $
where $ \sigma_1 = \lambda^3 + \lambda +10 $
z2 =
$ \begin{pmatrix} \frac{14}{\sigma_1} & -\frac{14}{\sigma_1} & -\frac{7}{\sigma_1} \\ \frac{12}{\sigma_1} & -\frac{12}{\sigma_1} & -\frac{6}{\sigma_1} \\ -\frac{22}{\sigma_1} & \frac{22}{\sigma_1} & \frac{11}{\sigma_1} \end{pmatrix} $
where $ \sigma_1 = (\lambda -1)^2 +4 $
z_2 =
$ \begin{pmatrix} \frac{14}{13}&1-\frac{14}{13}&-\frac{7}{13} \\ \frac{12}{13}&-\frac{12}{13} &-\frac{6}{13} \\ -\frac{22}{13}&\frac{22}{13}&\frac{11}{13} \end{pmatrix} $
z1_2j =

A =
    -0.0385 - 0.8077i   0.5385 + 0.
    -0.4615 - 1.1923i   0.9615 + 0.
     0.8462 + 0.7692i  -0.8462 + 0.

Z1_2j =
$ \begin{pmatrix} - \frac{1}{26} - \frac{21}{26}\,i & \frac{7}{13} + \frac{4}{13}\, i & \frac{7}{26} \\ - \frac{6}{13} - \frac{31}{26}\,i & \frac{25}{26} + \frac{5}{26}\, i & \frac{3}{13} \\ \frac{11}{13} + \frac{10}{13}\,i & - \frac{11}{13} + \frac{3}{13}\, i & \frac{1}{13} + \end{pmatrix} $
This auxiliary matrix is a projection; as so $ {\bf Z}_{-2} {\bf Z}_{-2} = {\bf Z}_{-2}^2 = {\bf Z}_{-2}^3 = {\bf Z}_{-2} . $

Sylvester's auxiliary matrices corresponding to two complex conjugate eigenvalues are also complex conjugate:

\[ {\bf Z}_{1+2{\bf j}} = \frac{\left( {\bf A} + \left( -1 + 2{\bf j} \right) {\bf I} \right) \left( {\bf A} + 2{\bf I} \right) }{4{\bf j} (3+ 2{\bf j})} = \frac{1}{26}\,\begin{bmatrix} -1&14&7 \\ -12&25&6 \\ 22&-22& 2 \end{bmatrix} + \frac{\bf j}{26}\,\begin{bmatrix} -21&8&-9 \\ -31&5&-17 \\ 20&6&16 \end{bmatrix} = \overline{{\bf Z}_{1-2{\bf j}} } . \]

These two complex auxiliary matrices are also projections: $ {\bf Z}_{1+2{\bf j}}^2 = {\bf Z}_{1+2{\bf j}}^3 = {\bf Z}_{1+2{\bf j}} , $ and similar relations for its complex conjugate.

Now suppose you want to build a matrix function corresponding to the exponential function $ f(\lambda ) = e^{\lambda \,t} . $ Using Sylvester's auxiliary matrices, we have

\[ f({\bf A}) = e^{{\bf A}\,t} = e^{(1+2{\bf j})t}\, {\bf Z}_{1+2{\bf j}} + e^{(1-2{\bf j})t}\, {\bf Z}_{1-2{\bf j}} + e^{-2t} \,{\bf Z}_{-2} . \]

factoring out the real-valued function $ e^{t} , $ the first two terms become

\[ e^{2{\bf j}\,t}\, {\bf Z}_{1+2{\bf j}} + e^{-2{\bf j}\,t}\, {\bf Z}_{1-2{\bf j}} = 2\,\mbox{Re}\, e^{2{\bf j}\,t}\, {\bf Z}_{1+2{\bf j}} = 2\,\cos 2t \left( \Re\, {\bf Z}_{1+2{\bf j}} \right) - 2\,\sin 2t \left( \Im\, {\bf Z}_{1+2{\bf j}} \right) . \]

This allows us to simplify the answer to

\[ f({\bf A}) = e^{{\bf A}\,t} = e^t \left( \frac{\cos 2t}{13} \,\begin{bmatrix} -1&14&7 \\ -12&25&6 \\ 22&-22& 2 \end{bmatrix} -\frac{\sin 2t}{13} \, \begin{bmatrix} -21&8&-9 \\ -31&5&-17 \\ 20&6&16 \end{bmatrix} \right) + e^{-2t} \,{\bf Z}_{-2} . \]

The obtained exponential matrix function $ e^{{\bf A}\,t} $ is a unique solution to the initial value problem:

\[ \dot{\bf X} (t) = {\bf X} (t) , \qquad {\bf X} (0) = {\bf I} . \]

MATLAB TUTORIAL, part 2.1: Sylvester Formula

---------------- New version ----------------------