This web page serves as an introduction to definition of a function of a square matrix, which may have
several meanings. In particular, for some scalar function f and a square matrix A, we
specify \( f ({\bf A} ) \) to be a matrix of the same dimensions as A.
Since our objection is applications of matrix functions in differential equations, we
usually utilize holomorphic (represented by convergent power series) functions of a single variable. Previously, we
used several equivalent definitions of matrix functions for admissible functions that are defined on spectrum of A
if the values
exist for every eigenvalue λi of multiplicity mi. There are known many other
definitions of a function of a square matrix (see the monograph by Nicolas J. Higham, born in 1961):
Higham, Nicholas J. Functions of Matrices: Theory and Computation. Cambridge University Press, Cambridge, 2008.
Nicolas Higham.
We know that every non-zero number has precisely two square roots, as for instance, -1 has two square roots:
j and -j, where j is a unit vector in positive vertical direction
on the complex plane \( \mathbb{C} . \) The situation is rather more complicated for
matrices. In fact, we know that some matrices have infinite many roots, others have only finite number of roots
(that we can discover), with some having none at all.
The matrix square root and logarithm are among the most commonly occurring matrix functions,
arising most frequently in the context of symmetric positive definite matrices. The
key roles that the square root plays in, for example, the matrix sign function, the definite generalized
eigenvalue problem, the polar decomposition, and the geometric mean, make it a useful
theoretical and computational tool. We consider matrix square roots related for solving second order differential
equation, where functions
occur frequently. Although square root for some matrices may not exist, and for other matrices there could be
infinite many matrix-roots, the above functions are uniquely defined for arbitrary square matrix A.
Indeed, one infinite family of square roots of the identity \( 2 \times 2 \) matrix
comprises the Householder reflections
We always assume the choice of branches of the square root function is fixed in the neighbourhoods
of the eigenvalues of the matrix of interest, since otherwise the square root is not
even continuous.
Theorem: An arbitrary
n-by-n matrix A has a square root if
and only if in the "ascent sequence" of integers d1,
d2, ... defined by
furnish square roots of A. If \( a\ne 0 \ \& \ b \ne 0 , \) then there
are four such square roots. If just one of a and b is zero, there are two such square roots. If
a = b = 0, there is just one. But are these the only square roots?
Now it is easy to check that (with A, S, and Λ as above)
Therefore, it is suffices to find all matrices X satisfying \( {\bf X}^2 = {\bf \Lambda} . \) Let
\[
{\bf X} = \begin{bmatrix} x & y \\ z & w \end{bmatrix}
\]
so that from \( {\bf X}^2 = {\bf \Lambda} \) we obtain four simultaneous quadratic equations
\begin{cases}
x^2 + yz &=a, \qquad y\left( x+w\right) &=0, \\
z\left( x+w \right) &=0, \qquad w^2 + yz &=b. \end{cases}
If \( x+w \ne 0 , \) then we require that \( y=z=0 , \)
which in turn leads to \( x = \pm \sqrt{a} , \quad w= \pm \sqrt{b} , \) and the
matrix roots already found above. If, on the other hand, we have \( x+w =0, \) then we get
\[
x^2 + yz =a, \qquad x^2 + yz =b .
\]
These equations are contradictory if \( a\ne b , \) so for matrices with two distinct
eigenvalues (and hence two linearly independent eigenvectors) we have already found all square roots and we can
state that there are just the four ones if \( a\ne b \ne 0, \) and just two if one of
a and b is zero.
the second type of matrix to consider is one which has just one eigenvalue, but two independent eigenvectors.
Then \( a =b , \quad {\bf \Lambda} = a{\bf I} = {\bf A} \) and the matrix S
is redundant. Then the above equations with \( x+w =0 \) yield the consistent equations
\[
x^2 + yz =a, \qquad x^2 + yz =a .
\]
The case y = 0 leads back to the "obvious" square roots already found, and the cases
\( y \ne 0 \) give \( z= (a-x^2 )/y \) and yield
\[
{\bf X} = \begin{bmatrix} x & y \\ \frac{a-x^2}{y} & -x \end{bmatrix} , \qquad y \ne 0,
\]
as a square root of aI for any x and \( y \ne 0. \) In particular,
by taking a = 0 this gives all the additional square roots of the zero matrix.
There remains just the third type of matrix to discuss, namely a defective A that has
just one eigenvalue a and only one linearly independent eigenvector. In this case linear algebra tells
us that this matrix is similar to the Jordan form:
Finally, is a = 0 (i.e., the defective matrix A has the determinant and its trace to be zero),
then J possesses no square root at all. It follows that every 2-by-2 matrix that does not possess
a square root is of the form
Situation is more complicated for higher dimensions. It is not true that not defective 3-by-3 matrix has
only eight roots. Let us consider the symmetric positive definite matrix:
that has two distinct eigenvalues, one \( \lambda =4 \) is simple, and another one
\( \lambda = 1 \) has multiplicity 2. Matrix A is not defective and it is
diagonalizable. However, this matrix has more than eight roots; we list some of them
that has one eigenvalue \( \lambda = 0 \) of geometrical multiplicity 1. The zero power of
A is the identity matrix, so its kernel is zero. The null space of matrix A is
one-dimensional spanned on the vector \( \langle 1 , 1 \rangle . \) Therefore, the
kernel of A is one-dimensional and the first number is d1 =1. All powers of matrix
A are zero matrices having null spaces of dimension 2. Therefore, we get the sequence
\( d_1 =1 , \ d_2 =1, \ d_3 = d_4 = \cdots = 0 , \) the given matrix has no square root.
It has one triple eigenvalue \( \lambda = 0 \) of geometrical multiplicity 2. The zero power of
A is the identity matrix, so its kernel is zero. The null space of matrix A is a
two-dimensional space spanned on vectors \( \langle , 1 , 0, 0 \rangle \) and
\( \langle , 0 , 0, 1 \rangle . \) The minimal polynomial for matrix
A is \( \psi (\lambda ) = \lambda^2 , \) and none of discussed
methods is able to find a square root for the given matrix.
Now we verify that the conditions of the above theorem are valid for matrix A. Indeed, the
null space of the given matrix is 2-dimensional. Since all powers of matrix A are zero matrices,
we get the sequence:
\[
d_1 =2, \ d_2 =1, \ d_3 = d_4 = \cdots = 0.
\]
Therefore, we see that the conditions of the theorem are fulfilled. ■
It is instructive to consider the cases where A is
a rank-1 n-by-n matrix so
\( {\bf A} = {\bf u}\otimes {\bf v} =
{\bf u}\,{\bf v}^{\ast} \) is
an outer product of two n-vectors. In this case (rank(A) = 1),
a function of rank-1 matrix is the sum of two terms:
This matrix has one simple eigenvalue \( \lambda =4 \) and another one double eigenvalue
\( \lambda = 0 . \) Mathematica confirms:
A = Outer[Times, {1, 1, 1}, {1, 0, 3}] (* or *)
A = {{1, 0, 3}, {1, 0, 3}, {1, 0, 3}}
ss = Eigenvalues[A]
{4, 0, 0}
Eigenvectors[A]
{{1, 1, 1}, {-3, 0, 1}, {0, 1, 0}}
CharacteristicPolynomial[A, x]
4 x^2 - x^3
Since matrix A is diagonilizable, its minimal polynomial is of second degree:
\( \psi (\lambda ) = \lambda \left( \lambda -4 \right) . \)
We seek a square root of A in the form
\[
{\bf A}^{1/2} = b_0 {\bf I} + b_1 {\bf A} ,
\]
where coefficients are determined from the system of algebraic equations
the null spaces of all powers are the same and the sequence becomes \( d_1 =2, \ d_2 = d_3 = \cdots =0 . \)
Now suppose we want to define the exponential matrix function that corresponds to the scalar function
\( f(\lambda ) = e^{\lambda \,t} . \) Again, the corresponding matrix function is the sum of two terms:
\(
\left(\begin{array}{c} 2\\ -1\\ 1
\end{array}\right),\left(\begin{array}{c} -\frac{1}{3}\\ 3\\ 1
\end{array}\right),\left(\begin{array}{c} \frac{1}{2}\\
\frac{3}{2}\\ 1 \end{array}\right) \)
There are a number of different methods we can use to find the square
roots of matrix A. We will first use the diagonalization
method. To achieve this, we first build the transition matrix of
eigenvectors and its inverse.
\(
\left(\begin{array}{ccc} 1 & 0 & 0\\ 0 & 9 & 0\\ 0 & 0 & 4 \end{array}\right)
\)
Due to the nature of the square root, this matrix has a total of 8
roots. For our purposes we will choose one of the eight roots as shown
below.
sqrtlambda:=lambda^(1/2)
\(
\left(\begin{array}{ccc} 1 & 0 & 0\\ 0 & 3 & 0\\ 0 & 0 & 2 \end{array}\right)
\)
We have constructed one of the eight square roots of this positive
definite matrix. The other seven are from the
\( \pm \) of the diagonals along the
sqrtlambda matrix.
sqrtA:=S*sqrtlambda*Sinv
\(
\left(\begin{array}{ccc} -21 & -13 & 31\\ 54 & 34 & -75\\ 6 & 4 & -7
\end{array}\right) \)
We check that obtained matrix is a square root:
sqrtA*sqrtA
\(
\left(\begin{array}{ccc} -75 & -45 & 107\\ 252 & 154 & -351\\ 48 & 30
& -65 \end{array}\right)
\)
Now we find another square root based on another diagonal matrix
sqrt2:=matrix([[1,0,0],[0,-3,0],[0,0,2]])
\(
\left(\begin{array}{ccc} 1 & 0 & 0\\ 0 & -3 & 0\\ 0 & 0 & 2
\end{array}\right) \)
Using the above diagonal amtrix, we build another matrix square root:
sqrtA2:=S*sqrt2*Sinv
\(
\left(\begin{array}{ccc} 9 & 5 & -11\\ -216 & -128 & 303\\ -84 & -50 &
119 \end{array}\right) \)
We also check whether this matrix is a square root by multiplying it
with itself:
sqrtA2*sqrtA2
\(
\left(\begin{array}{ccc} -75 & -45 & 107\\ 252 & 154 & -351\\ 48 & 30
& -65 \end{array}\right) \)
We show how to find square roots using Sylvester's method.
Then we find the resolvent, \( {\bf R}_{\lambda} \left( {\bf A} \right) = \left( \lambda {\bf I} - {\bf A} \right)^{-1} : \)
Next we calculate residues, corresponding to the square root. Since matrix A has three distinct
eigenvalues, these residues are multiples of Sylvester's auxiliary matrices.
Since there are three linearly independent eigenvectors, matrix A is diagonalizable. Then its
minimal polynomial is of the second degree: \( \psi (\lambda )= (\lambda -1)(\lambda -4) . \)
The resolvent has ψ in the denominator,
R = Simplify[Inverse[lambda*IdentityMatrix[3] - A]]
Matrix Z1 is the projection operator on eigenspace corresponding to eigenvalue \( \lambda =1 , \)
while Z4 is the projection operator on eigenspace corresponding to eigenvalue \( \lambda =4 , \)