Motivated by definitions of matrix functions presented in the previous
sections, we briefly discuss an exiting topic of matrix roots. Again, we do
not treat it in general case and restrict ourselves to a particular class of
square matrices---positive definite matrices. As
we will see in the next chapter, some physical
problems lead to matrix differential equation
\( {\bf A}\,\ddot{\bf X} + {\bf B}\,\dot{\bf X} + {\bf C}\,
{\bf X} = {\bf 0} . \) A typical approach to solve this matrix
differential equation is to apply the Laplace transformation. This reduces the
corresponding initial value matrix problem to algebraic matrix quadratic
equation, and this in turn yields definition of matrix functions involving
square roots. In particular, when B = 0, its solution is
expressed through a function depending on matrix square root. Although
differential equations utilize specific functions with square matrix roots,
this section goes beyond these functions. The reader is advised to find more deep exposition of matrix square roots in presented references.
Return to computing page for the first course APMA0330
Return to computing page for the second course APMA0340
Return to Mathematica tutorial for the first course APMA0330
Return to Mathematica tutorial for the second course APMA0340
Return to the main page for the first course APMA0330
Return to the main page for the second course APMA0340
Return to Part I of the course APMA0340
Introduction to Linear Algebra with Mathematica
This web page serves as an introduction to the definition of a
function of a square matrix, which may have several meanings. In particular,
for some scalar function f and a square matrix A, we
specify f(A) to be a matrix of the same dimensions as
A. Since our objective is applications of matrix functions in
differential equations, we usually utilize holomorphic (represented by
convergent power series) functions of a single variable. Previously, we used
several equivalent definitions of matrix functions for admissible functions
that are defined on spectrum of A if the values
exist for every eigenvalue λ_{i} of multiplicity
m_{i}. There are known many other definitions of a function of
a square matrix (see the monograph by
Nicolas J. Higham, born in 1961).
We know that every non-zero number has precisely two square roots, as
for instance, -1 has two square roots:
j and -j, where j is a unit
vector in positive vertical direction
on the complex plane ℂ. The situation is more complicated for
matrices. In fact, we know that some matrices have infinitely many
roots, others have only a finite number of roots
(we can discover some of them), while some have none at all.
The matrix square root and logarithm are among the most commonly
occurring matrix functions, arising most frequently in the context of
symmetric positive definite matrices. The key roles that the square root plays
in, for example, the matrix sign function, the definite generalized
eigenvalue problem, the polar decomposition, and the geometric mean, make it a
useful theoretical and computational tool. We consider matrix square roots
related for solving second order differential equation, where functions
occur frequently. Although square roots for some matrices may not exist, and
for other matrices there could be infinitely many matrix-roots, the above
functions are uniquely defined
for arbitrary square matrix A because they are unique
solutions of some initial value problems.
Indeed, one infinite family of square roots of the identity 2 × 2 matrix
comprises the Householder reflections
Then for any θ, H²(θ) = I, the identity
matrix. We always assume the choice of branches of the square root
function is fixed in the neighborhoods of the eigenvalues of the
matrix of interest, since otherwise the square root is not even
continuous.
Theorem:
A n-by-n matrix A has a square root if
and only if in the "ascent sequence" of integers d_{1}, d_{2}, ... defined by
furnish square roots of A. If \( a\ne 0 \ \& \ b \ne 0 , \) then there
are four such square roots. If just one of a and b is zero, there are two such square roots. If
a = b = 0, there is just one. But are these the only square roots?
Now it is easy to check that (with A, S, and Λ as above)
Therefore, it is suffices to find all matrices X satisfying \( {\bf X}^2 = {\bf \Lambda} . \) Let
\[
{\bf X} = \begin{bmatrix} x & y \\ z & w \end{bmatrix}
\]
so that from \( {\bf X}^2 = {\bf \Lambda} \) we obtain four simultaneous quadratic equations
\begin{cases}
x^2 + yz &=a, \qquad y\left( x+w\right) &=0, \\
z\left( x+w \right) &=0, \qquad w^2 + yz &=b. \end{cases}
If \( x+w \ne 0 , \) then we require that \( y=z=0 , \)
which in turn leads to \( x = \pm \sqrt{a} , \quad w= \pm \sqrt{b} , \) and the
matrix roots already found above. If, on the other hand, we have \( x+w =0, \) then we get
\[
x^2 + yz =a, \qquad x^2 + yz =b .
\]
These equations are contradictory if \( a\ne b , \) so for matrices with two distinct
eigenvalues (and hence two linearly independent eigenvectors) we have already found all square roots and we can
state that there are just the four ones if \( a\ne b \ne 0, \) and just two if one of
a and b is zero.
The second type of matrix to consider is one which has just one eigenvalue, but two independent eigenvectors.
Then \( a =b , \quad {\bf \Lambda} = a{\bf I} = {\bf A} \) and the matrix S
is redundant. Then the above equations with \( x+w =0 \) yield the consistent equations
\[
x^2 + yz =a, \qquad x^2 + yz =a .
\]
The case y = 0 leads back to the "obvious" square roots already found, and the cases
\( y \ne 0 \) give \( z= (a-x^2 )/y \) and yield
\[
{\bf X} = \begin{bmatrix} x & y \\ \frac{a-x^2}{y} & -x \end{bmatrix} , \qquad y \ne 0,
\]
as a square root of aI for any x and \( y \ne 0. \) In particular,
by taking a = 0 this gives all the additional square roots of the zero matrix.
There remains just the third type of matrix to discuss, namely a defective A that has
just one eigenvalue a and only one linearly independent eigenvector. In this case, linear algebra tells
us that this matrix is similar to the Jordan form:
Finally, if a = 0 (i.e., the defective matrix A has the determinant and its trace to be zero),
then J possesses no square root at all. It follows that every 2-by-2 matrix that does not possess
a square root is of the form
that has one eigenvalue λ = 0 of geometrical multiplicity 1. The zero
power of A is the identity matrix, so its kernel is zero. The
null space of matrix A is one-dimensional spanned on
the vector \( \langle 1 , 1 \rangle . \) Therefore, the
kernel of A is one-dimensional and the first number is d_{1} =1. All powers of matrix
A are zero matrices having null spaces of dimension 2. Therefore, we get the sequence
\( d_1 =1 , \ d_2 =1, \ d_3 = d_4 = \cdots = 0 , \) the given matrix has no square root.
■
Situation is more complicated for higher dimensions. It is not true
that not defective 3-by-3 matrix has only eight roots. Let us consider the
symmetric positive definite matrix:
that has two distinct eigenvalues, one λ = 4 is simple, and another one
λ = 1 has multiplicity 2. Matrix A is not defective
and it is diagonalizable. However, this matrix has more than eight roots; we
list some of them
It has one triple eigenvalue \( \lambda = 0 \) of geometrical multiplicity 2. The zero power of
A is the identity matrix, so its kernel is zero. The null space of matrix A is a
two-dimensional space spanned on vectors \( \langle , 1 , 0, 0 \rangle \) and
\( \langle , 0 , 0, 1 \rangle . \) The minimal polynomial for matrix
A is \( \psi (\lambda ) = \lambda^2 , \) and none of discussed
methods is able to find a square root for the given matrix.
Now we verify that the conditions of the above theorem are valid for matrix A. Indeed, the
null space of the given matrix is 2-dimensional. Since all powers of matrix A are zero matrices,
we get the sequence:
\[
d_1 =2, \ d_2 =1, \ d_3 = d_4 = \cdots = 0.
\]
Therefore, we see that the conditions of the theorem are fulfilled,
and matrix A has a square root. ■
It is instructive to consider the cases where A is
a rank-1 n-by-n matrix so
\( {\bf A} = {\bf u}\otimes {\bf v} =
{\bf u}\,{\bf v}^{\ast} \) is
an outer product of two n-vectors (actually, they
are n×1 matrices). In this case (rank(A) = 1),
a function of rank-1 matrix is the sum of two terms:
This matrix has one simple eigenvalue \( \lambda =4 \) and another one double eigenvalue
\( \lambda = 0 . \) Mathematica confirms:
A = Outer[Times, {1, 1, 1}, {1, 0, 3}] (* or *)
A = {{1, 0, 3}, {1, 0, 3}, {1, 0, 3}}
ss = Eigenvalues[A]
{4, 0, 0}
Eigenvectors[A]
{{1, 1, 1}, {-3, 0, 1}, {0, 1, 0}}
CharacteristicPolynomial[A, x]
4 x^2 - x^3
Since matrix A is diagonilizable, its minimal polynomial is of second degree:
\( \psi (\lambda ) = \lambda \left( \lambda -4 \right) . \)
We seek a square root of A in the form
\[
{\bf A}^{1/2} = b_0 {\bf I} + b_1 {\bf A} ,
\]
where coefficients are determined from the system of algebraic equations
For another branch of square root, we have \(
f\left( {\bf v}^{\ast}{\bf u} \right) = -\sqrt{4} = -2 . \)
So another square root of matrix A
becomes \( {\bf R}_{-} = -\frac{1}{2}\, {\bf A}
. \) From obvious relation A² = 4 A, it
follows that
Hence, both matrices R_{+} and R_{-} are
square roots of A.
From the above relation, we conclude that the null spaces of all
powers of A are the same and the sequence becomes
\( d_1 =2, \ d_2 = d_3 = \cdots =0 . \)
Application of the Theorem yields that matrix A has a square
root.
Now suppose we want to define the exponential matrix function that corresponds to the scalar function
\( f(\lambda ) = e^{\lambda \,t} . \) Again, the corresponding matrix function is the sum of two terms:
Next we calculate residues, corresponding to the square root. Since matrix A has three distinct
eigenvalues, these residues are multiples of Sylvester's auxiliary matrices.
Since there are three linearly independent eigenvectors,
matrix A is diagonalizable. Then its minimal
polynomial must be of the second degree: \( \psi (\lambda )= (\lambda -1)(\lambda -4) . \)
The resolvent \( {\bf R}_{\lambda} ({\bf A}) =
\left( \lambda{\bf I} - {\bf A} \right)^{-1} \) of A,
when presented as a ratio of two irreducible polynomials, will contain
the minimal polynomial ψ in the denominator.
R = Simplify[Inverse[lambda*IdentityMatrix[3] - A]]
Matrix Z1 is the projection operator on eigenspace corresponding to eigenvalue \( \lambda =1 , \)
while Z4 is the projection operator on eigenspace corresponding to eigenvalue \( \lambda =4 , \)
Now we check the conditions that the square root exists. Since the
denominator of A is 1, any powers of A will be again a
unimodal matrix. So all null spaces of matrix A and all its
powers will be zero. Therefore, the sequence d_{i} in
Theorem will consist of all zeroes, and the given matrix has a square
root.
Higham, Nicholas J., Functions of Matrices. Theory and Computation. SIAM, 2008,
Konvalina, J., A combinatorial formula for powers of 2×2 matrices, Mathematics Magazine, 2015, Vol. 88, Issue 4, pp. 280--284. https://doi.org/10.4169/math.mag.88.4.280
Levinger, B., The square root of a 2 × 2 matrix, Mathematics Magazine, 1980, Vol. 53, Issue 4, pp. 222--224. https://doi.org/10.1080/0025570X.1980.11976858
MacKinnon, N., Four routes to matrix roots, 1989, Vol. 73, pp. 135--136.
Scott, H.H., On square-rooting matrices, Math Gazette 1990, Vol. 74, pp.111--114.
Somayya, P.C., A method for finding a square root of a 2 × 2 matrice, The Mathematics Education, 1997, Vol. XXXI, No. 1, Bihar State, India.
Yandl, A.L. and Swenson, C., A class of matrices with zero determinant, Mathematics Magazine, 2012, Vol. 85, Issue 2, pp. 126--130. https://doi.org/10.4169/math.mag.85.2.126
Return to Mathematica page
Return to the main page (APMA0340)
Return to the Part 1 Matrix Algebra
Return to the Part 2 Linear Systems of Ordinary Differential Equations
Return to the Part 3 Non-linear Systems of Ordinary Differential Equations
Return to the Part 4 Numerical Methods
Return to the Part 5 Fourier Series
Return to the Part 6 Partial
Differential Equations
Return to the Part 7 Special Functions