Square Roots

It is known that some matrices have infinite many square roots, some nilpotent matrices have no root, and for some matrices we can build certain finite number of roots according to known algorithms discussed previously. Necessary and sufficient conditions for exitence of a square root of a matrix can be found in Higham's monograph.

Theorem: A square n × n matrix A has a square root if and only if ib the "ascent sequence" of integers \( d_1 , d_2 , d_3 , \ldots , \) defined by

\[ d_k = \dim \left( \mbox{ker} \left( {\bf A}^k \right) \right) - \dim \left( \mbox{ker} \left( {\bf A}^{k-1} \right) \right) , \qquad k=1,2,\dots , \]
no two terms are the same odd integer.     ■
  1. Higham, Nicholas, Functions of Matrices: Theory and Computation , SIAM, 2008, https://doi.org/10.1137/1.9780898717778
  2. G.W. Cross, P. Lancaster, Square roots of complex matrices, Linear and Multilinear Algebra, 1, (1974), pp. 289--293.
  3. Åke Björck, Sven Hammarling, A Schur method for the square root of a matrix, Linear Algebra and its Applications, 52-53, July 1983, pp. 127--140.
  4. Roger A. Horn, Charles R. Johnson, Matrix Analysis, Cambridge University Press, 2012.
Although a square root could be determined for the majority of square matrices, most applications call for matrices with nonnegative eigenvalues. It is natural to call such matrices positive-definite. There are known several definitions of positive definite/semi-definite matrices, all of them are equivalent. However, we start with the definition that is slightly different from universally accepted by mathematicians. Usually positive definite matrices are symmetric or self-adjoint, but we forfeit this condition. The reason is that for our applications in engineering it is not necessary. It is usually applied to the product \( {\bf A}^{\mathrm T} {\bf A} \) or \( {\bf A}^{\ast} {\bf A} , \) which is a symmetric or self-adjoint matrix.

A square matrix A is called positive definite if all its eigenvalues are positive. Similarly, A is called positive semi-definite if all its eigenvalues are not negative.


Theorem: When a matrix A has independent columns (= full column rank), the product \( {\bf A}^{\mathrm T} {\bf A} \) is a square, symmetric, positive definite, and invertible matrix. If A is not a full column rank matrix, then \( {\bf A}^{\mathrm T} {\bf A} \) is a square, symmetric, and positive semi-definite matrix.

Example: Consider two 3 × 3 matrices
\[ {\bf A} = \begin{bmatrix} 11 &11&-15 \\ 20&59&-90 \\ 14&37&-56 \end{bmatrix} \qquad\mbox{and} \qquad {\bf B} = \begin{bmatrix} 2 &-1&4 \\ 2&3&4 \\ 1&2&2 \end{bmatrix} . \]
Matrix A is positive definite because its eigenvalues are \( \lambda = 1, 4, 9. \) On the other hand, matrix B is positive semi-definite because its eigenevalues are \( \lambda = 6, 1, 0 . \)

Since each of these matrices is diagonalizable (because they have three distinct eigenvalues), we can use the Sylvester method. First, we calculate Sylvester's auxiliary matrices for A:

\begin{align*} {\bf Z}_1 &= \frac{\left( {\bf A} - 9\,{\bf I} \right) \left( {\bf A} - 4\,{\bf I} \right)}{(1-9)(1-4)} = \begin{bmatrix} 1&3&-5 \\ -5&-15&25 \\ -3&-9&15 \end{bmatrix} , \\ {\bf Z}_4 &= \frac{\left( {\bf A} - 9\,{\bf I} \right) \left( {\bf A} - {\bf I} \right)}{(4-9)(4-1)} = \begin{bmatrix} -2&-7&11 \\ 4&14&-22 \\ 2&7& -11 \end{bmatrix} , \\ {\bf Z}_9 &= \frac{\left( {\bf A} - {\bf I} \right) \left( {\bf A} - 4\,{\bf I} \right)}{(9-1)(9-4)} = \begin{bmatrix} 2&4&-6 \\ 1&2&-3 \\ 1&2&-3 \end{bmatrix} . \end{align*}
A = {{11, 11, -15}, {20, 59, -90}, {14, 37, -56}}
Eigenvalues[A]
{9, 4, 1}
Z1 = (A - 9*IdentityMatrix[3]).(A - 4*IdentityMatrix[3])/8/3
{{1, 3, -5}, {-5, -15, 25}, {-3, -9, 15}}
Z4 = -(A - 9*IdentityMatrix[3]).(A - IdentityMatrix[3])/5/3
{{-2, -7, 11}, {4, 14, -22}, {2, 7, -11}}
Z9 = (A - 4*IdentityMatrix[3]).(A - IdentityMatrix[3])/5/8
{{2, 4, -6}, {1, 2, -3}, {1, 2, -3}}
Each of Sylvester's auxiliary matrices is a projector:
\[ {\bf Z}_1^2 = {\bf Z}_1 , \quad {\bf Z}_4^2 = {\bf Z}_4 , \quad {\bf Z}_9^2 = {\bf Z}_9 . \]
Now we find eight square roots, but half of them go with opposite sign:
\[ \pm {\bf Z}_1 \pm 2\,{\bf Z}_4 \pm 3\,{\bf Z}_9 . \]
In particular,
\[ {\bf R} = {\bf Z}_1 + 2\,{\bf Z}_4 + 3\,{\bf Z}_9 = \begin{bmatrix} 3&1&-1 \\ 6&19&-28 \\ 4&11&-16 \end{bmatrix} . \]
It is not hard to verify that R2 = A.

Now we consider positive semi-definite matrix B. Its Sylvester auxiliary matrices are

\begin{align*} {\bf Z}_6 &= \frac{\left( {\bf B} - 0\,{\bf I} \right) \left( {\bf B} - {\bf I} \right)}{(6-0)(1-0)} = \frac{1}{30)}\begin{bmatrix} 4&4&8 \\ 12&12&24 \\ 7&7&1 \end{bmatrix} , \\ {\bf Z}_1 &= \frac{{\bf B} \left( {\bf B} - 6\,{\bf I} \right)}{(1-6)(1-0)} = \frac{1}{5} \begin{bmatrix} 6&-9&12 \\ -2&3&-4 \\ -2&3&-4 \end{bmatrix} , \\ {\bf Z}_0 &= \frac{\left( {\bf B} - 6\,{\bf I} \right) \left( {\bf B} - {\bf I} \right)}{(0-6)(0-1)} = \frac{1}{6} \begin{bmatrix} -2&10&-16 \\ 0&0&0 \\ 1&-5&8 \end{bmatrix} . \end{align*}
Each of Sylvester's auxiliary matrices is a projector:
\[ {\bf Z}_6^2 = {\bf Z}_6 , \quad {\bf Z}_1^2 = {\bf Z}_1 , \quad {\bf Z}_0^2 = {\bf Z}_0 . \]
Now we find four square roots, but half of them go with opposite sign:
\[ \pm \sqrt{6} {\bf Z}_6 \pm {\bf Z}_1 . \]
In particular,
\[ {\bf R} = \sqrt{6} {\bf Z}_6 + {\bf Z}_1 = \frac{1}{15} \begin{bmatrix} 2 \left( 9 + \sqrt{6} \right)&-27 + 2\sqrt{6}&4 \left( 9 + \sqrt{6} \right) \\ 6\left( -1 + \sqrt{6} \right)&3 \left( 3 + 2\sqrt{6} \right)&12 \left( -1 + \sqrt{6} \right) \\ 7\sqrt{6} /2 - 6&7\sqrt{6} /2 +15&-12 + 7\sqrt{6} \end{bmatrix} . \]
It is not hard to verify that R2 = B. ■

Example: Consider the matrix
\[ {\bf A} = \begin{bmatrix} 7+ 2\,{\bf j} & 2 + 2\,{\bf j} \\ 2 - 3\,{\bf j} & 6 -2\,{\bf j} \end{bmatrix} \]
that has two positive eigenvalues λ = 4 and λ = 9. Two find four square roots, we use Sylvester's method by calculating auxiliary matrices:
\begin{align*} {\bf Z}_4 &= \frac{{\bf A} - 9\,{\bf I}_2 }{4-9} = \frac{1}{5}\begin{bmatrix} 2-2{\bf j}&-2-2{\bf j} \\ -2+3{\bf j}&3+2{\bf j} \end{bmatrix} , \\ {\bf Z}_1 &= \frac{{\bf A} - 4\,{\bf I}}{9-4} = \frac{1}{5} \begin{bmatrix} 3+2{\bf j}&2+2{\bf j} \\ 2-3{\bf j}&2-2{\bf j} \end{bmatrix} . \end{align*}
Now we find four square roots, but half of them go with opposite sign:
\[ \pm 2 {\bf Z}_4 \pm 3{\bf Z}_9 . \]
In particular,
\[ {\bf R} = 2{\bf Z}_4 + 3{\bf Z}_9 = \frac{1}{5} \begin{bmatrix} 13+2{\bf j}&2+2{\bf j} \\ 2-3{\bf j}&2-2{\bf j} \end{bmatrix} . \]
This matrix has two positive eigenvalues \( \lambda = 3,2 . \)

Example: Consider the 3 × 2 matrix:
\[ {\bf A} = \begin{bmatrix} 1&7 \\ 1&7 \\ 0&0 \end{bmatrix} \qquad \Longrightarrow \qquad {\bf A}^{\mathrm T} = \begin{bmatrix} 1&1&0 \\ 7&7&0 \end{bmatrix} . \]
Calculations provided by Mathematica show
\[ {\bf B} = {\bf A}^{\mathrm T} {\bf A} = \begin{bmatrix} 2&14 \\ 14& 98 \end{bmatrix} . \]
A = {{1, 7}, {1, 7}, {0, 0}}
B = Transpose[A].A
{{2, 14}, {14, 98}}
Eigenvalues[%]
{100, 0}
Therefore, the product \( {\bf B} = {\bf A}^{\mathrm T} {\bf A} \) is the positive semi-definite 2 × 2 matrix, with eigenvalues λ = 0, 100. Its Sylvester's auxiliary matrices are
\begin{align*} {\bf Z}_0 &= \frac{{\bf B} - 100\,{\bf I}}{0-100} = \frac{1}{50} \begin{bmatrix} 49&-7 \\ -7& 1 \end{bmatrix} , \\ {\bf Z}_{100} &= \frac{{\bf B}}{100-0} = \frac{1}{100} \begin{bmatrix} 2&14 \\ 14& 98 \end{bmatrix} . \end{align*}
Each of these matrices is a projector:
\[ {\bf Z}_0^2 = {\bf Z}_0 \qquad\mbox{and} \qquad {\bf Z}_{100}^2 = {\bf Z}_{100} . \]
Now we are ready to determine square roots:
\[ {\bf R}_{\pm} = \pm 10\,{\bf Z}_{100} = \pm \frac{1}{10} \begin{bmatrix} 2&14 \\ 14& 98 \end{bmatrix} . \]
Indeeed,
R = B/10
R.R-B
{{0, 0}, {0, 0}}

Example: Consider the 3 × 3 matrix:
\[ {\bf A} = \begin{bmatrix} 12&7&-16 \\ 1&2&4 \\ 1&-2&8 \end{bmatrix} . \]
It has one defective double eigenvalue λ = 9 and one simple eigenvalue λ = 4. Its minimal polynomial is the same as the characteristic polynomial \( \chi (\lambda ) = \left( \lambda -9 \right)^2 (\lambda -4) . \) So we calculate the resolvent
\[ {\bf R}_{\lambda} \left( {\bf A} \right) = \left( \lambda {\bf I} - {\bf A} \right)^{-1} = \frac{1}{\left( \lambda -9 \right)^2 (\lambda -4)} \begin{bmatrix} \lambda^2 -10\lambda + 24&7\lambda -24&60-16\lambda \\ \lambda -4&\lambda^2 -20\lambda + 112&4\lambda -64 \\ \lambda -4&31 -2\lambda&\lambda^2 -14\lambda +17 \end{bmatrix} . \]
Then four square roots could be found from the formula:
\[ \lim_{\lambda \to 9} \, \frac{\text d}{{\text d} \lambda} \,{\bf R}_{\lambda} \left( {\bf A} \right) \left( \lambda -9 \right)^2 \sqrt{\lambda} + \lim_{\lambda \to 4} \, {\bf R}_{\lambda} \left( {\bf A} \right) \left( \lambda -4 \right) \sqrt{\lambda} . \]
Mathematica comes on rescure with calculations:
A = {{12, 7, -16}, {1, 2, 4}, {1, -2, 8}}
R[s_] = Inverse[s*IdentityMatrix[3] - A]
r1 = D[Simplify[R[s]*(s - 9)^2]*Sqrt[s], s] /. s -> 9
{{7/2, 41/50, -(58/25)}, {1/6, -(349/150), 362/75}, {1/6, -(349/150), 362/75}}
r2 = Simplify[(s - 4)*R[s]]*2 /. s -> 4
{{0, 8/25, -(8/25)}, {0, 96/25, -(96/25)}, {0, 46/25, -(46/25)}}
Now we find two roots by adding and subtracting these two residures:
rr = r1 + r2
{{7/2, 57/50, -(66/25)}, {1/6, 227/150, 74/75}, {1/6, -(73/150), 224/ 75}}
rr2 = r1 - r2
{{7/2, 1/2, -2}, {1/6, -(37/6), 26/3}, {1/6, -(25/6), 20/3}}
rr2.rr2
{{12, 7, -16}, {1, 2, 4}, {1, -2, 8}}
So we found two roots:
\[ \frac{1}{150} \begin{bmatrix} 525 & 171 & -396 \\ 25&227&148 \\ 25&-73&448 \end{bmatrix} \qquad\mbox{and} \qquad \frac{1}{6} \begin{bmatrix} 21&3&-12 \\ 1&-37 &52 \\ 1&-25&40 \end{bmatrix} . \]
Two other roots are just opposite of these two ones. ■

 

  1. F. R. Gantmacher, The Theory of Matrices, volume one. Chelsea, New York, 1959.
  2. Nick Higham, Matrix Functions: Theory and Algorithms, University of Manchester,
  3. Roger A. Horn and Charles R. Johnson. Topics in Matrix Analysis, Cambridge University Press, 1991.
  4. Peter Lancaster and Miron Tismenetsky. The Theory of Matrices, Academic Press, London, second edition, 1985.
  5. R. F. Rinehart, The Equivalence of Definitions of a Matric Function, The American Mathematical Monthly, (1955)