Matrix Norms

⫴ ⫼

The set ℳm,n of all m × n matrices under the field of either real or complex numbers is a vector space of dimension m  · n. In order to determine how close two matrices are, and in order to define the convergence of sequences of matrices, a special concept of matrix norm is employed, with notation \( \| {\bf A} \| . \) A norm is a function from a real or complex vector space to the nonnegative real numbers that satisfies the following conditions:

Since the set of all matrices admits the operation of multiplication in addition to the basic operation of addition (which is included in the definition of vector spaces), it is natural to require that matrix norm satisfies the special property: Once a norm is defined, it is the most natural way of measure distance between two matrices A and B as d(A, B) = ‖AB‖ = ‖BA‖. However, not all distance functions have a corresponding norm. For example, a trivial distance that has no equivalent norm is d(A, A) = 0 and d(A, B) = 1 if AB. The norm of a matrix may be thought of as its magnitude or length because it is a nonnegative number. Their definitions are summarized below for an \( m \times n \) matrix A, to which corresponds a self-adjoint (m+n)×(m+n) matrix B:

\[ {\bf A} = \left[ \begin{array}{cccc} a_{1,1} & a_{1,2} & \cdots & a_{1,n} \\ a_{2,1} & a_{2,2} & \cdots & a_{2,n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m,1} & a_{m,2} & \cdots & a_{m,n} \end{array} \right] \qquad \Longrightarrow \qquad {\bf B} = \begin{bmatrix} {\bf 0} & {\bf A}^{\ast} \\ {\bf A} & {\bf 0} \end{bmatrix} . \]
Here A* denotes the adjoint matrix: \( {\bf A}^{\ast} = \overline{{\bf A}^{\mathrm T}} = \overline{\bf A}^{\mathrm T} . \)
For a rectangular m-by-n matrix A and given norms \( \| \ \| \) in \( \mathbb{R}^n \mbox{ and } \mathbb{R}^m , \) the norm of A is defined as follows:
\begin{equation} \label{EqBasic.2} \| {\bf A} \| = \sup_{{\bf x} \ne {\bf 0}} \ \dfrac{\| {\bf A}\,{\bf x} \|_m}{\| {\bf x} \|_n} = \sup_{\| {\bf x} \| = 1} \ \| {\bf A}\,{\bf x} \| . \end{equation}
This matrix norm is called the operator norm or induced norm.
The term "induced" refers to the fact that the definition of a norm for vectors such as A x and x is what enables the definition above of a matrix norm. This definition of matrix norm is not computationally friendly, so we use other options. The most important norms are as follow

The operator norm corresponding to the p-norm for vectors, p ≥ 1, is:

\begin{equation} \label{EqBasic.3} \| {\bf A} \|_{p,q} = \sup_{{\bf x} \ne 0} \, \frac{\| {\bf A}\,{\bf x} \|_q}{\| {\bf x} \|_p} = \sup_{\| {\bf x} \|_p =1} \, \| {\bf A}\,{\bf x} \|_q , \end{equation}
where \( \| {\bf x} \|_p = \left( x_1^p + x_2^p + \cdots + x_n^p \right)^{1/p} .\)

1-norm (is commonly known as the maximum column sum norm) of a matrix A may be computed as

\begin{equation} \label{EqBasic.4} \| {\bf A} \|_1 = \max_{1 \le j \le n} \,\sum_{i=1}^n | a_{i,j} | . \end{equation}

The infinity norm, \( \infty - \) norm of matrix A may be computed as

\begin{equation} \label{EqBasic.5} \| {\bf A} \|_{\infty} = \| {\bf A}^{\ast} \|_{1} = \max_{1 \le i \le n} \,\sum_{j=1}^n | a_{i,j} | , \end{equation}
which is simply the maximum absolute row sum of the matrix.

In the special case of p = 2 we get the Euclidean norm (which is equal to the largest singular value of a matrix)

\begin{equation} \label{EqBasic.6} \| {\bf A} \|_2 = \sup_{\bf x} \left\{ \| {\bf A}\, {\bf x} \|_2 \, : \quad \mbox{with} \quad \| {\bf x} \|_2 =1 \right\} = \sigma_{\max} \left( {\bf A} \right) = \sqrt{\rho \left( {\bf A}^{\ast} {\bf A} \right)} , \end{equation}
where σmax(A) represents the largest singular value of matrix A.

The Frobenius norm (non-induced norm):

\begin{equation} \label{EqBasic.7} \| {\bf A} \|_F = \left( \sum_{i=1}^m \sum_{j=1}^n |a_{i.j} |^2 \right)^{1/2} = \left( \mbox{tr}\, {\bf A} \,{\bf A}^{\ast} \right)^{1/2} = \left( \mbox{tr}\, {\bf A}^{\ast} {\bf A} \right)^{1/2} . \end{equation}
The Euclidean norm and the Frobenius norm are related via the inequality:
\[ \| {\bf A} \|_2 = \sigma_{\max}\left( {\bf A} \right) \le \| {\bf A} \|_F = \left( \sum_{i=1}^m \sum_{j=1}^n |a_{i.j} |^2 \right)^{1/2} = \left( \mbox{tr}\, {\bf A} \,{\bf A}^{\ast} \right)^{1/2} . \]

There is also another function that that provides infinum of all norms of a square matrix: \( \rho ({\bf A}) \le \|{\bf A}\| . \)

The spectral radius of a square matrix A is
\begin{equation} \label{EqBasic.8} \rho ({\bf A}) = \lim_{k\to \infty} \| {\bf A}^k \|^{1/k} = \max \left\{ |\lambda | : \ \lambda \mbox{ is eigenvalue of }\ {\bf A} \right\} . \end{equation}
Theorem 2: For any matrix norm ‖·‖ on the space of square matrices and for any square matrix A, we have
\[ \rho\left( {\bf A} \right) \le \| {\bf A} \| . \]
For any positive integer k, we have
\begin{equation} \label{EqBasic.9} \rho ({\bf A}) \le \| {\bf A}^k \|^{1/k} . \end{equation}

Some properties of the matrix norms are presented in the following

Theorem 3: Let A and B be \( m \times n \) matrices and let \( k \) be a scalar.

  • \( \| {\bf A} \| \ge 0 \) for any square matrix A.
  • \( \| {\bf A} \| =0 \) if and only if the matrix A is zero: \( {\bf A} = {\bf 0}. \)
  • \( \| k\,{\bf A} \| = |k| \, \| {\bf A} \| \) for any scalar \( k. \)
  • \( \| {\bf A} + {\bf B}\| \le \| {\bf A} \| + \| {\bf B} \| .\)
  • \( \left\vert \| {\bf A} - {\bf B}\|\right\vert \le \| {\bf A} - {\bf B} \| .\)
  • \( \| {\bf A} \, {\bf B}\| \le \| {\bf A} \| \, \| {\bf B} \| . \)

All these norms are equivalent and we present some inequalities:

\[ \| {\bf A} \|_2^2 \le \| {\bf A}^{\ast} \|_{\infty} \cdot \| {\bf A} \|_{\infty} = \| {\bf A} \|_{1} \cdot \| {\bf A} \|_{\infty} , \]
where A* is the adjoint matrix to A (transposed and complex conjugate).

Theorem 4: Let ‖ ‖ be any matrix norm and let B be a matrix such that  ‖B‖ < 1. Then matrix I + B is invertible and
\[ \| \left( {\bf I} + {\bf B} \right)^{-1} \| \le \frac{1}{1 - \| {\bf B} \|} . \]
Theorem 5: Let ‖ ‖ be any matrix norm, and let matrix I + B is singular, where I is the identity matrix. Then ‖B‖ ≥ 1 for every matrix norm.

Mathematica has a special command for evaluating norms:
Norm[A] = Norm[A,2] for evaluating the Euclidean norm of the matrix A;
Norm[A,1] for evaluating the 1-norm;
Norm[A, Infinity] for evaluating the ∞-norm;
Norm[A, "Frobenius"] for evaluating the Frobenius norm.

A = {{1, 2, 3}, {4, 5, 6}, {7, 8, 9}}
Norm[A]
Sqrt[3/2 (95 + Sqrt[8881])]
N[%]
16.8481

 

Example 3: Evaluate the norms of the matrix \( {\bf A} = \left[ \begin{array}{cc} \phantom{-}1 & -7 & 4 \\ -2 & -3 & 1\end{array} \right] . \)

The absolute column sums of A are \( 1 + | -2 | =3 \) , \( |-7| + | -3 | =10 , \) and \( 4+1 =5 . \) The larger of these is 10 and therefore \( \| {\bf A} \|_1 = 10 . \)

Norm[A, 1]
10

The absolute row sums of A are \( 1 + | -7 | + 4 =12 \) and \( | -2 | + |-3| + 1 = 6 ; \) therefore, \( \| {\bf A} \|_{\infty} = 12 . \)

Norm[Transpose[A], 1]
12

The Euclidean norm of A is the largest singular value. So we calculate

\[ {\bf S} = {\bf A}^{\ast} {\bf A} = \begin{bmatrix} 5&-1&2 \\ -1&58&-31 \\ 2&-31&17 \end{bmatrix} , \qquad \mbox{tr} \left( {\bf S} \right) = 80. \]
Its eigenvalues are
Eigenvalues[Transpose[A].A]
{40 + Sqrt[1205], 40 - Sqrt[1205], 0}
Taking the square root of the largest one, we obtain the Euclidean norm of matrix A:
N[Sqrt[40 + Sqrt[1205]]]
8.64367
Mathematica also knows how to find the Euclidean norm:
Norm[A, 2]
Sqrt[40 + Sqrt[1205]]
We compare it with the Frobenius norm:
Norm[A, "Frobenius"]
4 Sqrt[5]
N[%]
8.94427
Norm[A]
Sqrt[40 + Sqrt[1205]]
N[%]
8.64367
To find its exact value, we evaluate the product
\[ {\bf M} = {\bf A}\,{\bf A}^{\ast} = \left[ \begin{array}{cc} \phantom{-}1 & -7 & 4\\ -2 & -3 & 1 \end{array} \right] \, \left[ \begin{array}{cc} 1 & -2 \\ -7 & -3 \\ 4&-1 \end{array} \right] = \left[ \begin{array}{cc} 66 & 23 \\ 23& 14 \end{array} \right] , \qquad \mbox{tr} \left( {\bf M} \right) = 80. \]

This matrix\( {\bf A}\,{\bf A}^{\ast} \) has two eigenvalues \( 40 \pm \sqrt{1205} . \) Hence, the Euclidean norm of the matrix A is \( \sqrt{40 + \sqrt{1205}} \approx 8.64367 . \)

Therefore,
\[ \| {\bf A} \|_2 = 8.64367 < \| {\bf A} \|_F = 8.94427 < \| {\bf A} \|_1 = 10 < \| {\bf A} \|_{\infty} = 12 . \]
   ■
Example 4: Let us consider the matrix
\[ {\bf A} = \begin{bmatrix} 1&2&3 \\ 4&5&6 \\ 7&8&9 \end{bmatrix} . \]
Its conjugate transpose (adjoint) matrix is
\[ {\bf A}^{\ast} = \begin{bmatrix} 1&2&3 \\ 4&5&6 \\ 7&8&9 \end{bmatrix}^{\mathrm T} = \begin{bmatrix} 1&4&7 \\ 2&5&8 \\ 3&6&9 \end{bmatrix} . \]
So
\[ {\bf S} = {\bf A}^{\ast} {\bf A} = \begin{bmatrix} 66&78&90 \\ 78&93&108 \\ 90&108&126 \end{bmatrix} \]
We check with Mathematica:
A = {{1, 2, 3}, {4, 5, 6}, {7, 8, 9}}
S = Transpose[A].A
Their eigenvalues are
Eigenvalues[A]
{3/2 (5 + Sqrt[33]), 3/2 (5 - Sqrt[33]), 0}
Eigenvalues[S]
{3/2 (95 + Sqrt[8881]), 3/2 (95 - Sqrt[8881]), 0}
N[%]
{283.859, 1.14141, 0.}
Therefore, the largest singular number of A is
\[ \sigma = \sqrt{\frac{3}{2} \left( 95 + \sqrt{8881} \right)} \approx 16.8481. \]
We also check the opposite product
Eigenvalues[A.Transpose[A]]
{3/2 (95 + Sqrt[8881]), 3/2 (95 - Sqrt[8881]), 0}
\[ {\bf M} = {\bf A}\, {\bf A}^{\ast} = \begin{bmatrix} 14&32&50 \\ 32&77&122 \\ 50&122&194 \end{bmatrix} \]
These matrices S and M have the same eigenvalues. Therefore, we found the Euclidean (operator) norm of A to be approximately 16.8481. Mathematica knows this norm:
Norm[A]
Sqrt[3/2 (95 + Sqrt[8881])]
The spectral radius of A is the largest eigenvalue:
\[ \rho ({\bf A}) = \frac{3}{2} \left( 5 + \sqrt{33} \right) \approx 16.1168 , \]
which is slightly less than its operator (Euclidean) norm.

The Frobenius norm of matrix \( {\bf A} = \begin{bmatrix} 1&2&3 \\ 4&5&6 \\ 7&8&9 \end{bmatrix} \) is

\[ \| {\bf A} \|_F = \left( \sum_{i=1}^m \sum_{j=1}^n |a_{i.j} |^2 \right)^{1/2} = \left( 1^2 + 2^2 + 3^2 + 4^2 + 5^2 + 6^2 + 7^2 +8^2 +9^2 \right)^{1/2} = \sqrt{285} = \left( \mbox{tr}\, {\bf A} \,{\bf A}^{\ast} \right)^{1/2} . \]
A = {{1, 2, 3}, {4, 5, 6}, {7, 8, 9}}
Tr[A.Transpose[A]]
285
Sum[k^2, {k, 1, 9}]
285
N[Sqrt[285]]
16.8819
Mathematica has a dedicated command:
Norm[A, "Frobenius"]
16.8819

To find 1-norm of A, we add elements in every column; it turns out that the last column has the largest entries, so

\[ \| {\bf A} \|_1 = 3+6+9=18. \]
If we add entries in every row, then the last row contains the largest values and we get
\[ \| {\bf A} \|_{\infty} = 7+8+9=24. \]
   ■