Linear Systems of Algebraic Equations¶

This page presents some topics from Linear Algebra needed for construction of solutions to systems of linear algebraic equations and some applications. We use matrices and vectors as essential elements in obtaining and expressing the solutions.

SVD factorization ¶

One of the most fruitful ideas in the theory of matrices is that of matrix decomposition or canonical form. The theoretical utility of matrix decompositions has long been appreciated. Previously, we have discused LU-decomposition and QR-factorization for rectangular m-by-n matrices. For a square \( n \times n \) matrix, we know even more canonical forms:

An eigenvalue decomposition (abbreviated EVD) of a symmetrical (self-adjoint) matrix
\[ {\bf A} = {\bf P}\,{\bf \Lambda}\,{\bf P}^{\mathrm T} \qquad \mbox{or} \qquad {\bf A} = {\bf P}\,{\bf \Lambda}\,{\bf P}^{\ast} . \]
where P is an n-by-n orthogonal (unitary) matrix of eigenvalues of A, and Λ is the diagonal matrix whose diagonal entries are eigenvalues corresponding to the column vectors of S.
A Hessenberg decomposition of a square matrix with real entries, due to the German electrical engineer Karl Hessenberg (1904--1959),
\[ {\bf A} = {\bf P}\,{\bf H}\,{\bf P}^{\mathrm T} , \]
in which P is an orthogonal matrix and H is in upper Hessenberg form: \( {\bf H} = \begin{bmatrix} x&x & \cdots &x&x&x \\ x&x& \cdots &x&x&x \\ 0&x& \cdots&x&x&x \\ \vdots&\vdots & \ddots &\vdots& \vdots& \vdots \\ 0&0& \cdots &x&x&x \\ 0&0& \cdots &0&x&x \end{bmatrix} . \)
Schur's decomposition for an n-by-n matrix A with real entries and real eigenvalues: \( {\bf A} = {\bf P}\,{\bf S}\,{\bf P}^{\mathrm T} , \) where P is an orthogonal matrix such that \( {\bf P}^{\mathrm T} {\bf A} \, {\bf P} = \begin{bmatrix} \lambda_1 & x & x& \cdots & x \\ 0&\lambda_2 & x & \cdots & x \\ 0&0&\lambda_3& \cdots & x \\ \vdots&\vdots& \vdots & \ddots & \vdots \\ 0&0&0& \cdots & \lambda_n \end{bmatrix} , \) in which \( \lambda_1 , \lambda_2 , \ldots , \lambda_n \) are the eigenvalues of A repeated according to multiplicity.
Cholesky decomposition of a positive definite matrix: \( {\bf A} = {\bf L}\,{\bf L}^{\ast} , \) where L is a lower triangular matrix.

Of many useful decompositions, the singular value decomposition---that is, the factorization of a m-by-n matrix A into the product \( {\bf U}\, {\bf \Sigma} \,{\bf V}^{\ast} \) of a unitary (or orthogonal) \( m \times m \) matrix U, a \( m \times n \) 'diagonal' matrix \( {\bf \Sigma} \) and another unitary \( n \times n \) matrix \( {\bf V}^{\ast} \) ---has assumed a special role. There are several reasons. In the first place, the fact that the decomposition is achieved by unitary matrices makes it an ideal vehicle for discussing the geometry of n-space. Second, it is stable; small pertubations in A corresponds to small pertubabtions in \( {\bf \Sigma} .\) Third, the diagonality of \( {\bf \Sigma} \) makes it easy to operate and solve equations.

It is an intriguing observation that most of the classical matrix decompositions predated the widespread use of matrices. In 1873, the Italian mathematician Eugenio Beltrami (1835--1899) published a first paper on SVD, which was followed by the work of Camille Jordan in 1874, whom we can consider as a codiscover. Later James Joseph Sylvester (1814--1897), Edhard Schmidt (1876--1959), and Hermann Weyl (1885--1955) were responsible for establishing the existence of the singular value decomposition (SVD) and developing its theory. The term ``singular value'' seems to have come from the literature on integral equations. The British-born American mathematician Harry Bateman (1882--1946) used it in a research paper published in 1908.

Let A be an arbitrary real- or complex-valued \( m \times n \) matrix. Then

\( {\bf A}^{\ast} {\bf A} \) and \( {\bf A} \, {\bf A}^{\ast} \) are both symmetric positive semidefinite matrices of dimensions \( n \times n \) and \( m \times m ,\) respectively.
The equality \( {\bf A} = {\bf U}\,{\bf \Sigma}\, {\bf V}^{\ast} = {\bf U}\,{\bf \Sigma}\, {\bf V}^{-1} \) is called the singular value decomposition (or SVD) of A, where V is unitary (or orthogonal in real-valued case) \( n \times n \) matrix, U is unitary/orthogonal \( m \times m \) matrix, and \( {\bf \Sigma} = {\bf U}^{\ast} {\bf A}\, {\bf V} \) is diagonal \( m \times n \) matrix, with \( {\bf \Sigma} = \mbox{diag}( \sigma_1 , \sigma_2 , \ldots , \sigma_p ), \) where \( p = \min (m,n) \) and the nonnegative numbers \( \sigma_1 \ge \sigma_2 \ge \cdots \ge\sigma_p \ge 0 \) are singular values of A.
If r is rank of A, then A has exactly r positive singular values \( \sigma_1 \ge \sigma_2 \ge \cdots \ge\sigma_r > 0 \) with \( \sigma_{r+1} = \sigma_{r+2} = \cdots = \sigma_p =0 . \)
Any real \( m \times n \) matrix A can be expressed as a finite sum of rank 1 matrices in normalized form, that is, \( {\bf A} = \sigma_1 {\bf R}_1 + \sigma_2 {\bf R}_2 + \cdots + \sigma_r {\bf R}_r , \) where \( {\bf R}_i = {\bf u}_i {\bf v}_i^{\ast} = {\bf u}_i \otimes {\bf v}_i , \) \( {\bf u}_i \) is the i-th column of U and a unit vector of \( {\bf A}\,{\bf A}^{\ast} , \) and \( {\bf v}_i \) is the i-th column of V and a unit eigenvector of \( {\bf A}^{\ast} {\bf A} . \)
The pseudoinverse \( {\bf A}^{\dagger} \) of a matrix with SVD \( {\bf A} = {\bf U}\,{\bf \Sigma}\, {\bf V}^{\ast} \) is the matrix \( {\bf A}^{\dagger} = {\bf V} {\bf \Sigma}^{\dagger} {\bf U}^{\ast} \) with
\[ {\bf \Sigma}^{\dagger} = \mbox{diag} \left( \frac{1}{\sigma_1} , \cdots , \frac{1}{\sigma_r} , 0, \cdots , 0\right) \in \mathbb{R}^{m\times n} . \]

Theorem: If A is an \( m \times n \) matrix with real entries, then

A and A^TA have the same null space (kernel).
A and A^TA have the same row space.
A^T and A^TA have the same column space.
A and A^TA have the same rank.

Theorem: If A is an \( m \times n \) real matrix, then

A^TA is orthogonally diagonalizable: \( {\bf V}^{\mathrm T} \left( {\bf A}^{\mathrm T} {\bf A} \right) {\bf V} = \mbox{diag} \left( \sigma_1^2 , \ldots , \sigma_n^2 \right) . \)
The eigenvalues \( \{ \sigma_1^2 , \ldots , \sigma_n^2 \} \) of A^TA are nonnegative.
The column vectors of U are the eigenvectors of the symmetric matrix AA^T; its eigenvalues are \( \{ \sigma_1^2 , \ldots , \sigma_n^2 \} \) and m - n are zeroes (provided \( m \ge n \) ). ■

If A is a square matrix, then there is no relation between eigenvalues of A and A^TA as the following example shows.

Example. Consider the matrix

\[ {\bf A} = \begin{bmatrix} 4&2&5 \\ 0&16&0 \\ 0&14&9 \end{bmatrix} \]

that we met previously. This matrix A has three distinct positive (integer) eigenvalues \( \lambda = 16, 9, 4 . \) Two symmetric matrices

\[ {\bf B}_1 = {\bf A}^T {\bf A} = \begin{bmatrix} 16&8&20 \\ 8&456&136 \\ 20&136&106 \end{bmatrix} \quad \mbox{and} \quad {\bf B}_2 = {\bf A} \, {\bf A}^T = \begin{bmatrix} 45&32&73 \\ 32&256&224 \\ 73&224&277 \end{bmatrix} \]

share the same positive irrational eigenvalues (but have different eigenvectors):

\[ \lambda_1 \approx 503.039 , \qquad \lambda_2 \approx 64.7801, \qquad \lambda_3 \approx 10.1813 . \]

We can use a standard Sage command to obtain SVD for the given matrix

sage: A=matrix(QQ,3,3,[4,2,5,0,16,0,0,14,9])
sage: A.eigenvalues()
[16, 9, 4]
sage: A.change_ring(RDF).SVD()
(
[-0.1619372019614091  0.6332468129089919  0.7568188796280878]
[-0.6742825611554915 -0.6309792503036638  0.3836772255526993]
[-0.7204993895504005  0.4481781561241365 -0.5291662971417107],
[22.42852313777857 0.0 0.0]
[ 0.0 8.048607015909965 0.0]
[ 0.0 0.0 3.190811019408077]
[-0.02888058227759853 0.31471125955446966 0.9487479829106075]
[ -0.9451976265172841 -0.3174065013054804 0.076515094949342]
[-0.32521893978272465 0.8945445411150043 -0.30663122014554733]
      )

RDT means real double field, as opposed to the 53 bit precision real field RR; if you define a matrix over this field, you can simplify

  sage: A=matrix(RDF,3,3,[4,2,5,0,16,0,0,14,9])
sage: A.eigenvalues()
[4.0, 9.0, 16.0]
sage: A.SVD()

with the same output as previous one.

Definition: If A is an \( m \times n \) matrix, and if \( \lambda_1 , \lambda_2 , \ldots , \lambda_n \) are the eigenvalues of A^TA, then the numbers

\[ \sigma_1 = \sqrt{\lambda_1} , \quad \sigma_2 = \sqrt{\lambda_2} , \quad \ldots \quad , \quad \sigma_n = \sqrt{\lambda_n} \]

are called the singular values of A. In other words, a nonnegative real number σ is a singular value for m-by-n matrix A if and only if there exist unit-length vectors \( {\bf u} \in \mathbb{R}^m \quad\mbox{and} \quad {\bf v} \in \mathbb{R}^n \) such that \( {\bf A}{\bf v} = \sigma {\bf u} . \) The vectors u and v are called left singular and right singular vectors for σ, respectively. ■

Theorem: Let A be a m-by-n matrix with complex or real entries. Then its largest singular value equals to the Euclidean 2-norm: \( \max \{ \sigma_i \} = \| {\bf A} \|_2 = \max_{\| {\bf x} \| =1} \| {\bf A} {\bf x} \| , \) and square root of the sum of squares of the singular values equals the Frobenius norm: \( \sum_i \sigma_i^2 = \| {\bf A} \|_F^2 = \sum_{i,j} |a_{i,j} |^2 = \mbox{trace} \left( {\bf A}^{\ast} {\bf A} \right) . \) ■

There are known two forms of the singular value decompositions---the brief (or reduced) form and the expanded form:

Theorem: (Singular Value Decomposition in expanded form) If A is an \( m \times n \) matrix of rank r, then A can be factored as

\[ {\bf A} = \left[ {\bf u}_1 \ {\bf u}_2 \ \cdots \ {\bf u}_r \vert {\bf u}_{r+1} \ \cdots \ {\bf u}_m \right] \left[ \begin{array}{cccc|c} \sigma_1 & 0 & \cdots & 0 & \\ 0& \sigma_2 & \cdots & 0 & \\ \vdots & \vdots & \ddots & \vdots & {\bf 0}_{r\times (n-r)} \\ 0&0& \cdots & \sigma_r & \\ \hline \\ && {\bf 0}_{(m-r)\times r} & &{\bf 0}_{(m-r)\times (n-r)} \end{array} \right] \begin{bmatrix} {\bf v}_1^{\mathrm T} \\ \vdots \\ {\bf v}_r^{\mathrm T} \\ \hline {\bf v}_{r+1}^{\mathrm T} \\ \vdots \\ {\bf v}_n^{\mathrm T} \end{bmatrix} \]

in which U, Σ, and V have sizes \( m \times m , \) \( m \times n , \) and \( n \times n ,\) respectively, and in which:

\( {\bf V} = \left[ {\bf v}_1 \ {\bf v}_2 \ \cdots \ {\bf v}_n \right] \) orthogonally diagonalizes \( {\bf A}^{\mathrm T} {\bf A} = {\bf V} \mbox{diag} (\sigma_i^2 ) \,{\bf V}^{\mathrm T} .\) Column vectors in V are usually called right singular vectors of A.
The nonzero diagonal entries of Σ are \( \sigma_1 = \sqrt{\lambda_1} , \ \sigma_2 = \sqrt{\lambda_2} , \ldots , \sigma_r = \sqrt{\lambda_r} , \) where \( \lambda_1 , \lambda_2 , \ldots , \lambda_r \) are the nonzero eigenvalues of A^TA corresponding to the column vectors of V.
The column vectors of V are ordered so that \( \sigma_1 \ge \sigma_2 \ge \cdots \ge \sigma_r >0. \)
\( {\bf u}_i = \frac{{\bf A} {\bf v}_i}{\| {\bf A} {\bf v}_i \|} = \frac{1}{\sigma_i} \, {\bf A}\,{\bf v}_i \quad (i=1,2,\ldots , r) . \)
\( {\bf U} = \left[ {\bf u}_1 , {\bf u}_2 , \ldots , {\bf u}_r \right] \) is an orthogonal basis for the column space of A. Column vectors in U are usually called left singular vectors of A.
\( \left[ {\bf u}_1 , {\bf u}_2 , \ldots , {\bf u}_r , {\bf u}_{r+1} , \ldots , {\bf u}_m \right] \) is an extension of \( \left[ {\bf u}_1 , {\bf u}_2 , \ldots , {\bf u}_r \right] \) to an orthogonal basis for \( {\mathbb R}^m . \) ■

When rank r of \( m \times n \) matrix A is less than \( \min \{ m, n \} , \) the most entries of 'diagonal' \( m \times n \) matrix Σ are zeroes except singular values \( \sigma_1 , \ldots , \sigma_r . \) Therefore, when this matrix is multiplied from left by U and from right by V^T, their entries \( {\bf u}_{r+1} , \ldots , {\bf u}_m \) and \( {\bf v}_{r+1} , \ldots , {\bf v}_n \) do not contribute to the final matrix. So we can drop them and represent A in the reduced form, as the following theorem states.

Let r be the rank of m-by-n matrix A. Then \( \sigma_{r+1} = \sigma_{r+2} = \cdots = \sigma_n . \) Partition \( {\bf U} = \left[ {\bf U}_1 \ {\bf U}_2 \right] \) and \( {\bf V} = \left[ {\bf V}_1 \ {\bf V}_2 \right] , \) where \( {\bf U}_1 = \left[ {\bf u}_1 \ , \ldots , \ {\bf u}_r \right] \) and \( {\bf V}_1 = \left[ {\bf v}_1 \ , \ldots , \ {\bf v}_r \right] \) have r columns. Then with \( {\bf \Sigma}_r = \mbox{diag}(\sigma_1 , \ldots , \sigma_r ) : \)

\begin{align*} {\bf A} &= \left[ {\bf U}_1 \ {\bf U}_2 \right] \left( \begin{array}{cc} {\bf \Sigma}_r & {\bf 0}_{r\times (n-r)} \\ {\bf 0}_{(m-r)\times r} & {\bf 0}_{(m-r) \times (n-r)} \end{array} \right) \left[ {\bf V}_1 \ {\bf V}_2 \right]^{\mathrm T} \qquad\mbox{full SVD} \\ &= {\bf U}_1 {\bf \Sigma}_r {\bf V}_1^{\mathrm T} = \sum_{i=1}^r \sigma_i {\bf u}_i \otimes {\bf v}_i \qquad \mbox{brief SVD} . \end{align*}

Theorem: (Singular Value Decomposition in brief/reduced form) If A is an \( m \times n \) matrix of rank r, then A can be expressed in the form \( {\bf A}_{m\times n} = {\bf U}_{m\times r} {\bf \Sigma}_r {\bf V}^{\mathrm T}_{r\times n} , \) where Σ_r has size r-by-r and

\[ {\bf \Sigma}_r = \left[ \begin{array}{cccc} \sigma_1 & 0 & \cdots & 0 \\ 0 & \sigma_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0&0& \cdots & \sigma_r \end{array} \right] , \quad {\bf U}_{m\times r} = \left[ {\bf u}_1 \ \cdots \ {\bf u}_r \right] , \quad {\bf V}_{n\times r} = \left[ {\bf v}_1 \ \cdots \ {\bf v}_r \right] . \]

It is a custom to place the successive diagonal entries in r-by-r matrix Σ_r in nonincreasing order, U is an m-by-r orthogonal matrix, and \( {\bf V}_{n\times r} \) is an n-by-r orthogonal matrix. ■

The matrices U and V are not uniquely determined by A, but the diagonal entries of Σ are necessarily the singular values of A. The set \( \{ {\bf u}_1 , \ldots , {\bf u}_r \} \) is an orthonormal basis for the column space of A. The set \( \{ {\bf u}_{r+1} , \ldots , {\bf u}_m \} \) is an orthonormal basis for the kernel (null space) of A^T. The set \( \{ {\bf v}_1 , \ldots , {\bf v}_r \} \) is an orthonormal basis for the row space of A. The set \( \{ {\bf v}_{r+1} , \ldots , {\bf v}_n \} \) is an orthonormal basis for the kernel of A.

One way to understand how a matrix A deforms space is to consider its action on the unit sphere in \( {\mathbb R}^n . \) An arbitrary element x of this unit sphere can be represented by \( {\bf x} = x_1 v_1 + x_2 v_2 + \cdots + x_n v_n \) with \( \sum_{i=1}^n x_i^2 =1 . \) The image is \( {\bf A}{\bf x} = \sigma_1 x_1 u_1 + \cdots + \sigma_m x_p u_p . \) Letting \( y_i = \sigma_i x_i , \) we see that the image of the unit sphere consists of the vectors \( y_1 u_1 + y_2 u_2 + \cdots + y_p u_p , \) where

\[ \frac{y_1^2}{\sigma_1^2} + \frac{y_2^2}{\sigma_2^2} + \cdots + \frac{y_p^2}{\sigma_p^2} = \sum_1^p x_i^2 \le 1. \]

If A has full column rank (linearly independent columns), so \( p=n , \) the indequality is actually a strict equality. Otherwise, some of the x_i are missing on the right, and the sum can be any number from 0 to 1. This shows that A maps the unit sphere of \( {\mathbb R}^n \) to a p-dimensional ellipsoid with semi-axes in the directions u_i and with magnitudes \( \sigma_i . \) If \( p=n , \) the image is just the surface of the ellipsoid; otherwise it is the solid ellipsoid.

Example. If \( {\bf A} = \left[ \begin{array}{ccc} 1\ & 2& \ 6 \\ 2\ & -3 \ & -2 \end{array} \right] , \) then the linear transformation \( {\bf x} \,\mapsto \,{\bf A}{\bf x} \) maps the unit sphere \( \left\{ {\bf x} : \, \| {\bf x} \| =1 \right\} \) in \( {\bf \mathbb R}^3 \) onto an ellipse in \( {\bf \mathbb R}^2 , \) shown in Figure:

The quantity \( \| {\bf A}{\bf x} \|^2 \) is maximized at the same x that maximizes \( \| {\bf A}{\bf x} \| , \) and \( \| {\bf A}{\bf x} \|^2 \) is easier to study:

\[ \| {\bf A}{\bf x} \|^2 = \left( {\bf A}{\bf x} \right)^{\mathrm T} \left( {\bf A}{\bf x} \right) = {\bf x}^{\mathrm T} {\bf A}^{\mathrm T} {\bf A} {\bf x} = {\bf x}^{\mathrm T} \left( {\bf A}^{\mathrm T} {\bf A} \right) {\bf x} . \]

Also, both \( {\bf A}^{\mathrm T} {\bf A} \) and \( {\bf A} {\bf A}^{\mathrm T} \) are symmetric matrices:

\[ {\bf B}_1 = {\bf A}^{\mathrm T} {\bf A} = \begin{bmatrix} 5&-4&2 \\ -4&13&18 \\ 2&18&40 \end{bmatrix} \quad \mbox{and} \quad {\bf B}_2 = {\bf A} {\bf A}^{\mathrm T} = \begin{bmatrix} 41&-16 \\ -16 &17 \end{bmatrix} \]

that have three eigenvalues \( \lambda_1 =49, \quad \lambda_2 =9, \quad \lambda_3 =0 \) for B₁ and two eigenvalues \( \lambda_1 =49, \quad \lambda_2 =9 \) for B₂. Corresponding unit eigenvectors for B₁ (right singular vectors for A) are, respectively,

\[ {\bf v}_1 = \frac{1}{\sqrt{5}} \,\begin{bmatrix} 0 \\ 1 \\ 2 \end{bmatrix} , \quad {\bf v}_2 = \frac{1}{3\sqrt{5}} \, \begin{bmatrix} 5 \\ -4 \\ 2 \end{bmatrix} , \quad {\bf v}_3 = \frac{1}{3} \, \begin{bmatrix} 2 \\ 2 \\ -1 \end{bmatrix} . \]

Evaluating products, we get

\[ {\bf A} {\bf v}_1 = \frac{7}{\sqrt{5}} \,\begin{bmatrix} 2 \\ -1 \end{bmatrix} , \quad {\bf A} {\bf v}_2 = \frac{3}{\sqrt{5}} \, \begin{bmatrix} 1 \\ 2 \end{bmatrix} , \quad {\bf A}{\bf v}_3 = {\bf 0} . \]

Therefore, the maximum value of \( \| {\bf A}{\bf x} \|^2 \) is 49, attained when x is the unit vector \( {\bf v}_1 . \) The vector \( {\bf A}{\bf v}_1 \) is a point on the ellipse in Figure farthest from the origin \( \| {\bf A}{\bf v}_1 \| = 7 . \)

The singular values of A are the square roots of the eigenvalues of \( {\bf B}_1 = {\bf A}^{\mathrm T} {\bf A} : \)

\[ \sigma_1 = \sqrt{49} = 7, \quad \sigma_2 = \sqrt{9} =3, \quad \sigma_3 =0. \]

The first singular value of A is the maximum of \( \| {\bf A}{\bf x} \| \) over all unit vectors, and the maximum is attained at the unit vector \( {\bf v}_1 . \) This singular value 7 is the 2-norm of matrix A:

\[ \sigma_1 = 7 = \| {\bf A} \|_2 < \left( \sum_{i=1}^2 \sum_{j=1}^3 \left\vert a_{i,j} \right\vert^2 \right)^{1/2} = \sqrt{58} = \| {\bf A} \|_F \approx 7.61577 , \]

which is the Frobenius norm. Normalizing \( {\bf A}{\bf v}_{1,2} , \) we obtain left singualr vectors of A:

\[ {\bf u}_1 = \frac{1}{\sigma_1} \,{\bf A} {\bf v}_1 = \frac{1}{\sqrt{5}} \,\begin{bmatrix} 2 \\ -1 \end{bmatrix} , \qquad {\bf u}_2 = \frac{1}{\sigma_2} \,{\bf A} {\bf v}_2 = \frac{1}{\sqrt{5}} \,\begin{bmatrix} 1 \\ 2 \end{bmatrix} . \]

Now we construct the orthogonal matrices

\[ {\bf U} = \left[ {\bf u}_1 \ {\bf u}_2 \right] = \frac{1}{\sqrt{5}}\, \begin{bmatrix} 2&1 \\ -1&2 \end{bmatrix} \quad\mbox{and} \quad {\bf V} = \left[ {\bf v}_1 \ {\bf v}_2 \ {\bf v}_3 \right] = \frac{1}{3\sqrt{5}}\, \begin{bmatrix} 0&5&2\sqrt{5} \\ 3&-4& 2\sqrt{5} \\ 6&2&-\sqrt{5} \end{bmatrix} . \]

This allows us to obtain the expanded form of SVD for matrix A

\[ {\bf A} = {\bf U} {\bf \Sigma} {\bf V}^{\mathrm T} = \frac{1}{15} \begin{bmatrix} 2&1 \\ -1&2 \end{bmatrix} \,\begin{bmatrix} 7&0&0 \\ 0&3&0 \end{bmatrix} \, \begin{bmatrix} 0&3&6 \\ 5&-4&2 \\ 2\sqrt{5} & 2\sqrt{5} & -\sqrt{5} \end{bmatrix} \]

and the reduced form

\[ {\bf A} = {\bf U} {\bf \Sigma}_{2\times 2} {\bf V}^{\mathrm T}_{2\times 3} = \frac{1}{15} \begin{bmatrix} 2&1 \\ -1&2 \end{bmatrix} \,\begin{bmatrix} 7&0 \\ 0&3 \end{bmatrix} \, \begin{bmatrix} 0&3&6 \\ 5&-4&2 \end{bmatrix} , \]

where

\[ {\bf \Sigma} = \begin{bmatrix} 7&0&0 \\ 0&3&0 \end{bmatrix} , \qquad {\bf \Sigma}_{2\times 2} = \begin{bmatrix} 7&0 \\ 0&3 \end{bmatrix} , \quad {\bf V}^{\mathrm T}_{2\times 3} = \begin{bmatrix} 0&3&6 \\ 5&-4&2 \end{bmatrix} . \]

Finally, we represent matrix A as a sum of rank 1 matrices:

\[ {\bf A} = \sigma_1 {\bf R}_1 + \sigma_2 {\bf R}_2 = 7 {\bf R}_1 + 3 {\bf R}_2 , \]

where

\[ {\bf R}_1 = {\bf u}_1 {\bf v}_1^{\mathrm T} = \frac{1}{5} \,\begin{bmatrix} 0&2&4 \\ 0&-1&-2 \end{bmatrix} , \quad {\bf R}_2 = {\bf u}_2 {\bf v}_2^{\mathrm T} = \frac{1}{15} \,\begin{bmatrix} 5&-4&2 \\ 10&-8&4 \end{bmatrix} . \]

Example. Consider the full rank matrix

\[ {\bf A} = \begin{bmatrix} 1&2 \\ 0&2 \\ 2&0 \end{bmatrix} . \]

Two symmetric matrices

\[ {\bf B}_1 = {\bf A}^T {\bf A} = \begin{bmatrix} 5&2 \\ 2&8 \end{bmatrix} \quad \mbox{and} \quad {\bf B}_2 = {\bf A} \, {\bf A}^T = \begin{bmatrix} 5&4&2 \\ 4&4&0 \\ 2&0&4 \end{bmatrix} , \]

the latter is a singular one, share the same eigenvalues \( \lambda_{1} = 9, \quad \lambda_2 = 4, \) having corresponding eigenvectors:

\[ {\bf v}_1 = \begin{bmatrix} 1 \\ 2 \end{bmatrix} , \quad {\bf v}_2 = \begin{bmatrix} 2 \\ -1 \end{bmatrix} , \]

and, including \( \lambda_{3} = 0 , \)

\[ {\bf u}_1 = \begin{bmatrix} 5 \\ 4 \\ 2 \end{bmatrix} , \quad {\bf u}_2 = \begin{bmatrix} 0 \\ -1 \\ 2 \end{bmatrix} , \quad {\bf u}_3 = \begin{bmatrix} -2 \\ 2 \\ 1 \end{bmatrix} . \]

Since these two matrices are symmetric, the corresponding eigenvectors are orthogonal. We can normalize these vectors by dividing by \( \| {\bf v}_1 \|_2 = \| {\bf v}_2 \|_2 = \sqrt{5} \approx 2.23607 \) and \( \| {\bf u}_1 \|_2 = 3\sqrt{5} \approx 6.7082, \) \( \| {\bf u}_2 \|_2 = \sqrt{5} \approx 2.23607 , \) \( \| {\bf u}_3 \|_2 = 3 . \) This yields right singular vectors

\[ {\bf v}_1 = \frac{1}{\sqrt{5}} \,\begin{bmatrix} 1 \\ 2 \end{bmatrix} , \quad {\bf v}_2 = \frac{1}{\sqrt{5}} \,\begin{bmatrix} -2 \\ 1 \end{bmatrix} , \]

and left singular vectors

\[ {\bf u}_1 = \frac{1}{3\sqrt{5}} \,\begin{bmatrix} 5 \\ 4 \\ 2 \end{bmatrix} , \quad {\bf u}_2 = \frac{1}{\sqrt{5}} \,\begin{bmatrix} 0 \\ -1 \\ 2 \end{bmatrix} , \quad {\bf u}_3 = \frac{1}{3} \,\begin{bmatrix} -2 \\ 2 \\ 1 \end{bmatrix} , \]

Taking square roots of positive eigenvalues corresponding to B₁ and B₂, we obtain singular values for matrix A:

\[ \sigma_1 = \sqrt{9} =3 , \quad \sigma_2 = \sqrt{4} =2 , \quad\mbox{and} \quad \sigma_3 =0. \]

Then we build two orthogonal matrices:

\[ {\bf V} = \left[ {\bf v}_1 , \ {\bf v}_2 \right] = \frac{1}{\sqrt{5}} \,\begin{bmatrix} 1 & 2 \\ 2 & -1 \end{bmatrix} \]

and

\[ {\bf U} = \left[ {\bf u}_1 , \ {\bf u}_2 , \ {\bf u}_3 \right] = \frac{1}{3\sqrt{5}} \,\begin{bmatrix} 5 & 0 & -2\sqrt{5} \\ 4 & -3 & 2\sqrt{5} \\ 2 & 6 & \sqrt{5} \end{bmatrix} , \]

and the 'diagonal' matrix (which has the same dimensions as A)

\[ {\bf \Sigma} = \begin{bmatrix} \sigma_1 &0 \\ 0&\sigma_2 \\ 0&0 \end{bmatrix} = \begin{bmatrix} 3 &0 \\ 0&2 \\ 0&0 \end{bmatrix} . \]

This allows us to determing SVD in extended form

\[ {\bf A} = {\bf U}\, {\bf \Sigma} \,{\bf V}^{\mathrm T} = \frac{1}{15}\,\begin{bmatrix} 5&0& -2\sqrt{5} \\ 4&-3& 2\sqrt{5} \\ 2&6&\sqrt{5} \end{bmatrix} \begin{bmatrix} 3&0 \\0&2 \\ 0&0 \end{bmatrix} \begin{bmatrix} 1&2 \\ 2&-1 \end{bmatrix} , \]

reduced form

\[ {\bf A} = {\bf U}_{3\times 2} {\bf \Sigma}_{2 \times 2} {\bf V}^{\mathrm T} = \frac{1}{15}\,\begin{bmatrix} 5&0 \\ 4&-3 \\ 2&6 \end{bmatrix} \begin{bmatrix} 3&0 \\0&2 \end{bmatrix} \begin{bmatrix} 1&2 \\ 2&-1 \end{bmatrix} , \]

and represent A as a sum of rank 1 matrices:

\[ {\bf A} = \sigma_1 {\bf R}_1 + \sigma_2 {\bf R}_2 = \frac{1}{5} \, \begin{bmatrix} 5&10 \\ 4&8 \\ 2&4 \end{bmatrix} + \frac{2}{5} \,\begin{bmatrix} 0&0 \\ -2&1 \\ 4&-2 \end{bmatrix} , \]

where

\[ {\bf R}_1 = {\bf u}_1 {\bf v}_1^{\mathrm T} = \frac{1}{15} \, \begin{bmatrix} 5&10 \\ 4&8 \\ 2&4 \end{bmatrix} , \quad {\bf R}_2 = {\bf u}_2 {\bf v}_2^{\mathrm T} = \frac{1}{5} \, \begin{bmatrix} 0&0 \\ -2&1 \\ 4&-2 \end{bmatrix} . \]

The pseudoinverse (also, the Moore--Penrose inverse) can be found using reduced SVD form:

\[ {\bf A}^{\dagger} = {\bf V} \,{\bf \Sigma}_{2\times 2}^{-1} {\bf U}_{2\times 3}^{\mathrm T} = \frac{1}{15} \, \begin{bmatrix} 1&2 \\ 2&-1 \end{bmatrix} \begin{bmatrix} 1/3&0 \\ 0&1/2 \end{bmatrix} \begin{bmatrix} 5&4&2 \\ 0&-3&6 \end{bmatrix} = \frac{1}{18} \, \begin{bmatrix} 2&2&8 \\ 4&5&2 \end{bmatrix} . \]

Indeed,

\[ {\bf A}^{\dagger} {\bf A} = \begin{bmatrix} 1&0 \\ 0&1 \end{bmatrix} . \]

Example. Consider a 3-by-2 matrix \( {\bf A}^{\mathrm T} = \begin{bmatrix} 4& -1&1 \\ -4 & 1& -1 \end{bmatrix} . \) First, we compute two symmetric singular matrices:

\[ {\bf B}_1 = {\bf A}^T {\bf A} = \begin{bmatrix} 18 &-18 \\ -18 &18 \end{bmatrix} \quad \mbox{and} \quad {\bf B}_2 = {\bf A} \, {\bf A}^T = \begin{bmatrix} 32 &-8&8 \\ -8 &2&-2 \\ 8 &-2&2 \end{bmatrix} . \]

The eigenvalues of B₁ are 36 and 0, with corresponding unit vectors (that are right singular vectors of A)

\[ {\bf v}_1 = \frac{1}{\sqrt{2}} \, \begin{bmatrix} -1 \\ 1 \end{bmatrix} \quad \mbox{and} \quad {\bf v}_2 = \frac{1}{\sqrt{2}} \, \begin{bmatrix} 1 \\ 1 \end{bmatrix} . \]

These unit vectors form the columns of V:

\[ {\bf V} = \left[ {\bf v}_1 \ {\bf v}_2 \right] = \frac{1}{\sqrt{2}} \, \begin{bmatrix} -1&1 \\ 1&1 \end{bmatrix} . \]

The singular values are \( \sigma_1 = \sqrt{36} =6 \quad\mbox{and} \quad \sigma_2 =0 . \) Since there is only one nonzero singular value, the matrix Λ may be written as a single number; that is, \( {\bf \Lambda} = \Lambda =6. \) The matrix Σ has the same size as A, with Λ in its upper left corner:

\[ {\bf \Sigma} = \begin{bmatrix} \Lambda &0 \\ 0&0 \\ 0&0 \end{bmatrix} = \begin{bmatrix} 6&0 \\ 0&0 \\ 0&0 \end{bmatrix} . \]

To determine U, first calculate

\[ {\bf A} {\bf v}_1 = \frac{1}{\sqrt{2}} \, \begin{bmatrix} -8 \\ 2 \\ -2 \end{bmatrix} = \sqrt{2} \, \begin{bmatrix} -4 \\ 1 \\ -1 \end{bmatrix} , \quad {\bf A} {\bf v}_2 = \begin{bmatrix} 0 \\ 0 \\ 0 \end{bmatrix} . \]

As a check on the calculations, we verify that \( \| {\bf A} {\bf v}_1 \| = \sigma_1 =6 . \) The only column found for U so far is

\[ {\bf u}_1 = \frac{1}{\sigma_1} \, {\bf A} {\bf v}_1 = \frac{\sqrt{2}}{6} \, \begin{bmatrix} -4 \\ 1 \\ -1 \end{bmatrix} . \]

The other columns of U are found by extending the set \( \{ {\bf u}_1 \} \) to an orthonormal basis for \( \mathbb{R}^3 . \) In this case, we need two orthogonal unit vectors u₂ and u₃ that are orthogonal to u₁. Each vector must satisfy \( {\bf u}_1^{\mathrm T} {\bf x} =0, \) which is equivalent to the equation \( 4x_1 - x_2 + x_3 =0 . \) A basis for the solution set of this equation is

\[ {\bf w}_1 = \begin{bmatrix} 0 \\ 1 \\ 1 \end{bmatrix} , \qquad {\bf w}_2 = \begin{bmatrix} 1 \\ 4 \\ 0 \end{bmatrix} . \]

Upon application of the Gram--Schmidt process to \( \{ {\bf w}_1 , \ {\bf w}_2 \} , \) we obtain

\[ {\bf u}_2 = \frac{1}{\sqrt{2}} \,\begin{bmatrix} 0 \\ 1 \\ 1 \end{bmatrix} , \qquad {\bf u}_3 = \frac{1}{3} \, \begin{bmatrix} 1 \\ 2 \\ -2 \end{bmatrix} . \]

Finally, set \( \{ {\bf U} = \left[ {\bf u}_1 \ {\bf u}_2 \ {\bf u}_3 \right] , \) take Σ and \( \{ {\bf V}^{\mathrm T} \) from above, and write SVD for A:

\[ {\bf A} = \frac{1}{3\sqrt{2}} \,\begin{bmatrix} -4&0&\sqrt{2} \\ 1&3&2\sqrt{2} \\ -1&3&-2\sqrt{2} \end{bmatrix} \begin{bmatrix} 6&0 \\ 0&0 \\ 0&0 \end{bmatrix} \frac{1}{\sqrt{2}} \,\begin{bmatrix} -1&1 \\ 1 & 1 \end{bmatrix} . \]

However, its reduced form is very easy

\[ {\bf A} = {\bf U}_{3\times 1} {\bf \Sigma}_{1 \times 1} {\bf V}^{\mathrm T}_{1\times 2} = \begin{bmatrix} -4 \\ 1 \\ -1 \end{bmatrix} \cdot \left[ -1 \ 1 \right] . \]

because we can keep the number Λ instead of matrix Σ:

\[ {\bf U}_{3\times 1} = \frac{\sqrt{2}}{6} \, \begin{bmatrix} -4 \\ 1 \\ -1 \end{bmatrix} , \quad {\bf \Sigma}_{1 \times 1} = \Lambda = 6, \quad {\bf V}^{\mathrm T}_{1\times 2} = \frac{1}{\sqrt{2}} \left[ -1 \ 1 \right] . \]

Example.

To view this, type show(P+Q+R).

Linear Systems of Algebraic Equations¶

SVD factorization ¶

Table Of Contents

Previous topic

Next topic

Navigation

Linear Systems of Algebraic Equations¶

SVD factorization ¶

Table Of Contents

Previous topic

Next topic

Quick search

Navigation