Linear Algebra, Part 5: Solvability (Mathematica)

To enhance pedagogical effectiveness, the treatment of the dot product is presented in several distinct sections

Solvability

Consider the linear system of equations,

\begin{equation} \label{EqDot.1} \mathbf{A}\,\mathbf{x} = \mathbf{b} , \qquad \mathbf{A} \in \mathbb{F}^{m\times n} , \end{equation}

where A is an m-by-n matrix with entries from field 𝔽, x is an n×1 column vector representing n unknowns, and b is a given m × 1 column vector. Suppose that matrix A is composed of n column vectors

\[ \mathbf{A} = \begin{bmatrix} \mathbf{a}_1 & \mathbf{a}_2 & \cdots & \mathbf{a}_n \end{bmatrix} , \]

where each a_k is the k-th column of matrix A, k = 1, 2, … n. Then the matrix-vector product A x can be expressed as a linear combination of the columns of A:

\[ \mathbf{A}\,\mathbf{x} = x_1 \mathbf{a}_1 + x_2 \mathbf{a}_2 + \cdots + x_n \mathbf{a}_n = \mathbf{b} \in \mathbb{F}^{m\times 1} . \]

Therefore, the matrix-vector product A x can be considered as a dot product A • x. This formulation highlights that solving the equation A x = b is equivalent to finding scalars (x₁, x₂, … , x_n) such that the linear combination of the column vectors of A equals the vector b. This means that b is a vector in span{a₁, a₂, … , a_n}. Hence, there exists a solution to the system \eqref{EqDot.1} if and only if b is an element of subspace generated by column vectors of matrix A.

Let S ⊆ ℝⁿ. Then S^⊥ = {x ∈ ℝⁿ : x • s = 0 for every s ∈ S} . The funny exponent, ⊥, is called “perp.”

We remind some notations.

Every m-by-n matrix A ∈ ℝ^m×n defined a linear transformation T_A : ℝ^n×1 ⇾ ℝ^m×1 by miltiplication from left: T_A(x) = A x. Then:

kernel (also known as the null space): ker(A) = {x ∈ ℝ^n×1 : A x = 0} ⊆ ℝ^n×1;
image: im(A) = {A x : x ∈ ℝ^n×1} ⊆ ℝ^m×1;
cokernel is the factor space: coker(A) = ℝ^m×1/image(A);
left-null space: ker(A^T).
coimage: is the quotient space coim(A) = 𝔽ⁿ/ker(A).

Lemma 1: There is an isomorpism between the left null space and dual to cokernel: ker(A^T) ≌ (coker(A))′, moreover, the left null space and cokernel have the same dimensions: \[ \dim\,\mbox{ker}\left( \mathbf{A}^{\mathrm T} \right) = \dim \mbox{coker} (\mathbf{A}) = m - \mbox{rank}(\mathbf{A}) . \] Also \[ \dim(\mbox{coim}(\mathbf{A})) = \dim \mathbb{F}^n - \dim\mbox{ker}(\mathbf{A}) = \mbox{rank}(\mathbf{A}) \] because coim(A) ≌ image(A) = ColumnSpace(A) ⊆ 𝔽ⁿ.

An element of the cokernel is a coset [v] = v + im(A), where v ∈ ℝ^m×`. A linear functional on coker(A) is a map \[ \varphi \ : \ \mbox{coker}(\mathbf{A}) \mapsto \mathbb{R} \] such that φ(α[v] + β[w]) = αφ([v]) + βφ([w]), where α, β are some scalars.

To define such a φ, it suffices to define a linear functional on ℝ^m ≌ ℝ^m×1 that vanishes on im(A), because then it descends to the quotient.

We construct tunctionals vanishing on im(A): Every linear functional on ℝ^m can be written as ℓ_y(v) = y • v for a unique y ∈ ℝ^m×1, using the dot product.

The condition “ℓ_y vanishes on im(A)” means: \[ \ell _y (\mathbf{A}\,\mathbf{v}) = \langle {\bf y}, \mathbf{A}\,\mathbf{v}\rangle = {\bf y} \bullet \mathbf{A}\, \mathbf{v} =0\quad \mathrm{for\ all\ }\mathbf{v}\in \mathbb{R}^{n}. \] Equivalently, \[ \langle \mathbf{A}^T \mathbf{y}, \mathbf{v}\rangle = 0\quad \mathrm{for\ all\ }\mathbf{v}\in \mathbb{R}^{n}, \] so A^Ty = 0, i.e., y ∈ ker(A^T). Thus: Linear functionals on coker(A) correspond exactly to vectors y ∈ ker(A^T).

Construction of the explicit isomorphism: Define \[ \Phi : \ \ker (\mathbf{A}^T)\rightarrow (\operatorname{coker} (\mathbf{A}))' \] by \[ \Phi (\mathbf{y})([\mathbf{v}]) = \langle {\bf y}, {\bf v}\rangle= {\bf y} \bullet {\bf v} . \] We must check that Φ:

is well-defined: If [v] = [v'], then v' = v + A w for some w. Then \begin{align*} \langle \mathbf{y}, \mathbf{v}'\rangle &=\langle \mathbf{y}, \mathbf{v}+{\bf A}\,{\bf w}\rangle =\langle \mathbf{y}, \mathbf{v}\rangle +\langle \mathbf{y}, {\bf A}\,{\bf w}\rangle \\ &=\langle \mathbf{y},\mathbf{v}\rangle +\langle \mathbf{A}^T\mathbf{y}, \mathbf{w}\rangle =\langle \mathbf{y}, \mathbf{v}\rangle +0, \end{align*} since y ∈ ker(A^T). So Φ(y)([v]) is independent of the representative.
Linearity in y: Clear from linearity of the dot product.
Injective: If Φ(y) = 0, then y • v = 0 for all v; hence, y = 0.
Surjective: Any ψ from coker(A)′ corresponds to a linear functional on ℝ^m vanishing on image(A); that functional is ℓ_y for a unique y, and then y ∈ ker(A^T). So ψ = Φ(y).

Therefore, Φ is a vector space isomorphism: \[ \ker (\mathbf{A}^T) \cong \operatorname{coker}(\mathbf{A})'. \] left null space = orthogonal functionals that “see” only the quotient by the image.

Example 51: Let \[ \mathbf{A} = \left[ \begin{matrix}1&2\\ 3&6 \end{matrix}\right] : \mathbb{R}^{2} \rightarrow \mathbb{R}^{2} . \] Construct Image and cokernel:

The second row is 3 times the first, so rank(A) = 1. The image is \[ \operatorname{im} (\mathbf{A}) = \operatorname{span} \left\{ \left[ \begin{matrix}1\\ 3\end{matrix}\right] \right\} . \] Thus, \[ \operatorname{coker} (\mathbf{A}) = \mathbb{R}^{2}/\operatorname{im} (\mathbf{A}) \] is 1‑dimensional. A convenient representative of a nonzero coset is, say, \[ [\mathbf{v}]=\left[ \left[ \begin{matrix}1\\ 0\end{matrix}\right] \right] . \] Construct the left null space, ker(A^T):

Compute \[ \mathbf{A}^T = \left[ \begin{matrix}1&3\\ 2&6\\ \end{matrix}\right] . \] We want A^Ty = 0, i.e., \[ \left[ \begin{matrix}1&3\\ 2&6\\ \end{matrix}\right] \left[ \begin{matrix}y_1\\ y_2\end{matrix}\right] =\left[ \begin{matrix}0\\ 0\end{matrix}\right] . \] This gives \[ \left\{ \, \begin{array}{l}\textstyle y_1+3y_2=0,\\ \textstyle 2y_1+6y_2=0,\end{array}\right. \] and the second equation is redundant. So \[ y_1=-3y_2. \] A basis vector for ker(A^T) is \[ \mathbf{y} = \left[ \begin{matrix}-3\\ 1\end{matrix}\right] . \] So \( \displaystyle \quad \ker (\mathbf{A}^T) = \operatorname{span} \left\{ \left[ \begin{matrix}-3\\ 1\end{matrix}\right] \right\} , \quad \) a 1‑dimensional subspace.

The functional on the cokernel:

Take \( \displaystyle \quad y=\left[ \begin{matrix}-3\\ 1\end{matrix}\right] . \quad \) Define \( \displaystyle \quad \Phi (\mathbf{y})([\mathbf{v}]) = \langle \mathbf{y}, \mathbf{v}\rangle = \mathbf{y}^T\mathbf{v}. \)

Check that this is well-defined on cosets:

If v' = v + A w, then \[ \mathbf{y}^T \mathbf{v}' = \mathbf{y}^T \mathbf{v} + \mathbf{y}^T {\bf A}\,{\bf w} = \mathbf{y}^T \mathbf{v} + (\mathbf{A}^T \mathbf{y})^T\mathbf{w} = \mathbf{y}^T\mathbf{v}+0, \] since A^Ty = 0.

So Φ(y) is a linear functional on coker(A).

Evaluate it on our chosen coset: \[ \Phi (y)\left( \left[ \left[ \begin{matrix}1\\ 0\end{matrix}\right] \right] \right) =y^T\left[ \begin{matrix}1\\ 0\end{matrix}\right] =-3. \] This is nonzero, so Φ(y) is a nontrivial functional on the 1‑dimensional space coker(A); hence it spans (coker(A))′.

Thus, in this example:

ker(A^T) is the line spanned by \( \displaystyle \quad \left[ \begin{matrix}-3\\ 1\end{matrix}\right] .\)
(coker(A))′ is the line spanned by the functional [v] ↦ y • v.
The map y ↦ ([v] ↦ y • v) is a linear isomorphism between these two 1‑dimensional spaces.

■

End of Example 51

Theorem 19: Every m × n matrix A ∈ 𝔽^m×n generates the linear transformation T_A : 𝔽^n×1 ⇾ 𝔽^m×1 by matrix multiplication from left, T_A(x) = A x. Then its kernel and image are related via the following relation:

\begin{equation} \label{EqDot.2} \mbox{ker} \left( \mathbf{A}^{\mathrm T} \right) = \mbox{im}\left( \mathbf{A} \right)^{\perp} . \end{equation}

If x ∈ ker(A^T), then A^Tx = 0. Taking dot product, we get \[ \mathbf{A}^{\mathrm T}\mathbf{x} \bullet \mathbf{y} = \mathbf{x} \bullet \mathbf{A}\,\mathbf{y} = 0 \qquad \forall \mathbf{y} \in \mathbb{F}^{m\times 1} . \] In Theorem 1 of the previous section, it was shown that A^Tx • y = x • A y. Hence, x belongs to the orthogonal complement of image of T_A. Since these relations are valid backward, the statement follows.

Example 52: We consider singular 3-by-3 matrix \[ \mathbf{A} = \begin{bmatrix} 1& \phantom{-}2& \phantom{-}3 \\ 4& \phantom{-}5& \phantom{-}6 \\ 1& -1& -3 \end{bmatrix} , \] of rank 2, as Mathematica confirms.

A = {{1, 2, 3}, {4, 5, 6}, {1, -1, -3}} MatrixRank[A]

Since A is a square matrix, its index is zero, so dimensions of its kernel and kernel of transposed matrix are the same---two (rank). Using Mathematica we find their kernels:

NullSpace[A]

NullSpace[Transpose[A]]

Hence, their kernels are spanned on the following vectors: \[ \mbox{ker}\left( \mathbf{A} \right) = \mbox{span} \left\{ \begin{pmatrix} 1 \\ -2 \\ 1 \end{pmatrix} \right\} , \quad \mbox{ker}\left( \mathbf{A}^{\mathrm T} \right) = \mbox{span} \left\{ \begin{pmatrix} 3 \\ -1 \\ 1 \end{pmatrix} \right\} . \] The image of A is \[ W = \mbox{image}\left( \mathbf{A} \right) = \left\{ \begin{pmatrix} x + 2y + 3z \\ 4x + 5y + 6z \\ x - y -3z \end{pmatrix} \ : \quad x, y, z \in \mathbb{R} \right\} . \] Since image of any linear operator is a subspace of the codomain, we denote it by W, for simplicity. Although there are three parameters in its definition (x, y, z), one of them can be excluded. Let us denote \[ a = x+2y+3z , \quad b = 4x + 5y + 6z , \quad c = x -y -3z. \] Excluding z, we obtain two equations \[ b - 2a = 2x + y , \qquad a+c = 2x + y . \] Hence, we get a relation between three parameters \[ b - 2a = a + c \qquad \Longrightarrow \qquad c = b - 3a . \] This leads to explicit formula for the image subspace: \[ W = \mbox{image}\left( \mathbf{A} \right) = \left\{ \begin{pmatrix} a \\ b \\ b - 3a \end{pmatrix} \ : \quad a, b \in \mathbb{R} \right\} . \] This space is two dimensional---there are two parameters that generate W. Now we check whether null space of A^T is orthogonal to W.

v = {3, -1, 1}; u = {a, b, b - 3*a}; Dot[v, u]

This output shows that the image of T_A is orthogonal to ker(A^T).

Now we consider the following 4-by-4 matrix: \[ \mathbf{A} = \begin{bmatrix} 1&2&3&4 \\ 5&6&7&8 \\ -2&0&2&4 \\ 7&6&5&4 \end{bmatrix} . \] This matrix has rank 2.

A = {{1, 2, 3, 4}, {5, 6, 7, 8}, {-2, 0, 2, 4}, {7, 6, 5, 4}}; MatrixRank[A]

Using Mathematica, we find its kernel and kernel of its transpose matrix:

NullSpace[A]

{{2, -3, 0, 1}, {1, -2, 1, 0}}

NullSpace[Transpose[A]]

{{3, -2, 0, 1}, {-3, 1, 1, 0}}

\[ \mbox{ker} \left( \mathbf{A} \right) = \mbox{span} \left\{ \begin{pmatrix} 2\\ -3\\ 0\\ 1 \end{pmatrix} , \ \begin{pmatrix} 1 \\ -2 \\ 1 \\ 0 \end{pmatrix} \right\} , \] and \[ \mbox{ker} \left( \mathbf{A}^{\mathrm T} \right) = \mbox{span} \left\{ \begin{pmatrix} 3 \\ -2 \\ 0 \\ 1 \end{pmatrix} , \ \begin{pmatrix} -3 \\ 1 \\ 1 \\ 0 \end{pmatrix} \right\} . \] Let W be the image of operator T_A: \[ W = \mbox{image}(\mathbf{A} ) = \left\{ \begin{pmatrix} a \\ b \\ c \\ d \end{pmatrix} \ : \quad a,b,c,d \in \mathbb{R} \right\} , \] where we don't know yet any relation between four coefficients generating W. What we know is that W is orthogonal to ker(A^T). This yields \begin{align*} \left( 3, -2, 0, 1 \right) \bullet \left( a, b, c, d \right) &= 3a -2b +d = 0 , \\ \left( -3, 1, 1, 0 \right) \bullet \left( a, b, c, d \right) &= -3a + b + c = 0 . \end{align*}

Solve[{3*a - 2*b + d == 0, -3*a + b + c == 0}, {a, b}]

{a -> 1/3 (2 c + d), b -> c + d}}

Solving these two equation, we exclude two parameters: \[ 3a = 2c + d, \qquad b = c+d . \] Then the image subspace can be written explicitly \[ W = \mbox{image}(\mathbf{A} ) = \left\{ \begin{pmatrix} \frac{2c + d}{3} \\ c+d \\ c \\ d \end{pmatrix} \ : \quad c,d \in \mathbb{R} \right\} , \] According to Theorem 1, this subspace W must be orthogonal to each of two vectors generating ker(A^T): \[ W \perp \left\{ \begin{pmatrix} 3 \\ -2 \\ 0 \\ 1 \end{pmatrix} , \quad \begin{pmatrix} -3 \\ 1 \\ 1 \\ 0 \end{pmatrix} \right\} . \] Using Mathematica, we verify the orthogonal relation:

{(2*c + d)/3, c+d, c, d} . {3, -2, 0, 1}

{(2*c + d)/3, c+d, c, d} . {-3, 1, 1, 0}

■

End of Example 52

This theorem establishes a duality between matrix A and its transpose A^T (more precisely between corresponding linear transformations T_A and T_A′, but we will refer to matrices):

If A is surjective (onto), then A^T is injective (one-to-one).
If the operator A^T is surjective, then A injective.

Let A ∈ 𝔽^m×n be an m-by-n matrix, which defines a linear transformation T_A : 𝔽^n×1 ⇾ 𝔽^m×1 by matrix multiplication from left. Its index is \[ \mbox{ind}\left( \mathbf{A}\right) = n - m = \dim\left( \mbox{ker}\mathbf{A} \right) - \dim\left( \mbox{ker}\mathbf{A}^{\mathrm T} \right) . \]

Recall that the kernel of matrix A is also called the null space in linear algebra. The kernel of transpose matrix, which we denote ker(A^T), is known as left null space of matrix A. Now note that the left null space can be defined as the column space:

\[ \mbox{ker}\left( \mathbf{A}^{\mathrm T} \right) \equiv \left\{ \mathbf{z} \in \mathbb{F}^{m\times 1} \ : \ \mathbf{A}^{\mathrm T} \mathbf{z} = 0 \right\} = \left\{ \mathbf{z} \ : \quad \sum_{k=1}^m z_k \mathbf{a}_k^{\mathrm T} = 0 \right\} , \]

where \( \displaystyle \quad \mathbf{a}_k^{\mathrm T} \quad \) is the k-th column vector of transpose matrix, A^T. The left null space is also isomorphic to the row vector space:

\[ \mbox{ker}\left( \mathbf{A}^{\mathrm T} \right) \equiv \left\{ \mathbf{y} \in \mathbb{F}^{1\times m} \ : \ \mathbf{y}\, \mathbf{A} = 0 \right\} = \left\{ \mathbf{y} \ : \quad \sum_{k=1}^m y_k \mathbf{a}_k^{\mathrm T} = 0 \right\} . \]

Example 53: To enhance understanding of the left null space, we consider a 2-by-3 matrix that we fill with some integers: \[ \mathbf{A} = \begin{bmatrix} a_{1,1} & a_{1,2} & a_{1,3} \\ a_{2,1} & a_{2,2} & a_{2,3} \end{bmatrix} , \qquad \mathbf{A} = \begin{bmatrix} 1&2&3 \\ 4&5&6 \end{bmatrix} . \tag{2.1} \] Its transpose is \[ \mathbf{A}^{\mathrm T} = \begin{bmatrix} a_{1,1} & a_{2,1} \\ a_{1,2} & a_{2,2} \\ a_{1,3} & a_{2.3} \end{bmatrix} , \qquad \mathbf{A}^{\mathrm T} = \begin{bmatrix} 1&4 \\ 2&5 \\ 3&6 \end{bmatrix} \tag{2.2} \] Matrix A ∈ 𝔽^m×n can be represented as a row of column vectors: \[ \mathbf{A} = \begin{bmatrix} {\bf a}_1 & {\bf a}_2 & \cdots & {\bf a}_n \end{bmatrix} , \] where in our case of m = 2 and n = 3, we have \[ \mathbf{a}_1 = \begin{pmatrix} a_{1,1} \\ a_{2,1} \end{pmatrix} = \begin{pmatrix} 1 \\ 4 \end{pmatrix} , \quad \mathbf{a}_2 = \begin{pmatrix} a_{1,2} \\ a_{2,2} \end{pmatrix} = \begin{pmatrix} 2 \\ 5 \end{pmatrix} , \quad \] and \[ \mathbf{a}_3 = \begin{pmatrix} a_{1,3} \\ a_{2,3} \end{pmatrix} = \begin{pmatrix} 3 \\ 6 \end{pmatrix} \] We can find a similar representation for the transpose matrix \[ \mathbf{A}^{\mathrm T} = \begin{bmatrix} {\bf a}_1^{\mathrm T} & {\bf a}_2^{\mathrm T} & \cdots & {\bf a}_m^{\mathrm T} \end{bmatrix} , \] where \[ {\bf a}_1^{\mathrm T} = \begin{pmatrix} a_{1,1} \\ a_{1,2} \\ a_{1,3} \end{pmatrix} = \begin{pmatrix} 1 \\ 2 \\ 3 \end{pmatrix} , \quad {\bf a}_2^{\mathrm T} = \begin{pmatrix} a_{2,1} \\ a_{2,2} \\ a_{2,3} \end{pmatrix} = \begin{pmatrix} 4 \\ 5 \\ 6 \end{pmatrix} . \] Then product of transpose matrix A^T and column vector z ∈ ℝ^2×1 can be written as \[ {\bf A}^{\mathrm T} \mathbf{z} = \begin{bmatrix} a_{1,1} & a_{2,1} \\ a_{1,2} & a_{2,2} \\ a_{1,3} & a_{2.3} \end{bmatrix} \begin{pmatrix} z_1 \\ z_2 \end{pmatrix} = \begin{pmatrix} a_{1,1} z_1 + a_{2,1} z_2 \\ a_{1,2} z_1 + a_{2,2} z_2 \\ a_{1,3} z_1 + a_{2,3} z_2 \end{pmatrix} . \] Using column vectors of matrix A^T, we rewrite it as \[ {\bf A}^{\mathrm T} \mathbf{z} = z_1 \begin{pmatrix} a_{1,1} \\ a_{1,2} \\ a_{1,3} \end{pmatrix} + z_2 \begin{pmatrix} a_{2,1} \\ a_{2,2} \\ a_{2,3} \end{pmatrix} = z_1 {\bf a}_1^{\mathrm T} + z_2 {\bf a}_2^{\mathrm T} . \] Hence, the left null space as a column space is \[ \mbox{ker}\left( \mathbf{A}^{\mathrm T} \right) = \left\{ \mathbf{z} \in \mathbb{F}^{m\times 1} \ : \quad \sum_{k=1}^m z_k \mathbf{a}_k^{\mathrm T} = 0 \right\} . \] For our 2-by-3 matrix, it becomes \[ \mbox{ker}\left( \mathbf{A}^{\mathrm T} \right) = \left\{ \mathbf{z} = \begin{pmatrix} z_1 \\ z_2 \end{pmatrix} \ : \quad z_1 \begin{pmatrix} 1 \\ 2 \\ 3 \end{pmatrix} + z_2 \begin{pmatrix} 4 \\ 5 \\ 6 \end{pmatrix} = 0 \right\} , \] which is equivalent to solving the following system of equations: \begin{align*} z_1 + 4 z_2 &= 0 , \\ 2 z_1 + 5 z_2 &=0 , \\ 3 z_1 + 6 z_2 &= 0 . \end{align*} From the first equation, we get z ₁n = −4z ₂. Substituting this expression into the remainder equations leads to to two equations \[ -8 z_2 + 5 z_2 = 0, \quad -4 z_2 + 2 z_2 = 0 \] These two equations have the only zero solution. Mathematica confirms:

AA = {{1, 2, 3}, {4, 5, 6}} NullSpace[Transpose[AA]]

{}

Now we represent the left null space as a row vector space. To achive this, we multiply matrix A by row vector y = [y₁, y₂] from left. This yields \[ \mathbf{y}\,\mathbf{A} = \begin{bmatrix} y_1 & y_2 \end{bmatrix} \begin{pmatrix} a_{1,1} & a_{1,2} & a_{1,3} \\ a_{2,1} & a_{2,2} & a_{2,3} \end{pmatrix} = \begin{bmatrix} y_1 a_{1,1} + y_2 a_{1,2} \\ y_1 a_{2,1} + y_2 a_{2,2} \\ y_1 a_{1,3} + y_2 a_{2,3} \end{bmatrix} \] Or \[ \mathbf{y}\,\mathbf{A} = y_1 \begin{bmatrix} a_{1,1} \\ a_{1,2} \\ a_{1,3} \end{bmatrix} + y_2 \begin{bmatrix} a_{1,2} \\ a_{2,2} \\ a_{2,3} \end{bmatrix} = y_1 \mathbf{a}_1^{\mathrm T} + y_2 \mathbf{a}_1^{\mathrm T} . \] Therefore, we can define the left null space as the row vector space: \[ \mbox{ker}\left( \mathbf{A}^{\mathrm T} \right) = \left\{ \mathbf{y} = \begin{bmatrix} y_1 & y_2 \end{bmatrix} \ : \quad y_1 \begin{pmatrix} 1 \\ 2 \\ 3 \end{pmatrix} + y_2 \begin{pmatrix} 4 \\ 5 \\ 6 \end{pmatrix} = 0 \right\} , \] which is exactly the same as column space.

■

End of Example 53

Theorem 20: Let A be a real m × n matrix and let b ∈ ℝ^m×1. There exists a solution x to the equation A x = b if and only if b ∈ ker(A^T)^⊥.

First suppose b ∈ ker(A^T)^⊥. Then this says that if A^Tx = 0, it follows that b • x = 0. In other words, taking transpose, \[ \mathbf{x}^{\mathrm T} \mathbf{A} = 0 \qquad \Longrightarrow \qquad \mathbf{x}^{\mathrm T} \mathbf{b} = 0 . \] Thus, if P is a product of elementary matrices such that PA is in row reduced echelon form, then if PA has a row of zeros, in the k-th position, obtained from the k-th row of P times A, then there is also a zero in the k-th position of P b. This is because the kth position in P b is just the k-th row of P times b. Thus, the row reduced echelon forms of A and augmented matrix [A ∣ b] have the same number of zero rows. Hence, the rank of the augmented matrix equals the rank of matrix A.According to Theorem 1, there exists a solution x to the system A x = b. It remains to prove the converse statement.

Let z ∈ ker(A^T) and suppose A x = b. We need to verify b • z = 0. Since for dot product we have v •A u = A^Tv • u, we have \[ \mathbf{b} \bullet \mathbf{z} = \mathbf{A}\,\mathbf{x} \bullet \mathbf{z} = \mathbf{x} \bullet \mathbf{A}^{\mathrm T} \mathbf{z} = \mathbf{x} \bullet \mathbf{0} = 0 . \]

Example 54: We consider two matrices \[ \mathbf{A} = \begin{bmatrix} 1&2&3&4 \\ 1&-1&1&-2 \\ 2&1&4&2 \end{bmatrix} , \quad \mathbf{B} = \begin{bmatrix} 1 & 2 & \phantom{-}3 \\ 3&2&\phantom{-}1 \\ 1&1&\phantom{-}1 \\ 1&0&-1 \end{bmatrix} . \] We consider the linear systems of equations A x = b and B x = b. Their transpose matrices become \[ \mathbf{A}^{\mathrm T} = \begin{bmatrix} 1&1&2 \\ 2&-1&1 \\ 3&1&4 \\ 4 &-2&2 \end{bmatrix} , \quad \mathbf{B}^{\mathrm T} = \begin{bmatrix} 1&3&1&\phantom{-}1 \\ 2&2&1&\phantom{-}0 \\ 3&1&1&-1 \end{bmatrix} . \] Using Mathematica, we find the kernel of A^T (left null space for the matrix)

A = {{1, 2, 3, 4}, {1, -1, 1, -2}, {2, 1, 4, 2}}; NullSpace[Transpose[A]]

Hence, the null space of A^T is spanned on vector (1, 1, −1). So the system A x = b has a solution if and only if vector b is orthogonal to (1, 1, −1). This means that the coefficients of b = (b₁, b₂, b₃) satisfy the condition b₁ + b₂ = b₃. We check with Mathematica:

Solve[{A . {x, y, z, w} == {1, 1, 2}}, {x, y, z, w}]

As we see, solution of the system A x = b exists, but not unique (depends on two parameters): \[ \mathbf{A}\begin{pmatrix} x \\ y \\ z \\ w \end{pmatrix} = \begin{pmatrix} 1 \\ 1 \\ 2 \end{pmatrix} \quad \Longrightarrow \quad \mathbf{x} = \begin{pmatrix} x \\ y \\ \frac{3 (1 -x)}{5} \\ \frac{x -1}{5} - \frac{y}{2} \end{pmatrix} . \]

Now we determine the left null space of matrix B (with the aid of Mathematica)

B = {{1, 2, 3}, {3, 2, 1}, {1, 1, 1}, {1, 0, -1}}; NullSpace[Transpose[B]]

{{1, -1, 0, 2}, {-1, -1, 4, 0}}

\[ \mbox{ker} \left( \mathbf{B}^{\mathrm T} \right) = \left\{ \mathbf{y} \in \mathbb{R}^{4\times 1} \ : \ \mathbf{y}\,\mathbf{B} = 0 \right\} \] Mathematica tells us that the left null space of matrix B is a two dimensional subspace of ℝ^4×1 spanned on two vectors: \[ \mbox{ker} \left( \mathbf{B}^{\mathrm T} \right) = \mbox{span} \left\{ \begin{pmatrix} 1 \\ -1 \\ 0 \\ 2 \end{pmatrix} , \quad \begin{pmatrix} -1 \\ -1 \\ 4 \\ 0 \end{pmatrix} \right\} . \] Hence, the null space of B^T is spanned on two vector (1, −1, 0, 2) and (−1, −1, 4, 0). So the system B x = b has a solution if and only if vector b is orthogonal to these two vectors: \[ b_1 - b_2 + 2 b_4 =0 , \qquad -b_1 - b_2 + 4 b_3 = 0 . \]

Solve[{b1 - b2 + 2*b4 == 0, -b1 - b2 + 4*b3 == 0}, {b1, b2}]

{b1 -> 2 b3 - b4, b2 -> 2 b3 + b4}}

So vector b must depend on two parameters: \[ \mathbf{b} = \begin{bmatrix} 2 b_3 - b_4 & 2 b_3 + b_4 & b_3 & b_4 \end{bmatrix}^{\mathrm T} , \qquad b_3 , b_4 \in \mathbb{R} . \] However, solution of the system B x = b, if it exists, is not unique because the kernel of B is a one dimensional space \[ \mbox{ker} \left( \mathbf{B} \right) = \mbox{span}\left\{ \begin{pmatrix} 1 \\ -2 \\ 1 \end{pmatrix} \right\} . \]

NullSpace[B]

The corresponding linear operator T_B : ℝ^3×1 ⇾ ℝ^4×1, defined by T_B(x) = B x, has index −1: \[ \mbox{ind}\left( \mathbf{B} \right) = 3 - 4 = \dim\left( \mbox{ker}\mathbf{B} \right) - \dim\left( \mbox{ker}\mathbf{B}^{\mathrm T} \right) = 1 - 2 = -1 . \] ■

End of Example 54

Complex matrices: Although Theorem 20 is stated for real matrices—primarily for clarity—it remains valid for complex matrices as the following example shows.

Example 55: We consider singular 2-by-2 matrix with complex entries \[ \mathbf{B} = \begin{bmatrix} 1 & 1 + \mathbf{j} \\ \mathbf{j} & \mathbf{j} -1 \end{bmatrix} , \qquad \mbox{with} \qquad \mathbf{B}^{\mathrm T} = \begin{bmatrix} 1 & \mathbf{j} \\ 1 + \mathbf{j} & \mathbf{j} -1 \end{bmatrix} \] The left null space for B^T is \[ \mbox{ker} \left( \mathbf{B}^{\mathrm T} \right) = \left\{ \mathbf{y} = \begin{bmatrix} y_1 & y_2 \end{bmatrix} \ : \ \mathbf{y}\,\mathbf{B} = 0 \right\} = \left\{ y_1 \begin{pmatrix} 1 \\ \mathbf{j} \end{pmatrix} + y_2 \begin{pmatrix} 1 + \mathbf{j} \\ \mathbf{j} -1 \end{pmatrix} = 0 \right\} . \]

B = {{1, 1 + \[ImaginaryJ]}, {\[ImaginaryJ], \[ImaginaryJ] - 1}};
NullSpace[Transpose[B]]

Therefore, the left null space of matrix B is spanned on vector v = (ⅉ, −1). According to Theorem 20, vector b in equation B x = b must be orthogonal to v; otherwise this equation has no solution. So \[ \mathbf{v} \perp \mathbf{b} \qquad \iff \qquad \mathbf{b} = b\begin{bmatrix} 1 & \mathbf{j} \end{bmatrix} , \quad b \in \mathbb{C} . \] ■

End of Example 55

The following statement is known as the Fredholm alternative, named after the Swedish mathematician Ivar Fredholm (1866--1927); it is one of Fredholm's theorems and a result in Fredholm theory. It may be expressed in several ways, as a theorem of linear algebra, a theorem of integral equations, or as a theorem on Fredholm operators.

Theorem 21 (Fredholm alternative): Let A ∈ ℝ^m×n be an m x n matrix. Then A maps ℝ^n×1 onto ℝ^m×1 by multiplication from left if and only if the only solution to A^Tx = 0 is x = 0.

If the only solution to A^Tx = 0 is x = 0, then ker(A^T) = {0} and so ker(A^T)^⊥ = ℝ^m×1 because every b ∈ ℝ^m×1 has the property that b • 0 = 0 Therefore, A x = b has a solution for any b ∈ ℝ^m×1 because the b for which there is a solution are those in ker(A^T)^⊥ by Theorem 20. In other words, A maps ℝ^n×1 onto ℝ^m×1.

Conversely if A is onto, then by Theorem 20 every b ∈ ℝ^m×1 is in ker(A^T)^⊥ and so if A^Tx = 0, then b • x = 0 for every b. In particular, this holds for b = x. Hence if A^Tx = 0, then x = 0.

Example 56: We start with square matrices (that always have index zero) \[ \mathbf{A} = \begin{bmatrix} 1&2&3 \\ 4&5&6 \\ 7&8&9 \end{bmatrix} , \qquad \mathbf{B} = \begin{bmatrix} 1&2&3 \\ 4&5&6 \\ 7&8&7 \end{bmatrix} . \] The kernel (null space) of matrix A is one dimensional, as Mathematica tells us

A = {{1, 2, 3}, {4, 5, 6}, {7, 8, 9}}; NullSpace[A]

So \[ \mbox{ker}\left( \mathbf{A} \right) = \mbox{span} \left\{ \begin{pmatrix} 1 \\ -2 \\ 1 \end{pmatrix} \right\} . \] The left null space of matrix A (also according to Mathematica, whom we trust) is the same:

NullSpace[Transpose[A]]

The Fredholm alternative claims that mapping T_A : ℝ^3×1 ⇾ ℝ^3×1 cannot be surjective (onto). To verify this statement, we consider three equations that define the image of operator T_A: \begin{align*} x + 2 y + 3 z &= a , \\ 4 x + 5 y + 6 z &= b , \\ 7 x + 8 y + 9 z &= c . \end{align*} Excluding x from the first equation, x = 𝑎 −2y −3z, and substituting this expression into two other equations gives \[ 4 \left( a - 2 y - 3 z \right) + 5 y + 6 z = b, \quad 7 \left( a - 2 y - 3 z \right) + 8 y + 9 z = c. \] From the latter, we exclude y and substitute into the third one. You will be surprised that variable z will be eliminated and we obtain an equation between three parameters: c = 2b −𝑎. So the image of matrix A becomes

7*(a - 2*(4*a/3 - 2*z - b/3) - 3*z) + 8*(4*a/3 - 2*z - b/3) + 9*z

-a + 2 b

\[ \mbox{image}\left( \mathbf{A} \right) = \left\{ \begin{pmatrix} a \\ b \\ b -2a \end{pmatrix} \ : \quad a, b \in \mathbb{R} \right\} . \tag{6.1} \] This formula shows that the image of T_A has dimension 2 and cannot be onto because the codomain has dimension 3. We verify (6.1) with a couple of numerical examples.

{{1, 2, 3}, {4, 5, 6}, {7, 8, 9}} . {3, 2, 1}

{10, 28, 46}

\[ \begin{bmatrix} 1&2&3 \\ 4&5&6 \\ 7&8&9 \end{bmatrix} \begin{pmatrix} 4 \\ -2 \\ 2 \end{pmatrix} = \begin{pmatrix} 6 \\ 18 \\ 30 \end{pmatrix} = \begin{pmatrix} 6 \\ 18 \\ 2 \cdot 18 -6 \end{pmatrix} . \]

{{1, 2, 3}, {4, 5, 6}, {7, 8, 9}} . {4, -2, 2}

6, 18, 30}

\[ \begin{bmatrix} 1&2&3 \\ 4&5&6 \\ 7&8&9 \end{bmatrix} \begin{pmatrix} 4 ]] -2 \\ 2 \end{pmatrix} = \begin{pmatrix} 6 \\ 18 \\ 30 \end{pmatrix} = \begin{pmatrix} 6 \\ 18 \\ 2 \cdot 18 - 6 \end{pmatrix} . \]

Let us consider matrix B. It is not a singular matrix since its rank is 3.

B = {{1, 2, 3}, {4, 5, 6}, {7, 8, 7}}; MatrixRank[B]

Hence, we know that ker(B) = {0}; the same conclusion we can make to the left null space---it is a trivial zero space. Therefore, linear mapping T_B is onto.

NullSpace[Transpose[B]]

{}

■

End of Example 57

If vector b is orthogonal to all solutions of A^Ty = 0 or y A = 0, then A x = b has a solution. If not, then no solution exists. This duality between the operator and its adjoint is what makes the Fredholm Alternative tick—and the zero index guarantees that the dimensions align perfectly for the theorem to apply.

Example 58: Let A be an m x n matrix in which m > n. Then A cannot map ℝ^n×1 onto ℝ^m×1. The reason for this is that A^T is an n x m (fat) where m > n and so in the augmented matrix \[ \left[ \mathbf{A} \, \mid \,\mathbf{0} \right] \] there must be some free variables. Thus, there exists a nonzero vector x ∈ ker(A^T) such that A^Tx = 0.

As an example, we consider the following matrix \[ \mathbf{A} = \begin{bmatrix} 1&2&0 \\ 3&2&1 \\ 1&1&1 \\ 4&5&2 \end{bmatrix} . \] Since rank of this matrix is 3, its image is a three dimensional subspace of ℝ^4×1

A = {{1, 2, 0}, {3, 2, 1}, {1, 1, 1}, {4, 5, 2}}; Mat.rixRank[A]

The kernel (or null space) of matrix A is {0}.

NullSpace[A]

{}

The kernel of A^T is one dimensional since it is spanned on vector (−4, −1, −5, 3). Therefore, matrix A has index −1.

NullSpace[Transpose[A]]

The image of matrix A is \[ \mbox{image}(\mathbf{A}) = \left\{ \begin{pmatrix} 2b -a-2c \\ 2a -b +c \\ 4c -a-b \\ 4a +b + 5c \end{pmatrix} \quad : \quad a,b,c \in \mathbb{R} \right\} . \]

Solve[{x + 2*y == a, 3*x + 2*y + z == b, x + y + z == c}, {x, y, z}]

{{x -> 1/3 (-a + 2 b - 2 c), y -> 1/3 (2 a - b + c), z -> 1/3 (-a - b + 4 c)}}

Therefore, vector [3, 3, 3, 10]^T is in the image of matrix A (more precisely, linear transformation T_A : ℝ^3×1 ⇾ ℝ^4×1 generated by matrix multiplication), but vector [3, 3, 3, 9]^T is not. So index of matrix A is \[ \mbox{ind}(\mathbf{A}) = \dim\left( \mbox{ker}\mathbf{A} \right) - \dim\left( \mbox{ker}\mathbf{A}^{\mathrm T} \right) = -1 . \] The linear transformation T_A is one-to-one, but not onto. ■

End of Example 58

Find index of the following matrices and determine which of them defines an injective (one-to-one) or surjective (onto) linear mapping.
1. \( \displaystyle \quad \begin{bmatrix} 2&1&0 \\ 4&3&2 \\ 1&2&3 \\ 0&2&1 \end{bmatrix} ; \)
2. \( \displaystyle \quad \begin{bmatrix} 1&0&0 \\ 2&2&2 \\ 1&2&0 \\ 0&1&0 \end{bmatrix} ; \)
3. \( \displaystyle \quad \begin{bmatrix} 0&3&0&2&0&1&0 \\ 0&2&3&1&1&8&6 \\ 0&1&1&5&0&2&3 \\ 0&2&1&7&0&4&3 \end{bmatrix} ; \)
4. \( \displaystyle \quad \begin{bmatrix} 0&1&0&2&0&1&0 \\ 0&3&2&6&0&5&4 \\ 0&1&1&2&0&2&2 \\ 0&2&1&4&0&3&2 \end{bmatrix} ; \)
5. \( \displaystyle \quad \begin{bmatrix} 0&1&0&2&1&1&2 \\ 0&3&2&6&1&5&1 \\ 0&1&1&2&0&2&1 \\ 0&2&1&4&0&3&1 \end{bmatrix} . \)
Suppose A is an m × n matrix and that m ≤ n. Suppose also that the rank A equals m. Show that T_A maps 𝔽^n×1 onto 𝔽^m×1. Hint: vectors e₁, e₂, … , e_{m occur as columns in the row reduced echelon form for A}
Suppose A is an m × n matrix and that m ≥ n. Suppose also that the rank A equals n. Show that T_A is one-to-one. Hint: If not, there exists a vector x ≠ 0 such that A x = 0, and this implies at least one column of A is a linear combination of the others. Show that this would require the column rank to be less than n.
Suppose A is an m × n matrix and that m < n. Show that A is not one to one.

Anton, Howard (2005), Elementary Linear Algebra (Applications Version) (9th ed.), Wiley International
Beezer, R., A First Course in Linear Algebra, 2015.
Beezer, R., A Second Course in Linear Algebra, 2013.
Fitzpatrick, S., Orthogonal sets of vectors, Linear Algebra.

Introduction to Linear Algebra

Systems of Linear Equations

Matrix Algebra

Vector Spaces

Eigenvalues, Eigenvectors

Euclidean Spaces

Canonical forms

Applications

Miscellany

Preliminaries

Glossary

Reference