es

To enhance pedagogical effectiveness, the treatment of the dot product is presented in several distinct sections

Dot product

Metric

Geometrical interpretation

Duality

Orthogonality

Projection

 

Solvability


Consider the linear system of equations,
\begin{equation} \label{EqDot.1} \mathbf{A}\,\mathbf{x} = \mathbf{b} , \qquad \mathbf{A} \in \mathbb{F}^{m\times n} , \end{equation}
where A is an m-by-n matrix with entries from field 𝔽, x is an n×1 column vector representing n unknowns, and b is a given m × 1 column vector. Suppose that matrix A is composed of n column vectors
\[ \mathbf{A} = \begin{bmatrix} \mathbf{a}_1 & \mathbf{a}_2 & \cdots & \mathbf{a}_n \end{bmatrix} , \]
where each ak is the k-th column of matrix A, k = 1, 2, … n. Then the matrix-vector product A x can be expressed as a linear combination of the columns of A:
\[ \mathbf{A}\,\mathbf{x} = x_1 \mathbf{a}_1 + x_2 \mathbf{a}_2 + \cdots + x_n \mathbf{a}_n = \mathbf{b} \in \mathbb{F}^{m\times 1} . \]
Therefore, the matrix-vector product A x can be considered as a dot product Ax. This formulation highlights that solving the equation Ax = b is equivalent to finding scalars (x₁, x₂, … , xn) such that the linear combination of the column vectors of A equals the vector b. This means that b is a vector in span{a₁, a₂, … , an}. Hence, there exists a solution to the system \eqref{EqDot.1} if and only if b is an element of subspace generated by column vectors of matrix A.
Let S ⊆ ℝn. Then S = {x ∈ ℝn   :   xs = 0 for every sS} . The funny exponent, ⊥, is called “perp.”
We remind some notations.
Every m-by-n matrix A ∈ ℝm×n defined a linear transformation TA : ℝn×1 ⇾ ℝm×1 by miltiplication from left: TA(x) = A x. Then:
  • kernel (also known as the null space): ker(A) = {x ∈ ℝn×1 : A x = 0} ⊆ ℝn×1;
  • image: im(A) = {A x : x ∈ ℝn×1} ⊆ ℝm×1;
  • cokernel is the factor space: coker(A) = ℝm×1/image(A);
  • left-null space: ker(AT).
  • coimage: is the quotient space coim(A) = 𝔽n/ker(A).
Lemma 1: There is an isomorpism between the left null space and dual to cokernel: ker(AT) ≌ (coker(A))′, moreover, the left null space and cokernel have the same dimensions: \[ \dim\,\mbox{ker}\left( \mathbf{A}^{\mathrm T} \right) = \dim \mbox{coker} (\mathbf{A}) = m - \mbox{rank}(\mathbf{A}) . \] Also \[ \dim(\mbox{coim}(\mathbf{A})) = \dim \mathbb{F}^n - \dim\mbox{ker}(\mathbf{A}) = \mbox{rank}(\mathbf{A}) \] because coim(A) ≌ image(A) = ColumnSpace(A) ⊆ 𝔽n.
An element of the cokernel is a coset [v] = v + im(A), where v ∈ ℝm×`. A linear functional on coker(A) is a map \[ \varphi \ : \ \mbox{coker}(\mathbf{A}) \mapsto \mathbb{R} \] such that φ(α[v] + β[w]) = αφ([v]) + βφ([w]), where α, β are some scalars.

To define such a φ, it suffices to define a linear functional on ℝm ≌ ℝm×1 that vanishes on im(A), because then it descends to the quotient.

We construct tunctionals vanishing on im(A): Every linear functional on ℝm can be written as ℓy(v) = yv for a unique y ∈ ℝm×1, using the dot product.

The condition “ℓy vanishes on im(A)” means: \[ \ell _y (\mathbf{A}\,\mathbf{v}) = \langle {\bf y}, \mathbf{A}\,\mathbf{v}\rangle = {\bf y} \bullet \mathbf{A}\, \mathbf{v} =0\quad \mathrm{for\ all\ }\mathbf{v}\in \mathbb{R}^{n}. \] Equivalently, \[ \langle \mathbf{A}^T \mathbf{y}, \mathbf{v}\rangle = 0\quad \mathrm{for\ all\ }\mathbf{v}\in \mathbb{R}^{n}, \] so ATy = 0, i.e., y ∈ ker(AT). Thus: Linear functionals on coker(A) correspond exactly to vectors y ∈ ker(AT).

Construction of the explicit isomorphism: Define \[ \Phi : \ \ker (\mathbf{A}^T)\rightarrow (\operatorname{coker} (\mathbf{A}))' \] by \[ \Phi (\mathbf{y})([\mathbf{v}]) = \langle {\bf y}, {\bf v}\rangle= {\bf y} \bullet {\bf v} . \] We must check that Φ:

  1. is well-defined: If [v] = [v'], then v' = v + A w for some w. Then \begin{align*} \langle \mathbf{y}, \mathbf{v}'\rangle &=\langle \mathbf{y}, \mathbf{v}+{\bf A}\,{\bf w}\rangle =\langle \mathbf{y}, \mathbf{v}\rangle +\langle \mathbf{y}, {\bf A}\,{\bf w}\rangle \\ &=\langle \mathbf{y},\mathbf{v}\rangle +\langle \mathbf{A}^T\mathbf{y}, \mathbf{w}\rangle =\langle \mathbf{y}, \mathbf{v}\rangle +0, \end{align*} since y ∈ ker(AT). So Φ(y)([v]) is independent of the representative.
  2. Linearity in y: Clear from linearity of the dot product.
  3. Injective: If Φ(y) = 0, then yv = 0 for all v; hence, y = 0.
  4. Surjective: Any ψ from coker(A)′ corresponds to a linear functional on ℝm vanishing on image(A); that functional is ℓy for a unique y, and then y ∈ ker(AT). So ψ = Φ(y).

Therefore, Φ is a vector space isomorphism: \[ \ker (\mathbf{A}^T) \cong \operatorname{coker}(\mathbf{A})'. \] left null space = orthogonal functionals that “see” only the quotient by the image.

   
Example 51: Let \[ \mathbf{A} = \left[ \begin{matrix}1&2\\ 3&6 \end{matrix}\right] : \mathbb{R}^{2} \rightarrow \mathbb{R}^{2} . \] Construct Image and cokernel:

The second row is 3 times the first, so rank(A) = 1. The image is \[ \operatorname{im} (\mathbf{A}) = \operatorname{span} \left\{ \left[ \begin{matrix}1\\ 3\end{matrix}\right] \right\} . \] Thus, \[ \operatorname{coker} (\mathbf{A}) = \mathbb{R}^{2}/\operatorname{im} (\mathbf{A}) \] is 1‑dimensional. A convenient representative of a nonzero coset is, say, \[ [\mathbf{v}]=\left[ \left[ \begin{matrix}1\\ 0\end{matrix}\right] \right] . \] Construct the left null space, ker(AT):

Compute \[ \mathbf{A}^T = \left[ \begin{matrix}1&3\\ 2&6\\ \end{matrix}\right] . \] We want ATy = 0, i.e., \[ \left[ \begin{matrix}1&3\\ 2&6\\ \end{matrix}\right] \left[ \begin{matrix}y_1\\ y_2\end{matrix}\right] =\left[ \begin{matrix}0\\ 0\end{matrix}\right] . \] This gives \[ \left\{ \, \begin{array}{l}\textstyle y_1+3y_2=0,\\ \textstyle 2y_1+6y_2=0,\end{array}\right. \] and the second equation is redundant. So \[ y_1=-3y_2. \] A basis vector for ker(AT) is \[ \mathbf{y} = \left[ \begin{matrix}-3\\ 1\end{matrix}\right] . \] So \( \displaystyle \quad \ker (\mathbf{A}^T) = \operatorname{span} \left\{ \left[ \begin{matrix}-3\\ 1\end{matrix}\right] \right\} , \quad \) a 1‑dimensional subspace.

The functional on the cokernel:

Take \( \displaystyle \quad y=\left[ \begin{matrix}-3\\ 1\end{matrix}\right] . \quad \) Define \( \displaystyle \quad \Phi (\mathbf{y})([\mathbf{v}]) = \langle \mathbf{y}, \mathbf{v}\rangle = \mathbf{y}^T\mathbf{v}. \)

Check that this is well-defined on cosets:

If v' = v + A w, then \[ \mathbf{y}^T \mathbf{v}' = \mathbf{y}^T \mathbf{v} + \mathbf{y}^T {\bf A}\,{\bf w} = \mathbf{y}^T \mathbf{v} + (\mathbf{A}^T \mathbf{y})^T\mathbf{w} = \mathbf{y}^T\mathbf{v}+0, \] since ATy = 0.

So Φ(y) is a linear functional on coker(A).

Evaluate it on our chosen coset: \[ \Phi (y)\left( \left[ \left[ \begin{matrix}1\\ 0\end{matrix}\right] \right] \right) =y^T\left[ \begin{matrix}1\\ 0\end{matrix}\right] =-3. \] This is nonzero, so Φ(y) is a nontrivial functional on the 1‑dimensional space coker(A); hence it spans (coker(A))′.

Thus, in this example:

  • ker(AT) is the line spanned by \( \displaystyle \quad \left[ \begin{matrix}-3\\ 1\end{matrix}\right] .\)
  • (coker(A))′ is the line spanned by the functional [v] ↦ yv.
  • The map y ↦ ([v] ↦ yv) is a linear isomorphism between these two 1‑dimensional spaces.
   ■
End of Example 51
Theorem 19: Every m × n matrix A ∈ 𝔽m×n generates the linear transformation TA : 𝔽n×1 ⇾ 𝔽m×1 by matrix multiplication from left, TA(x) = A x. Then its kernel and image are related via the following relation:
\begin{equation} \label{EqDot.2} \mbox{ker} \left( \mathbf{A}^{\mathrm T} \right) = \mbox{im}\left( \mathbf{A} \right)^{\perp} . \end{equation}
If x ∈ ker(AT), then ATx = 0. Taking dot product, we get \[ \mathbf{A}^{\mathrm T}\mathbf{x} \bullet \mathbf{y} = \mathbf{x} \bullet \mathbf{A}\,\mathbf{y} = 0 \qquad \forall \mathbf{y} \in \mathbb{F}^{m\times 1} . \] In Theorem 1 of the previous section, it was shown that ATxy = xA y. Hence, x belongs to the orthogonal complement of image of TA. Since these relations are valid backward, the statement follows.
   
Example 52: We consider singular 3-by-3 matrix \[ \mathbf{A} = \begin{bmatrix} 1& \phantom{-}2& \phantom{-}3 \\ 4& \phantom{-}5& \phantom{-}6 \\ 1& -1& -3 \end{bmatrix} , \] of rank 2, as Mathematica confirms.
A = {{1, 2, 3}, {4, 5, 6}, {1, -1, -3}} MatrixRank[A]
2
Since A is a square matrix, its index is zero, so dimensions of its kernel and kernel of transposed matrix are the same---two (rank). Using Mathematica we find their kernels:
NullSpace[A]
{{1, -2, 1}}
NullSpace[Transpose[A]]
{{3, -1, 1}}
Hence, their kernels are spanned on the following vectors: \[ \mbox{ker}\left( \mathbf{A} \right) = \mbox{span} \left\{ \begin{pmatrix} 1 \\ -2 \\ 1 \end{pmatrix} \right\} , \quad \mbox{ker}\left( \mathbf{A}^{\mathrm T} \right) = \mbox{span} \left\{ \begin{pmatrix} 3 \\ -1 \\ 1 \end{pmatrix} \right\} . \] The image of A is \[ W = \mbox{image}\left( \mathbf{A} \right) = \left\{ \begin{pmatrix} x + 2y + 3z \\ 4x + 5y + 6z \\ x - y -3z \end{pmatrix} \ : \quad x, y, z \in \mathbb{R} \right\} . \] Since image of any linear operator is a subspace of the codomain, we denote it by W, for simplicity. Although there are three parameters in its definition (x, y, z), one of them can be excluded. Let us denote \[ a = x+2y+3z , \quad b = 4x + 5y + 6z , \quad c = x -y -3z. \] Excluding z, we obtain two equations \[ b - 2a = 2x + y , \qquad a+c = 2x + y . \] Hence, we get a relation between three parameters \[ b - 2a = a + c \qquad \Longrightarrow \qquad c = b - 3a . \] This leads to explicit formula for the image subspace: \[ W = \mbox{image}\left( \mathbf{A} \right) = \left\{ \begin{pmatrix} a \\ b \\ b - 3a \end{pmatrix} \ : \quad a, b \in \mathbb{R} \right\} . \] This space is two dimensional---there are two parameters that generate W. Now we check whether null space of AT is orthogonal to W.
v = {3, -1, 1}; u = {a, b, b - 3*a}; Dot[v, u]
0
This output shows that the image of TA is orthogonal to ker(AT).

   

Now we consider the following 4-by-4 matrix: \[ \mathbf{A} = \begin{bmatrix} 1&2&3&4 \\ 5&6&7&8 \\ -2&0&2&4 \\ 7&6&5&4 \end{bmatrix} . \] This matrix has rank 2.

A = {{1, 2, 3, 4}, {5, 6, 7, 8}, {-2, 0, 2, 4}, {7, 6, 5, 4}}; MatrixRank[A]
2
Using Mathematica, we find its kernel and kernel of its transpose matrix:
NullSpace[A]
{{2, -3, 0, 1}, {1, -2, 1, 0}}
NullSpace[Transpose[A]]
{{3, -2, 0, 1}, {-3, 1, 1, 0}}
\[ \mbox{ker} \left( \mathbf{A} \right) = \mbox{span} \left\{ \begin{pmatrix} 2\\ -3\\ 0\\ 1 \end{pmatrix} , \ \begin{pmatrix} 1 \\ -2 \\ 1 \\ 0 \end{pmatrix} \right\} , \] and \[ \mbox{ker} \left( \mathbf{A}^{\mathrm T} \right) = \mbox{span} \left\{ \begin{pmatrix} 3 \\ -2 \\ 0 \\ 1 \end{pmatrix} , \ \begin{pmatrix} -3 \\ 1 \\ 1 \\ 0 \end{pmatrix} \right\} . \] Let W be the image of operator TA: \[ W = \mbox{image}(\mathbf{A} ) = \left\{ \begin{pmatrix} a \\ b \\ c \\ d \end{pmatrix} \ : \quad a,b,c,d \in \mathbb{R} \right\} , \] where we don't know yet any relation between four coefficients generating W. What we know is that W is orthogonal to ker(AT). This yields \begin{align*} \left( 3, -2, 0, 1 \right) \bullet \left( a, b, c, d \right) &= 3a -2b +d = 0 , \\ \left( -3, 1, 1, 0 \right) \bullet \left( a, b, c, d \right) &= -3a + b + c = 0 . \end{align*}
Solve[{3*a - 2*b + d == 0, -3*a + b + c == 0}, {a, b}]
{a -> 1/3 (2 c + d), b -> c + d}}
Solving these two equation, we exclude two parameters: \[ 3a = 2c + d, \qquad b = c+d . \] Then the image subspace can be written explicitly \[ W = \mbox{image}(\mathbf{A} ) = \left\{ \begin{pmatrix} \frac{2c + d}{3} \\ c+d \\ c \\ d \end{pmatrix} \ : \quad c,d \in \mathbb{R} \right\} , \] According to Theorem 1, this subspace W must be orthogonal to each of two vectors generating ker(AT): \[ W \perp \left\{ \begin{pmatrix} 3 \\ -2 \\ 0 \\ 1 \end{pmatrix} , \quad \begin{pmatrix} -3 \\ 1 \\ 1 \\ 0 \end{pmatrix} \right\} . \] Using Mathematica, we verify the orthogonal relation:
{(2*c + d)/3, c+d, c, d} . {3, -2, 0, 1}
0
{(2*c + d)/3, c+d, c, d} . {-3, 1, 1, 0}
0
   ■
End of Example 52

This theorem establishes a duality between matrix A and its transpose AT (more precisely between corresponding linear transformations TA and TA′, but we will refer to matrices):

  • If A is surjective (onto), then AT is injective (one-to-one).
  • If the operator AT is surjective, then A injective.
Let A ∈ 𝔽m×n be an m-by-n matrix, which defines a linear transformation TA : 𝔽n×1 ⇾ 𝔽m×1 by matrix multiplication from left. Its index is \[ \mbox{ind}\left( \mathbf{A}\right) = n - m = \dim\left( \mbox{ker}\mathbf{A} \right) - \dim\left( \mbox{ker}\mathbf{A}^{\mathrm T} \right) . \]
Recall that the kernel of matrix A is also called the null space in linear algebra. The kernel of transpose matrix, which we denote ker(AT), is known as left null space of matrix A. Now note that the left null space can be defined as the column space:
\[ \mbox{ker}\left( \mathbf{A}^{\mathrm T} \right) \equiv \left\{ \mathbf{z} \in \mathbb{F}^{m\times 1} \ : \ \mathbf{A}^{\mathrm T} \mathbf{z} = 0 \right\} = \left\{ \mathbf{z} \ : \quad \sum_{k=1}^m z_k \mathbf{a}_k^{\mathrm T} = 0 \right\} , \]
where \( \displaystyle \quad \mathbf{a}_k^{\mathrm T} \quad \) is the k-th column vector of transpose matrix, AT. The left null space is also isomorphic to the row vector space:
\[ \mbox{ker}\left( \mathbf{A}^{\mathrm T} \right) \equiv \left\{ \mathbf{y} \in \mathbb{F}^{1\times m} \ : \ \mathbf{y}\, \mathbf{A} = 0 \right\} = \left\{ \mathbf{y} \ : \quad \sum_{k=1}^m y_k \mathbf{a}_k^{\mathrm T} = 0 \right\} . \]
   
Example 53: To enhance understanding of the left null space, we consider a 2-by-3 matrix that we fill with some integers: \[ \mathbf{A} = \begin{bmatrix} a_{1,1} & a_{1,2} & a_{1,3} \\ a_{2,1} & a_{2,2} & a_{2,3} \end{bmatrix} , \qquad \mathbf{A} = \begin{bmatrix} 1&2&3 \\ 4&5&6 \end{bmatrix} . \tag{2.1} \] Its transpose is \[ \mathbf{A}^{\mathrm T} = \begin{bmatrix} a_{1,1} & a_{2,1} \\ a_{1,2} & a_{2,2} \\ a_{1,3} & a_{2.3} \end{bmatrix} , \qquad \mathbf{A}^{\mathrm T} = \begin{bmatrix} 1&4 \\ 2&5 \\ 3&6 \end{bmatrix} \tag{2.2} \] Matrix A ∈ 𝔽m×n can be represented as a row of column vectors: \[ \mathbf{A} = \begin{bmatrix} {\bf a}_1 & {\bf a}_2 & \cdots & {\bf a}_n \end{bmatrix} , \] where in our case of m = 2 and n = 3, we have \[ \mathbf{a}_1 = \begin{pmatrix} a_{1,1} \\ a_{2,1} \end{pmatrix} = \begin{pmatrix} 1 \\ 4 \end{pmatrix} , \quad \mathbf{a}_2 = \begin{pmatrix} a_{1,2} \\ a_{2,2} \end{pmatrix} = \begin{pmatrix} 2 \\ 5 \end{pmatrix} , \quad \] and \[ \mathbf{a}_3 = \begin{pmatrix} a_{1,3} \\ a_{2,3} \end{pmatrix} = \begin{pmatrix} 3 \\ 6 \end{pmatrix} \] We can find a similar representation for the transpose matrix \[ \mathbf{A}^{\mathrm T} = \begin{bmatrix} {\bf a}_1^{\mathrm T} & {\bf a}_2^{\mathrm T} & \cdots & {\bf a}_m^{\mathrm T} \end{bmatrix} , \] where \[ {\bf a}_1^{\mathrm T} = \begin{pmatrix} a_{1,1} \\ a_{1,2} \\ a_{1,3} \end{pmatrix} = \begin{pmatrix} 1 \\ 2 \\ 3 \end{pmatrix} , \quad {\bf a}_2^{\mathrm T} = \begin{pmatrix} a_{2,1} \\ a_{2,2} \\ a_{2,3} \end{pmatrix} = \begin{pmatrix} 4 \\ 5 \\ 6 \end{pmatrix} . \] Then product of transpose matrix AT and column vector z ∈ ℝ2×1 can be written as \[ {\bf A}^{\mathrm T} \mathbf{z} = \begin{bmatrix} a_{1,1} & a_{2,1} \\ a_{1,2} & a_{2,2} \\ a_{1,3} & a_{2.3} \end{bmatrix} \begin{pmatrix} z_1 \\ z_2 \end{pmatrix} = \begin{pmatrix} a_{1,1} z_1 + a_{2,1} z_2 \\ a_{1,2} z_1 + a_{2,2} z_2 \\ a_{1,3} z_1 + a_{2,3} z_2 \end{pmatrix} . \] Using column vectors of matrix AT, we rewrite it as \[ {\bf A}^{\mathrm T} \mathbf{z} = z_1 \begin{pmatrix} a_{1,1} \\ a_{1,2} \\ a_{1,3} \end{pmatrix} + z_2 \begin{pmatrix} a_{2,1} \\ a_{2,2} \\ a_{2,3} \end{pmatrix} = z_1 {\bf a}_1^{\mathrm T} + z_2 {\bf a}_2^{\mathrm T} . \] Hence, the left null space as a column space is \[ \mbox{ker}\left( \mathbf{A}^{\mathrm T} \right) = \left\{ \mathbf{z} \in \mathbb{F}^{m\times 1} \ : \quad \sum_{k=1}^m z_k \mathbf{a}_k^{\mathrm T} = 0 \right\} . \] For our 2-by-3 matrix, it becomes \[ \mbox{ker}\left( \mathbf{A}^{\mathrm T} \right) = \left\{ \mathbf{z} = \begin{pmatrix} z_1 \\ z_2 \end{pmatrix} \ : \quad z_1 \begin{pmatrix} 1 \\ 2 \\ 3 \end{pmatrix} + z_2 \begin{pmatrix} 4 \\ 5 \\ 6 \end{pmatrix} = 0 \right\} , \] which is equivalent to solving the following system of equations: \begin{align*} z_1 + 4 z_2 &= 0 , \\ 2 z_1 + 5 z_2 &=0 , \\ 3 z_1 + 6 z_2 &= 0 . \end{align*} From the first equation, we get z ₁n = −4z ₂. Substituting this expression into the remainder equations leads to to two equations \[ -8 z_2 + 5 z_2 = 0, \quad -4 z_2 + 2 z_2 = 0 \] These two equations have the only zero solution. Mathematica confirms:
AA = {{1, 2, 3}, {4, 5, 6}} NullSpace[Transpose[AA]]
{}

Now we represent the left null space as a row vector space. To achive this, we multiply matrix A by row vector y = [y₁, y₂] from left. This yields \[ \mathbf{y}\,\mathbf{A} = \begin{bmatrix} y_1 & y_2 \end{bmatrix} \begin{pmatrix} a_{1,1} & a_{1,2} & a_{1,3} \\ a_{2,1} & a_{2,2} & a_{2,3} \end{pmatrix} = \begin{bmatrix} y_1 a_{1,1} + y_2 a_{1,2} \\ y_1 a_{2,1} + y_2 a_{2,2} \\ y_1 a_{1,3} + y_2 a_{2,3} \end{bmatrix} \] Or \[ \mathbf{y}\,\mathbf{A} = y_1 \begin{bmatrix} a_{1,1} \\ a_{1,2} \\ a_{1,3} \end{bmatrix} + y_2 \begin{bmatrix} a_{1,2} \\ a_{2,2} \\ a_{2,3} \end{bmatrix} = y_1 \mathbf{a}_1^{\mathrm T} + y_2 \mathbf{a}_1^{\mathrm T} . \] Therefore, we can define the left null space as the row vector space: \[ \mbox{ker}\left( \mathbf{A}^{\mathrm T} \right) = \left\{ \mathbf{y} = \begin{bmatrix} y_1 & y_2 \end{bmatrix} \ : \quad y_1 \begin{pmatrix} 1 \\ 2 \\ 3 \end{pmatrix} + y_2 \begin{pmatrix} 4 \\ 5 \\ 6 \end{pmatrix} = 0 \right\} , \] which is exactly the same as column space.

   ■
End of Example 53
Theorem 20: Let A be a real m × n matrix and let b ∈ ℝm×1. There exists a solution x to the equation Ax = b if and only if b ∈ ker(AT).
First suppose b ∈ ker(AT). Then this says that if ATx = 0, it follows that bx = 0. In other words, taking transpose, \[ \mathbf{x}^{\mathrm T} \mathbf{A} = 0 \qquad \Longrightarrow \qquad \mathbf{x}^{\mathrm T} \mathbf{b} = 0 . \] Thus, if P is a product of elementary matrices such that PA is in row reduced echelon form, then if PA has a row of zeros, in the k-th position, obtained from the k-th row of P times A, then there is also a zero in the k-th position of P b. This is because the kth position in P b is just the k-th row of P times b. Thus, the row reduced echelon forms of A and augmented matrix [Ab] have the same number of zero rows. Hence, the rank of the augmented matrix equals the rank of matrix A.According to Theorem 1, there exists a solution x to the system A x = b. It remains to prove the converse statement.

Let z ∈ ker(AT) and suppose A x = b. We need to verify bz = 0. Since for dot product we have vA u = ATvu, we have \[ \mathbf{b} \bullet \mathbf{z} = \mathbf{A}\,\mathbf{x} \bullet \mathbf{z} = \mathbf{x} \bullet \mathbf{A}^{\mathrm T} \mathbf{z} = \mathbf{x} \bullet \mathbf{0} = 0 . \]

   
Example 54: We consider two matrices \[ \mathbf{A} = \begin{bmatrix} 1&2&3&4 \\ 1&-1&1&-2 \\ 2&1&4&2 \end{bmatrix} , \quad \mathbf{B} = \begin{bmatrix} 1 & 2 & \phantom{-}3 \\ 3&2&\phantom{-}1 \\ 1&1&\phantom{-}1 \\ 1&0&-1 \end{bmatrix} . \] We consider the linear systems of equations A x = b and B x = b. Their transpose matrices become \[ \mathbf{A}^{\mathrm T} = \begin{bmatrix} 1&1&2 \\ 2&-1&1 \\ 3&1&4 \\ 4 &-2&2 \end{bmatrix} , \quad \mathbf{B}^{\mathrm T} = \begin{bmatrix} 1&3&1&\phantom{-}1 \\ 2&2&1&\phantom{-}0 \\ 3&1&1&-1 \end{bmatrix} . \] Using Mathematica, we find the kernel of AT (left null space for the matrix)
A = {{1, 2, 3, 4}, {1, -1, 1, -2}, {2, 1, 4, 2}}; NullSpace[Transpose[A]]
{{-1, -1, 1}}
Hence, the null space of AT is spanned on vector (1, 1, −1). So the system A x = b has a solution if and only if vector b is orthogonal to (1, 1, −1). This means that the coefficients of b = (b₁, b₂, b₃) satisfy the condition b₁ + b₂ = b₃. We check with Mathematica:
Solve[{A . {x, y, z, w} == {1, 1, 2}}, {x, y, z, w}]
{{z -> 3/5 - (3 x)/5, w -> -(1/5) + x/5 - y/2}}
As we see, solution of the system A x = b exists, but not unique (depends on two parameters): \[ \mathbf{A}\begin{pmatrix} x \\ y \\ z \\ w \end{pmatrix} = \begin{pmatrix} 1 \\ 1 \\ 2 \end{pmatrix} \quad \Longrightarrow \quad \mathbf{x} = \begin{pmatrix} x \\ y \\ \frac{3 (1 -x)}{5} \\ \frac{x -1}{5} - \frac{y}{2} \end{pmatrix} . \]

Now we determine the left null space of matrix B (with the aid of Mathematica)

B = {{1, 2, 3}, {3, 2, 1}, {1, 1, 1}, {1, 0, -1}}; NullSpace[Transpose[B]]
{{1, -1, 0, 2}, {-1, -1, 4, 0}}
\[ \mbox{ker} \left( \mathbf{B}^{\mathrm T} \right) = \left\{ \mathbf{y} \in \mathbb{R}^{4\times 1} \ : \ \mathbf{y}\,\mathbf{B} = 0 \right\} \] Mathematica tells us that the left null space of matrix B is a two dimensional subspace of ℝ4×1 spanned on two vectors: \[ \mbox{ker} \left( \mathbf{B}^{\mathrm T} \right) = \mbox{span} \left\{ \begin{pmatrix} 1 \\ -1 \\ 0 \\ 2 \end{pmatrix} , \quad \begin{pmatrix} -1 \\ -1 \\ 4 \\ 0 \end{pmatrix} \right\} . \] Hence, the null space of BT is spanned on two vector (1, −1, 0, 2) and (−1, −1, 4, 0). So the system B x = b has a solution if and only if vector b is orthogonal to these two vectors: \[ b_1 - b_2 + 2 b_4 =0 , \qquad -b_1 - b_2 + 4 b_3 = 0 . \]
Solve[{b1 - b2 + 2*b4 == 0, -b1 - b2 + 4*b3 == 0}, {b1, b2}]
{b1 -> 2 b3 - b4, b2 -> 2 b3 + b4}}
So vector b must depend on two parameters: \[ \mathbf{b} = \begin{bmatrix} 2 b_3 - b_4 & 2 b_3 + b_4 & b_3 & b_4 \end{bmatrix}^{\mathrm T} , \qquad b_3 , b_4 \in \mathbb{R} . \] However, solution of the system B x = b, if it exists, is not unique because the kernel of B is a one dimensional space \[ \mbox{ker} \left( \mathbf{B} \right) = \mbox{span}\left\{ \begin{pmatrix} 1 \\ -2 \\ 1 \end{pmatrix} \right\} . \]
NullSpace[B]
{{1, -2, 1}}
The corresponding linear operator TB : ℝ3×1 ⇾ ℝ4×1, defined by TB(x) = B x, has index −1: \[ \mbox{ind}\left( \mathbf{B} \right) = 3 - 4 = \dim\left( \mbox{ker}\mathbf{B} \right) - \dim\left( \mbox{ker}\mathbf{B}^{\mathrm T} \right) = 1 - 2 = -1 . \]    ■
End of Example 54

Complex matrices:    Although Theorem 20 is stated for real matrices—primarily for clarity—it remains valid for complex matrices as the following example shows.    

Example 55: We consider singular 2-by-2 matrix with complex entries \[ \mathbf{B} = \begin{bmatrix} 1 & 1 + \mathbf{j} \\ \mathbf{j} & \mathbf{j} -1 \end{bmatrix} , \qquad \mbox{with} \qquad \mathbf{B}^{\mathrm T} = \begin{bmatrix} 1 & \mathbf{j} \\ 1 + \mathbf{j} & \mathbf{j} -1 \end{bmatrix} \] The left null space for BT is \[ \mbox{ker} \left( \mathbf{B}^{\mathrm T} \right) = \left\{ \mathbf{y} = \begin{bmatrix} y_1 & y_2 \end{bmatrix} \ : \ \mathbf{y}\,\mathbf{B} = 0 \right\} = \left\{ y_1 \begin{pmatrix} 1 \\ \mathbf{j} \end{pmatrix} + y_2 \begin{pmatrix} 1 + \mathbf{j} \\ \mathbf{j} -1 \end{pmatrix} = 0 \right\} . \]
B = {{1, 1 + \[ImaginaryJ]}, {\[ImaginaryJ], \[ImaginaryJ] - 1}};
NullSpace[Transpose[B]]
{{-I, 1}}
Therefore, the left null space of matrix B is spanned on vector v = (ⅉ, −1). According to Theorem 20, vector b in equation B x = b must be orthogonal to v; otherwise this equation has no solution. So \[ \mathbf{v} \perp \mathbf{b} \qquad \iff \qquad \mathbf{b} = b\begin{bmatrix} 1 & \mathbf{j} \end{bmatrix} , \quad b \in \mathbb{C} . \]    ■
End of Example 55

The following statement is known as the Fredholm alternative, named after the Swedish mathematician Ivar Fredholm (1866--1927); it is one of Fredholm's theorems and a result in Fredholm theory. It may be expressed in several ways, as a theorem of linear algebra, a theorem of integral equations, or as a theorem on Fredholm operators.

Theorem 21 (Fredholm alternative): Let A ∈ ℝm×n be an m x n matrix. Then A maps ℝn×1 onto ℝm×1 by multiplication from left if and only if the only solution to ATx = 0 is x = 0.
If the only solution to ATx = 0 is x = 0, then ker(AT) = {0} and so ker(AT) = ℝm×1 because every b ∈ ℝm×1 has the property that b0 = 0 Therefore, A x = b has a solution for any b ∈ ℝm×1 because the b for which there is a solution are those in ker(AT) by Theorem 20. In other words, A maps ℝn×1 onto ℝm×1.

Conversely if A is onto, then by Theorem 20 every b ∈ ℝm×1 is in ker(AT) and so if ATx = 0, then bx = 0 for every b. In particular, this holds for b = x. Hence if ATx = 0, then x = 0.

   
Example 56: We start with square matrices (that always have index zero) \[ \mathbf{A} = \begin{bmatrix} 1&2&3 \\ 4&5&6 \\ 7&8&9 \end{bmatrix} , \qquad \mathbf{B} = \begin{bmatrix} 1&2&3 \\ 4&5&6 \\ 7&8&7 \end{bmatrix} . \] The kernel (null space) of matrix A is one dimensional, as Mathematica tells us
A = {{1, 2, 3}, {4, 5, 6}, {7, 8, 9}}; NullSpace[A]
{{1, -2, 1}}
So \[ \mbox{ker}\left( \mathbf{A} \right) = \mbox{span} \left\{ \begin{pmatrix} 1 \\ -2 \\ 1 \end{pmatrix} \right\} . \] The left null space of matrix A (also according to Mathematica, whom we trust) is the same:
NullSpace[Transpose[A]]
{{1, -2, 1}}
The Fredholm alternative claims that mapping TA : ℝ3×1 ⇾ ℝ3×1 cannot be surjective (onto). To verify this statement, we consider three equations that define the image of operator TA: \begin{align*} x + 2 y + 3 z &= a , \\ 4 x + 5 y + 6 z &= b , \\ 7 x + 8 y + 9 z &= c . \end{align*} Excluding x from the first equation, x = 𝑎 −2y −3z, and substituting this expression into two other equations gives \[ 4 \left( a - 2 y - 3 z \right) + 5 y + 6 z = b, \quad 7 \left( a - 2 y - 3 z \right) + 8 y + 9 z = c. \] From the latter, we exclude y and substitute into the third one. You will be surprised that variable z will be eliminated and we obtain an equation between three parameters: c = 2b −𝑎. So the image of matrix A becomes
7*(a - 2*(4*a/3 - 2*z - b/3) - 3*z) + 8*(4*a/3 - 2*z - b/3) + 9*z
-a + 2 b
\[ \mbox{image}\left( \mathbf{A} \right) = \left\{ \begin{pmatrix} a \\ b \\ b -2a \end{pmatrix} \ : \quad a, b \in \mathbb{R} \right\} . \tag{6.1} \] This formula shows that the image of TA has dimension 2 and cannot be onto because the codomain has dimension 3. We verify (6.1) with a couple of numerical examples.

{{1, 2, 3}, {4, 5, 6}, {7, 8, 9}} . {3, 2, 1}
{10, 28, 46}
\[ \begin{bmatrix} 1&2&3 \\ 4&5&6 \\ 7&8&9 \end{bmatrix} \begin{pmatrix} 4 \\ -2 \\ 2 \end{pmatrix} = \begin{pmatrix} 6 \\ 18 \\ 30 \end{pmatrix} = \begin{pmatrix} 6 \\ 18 \\ 2 \cdot 18 -6 \end{pmatrix} . \]
{{1, 2, 3}, {4, 5, 6}, {7, 8, 9}} . {4, -2, 2}
6, 18, 30}
\[ \begin{bmatrix} 1&2&3 \\ 4&5&6 \\ 7&8&9 \end{bmatrix} \begin{pmatrix} 4 ]] -2 \\ 2 \end{pmatrix} = \begin{pmatrix} 6 \\ 18 \\ 30 \end{pmatrix} = \begin{pmatrix} 6 \\ 18 \\ 2 \cdot 18 - 6 \end{pmatrix} . \]

Let us consider matrix B. It is not a singular matrix since its rank is 3.

B = {{1, 2, 3}, {4, 5, 6}, {7, 8, 7}}; MatrixRank[B]
3
Hence, we know that ker(B) = {0}; the same conclusion we can make to the left null space---it is a trivial zero space. Therefore, linear mapping TB is onto.
NullSpace[Transpose[B]]
{}
   ■
End of Example 57

If vector b is orthogonal to all solutions of ATy = 0 or yA = 0, then A x = b has a solution. If not, then no solution exists. This duality between the operator and its adjoint is what makes the Fredholm Alternative tick—and the zero index guarantees that the dimensions align perfectly for the theorem to apply.    

Example 58: Let A be an m x n matrix in which m > n. Then A cannot map ℝn×1 onto ℝm×1. The reason for this is that AT is an n x m (fat) where m > n and so in the augmented matrix \[ \left[ \mathbf{A} \, \mid \,\mathbf{0} \right] \] there must be some free variables. Thus, there exists a nonzero vector x ∈ ker(AT) such that ATx = 0.

As an example, we consider the following matrix \[ \mathbf{A} = \begin{bmatrix} 1&2&0 \\ 3&2&1 \\ 1&1&1 \\ 4&5&2 \end{bmatrix} . \] Since rank of this matrix is 3, its image is a three dimensional subspace of ℝ4×1

A = {{1, 2, 0}, {3, 2, 1}, {1, 1, 1}, {4, 5, 2}}; Mat.rixRank[A]
3
The kernel (or null space) of matrix A is {0}.
NullSpace[A]
{}
The kernel of AT is one dimensional since it is spanned on vector (−4, −1, −5, 3). Therefore, matrix A has index −1.
NullSpace[Transpose[A]]
{{-4, -1, -5, 3}}
The image of matrix A is \[ \mbox{image}(\mathbf{A}) = \left\{ \begin{pmatrix} 2b -a-2c \\ 2a -b +c \\ 4c -a-b \\ 4a +b + 5c \end{pmatrix} \quad : \quad a,b,c \in \mathbb{R} \right\} . \]
Solve[{x + 2*y == a, 3*x + 2*y + z == b, x + y + z == c}, {x, y, z}]
{{x -> 1/3 (-a + 2 b - 2 c), y -> 1/3 (2 a - b + c), z -> 1/3 (-a - b + 4 c)}}
Therefore, vector [3, 3, 3, 10]T is in the image of matrix A (more precisely, linear transformation TA : ℝ3×1 ⇾ ℝ4×1 generated by matrix multiplication), but vector [3, 3, 3, 9]T is not. So index of matrix A is \[ \mbox{ind}(\mathbf{A}) = \dim\left( \mbox{ker}\mathbf{A} \right) - \dim\left( \mbox{ker}\mathbf{A}^{\mathrm T} \right) = -1 . \] The linear transformation TA is one-to-one, but not onto.    ■
End of Example 58

 

  1. Find index of the following matrices and determine which of them defines an injective (one-to-one) or surjective (onto) linear mapping.
    1. \( \displaystyle \quad \begin{bmatrix} 2&1&0 \\ 4&3&2 \\ 1&2&3 \\ 0&2&1 \end{bmatrix} ; \)
    2. \( \displaystyle \quad \begin{bmatrix} 1&0&0 \\ 2&2&2 \\ 1&2&0 \\ 0&1&0 \end{bmatrix} ; \)
    3. \( \displaystyle \quad \begin{bmatrix} 0&3&0&2&0&1&0 \\ 0&2&3&1&1&8&6 \\ 0&1&1&5&0&2&3 \\ 0&2&1&7&0&4&3 \end{bmatrix} ; \)
    4. \( \displaystyle \quad \begin{bmatrix} 0&1&0&2&0&1&0 \\ 0&3&2&6&0&5&4 \\ 0&1&1&2&0&2&2 \\ 0&2&1&4&0&3&2 \end{bmatrix} ; \)
    5. \( \displaystyle \quad \begin{bmatrix} 0&1&0&2&1&1&2 \\ 0&3&2&6&1&5&1 \\ 0&1&1&2&0&2&1 \\ 0&2&1&4&0&3&1 \end{bmatrix} . \)
  2. Suppose A is an m × n matrix and that mn. Suppose also that the rank A equals m. Show that TA maps 𝔽n×1 onto 𝔽m×1. Hint: vectors e₁, e₂, … , em occur as columns in the row reduced echelon form for A
  3. Suppose A is an m × n matrix and that mn. Suppose also that the rank A equals n. Show that TA is one-to-one. Hint: If not, there exists a vector x0 such that A x = 0, and this implies at least one column of A is a linear combination of the others. Show that this would require the column rank to be less than n.
  4. Suppose A is an m × n matrix and that m < n. Show that A is not one to one.

 

  1. Anton, Howard (2005), Elementary Linear Algebra (Applications Version) (9th ed.), Wiley International
  2. Beezer, R., A First Course in Linear Algebra, 2015.
  3. Beezer, R., A Second Course in Linear Algebra, 2013.
  4. Fitzpatrick, S., Orthogonal sets of vectors, Linear Algebra.