In this section, we consider vector spaces over field of real numbers, but not complex numbers. The dot product of two n-dimensional vectors x = (x₁, x₂, … , xn) and y = (y₁, y₂, … , yn),
Here, u is a non-zero vector in ℝ² that defines a one-dimensional subspace spanned by u, and v is an arbitrary vector in the plane. We label the tip of v as point Q, with its tail positioned at the origin O.
The point P lies on the line defined by u and is the closest point on that line to Q. Equivalently, the triangle formed by points O, P, and Q is a right triangle, with the right angle at P.
The distance from O to P, measured positively in the direction of u, is called the
component of v in the direction of u, and is denoted compu(v). The vector \( \displaystyle \quad \vec{OP} \quad \) is referred to as the projection of v onto u, and is denoted proju(v) or v∥. We are going to find formulas for these quantities.
Let θ be the included angle between v and u. From trigonometry, considering the right triangle (O, P, Q), we know that
From geometric interpretation formula of dot product \( \displaystyle \quad \mathbf{v} \bullet \mathbf{u} = \| \mathbf{v} \| \,\| \mathbf{u} \| \,\cos\theta , \) we get
Therefore, the dot product u • v of two vectors can be interpreted as a simple numerical product of two quantities: the magnitude of of vectors u and the magnitude of the projection of v onto span of vector u, which we denote by v∥ .
In other words, to compute the dot product, one must determine the projection of one vector in the direction of the other, along with the direction vector itself.
Since the formula for the dot product \eqref{EqDot.1} does not depend on the specific choice of the direction vector (as long as it is non-zero), we can express the dot product as:
The vector proju(v) = OP can now be computed by re-scaling u to the correct length. Specifically, we
first normalize u by dividing it by its own length, and then multiply by |OP|. In formulas:
The following definition summarizes what we have just found.
Let u be a non-zero vector and v any vector. Then the component of v in the direction of u is
defined to be the scalar
\begin{equation} \label{EqDot.11}
\mbox{comp}_{\bf u}\left( \mathbf{v} \right) = \frac{\mathbf{u} \bullet \mathbf{v}}{\| \mathbf{u} \|} .
\end{equation}
The projection of v onto u is defined to be the vector
\begin{equation} \label{EqDot.12}
\mbox{proj}_{\bf u}\left( \mathbf{v} \right) = \frac{\mathbf{u} \bullet \mathbf{v}}{\mathbf{u} \bullet \mathbf{u}} \,\mathbf{u} = \frac{\mathbf{u} \bullet \mathbf{v}}{\| \mathbf{u} \|^2} \,\mathbf{u} .
\end{equation}
These two operations are also called the scalar projection and vector projection, respectively.
Formula \eqref{EqDot.12} may be interpreted as the orthogonal projection of the vector v, not merely onto the vector u, but onto the subspace span{u}. This aligns with the geometric intuition commonly introduced in elementary linear algebra: we construct the perpendicular from the tip of v to the line defined by the direction of u.
It is important to note that the resulting scalar quantity—namely, the component of v in the direction of u—can be negative. This occurs when the angle between v and u is obtuse, indicating that v points in a direction opposite to u. In such cases, the projection of v onto span{u} lies on the extension of the line in the direction opposite to u, and the component of v along u is therefore negative.
Formula \eqref{EqDot.12} shows that the projection vector depends on vector v linearly. We know that every linear transformation can be expressed as matrix multiplication. So we would like to express the projection vector as
Example 41:
Let u = [1, −2, 3]T be a vector in ℝ³. Since its squared norm is 14, we find the corresponding projection matrix
\[
\mathbf{P} = \mathbf{P}_u = \frac{1}{14}\,\mathbf{u} \cdot \mathbf{u}^{\mathrm T} .
\]
Actual evaluation of matrix P is devoted to Mathematica:
An important application of projections is that every vector v can be uniquely written as a sum of two
orthogonal vectors, one of which is a scalar multiple of some given non-zero vector u, and the other of
which is orthogonal to u.
Theorem 13:
Let u be a non-zero vector, and let v be any vector from the same vector space. Then there exist unique vectors v∥ and v⊥ such that
\begin{equation} \label{EqDot.13}
\mathbf{v} = \mathbf{v}_{\|} + \mathbf{v}_{\perp} ,
\end{equation}
where v∥ is a scalar multiple of u, and v⊥ is orthogonal to u.
We visualize the situation with Picture 2.
To show that such a decomposition exists, let
\[
\mathbf{v}_{\|} = \mbox{proj}_u (\mathbf{v}) = \frac{\mathbf{u} \bullet \mathbf{v}}{\mathbf{u} \bullet \mathbf{u}}\,\mathbf{u} ,
\]
and define v⊥ by the identity v⊥ = v − v∥.
By definition, Eq.(5) is satisfied, and v∥ is a scalar multiple of u. We must show
that v⊥ is orthogonal to u. For this, we verify that their dot product equals zero:
\begin{align*}
\mathbf{u} \bullet \mathbf{v}_{\perp} &= \mathbf{u} \bullet \left( \mathbf{v} - \mathbf{v}_{\perp} \right)
\\
&= \mathbf{u} \bullet \mathbf{v} - \mathbf{u} \bullet \mathbf{v}_{\perp}
\\
&= \mathbf{u} \bullet \mathbf{v} - \frac{\mathbf{u} \bullet \mathbf{v}}{\mathbf{u} \bullet \mathbf{u}}\,\mathbf{u} \bullet \mathbf{u}
\\
&= \mathbf{u} \bullet \mathbf{v} - \mathbf{u} \bullet \mathbf{v} = 0 .
\end{align*}
To show uniqueness, suppose that Eq.(13) holds and v⊥ = λu. Taking the dot product of both sides of (13) with u and using u • v⊥ = 0, this yields
\begin{align*}
\mathbf{u} \bullet \mathbf{v} &= \mathbf{u} \left( \mathbf{v}_{\|} + \mathbf{v}_{\perp} \right)
\\
&= \mathbf{u} \bullet \lambda \mathbf{u} + \mathbf{u} \bullet \mathbf{v}_{\perp}
\\
&= \lambda \,\| \mathbf{u} \|^2 ,
\end{align*}
which implies λ = u • v/∥u∥². Thus, there can be no more than one such vector v∥. Since v⊥ must equal v − v∥, it follows that there can be no more than one choice for both v∥ and v⊥, proving their uniqueness.
Example 42:
Let us consider two vectors from ℝ³:
\[
\mathbf{u} = \left( 3,-4,5 \right) , \qquad \mathbf{v} = \left( 2.71828, 3.14159, 1.61803 \right) .
\]
We want to decompose vector v into v = v∥ + v⊥, where v∥ is parallel to u and v⊥ is orthogonal to u. Using formula (12), we find the vector that is parallel to u:
\[
\mathbf{v}_{\|} = \frac{\mathbf{u} \bullet \mathbf{v}}{\mathbf{u} \bullet \mathbf{u}} \,\mathbf{u} = 0.0735726 \,\mathbf{u} = 0.0735726 \left( 3,-4,5 \right)
\]
because
\[
\mathbf{u} \bullet \mathbf{u} = 50, \qquad \mathbf{u} \bullet \mathbf{v} = 3.67863
\]
u = {3, -4, 5}; v = {2.71828, 3.14159, 1.61803};
Dot[u, u]
50
Dot[u, v]
3.67863
Dot[u, v]/Dot[u, u]
0.0735726
Then its orthogonal projection becomes
\[
\mathbf{v}_{\perp} = \mathbf{v} - \mathbf{v}_{\|} = \left( 2.49756, 3.43588, 1.25017 \right) .
\]
v - Dot[u, v]/Dot[u, u]*u
{2.49756, 3.43588, 1.25017}
u = {3, -4, 5}; v = {2.71828, 3.14159, 1.61803};
av = Graphics3D[{Black, Thickness[0.01], Arrowheads[0.04],
Arrow[{{0.0, 0.0, 0}, v}]}];
Show[av, au]
Figure 42.1: Two orthogonal vectors v⊥ and v∥
■
End of Example 42
Projection on a subspace
It is natural to extend the definition of vector projection from one dimensional line generated by a single vector to the subspace spaned by an orthogonal set. This generalization is not only elegant but also essential for solving minimum distance problems in linear algebra and optimization.
Recall that a subspace of a vector space is a subset that is closed under both vector addition and scalar multiplication. That means any linear combination of vectors in the subspace remains within the subspace.
Let W be a subspace of ℝn and let x
be a vector in ℝn. The closest vector to x on W is denoted by xW. This means that difference x − xW is orthogonal to all vectors in W.
This vector xW is called the orthogonal projection of vector x on subspace W. The length of vector x − xW can be considered as the distance from x to subspace W.
For ℝ³, some common subspaces include lines that go through the origin and planes that go through the origin.
The following example serves as intuitive development of vector projection in 3-dimensional physical space.
Example 43:
A basis for the xy-plane is given by the two standard coordinate vectors
\[
\mathbf{e}_1 = \begin{pmatrix} 1 \\ 0 \\ 0 \end{pmatrix} = \mathbf{i} , \qquad \mathbf{e}_2 = \begin{pmatrix} 0 \\ 1 \\ 0 \end{pmatrix} = \mathbf{j} .
\]
The set {e₁, e₂) is orthogonal, so for any vector x = (x₁, x₂, x₃), we have
\[
\mathbf{x}_W = \left( \mathbf{x} \bullet \mathbf{e}_1 \right) \mathbf{e}_1 + \left( \mathbf{x} \bullet \mathbf{e}_2 \right) \mathbf{e}_2 = \begin{pmatrix} x_1 \\ x_2 \\ 0 \end{pmatrix} .
\]
There are two equivalent approaches to the definition of orthogonal projection: we can seek p ∈ W
such that y − p ∈ W⊥, or we can seek p ∈ W such that ∥y − p∥ is minimized. There are several issues to consider: uniqueness,
existence, and computing projection vector p. We start with the uniqueness question.
Theorem 14:
Let W be a subspace of the direct product space ℝn, and let y be its element.
Then there is at most one vector p in W such that y − p ∈ W⊥.
Suppose there exist two points p, q such that y − p ∈ W⊥ and y − q ∈ W⊥. Since W and W⊥ are both subspaces of ℝn, their difference p − q is also an element of W. On the other hand, (y − p) − (y − q) = p − q ∈ W⊥. However, W ∩ W⊥ = {0), so p − q = 0 and p = q.
Example 44:
Let us find the distance from the point (5, −3, 2) to the plane defined by 2x + 5y −3z = 0. We demonstrate two approaches for distance determination.
The first approach is based on projection of the taget vector onto normal vector to the plane. Since this vector is known to be n = (2, 5, −3).
Now we compute the projection of the position vector (5, −3, 2) onto n using Eq.*3):
\[
\mbox{proj}_n (\mathbf{v}) = \frac{\mathbf{v} \bullet \mathbf{n}}{\| \mathbf{n} \|^2} \,\mathbf{n} = \frac{11}{38} \begin{pmatrix} -2 \\ -5 \\ 3 \end{pmatrix} . \]
v = {5, -3, 2}; n = {2, 5, -3};
x = (Dot[v, n])/Norm[n]^2 *n
{-(11/19), -(55/38), 33/38}
Now, this vector is parallel to normal vector n, so it’s perpendicular to the plane. Subtracting it from the taget vector v = (5, −3, 2)
gives a vector parallel to the plane, and this is the position vector for the point we seek.
\[
\mathbf{q} = \mathbf{v} - \mathbf{x} = \frac{1}{38} \begin{pmatrix} 212 \\ -59 \\ 43 \end{pmatrix} .
\]
q = v - x
{106/19, -(59/38), 43/38}
So the closest point is Q = (212, −59, 43)/38.
We weren’t asked for it, but note that if we wanted the distance from the point v
to the plane, this is given by ∥x∥ = (√11)/38.
Norm[x]
11/Sqrt[38]
Now we demonstrate another approach using
orthogonal projection:
Recall that the orthogonal complement of W, denoted by W⊥ is
\[
W^{\perp} = \left\{ \mathbf{y} \in \mathbb{R}^3 \ : \ \mathbf{x} \bullet \mathbf{y} \quad \forall \mathbf{x} \in W \right\} ,
\]
which leads to direct sum decomposition ℝ³ = W ⊕ W⊥. With this approach, we need to determine a basis of subspace W. From plane equation, we find
\[
2x = -5y + 3z.
\]
Setting y = 0, z = 2, and then y = 2, z = 0, we obtain two linearly independent (because they have zero components) vectors
\[
\mathbf{u}_1 = \begin{pmatrix} 3 \\ 0 \\ 2 \end{pmatrix} , \qquad \mathbf{u}_2 = \begin{pmatrix} -5 \\ 2 \\ 0 \end{pmatrix} .
\]
With these vectors at hand, we build the matrix
\[
\mathbf{A} = \begin{bmatrix} 3 & -5 \\ 0 & 2 \\ 2&0 \end{bmatrix} .
\]
Using formula (8) we find the projection vector
\[
\mathbf{p} = \mathbf{A} \left( \mathbf{A}^{\mathrm T} \mathbf{A} \right)^{-1} \mathbf{A}^{\mathrm T} \mathbf{v} = \frac{1}{38}\begin{bmatrix} 34 & -106 \\ -10&13&15 \\ 6&15&29 \end{bmatrix} \begin{pmatrix} 5 \\ -3 \\ 2 \end{pmatrix} = \frac{1}{38}\begin{pmatrix} 212 \\ -59 \\ 43 \end{pmatrix} .
\]
A = {{3, -5}, {0, 2}, {2, 0}};
P = A . Inverse[Transpose[A] . A] . Transpose[A]
The more subtle question is whether projection vector always exists for arbitrary element of finite dimensional real space. We have seen the answer is yes for a one-dimensional subspace, formula \eqref{EqDot.12}. The next theorem generalizes this result to any
finite-dimensional subspace, giving a simple formula for the orthogonal projection of a vector in terms of an orthonormal basis for the subspace.
Theorem 15:
Let W be a subspace of ℝn and let {u₁, u₂, … , um} be an orthogonal basis for W. Then for any vector x in ℝn,
the orthogonal projection of x
onto W
is given by the formula
\begin{equation} \label{EqDot.4}
\mathbf{x}_W = \frac{\mathbf{x} \bullet \mathbf{u}_1}{\mathbf{u}_1 \bullet \mathbf{u}_1}\,\mathbf{u}_{1} + \frac{\mathbf{x} \bullet \mathbf{u}_2}{\mathbf{u}_2 \bullet \mathbf{u}_2}\,\mathbf{u}_{2} + \cdots + \frac{\mathbf{x} \bullet \mathbf{u}_m}{\mathbf{u}_m \bullet \mathbf{u}_m}\,\mathbf{u}_{m} .
\end{equation}
If nectors e₁, e₂, … , em constitute the orthonormal basis for W, then the projection matrix can be written as
Let
\[
\mathbf{y} = \frac{\mathbf{x} \bullet \mathbf{u}_1}{\mathbf{u}_1 \bullet \mathbf{u}_1}\,\mathbf{u}_{1} + \frac{\mathbf{x} \bullet \mathbf{u}_2}{\mathbf{u}_2 \bullet \mathbf{u}_2}\,\mathbf{u}_{2} + \cdots + \frac{\mathbf{x} \bullet \mathbf{u}_m}{\mathbf{u}_m \bullet \mathbf{u}_m}\,\mathbf{u}_{m} .
\]
This vector is contained in W
because it is a linear combination of {u₁, u₂, … , um}. Hence, we just need to show that x − y is in W⊥, i.e., that uj • (x − y) = 0 for each j = 1, 2, … , m. For u₁ we have
\begin{align*}
\mathbf{u}_1 \bullet \left( \mathbf{x} - \mathbf{y} \right) &= \mathbf{u}_1 \bullet \left( \mathbf{x} - \frac{\mathbf{x} \bullet \mathbf{u}_1}{\mathbf{u}_1 \bullet \mathbf{u}_1}\,\mathbf{u}_{1} - \cdots - \frac{\mathbf{x} \bullet \mathbf{u}_m}{\mathbf{u}_m \bullet \mathbf{u}_m}\,\mathbf{u}_{m} \right)
\\
&=\mathbf{u}_1 \bullet \mathbf{x} - \frac{\mathbf{x} \bullet \mathbf{u}_1}{\mathbf{u}_1 \bullet \mathbf{u}_1}\left( \mathbf{u}_{1} \bullet \mathbf{u}_{1} \right) - 0 - \cdots
\\
&= 0 .
\end{align*}
A similar calculation shows that uj • (x − y) = 0 for each j = 1, 2, … , m. Hence, (x − y) ∈ W⊥, as desired.
Example 45:
We consider two vectors in three dimensional physical space
\[
\mathbf{a} = \left( 2,2,-1 \right) , \qquad \mathbf{a} = \left( -1, 1, 2 \right) .
\]
These vectors generate a two dimensional flat with normal vector n = (5, −3, 4), which we determine by taking their cross product.
a = {2, 2, -1}; b = {-1, 1, 2};
Cross[a, b]
{5, -3, 4}
We denote two dimensional flat by W = span(a, b). Its equation in ℝ³ is
\[
5\,x -3\,y + 4\,z = 0.
\]
We plot the plane along with the normal vector.
Using formula (6), we evaluate the projection vector
\[
\mathbf{x}_W = \frac{2x + 2y -z}{9} \left( 2, 2, -1 \right) + \frac{-x + y + 2z}{6} \left( -1, 1, 2 \right) .
\]
For instance, vector x = (3, 2, 1) has the follwoing projection on flat W.
\[
\mathbf{x}_W = \mathbf{a} + \frac{1}{6}\,\mathbf{b} = \frac{1}{6} \left( 11, 13, -2 \right) .
\]
{3, 2, 1} . a
9
{3, 2, 1} . b
1
a + b/6
{11/6, 13/6, -(2/3)}
■
End of Example 45
Corollary 1:
Let W be a subspace of ℝn and let {e₁, e₂, … , em} be an orthonormal basis for W. Then for any vector x in ℝn,
the orthogonal projection of x
onto W
is given by the formula
\begin{equation} \label{EqDot.5}
\mathbf{x} = \left( \mathbf{x} \bullet \mathbf{e}_1 \right) \mathbf{e}_1 + \left( \mathbf{x} \bullet \mathbf{e}_2 \right) \mathbf{e}_2 + \cdots + \left( \mathbf{x} \bullet \mathbf{e}_m \right) \mathbf{e}_m .
\end{equation}
This formula is both simple and powerful because it guarantees the existence of a unique projection vector in W, and it minimizes the Euclidean distance between the target vector and any point in W.
Corollary 2:
Let p be a projection of the target vector y ∈ ℝn on subspace W. For any x in W with x ≠ p, the inequality ∥y − p∥ ≤ ∥y − x∥ holds.
Suppose x ∈ W. Then p − x ∈ W and y − p ∈ W⊥, and so
\[
\| \left( \mathbf{y} - \mathbf{p} \right) \|^2 + \| \left( \mathbf{p} - \mathbf{x} \right) \|^2 = \| \left( \mathbf{y} - \mathbf{x} \right) \|^2 .
\]
Hence ∥y − p∥ ≤ ∥y − x∥ and equality holds only when x = p.
Earlier we said that an equivalent way to define the orthogonal projection of
target vector y on a subspace W is to find the point in W which is closest to y (provided such a
point exists). Corollary 2 establishes this equivalence for a finite-dimensional case.
Now we transfer formula \eqref{EqDot.4} into another form using matrix language.
Suppose we have an n-dimensional vector space ℝn, and a set of m linearly independent vectors u₁, u₂, … , um ∈ ℝn. Then these vectors generate a subspace W ⊂ ℝn. We want to find a combination of these vectors that's closest to some target vector v. In other words, we need to find the projection of v onto the subspace spanned by {u₁, u₂, … , um}.
We organize generating vectors {u₁, u₂, … , um} as columns into an n × m matrix called A:
Since {u₁, u₂, … , um} is an orthogonal basis of W, its span coincides with the column space of matrix A.
We are looking for a vector w ∈ W that is a linear combination of generating vectors
and is closest to the target vector v. Then the projection vector will be vW = Ac. So we make the error vector v⊥ = v − vW to be orthogonal to the subspace W onto which this vector is projected. This means that v⊥ must be orthogonal to every one of generator set {u₁, u₂, … , um}. Equating dot product with every generating vector to zero, we get the system of equations
because matrix ATA is invertible as having m linearly independent columns. Therefore, the projection vector on subspace W is expressed through matrix A of column generators as follows:
Theorem 16:
Let A be an m × n
matrix with linearly independent columns and let
W = ColumnSpace (A) be the column space of matrix A.
Then the n × n (self-adjoint) matrix ATA
is invertible, and for any column vector x in ℝn×1, its projection on subspace W is expressed by formula \eqref{EqDot.8}, where P ∈ ℝn×n.
We first show that the null space of ATA consists of the single element---the zero vector. This leads to invertibility of the self-adjoint matrix ATA. Suppose that ATAc = 0. Then ATAc = ATc = 0, so 0W = ATc. However, 0W = 0 (the orthogonal decomposition of the zero vector is just 0 = 0 + 0), so A c = 0, and therefore is in NullSpace(A). Since columns of matrix A are linearly independent, we have c = 0. Hence, NullSpace(ATA) = {0}, as desired.
Let x be a column vector in ℝn×1 and let c be a solution of ATAc = ATx. Then applying the inverse of the self-ajoint matrix, we determine the projection vector according to formula \eqref{EqDot.8}.
Example 46:
In four dimensional space, we consider two noncollinear vectors
\[
\mathbf{u}_1 = \begin{pmatrix} 1 \\ 2 \\ 3 \\ 4 \end{pmatrix} , \qquad \mathbf{u}_2 = \begin{pmatrix} 4 \\ 3 \\ 2 \\ 1 \end{pmatrix} .
\]
Let W = span{u₁, u₂} be two-dimensional subspace generated by these two vectors. We want to find projection of vector v = (1, 1, 1, 1) on this subspace. First, we build the matrix of these vectors
\[
\mathbf{A} = \begin{bmatrix} 1 & 4 \\ 2 & 3 \\ 3 & 2 \\ 4 & 1 \end{bmatrix} .
\]
The corresponding projection matrix is given by formula (8):
\[
\mathbf{P} = \mathbf{P}_W = \mathbf{A} \left( \mathbf{A}^{\mathrm T} \mathbf{A} \right)^{-1} \mathbf{A}^{\mathrm T} = \frac{1}{10} \begin{bmatrix} 7& 4& 1& -2 \\ 4& 3& 2& 1 \\ 1& 2& 3& 4 \\ -2& 1& 4& 7 \end{bmatrix} .
\]
A = {{1, 4}, {2, 3}, {3, 2}, {4, 1}};
P = A . Inverse[Transpose[A] . A] . Transpose[A]
In particular, if one wants to find a projection of arbitrary vecto on Wr, just multiply (from left) it by matrix P. For target vector v = (1, 1, 1, 1), we have
\[
\mathbf{v}_W = \mathbf{P}\,\mathbf{v} = \begin{pmatrix} 1 \\ 1 \\ 1 \\ 1 \end{pmatrix} = \mathbf{v} \in W .
\]
P . {1, 1, 1, 1}
{1, 1, 1, 1}
Since vector v = (1, 1, 1, 1) belongs to subspace W, its projection coincides with v.
However, if we choose another vector (not in W), we get
\[
\mathbf{y}_W = \mathbf{P}\,\begin{pmatrix} 1 \\ -1 \\ -1 \\ 1 \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \\ 0 \\ 0 \end{pmatrix} .
\]
This means that vector y = (1, −1, −1, 1) is orthogonal to W.
Example A: Verification of Eq.(9) in ℝ³
We consider a two-dimensional subspace W ⊂ ℝ³ spanned on two orthogonal vectors
\[
\mathbf{e} _1 = \frac{1}{\sqrt{3}} \begin{pmatrix} 1 \\ 1 \\ 1 \end{pmatrix} , \qquad \frac{1}{\sqrt{14}} \mathbf{e} _2 = \begin{pmatrix} 2 \\ -3 \\ 1 \end{pmatrix}
\]
We build 3 × 2 matrix of these vectors
\[
\mathbf{Q} = \begin{bmatrix} \mathbf{e}_1 & \mathbf{e}_2 \end{bmatrix} .
\]
Then we form a projection matrix according to Eq.(9):
\[
\mathbf{P} = \mathbf{Q}\,\mathbf{Q}^{\mathrm T} = \frac{1}{42} \begin{bmatrix} 26 & -4 & 20 \\ -4& 41& 5 \\ 20& 5& 17 \end{bmatrix} .
\]
Theorem 17:
Let W be a nontrivial subspace of ℝn, and define a linear transformation T : ℝn ⇾ ℝn by T(x) = xW, the orthogonal projection of x onto W. Then
T is a linear transformation.
T(x) = x if and only if x is in W.
T(x) = 0 if and only if x is in W⊥.
Composition T ∘ T = T.
The image of T is W.
We have to verify two defining properties of linear transformation. Let x, y be arbitrary vectors from ℝn, and let x = xW + x⊥ and y = yW + y⊥ be their orthogonal decompositions. Since W and its orthogonal complement W⊥ are both subspaces of ℝn, the sums xW + yW and x⊥ + y⊥ are in W and W⊥, respectively. Therefore, the orthogonal decomposition of x + y is (xW + yW) + (x⊥ + y⊥). Hence,
\[
T \left( \mathbf{x} + \mathbf{y} \right) = \left( \mathbf{x} + \mathbf{y} \right)_W = x_W + y_W = T(\mathbf{x}) + T(\mathbf{y}) .
\]
Now let λ be a scalar. Then λxW is in W and λx⊥ is in W⊥, so the orthogonal decomposition of λx is λxW + λx⊥, and therefore,
\[
T \left( \lambda\mathbf{x} \right) = \left( \lambda\mathbf{x} \right)_W = \lambda \mathbf{x}_W = \lambda\,T(\mathbf{x}) .
\]
Since these two defining properties of linear transformation are satisfied by projecting mapping T, we consclude that T is a linear transformation.
For every vector x ∈ ℝn, there is a unique decomposition
\[
\mathbf{x} = \mathbf{x}_W + \mathbf{x}_{W^{\perp }} = \mathbf{x}_W + \mathbf{x}_W^{\perp} ,
\]
where
xW ∈ W,
x⊥W ∈ W⊥ (the orthogonal complement of W),
and by definition of orthogonal projection,
T(x) = xW.
We now prove the required identity in both directions.
(⇒ ) If T(x) = x, then x ∈ W.
Assume T(x) = x. By definition of T,
\[
T(\mathbf{x}) = \mathbf{x}_W .
\]
So x = T(x) = xW.
But xW ∈ W. Therefore, x is in W.
(⇐ ) If x is in W, then T(x) = x.
Assume x ∈ W. In the decomposition
\[
\mathbf{x} = \mathbf{x}_W+x_{W^{\perp }},
\]
the component in W must be x itself, and the orthogonal component must be 0. So
\[
\mathbf{x}_W = \mathbf{x},\quad x_{W^{\perp }}=0.
\]
Hence,
T(x) = xW = x.
Since both implications hold, we conclude
T(x) = x ⇔ x ∈ W.
Similar to the previous part, the proof is based on unique decomposition
\[
\mathbf{x} = \mathbf{x}_W + \mathbf{x}_{W^{\perp }} = \mathbf{x}_W + \mathbf{x}_W^{\perp} .
\]
(⇒ ) If T(x) = 0, then x is in W⊥.
Assume T(x) = 0. Then
\[
\mathbf{x}_W = T(\mathbf{x}) =0.
\]
Using the decomposition x = xW + x⊥W, we get
\[
\mathbf{x} = 0 + \mathbf{x}_{W^{\perp }} = \mathbf{x}_{W^{\perp }}.
\]
But x⊥W is in W⊥, so x ∈ W⊥.
(⇐ ) If x is in W⊥, then T(x) = 0.
Assume x ∈ W⊥. In the decomposition,
x = xW + x⊥W,
the component in W⊥ must be x itself, and the component in W must be 0. Thus,
\[
\mathbf{x}_W = 0,\quad \mathbf{x}_{W^{\perp }} = \mathbf{x}.
\]
For any x in ℝn, the vector T(x) is in W, so T ∘ T(x) = T(T(x)) = T(x) by property 2.
Any vector x in W is in the range of T because T(x) = x for such vectors. On the other hand, for any vector x in ℝn, the output T(x) = xW is in W, so W is in the image of T.
Example 47:
Using Mathematica, we generate two three-dimensional vectors
u1 = RandomInteger[{-9, 9}, 3]
{-6, 1, 9}
u2 = RandomInteger[{-9, 9}, 3]
{-7, 1, 5}
Then we build 3×2 matrix having these column vector
A = Transpose[{u1, u2}]
\[
\mathbf{A} = \begin{bmatrix} -6&-7 \\ \phantom{-}1&\phantom{-}1 \\ \phantom{-}9&\phantom{-}5 \end{bmatrix} .
\]
Using this matrix, we form a projection matrix on space W = span{u₁, u₂} based on formular (8):
\[
\mathbf{P}_W = \mathbf{A} \left( \mathbf{A}^{\mathrm T} \mathbf{A} \right)^{-1} \mathbf{A}^{\mathrm T} = \frac{1}{1106} \begin{bmatrix} 1090 & -132 & 4 \\
-132 & 17& 33 \\ 4 & 33 & 1105 \end{bmatrix} .
\]
It is well-known that any linear transformation is equivalent to matrix multiplication and vice versa. Therefore, to matrix P corresponds a linear transformation.
Let W be a two-dimensional subspace of ℝ³ that is spanned on two vectors u = [-9, 8, 5] and v = [0, -3, 0]:
u = RandomInteger[{-9, 9}, 3]
{-9, 8, 5}
v = RandomInteger[{-9, 9}, 3]
{0, -3, 0}
We organize these vectors in matrix form:
\[
\mathbf{A} = \begin{bmatrix} -9&0 \\ 8&-3 \\ 5&0 \end{bmatrix} .
\]
Projection transformation on subspace W is equivalent to matrix multiplication
\[
\mathbf{P}_W = \mathbf{A} \left( \mathbf{A}^{\mathrm T} \mathbf{A} \right)^{-1} \mathbf{A}^{\mathrm T} = \frac{1}{106} \begin{bmatrix} 81 & 0 & -45 \\
0 & 106& 0 \\ -45 & 0 & 25 \end{bmatrix} .
\]
Let us take x = u + 2 v = [-9, 2, 5] ∈ W. Then its projection on W is
\[
{\bf x}_W = {\bf P}\,\mathbf{x} = \mathbf{x} .
\]
P . {-9, 2, 5}
{-9, 2, 5}
In this example, we use W from the previous example, which is a span of two vectors u = [-9, 8, 5] and v = [0, -3, 0]. Then their cross product will be orthogonal to W:
\[
\mathbf{x} = \mathbf{u} \times \mathbf{v} = \begin{bmatrix} 15&0&27 \end{bmatrix} .
\]
Composition of two linear transformations corresponds the product of their corresponding matrices. So we need to check that P P = P. So for our example we take the projection matrix from example 2. Then the product is
\[
{\bf P}\,{\bf P} = \frac{1}{106} \begin{bmatrix} 81 & 0 & -45 \\
0 & 106& 0 \\ -45 & 0 & 25 \end{bmatrix} = {\bf P}.
\]
Properties of projection mappings can be reformulated in matrix language:
Theorem 18:
Let W be a subspace of column vectors ℝn×1, define T : ℝn×1 ⇾ ℝn×1 by T(x) = xW, and let P = PW be the standard matrix for T, so xW = Px. Then
ColumnSpace(P) = W.
NullSpace(P) = ker(P) = W⊥.
P² = P.
If W ≠ {0}, then λ = 1 is an eigenvalue of P and the 1-eigenspace for P is W.
If W ≠ ℝn×1, then zero is an eigenvalue of P and 0-eigenspace for P is W⊥.
Projection matrix P is similar to the diagonal matrix with m ones and n − m zeroes on the diagonal, where m = dim(W).
The properties of projection matrices would be very hard to prove in terms of matrix language only. By translating all the statements into asserments about linear transformations, they become much more transparent.
We show both inclusions: ColumnSpace(P) ⊆ W and W ⊆ ColumnSpace(P).
A. ColumnSpace(P) ⊆ W.
By definition, the column space of P is
ColumnSpace(P) = { P x : x ∈ ℝn×1 }.
However, P x = T(x) = xW, and by definition of orthogonal projection, xW ∈ W for every x. Therefore, every vector of the form P x lies in W, so ColumnSpace(P) ⊆ W.
B. W ⊆ ColumnSpace(P).
Let w ∈ W. Since T is the orthogonal projection onto W, we know from the previous theorem that
T(w) = w.
In matrix form, this is
\[
\mathbf{P}\,\mathbf{w} = \mathbf{w}.
\]
However, this means that w is equal to P times some vector (namely w itself), so w ∈ ColumnSpace(P). Hence,
W ⊆ ColumnSpace(P).
Combining the two inclusions, we obtain W = ColumnSpace(P).
We start with relating the matrix kernel to the transformation kernel.
Since P is the standard matrix of T, we have
\[
T(\mathbf{x}) = {\bf P}\,\mathbf{x} \quad \mathrm{for\ all\ }\mathbf{x} \in \mathbb{R}^{n\times 1} .
\]
Thus,
\[
{\bf P}\,\mathbf{x} = 0\quad \Longleftrightarrow \quad T(\mathbf{x})=0.
\]
Therefore,
\[
\ker (\mathbf{P}) = \ker (T).
\]
So it suffices to show
\[
\ker (T)=W^{\perp }.
\]
2) Use the orthogonal decomposition:
Every vector x ∈ ℝn has a unique decomposition
\[
\mathbf{x} = \mathbf{x}_W + \mathbf{x}_{W^{\perp }} = \mathbf{x}_W + \mathbf{x}_W^{\perp} ,
\]
where
xW is in W,
x⊥W ∈ W⊥ (the orthogonal complement of W).
T(x) = xW.
3) Show ker(T) ⊆ W⊥.
If x ∈ ker(T), then T(x) = xW = 0.
Using decomposition x = xW + x⊥W, we get
\[
\mathbf{x} = \mathbf{0} + \mathbf{x}_W^{\perp} = \mathbf{x}_W^{\perp} .
\]
Thus, x ∈ W⊥. and ker(T) ⊆ W⊥.
4) Show W⊥ ⊆ ker(T).
Let x ∈ W⊥. Then in the decomposition
x = xW + x⊥W,
we must have
\[
\mathbf{x}_{W^{\perp }} = \mathbf{x},\qquad \mathbf{x}_W = 0.
\]
Thus, T(x) = xW = 0,
so x ∈ ker(T).
Hence,
W⊥ ⊆ ker(T).
5) Now we combine the results.
We have shown that ker(T) = W⊥.
And since ker(P) = ker(T), we conclude that
NullSpace(P) = ker(P) = W⊥.
Step 1: Use the geometric meaning of projection.
Each x from ℝn×1 (or ℝn) can be written as
x = xW + x⊥W,
where xW ∈ W is the orthogonal projection of x onto W, and x⊥W ∈ W⊥ is the component orthogonal to W.
By definition of T,
T(x) = xW.
Now apply T again to T(x):
\[
T(T(\mathbf{x}))=T(\mathbf{x}_W).
\]
However, xW already lies in W. The orthogonal projection of a vector in W onto W is just itself. So
\[
T(\mathbf{x}_W) = \mathbf{x}_W.
\]
Therefore, for every x ∈ ℝn ≌ ℝn×1,
\[
T(T(\mathbf{x}))=T(\mathbf{x}).
\]
Step 2: Translate to matrices:
In terms of the standard matrix P, we have
\[
T(\mathbf{x})= {\bf P}\,\mathbf{x},\quad T(T(\mathbf{x})) = \mathbf{P}({\bf P}\,\mathbf{x}) = \mathbf{P}^2 \mathbf{x}.
\]
From the identity T(T(x)) = T(x) for all x ∈ ℝn×1, we get
\[
\mathbf{P}^2 \mathbf{x} = \mathbf{P}\,\mathbf{x}\quad \mathrm{for\ all\ }{\bf x}\in \mathbb{R}^{n}.
\]
The only matrix that satisfies P²x = P x for all x is one with
\[
\mathbf{P}^2 = \mathbf{P}.
\]
So the projection matrix P is idempotent.
1) Show that 1 is an eigenvalue of P:
Take any nonzero vector w from W. Since w already lies in W, its orthogonal projection onto W is itself:
\[
T(\mathbf{w})=\mathbf{w}.
\]
In matrix form,
P w = w.
Rewriting it as
\[
\mathbf{P}\,\mathbf{w} = 1\cdot \mathbf{w}.
\]
So w is an eigenvector of P with eigenvalue λ = 1. Because W ≠ {0}, such a nonzero w exists, so λ = 1 is indeed an eigenvalue of P.
2) Show that the 1-eigenspace of P is exactly W.
(a) Every vector in W is a 1-eigenvector.
Let w ∈ W. As above,
\[
\mathbf{P}\,\mathbf{w}=\mathbf{w}.
\]
So every vector in W satisfies P w = 1·w. Thus,
\[
W\subseteq E_1(\mathbf{P}),
\]
where E₁(P) denotes the 1-eigenspace of P.
(b) Every 1-eigenvector lies in W.
Indeed, let x ∈ ℝn×1 satisfy
P x = x.
By definition of P, P x = xW, the orthogonal projection of x onto W. So
\[
\mathbf{x} = \mathbf{P}\,\mathbf{x} = \mathbf{x}_W.
\]
However, xW ∈ W. Hence, x ∈ W.
Therefore, every vector x ∈ ℝn×1 with P x = x lies in W, so
\[
E_1(\mathbf{P})\subseteq W.
\]
3) Conclusion:
Combining (a) and (b), we have
\[
E_1(\mathbf{P})=W,
\]
and since W ≠ { 0 }, this shows that λ = 1 is an eigenvalue of P and its eigenspace is exactly W.
1) Zero is an eigenvalue of P
Because W is a proper subspace, W⊥ ≠ {0}. Take any nonzero vector v ∈ W⊥.
The orthogonal projection of v onto W is the zero vector:
\[
T(\mathbf{v}) = \mathbf{v}_W = 0.
\]
In matrix form,
P v = 0.
So
\[
\mathbf{P}\,\mathbf{v} = 0\cdot \mathbf{v},
\]
which means v is an eigenvector of P with eigenvalue λ = 0. Since v ≠ 0, λ = 0 is indeed an eigenvalue.
2) The 0-eigenspace is W⊥.
Let E₀(P) denote the 0-eigenspace of P, i.e.,
\[
E_0 (\mathbf{P}) = \left\{ \mathbf{x}\in \mathbb{R}^{n\times 1} \ : \ \mathbf{P}\,\mathbf{x} = 0 \right\} .
\]
(a) W⊥ ⊆ E₀(P).
If x ∈ W⊥, then as above its projection onto W is zero:
\[
\mathbf{P}\,\mathbf{x} = \mathbf{x}_W = 0.
\]
So x ∈ E₀(P). Hence,
W⊥ ⊆ E₀(P).
(b) E₀(P) ⊆ W⊥.
Now let x be in E₀(P), so P x = 0. But P x = xW, the projection of x onto W. Thus, xW = 0.
Write the orthogonal decomposition
\[
\mathbf{x} = \mathbf{x}_W + \mathbf{x}_{W^{\perp }} = 0 + \mathbf{x}_{W^{\perp }}=x_{W^{\perp }}.
\]
So x ∈ W⊥. Therefore,
\[
E_0(P)\subseteq W^{\perp }.
\]
3) Conclusion:
Combining (a) and (b), we get
\[
E_0(\mathbf{P})=W^{\perp }.
\]
Since W ≠ ℝn ≌ ℝn×1, we have W⊥ ≠ {0}, so the 0-eigenspace is nontrivial and λ = 0 is an eigenvalue of P, with eigenspace exactly W⊥.
We’ve already uncovered the key structure:
W is the 1-eigenspace of P, with dim(W) = m.
W⊥ is the 0-eigenspace of P, with dimW⊥ = n − m.
ℝn ≌ ℝn×1 = W ⊕ W⊥.
Now we turn that into a similarity with a diagonal matrix.
1) Build a basis adapted to W and W⊥.
Choose a basis of W:
\[
\{ \mathbf{w}_1, \dots , \mathbf{w}_m \} .
\]
Choose a basis of W⊥:
\[
\{ \mathbf{u}_1,\dots , \mathbf{u}_{n-m}\} .
\]
Then
\[
\mathcal{B}=\left\{ \mathbf{w}_1, \dots , \mathbf{w}_m , \mathbf{u}_1,\dots ,\mathbf{u}_{n-m} \right\}
\]
is a basis of ℝn×1 ≌ ℝn.
2) Describe P in this basis:
For each wi ∈ W, projection onto W fixes it:
\[
\mathbf{P}\,\mathbf{w}_i = \mathbf{w}_i .
\]
For each uj ∈ W⊥, projection onto W kills it:
\[
\mathbf{P}\,\mathbf{u}_j = 0.
\]
So, relative to the basis ℬ, the matrix of P acts as follows:
On the first m basis vectors: eigenvalue 1.
On the last n − m basis vectors: eigenvalue 0.
Thus, in the basis ℬ, the matrix of P is
\[
\Lambda = \left( \begin{matrix}I_m&0\\ 0&0_{(n-m)\times (n-m)}\end{matrix}\right) ,
\]
a diagonal matrix with m ones and n − m zeroes on the diagonal.
3) Express this as similarity:
Let S be the change-of-basis matrix whose columns are the vectors of ℬ written in the standard basis:
\[
\mathbf{S} = \left[ \begin{matrix}w_1&\cdots &w_m&u_1&\cdots &u_{n-m}\end{matrix}\right] .
\]
Then the matrix of P in the standard basis is related to its matrix Λ in the basis ℬ by
\[
\mathbf{S}^{-1} \mathbf{P}\,\mathbf{S} = \Lambda .
\]
So the projection matrix P is similar to the diagonal matrix with m ones and n − m zeroes on the diagonal, where m = dimW.
Example 48:
In this example, we use W generated by two vectors
u = RandomInteger[9, 4]
{{4, 4, 2, 2}
v = RandomInteger[9, 4]
{5, 2, 0, 8}
We organize these vectors in matrix form
\[
\mathbf{A} = \begin{bmatrix} 4&5 \\ 4&2 \\\ 2&0 \\ 2&8 \end{bmatrix} .
\]
A = Transpose[{u, v}]
{{4, 5}, {4, 2}, {2, 0}, {2, 8}}
The projection matrix on W = span{u, v} is defined by Eq.(8) and we have
\[
\mathbf{P}_W = \mathbf{A} \left( \mathbf{A}^{\mathrm T} \mathbf{A} \right)^{-1} \mathbf{A}^{\mathrm T} = \frac{1}{446} \begin{bmatrix} 182 & 164 &76 & 124 \\
164 & 236& 142&-50 \\ 76 & 142 & 93 & -83 \\
124 & -50 & -83 & 381\end{bmatrix} .
\tag{9.1}
\]
Hence,
\[
W = \mbox{span} \left\{ \mathbf{u}_1 = \begin{pmatrix} 182 \\ 164 \\ 76 \\ 124 \end{pmatrix} , \ \mathbf{u}_2 = \begin{pmatrix} 164 \\ 236 \\ 142 \\ -50 \end{pmatrix} \right\} .
\]
and we need to show that
\[
W = \mbox{span} \left\{ \mathbf{u} = \begin{pmatrix} 4 \\ 4 \\ 2 \\ 2 \end{pmatrix} , \ \mathbf{v} = \begin{pmatrix} 5 \\ 2 \\ 0 \\ 8 \end{pmatrix} \right\} .
\]
This means that there exist four constants c₁, c₂, c₃, and c₄, not all equal to zero, so that
\[
\begin{split}
c_1 \mathbf{u} + c_2 \mathbf{v} &= \mathbf{u}_1 ,
\\
c_3 \mathbf{u} + c_4 \mathbf{v} &= \mathbf{u}_2 .
\end{split}
\]
We ask Mathematica to solve the corresponding system of linear equations.
To compute the null space of P, we need to solve the following system of linear equations:
\[
\mathbf{P}\,\mathbf{x} = \mathbf{0} \qquad \iff \qquad \frac{1}{3} \begin{bmatrix} 2 & 1 & -1 \\ 1&2&1 \\ -1 &1 & 2 \end{bmatrix} \begin{pmatrix} x \\ y \\ z \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \\ 0 \end{pmatrix} .
\]
Fortunately, Mathematica has a dedicated comamnd to determine the null space:
NullSpace[P]
{{1, -1, 1}}
Therefore,
\[
\mbox{NullSpace}[\mathbf{P}] = \mbox{span} \left\{ \begin{pmatrix} 1 \\ -1 \\ 1 \end{pmatrix} \right\} .
\]
Now we compute W⊥. This space consists of all vectors x = [x, y, z] that are orthogonal to w₁ and w₂:
\[
\mathbf{w}_1 \bullet \mathbf{x} = 0 \quad \& \quad \mathbf{w}_2 \bullet \mathbf{x} = 0
\]
So we need to solve the homogeneous system of equations
\[
\begin{split}
x + y &= 0 , \\
y + z & = 0 .
\end{split}
\]
So y = −x = −z. Hence, W is spanned on one vector: [1, −1, 1], which is exactly the null space of matrix P.
Let
\[
W = \mbox{span}\left\{ \begin{pmatrix} 1 \\ 1 \\ 1 \end{pmatrix} , \quad \begin{pmatrix} 1 \\ 0 \\ -1 \end{pmatrix} \right\} .
\]
Using formula (9), we find its standard matrix of the orthogonal projection onto W to be
\[
\mathbf{P} = \frac{1}{6} \begin{bmatrix} 5 & 2 & -1 \\ 2&2&2 \\ -1&2&5 \end{bmatrix} .
\]
Since cross product e₁ × e₂ is orthogonal to these vectors, the orthogonal complement W⊥ is spanned on the vector
\[
W^{\perp} = \mbox{span} \left\{ \begin{pmatrix} 1 \\ -2 \\ 1 \end{pmatrix} \right\} .
\]
Cross[e1, e2]
{-(1/Sqrt[6]), Sqrt[2/3], -(1/Sqrt[6])}
We use matrix P from Eq.(9.1). We use Mathematica to determine eigenvalues and eigenvectors of matrix P:
So we see that projection matrix P has two eigenvalues λ = 1 and λ = 0, both of multiplicity 2 (not defective). The eigenvectors corresponding λ = 1 are
\[
\mathbf{v}_1 = \begin{pmatrix} 5 \\ 2 \\ 0 \\ 8 \end{pmatrix} , \qquad \mathbf{v}_2 = \begin{pmatrix} 11 \\ 14 \\ 8 \\ 0 \end{pmatrix}
\]
We check that these vectors v₁ and v₂ are eigenvectors of matrix P:
v1 = {5, 2, 0, 8};
P . v1
{5, 2, 0, 8}
v2 = {11, 14, 8, 0};
P . v2
{11, 14, 8, 0}
We need to show that vectors {v₁, v₂} span subspace W. So we need to show that vectors u = [4, 4, 2, 2] and v = 5, 2, 0, 8] are expressed as linear combination of vectors {v₁, v₂}. Since v = v₁, it is sufficient to show that vector u is a linear combination of {v₁, v₂}. So we need to show that the system of algebraic equations
\[
c_1 \mathbf{v}_1 + c_2 \mathbf{v}_2 = \mathbf{u}
\]
has a nontrivial solution. We deligate this job to mathematica:
Since c₁ = ¼ and c₂ = ¼ are not zero, eigenvectors {v₁, v₂} span subspace W.
From the previous example, we know that projection matrix (9.1) has double eigenvalue λ = 0 to which corresponds two eigenvectors
\[
\mathbf{u}_1 = \begin{pmatrix} -14 \\ 11 \\ 0 \\ 6 \end{pmatrix} , \qquad \mathbf{u}_2 = \begin{pmatrix} 2 \\ -5 \\ 6 \\ 0 \end{pmatrix} .
\]
We check with Mathematica that these vectors belong to W⊥:
u1 = {-14, 11, 0, 6};
P . u1
{0, 0, 0, 0}
u2 = {2, -5, 6, 0};
P . u2
{0, 0, 0, 0}
Since eigenvectors {u₁, u₂} are linearly independent, they span W⊥.
Using projection matrix (9.1), we build the matrix from its eigenvectors:
\[
\mathbf{S} = \begin{bmatrix} \mathbf{v}_1 & \mathbf{v}_2 & \mathbf{u}_1 & \mathbf{u}_2 \end{bmatrix} = \begin{bmatrix} 5& 11& -14&2 \\
2&14& 11 & -5 \\
0&8&0&6 \\
8&0&6&0 \end{bmatrix} .
\]
Now we calculate the product:
S = {{5, 11, -14, 2}, {2, 14, 11, -5}, {0, 8, 0, 6}, {8, 0, 6, 0}};
Inverse[S] . P . S
where xW is the orthogonal projection of x onto W.
In other words, to find refW(x),
one starts at x, then moves to x − x⊥ = xW,
then continues in the same direction one more time, to end on the opposite side of W.
If W has an orthonormal basis [e₁, e₂, … , em] collected into matrix
respectively. If n × m matrix Q has linearly independent columns, then the projection matrix P = PW on subspace W = ColumnSpace(Q) is defined by Eq.\eqref{EqDot.8} and we get the reflection matrix
We formulate the main properties of reflection matrix without proofs (that are left as exercises).
Properties of reflection matrix:
Let R be a reflection matrix across subspace W ⊂ ℝn. Then the following properties hold.
Reflection matrices are involutions: R² = I.
Reflections are orthogonal transformations: RTR = I.
They preserve Euclidean lengths: ∥Rx∥ = ∥x∥.
They flip exactly the orthogonal complement of the subspace.
Example 49:
Since reflection was exposed in a succinct way, we present several examples that clarify this topic.
Example A: Reflection across a line in ℝ²
Let U ⊂ ℝ² be the line spanned by a unit vector \( \displaystyle \quad {\bf u} = \frac{1}{\sqrt{2}} \left[ 1, 1 \right] . \quad \) Then reflection across this line becomes
\[
R_U ({\bf x}) = 2 \left( \frac{x_1 + x_2}{2} \right) \left[ 1, 1 \right] - \left[ x_1 , x_2 \right] = \left[ x_2 , x_1 \right] .
\]
So this reflection simply swaps the coordinates. On the other hand, if we choose V spanned on unit vector \( \displaystyle \quad {\bf v} = \frac{1}{\sqrt{5}} \left[ 2, 1 \right] , \quad \) then reflection across this line becomes
\[
R_V ({\bf x}) = 2 \left( \frac{2\,x_1 + x_2}{5} \right) \left[ 2, 1 \right] - \left[ x_1 , x_2 \right] = \frac{1}{5} \left[ 3\,x_1 + 4\, x_2 , 4\, x_1 -3\,x_2 \right] .
\]
The corresponding reflection matrix is
\[
\mathbf{R} = \frac{1}{5} \begin{bmatrix} 3&\phantom{-}4 \\ 4&-3 \end{bmatrix} .
\]
With Mathematica, we verify its involution and orthogonal properties.
If the line U is spanned by a unit vector
\[
{\bf u} = \begin{pmatrix} \cos\theta \\ \sin\theta \end{pmatrix} ,
\]
then the reflection matrix across this line is expressed by the classical formula
\[
R_U = \begin{bmatrix} \cos 2\theta & \phantom{-}\sin 2 \theta \\
\sin 2\theta & -\cos 2\theta \end{bmatrix} .
\]
We check with Mathematica its nvolution property: R² = I, where I 9s the identoty matrix:
So for the reflection across the line y = x, we have 2θ = π/2. Therefore the reflection matrix becomes
\[
R = \begin{bmatrix} 0&1 \\ 1&0 \end{bmatrix} .
\]
Example B: Reflection across a plane in ℝ³
Let plane U have unit normal vector n, written as a colum vector. Then the reflection matrix is
\[
{\bf R}_U = {\bf I} - 2 \mathbf{n}\,\mathbf{n}^{\mathrm T} .
\]
For instance, reflection across the plane x₂ = 0, with normal vector n = [0, 1, 0] is
\[
{\bf R}_U = \begin{bmatrix} 1& \phantom{-}0 & 0 \\ 0&-1&0 \\ 0& \phantom{-}0 & 1 \end{bmatrix} .
\]
So n = [0, 1, 0], and we have
\[
{\bf R}_U \left( x_1 , x_2 , x_3 \right) = \left[ x_1 , - x_2 , x_3 \right] .
\]
Example C: General Reflection across a subspace in ℝn
For instance, let us consider a 2-dimensional subspace U in ℝ4, which is spanned on orthonormal vectors
\[
{\bf e}_1 = \frac{1}{\sqrt{5}} \left[ 2, \ 1, \ 0 , 0 \right] , \qquad {\bf e}_2 = \frac{1}{\sqrt{10}} \left[ 0, \ 1, \ 3, \ 0 \right] .
\]
The reflection matrix across the subspace generated by these unit vectors is defined by Eq.\eqref{EqDot.9}. So we use Mathematica to generate the subroutine:
ReflectionMatrixFromBasis[basis_List] := Module[
{Q, G, P, n},
n = Length[First[basis]];
Q = Transpose[basis]; (* n x k *)
G = Transpose[Q].Q; (* Gram matrix k x k *)
P = Q.Inverse[G].Transpose[Q]; (* orthogonal projector *)
Simplify[2 P - IdentityMatrix[n]]
];
So reflection of vector x = [1, -3, 4, 2] is
\[
{\bf R}_U ({\bf x}) = {\bf R}_U (1, -3, 4, 2) = \frac{1}{49} \begin{bmatrix} -125 & 201 & 80 & -98 \end{bmatrix} .
\]
We check whether both vectors x = [1, -3, 4, 2] and reflected vector have the same length.
The following code handles an arbitrary randomly generated subspace:
SeedRandom[2026];
n = 7; k = 3;
basis = Table[RandomReal[{-1,1}, n], {k}];
R = ReflectionMatrixFromBasis[basis];
VerifyReflection[R]
■
End of Example 49
We present Mathematica code for dynamic manipulation of reflection across the plane in ℝ³. You can drag x and watch the reflection move symmetrically across the plane.
The following transformation was used in a 1958 paper by the American mathematician Alston Scott Householder (1904--1993).
A Householder reflection (also Householder transformation) is a linear transformation used to reflect vectors in ℝn across a hyperplane. Given a nonzero vector v ∈ ℝn, one defines the unit vector \( \displaystyle \quad
{\bf u} = \frac{\bf v}{\| {\bf v}\| }, \quad \)
and the associated Householder matrix
\[
{\bf H} = {\bf I}-2\,{\bf u}\,{\bf u}^{\mathsf{T}},
\]
where I is the n×n identity matrix. The matrix H represents reflection across the hyperplane orthogonal to u, i.e. the set
\[
\{ {\bf x}\in \mathbb{R}^{n} \ : \ {\bf u} \bullet {\bf x} =0 \} .
\]
For any vector x ∈ ℝn, the action of H can be written explicitly as
which is the familiar geometric formula for reflecting a point across a hyperplane with unit normal u.
Householder reflections play a central role in modern numerical linear algebra, particularly in algorithms that require stable orthogonal transformations.
One of the most important applications is the QR factorization of a matrix A ∈ ℝm×n. The goal is to write
\( \displaystyle \quad
{\bf A} = {\bf Q}\,{\bf R}, \quad \)
where Q is orthogonal and R is upper triangular. Householder reflections provide a numerically stable and conceptually clean way to construct such a factorization.
The idea is to successively apply Householder transformations to zero out subdiagonal entries of A. At the first step, one constructs a Householder matrix H₁ that maps the first column of A to a vector whose only nonzero entry is in the first position. Applying H₁ on the left yields
is orthogonal, and the final matrix is upper triangular, giving the desired QR factorization.
Householder-based QR is preferred in many contexts because it is backward stable and uses orthogonal transformations, which preserve the 2-norm and thus control error growth.
For symmetric matrices, Householder reflections are also used to reduce a dense symmetric matrix to tridiagonal form, a key preprocessing step in many eigenvalue algorithms. By carefully choosing the Householder vectors, one can zero out entries below the first subdiagonal while preserving symmetry, leading to efficient and stable eigenvalue computations.
Householder reflections are closely related to Givens rotations, another class of orthogonal transformations used in numerical linear algebra. While Givens rotations act nontrivially on only two coordinates at a time, Householder reflections act globally but can annihilate many entries in a single step. As a result, Householder methods are often more efficient for dense matrices, whereas Givens rotations are advantageous for sparse or structured problems.
From a geometric viewpoint, any orthogonal transformation with determinant -1 can be represented as a Householder reflection, and any orthogonal matrix can be expressed as a product of Householder reflections. This gives Householder transformations a foundational role in the structure theory of orthogonal groups.
Example 50:
This example presents a Mathematica code for dynamic manipulation and visualization of Householder transformations in 3-dimensional physical space.
Clear[householderMatrix, n, H]
(* Householder matrix for a unit normal n *)
householderMatrix[n_List] := IdentityMatrix[Length[n]] - 2 KroneckerProduct[n, n];
(* Example: 3D normal and reflection *)
n = Normalize[{1, 2, 1}];
H = householderMatrix[n];
plane = ParametricPlot3D[
s v1 + t v2 /.
With[{v1 = Normalize[NullSpace[{n}][[1]]], v2 = Normalize[Cross[n, Normalize[NullSpace[{n}][[1]]]]]},
{v1 -> v1, v2 -> v2}
],
{s, -3, 3}, {t, -3, 3},
Mesh -> None,
PlotStyle -> Opacity[0.3, LightBlue]
];
Manipulate[
Module[{x = {x1, x2, x3}, xr},
xr = H.x;
Show[
plane,
Graphics3D[{
Thick,
Red, Arrow[{{0, 0, 0}, x}],
Green, Arrow[{{0, 0, 0}, xr}],
Black, Arrow[{{0, 0, 0}, n}],
Black, PointSize[Large], Point[x], Point[xr]
}],
Axes -> True,
AxesLabel -> {"x", "y", "z"},
Boxed -> False,
ImageSize -> Large
]
],
{{x1, 2}, -3, 3},
{{x2, -1}, -3, 3},
{{x3, 3}, -3, 3}
]
■
End of Example 50
Householder reflection provides a remarkably simple formula
with deep geometric and algebraic content. It realizes reflection across a hyperplane in ℝⁿ, is symmetric, orthogonal, and involutory, and has a transparent eigenstructure. These properties make it an indispensable tool in numerical linear algebra, particularly in QR factorization and eigenvalue algorithms, where stability and norm preservation are essential. Its conceptual clarity and computational efficiency explain why Householder’s construction remains a standard ingredient in modern scientific computing.
X2
For a given target point (4, 3, 2, 1) in ℝ4, find the point in \( \displaystyle \quad \mbox{span} \left\{ \begin{pmatrix} 1 \\ -1 \\ 2 \\ 3 \end{pmatrix} , \ \begin{pmatrix} 0 \\ 1 \\ 3 \\ 2 \end{pmatrix} \right\} , \quad \) which is closest to the traget point.
For a given target point (4, 3, 2, 1) in ℝ4, find the point in \( \displaystyle \quad \mbox{span} \left\{ \begin{pmatrix} 1 \\ 0 \\ 2 \\ 0 \end{pmatrix} , \ \begin{pmatrix} 0 \\ 3 \\ 2 \\ 0 \end{pmatrix} \right\} , \quad \) which is closest to the traget point.
For a given target point (4, 3, 2, 1) in ℝ4, find the point on the plane described by 2x + y −3z + 4w = 0 which is closest to the traget point.
Anton, Howard (2005), Elementary
Linear Algebra (Applications Version) (9th ed.), Wiley International