es

To enhance pedagogical effectiveness, the treatment of the dot product is presented in several distinct sections

Dot product

Metric

Geometrical interpretation

Duality

Orthogonality

Solvability

In this section, we consider vector spaces over field of real numbers, but not complex numbers. The dot product of two n-dimensional vectors x = (x₁, x₂, … , xn) and y = (y₁, y₂, … , yn),

\[ \mathbf{x} \bullet \mathbf{y} = x_1 y_1 + x_2 y_2 + \cdots + x_n y_n , \]
becomes extremely simple for two collinear vectors. Namely, if y = λx, then their dot product is just product of two numbers
\[ \mathbf{x} \bullet \left( \lambda \mathbf{x} \right) = \| \mathbf{x} \| \cdot \lambda \| \mathbf{x} \| = \lambda \| \mathbf{x} \|^2 , \]
where
\[ \| \mathbf{x} \| = + \sqrt{x_1^2 + x_2^2 + \cdots + x_n^2} \]
is the Euclidean norm (or length) of vector x.

 

Projection


 

It is sometimes important to find the component of a vector in a particular direction associated with another vector. Consider the following picture:
au = Graphics[{Black, Thickness[0.01], Arrowheads[0.07], Arrow[{{0.7, 0.175}, {2, 0.5}}]}]; av = Graphics[{Blue, Thickness[0.01], Arrowheads[0.07], Arrow[{{0, 0}, {0.5, 1}}]}]; rec = Graphics[{Orange, Line[{{0.6, 0.15}, {0.56, 0.27}, {0.67, 0.29}, {0.7, 0.175}}]}]; op = Graphics[{Purple, Thickness[0.01], Line[{{0, 0}, {0.7, 0.175}}]}]; qp = Graphics[{Purple, Thickness[0.01], Dashed, Line[{{0.7, 0.2}, {0.5, 1}}]}]; txt = Graphics[{Text[Style["O", Black, 20], {0, -0.15}], Text[Style["P", Black, 20], {0.7, 0.07}], Text[Style["u", Black, Bold, 20], {1.8, 0.3}], Text[Style["v", Black, Bold, 20], {0.22, 0.7}], Text[Style["Q", Black, 20], {0.6, 1.05}]}]; Show[au, av, qp, rec, op, txt]
Figure 1: Orthogonal projection
Here, u is a non-zero vector in ℝ² that defines a one-dimensional subspace spanned by u, and v is an arbitrary vector in the plane. We label the tip of v as point Q, with its tail positioned at the origin O. The point P lies on the line defined by u and is the closest point on that line to Q. Equivalently, the triangle formed by points O, P, and Q is a right triangle, with the right angle at P. The distance from O to P, measured positively in the direction of u, is called the component of v in the direction of u, and is denoted compu(v). The vector \( \displaystyle \quad \vec{OP} \quad \) is referred to as the projection of v onto u, and is denoted proju(v) or v. We are going to find formulas for these quantities.

Let θ be the included angle between v and u. From trigonometry, considering the right triangle (O, P, Q), we know that

\[ \cos\theta = \frac{| OP|}{|OQ|} = \frac{|OP|}{\| \mathbf{v} \|} , \]
and hence the length of projection becomes
\[ \left\vert \vec{OP} \right\vert = \left\| \mathbf{v}_{\|} \right\| = \left\| \mbox{proj}_{u} (\mathbf{v}) \right\| = \| \mathbf{v} \| \,\cos\theta . \]
From geometric interpretation formula of dot product \( \displaystyle \quad \mathbf{v} \bullet \mathbf{u} = \| \mathbf{v} \| \,\| \mathbf{u} \| \,\cos\theta , \) we get
\begin{equation} \label{EqDot.1} \mathbf{u} \bullet \mathbf{v} = \left\| \mathbf{u} \right\| \cdot \left\| \mathbf{v}_{\|} \right\| = \left\| \mathbf{u} \right\| \cdot \left\| \mbox{proj}_{u} (\mathbf{v}) \right\| . \end{equation}
Therefore, the dot product uv of two vectors can be interpreted as a simple numerical product of two quantities: the magnitude of of vectors u and the magnitude of the projection of v onto span of vector u, which we denote by v . In other words, to compute the dot product, one must determine the projection of one vector in the direction of the other, along with the direction vector itself. Since the formula for the dot product \eqref{EqDot.1} does not depend on the specific choice of the direction vector (as long as it is non-zero), we can express the dot product as:
\[ \mathbf{u} \bullet \mathbf{v} = \left\| \mathbf{v} \right\| \cdot \left\| \mathbf{u}_{\|} \right\| = \left\| \mathbf{v} \right\| \cdot \left\| \mbox{proj}_{v} (\mathbf{u}) \right\| . \]
au = Graphics[{Black, Thickness[0.01], Arrowheads[0.07], Arrow[{{0.7, 0.175}, {2, 0.5}}]}]; av = Graphics[{Blue, Thickness[0.01], Arrowheads[0.07], Arrow[{{0, 0}, {0.5, 1}}]}]; rec = Graphics[{Orange, Line[{{0.6, 0.15}, {0.56, 0.27}, {0.67, 0.29}, {0.7, 0.175}}]}]; op = Graphics[{Purple, Thickness[0.01], Line[{{0, 0}, {0.7, 0.175}}]}]; qp = Graphics[{Purple, Thickness[0.01], Dashed, Line[{{0.7, 0.2}, {0.5, 1}}]}]; txt = Graphics[{Text[Style["O", Black, 20], {0, -0.15}], Text[Style["P", Black, 20], {0.7, 0.07}], Text[Style["S", Black, 20], {2.0, 0.6}], Text[Style["u", Black, Bold, 20], {1.8, 0.3}], Text[Style["v", Black, Bold, 20], {0.22, 0.7}], Text[Style["R", Black, 20], {0.4, 1.15}]}]; aav = Graphics[{Red, Thickness[0.01], Arrowheads[0.07], Arrow[{{0, 0}, {0.6, 1.2}}]}]; rs = Graphics[{Purple, Thickness[0.01], Dashed, Line[{{2, 0.5}, {0.6, 1.2}}]}]; Show[au, av, qp, rec, op, txt, aav, rs]
Figure 2: Orthogonal projection on another vector

Now we express the length of projection vector through dot product:

\[ \left\| \mathbf{v}_{\|} \right\| = \left\| \mbox{proj}_{u} (\mathbf{v}) \right\| = \| \mathbf{v} \| \, \cos\theta = \frac{\mathbf{u} \bullet \mathbf{v}}{\| \mathbf{u} \|} . \]

The vector proju(v) = OP can now be computed by re-scaling u to the correct length. Specifically, we first normalize u by dividing it by its own length, and then multiply by |OP|. In formulas:

\[ \mbox{proj}_u (\mathbf{v}) = \vec{OP} =\frac{|OP|}{\| \mathbf{u} \|}\,\mathbf{u} = \frac{\mathbf{u} \bullet \mathbf{v}}{\mathbf{u} \bullet \mathbf{u}} \,\mathbf{u} . \]
The following definition summarizes what we have just found.
Let u be a non-zero vector and v any vector. Then the component of v in the direction of u is defined to be the scalar \begin{equation} \label{EqDot.11} \mbox{comp}_{\bf u}\left( \mathbf{v} \right) = \frac{\mathbf{u} \bullet \mathbf{v}}{\| \mathbf{u} \|} . \end{equation} The projection of v onto u is defined to be the vector \begin{equation} \label{EqDot.12} \mbox{proj}_{\bf u}\left( \mathbf{v} \right) = \frac{\mathbf{u} \bullet \mathbf{v}}{\mathbf{u} \bullet \mathbf{u}} \,\mathbf{u} = \frac{\mathbf{u} \bullet \mathbf{v}}{\| \mathbf{u} \|^2} \,\mathbf{u} . \end{equation} These two operations are also called the scalar projection and vector projection, respectively.
Formula \eqref{EqDot.12} may be interpreted as the orthogonal projection of the vector v, not merely onto the vector u, but onto the subspace span{u}. This aligns with the geometric intuition commonly introduced in elementary linear algebra: we construct the perpendicular from the tip of v to the line defined by the direction of u. It is important to note that the resulting scalar quantity—namely, the component of v in the direction of u—can be negative. This occurs when the angle between v and u is obtuse, indicating that v points in a direction opposite to u. In such cases, the projection of v onto span{u} lies on the extension of the line in the direction opposite to u, and the component of v along u is therefore negative.

au = Graphics[{Black, Thickness[0.01], Arrowheads[0.07], Arrow[{{0.7, 0.175}, {2, 0.5}}]}]; av = Graphics[{Blue, Thickness[0.01], Arrowheads[0.07], Arrow[{{0, 0}, {-0.5, -1}}]}]; rec = Graphics[{Orange, Line[{{-0.6, -0.15}, {-0.56, -0.27}, {-0.67, -0.29}, {-0.7, \ -0.175}}]}]; op = Graphics[{Purple, Thickness[0.01], Line[{{0, 0}, {0.7, 0.175}}]}]; qp = Graphics[{Purple, Thickness[0.01], Dashed, Line[{{-0.7, -0.2}, {-0.5, -1}}]}]; txt = Graphics[{Text[Style["O", Black, 20], {0.1, 0.15}], Text[Style["P", Black, 20], {-0.7, -0.09}], Text[Style["u", Black, Bold, 20], {1.8, 0.3}], Text[Style["v", Black, Bold, 20], {-0.22, -0.7}], Text[Style["Q", Black, 20], {-0.6, -1.05}]}]; pr = Graphics[{Red, Thickness[0.01], Arrowheads[0.07], Arrow[{{0, 0}, {-0.7, -0.2}}]}]; ax = Graphics[{Black, Thick, Arrowheads[0.04], Arrow[{{-0.9, 0}, {2.0, 0}}]}]; ay = Graphics[{Black, Thick, Arrowheads[0.04], Arrow[{{0, -1.0}, {0, 0.7}}]}]; Show[au, av, qp, rec, op, txt, pr, ax, ay]
Figure 3: Projection (in red) in opposite direction
   
Example 40: For vectors u = (3, −7, 2) and v = (6, 4, −5) in ℝ³, let us find their projections on each other.

In order to use formula (3), we need to find \[ \mathbf{u} \bullet \mathbf{v} = 3 \cdot 6 - 7 \cdot 4 - 2 \cdot 5 = -20 . \]

u = {3, -7, 2}; v = {6, 4, -5}; Dot[u,v]
-20
The norms of the given vectors are \[ \| \mathbf{u} \| = \sqrt{62} \approx 7.87401, \qquad \| \mathbf{v} \| = \sqrt{77} \approx 8.77496 \]
Norm[u]
\( \displaystyle \quad \sqrt{62} \)
Norm[v]
\( \displaystyle \quad \sqrt{77} \)
Now we determine the required projections: \[ \mbox{proj}_u (\mathbf{v}) = \frac{\mathbf{u} \bullet \mathbf{v}}{\mathbf{u} \bullet \mathbf{u}} \,\mathbf{u} = \frac{\mathbf{u} \bullet \mathbf{v}}{\| \mathbf{u} \|^2} \,\mathbf{u} = \frac{-20}{62}\,\mathbf{u} = - \frac{10}{31}\,\left( 3, -7, 2 \right) , \] and \[ \mbox{proj}_v (\mathbf{u}) = \frac{\mathbf{u} \bullet \mathbf{v}}{\mathbf{v} \bullet \mathbf{v}} \,\mathbf{v} = \frac{\mathbf{u} \bullet \mathbf{v}}{\| \mathbf{v} \|^2} \,\mathbf{v } = - \frac{20}{77} \,\left( 6, 4, -5 \right) . \]    ■
End of Example 40
    Formula \eqref{EqDot.12} shows that the projection vector depends on vector v linearly. We know that every linear transformation can be expressed as matrix multiplication. So we would like to express the projection vector as
\[ \mbox{proj}_{\bf u}\left( \mathbf{v} \right) = \frac{\mathbf{u} \bullet \mathbf{v}}{\mathbf{u} \bullet \mathbf{u}} \,\mathbf{u} = \mathbf{P}\,\mathbf{u} , \quad \mathbf{u} \in \mathbb{R}^{n\times 1} . \]
This rank-one n-by-n matrix is called the projection matrix on direction u:
\begin{equation} \label{EqDot.2} \mathbf{P}_u = \mathbf{P} = \frac{\mathbf{u} \,\mathbf{u}^{\mathrm T}}{\| \mathbf{u} \|^2} , \quad \mathbf{u} \in \mathbb{R}^{n\times 1} . \end{equation}
   
Example 41: Let u = [1, −2, 3]T be a vector in ℝ³. Since its squared norm is 14, we find the corresponding projection matrix \[ \mathbf{P} = \mathbf{P}_u = \frac{1}{14}\,\mathbf{u} \cdot \mathbf{u}^{\mathrm T} . \] Actual evaluation of matrix P is devoted to Mathematica:
u = {{1, -2, 3}}; ut = {{1}, {-2}, {3}}; ut . u
{1, -2, 3}, {-2, 4, -6}, {3, -6, 9}}
So rank-one projection matrix becomes \[ \mathbf{P} = \frac{1}{14} \begin{bmatrix} \phantom{-}1 & -2 & \phantom{-}3 \\ -2 & \phantom{-}4 & -6 \\ \phantom{-}3 & -6 & \phantom{-}9 \end{bmatrix} . \]    ■
End of Example 41
    An important application of projections is that every vector v can be uniquely written as a sum of two orthogonal vectors, one of which is a scalar multiple of some given non-zero vector u, and the other of which is orthogonal to u.
Theorem 13: Let u be a non-zero vector, and let v be any vector from the same vector space. Then there exist unique vectors v and v such that \begin{equation} \label{EqDot.13} \mathbf{v} = \mathbf{v}_{\|} + \mathbf{v}_{\perp} , \end{equation} where v is a scalar multiple of u, and v is orthogonal to u.
We visualize the situation with Picture 2.

To show that such a decomposition exists, let \[ \mathbf{v}_{\|} = \mbox{proj}_u (\mathbf{v}) = \frac{\mathbf{u} \bullet \mathbf{v}}{\mathbf{u} \bullet \mathbf{u}}\,\mathbf{u} , \] and define v by the identity v = vv. By definition, Eq.(5) is satisfied, and v is a scalar multiple of u. We must show that v is orthogonal to u. For this, we verify that their dot product equals zero: \begin{align*} \mathbf{u} \bullet \mathbf{v}_{\perp} &= \mathbf{u} \bullet \left( \mathbf{v} - \mathbf{v}_{\perp} \right) \\ &= \mathbf{u} \bullet \mathbf{v} - \mathbf{u} \bullet \mathbf{v}_{\perp} \\ &= \mathbf{u} \bullet \mathbf{v} - \frac{\mathbf{u} \bullet \mathbf{v}}{\mathbf{u} \bullet \mathbf{u}}\,\mathbf{u} \bullet \mathbf{u} \\ &= \mathbf{u} \bullet \mathbf{v} - \mathbf{u} \bullet \mathbf{v} = 0 . \end{align*} To show uniqueness, suppose that Eq.(13) holds and v = λu. Taking the dot product of both sides of (13) with u and using uv = 0, this yields \begin{align*} \mathbf{u} \bullet \mathbf{v} &= \mathbf{u} \left( \mathbf{v}_{\|} + \mathbf{v}_{\perp} \right) \\ &= \mathbf{u} \bullet \lambda \mathbf{u} + \mathbf{u} \bullet \mathbf{v}_{\perp} \\ &= \lambda \,\| \mathbf{u} \|^2 , \end{align*} which implies λ = uv/∥u∥². Thus, there can be no more than one such vector v. Since v must equal vv, it follows that there can be no more than one choice for both v and v, proving their uniqueness.

   
Example 42: Let us consider two vectors from ℝ³: \[ \mathbf{u} = \left( 3,-4,5 \right) , \qquad \mathbf{v} = \left( 2.71828, 3.14159, 1.61803 \right) . \] We want to decompose vector v into v = v + v, where v is parallel to u and v is orthogonal to u. Using formula (12), we find the vector that is parallel to u: \[ \mathbf{v}_{\|} = \frac{\mathbf{u} \bullet \mathbf{v}}{\mathbf{u} \bullet \mathbf{u}} \,\mathbf{u} = 0.0735726 \,\mathbf{u} = 0.0735726 \left( 3,-4,5 \right) \] because \[ \mathbf{u} \bullet \mathbf{u} = 50, \qquad \mathbf{u} \bullet \mathbf{v} = 3.67863 \]
u = {3, -4, 5}; v = {2.71828, 3.14159, 1.61803}; Dot[u, u]
50
Dot[u, v]
3.67863
Dot[u, v]/Dot[u, u]
0.0735726
Then its orthogonal projection becomes \[ \mathbf{v}_{\perp} = \mathbf{v} - \mathbf{v}_{\|} = \left( 2.49756, 3.43588, 1.25017 \right) . \]
v - Dot[u, v]/Dot[u, u]*u
{2.49756, 3.43588, 1.25017}
u = {3, -4, 5}; v = {2.71828, 3.14159, 1.61803}; av = Graphics3D[{Black, Thickness[0.01], Arrowheads[0.04], Arrow[{{0.0, 0.0, 0}, v}]}]; Show[av, au]
Figure 42.1: Two orthogonal vectors v and v
   ■
End of Example 42

Projection on a subspace

It is natural to extend the definition of vector projection from one dimensional line generated by a single vector to the subspace spaned by an orthogonal set. This generalization is not only elegant but also essential for solving minimum distance problems in linear algebra and optimization. Recall that a subspace of a vector space is a subset that is closed under both vector addition and scalar multiplication. That means any linear combination of vectors in the subspace remains within the subspace.
Let W be a subspace of ℝn and let x be a vector in ℝn. The closest vector to x on W is denoted by xW. This means that difference xxW is orthogonal to all vectors in W. This vector xW is called the orthogonal projection of vector x on subspace W. The length of vector xxW can be considered as the distance from x to subspace W.
For ℝ³, some common subspaces include lines that go through the origin and planes that go through the origin. The following example serves as intuitive development of vector projection in 3-dimensional physical space.    
Example 43: A basis for the xy-plane is given by the two standard coordinate vectors \[ \mathbf{e}_1 = \begin{pmatrix} 1 \\ 0 \\ 0 \end{pmatrix} = \mathbf{i} , \qquad \mathbf{e}_2 = \begin{pmatrix} 0 \\ 1 \\ 0 \end{pmatrix} = \mathbf{j} . \] The set {e₁, e₂) is orthogonal, so for any vector x = (x₁, x₂, x₃), we have \[ \mathbf{x}_W = \left( \mathbf{x} \bullet \mathbf{e}_1 \right) \mathbf{e}_1 + \left( \mathbf{x} \bullet \mathbf{e}_2 \right) \mathbf{e}_2 = \begin{pmatrix} x_1 \\ x_2 \\ 0 \end{pmatrix} . \]
ax = Graphics[{Black, Thick, Arrowheads[0.04], Arrow[{{0.2, 0.2}, {-0.5, -0.5}}]}]; ay = Graphics[{Black, Thick, Arrowheads[0.04], Arrow[{{-0.3, 0}, {1, 0}}]}]; az = Graphics[{Black, Thick, Arrowheads[0.04], Arrow[{{0, -0.4}, {0, 1}}]}]; au = Graphics[{Blue, Thickness[0.01], Arrowheads[0.07], Arrow[{{0.0, 0.0}, {0.35, 0.75}}]}]; azu = Graphics[{Purple, Dashed, Thickness[0.01], Line[{{0.35, -0.3}, {0.35, 0.75}}]}]; axu = Graphics[{Purple, Dashed, Thickness[0.01], Line[{{0.35, -0.3}, {-0.3, -0.3}}]}]; ayu = Graphics[{Purple, Dashed, Thickness[0.01], Line[{{0.35, -0.3}, {0.65, 0.0}}]}]; pr = Graphics[{Red, Thickness[0.01], Arrowheads[0.07], Arrow[{{0.0, 0.0}, {0.35, -0.3}}]}]; txt = Graphics[{Text[Style["O", Black, 20], {-0.1, 0.1}], Text[Style["x", Black, 20], {-0.35, -0.45}], Text[Style["y", Black, Bold, 20], {0.9, 0.1}], Text[Style["v", Black, Bold, 20], {0.25, 0.7}], Text[Style["projection", Red, Bold, 20], {0.3, -0.4}], Text[Style["z", Black, 20], {0.1, 0.9}]}]; Show[ax, ay, az, au, azu, axu, ayu, pr, txt]
Figure 4.1: Projection (in red) on xy-plane
   ■
End of Example 43

There are two equivalent approaches to the definition of orthogonal projection: we can seek pW such that ypW, or we can seek pW such that ∥yp∥ is minimized. There are several issues to consider: uniqueness, existence, and computing projection vector p. We start with the uniqueness question.

Theorem 14: Let W be a subspace of the direct product space ℝn, and let y be its element. Then there is at most one vector p in W such that ypW.
Suppose there exist two points p, q such that ypW and yqW. Since W and W are both subspaces of ℝn, their difference pq is also an element of W. On the other hand, (yp) − (yq) = pqW. However, WW = {0), so pq = 0 and p = q.
   
Example 44: Let us find the distance from the point (5, −3, 2) to the plane defined by 2x + 5y −3z = 0. We demonstrate two approaches for distance determination.

The first approach is based on projection of the taget vector onto normal vector to the plane. Since this vector is known to be n = (2, 5, −3). Now we compute the projection of the position vector (5, −3, 2) onto n using Eq.*3): \[ \mbox{proj}_n (\mathbf{v}) = \frac{\mathbf{v} \bullet \mathbf{n}}{\| \mathbf{n} \|^2} \,\mathbf{n} = \frac{11}{38} \begin{pmatrix} -2 \\ -5 \\ 3 \end{pmatrix} . \]

v = {5, -3, 2}; n = {2, 5, -3}; x = (Dot[v, n])/Norm[n]^2 *n
{-(11/19), -(55/38), 33/38}
Now, this vector is parallel to normal vector n, so it’s perpendicular to the plane. Subtracting it from the taget vector v = (5, −3, 2) gives a vector parallel to the plane, and this is the position vector for the point we seek. \[ \mathbf{q} = \mathbf{v} - \mathbf{x} = \frac{1}{38} \begin{pmatrix} 212 \\ -59 \\ 43 \end{pmatrix} . \]
q = v - x
{106/19, -(59/38), 43/38}
So the closest point is Q = (212, −59, 43)/38. We weren’t asked for it, but note that if we wanted the distance from the point v to the plane, this is given by ∥x∥ = (√11)/38.
Norm[x]
11/Sqrt[38]

Now we demonstrate another approach using    orthogonal projection:   

Recall that the orthogonal complement of W, denoted by W is \[ W^{\perp} = \left\{ \mathbf{y} \in \mathbb{R}^3 \ : \ \mathbf{x} \bullet \mathbf{y} \quad \forall \mathbf{x} \in W \right\} , \] which leads to direct sum decomposition ℝ³ = WW. With this approach, we need to determine a basis of subspace W. From plane equation, we find \[ 2x = -5y + 3z. \] Setting y = 0, z = 2, and then y = 2, z = 0, we obtain two linearly independent (because they have zero components) vectors \[ \mathbf{u}_1 = \begin{pmatrix} 3 \\ 0 \\ 2 \end{pmatrix} , \qquad \mathbf{u}_2 = \begin{pmatrix} -5 \\ 2 \\ 0 \end{pmatrix} . \] With these vectors at hand, we build the matrix \[ \mathbf{A} = \begin{bmatrix} 3 & -5 \\ 0 & 2 \\ 2&0 \end{bmatrix} . \] Using formula (8) we find the projection vector \[ \mathbf{p} = \mathbf{A} \left( \mathbf{A}^{\mathrm T} \mathbf{A} \right)^{-1} \mathbf{A}^{\mathrm T} \mathbf{v} = \frac{1}{38}\begin{bmatrix} 34 & -106 \\ -10&13&15 \\ 6&15&29 \end{bmatrix} \begin{pmatrix} 5 \\ -3 \\ 2 \end{pmatrix} = \frac{1}{38}\begin{pmatrix} 212 \\ -59 \\ 43 \end{pmatrix} . \]

A = {{3, -5}, {0, 2}, {2, 0}}; P = A . Inverse[Transpose[A] . A] . Transpose[A]
{{17/19, -(5/19), 3/19}, {-(5/19), 13/38, 15/38}, {3/19, 15/38, 29/ 38}}
P . {5, -3, 2}
{106/19, -(59/38), 43/38}
Substracting this vector

p from v, we obtain the same vector qW.    ■

End of Example 44

The more subtle question is whether projection vector always exists for arbitrary element of finite dimensional real space. We have seen the answer is yes for a one-dimensional subspace, formula \eqref{EqDot.12}. The next theorem generalizes this result to any finite-dimensional subspace, giving a simple formula for the orthogonal projection of a vector in terms of an orthonormal basis for the subspace.

Theorem 15: Let W be a subspace of ℝn and let {u₁, u₂, … , um} be an orthogonal basis for W. Then for any vector x in ℝn, the orthogonal projection of x onto W is given by the formula \begin{equation} \label{EqDot.4} \mathbf{x}_W = \frac{\mathbf{x} \bullet \mathbf{u}_1}{\mathbf{u}_1 \bullet \mathbf{u}_1}\,\mathbf{u}_{1} + \frac{\mathbf{x} \bullet \mathbf{u}_2}{\mathbf{u}_2 \bullet \mathbf{u}_2}\,\mathbf{u}_{2} + \cdots + \frac{\mathbf{x} \bullet \mathbf{u}_m}{\mathbf{u}_m \bullet \mathbf{u}_m}\,\mathbf{u}_{m} . \end{equation}
If nectors e₁, e₂, … , em constitute the orthonormal basis for W, then the projection matrix can be written as
\[ {\bf P}_W = \sum_i \left\langle {\bf e}_i , \cdot \right\rangle {\bf e}_i . \tag{6A} \]
Let \[ \mathbf{y} = \frac{\mathbf{x} \bullet \mathbf{u}_1}{\mathbf{u}_1 \bullet \mathbf{u}_1}\,\mathbf{u}_{1} + \frac{\mathbf{x} \bullet \mathbf{u}_2}{\mathbf{u}_2 \bullet \mathbf{u}_2}\,\mathbf{u}_{2} + \cdots + \frac{\mathbf{x} \bullet \mathbf{u}_m}{\mathbf{u}_m \bullet \mathbf{u}_m}\,\mathbf{u}_{m} . \] This vector is contained in W because it is a linear combination of {u₁, u₂, … , um}. Hence, we just need to show that xy is in W, i.e., that uj • (xy) = 0 for each j = 1, 2, … , m. For u₁ we have \begin{align*} \mathbf{u}_1 \bullet \left( \mathbf{x} - \mathbf{y} \right) &= \mathbf{u}_1 \bullet \left( \mathbf{x} - \frac{\mathbf{x} \bullet \mathbf{u}_1}{\mathbf{u}_1 \bullet \mathbf{u}_1}\,\mathbf{u}_{1} - \cdots - \frac{\mathbf{x} \bullet \mathbf{u}_m}{\mathbf{u}_m \bullet \mathbf{u}_m}\,\mathbf{u}_{m} \right) \\ &=\mathbf{u}_1 \bullet \mathbf{x} - \frac{\mathbf{x} \bullet \mathbf{u}_1}{\mathbf{u}_1 \bullet \mathbf{u}_1}\left( \mathbf{u}_{1} \bullet \mathbf{u}_{1} \right) - 0 - \cdots \\ &= 0 . \end{align*} A similar calculation shows that uj • (xy) = 0 for each j = 1, 2, … , m. Hence, (xy) ∈ W, as desired.
   
Example 45: We consider two vectors in three dimensional physical space \[ \mathbf{a} = \left( 2,2,-1 \right) , \qquad \mathbf{a} = \left( -1, 1, 2 \right) . \] These vectors generate a two dimensional flat with normal vector n = (5, −3, 4), which we determine by taking their cross product.
a = {2, 2, -1}; b = {-1, 1, 2}; Cross[a, b]
{5, -3, 4}
We denote two dimensional flat by W = span(a, b). Its equation in ℝ³ is \[ 5\,x -3\,y + 4\,z = 0. \] We plot the plane along with the normal vector.
plane = Plot3D[-5*x/4 + 3*y/4 , {x, -2, 2}, {y, -2, 2}]; vec = Graphics3D[{Blue, Thick, Arrow[{{0, 0, 0}, {-5, 3, -4}}]}] Show[vec, plane]
Figure 5.1: Plane with normal vector
Using formula (6), we evaluate the projection vector \[ \mathbf{x}_W = \frac{2x + 2y -z}{9} \left( 2, 2, -1 \right) + \frac{-x + y + 2z}{6} \left( -1, 1, 2 \right) . \] For instance, vector x = (3, 2, 1) has the follwoing projection on flat W. \[ \mathbf{x}_W = \mathbf{a} + \frac{1}{6}\,\mathbf{b} = \frac{1}{6} \left( 11, 13, -2 \right) . \]
{3, 2, 1} . a
9
{3, 2, 1} . b
1
a + b/6
{11/6, 13/6, -(2/3)}
   ■
End of Example 45
Corollary 1: Let W be a subspace of ℝn and let {e₁, e₂, … , em} be an orthonormal basis for W. Then for any vector x in ℝn, the orthogonal projection of x onto W is given by the formula \begin{equation} \label{EqDot.5} \mathbf{x} = \left( \mathbf{x} \bullet \mathbf{e}_1 \right) \mathbf{e}_1 + \left( \mathbf{x} \bullet \mathbf{e}_2 \right) \mathbf{e}_2 + \cdots + \left( \mathbf{x} \bullet \mathbf{e}_m \right) \mathbf{e}_m . \end{equation}
This formula is both simple and powerful because it guarantees the existence of a unique projection vector in W, and it minimizes the Euclidean distance between the target vector and any point in W.
Corollary 2: Let p be a projection of the target vector y ∈ ℝn on subspace W. For any x in W with xp, the inequality ∥yp∥ ≤ ∥yx∥ holds.
Suppose xW. Then pxW and ypW, and so \[ \| \left( \mathbf{y} - \mathbf{p} \right) \|^2 + \| \left( \mathbf{p} - \mathbf{x} \right) \|^2 = \| \left( \mathbf{y} - \mathbf{x} \right) \|^2 . \] Hence ∥yp∥ ≤ ∥yx∥ and equality holds only when x = p.
    Earlier we said that an equivalent way to define the orthogonal projection of target vector y on a subspace W is to find the point in W which is closest to y (provided such a point exists). Corollary 2 establishes this equivalence for a finite-dimensional case.

Now we transfer formula \eqref{EqDot.4} into another form using matrix language. Suppose we have an n-dimensional vector space ℝn, and a set of m linearly independent vectors u₁, u₂, … , um ∈ ℝn. Then these vectors generate a subspace W ⊂ ℝn. We want to find a combination of these vectors that's closest to some target vector v. In other words, we need to find the projection of v onto the subspace spanned by {u₁, u₂, … , um}.

We organize generating vectors {u₁, u₂, … , um} as columns into an n × m matrix called A:

\[ \mathbf{A} = \begin{bmatrix} \mathbf{u}_1 & \mathbf{u}_2 & \cdots & \mathbf{u}_m \end{bmatrix} . \]
Since {u₁, u₂, … , um} is an orthogonal basis of W, its span coincides with the column space of matrix A. We are looking for a vector wW that is a linear combination of generating vectors
\[ \mathbf{w} = c_1 \mathbf{u}_1 + c_2 \mathbf{u}_2 + \cdots + c_m \mathbf{u}_m = \mathbf{A}\,\mathbf{c} \]
and is closest to the target vector v. Then the projection vector will be vW = Ac. So we make the error vector v = vvW to be orthogonal to the subspace W onto which this vector is projected. This means that v must be orthogonal to every one of generator set {u₁, u₂, … , um}. Equating dot product with every generating vector to zero, we get the system of equations
\[ \mathbf{u}_j \bullet \left( \mathbf{v} - \mathbf{v}_W \right) = \mathbf{u}_j \bullet \left( \mathbf{v} - \mathbf{A}\,\mathbf{c} \right) = 0 , \qquad j=1,2,\ldots , m . \]
We can rewrite this system in succinct (matrix) form because matrix/vector multiplication is expressed as dot product for every row:
\[ \mathbf{A}^{\mathrm T} \left( \mathbf{v} - \mathbf{A}\,\mathbf{c} \right) = 0 \qquad \iff \qquad \mathbf{A}^{\mathrm T} \mathbf{v} = \mathbf{A}^{\mathrm T} \mathbf{A}\,\mathbf{c} . \]
We isolate column vector c ∈ ℝm×1:
\[ \mathbf{c} = \left( \mathbf{A}^{\mathrm T} \mathbf{A} \right)^{-1} \mathbf{A}^{\mathrm T} \mathbf{v} \]
because matrix ATA is invertible as having m linearly independent columns. Therefore, the projection vector on subspace W is expressed through matrix A of column generators as follows:
\begin{equation} \label{EqDot.8} \mathbf{v}_W = \mathbf{A} \left( \mathbf{A}^{\mathrm T} \mathbf{A} \right)^{-1} \mathbf{A}^{\mathrm T} \mathbf{v} = \mathbf{P} \,\mathbf{v} . \end{equation}

If W has an orthonormal basis [e₁, e₂, … , em] collected into matrix

\[ \mathbf{Q} = \begin{bmatrix} \mathbf{e}_1 & \mathbf{e}_2 & \cdots & \mathbf{e}_m \end{bmatrix} \in \mathbb{R}^{n\times m} , \]
then the projection matrix on subspace W becomes
\begin{equation} \label{EqDot.9} \mathbf{P} = \mathbf{Q}\,\mathbf{Q}^{\mathrm T} . \end{equation}
because QTQ = I, the identity m×m matrix.
Theorem 16: Let A be an m × n matrix with linearly independent columns and let W = ColumnSpace (A) be the column space of matrix A. Then the n × n (self-adjoint) matrix ATA is invertible, and for any column vector x in ℝn×1, its projection on subspace W is expressed by formula \eqref{EqDot.8}, where P ∈ ℝn×n.
We first show that the null space of ATA consists of the single element---the zero vector. This leads to invertibility of the self-adjoint matrix ATA. Suppose that ATAc = 0. Then ATAc = ATc = 0, so 0W = ATc. However, 0W = 0 (the orthogonal decomposition of the zero vector is just 0 = 0 + 0), so A c = 0, and therefore is in NullSpace(A). Since columns of matrix A are linearly independent, we have c = 0. Hence, NullSpace(ATA) = {0}, as desired.

Let x be a column vector in ℝn×1 and let c be a solution of ATAc = ATx. Then applying the inverse of the self-ajoint matrix, we determine the projection vector according to formula \eqref{EqDot.8}.

   
Example 46: In four dimensional space, we consider two noncollinear vectors \[ \mathbf{u}_1 = \begin{pmatrix} 1 \\ 2 \\ 3 \\ 4 \end{pmatrix} , \qquad \mathbf{u}_2 = \begin{pmatrix} 4 \\ 3 \\ 2 \\ 1 \end{pmatrix} . \] Let W = span{u₁, u₂} be two-dimensional subspace generated by these two vectors. We want to find projection of vector v = (1, 1, 1, 1) on this subspace. First, we build the matrix of these vectors \[ \mathbf{A} = \begin{bmatrix} 1 & 4 \\ 2 & 3 \\ 3 & 2 \\ 4 & 1 \end{bmatrix} . \] The corresponding projection matrix is given by formula (8): \[ \mathbf{P} = \mathbf{P}_W = \mathbf{A} \left( \mathbf{A}^{\mathrm T} \mathbf{A} \right)^{-1} \mathbf{A}^{\mathrm T} = \frac{1}{10} \begin{bmatrix} 7& 4& 1& -2 \\ 4& 3& 2& 1 \\ 1& 2& 3& 4 \\ -2& 1& 4& 7 \end{bmatrix} . \]
A = {{1, 4}, {2, 3}, {3, 2}, {4, 1}}; P = A . Inverse[Transpose[A] . A] . Transpose[A]
{{7/10, 2/5, 1/10, -(1/5)}, {2/5, 3/10, 1/5, 1/10}, {1/10, 1/5, 3/10, 2/5}, {-(1/5), 1/10, 2/5, 7/10}}
In particular, if one wants to find a projection of arbitrary vecto on Wr, just multiply (from left) it by matrix P. For target vector v = (1, 1, 1, 1), we have \[ \mathbf{v}_W = \mathbf{P}\,\mathbf{v} = \begin{pmatrix} 1 \\ 1 \\ 1 \\ 1 \end{pmatrix} = \mathbf{v} \in W . \]
P . {1, 1, 1, 1}
{1, 1, 1, 1}
Since vector v = (1, 1, 1, 1) belongs to subspace W, its projection coincides with v. However, if we choose another vector (not in W), we get \[ \mathbf{y}_W = \mathbf{P}\,\begin{pmatrix} 1 \\ -1 \\ -1 \\ 1 \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \\ 0 \\ 0 \end{pmatrix} . \] This means that vector y = (1, −1, −1, 1) is orthogonal to W.

Example A: Verification of Eq.(9) in ℝ³

We consider a two-dimensional subspace W ⊂ ℝ³ spanned on two orthogonal vectors \[ \mathbf{e} _1 = \frac{1}{\sqrt{3}} \begin{pmatrix} 1 \\ 1 \\ 1 \end{pmatrix} , \qquad \frac{1}{\sqrt{14}} \mathbf{e} _2 = \begin{pmatrix} 2 \\ -3 \\ 1 \end{pmatrix} \] We build 3 × 2 matrix of these vectors \[ \mathbf{Q} = \begin{bmatrix} \mathbf{e}_1 & \mathbf{e}_2 \end{bmatrix} . \] Then we form a projection matrix according to Eq.(9): \[ \mathbf{P} = \mathbf{Q}\,\mathbf{Q}^{\mathrm T} = \frac{1}{42} \begin{bmatrix} 26 & -4 & 20 \\ -4& 41& 5 \\ 20& 5& 17 \end{bmatrix} . \]

Q = {{1/Sqrt[3], 2/Sqrt[14]}, {1/Sqrt[3], -3/Sqrt[14]}, {1/Sqrt[3], 1/Sqrt[14]}};
P = Q . Transpose[Q]
{{13/21, -(2/21), 10/21}, {-(2/21), 41/42, 5/42}, {10/21, 5/42, 17/ 42}}
Finally, we compare it with Eq.(8) to verify that the projection matrix is the same:
PP = FullSimplify[Q . Inverse[Transpose[Q] . Q] . Transpose[Q]]
{{13/21, -(2/21), 10/21}, {-(2/21), 41/42, 5/42}, {10/21, 5/42, 17/ 42}}
   ■
End of Example 46
Theorem 17: Let W be a nontrivial subspace of ℝn, and define a linear transformation T : ℝn ⇾ ℝn by T(x) = xW, the orthogonal projection of x onto W. Then
  1. T is a linear transformation.
  2. T(x) = x if and only if x is in W.
  3. T(x) = 0 if and only if x is in W.
  4. Composition TT = T.
  5. The image of T is W.
  1. We have to verify two defining properties of linear transformation. Let x, y be arbitrary vectors from ℝn, and let x = xW + x and y = yW + y be their orthogonal decompositions. Since W and its orthogonal complement W are both subspaces of ℝn, the sums xW + yW and x + y are in W and W, respectively. Therefore, the orthogonal decomposition of x + y is (xW + yW) + (x + y). Hence, \[ T \left( \mathbf{x} + \mathbf{y} \right) = \left( \mathbf{x} + \mathbf{y} \right)_W = x_W + y_W = T(\mathbf{x}) + T(\mathbf{y}) . \] Now let λ be a scalar. Then λxW is in W and λx is in W, so the orthogonal decomposition of λx is λxW + λx, and therefore, \[ T \left( \lambda\mathbf{x} \right) = \left( \lambda\mathbf{x} \right)_W = \lambda \mathbf{x}_W = \lambda\,T(\mathbf{x}) . \] Since these two defining properties of linear transformation are satisfied by projecting mapping T, we consclude that T is a linear transformation.
  2. For every vector x ∈ ℝn, there is a unique decomposition \[ \mathbf{x} = \mathbf{x}_W + \mathbf{x}_{W^{\perp }} = \mathbf{x}_W + \mathbf{x}_W^{\perp} , \] where
    • xWW,
    • xWW (the orthogonal complement of W),
    • and by definition of orthogonal projection, T(x) = xW.
    We now prove the required identity in both directions.

    (⇒ ) If T(x) = x, then xW.

    Assume T(x) = x. By definition of T, \[ T(\mathbf{x}) = \mathbf{x}_W . \] So x = T(x) = xW. But xWW. Therefore, x is in W.

    (⇐ ) If x is in W, then T(x) = x.

    Assume xW. In the decomposition \[ \mathbf{x} = \mathbf{x}_W+x_{W^{\perp }}, \] the component in W must be x itself, and the orthogonal component must be 0. So \[ \mathbf{x}_W = \mathbf{x},\quad x_{W^{\perp }}=0. \] Hence, T(x) = xW = x.

    Since both implications hold, we conclude T(x) = x     ⇔     xW.

  3. Similar to the previous part, the proof is based on unique decomposition \[ \mathbf{x} = \mathbf{x}_W + \mathbf{x}_{W^{\perp }} = \mathbf{x}_W + \mathbf{x}_W^{\perp} . \]

    (⇒ ) If T(x) = 0, then x is in W.

    Assume T(x) = 0. Then \[ \mathbf{x}_W = T(\mathbf{x}) =0. \] Using the decomposition x = xW + xW, we get \[ \mathbf{x} = 0 + \mathbf{x}_{W^{\perp }} = \mathbf{x}_{W^{\perp }}. \] But xW is in W, so xW.

    (⇐ ) If x is in W, then T(x) = 0.

    Assume xW. In the decomposition, x = xW + xW, the component in W must be x itself, and the component in W must be 0. Thus, \[ \mathbf{x}_W = 0,\quad \mathbf{x}_{W^{\perp }} = \mathbf{x}. \]

  4. For any x in ℝn, the vector T(x) is in W, so TT(x) = T(T(x)) = T(x) by property 2.
  5. Any vector x in W is in the range of T because T(x) = x for such vectors. On the other hand, for any vector x in ℝn, the output T(x) = xW is in W, so W is in the image of T.
   
Example 47:
  1. Using Mathematica, we generate two three-dimensional vectors
          u1 = RandomInteger[{-9, 9}, 3]
          {-6, 1, 9}
          u2 = RandomInteger[{-9, 9}, 3]
          {-7, 1, 5}
    Then we build 3×2 matrix having these column vector
          A = Transpose[{u1, u2}]
    \[ \mathbf{A} = \begin{bmatrix} -6&-7 \\ \phantom{-}1&\phantom{-}1 \\ \phantom{-}9&\phantom{-}5 \end{bmatrix} . \] Using this matrix, we form a projection matrix on space W = span{u₁, u₂} based on formular (8): \[ \mathbf{P}_W = \mathbf{A} \left( \mathbf{A}^{\mathrm T} \mathbf{A} \right)^{-1} \mathbf{A}^{\mathrm T} = \frac{1}{1106} \begin{bmatrix} 1090 & -132 & 4 \\ -132 & 17& 33 \\ 4 & 33 & 1105 \end{bmatrix} . \]
          P = A . Inverse[Transpose[A] . A] . Transpose[A]
          {{545/553, -(66/553), 2/553}, {-(66/553), 17/1106, 33/1106}, {2/553, 33/1106, 1105/1106}}
    It is well-known that any linear transformation is equivalent to matrix multiplication and vice versa. Therefore, to matrix P corresponds a linear transformation.
  2. Let W be a two-dimensional subspace of ℝ³ that is spanned on two vectors u = [-9, 8, 5] and v = [0, -3, 0]:
          u = RandomInteger[{-9, 9}, 3]
          {-9, 8, 5}
          v = RandomInteger[{-9, 9}, 3]
          {0, -3, 0}
    We organize these vectors in matrix form: \[ \mathbf{A} = \begin{bmatrix} -9&0 \\ 8&-3 \\ 5&0 \end{bmatrix} . \] Projection transformation on subspace W is equivalent to matrix multiplication \[ \mathbf{P}_W = \mathbf{A} \left( \mathbf{A}^{\mathrm T} \mathbf{A} \right)^{-1} \mathbf{A}^{\mathrm T} = \frac{1}{106} \begin{bmatrix} 81 & 0 & -45 \\ 0 & 106& 0 \\ -45 & 0 & 25 \end{bmatrix} . \]
          P = A . Inverse[Transpose[A] . A] . Transpose[A]
         {{81/106, 0, -(45/106)}, {0, 1, 0}, {-(45/106), 0, 25/106}}
    We check that matrix P is singular
         Det[P]
          0
    Let us take x = u + 2 v = [-9, 2, 5] ∈ W. Then its projection on W is \[ {\bf x}_W = {\bf P}\,\mathbf{x} = \mathbf{x} . \]
         P . {-9, 2, 5}
          {-9, 2, 5}
  3. In this example, we use W from the previous example, which is a span of two vectors u = [-9, 8, 5] and v = [0, -3, 0]. Then their cross product will be orthogonal to W: \[ \mathbf{x} = \mathbf{u} \times \mathbf{v} = \begin{bmatrix} 15&0&27 \end{bmatrix} . \]
          u = {-9, 8, 5}; v = {0, -3, 0};
          x = Cross[u, v]
          {15, 0, 27}
    Its projection on W becomes \[ \mathbf{x}_W = {\bf P}\,\mathbf{x} = \frac{1}{106} \begin{bmatrix} 81 & 0 & -45 \\ 0 & 106& 0 \\ -45 & 0 & 25 \end{bmatrix} \begin{pmatrix} 15 \\ 0 \\ 27 \end{pmatrix} = \begin{pmatrix} 0\\ 0 \\ 0 \end{pmatrix} . \]
          P . x
          {0, 0, 0}
  4. Composition of two linear transformations corresponds the product of their corresponding matrices. So we need to check that P P = P. So for our example we take the projection matrix from example 2. Then the product is \[ {\bf P}\,{\bf P} = \frac{1}{106} \begin{bmatrix} 81 & 0 & -45 \\ 0 & 106& 0 \\ -45 & 0 & 25 \end{bmatrix} = {\bf P}. \]
          P.P
          {{81/106, 0, -(45/106)}, {0, 1, 0}, {-(45/106), 0, 25/106}}
  5. This property is equivalent to property 2.
   ■
End of Example 47
    Properties of projection mappings can be reformulated in matrix language:
Theorem 18: Let W be a subspace of column vectors ℝn×1, define T : ℝn×1 ⇾ ℝn×1 by T(x) = xW, and let P = PW be the standard matrix for T, so xW = Px. Then
  1. ColumnSpace(P) = W.
  2. NullSpace(P) = ker(P) = W.
  3. P² = P.
  4. If W ≠ {0}, then λ = 1 is an eigenvalue of P and the 1-eigenspace for P is W.
  5. If W ≠ ℝn×1, then zero is an eigenvalue of P and 0-eigenspace for P is W.
  6. Projection matrix P is similar to the diagonal matrix with m ones and nm zeroes on the diagonal, where m = dim(W).
The properties of projection matrices would be very hard to prove in terms of matrix language only. By translating all the statements into asserments about linear transformations, they become much more transparent.
  1. We show both inclusions: ColumnSpace(P) ⊆ W and W ⊆ ColumnSpace(P).

    A. ColumnSpace(P) ⊆ W.

    By definition, the column space of P is ColumnSpace(P) = { P x : x ∈ ℝn×1 }.

    However, P x = T(x) = xW, and by definition of orthogonal projection, xWW for every x. Therefore, every vector of the form P x lies in W, so ColumnSpace(P) ⊆ W.

    B. W ⊆ ColumnSpace(P).

    Let wW. Since T is the orthogonal projection onto W, we know from the previous theorem that T(w) = w. In matrix form, this is \[ \mathbf{P}\,\mathbf{w} = \mathbf{w}. \] However, this means that w is equal to P times some vector (namely w itself), so w ∈ ColumnSpace(P). Hence, W ⊆ ColumnSpace(P).

    Combining the two inclusions, we obtain W = ColumnSpace(P).

  2. We start with relating the matrix kernel to the transformation kernel.

    Since P is the standard matrix of T, we have \[ T(\mathbf{x}) = {\bf P}\,\mathbf{x} \quad \mathrm{for\ all\ }\mathbf{x} \in \mathbb{R}^{n\times 1} . \] Thus, \[ {\bf P}\,\mathbf{x} = 0\quad \Longleftrightarrow \quad T(\mathbf{x})=0. \] Therefore, \[ \ker (\mathbf{P}) = \ker (T). \] So it suffices to show \[ \ker (T)=W^{\perp }. \] 2) Use the orthogonal decomposition: Every vector x ∈ ℝn has a unique decomposition \[ \mathbf{x} = \mathbf{x}_W + \mathbf{x}_{W^{\perp }} = \mathbf{x}_W + \mathbf{x}_W^{\perp} , \] where

    • xW is in W,
    • xWW (the orthogonal complement of W).
    • T(x) = xW.

    3) Show ker(T) ⊆ W.

    If x ∈ ker(T), then T(x) = xW = 0.

    Using decomposition x = xW + xW, we get \[ \mathbf{x} = \mathbf{0} + \mathbf{x}_W^{\perp} = \mathbf{x}_W^{\perp} . \] Thus, xW. and ker(T) ⊆ W.

    4) Show W ⊆ ker(T).

    Let xW. Then in the decomposition x = xW + xW, we must have \[ \mathbf{x}_{W^{\perp }} = \mathbf{x},\qquad \mathbf{x}_W = 0. \] Thus, T(x) = xW = 0, so x ∈ ker(T). Hence, W ⊆ ker(T).

    5) Now we combine the results. We have shown that ker(T) = W. And since ker(P) = ker(T), we conclude that NullSpace(P) = ker(P) = W.

  3. Step 1: Use the geometric meaning of projection. Each x from ℝn×1 (or ℝn) can be written as x = xW + xW, where xWW is the orthogonal projection of x onto W, and xWW is the component orthogonal to W.

    By definition of T, T(x) = xW. Now apply T again to T(x): \[ T(T(\mathbf{x}))=T(\mathbf{x}_W). \] However, xW already lies in W. The orthogonal projection of a vector in W onto W is just itself. So \[ T(\mathbf{x}_W) = \mathbf{x}_W. \] Therefore, for every x ∈ ℝn ≌ ℝn×1, \[ T(T(\mathbf{x}))=T(\mathbf{x}). \]

    Step 2: Translate to matrices: In terms of the standard matrix P, we have \[ T(\mathbf{x})= {\bf P}\,\mathbf{x},\quad T(T(\mathbf{x})) = \mathbf{P}({\bf P}\,\mathbf{x}) = \mathbf{P}^2 \mathbf{x}. \] From the identity T(T(x)) = T(x) for all x ∈ ℝn×1, we get \[ \mathbf{P}^2 \mathbf{x} = \mathbf{P}\,\mathbf{x}\quad \mathrm{for\ all\ }{\bf x}\in \mathbb{R}^{n}. \] The only matrix that satisfies P²x = P x for all x is one with \[ \mathbf{P}^2 = \mathbf{P}. \] So the projection matrix P is idempotent.

  4. 1) Show that 1 is an eigenvalue of P:

    Take any nonzero vector w from W. Since w already lies in W, its orthogonal projection onto W is itself: \[ T(\mathbf{w})=\mathbf{w}. \] In matrix form, P w = w.

    Rewriting it as \[ \mathbf{P}\,\mathbf{w} = 1\cdot \mathbf{w}. \] So w is an eigenvector of P with eigenvalue λ = 1. Because W ≠ {0}, such a nonzero w exists, so λ = 1 is indeed an eigenvalue of P.

    2) Show that the 1-eigenspace of P is exactly W.

    (a) Every vector in W is a 1-eigenvector. Let wW. As above, \[ \mathbf{P}\,\mathbf{w}=\mathbf{w}. \] So every vector in W satisfies P w = 1·w. Thus, \[ W\subseteq E_1(\mathbf{P}), \] where E₁(P) denotes the 1-eigenspace of P.

    (b) Every 1-eigenvector lies in W.

    Indeed, let x ∈ ℝn×1 satisfy P x = x.

    By definition of P, P x = xW, the orthogonal projection of x onto W. So \[ \mathbf{x} = \mathbf{P}\,\mathbf{x} = \mathbf{x}_W. \] However, xWW. Hence, xW. Therefore, every vector x ∈ ℝn×1 with P x = x lies in W, so \[ E_1(\mathbf{P})\subseteq W. \] 3) Conclusion: Combining (a) and (b), we have \[ E_1(\mathbf{P})=W, \] and since W ≠ { 0 }, this shows that λ = 1 is an eigenvalue of P and its eigenspace is exactly W.

  5. 1) Zero is an eigenvalue of P

    Because W is a proper subspace, W ≠ {0}. Take any nonzero vector vW.

    The orthogonal projection of v onto W is the zero vector: \[ T(\mathbf{v}) = \mathbf{v}_W = 0. \] In matrix form, P v = 0. So \[ \mathbf{P}\,\mathbf{v} = 0\cdot \mathbf{v}, \] which means v is an eigenvector of P with eigenvalue λ = 0. Since v ≠ 0, λ = 0 is indeed an eigenvalue.

    2) The 0-eigenspace is W.

    Let E₀(P) denote the 0-eigenspace of P, i.e., \[ E_0 (\mathbf{P}) = \left\{ \mathbf{x}\in \mathbb{R}^{n\times 1} \ : \ \mathbf{P}\,\mathbf{x} = 0 \right\} . \]

    (a) WE₀(P).

    If xW, then as above its projection onto W is zero: \[ \mathbf{P}\,\mathbf{x} = \mathbf{x}_W = 0. \] So xE₀(P). Hence, WE₀(P).

    (b) E₀(P) ⊆ W.

    Now let x be in E₀(P), so P x = 0. But P x = xW, the projection of x onto W. Thus, xW = 0.

    Write the orthogonal decomposition \[ \mathbf{x} = \mathbf{x}_W + \mathbf{x}_{W^{\perp }} = 0 + \mathbf{x}_{W^{\perp }}=x_{W^{\perp }}. \] So xW. Therefore, \[ E_0(P)\subseteq W^{\perp }. \] 3) Conclusion: Combining (a) and (b), we get \[ E_0(\mathbf{P})=W^{\perp }. \] Since W ≠ ℝn ≌ ℝn×1, we have W ≠ {0}, so the 0-eigenspace is nontrivial and λ = 0 is an eigenvalue of P, with eigenspace exactly W.

  6. We’ve already uncovered the key structure:
    • W is the 1-eigenspace of P, with dim(W) = m.
    • W is the 0-eigenspace of P, with dimW = nm.
    • n ≌ ℝn×1 = WW.
    Now we turn that into a similarity with a diagonal matrix.

    1) Build a basis adapted to W and W.

    Choose a basis of W: \[ \{ \mathbf{w}_1, \dots , \mathbf{w}_m \} . \] Choose a basis of W: \[ \{ \mathbf{u}_1,\dots , \mathbf{u}_{n-m}\} . \] Then \[ \mathcal{B}=\left\{ \mathbf{w}_1, \dots , \mathbf{w}_m , \mathbf{u}_1,\dots ,\mathbf{u}_{n-m} \right\} \] is a basis of ℝn×1 ≌ ℝn.

    2) Describe P in this basis:

    For each wiW, projection onto W fixes it: \[ \mathbf{P}\,\mathbf{w}_i = \mathbf{w}_i . \] For each ujW, projection onto W kills it: \[ \mathbf{P}\,\mathbf{u}_j = 0. \] So, relative to the basis ℬ, the matrix of P acts as follows:

    • On the first m basis vectors: eigenvalue 1.
    • On the last nm basis vectors: eigenvalue 0.
    Thus, in the basis ℬ, the matrix of P is \[ \Lambda = \left( \begin{matrix}I_m&0\\ 0&0_{(n-m)\times (n-m)}\end{matrix}\right) , \] a diagonal matrix with m ones and nm zeroes on the diagonal.

    3) Express this as similarity:

    Let S be the change-of-basis matrix whose columns are the vectors of ℬ written in the standard basis: \[ \mathbf{S} = \left[ \begin{matrix}w_1&\cdots &w_m&u_1&\cdots &u_{n-m}\end{matrix}\right] . \] Then the matrix of P in the standard basis is related to its matrix Λ in the basis ℬ by \[ \mathbf{S}^{-1} \mathbf{P}\,\mathbf{S} = \Lambda . \] So the projection matrix P is similar to the diagonal matrix with m ones and nm zeroes on the diagonal, where m = dimW.

   
Example 48: In this example, we use W generated by two vectors
       u = RandomInteger[9, 4]
       {{4, 4, 2, 2}
       v = RandomInteger[9, 4]
       {5, 2, 0, 8}
We organize these vectors in matrix form \[ \mathbf{A} = \begin{bmatrix} 4&5 \\ 4&2 \\\ 2&0 \\ 2&8 \end{bmatrix} . \]
       A = Transpose[{u, v}]
      {{4, 5}, {4, 2}, {2, 0}, {2, 8}}
The projection matrix on W = span{u, v} is defined by Eq.(8) and we have \[ \mathbf{P}_W = \mathbf{A} \left( \mathbf{A}^{\mathrm T} \mathbf{A} \right)^{-1} \mathbf{A}^{\mathrm T} = \frac{1}{446} \begin{bmatrix} 182 & 164 &76 & 124 \\ 164 & 236& 142&-50 \\ 76 & 142 & 93 & -83 \\ 124 & -50 & -83 & 381\end{bmatrix} . \tag{9.1} \]
       P = A . Inverse[Transpose[A] . A] . Transpose[A]
      {{91/223, 82/223, 38/223, 62/223}, {82/223, 118/223, 71/ 223, -(25/223)}, {38/223, 71/223, 93/446, -(83/446)}, {62/ 223, -(25/223), -(83/446), 381/446}}
We check that matrix P is singular:
       Det[P]
0
Now we go over examples.
  1. The column space of projection matrix P is spanned on four vectors \[ W = \mbox{span} \left\{ \mathbf{u}_1 = \begin{pmatrix} 182 \\ 164 \\ 76 \\ 124 \end{pmatrix} , \ \mathbf{u}_2 = \begin{pmatrix} 164 \\ 236 \\ 142 \\ -50 \end{pmatrix} , \ \mathbf{u}_3 = \begin{pmatrix} 76 \\ 142 \\ 93 \\ -83 \end{pmatrix} , \ \mathbf{u}_4 = \begin{pmatrix} 124 \\ -50 \\-83 \\ 381 \end{pmatrix} \right\} . \] First, we check matrix rank of projection matrix, which is 2.
           MatrixRank[P]
           2
    Next, we verify that first two vectors u₁ and u₂ are linearly independent:
          MatrixRank[{{182, 164}, {164, 236}, {76, 142}, {124, -50}}]
          2
    Hence, \[ W = \mbox{span} \left\{ \mathbf{u}_1 = \begin{pmatrix} 182 \\ 164 \\ 76 \\ 124 \end{pmatrix} , \ \mathbf{u}_2 = \begin{pmatrix} 164 \\ 236 \\ 142 \\ -50 \end{pmatrix} \right\} . \] and we need to show that \[ W = \mbox{span} \left\{ \mathbf{u} = \begin{pmatrix} 4 \\ 4 \\ 2 \\ 2 \end{pmatrix} , \ \mathbf{v} = \begin{pmatrix} 5 \\ 2 \\ 0 \\ 8 \end{pmatrix} \right\} . \] This means that there exist four constants c₁, c₂, c₃, and c₄, not all equal to zero, so that \[ \begin{split} c_1 \mathbf{u} + c_2 \mathbf{v} &= \mathbf{u}_1 , \\ c_3 \mathbf{u} + c_4 \mathbf{v} &= \mathbf{u}_2 . \end{split} \] We ask Mathematica to solve the corresponding system of linear equations.
          LinearSolve[Transpose[A], {{182, 164, 76, 124}, {164, 236, 142, -50}}]
          {{73/3, 154/3, 104/3, -(112/3)}, {127/6, -(31/3), -(47/3), 205/3}, {0, 0, 0, 0}, {0, 0, 0, 0}}
    Since this system has a nontrivial solution, Mathematica confirms that column space of P coincides with W.
  2. Let \[ W = \mbox{span}\left\{ \mathbf{w}_1 = \begin{pmatrix} 1 \\ 1 \\ 0 \end{pmatrix} , \quad \mathbf{w}_2 = \begin{pmatrix} 0 \\ 1 \\ 1 \end{pmatrix} \right\} \subset \mathbb{R}^{3\times 1} . \] This is a 2-dimensional subspace of ℝ3×1. We need an orthonormal basis for W. First, we normalize w₁: \[ \mathbf{u}_1 = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ 1 \\ 0 \end{pmatrix} . \] Then we make w₂ orthogonal to u₁: \[ w'_2 = w_2 - \left( \mathbf{u}_1 \bullet \mathbf{w}_2 \right) \mathbf{u}_1 = \begin{pmatrix} -1/2 \\ 1/2 \\ 1 \end{pmatrix} . \]
          u1 = {1, 1, 0}/Sqrt[2];        w2 = {0, 1, 1};        w2p = w2 - Dot[u1, w2]*u1
          {-(1/2), 1/2, 1}
    We normalize this vector
           u2 = w2p/Norm[w2p]
           {-(1/Sqrt[6]), 1/Sqrt[6], Sqrt[2/3]}
    \[ \mathbf{u}_2 = \frac{\mathbf{w}'_2}{\| \mathbf{w}'_2 \|} = \frac{1}{\sqrt{6}} \begin{pmatrix} -1 \\ 1 \\ 2 \end{pmatrix} . \] So an orthonormal basis for W is {u₁, u₂}.

    Now we build the projection matrix \[ \mathbf{P} = \mathbf{Q}\,\mathbf{Q}^{\mathrm T} = \frac{1}{3} \begin{bmatrix} 2 & 1 & -1 \\ 1&2&1 \\ -1 &1 & 2 \end{bmatrix} , \] where \[ \mathbf{Q} = \left\{ \mathbf{u}_1 , \mathbf{u}_2 \right\} = \frac{1}{\sqrt{6}} \begin{bmatrix} \sqrt{3}& -1 \\ \sqrt{3} & 1 \\ 0 & 2 \end{bmatrix} . \]

           Q = Transpose[{u1, u2}]
           {{1/Sqrt[2], -(1/Sqrt[6])}, {1/Sqrt[2], 1/Sqrt[6]}, {0, Sqrt[2/3]}}
           P = Q . Transpose[Q]
           {{2/3, 1/3, -(1/3)}, {1/3, 2/3, 1/3}, {-(1/3), 1/3, 2/3}}
    This is the projection matrix onto W. Of course, you can use formula (8), but it provides the same answer (Mathematica confirms)
           PP = Q . Inverse[Transpose[Q] . Q] . Transpose[Q]
           {{2/3, 1/3, -(1/3)}, {1/3, 2/3, 1/3}, {-(1/3), 1/3, 2/3}}
    To compute the null space of P, we need to solve the following system of linear equations: \[ \mathbf{P}\,\mathbf{x} = \mathbf{0} \qquad \iff \qquad \frac{1}{3} \begin{bmatrix} 2 & 1 & -1 \\ 1&2&1 \\ -1 &1 & 2 \end{bmatrix} \begin{pmatrix} x \\ y \\ z \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \\ 0 \end{pmatrix} . \] Fortunately, Mathematica has a dedicated comamnd to determine the null space:
           NullSpace[P]
           {{1, -1, 1}}
    Therefore, \[ \mbox{NullSpace}[\mathbf{P}] = \mbox{span} \left\{ \begin{pmatrix} 1 \\ -1 \\ 1 \end{pmatrix} \right\} . \] Now we compute W. This space consists of all vectors x = [x, y, z] that are orthogonal to w₁ and w₂: \[ \mathbf{w}_1 \bullet \mathbf{x} = 0 \quad \& \quad \mathbf{w}_2 \bullet \mathbf{x} = 0 \] So we need to solve the homogeneous system of equations \[ \begin{split} x + y &= 0 , \\ y + z & = 0 . \end{split} \] So y = −x = −z. Hence, W is spanned on one vector: [1, −1, 1], which is exactly the null space of matrix P.
  3. Let \[ W = \mbox{span}\left\{ \begin{pmatrix} 1 \\ 1 \\ 1 \end{pmatrix} , \quad \begin{pmatrix} 1 \\ 0 \\ -1 \end{pmatrix} \right\} . \] Using formula (9), we find its standard matrix of the orthogonal projection onto W to be \[ \mathbf{P} = \frac{1}{6} \begin{bmatrix} 5 & 2 & -1 \\ 2&2&2 \\ -1&2&5 \end{bmatrix} . \]
           e1 = {1, 1, 1}/Sqrt[3]; e2 = {1, 0, -1}/Sqrt[2];
           Q = Transpose[{e1, e2}];
           P = Q . Transpose[Q]
           {{5/6, 1/3, -(1/6)}, {1/3, 1/3, 1/3}, {-(1/6), 1/3, 5/6}}
    Of course, you can use formula (8), but the answer will be the same:
           Q . Inverse[Transpose[Q] . Q] . Transpose[Q]
           {{5/6, 1/3, -(1/6)}, {1/3, 1/3, 1/3}, {-(1/6), 1/3, 5/6}}
    Verify that P² = P.
           P . P == P
          True
    Since cross product e₁ × e₂ is orthogonal to these vectors, the orthogonal complement W is spanned on the vector \[ W^{\perp} = \mbox{span} \left\{ \begin{pmatrix} 1 \\ -2 \\ 1 \end{pmatrix} \right\} . \]
           Cross[e1, e2]
          {-(1/Sqrt[6]), Sqrt[2/3], -(1/Sqrt[6])}
  4. We use matrix P from Eq.(9.1). We use Mathematica to determine eigenvalues and eigenvectors of matrix P:
          P = {{91/223, 82/223, 38/223, 62/223}, {82/223, 118/223, 71/223, -(25/223)}, {38/223, 71/223, 93/446, -(83/446)}, {62/223, -(25/223), -(83/446), 381/446}}
          Eigenvalues[P]
          {1, 1, 0, 0}
    So we see that projection matrix P has two eigenvalues λ = 1 and λ = 0, both of multiplicity 2 (not defective). The eigenvectors corresponding λ = 1 are \[ \mathbf{v}_1 = \begin{pmatrix} 5 \\ 2 \\ 0 \\ 8 \end{pmatrix} , \qquad \mathbf{v}_2 = \begin{pmatrix} 11 \\ 14 \\ 8 \\ 0 \end{pmatrix} \]
          Eigenvectors[P]
          {{5/8, 1/4, 0, 1}, {11/8, 7/4, 1, 0}, {-(7/3), 11/6, 0, 1}, {1/ 3, -(5/6), 1, 0}}
    We check that these vectors v₁ and v₂ are eigenvectors of matrix P:
          v1 = {5, 2, 0, 8};
          P . v1
          {5, 2, 0, 8}
          v2 = {11, 14, 8, 0};
          P . v2
          {11, 14, 8, 0}
    We need to show that vectors {v₁, v₂} span subspace W. So we need to show that vectors u = [4, 4, 2, 2] and v = 5, 2, 0, 8] are expressed as linear combination of vectors {v₁, v₂}. Since v = v₁, it is sufficient to show that vector u is a linear combination of {v₁, v₂}. So we need to show that the system of algebraic equations \[ c_1 \mathbf{v}_1 + c_2 \mathbf{v}_2 = \mathbf{u} \] has a nontrivial solution. We deligate this job to mathematica:
          u = {4, 4, 2, 2};
          Solve[{c1*v1 + c2*v2 == u}, {c1, c2}]
          {{c1 -> 1/4, c2 -> 1/4}}
    Since c₁ = ¼ and c₂ = ¼ are not zero, eigenvectors {v₁, v₂} span subspace W.
  5. From the previous example, we know that projection matrix (9.1) has double eigenvalue λ = 0 to which corresponds two eigenvectors \[ \mathbf{u}_1 = \begin{pmatrix} -14 \\ 11 \\ 0 \\ 6 \end{pmatrix} , \qquad \mathbf{u}_2 = \begin{pmatrix} 2 \\ -5 \\ 6 \\ 0 \end{pmatrix} . \] We check with Mathematica that these vectors belong to W:
          u1 = {-14, 11, 0, 6};
          P . u1
          {0, 0, 0, 0}
          u2 = {2, -5, 6, 0};
          P . u2
          {0, 0, 0, 0}
    Since eigenvectors {u₁, u₂} are linearly independent, they span W.
  6. Using projection matrix (9.1), we build the matrix from its eigenvectors: \[ \mathbf{S} = \begin{bmatrix} \mathbf{v}_1 & \mathbf{v}_2 & \mathbf{u}_1 & \mathbf{u}_2 \end{bmatrix} = \begin{bmatrix} 5& 11& -14&2 \\ 2&14& 11 & -5 \\ 0&8&0&6 \\ 8&0&6&0 \end{bmatrix} . \] Now we calculate the product:
          S = {{5, 11, -14, 2}, {2, 14, 11, -5}, {0, 8, 0, 6}, {8, 0, 6, 0}};
          Inverse[S] . P . S
          {{1, 0, 0, 0}, {0, 1, 0, 0}, {0, 0, 0, 0}, {0, 0, 0, 0}}
    \[ \mathbf{S}^{-1} \mathbf{P}\,\mathbf{S} = \begin{bmatrix} 1&0&0&0 \\ 0&1&0&0 \\ 0&0&0&0 \\ 0&0&0&0 \end{bmatrix} . \]
   ■
End of Example 48

 

Reflections


Let W be a subspace of ℝn, and let x be a vector in ℝn. The reflection of x over W is defined to be the vector
\[ \mbox{ref}_W (\mathbf{x}) = \mathbf{x} - 2\,\mathbf{x}^{\perp} = 2\,\mathbf{x}_W - \mathbf{x} , \]
where xW is the orthogonal projection of x onto W. In other words, to find refW(x), one starts at x, then moves to xx = xW, then continues in the same direction one more time, to end on the opposite side of W.

If W has an orthonormal basis [e₁, e₂, … , em] collected into matrix

\[ \mathbf{Q} = \begin{bmatrix} \mathbf{e}_1 & \mathbf{e}_2 & \cdots & \mathbf{e}_m \end{bmatrix} \in \mathbb{R}^{n\times m} , \]
then the projection matrix on subspace W and reflection across W become
\[ \mathbf{P} = \mathbf{Q}\,\mathbf{Q}^{\mathrm T} , \qquad \mathbf{R}_W = 2\,\mathbf{Q}\,\mathbf{Q}^{\mathrm T} - \mathbf{I} = 2\,\mathbf{P} - \mathbf{I} , \]
respectively. If n × m matrix Q has linearly independent columns, then the projection matrix P = PW on subspace W = ColumnSpace(Q) is defined by Eq.\eqref{EqDot.8} and we get the reflection matrix
\begin{equation} \label{EqDot.10} \mathbf{P} = \mathbf{Q}\left(\mathbf{Q}^{\mathrm T} \mathbf{Q} \right)^{-1} \mathbf{Q}^{\mathrm T} , \qquad \mathbf{R}_W = 2\,\mathbf{P} - \mathbf{I} . \end{equation}
We formulate the main properties of reflection matrix without proofs (that are left as exercises).
Properties of reflection matrix: Let R be a reflection matrix across subspace W ⊂ ℝn. Then the following properties hold.
  1. Reflection matrices are involutions:     R² = I.
  2. Reflections are orthogonal transformations:     RTR = I.
  3. They preserve Euclidean lengths:     ∥Rx∥ = ∥x∥.
  4. They flip exactly the orthogonal complement of the subspace.

   

Example 49: Since reflection was exposed in a succinct way, we present several examples that clarify this topic.

Example A: Reflection across a line in ℝ²

Let U ⊂ ℝ² be the line spanned by a unit vector \( \displaystyle \quad {\bf u} = \frac{1}{\sqrt{2}} \left[ 1, 1 \right] . \quad \) Then reflection across this line becomes \[ R_U ({\bf x}) = 2 \left( \frac{x_1 + x_2}{2} \right) \left[ 1, 1 \right] - \left[ x_1 , x_2 \right] = \left[ x_2 , x_1 \right] . \] So this reflection simply swaps the coordinates. On the other hand, if we choose V spanned on unit vector \( \displaystyle \quad {\bf v} = \frac{1}{\sqrt{5}} \left[ 2, 1 \right] , \quad \) then reflection across this line becomes \[ R_V ({\bf x}) = 2 \left( \frac{2\,x_1 + x_2}{5} \right) \left[ 2, 1 \right] - \left[ x_1 , x_2 \right] = \frac{1}{5} \left[ 3\,x_1 + 4\, x_2 , 4\, x_1 -3\,x_2 \right] . \] The corresponding reflection matrix is \[ \mathbf{R} = \frac{1}{5} \begin{bmatrix} 3&\phantom{-}4 \\ 4&-3 \end{bmatrix} . \] With Mathematica, we verify its involution and orthogonal properties.
r1 = {{3, 4}, {4, -3}}/5; Simplify[r1 . r1] == IdentityMatrix[2]
True
Simplify[Transpose[r1] . r1] == IdentityMatrix[2]
True
If the line U is spanned by a unit vector \[ {\bf u} = \begin{pmatrix} \cos\theta \\ \sin\theta \end{pmatrix} , \] then the reflection matrix across this line is expressed by the classical formula \[ R_U = \begin{bmatrix} \cos 2\theta & \phantom{-}\sin 2 \theta \\ \sin 2\theta & -\cos 2\theta \end{bmatrix} . \] We check with Mathematica its nvolution property:    R² = I, where I 9s the identoty matrix:
r2 = {{Cos[2*t], Sin[2*t]}, {Sin[2*t], -Cos[2*t]}};
Simplify[r2 . r2] == IdentityMatrix[2]
True
Next, we check its orthogonality:
Simplify[Transpose[r2] . r2] == IdentityMatrix[2]
True
So for the reflection across the line y = x, we have 2θ = π/2. Therefore the reflection matrix becomes \[ R = \begin{bmatrix} 0&1 \\ 1&0 \end{bmatrix} . \]

Example B: Reflection across a plane in ℝ³

Let plane U have unit normal vector n, written as a colum vector. Then the reflection matrix is \[ {\bf R}_U = {\bf I} - 2 \mathbf{n}\,\mathbf{n}^{\mathrm T} . \] For instance, reflection across the plane x₂ = 0, with normal vector n = [0, 1, 0] is \[ {\bf R}_U = \begin{bmatrix} 1& \phantom{-}0 & 0 \\ 0&-1&0 \\ 0& \phantom{-}0 & 1 \end{bmatrix} . \] So n = [0, 1, 0], and we have \[ {\bf R}_U \left( x_1 , x_2 , x_3 \right) = \left[ x_1 , - x_2 , x_3 \right] . \]

Now we consider a general plane in ℝ³.

(* --- Define plane and reflection --- *) n = Normalize[{1, 2, 1}]; P = IdentityMatrix[3] - Outer[Times, n, n]; (* projector onto plane *) R = IdentityMatrix[3] - 2 Outer[Times, n, n]; (* reflection *) (* --- Choose a point to reflect --- *) x = {2, -1, 3}; xr = R.x; (* --- Plane graphics --- *) plane = ParametricPlot3D[ {s, t, -(n.{s, t, 0})/n[[3]]}, {s, -3, 3}, {t, -3, 3}, Mesh -> None, PlotStyle -> Opacity[0.3, Blue] ]; (* --- Vectors --- *) vecs = Graphics3D[{ Thick, Red, Arrow[{{0,0,0}, x}], Green, Arrow[{{0,0,0}, xr}], Black, Arrow[{{0,0,0}, n}], Purple, Arrow[{{0,0,0}, P.x}] }]; Show[plane, vecs, Axes -> True, AxesLabel -> {"x","y","z"}, Boxed -> False, ImageSize -> Large]
Figure 10.1: Reflection in ℝ³

Example C: General Reflection across a subspace in ℝn

For instance, let us consider a 2-dimensional subspace U in ℝ4, which is spanned on orthonormal vectors \[ {\bf e}_1 = \frac{1}{\sqrt{5}} \left[ 2, \ 1, \ 0 , 0 \right] , \qquad {\bf e}_2 = \frac{1}{\sqrt{10}} \left[ 0, \ 1, \ 3, \ 0 \right] . \] The reflection matrix across the subspace generated by these unit vectors is defined by Eq.\eqref{EqDot.9}. So we use Mathematica to generate the subroutine:

ReflectionMatrixFromBasis[basis_List] := Module[ {Q, G, P, n}, n = Length[First[basis]]; Q = Transpose[basis]; (* n x k *) G = Transpose[Q].Q; (* Gram matrix k x k *) P = Q.Inverse[G].Transpose[Q]; (* orthogonal projector *) Simplify[2 P - IdentityMatrix[n]] ];
\[ {\bf R} = 2\,{\bf Q}\left( {\bf Q}^{\mathrm T} {\bf Q} \right)^{-1} {\bf Q}^{\mathrm T} - {\bf I} , \qquad {\bf Q} = \left[ {\bf e}_1 \ {\bf e}_2 \right] . \] With Mathematica, we define vectors
u1 = {2,1,0,0}/Sqrt[5]; u2 = {0,1,3,0}/Sqrt[10];
Form the reflection matrix R
R = ReflectionMatrixFromBasis[{u1, u2}];
\[ \mathbf{R} = \frac{1}{49} \begin{bmatrix} 31&36&-12&0 \\ 36&-23& -24&0 \\ -12&24&41&0 \\ 0&0&0&-49 \end{bmatrix} . \] Test vector
x = {1, -3, 4, 2}; reflected = R . x
{-(125/49), 201/49, 80/49, -2}
So reflection of vector x = [1, -3, 4, 2] is \[ {\bf R}_U ({\bf x}) = {\bf R}_U (1, -3, 4, 2) = \frac{1}{49} \begin{bmatrix} -125 & 201 & 80 & -98 \end{bmatrix} . \] We check whether both vectors x = [1, -3, 4, 2] and reflected vector have the same length.
x = {1, -3, 4, 2}; Norm[x]
Sqrt[30]
Norm[reflected]
Sqrt[30]
Verification:
VerifyReflection[R_] := { Simplify[R.R] === IdentityMatrix[Length[R]], (* R^2 = I *) Simplify[Transpose[R].R] === IdentityMatrix[Length[R]] (* orthogonal *) }
The following code handles an arbitrary randomly generated subspace:
SeedRandom[2026]; n = 7; k = 3; basis = Table[RandomReal[{-1,1}, n], {k}]; R = ReflectionMatrixFromBasis[basis]; VerifyReflection[R]
   ■
End of Example 49
    We present Mathematica code for dynamic manipulation of reflection across the plane in ℝ³. You can drag x and watch the reflection move symmetrically across the plane.
Clear[n, P, R, v1, v2] (* Normal vector *) n = Normalize[{1, 2, 1}]; (* Two orthonormal tangent vectors spanning the plane *) v1 = Normalize[NullSpace[{n}][[1]]]; v2 = Normalize[Cross[n, v1]]; (* Projector and reflection *) P = IdentityMatrix[3] - KroneckerProduct[n, n]; R = IdentityMatrix[3] - 2 KroneckerProduct[n, n]; (* Plane plot *) plane = ParametricPlot3D[ s v1 + t v2, {s, -3, 3}, {t, -3, 3}, Mesh -> None, PlotStyle -> Opacity[0.3, LightBlue] ]; Manipulate[ Module[{x = {x1, x2, x3}, xr, px}, xr = R.x; px = P.x; Show[ plane, Graphics3D[{ Thick, Red, Arrow[{{0, 0, 0}, x}], Green, Arrow[{{0, 0, 0}, xr}], Purple, Arrow[{{0, 0, 0}, px}], Black, Arrow[{{0, 0, 0}, n}], Black, PointSize[Large], Point[x], Point[xr] }], Axes -> True, AxesLabel -> {"x", "y", "z"}, Boxed -> False, ImageSize -> Large ] ], {{x1, 2}, -3, 3}, {{x2, -1}, -3, 3}, {{x3, 3}, -3, 3} ]
In this code:
  • Red: original vector x.
  • Green: reflected vector R x.
  • Purple: projection P x onto the plane.
  • Black: normal vector n.

The Householder Reflection

The following transformation was used in a 1958 paper by the American mathematician Alston Scott Householder (1904--1993).
A Householder reflection (also Householder transformation) is a linear transformation used to reflect vectors in ℝn across a hyperplane. Given a nonzero vector v ∈ ℝn, one defines the unit vector \( \displaystyle \quad {\bf u} = \frac{\bf v}{\| {\bf v}\| }, \quad \) and the associated Householder matrix \[ {\bf H} = {\bf I}-2\,{\bf u}\,{\bf u}^{\mathsf{T}}, \] where I is the n×n identity matrix. The matrix H represents reflection across the hyperplane orthogonal to u, i.e. the set \[ \{ {\bf x}\in \mathbb{R}^{n} \ : \ {\bf u} \bullet {\bf x} =0 \} . \]
For any vector x ∈ ℝn, the action of H can be written explicitly as
\[ {\bf H}\,{\bf x} = \mathbf{x} - 2\left( {\bf u}^{\mathsf{T}}\mathbf{x}\right) {\bf u}, \]
which is the familiar geometric formula for reflecting a point across a hyperplane with unit normal u.

Householder reflections play a central role in modern numerical linear algebra, particularly in algorithms that require stable orthogonal transformations. One of the most important applications is the QR factorization of a matrix A ∈ ℝm×n. The goal is to write \( \displaystyle \quad {\bf A} = {\bf Q}\,{\bf R}, \quad \) where Q is orthogonal and R is upper triangular. Householder reflections provide a numerically stable and conceptually clean way to construct such a factorization.

The idea is to successively apply Householder transformations to zero out subdiagonal entries of A. At the first step, one constructs a Householder matrix H₁ that maps the first column of A to a vector whose only nonzero entry is in the first position. Applying H₁ on the left yields

\[ {\bf H}_1 {\bf A} = \left( \begin{matrix}*&*&\cdots \\ 0&*&\cdots \\ \vdots &\vdots &\ddots \end{matrix}\right) . \]
One then repeats this process on the trailing submatrix, constructing H₂, H₃, … . The product
\[ {\bf Q} = {\bf H}_1^{\mathsf{T}} {\bf H}_2^{\mathsf{T}}\cdots \]
is orthogonal, and the final matrix is upper triangular, giving the desired QR factorization. Householder-based QR is preferred in many contexts because it is backward stable and uses orthogonal transformations, which preserve the 2-norm and thus control error growth.

For symmetric matrices, Householder reflections are also used to reduce a dense symmetric matrix to tridiagonal form, a key preprocessing step in many eigenvalue algorithms. By carefully choosing the Householder vectors, one can zero out entries below the first subdiagonal while preserving symmetry, leading to efficient and stable eigenvalue computations.

Householder reflections are closely related to Givens rotations, another class of orthogonal transformations used in numerical linear algebra. While Givens rotations act nontrivially on only two coordinates at a time, Householder reflections act globally but can annihilate many entries in a single step. As a result, Householder methods are often more efficient for dense matrices, whereas Givens rotations are advantageous for sparse or structured problems.

From a geometric viewpoint, any orthogonal transformation with determinant -1 can be represented as a Householder reflection, and any orthogonal matrix can be expressed as a product of Householder reflections. This gives Householder transformations a foundational role in the structure theory of orthogonal groups.    

Example 50: This example presents a Mathematica code for dynamic manipulation and visualization of Householder transformations in 3-dimensional physical space.
Clear[householderMatrix, n, H] (* Householder matrix for a unit normal n *) householderMatrix[n_List] := IdentityMatrix[Length[n]] - 2 KroneckerProduct[n, n]; (* Example: 3D normal and reflection *) n = Normalize[{1, 2, 1}]; H = householderMatrix[n]; plane = ParametricPlot3D[ s v1 + t v2 /. With[{v1 = Normalize[NullSpace[{n}][[1]]], v2 = Normalize[Cross[n, Normalize[NullSpace[{n}][[1]]]]]}, {v1 -> v1, v2 -> v2} ], {s, -3, 3}, {t, -3, 3}, Mesh -> None, PlotStyle -> Opacity[0.3, LightBlue] ]; Manipulate[ Module[{x = {x1, x2, x3}, xr}, xr = H.x; Show[ plane, Graphics3D[{ Thick, Red, Arrow[{{0, 0, 0}, x}], Green, Arrow[{{0, 0, 0}, xr}], Black, Arrow[{{0, 0, 0}, n}], Black, PointSize[Large], Point[x], Point[xr] }], Axes -> True, AxesLabel -> {"x", "y", "z"}, Boxed -> False, ImageSize -> Large ] ], {{x1, 2}, -3, 3}, {{x2, -1}, -3, 3}, {{x3, 3}, -3, 3} ]
   ■
End of Example 50
Householder reflection provides a remarkably simple formula
\[ {\bf H} = {\bf I} - 2\mathbf{u}\,\mathbf{u}^{\mathsf{T}} \]
with deep geometric and algebraic content. It realizes reflection across a hyperplane in ℝⁿ, is symmetric, orthogonal, and involutory, and has a transparent eigenstructure. These properties make it an indispensable tool in numerical linear algebra, particularly in QR factorization and eigenvalue algorithms, where stability and norm preservation are essential. Its conceptual clarity and computational efficiency explain why Householder’s construction remains a standard ingredient in modern scientific computing. X2

 

  1. For a given target point (4, 3, 2, 1) in ℝ4, find the point in \( \displaystyle \quad \mbox{span} \left\{ \begin{pmatrix} 1 \\ -1 \\ 2 \\ 3 \end{pmatrix} , \ \begin{pmatrix} 0 \\ 1 \\ 3 \\ 2 \end{pmatrix} \right\} , \quad \) which is closest to the traget point.
  2. For a given target point (4, 3, 2, 1) in ℝ4, find the point in \( \displaystyle \quad \mbox{span} \left\{ \begin{pmatrix} 1 \\ 0 \\ 2 \\ 0 \end{pmatrix} , \ \begin{pmatrix} 0 \\ 3 \\ 2 \\ 0 \end{pmatrix} \right\} , \quad \) which is closest to the traget point.
  3. For a given target point (4, 3, 2, 1) in ℝ4, find the point on the plane described by 2x + y −3z + 4w = 0 which is closest to the traget point.

 

  1. Anton, Howard (2005), Elementary Linear Algebra (Applications Version) (9th ed.), Wiley International
  2. Beezer, R., A First Course in Linear Algebra, 2015.
  3. Beezer, R., A Second Course in Linear Algebra, 2013.