es
Recall that 𝔽 denotes one of the following fields of numbers: ℤ, integers, ℚ, rational numbers, ℝ, real numbers; hence we exclude from our consideration ℂ, complex numbers in this section. It is caused by applications of affine transformations in geometry and computer graphics that utilize only real numbers. We denote by 𝔽m×n or 𝔽m,n the vector space of m-by-n matrices with entries from field 𝔽.

According to Wikipedia, the term linear function can refer to two distinct concepts, based on the context:

  • In Calculus, a linear function is a polynomial function of degree zero or one; in other words, a function of the form f(x) = m x + b for some constants m and b ∈ ℝ.
  • In Linear Algebra, a linear function is a linear mapping, or linear transformation: fx + y) = λf(x) + f(y). for any scalar λ and any two vectors x and y.
A matrix A of size m x n defines a linear map upon multiplication from left:
\begin{align*} \mathbb{F}^{n\times 1} &\longrightarrow \, \mathbb{F}^{m\times 1} , \\ \mathbb{F}^{n\times 1} \ni {\bf x} &\longrightarrow \, \mathbf{A}\,\mathbf{x} \in \mathbb{F}^{m\times 1} , \end{align*}
also denoted by A : 𝔽m×1 ⇾ 𝔽n×1. The same matrix defines a linear transformation between row vectors upon multiplication from right,
\begin{align*} \mathbb{F}^{1\times n} &\longleftarrow \, \mathbb{F}^{1\times m} , \\ \mathbb{F}^{1\times n} \ni {\bf v}\,{\bf A} &\longleftarrow \, \mathbf{v} \in \mathbb{F}^{1\times m} . \end{align*}
Such a map has the basic property A 0 = 0 for column vectors and 0 A = 0 for row vectors.
$Post := If[MatrixQ[#1], MatrixForm[#1], #1] & (* outputs matricies in MatrixForm*)
Remove[ "Global`*"] // Quiet (* remove all variables *)

Affine Transformations

An affine transformation or affinity (in 1748, Leonhard Euler introduced the term affine, which stems from the Latin, affinis, "connected with") is a geometric transformation that preserves the parallelism of lines and the ratio of distances between points. Affine transformation is closely related to projective transformation---this technique is widely used in computer graphics, image processing, machine learning, and neural networks to perform geometric transformations in a simple way using transformation matrices.

Although there are several open computer vision libraries for affine transformations such as openGL and openCV, we prefer to use Mathematica and its build-in commands: AffineTransform and TransformationMatrix.

The following definition does not satisfy mathematical scrutiny because it does not say explicitly what are domain and codomain of this transformation---formal definition will be given later. However, it gives us an idea where affine transformations come from.

(Naïve Representation) Any map f : 𝔽n×1 ↦ 𝔽m×1 of the form \begin{equation} \label{EqAffine.1} \mathbb{F}^{n\times 1} \ni \mathbf{x} \longrightarrow f({\bf x}) = {\bf A}\,{\bf x} + \mathbf{b} \end{equation} for some column vector b ∈ 𝔽m×1, is called an affine map. Similarly, a mapping between row vectors \begin{equation} \label{EqAffine.2} \mathbb{F}^{1\times m} \ni \mathbf{v} \longrightarrow f({\bf v}) = {\bf v}\,{\bf A} + \mathbf{w} \end{equation} for some row vector w ∈ 𝔽1×n, is also called an affine transformation.
Both formulae \eqref{EqAffine.1} and \eqref{EqAffine.2} are just short cuts of the general transformation of the form
\[ \begin{cases} y_1 &= a_{1,2} x_1 + a_{1,2} x_2 + \cdots + a_{1,n} x_n + b_1 , \\ y_2 &= a_{2,2} x_1 + a_{2,2} x_2 + \cdots + a_{2,n} x_n + b_2 , \\ \ \vdots & \qquad \vdots \qquad \vdots \\ y_m &= a_{m,2} x_1 + a_{m,2} x_2 + \cdots + a_{m,n} x_n + b_m . \end{cases} \]

Affine map is a geometric transformation that preserves lines and parallelism, but not necessarily Euclidean distances and angles. Since f(0) = b, such a map can be be linear only when b = 0 in Eq.\eqref{EqAffine.1} or w = 0 in Eq.\eqref{EqAffine.2}.

Basically, there are four affine transformations or their compositions:

  • Translate moves a set of points a fixed distance in each coordinate.
  • Scale scales a set of points up or down in each coordinate.
  • Rotate rotates a set of points about the origin,
  • Shear offsets a set of points a distance proportional to their x and y coordinates.
Note that only shear and non-uniform scale change the shape determined by a set of points. A subclass of affine transformations that locally preserves angles, but not necessarily lengths is called the set of conformal maps.    

Example 1: Let us consider an affine transformation \[ \begin{bmatrix} x \\ y \end{bmatrix} \,\mapsto \, \begin{bmatrix} 1 & 1 \\ 0 & 2 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} + \begin{bmatrix} \phantom{-}1 \\ -1 \end{bmatrix} \] It is a “shear” followed by a translation. The effect of this shear on the square (𝑎, b, c, d) is shown in the following figure. The image of this square is the parallelogram.

First, we rewrite the transformation in coordinates explicitly: \[ \begin{split} x &\mapsto x+y -1 , \\ y &\mapsto 2\,y + 1 . \end{split} \] Then vertical line x = −1 is mapped into the line \[ x \mapsto y -2 , \qquad y \mapsto 2\,y + 1 . \] The images of vertices become \[ \left( -1,-1 \right) \mapsto \left( -3, -1 \right) , \quad \left( -1, 1 \right) \mapsto \left( -1, 3 \right) , \quad \left( 1, 1 \right) \mapsto \left( 1, 3 \right) , \quad \left( 1, -1 \right) \mapsto \left( -1, -1 \right) . \]

Graphics[{Opacity[.35], Blue, Rectangle[{-1, -1}, {1, 1}], Red, Rectangle[{-3, -1}, {1, 3}]}, Axes -> True]
Square before affine map.
     
Parallelogram after affine map.
Clear[A, x, y, xy]
A = {{1, 1}, {0, 2}};
b = {1, -1};
xy = {x, y}
A . xy
{x + y, 2 y}
Solve[{(A . xy)[[1]] == 1, (A . xy)[[2]] == -1}, {x, y}]
\( \displaystyle \quad \left( x\,\to\,\frac{3}{2} \ y\,\to\, -\frac{1}{2} \right) \)
t[{x, y}]
{1 + x + y, -1 + 2 y}
TransformationMatrix@AffineTransform[{A, b}] // MatrixForm
\( \displaystyle \quad \begin{pmatrix} 1&1&1 \\ 0&2&-1 \\ 0&0&1 \end{pmatrix} \)
   ■
End of Example 1
Theorem 1: If an affine transformation has an inverse, then it is also an affine transformation.
Let q = A x + b be an affine transformation written for column vectors. It has an inverse only when A is nonsingular matrix, so det(A) ≠ 0. Then A x = qb. Application of inverse matrix (which exists for nonsingular matrices) to the latter, we obtain x = A−1qA−1b. `Hence, x = B q + v, where B = A−1 and v = − A−1b.
   
Example 2: Matrix \( \displaystyle \quad \mathbf{A} = \begin{bmatrix} 1 & 0 \\ 0&0 \end{bmatrix} \) maps all points to the x-axis, so it is a projection on this axis. The area of any closed region will become zero. We have det(A) = 0, which verifies that any closed region’s area will be scaled by zero.

In general, for any given closed region, the area under an affine transformation A x + b is scaled by det(A). This result is valid for any linear mapping y = A x.    ■

End of Example 2
Corollary 1: A composition of affine transformations is an affine transformation.
Let f(x) = A x + a and g(x) = B x + b be affine transformations. Then (gf)(x) = g(f)(x)) = B(A x + a) + b = (B A)x + (B a + b)
   
Example 3:    ■
End of Example 3

The basic properties of affine transformations are summarize in the following statement.

Theorem 2: Let f(x) = A x + b be an affine transformation. Then f
  1. maps a line to a line,
  2. maps a line segment to a line segment,
  3. preserves the property of parallelism among lines and line segments
  4. maps an n-gon to an n-gon,
  5. maps a parallelogram to a parallelogram,
  6. preserves the ratio of lengths of two parallel segments, and
  7. preserves the ratio of areas of two figures.
  1. Let L be aline and let L: p + tm, t ∈ ℝ, be an equation of L in vector form. Then for every t ∈ ℝ, \[ f \left( \mathbf{p} + t\,{\bf m} \right) = \mathbf{A}\left( \mathbf{p} + t\,{\bf m} \right) + \mathbf{b} = \mathbf{p}_1 + t\,\mathbf{m}_1 , \] where p₁ = A p + b and m₁ = A m. Hence, f(L) = L₁, where L₁ : p₁ + tm₁, t ∈ ℝ, is again a line.
  2. The proof is the same as that for (1), with t restricted to [0, 1].
  3. Suppose that L: p + tm and L₁ : p₁ + tm₁, t ∈ ℝ, are parallel lines. Then m₁ = km for some k ∈ ℝ. Therefore, \begin{align*} f \left( \mathbf{p} + t\,\mathbf{m} \right) &= \mathbf{A} \left( \mathbf{p} + t\,\mathbf{m} \right) + {\bf b} = \left( \mathbf{A}\,\mathbf{p} + {\bf b} \right) + t \left( {\bf A}\,{\bf m} \right) = \mathbf{q} + t\,\mathbf{n} , \\ f \left( \mathbf{p}_1 + t\,\mathbf{m}_1 \right) &= f \left( \mathbf{p}_1 + t\,k\,\mathbf{m} \right) = \mathbf{A} \left( \mathbf{p}_1 + t\,k\,\mathbf{m} \right) + {\bf b} \\ &= \left( \mathbf{A} \, \mathbf{p}_1 + {\bf b} \right) + t \left( \mathbf{A} \,k\,{\bf m} \right) = \mathbf{p}_2 + t\,\mathbf{m}_2 . \end{align*} That is, L and L₁ are mapped to lines that are parallel.

    It is clear that for two line segments or a line and a line segment the proof is absolutely analogous.

  4. We prove this by strong induction on n. For the base case, when n = 3, consider a triangle T. Then T and its interior can be represented in vector form as T : u + sv + tw, where s, t ∈ [0, 1], s + t ≤ 1, and the vectors v and w are not collinear. Then \begin{align*} f(T) &= F \left( {\bf u} + s {\bf v} + t {\bf w} \right) = \mathbf{A} \left( {\bf u} + s {\bf v} + t {\bf w} \right) + {\bf b} \\ &= \left( {\bf A}\,{\bf u} + {\bf b} \right) + s \left( \mathbf{A}\,{\bf v} \right) + t \left( \mathbf{A}\,{\bf w} \right) \\ &= {\bf u}_1 + s{\bf v}_1 + t{\bf w}_1 , \end{align*} where s, t ∈ [0, 1], s + t ≤ 1. By part3, v₁ = Av and w₁ = Aw are not parallel. Thus, T is mapped to a triangle T₁, which completes the proof of the base case.

    Now suppose that f maps each n-gon to an n-gon for all n, 3 ≤ nk, and let P be a polygon with k + 1 sides. We know that every polygon with at least 4 sides has a diagonal contained completely in its interior. Let \( \displaystyle \quad \overline{AB} \) be such a diagonal in P. This diagonal divides P into two polygons, P₁ and P₂ containing t and k + 1 − t sides, respectively, for some t, 3 ≤ tk. By the inductive hypothesis, f(P₁) and f(P₂) will be t-sided and (k + 3 − t)-sided polygons, respectively. Since each of these polygons will have the segment from f(A) to f(B) as a diagonal, the union of P₁ and P₂ will form a polygon with k + 1 sides, which concludes the proof.

  5. The proof that a parallelogram is mapped to a parallelogram is analogous to the proof that triangles get mapped to triangles in part (4), by simply dropping the condition that s + t ≤ 1.
  6. Consider parallel line segments, S₁ and S₂, given in vector form as Si : pi + t ui, t ∈ [0, 1]. Because they are parallel, u₂ = ku₁ for some k ∈ ℝ. As |ui| is the length of Si , the ratio of lengths of S₂ and S₁ is |k|. From parts (1) and (2), Si is mapped into a segment of length |Aui|. Since Au₂ = A(ku₁) = k(A&thinspp;u), |Au₂| = |k| |Au₁|, which shows that the ratio of lengths of f(S₂) and f(S₁) is also |k|.
  7. We postpone discussion of the proof of this property until the end of this section
   
Example 4:
   ■
End of Example 4

Affine Spaces

Recall that the Cartesian product of two sets A and B, denoted A × B, is the set of all ordered pairs (a, b), where a is in A and b is in B. Since our main object of interest is ℝ, the set of real numbers, its direct product ℝ² = ℝ × ℝ inherits a linear structure from field ℝ. This space provides the main historical example of the Cartesian plane in analytic geometry.

The set of all such pairs (i.e., the Cartesian product ℝ × ℝ, denoted by ℝ²) is assigned to the set of all points in the plane as well as to the set of all free vectors. All these three sets (the set of points on the plane, the set of 2-tuples, and the set of free vectors in ℝ²) are in one-to-one and onto correspondence between each other. Therefore, they traditionally are denoted by ℝ², and content specifies which of these sets is in use. One can similarly define the Cartesian product of n sets, also known as an n-fold Cartesian product, which can be represented by an n-dimensional array, where each element is an n-tuple. In Euclidean space, points and vectors are usually identified with n-tuples.

Theorem 3: A subset of 𝔽n is an affine set if and only if it is the solution set to a system of linear equations over 𝔽.
If S ⊆ 𝔽n is an affine set, it has the form S = b + W, where W is a subspace of 𝔽n and b is a vector in 𝔽n. Given such a set, let W₁ be a direct complement to W in 𝔽n. Note that we can assume b is in W₁.

Let A be the standard matrix representation of the projection onto W₁ along W. For w₁ in W₁ and w in W, we have \[ {\bf A} \left( {\bf w}_1 + {\bf w} \right) = {\bf w}_1 . \] In particular, NullSpace(A) = W and A b = b. Lemma 2 (the general solution of A b = b is the sum of a particular solution and the general solution of the homogeneous equation) in section thus ensures that S is the solution set to A b = b. This completes the proof in one direction.

The same Lemma 2 gives us the proof in the other direction.

   

Example 5: Let us consider the affine transformation S : ℝ³ ↦ ℝ4 given by S(x) = A x + b, where \[ \mathbf{A} = \begin{bmatrix} -2& 3& 5 \\ 3& 7& -1 \\ 5& 27& 7 \\ -9& 2& 16 \end{bmatrix}, \qquad {\bf b} = \begin{bmatrix} 3 \\ -6 \\ -12 \\ 15 \end{bmatrix} . \] First, we find solution A u = −b, so S(u) = 0, withe aid of Mathematica.

A = {{-2, 3, 5}, { 3, 7, -1}, {5, 27, 7}, {-9, 2, 16}};
b = {{3}, {-6}, {-12}, {15}};
RowReduce[Join[A, b, 2]] // MatrixForm
\( \displaystyle \quad \begin{pmatrix} 1& 0& -\frac{38}{23}& -\frac{39}{23} \\ 0& 1& \frac{13}{23}& -\frac{3}{23} \\ 0& 0& 0& 0 \\ 0& 0& 0& 0 \end{pmatrix} \)
From solution given above, we can reconstruct the solution as a line given parametrically \[ L = \left\{ \left( x, y, z \right) \ : \ x = \frac{38}{23}\, t -\frac{39}{23} , \quad y = -\frac{13}{23}\, t - \frac{3}{23} , \quad z = t , \quad \forall t \in \mathbb{R} \right\} . \] We can decompose L into a sum of two components: the first is the line L₀, which passes through origin; the second is a translation by a particular vector vp. To find the particular vector vp, notice that all we have to do is set t = 0 in the parametric definition of L given above, which yields \( \displaystyle \quad {\bf v}_p = \left[ -\frac{39}{23} , \ -\frac{3}{23} , \ 0 \right] \quad \) Once we know vp, the line L₀ is simply the remaining portion of the solution \[ L_0 = \left\{ \left( x, y, z \right) \ : \ x = \frac{38}{23}\, t , \quad y = -\frac{13}{23}\, t , \quad z = t , \quad \forall t \in \mathbb{R} \right\} . \] Clearly, L₀ is a line through the origin, and is thus a subspace of ℝ³. The line L can be realized as a translate of the line L₀ by the particular solution xp. Now let us plot these two lines along with the particular solution xp.
lineL = ParametricPlot3D[{38/23*t - 39/23, 13/23*t - 3/23, t}, {t, -3, 3}, PlotStyle -> {Thickness[0.007], Blue}];
lineL0 = ParametricPlot3D[{38/23*t, 13/23*t, t}, {t, -3, 3}, PlotStyle -> {Thickness[0.007], Red}];
pts = Graphics3D[{Black, Sphere[{0, 0, 0}, 0.13], Black, Sphere[{-39/23, -3/23, 0}, 0.13]}];
arrow = Graphics3D[{Arrowheads[0.05], Thickness[0.007], Purple, Arrow[{{0, 0, 0}, {-39/23, -3/23, 0}}]}];
txt = Graphics3D[{Text[Style["O", Black, 20], {0, 0, 0.6}], Text[Style["P", Black, 20], {-39/23, -3/23, 0.6}]}];
Show[lineL, lineL0, txt, arrow, pts]
Coset of 1D space.
   ■
End of Example 5

In particular, an Euclidean plane contains points P(x, y) and vectors v(x, y) simultaneously because they both have the same coordinates.

In computer graphics, the main problem is to render or display a three-dimensional objects (or models) by projecting or mapping them into two-dimensional images. Then the two-dimensional data must be converted into a form that the computer can display (rasterization) and then be displayed. This requires a viewpoint or direction of projection and a viewing or projection plane. Fortunately, a monitor is just a two-dimensional array of finite number of pixels, short for picture elements.

This practical situation with rastering data in computer graphics shows that we need to distinct points from vectors. It is important because points and vectors have some mutually exclusive properties. A point has location but no extent while a vector in ℝn has both direction and magnitude (norm) but its location is independent.

In order to define an affine plane (which is a two-dimensional geometric object), we need to separate points from vectors (that are also called lines) according to their gender. Namely, we identify points with extra integer "1," but we mark vectors with "0." Hence, we write points as P(x, y, 1) and vectors as v(x, y, 0). However, you can move vectors to the point plane and attach them to points. This allows us to move a point P into another position Q along vector v. In coordinates, it can be written as

\[ P(x, y, 1) + \mathbf{v}(a, b, 0) = Q(x+a, y+b, 1) \quad \Longrightarrow \quad \mathbf{v} = = Q - P = \overline{PQ} . \]

From this prospective, we are not allowed to add points, but we can add points and vectors, as well as vectors and vectors. For any two points P and Q from the inhabited set A, there exists a unique vector vV such that Q = P + v; so we can identify this vector as PQ or QP. In general, an affine space consists of an inhabited set of points A together with a vector space V and subtraction operation of two points, producing a unique vector.

In order to visualize an affine plane, we consider the vector space ℝ³. Inside ℝ³, we can choose two planes, as in the picture below. We'll call the yellow one the vector space V ≌ ℝ² and the blue one as the point plane A. The plane V passes through the origin since it is a vector space, but the blue plane A does not. However, the inhabited set A looks almost exactly the same as V, having the exact same, flat geometry, and in fact A and V are simply translates of one another. This plane A is a classical example of an affine space. You will learn in Part 3 that A is a coset of V.

Wrong model of affine plane.
     
Affine plane.

The left picture shows an attempt to introduce vector structure in the inhabited set A. Let T : AV be a translation of the point set to the vector space. You may try to define addition of two points as

\[ P \left( + \right) Q = T^{-1} \left( T(P) + T(Q) \right) . \]
However, the resulting vector (P(+)Q) does not belong to the inhabited set A. It is impossible to introduce a vector structure into an affine space---it is not a vector space.

Now we are ready to make general definition of an affince space, according to Wikipedia.

An affine space is a geometric structure that generalizes some of the properties of Euclidean spaces in such a way that these are independent of the concepts of distance and measure of angles, keeping only the properties related to parallelism and ratio of lengths for parallel line segments. Affine space is the setting for affine geometry.
In context of linear algebra, an affine space is a set of points A equipped with a set of transformations (that is bijective mappings); the translations, which forms a vector space (over a given field, commonly the set of real numbers), and such that for any given ordered pair of points there is a unique translation sending the first point to the second one; the composition of two translations is their sum in the vector space of the translations.
An affine space with vector space V is a nonempty set A of points and a vector valued map d : A × AV called a difference function, such that for all P, Q, RA
  1. d(P, Q) + d(Q, R) = d(P, R),
  2. the restricted map d₁ = d{P}×A : {P} × AV defined as mapping (P, Q) ↦ d(P, Q) is a bijection.

The first condition (i) is just the usual “parallelogram property” of the addition of vectors. From the second condition, it follows that for every pair of points P and Q from A, there exits a unique vector vV such that P + v = Q, which is naturally denoted by \( \displaystyle \quad \overline{PQ} \quad \mbox{or} \quad \vec{PQ} . \) From properties of a vector space, we derive

\[ P + \mathbf{0} = P , \qquad \left( P + \mathbf{v} \right) + \mathbf{u} = P + \left( \mathbf{v} + \mathbf{u} \right) \quad \forall \mathbf{v}, \mathbf{u} \in V. \]
   
Example 6: Any finite dimensional vector space V has an affine space structure specified by choosing the inhabited set A = V and letting d be subtraction in the vector space V. We will refer to the affine structure (V, V, d) on a vector space V as the canonical affine structure on V. In particular, the vector space ℝn can be viewed as the affine space (ℝn, ℝn, d), denoted by 𝔸n. The affine space 𝔸n is called the real affine space of dimension n.    ■
End of Example 6
Lemma 1: In an affine space (A, V, d) with difference function d we have
  1. d(P, P) = 0 for all points PA,
  2. d(P, Q) = −d(Q, P) for all points P, QA.
In an affine space (A, V, d), for any three points P, Q, RA, and any real number λ addition of points and scalar multiplication is defined via \[ \begin{split} \left( P, Q \right) + \left( P, R \right) &= d^{-1} \left( d\left( P, Q \right) + d\left( P, R \right) \right) , \\ \lambda \left( P, Q \right) &= d^{-1} \left( \lambda\, d \left( \left( P, Q \right)\right) \right) . \end{split} \] This vector space is the tangent space to A at point P, denoted TP(A). For v ∈ TP(A) ≌ V, we denote P + d−1(v) as P + v.
   
Example 7: Let us consider the subset A of 𝔸³ consisting of all points (x, y, z) satisfying the equation \[ x^2 + y^2 - z = 0 . \] The set of points A is a paraboloid of revolution, with axis Oz. The surface A can be made into an official affine space by defining the action of addition of points and vectors (which is equivalent to the difference operation d) s : A × ℝ² → A of ℝ² on A defined such that for every point (x, y, x² + y²) on A and any vector v = (v, u) ∈ ℝ², \[ \left( x, y , x^2 + y^2 \right) + \begin{bmatrix} v \\ u \end{bmatrix} = \left( x + v , y + u , (x+v)^2 + (y+u)^2 \right) . \]    ■
End of Example 7

As the notion of parallel lines is one of the main properties that is independent of any metric, affine geometry is often considered as the study of parallel lines.

Affine Transformations

Let P₁ and P₂ be two arbitrary points in inhabited set A of an affine space (A, V, d). An affinite combination of points P₁ and P₂ is \[ P = \alpha\, P_1 + \left( 1 - \alpha \right) P_2 \qquad \forall \alpha \in [0, 1]. \] In general, a linear combination of m points P₁, P₂, … , Pm is \[ \sum_i \alpha_i P_i = P_0 + \sum_{i=1}^m \alpha_i \left( P_i - P_0 \right) , \quad \sum_{i=1}^m \alpha_i = 1 . \]
   
Example 8: We start with two points ina two-dimensional affince space.
ar = Graphics[{Blue, Thick, Arrow[{{0, 0}, {1, 0.5}}]}]; ar2 = Graphics[{Black, Dashed, Thick, Arrow[{{1, 0.5}, {2, 1}}]}]; txt = Graphics[{Text[Style[Subscript[P, 1], Black, 20], {0, -0.15}], Text[Style[Subscript[P, 2], Black, 20], {2.1, 0.9}]}]; dot = Graphics[{{Purple, Disk[{1, 0.5}, 0.02]}, {Purple, Disk[{0, 0}, 0.02]}, {Purple, Disk[{2, 1}, 0.02]}}]; txt2 = Graphics[{Text[Style["+ t (", Black, 20], {1.3, 0.45}], Text[Style["-", Black, 20], {1.6, 0.45}], Text[Style[")", Black, 20], {1.81, 0.45}]}]; txt3 = Graphics[{Text[ Style[Subscript[P, 1], Black, 20], {1.1, 0.45}], Text[Style[Subscript[P, 2], Black, 20], {1.46, 0.45}], Text[Style[Subscript[P, 1], Black, 20], {1.74, 0.45}]}]; Show[ar, ar2, txt, dot, txt2, txt3]
Linear combination of two points.

Now we consider three points; the following figure shows a combination of these points: \[ P = \alpha_1 P_1 + \alpha_2 P_2 + \alpha_3 P_3 , \qquad \alpja_1 + \alpha_2 + \alpha_3 = 1. \]

ar = Graphics[{Blue, Thick, Arrow[{{0, 0}, {1, 0.5}}]}]; ar2 = Graphics[{Black, Dashed, Thick, Arrow[{{1, 0.5}, {2, 1}}]}]; ar3 = Graphics[{Blue, Thick, Arrow[{{1, 0.5}, {1.7, 0}}]}]; line = Graphics[{Black, Dashed, Thick, Line[{{2, 1}, {3, -0.7}, {0, 0}}]}]; dot = Graphics[{{Purple, Disk[{1, 0.5}, 0.02]}, {Purple, Disk[{0, 0}, 0.02]}, {Purple, Disk[{2, 1}, 0.02]}}]; dot2 = Graphics[{{Purple, Disk[{3, -0.7}, 0.02]}, {Red, Disk[{1.7, 0}, 0.03]}}];
Linear combination of two points.
   ■
End of Example 8
Formally, an affine transformation is a mapping from one affine space to another (which may be, and in fact usually is, the same space) that preserves affine combinations.
The properties of affine transformations on points and vectors are summarized in the following theorem.
Theorem 4: Let P and Q be points and u and v be vectors in an affine space 𝔸 = (A,V, d). Let F : 𝔸 → 𝔹 be an affine transformation from 𝔸 to another affine space 𝔹. Then for all scalars α and β
  1. FP + βQ) = α F(P) + β F(Q),
  2. F(v) = F(PQ) = F(P) − F(Q) for v = PQ,
  3. F(P + αv) = F(P) + αF(v),
  4. F(u + v) = F(u) + F(v),
  5. Fv) = αF(v).
  1. The first two properties are the definition of an affine transformation of a point and a vector.
  2. Showing part (d) is straight forward if P and Q are points in 𝔸 such that u = PQ and v = QR and the head-to-tail axiom is applied several times.
  3. \begin{align*} F(\mathbf{u} - \mathbf{v}) &= F \left( (P - Q) + (Q - R) \right) \\ &= F(P - R) = F(P) - F(R) \\ &= F(P) - F(Q) + F(Q) - F(R) \\ &= F(P - Q) + F(Q-R) \\ &= F(\mathbf{u}) + F(\mathbf{v}) . \end{align*}
   
Example 9:    ■
End of Example 9

Recall that a linear transformation T : ℝ² ⇾ ℝ² is uniquely determined by taking a line segment (or its endpoints) in the domain and map it into another line segment (or its endpoints) in the codomain. This is no longer the case for an affine map f : 𝔸² ↦ 𝔸². It turns out that an affine transformation f : 𝔸² ↦ 𝔸² is uniquely determined by taking a triangle (or three points) in the domain and mapping it into another triangle (or three points) in the codomain. To see how this works, let the triangle in the domain be defined as the interior of the three points

\[ T_1 = \left\{ \left( x_1 , y_1 \right) , \ \left( x_2 , y_2 \right) , \ \left( x_3 , y_3 \right) \right\} . \]
Similarly, s[[pse these points are mapped into
\[ T_2 = \left\{ \left( z_1 , w_1 \right) , \ \left( z_2 , w_2 \right) , \ \left( z_3 , w_3 \right) \right\} . \]
Then
\[ f \left( T_1 \right) = T_2 \qquad \iff \qquad f\left( x_j , y_j \right) = \left( z_j , w_j \right) , \quad j=1,2,3. \]
Given that f is determined by formula f(x) = A x + b, with A and b given respectively by
\[ \mathbf{A} = \begin{bmatrix} a&b \\ c & d \end{bmatrix} , \qquad \mathbf{b} = \begin{bmatrix} \alpha \\ \beta \end{bmatrix} , \]
we get the following system of equations:
\[ \begin{bmatrix} a&b \\ c & d \end{bmatrix} \cdot \begin{bmatrix} x_j \\ y_j \end{bmatrix} = \begin{bmatrix} \alpha \\ \beta \end{bmatrix} , \qquad j=1,2,3. \]
This gives the new single matrix equation
\[ \begin{bmatrix} x_1 & y_1 & 0&0& 1 & 0 \\ 0&0& x_1 & y_1 & 0&1 \\ x_2 & y_2 & 0&0&1&0 \\ 0&0& x_2 & y_2 &0&1 \\ x_3&y_3 & 0&0&1&0 \\ 0&0& x_3 & y_3 & 0&1 \end{bmatrix} \begin{bmatrix} a \\ b \\ c \\ d \\ \alpha \\ \beta \end{bmatrix} = \begin{bmatrix} z_1 \\ w_1 \\ z_2 \\ w_2 \\ z_3 \\ w_3 \end{bmatrix} . \]
This matrix/vector equation can be solved for our six unknowns {𝑎, b, c, d, α, β}, which determine the affine map uniquely. Therefore, a two-dimensional affine space has six degrees of freedom.
Theorem 5: Given two ordered sets of three non-collinear points each, there exists a unique affine transformation f mapping one set onto the other.
We first show that the special (ordered) triple of vectors, \[ \left\{ {\bf 0} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} , \quad {\bf i} = \begin{bmatrix} 1 \\ 0 \end{bmatrix} , \quad {\bf j} = \begin{bmatrix} 0 \\ 1 \end{bmatrix} , \right\} \] can be mapped by an appropriate affine transformation to an arbitrary (ordered) triple of vectors \[ \left\{ \mathbf{p} = \begin{bmatrix} p_1 \\ p_2 \end{bmatrix} , \quad \mathbf{q} = \begin{bmatrix} q_1 \\ q_2 \end{bmatrix} , \quad \mathbf{r} = \begin{bmatrix} r_1 \\ r_2 \end{bmatrix} \right\} , \] which corresponds to three non-collinear points. Let \[ \mathbf{A} = \begin{bmatrix} q_1 - p_1 & r_1 - p_1 \\ q_2 - p_2 & r_2 - p_2 \end{bmatrix} \quad\mbox{and} \quad \mathbf{b} = \begin{bmatrix} p_1 \\ p_2 \end{bmatrix} . \] One can immediately verify that \[ \mathbf{A}\,{\bf 0} + {\bf b} = \mathbf{p} , \quad \mathbf{A}\,{\bf i} + {\bf b} = \mathbf{q} , \quad \mathbf{A}\,{\bf j} + {\bf b} = \mathbf{r} . \] Note that the columns of A correspond to the vectors qp and rp. Since the points (p₁, p₂), (q₁, q₂), and (r₁, r₂) are non-collinear, the vectors qp and rp are non-parallel vectors. Hence, the determinant of A is nonzero. Thus, A is invertible, and f(x) = A x + b is an affine transformation by definition.

Let (p, q, r) and (p₁, q₁, r₁) be two ordered triples of position vectors representing two arbitrary triples of non-collinear points. Using the result we have just proven, there exist affine transformations f and g mapping the special triple {0, i, j} to {p, q, r} and to {p₁, q₁, r₁}, respectively. Then gf−1 is an affine transformation that maps {p, q, r} into {p₁, q₁, r₁}. The uniqueness of this transformation is left to you.

   
Example 10: Let us consider two sets of points on the plane ℝ²: \[ \begin{split} T_1 &= \left\{ \left( -5, -3 \right) , \ \left( 2, 10 \right) , \ \left( 3, -5 \right) \right\} , \\ T_2 &= \left\{ \left( -4, 1 \right) , \ \left( -3, 11 \right) , \ \left( 1, 9 \right) \right\} . \end{split} \] We want to find an affine transformation that maps points from T₁ into T₂. So we use Mathematica.
X1 ={-5, 2, 3}; Y1 = {-3, 10, -5};
X2 = {-4, -3, 1}; Y2 = {1, 11, 9};
R = {{X2[[1]]}, {Y2[[1]]}, {X2[[2]]}, {Y2[[2]]}, {X2[[3]]}, {Y2[[3]]}} ; d = {{X1[[1]], Y1[[1]], 0, 0, 1, 0}, {0, 0, X1[[1]], Y1[[1]], 0, 1}, {X1[[2]], Y1[[2]], 0, 0, 1, 0}, {0, 0, X1[[2]], Y1[[2]], 0, 1}, {X1[[3]], Y1[[3]], 0, 0, 1, 0}, {0, 0, X1[[3]], Y1[[3]], 0, 1}}; % // MatrixForm
\( \displaystyle \quad \begin{pmatrix} -5 &-3& 0&0&1&0 \\ 0&0&-5&-3&0&1 \\ 2&10&0&0&1&0 \\ 0&0&2&10&0&1 \\ 3&-5&0&0&1&0 \\ 0&0&3&-5&0&1 \end{pmatrix} \)
AB = Inverse[d] . R
{{67/118}, {-(27/118)}, {62/59}, {12/59}, {-(109/59)}, {405/59}}
A = {{AB[[1, 1]], AB[[2, 1]]}, { AB[[3, 1]], AB[[4, 1]]}};
% // MatrixForm
\( \displaystyle \quad \begin{pmatrix} \frac{67}{118} & -\frac{27}{118} \\ \frac{62}{59} & \frac{12}{59} \end{pmatrix} \)
B = {{AB[[5, 1]]}, {AB[[6, 1]]}};
% // MatrixForm
\( \displaystyle \quad \begin{pmatrix} - \frac{109}{59} \\ \frac{409}{59} \end{pmatrix} \)
A . {X1[[1]], Y1[[1]]};
% // MatrixForm
\( \displaystyle \quad \begin{pmatrix} -\frac{109}{59} \\ \frac{405}{59} \end{pmatrix} \)
A . {X1[[2]], Y1[[2]]};
% // MatrixForm
\( \displaystyle \quad \begin{pmatrix} -\frac{68}{59} \\ \frac{244}{59} \end{pmatrix} \)
A . {X1[[3]], Y1[[3]]};
% // MatrixForm
\( \displaystyle \quad \begin{pmatrix} \frac{168}{59} \\ \frac{126}{59} \end{pmatrix} \)
The last three Mathematica commands are simply verifications that the vectors (xk, yk) determine the corners of triangle T₁ were sent to their corresponding counterparts (zk, wk) of T₂.

=================== check    ■
End of Example 10

Homogeneous Coordinates

An n-dimensional affine space (A, V, d) is specified by the vector space V and the inhabited set of points A. The n-dimensional vector space V is completely described by providing an ordered basis for it. From the definition of an affine space, it is known that for every pair of points in A there exists a vector in V that “connects” them. Once a particular point O is selected from A, every other point in A can be obtained by adding a vector from V to O. Therefore, supplying an ordered basis for V and a single point in A is sufficient to specify the affine space A.

A frame for the n-dimensional affine space 𝔸 = (A, V, d) consists of the set of basis vectors e₁, e₂, … , en for V and a point O from A. The point O locates the origin of the frame within A. We use the notation ϕ = (e₁, e₂, … , en, O) to denote a frame. Every vector u in V can be expressed as \[ \mathbf{u} = c_1 \mathbf{e}_1 + c_2 \mathbf{e}_2 + \cdots + c_n \mathbf{e}_n , \] and every point P in A can be written as \[ P = k_1 \mathbf{e}_1 + k_2 \mathbf{e}_2 + \cdots + k_n \mathbf{e}_n + O. \]
Specifying a frame for an affine space is equivalent to providing a coordinate system for it; once a frame has been determined any point or vector in the affine space can be described by a set of scalar values. To do this in matrix notation, however, the following definition must be made. This is often specified as a third axiom to definition of the affine space:
\[ 0 \cdot P = O \qquad \mbox{and} \qquad 1 \cdot P = P \in \mathbb{A} . \]

We start demonstration of affine transformations with plane case. So we choose a frame ϕ = (e₁, e₂, O) for an affine space 𝔸 = (A, ℝ², d). Any vector u in ℝ² can be written in either column form or row form:

\[ \mathbf{u} = \begin{bmatrix} \mathbf{e}_1 & \mathbf{e}_2 & O \end{bmatrix} \begin{bmatrix} \alpha_1 \\ \alpha_2 \\ 0 \end{bmatrix} = \left( \alpha_1 \,:\,\alpha_2 \, : \,0 \right) \begin{pmatrix} \mathbf{e}_1 \\ \mathbf{e}_2 \\ O \end{pmatrix} . \]
Hence, column vector \( \displaystyle \quad \begin{bmatrix} \alpha_1 \\ \alpha_2 \\ 0 \end{bmatrix} \quad \) and the corresponding row vector \( \displaystyle \quad \left( \alpha_1 \,:\, \alpha_2 \, : \, 0 \right) \quad \) are coordinate vectors written in column form and in row form, respectively. Traditionally, coordinate vectors are written in row form where components are separated by ":" in projective geometry. Similarly, point P in the inhabited set A can be expressed as
\[ P = \begin{bmatrix} \mathbf{e}_1 & \mathbf{e}_2 & O \end{bmatrix} \begin{bmatrix} \beta_1 \\ \beta_2 \\ 1 \end{bmatrix} = \left( \beta_1 \, : \, \beta_2 \,:\, 1 \right) \begin{pmatrix} \mathbf{e}_1 \\ \mathbf{e}_2 \\ O \end{pmatrix} . \]
Similar expression are valid for three-dimensional affine spaces, and in general, they are extended for arbitrary n-dimensional case. Since there is no standard notation for affine coordinates, some authors prefer column notation while others use row form. Therefore, we place both notations together and let the reader deside which one is preferable.    
Example 11: Given the frame \[ \phi = \left( \begin{bmatrix} \phantom{-}2 \\ -3 \end{bmatrix} , \ \begin{bmatrix} 1 \\ 4 \end{bmatrix} , \ \begin{pmatrix} 6 & 2 \end{pmatrix} \right) , \] determine the point Q that has the coordinates (-3, 2, 1).

Solution: We use the coordinates to form a linear combination of the vectors in the frame that we then add to the frame’s origin. Because we are adding a vector to a point the result will indeed be a point. \[ Q = -3 \begin{bmatrix} \phantom{-}2 \\ -3 \end{bmatrix} + 2 \begin{bmatrix} 1 \\ 4 \end{bmatrix} + \begin{pmatrix} 6 & 2 \end{pmatrix} = \begin{bmatrix} 2 \\ 19 \end{bmatrix} . \]

-3*{2, -3} + 2 *{1, 4} + {6, 2}
{2, 19}
   ■
End of Example 11

Often it is desirable to find the coordinates of a point relative to one frame given the coordinates of that point relative to another frame. This operation, called a change of frames, is analogous to the change of basis operation in vector spaces. Let β = (v₁, v₂, v₃, O) and ϕ = (e₁, e₂, e₃, Q) be two frames for the 3-dimensional affine space 𝔸. To find coordinate vector of arbitrary point in frame ϕ, denoted by ⟦Pϕ given ⟦Pβ = [α₁, α₂, α₃, 1], we must first write the basis vectors and point in β in terms of the basis vectors and point in ϕ:

\begin{align*} \mathbf{v}_1 &= a_1 \mathbf{e}_1 + b_1 \mathbf{e}_2 + c_1 \mathbf{e}_3 , \\ \mathbf{v}_2 &= a_2 \mathbf{e}_1 + b_2 \mathbf{e}_2 + c_2 \mathbf{e}_3 , \\ \mathbf{v}_3 &= a_3 \mathbf{e}_1 + b_3 \mathbf{e}_2 + c_3 \mathbf{e}_3 , \\ O &= a_4 \mathbf{e}_1 + b_4 \mathbf{e}_2 + c_4 \mathbf{e}_3 + Q . \end{align*}
Then
\begin{align*} [\! P ]\!]_{\phi} &= [\![ \alpha_1 \mathbf{v}_1 + \alpha_2 \mathbf{v}_2 + \alpha_3 \mathbf{v}_3 ]\!]_{\phi} + [\![ O ]\!]_{\phi} \\ &= \alpha_1 [\![ \mathbf{v}_1 ]\!]_{\phi} + \alpha_2 [\![ \mathbf{v}_2 ]\!]_{\phi} + \alpha_3 [\![ \mathbf{v}_3 ]\!]_{\phi} + [\![ O ]\!]_{\phi} \\ &= \begin{bmatrix} [\![ \mathbf{v}_1 ]\!]_{\phi} & [\![ \mathbf{v}_2 ]\!]_{\phi} & [\![ \mathbf{v}_3 ]\!]_{\phi} & [\![ O ]\!]_{\phi} \end{bmatrix} \begin{bmatrix} \alpha_1 \\ \alpha_2 \\ \alpha_3 \\ 1 \end{bmatrix} \\ &= \begin{bmatrix} a_1 & a_2 & a_3 & a_4 \\ b_1 & b_2 & b_3 & b_4 \\ c_1 & c_2 & c_3 & c_4 \\ 0&0&0&1 \end{bmatrix} \begin{bmatrix} \alpha_1 \\ \alpha_2 \\ \alpha_3 \\ 1 \end{bmatrix} \end{align*}
is the change of frame matrix.    
Example 12: Let β and ϕ be two frames for the same affine space such that \[ \beta = \left( \begin{bmatrix} 3 \\ 2 \end{bmatrix} , \ \begin{bmatrix} 1 \\ 4 \end{bmatrix} , \ \left( 5, \ 2 \right) \right) , \qquad \phi = \left( \begin{bmatrix} 7 \\ 3 \end{bmatrix} ,\ \begin{bmatrix} -3 \\ -2 \end{bmatrix} ,\ \left( 3,\ -2 \right) \right) . \] If ⟦Qβ = (-3, 1, 1), then find ⟦Qϕ.

Solution: The basis vectors in β can be written as \begin{align*} \begin{bmatrix} 3 \\ 2 \end{bmatrix} &= 0 \cdot \begin{bmatrix} 7 \\ 3 \end{bmatrix} - \begin{bmatrix} -3 \\ -2 \end{bmatrix} + 0 \cdot \left( 3,\ -2 \right) , \\ \begin{bmatrix} 1 \\ 4 \end{bmatrix} &= -2 \cdot \begin{bmatrix} 7 \\ 3 \end{bmatrix} -5\cdot \begin{bmatrix} -3 \\ -2 \end{bmatrix} + 0 \cdot \left( 3,\ -2 \right) , \\ \left( 5, \ 2 \right) &= - \frac{8}{5}\cdot \begin{bmatrix} 7 \\ 3 \end{bmatrix} -\frac{22}{5} \cdot \begin{bmatrix} -3 \\ -2 \end{bmatrix} + 1 \cdot \left( 3,\ -2 \right) , \end{align*}

Inverse[{{7, -3}, {3, -2}}] . {3, 2}
{0, -1}
Inverse[{{7, -3}, {3, -2}}] . {1, 4}
{-2, -5}
Inverse[{{7, -3}, {3, -2}}] . {2, 4}
{-(8/5), -(22/5)}
so the change of frame matrix M is \[ \mathbf{M} = \begin{bmatrix} 0&-1& -\frac{8}{5} \\ -2&-3& -\frac{22}{5} \\ 0&0&1 \end{bmatrix} . \] Knowing M we can compute ⟦Qϕ: \[ [\![ Q ]\!]_{\phi} = \begin{bmatrix} 0&-1& -\frac{8}{5} \\ -2&-3& -\frac{22}{5} \\ 0&0&1 \end{bmatrix} \begin{bmatrix} -3 \\ 1 \\ 1 \end{bmatrix} = \begin{bmatrix} -\frac{13}{5} \\ - \frac{7}{5} \\ 1 \end{bmatrix} . \]
{{0,-1,-8/5}, {-2,-3,-22/5}, {0,0,1}} . {-3, 1, 1}
{-(13/5), -(7/5), 1}
   ■
End of Example 12
August Möbius

Compared to Euclidean geometry, projective geometry has a different setting and has extra points for a given dimension. This allows translation to be described as a linear transformation, thereby allowing all the transformations we would like to affect to be represented by matrix multiplication. Recall that a linear translation is not a linear transformation in vector spaces. The way out of this dilemma is to turn the n-dimensional problem into a (n+1)-dimensional problem, but in homogeneous coordinates, introduced by the German mathematician August Ferdinand Möbius (1790--1868) in his 1827 work Der barycentrische Calcul.

The real projective plane ℙ² can be given in terms of equivalence classes. For non-zero elements of ℝ³, define (x₁, y₁, z₁) ~ (x₂, y₂, z₂) to mean there is a non-zero λ so that (x₁, y₁, z₁) = (λx₂, λy₂, λz₂). Then ~ is an equivalence relation and the projective plane can be defined as the equivalence classes of ℝ³ ∖ {0}. If (x, y, z) is one of the elements of the equivalence class p, then these are taken to be homogeneous coordinates of p. The homogeneous coordinates or projective coordinates of the point are denoted with columns, either (x:y:z) or [x:y:z].

Homogenous coordinates for a 𝑛-dimensional space consist of tuples with 𝑛+1 coordinates, where the extra coordinate is kept at a special value.

When z ≠ 0, the point [x:y:z] represents the point (x/z, y/z) in the Euclidean plane ℝ². Homogeneous coordinates of the form (x, y, 0) do not correspond to a point in the Cartesian plane. Instead, they correspond to the unique point at infinity in the direction (x, y). Hence, the projective plane ℙ² can be seen as the plane ℝ² plus all the points at infinity, each of which along a different direction. The plane ℙ² also makes sense of the notion that two parallel lines intersect at infinity,

The projective transformation does not preserve parallelism, length, and angle. But it still preserves collinearity and incidence. Projective transformation can be represented as transformation of an arbitrary quadrangle (i.e. system of four points) into another one.    

Example 13:    ■
End of Example 13

Affine Matrices

Suppose that 𝔸 and 𝔹 are n-dimensional and m-dimensional affine spaces, respectively. Let α = (a₁, a₂, … , an, Oα) and β = (b₁, b₂, … , bm, Oβ) be frames for 𝔸 and 𝔹. Suppose further that there esists an affine transformation F such that F : 𝔸 → 𝔹 so that if P is a point in inhabited set A, then Q = F(P) is a point in inhabited set B. Finally, let ⟦Pα = [α₁, α₂, … , αn, 1]. Then
\begin{align*} \mathbf{Q} &= F(\mathbf{P}) \\ &= F \left( \alpha_1 \mathbf{a}_1 + \alpha_2 \mathbf{a}_2 + \cdots + \alpha_n \mathbf{a}_n + O_{\alpha} \right) \\ &= \alpha_1 F \left( \mathbf{a}_1 \right) + \alpha_2 F \left( \mathbf{a}_2 \right) + \cdots + \alpha_n F \left( \mathbf{a}_n \right) +F \left( O_{\alpha} \right) \end{align*}
where the last step is possible because of properties (c), (d) and (e) of theorem 5. Hence,
\begin{align*} [\![\mathbf{Q} ]\!]_{\beta} &= \left[ \alpha_1 F \left( \mathbf{a}_1 \right) + \alpha_2 F \left( \mathbf{a}_2 \right) + \cdots + \alpha_n F \left( \mathbf{a}_n \right) + F \left( O_{\alpha} \right) \right]_{\beta} \\ &= \begin{bmatrix} \left[ F \left( \mathbf{a}_1 \right) \right]_{\beta} & \cdots & \left[ F \left( \mathbf{a}_n \right) \right]_{\beta} & \left[ O_{\alpha} \right]_{\beta} \end{bmatrix} \begin{bmatrix} \alpha_1 \\ \alpha_2 \\ \vdots \\ \alpha_n \\ 1 \end{bmatrix} \\ &= \begin{bmatrix} a_{1,1} & a_{1,2} & \cdots & a_{1,n} & a_{1, n+1} \\ a_{2,1} & a_{2,2} & \cdots & a_{2,n} & a_{2, n+1} \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ a_{m,1} & a_{m,2} & \cdots & a_{m,n} & a_{m, n+1} \\ 0&0& \cdots & 0 & 1 \end{bmatrix} \begin{bmatrix} \alpha_1 \\ \alpha_2 \\ \vdots \\ \alpha_n \\ 1 \end{bmatrix} \end{align*}
since
\[ \left[ F \left( \mathbf{a}_1 \right) \right]_{\beta} = \begin{bmatrix} a_{1,1} \\ a_{2,1} \\ \vdots \\ a_{m,1} \\ 0 \end{bmatrix} . \quad \left[ F \left( \mathbf{a}_2 \right) \right]_{\beta} = \begin{bmatrix} a_{1,2} \\ a_{2,2} \\ \vdots \\ a_{m,2} \\ 0 \end{bmatrix} \cdots , \quad \left[ F \left( O_{\alpha} \right) \right]_{\beta} = \begin{bmatrix} a_{1, n+1} \\ a_{2, n+1} \\ \vdots \\ a_{n, m+1} \\ 1 \end{bmatrix} . \]
The matrix
\[ \mathbf{M} = \begin{bmatrix} a_{1,1} & a_{1,2} & \cdots & a_{1,n} & a_{1, n+1} \\ a_{2,1} & a_{2,2} & \cdots & a_{2,n} & a_{2, n+1} \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ a_{m,1} & a_{m,2} & \cdots & a_{m,n} & a_{m, n+1} \\ 0&0& \cdots & 0 & 1 \end{bmatrix} \]
is the standard matrix of the affine transformation. In the common cases of two- and three- dimensional affine spaces M has the form
\[ \mathbf{M} = \begin{bmatrix} a_1 & a_2 & a_3 \\ b_1 & b_2 & b_3 \\ 0&0&1 \end{bmatrix} , \quad \mathbf{M} = \begin{bmatrix} a_1 & a_2 & a_3 & a_4 \\ b_1 & b_2 & b_3 & b_4 \\ c_1 & c_2 & c_3 & c_4 \\ 0&0&0&1 \end{bmatrix} \]
In case of row vectors, these matrices become
\[ \mathbf{M} = \begin{pmatrix} a_1 & a_2 & 0 \\ b_1 & b_2 & 0 \\ c_1 & c_2 &1 \end{pmatrix} , \quad \mathbf{M} = \begin{pmatrix} a_1 & a_2 & a_3 & 0 \\ b_1 & b_2 & b_3 & 0 \\ c_1 & c_2 & c_3 & 0 \\ d_1 & d_2 &d_3 &1 \end{pmatrix} \]
   

Example 14: Let 𝔸 = (A, ℝ², d) be an affine space with frame α = [e₁, e₂, O], where O = (0, 0). Let T : 𝔸 → 𝔸 be defined as T(P) = P + t, where t = [Δx, Δy]. Find the 3 × 3 matrix T that mplements this transformation.

Solution: Since only frame α is used, we have \[ \left[ T(\mathbf{e}_1 ) \right]_{\alpha} = [\![ \mathbf{e}_1 ]\!] = \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} \qquad \mbox{and} \qquad \left[ T(\mathbf{e}_2 ) \right]_{\alpha} = [\![ \mathbf{e}_2 ]\!] = \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} \] while \[ \left[ T(O) \right]_{\alpha} = \left[ O + \mathbf{t} \right]_{\alpha} = \left[ O \right]_{\alpha} + \left[ {\bf t} \right]_{\alpha} = \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix} + \begin{bmatrix} \Delta x \\ \Delta y \\ 0 \end{bmatrix} = \begin{bmatrix} \Delta x \\ \Delta y \\ 1 \end{bmatrix} . \] Thus the matrix T is given by \[ \mathbf{T} = \begin{bmatrix} 1 & 0 & \Delta x \\ 0&1& \Delta y \\ 0&0&1 \end{bmatrix} . \]    ■

End of Example 14

Upon embedding n dimensional case into (n+1)-dimensional, we can define an affine transformation as regular linear transformation via matrix/vector multiplication. Since matrix form is so handy for building up complicated transforms from simpler ones, it would be very useful to be able to represent all of the affine transforms by matrices.

We also extend our augmented m-by-(n+1) matrix [Ab] from Eq.\eqref{EqAffine.1} into (m+1) × n+1)) matrix

\begin{equation} \label{EqAffine.3} \left[ \mathbf{A} \mid \mathbf{b} \right] \quad \Longrightarrow \quad {\bf A}_b = \begin{bmatrix} \mathbf{A} & \mathbf{b} \\ 0 \cdots 0 & 1 \end{bmatrix} \quad\mbox{or} \quad {\bf A}_b = \begin{pmatrix} \mathbf{A} & \mathbf{0} \\ \mathbf{b} & 1 \end{pmatrix} \quad\mbox{or} \end{equation}
define maps 𝔽m×(n+1) → 𝔽(m+1)×(n+1) . On the subset V ⊂ 𝔽(n+1)×1 consisting of vectors with last component 1, we recover the affine maps
\begin{equation} \label{EqAffine.4} \begin{bmatrix} \mathbf{A} & \mathbf{b} \\ 0 \cdots 0 & 1 \end{bmatrix} \begin{pmatrix} \mathbf{x} \\ 1 \end{pmatrix} = \begin{bmatrix} {\bf A}\,{\bf x} + \mathbf{b} \\ 1 \end{bmatrix} . \end{equation}
Since V does not contain the zero vector, it is not a vector subspace. But if V₀ denotes the subspace consisting of vectors having last component 0, then
\begin{equation} \label{EqAffine.5} V = \left\{ \mathbf{t} + \mathbf{v} \, | \ \mathbf{v} \in V_0 \right\} = \mathbf{t} + V_0 , \end{equation}
where t denotes any vector having last component 1. We view it as a translate of a vector subspace. Any subset of a vector space which is obtained by translation from a vector subspace is called affine subspace. For example, the set of solutions of a linear system is an affine subspace: It is a translate of the subspace of solutions of the associated homogeneous system.

Composition of affine maps is expressed by the following formula:

\begin{equation} \label{EqAffine.6} {\bf A}_b \,{\bf B}_c = \begin{bmatrix} \mathbf{A} & \mathbf{b} \\ 0 \cdots 0 & 1 \end{bmatrix} \cdot \begin{bmatrix} \mathbf{B} & \mathbf{c} \\ 0 \cdots 0 & 1 \end{bmatrix} = \begin{bmatrix} \mathbf{A}\,\mathbf{B} & \mathbf{A}\,\mathbf{c} + \mathbf{b} \\ 0 \cdots 0 & 1 \end{bmatrix} . \end{equation}
   

Example 14:    ■

End of Example 14

2D Affine Transformations

Matrix Representation of 2D Affine Transformations:

Translation:

\[ \mathbf{T} = \begin{bmatrix} 1&0&\Delta x \\ 0&1&\Delta y \\ 0&0&1 \end{bmatrix} . \]

Scale:

\[ \mathbf{S} = \begin{bmatrix} s_x & 0 & 0 \\ 0& s_y & 0 \\ 0&0&1 \end{bmatrix} . \]

Reflection in the plane is given a line and maps points by flipping the plane about this line.

\[ \mathbf{F}_x = \begin{bmatrix} -1&0&0 \\ 0&1&0 \\ 0&0&1 \end{bmatrix} . \]

Rotation in positive (counterclockwise) direction by angle θ (in radians):

\[ \mathbf{R}[\theta ] = \begin{bmatrix} \cos\theta & -\sin\theta & 0 \\ \sin\theta & \cos\theta & 0 \\ 0&0&1 \end{bmatrix} . \]

Shear:

\[ \mathbf{H} = \begin{bmatrix} 1&a&0 \\ b&1&0 \\ 0&0&1 \end{bmatrix} . \]
   
Example 6: http://igg.unistra.fr/People/seo/2%20affine%20transformation.pdf    ■
End of Example 15
   
Example 16: http://igg.unistra.fr/People/seo/2%20affine%20transformation.pdf    ■
End of Example 16
   
Example 17: http://igg.unistra.fr/People/seo/2%20affine%20transformation.pdf    ■
End of Example 17
   
Example 18:    ■
End of Example 18

Rotation: In its most general form, rotation is defined to take place about some fixed point. We will consider the simplest case where the fixed point is the origin of the coordinate frame.    

Example 19:    ■
End of Example 19

3D Affine Transformations

Now, we can extend all of previously disucced ideas to 3D in the following way. First, we convert all 3D points to homogeneous coordinates of point P(x, y, z), written in either row form or column form:
\[ \left( x\, : \, y \, : \, z \right) \in \mathbb{R}^{1 \times 4} \qquad \mbox{or} \qquad \begin{bmatrix} x \\ y \\ z \\ 1 \end{bmatrix} \in \mathbb{R}^{4 \times 1} . \]
The following matrices constitute the basic affine transforms in 3D, expressed in homogeneous form:
\[ \mbox{Translate: } \ \begin{bmatrix} 1&0&0& \Delta x \\ 0&1&0 &\Delta y \\ 0&0&1& \Delta z \\ 0&0&0&1 \end{bmatrix} , \qquad \mbox{Scale: } \ \begin{bmatrix} s_x & 0&0&0 \\ 0&s_y & 0&0 \\ 0&0&s_z & 0 \\ 0&0&0&1 \end{bmatrix} , \]
\[ \mbox{Shear: } \ \begin{bmatrix} 1 & h_{xy} & h_{xz} & 0 \\ h_{yx} & 1 & h_{yz} & 0 \\ h_{zx} & h_{zy} & 1 & 0 \\ 0&0&0&1 \end{bmatrix} . \]
   
Example 20:    ■
End of Example 20

Reflection in 3-space is given a plane, and flips points in space about this plane. In this case, reflection is just a special case of scaling, but where the scale factor is negative. A common simple version of this is when the plane about which the reflection is performed is one of the coordinate planes (corresponding to x = 0, y = 0, or z = 0).

For example, to reflect points about the xz-coordinate plane (that is, the plane y = 0), we can scale the y-coordinate by −1. Using the scaling matrix above, we have the following transformation matrix:

\[ \mathbf{F}_y = \begin{bmatrix} 1&0&0&0 \\ 0&-1&0&0 \\ 0&0&1&0 \\ 0&0&0&1 \end{bmatrix} . \]
The cases for the other two coordinate frames are similar.

   
Example 21:    ■
End of Example 21

Rotation: n its most general form, rotation is defined to take place about some fixed vector in space &Ropf'³. We will consider the simplest case where the fixed vector is one of the coordinate axes. There are three basic rotations: about the x, y and z-axes. In each case, the rotation is counterclockwise through an angle θ (given in radians). The rotation is assumed to be in accordance with a right-hand rule: if your right thumb is aligned with the axes of rotation, then positive rotation is indicated by the direction in which the fingers of this hand are pointing. To produce a clockwise rotation, simply negate the angle involved.

Consider a rotation about the z-axis. The z-unit vector and origin are unchanged. The x-unit vector is mapped to (cos θ, sin θ, 0, 0), and the y-unit vector is mapped to (− sin θ, cos θ, 0, 0). Thus the rotation matrix is:

\[ \mbox{Rotation about $z$ axis: } \ \begin{bmatrix} \cos \theta_z & -\sin \theta_z &0&0 \\ \sin \theta_z & \cos \theta_z &0&0 \\ 0&0&1&0 \\ 0&0&0&1 \end{bmatrix} . \]
   
Example 22:    ■
End of Example 22
RotationMatrix[\[Theta], {0, 0, 1}] // MatrixForm
\( \displaystyle \quad \begin{pmatrix} \cos \theta_z & -\sin \theta_z &0 \\ \sin \theta_z & \cos \theta_z &0 \\ 0&0&1 \end{pmatrix} \)
Observe that both points and vectors are altered by rotation. For the other two axes we have:
\[ \mbox{Rotation about $x$ axis: } \ \begin{bmatrix} 1&0&0&0 \\ 0&\cos \theta_x & -\sin \theta_x & 0 \\ 0&\sin \theta_x & \cos \theta_x & 0 \\ 0&0&0&1 \end{bmatrix} , \]
   
Example 23:    ■
End of Example 23
RotationMatrix[\[Theta], {1, 0, 0}] // MatrixForm
\( \displaystyle \quad \begin{pmatrix} 1&0&0 \\ 0&\cos \theta_x & -\sin \theta_x \\ 0&\sin \theta_x & \cos \theta_x \end{pmatrix} \)
\[ \mbox{Rotation about $y$ axis: } \ \begin{bmatrix} \cos \theta_y & 0 & \sin \theta_y & 0 \\ 0&1&0&0 \\ -\sin \theta_y & 0 & \cos \theta_y & 0 \\ 0&0&0&1 \end{bmatrix} , \]
A rotation by angle θ about an arbitrary axis can be decomposed into the concatenation of rotations about the x, y, and z axes.

Since a quaternion q basically store the axis vector w and angle of rotation θ, it is not surprisong that we can write the components of a rotation matrix based on quaternion data:

\[ {\bf q} = \left( \cos\theta , \sin\theta \,\mathbf{w} \right) = \left( a, \left( x, y, z \right) \right) , \qquad \|\mathbf{w} \| = 1, \]
Then the rotation matrix by angle 2θ is given by
\[ {\bf R}_q = \begin{bmatrix} 1 - 2y^2 - 2z^2 & 2xy -2az & 2xz + 2ay & 0 \\ 2xy + 2az & 1 - 2x^2 -2 z^2 & 2yz - 2ax & 0 \\ 2xz - 2ay & 2az + 2ax & 1 - 2x^2 - 2y^2 & 0 \\ 0&0&0&1 \end{bmatrix} \]
   
Example 24: Let us consider a line that goes through point P(7, 11, -5), which is parallel to the vector w = (3, 1, 8). We want to rotate the point Q(6, -9, 15) about this line through angels θ that are multiples of 5° until we are back at the point Q. Each rotation through a fixed angle θ is one application of an affiner map. The plot of these points is essentially the circle of rotation for Q. The formula for the rotated point Qnew is given by    ■
End of Example 24

  1. Suppose that an affine space A has two frames \[ \beta = \left( \begin{bmatrix} 2 \\ -2 \end{bmatrix} , \ \begin{bmatrix} 3 \\ 1 \end{bmatrix} , \ (2, -4)\right) \] and \[ \phi = \left( \begin{bmatrix} 3 \\ 1 \end{bmatrix} , \ \begin{bmatrix} 1 \\ 1 \end{bmatrix} , \ (-2, 5)\right) \] Find the change of frame matrix M and use it to compute ⟦Qβ = (5, -3, 1), then find ⟦Qϕ.
  2. Suppose M is the change of frame matrix that transforms coordinates relative to frame β to coordinates relative to frame ϕ. Prove that M−1 exists.
  3. Determine the matrix representation of the affine transformation S : 𝔸 → 𝔸 if 𝔸 = (A, ℝ²) and S(P) = Q where Q = (x+2y, y) if P = (x, y). What type of transformation is this?

  1. Anton, Howard (2005), Elementary Linear Algebra (Applications Version) (9th ed.), Wiley International
  2. Dunn, F. and Parberry, I. (2002). 3D math primer for graphics and game development. Plano, Tex.: Wordware Pub.
  3. Foley, James D.; van Dam, Andries; Feiner, Steven K.; Hughes, John F. (1991), Computer Graphics: Principles and Practice (2nd ed.), Reading: Addison-Wesley, ISBN 0-201-12110-7
  4. Matrices and Linear Transformations
  5. Rogers, D.F., Adams, J. A., Mathematical Elements for Computer Graphics, McGraw-Hill Science/Engineering/Math, 1989.
  6. Szeliski, R., Computer Vision: Algorithms and Applications, 2nd edition, Springer,
  7. Watt, A., 3D Computer Graphics, Addison-Wesley; 3rd edition, 1999.