Linear Transformations

Functions are used throughout mathematics, physics, and engineering to study the structures of sets and relationships between sets. You are familiar with the notation y = f(x), where f is a function that acts on numbers, signified by the input variable x, and produces numbers signified by the output variable y. It is a custom to write an input variable to the right of function following all European languages that perform writing in left to right.

In general, a function f : X ↦ Y is a rule that associates with each x in the set X a unique element y = f(x) in Y. We say that f maps the set X into the set Y and maps the element x to the element y. The set X is the domain of f and the set Y is called range or codomain. The set of all outputs a particular function actually uses from the set X is its image. In linear algebra, we are interested in functions that maps vectors into vectors preserving vector operations.

A function T : ℝ^m ↦ ℝⁿ is a linear transformation if it satisfies two conditions:

T(v + u) = T(v) + T(u) Preservation of addition;
T(kv) = kT(v) Preservation of scalar multiplication;

for all vectors v, u in ℝⁿ and for all scalars k.

AS it is clear from the definition above, we can similar extend it for complex fields ℂ^m ↦ ℂⁿ or rational fields ℚ^m ↦ ℚⁿ. Actually, we can extend this definition for arbitrary vector spaces:

Let V and U be vector spaces over a scalar field 𝔽 (which is either ℂ or ℝ or ℚ). A function T : V ↦ U is called a linear transformation (also called linear mapping or vector space homomorphism) if T preserves vector addition and scalar multiplication.

Theorem 1: Let V be a finite-dimensional vector space of dimension n≥1, and let β = { v₁, v₂, … , v_n } be a basis for V. Let U be any vector space, and let { u₁, u₂, … , u_n } be a list of vectors from U. The function T : V ↦ U defined by

\[ T \left( a_1 {\bf v}_1 + a_2 {\bf v}_2 + \cdots + a_n {\bf v}_n \right) = a_1 {\bf u}_1 + a_2 {\bf u}_2 + \cdots + a_n {\bf u}_n \]

is a linear transformation for any n scalars 𝑎₁, 𝑎₂, … , 𝑎_n.

Because β is a basis in V, there are unique scalars 𝑎₁, 𝑎₂, … , 𝑎_n such that arbitrary vector v ∈ V is represented as a linear combination of basis vectors: v = 𝑎₁v₁ + 𝑎₂v₂ + ⋯ + 𝑎_nv_n. So there is a unique corresponding element

\[ T \left( {\bf v} \right) = T \left( a_1 {\bf v}_1 + a_2 {\bf v}_2 + \cdots + a_n {\bf v}_n \right) = a_1 {\bf u}_1 + a_2 {\bf u}_2 + \cdots + a_n {\bf u}_n \]

in U. Hence T really is a function.

To show that T is a linear transformation, take any vectors v = 𝑎₁v₁ + 𝑎₂v₂ + ⋯ + 𝑎_nv_n and w = b₁v₁ + b₂v₂ + ⋯ + b_nv₂ in V and any real number k. We have

\begin{align*} T \left( {\bf v} + {\bf w} \right) &= T \left( a_1 {\bf v}_1 + a_2 {\bf v}_2 + \cdots + a_n {\bf v}_n + b_1 {\bf v}_1 + b_2 {\bf v}_2 + \cdots + b_n {\bf v}_n \right) \\ &= T \left( \left[ a_1 + b_1 \right] {\bf v}_1 + \left[ a_2 + b_2 \right] {\bf v}_2 + \cdots + \left[ a_n + b_n \right] {\bf v}_n \right) \\ &= \left[ a_1 + b_1 \right] {\bf u}_1 + \left[ a_2 + b_2 \right] {\bf u}_2 + \cdots + \left[ a_n + b_n \right] {\bf u}_n \\ &= a_1 {\bf u}_1 + a_2 {\bf u}_2 + \cdots + a_n {\bf u}_n + b_1 {\bf u}_1 + b_2 {\bf u}_2 + \cdots + b_n {\bf u}_n \\ &= a_1 T \left( {\bf v}_1 \right) + a_2 T \left( {\bf v}_2 \right) + \cdots + a_n T \left( {\bf v}_n \right) + b_1 T \left( {\bf v}_1 \right) + b_2 T \left( {\bf v}_2 \right) + \cdots + b_n T \left( {\bf v}_n \right) \\ &= T \left( {\bf v} \right) + T \left( {\bf w} \right) \end{align*}

and

\begin{align*} T \left( k\,{\bf v} \right) &= T \left( k\left[ a_1 {\bf v}_1 + a_2 {\bf v}_2 + \cdots + a_n {\bf v}_n \right] \right) \\ &= T \left( k\,a_1 {\bf v}_1 + k\,a_2 {\bf v}_2 + \cdots + k\,a_n {\bf v}_n \right) \\ &= k\,a_1 {\bf u}_1 + k\,a_2 {\bf u}_2 + \cdots + k\,a_n {\bf u}_n \\ &= k\, T \left( a_1 {\bf v}_1 + a_2 {\bf v}_2 + \cdots + a_n {\bf v}_n \right) = k\, T \left( {\bf v} \right) . \end{align*}

The two required conditions are satisfied, and T is a linear transformation.

This theorem provides a really nice way to create linear transformations.

Example 1: Let us construct a linear transformation from ℝ² into ℝ³. WE choose two basis vectors in ℝ² and corresponding vectors in ℝ³:

\[ {\bf v}_1 = \begin{pmatrix} -1 \\ \phantom{-}2 \end{pmatrix}, \quad {\bf v}_2 = \begin{pmatrix} 1 \\ 1 \end{pmatrix} \qquad \mbox{and} \qquad {\bf u}_1 = \begin{bmatrix} -1 \\ \phantom{-}1 \\ \phantom{-}2 \end{bmatrix}, \quad {\bf u}_2 = \begin{bmatrix} 2 \\ 1 \\ 3 \end{bmatrix} . \]

We can now define T: ℝ² ↦ ℝ³ by

\[ T \left( x \begin{bmatrix} -1 \\ \phantom{-}2 \end{bmatrix} + y \begin{bmatrix} 1 \\ 1 \end{bmatrix} \right) = x\begin{bmatrix} -1 \\ \phantom{-}1 \\ \phantom{-}2 \end{bmatrix} + y \begin{bmatrix} 2 \\ 1 \\ 3 \end{bmatrix} = \]

To prove that T is a linear transformation, all we need to do is say: by Theorem 1, T is a linear transformation. An alternative way to write T is as follows:

\[ T \left( \begin{bmatrix} y-x \\ 2x+y \end{bmatrix} \right) = \begin{bmatrix} 2y-x \\ x+y \\ 2x + 3y \end{bmatrix} \]

End of Example 1

Example 2: In order to create a linear transformation from ℳ_2×2 , we need a basis for M 2,2 . The standard basis

End of Example 2

Let V and U be vector spaces over the same field 𝔽 (which is either ℂ or ℝ or ℚ).

A map T : U → V is linear if for all vectors u, v ∈ U and arbitrary scalars α, β, we have

\[ T \left( \alpha{\bf u} + \beta{\bf v} \right) = \alpha\,T \left( {\bf u} \right) + \beta\,T \left( {\bf v} \right) \]

The set of all linear functions from U into V is denoted by ℒ(U, V). When U = V, we write ℒ(U) instead of ℒ(U, U).

Example 3:

Let ℭ[𝑎, b] be a set of all real-values continuous functions on the closed interval. Then the integral operator
\[ \int_a^b f(x)\,{\text d} x \]
provides a linear transformation from ℭ[𝑎, b] into ℝ.
Let ℭ(ℝ) be the set of all continuous functions on the real axis (−∞, ∞). Then the shift operator
\[ T(f)(x) = f(x - x_0 ) , \qquad x_0 \in \mathbb{R}, \]
is a linear transformation in the space ℭ(ℝ).
Let ℘_≤n be the linear space of polynomials of degree n or less. Then the derivative operator
\[ \texttt{D}\left( \sum_{i=0}^n a_i x^i \right) = \sum_{i=1}^n a_i x^{i-1} \]
provides a transformation from ℘_≤n into ℘_≤n-1
Let ℳ_m×n is a set of all m × n matrices with entries from the field 𝔽. Then transformation gives a linear transformation from ℳ_m×n into ℳ_n×m.
Let ℭ^∞[𝕋] be set of infinitely differentiable periodic functions on the unit circle 𝕋 (one-dimensional torus). Then expansion of a function from ℭ^∞[𝕋] into the Fourier series
\[ f(x) \sim \sum_{n=-\infty}^{\infty} \hat{f}(n) \, e^{-{\bf j} nx} , \qquad \hat{f}(n) = \frac{1}{2\pi} \int_{-\pi}^{\pi} f(x)\, e^{{\bf j} nx} {\text d}x , \]
provides a linear transformation from ℭ^∞[𝕋] into the set of infinite sequences.

Isometric transformations

A transformation is isomeric when ∥A x∥ = ∥ x∥.

This implies that the eigenvalues of an isometric transformation are given by λ = exp(jφ). Then also we have ⟨ Ax , Ay ⟩ = ⟨ x m y ⟩.

When W is an invariant subspace of the isometric transformation A with dim(A) < ∞, then also W^⊥ is also invariant subspace.

Orthogonal transformations

A transformation A is orthogonal if A is isometric and its inverse exists.

For an orthogonal transformation O, the identity O^TO = I, so O^T = O⁻¹. If A and B are orthogonal, then AB and A⁻¹ are also orthogonal.

Let A : V → V be orthogonal with dim(V) < ∞, then A is direct orthogonal if det(A) = +1. Matrix A describes a rotation. In particular, A provides a rotation of ℝ² through angle φ, it is given by

\[ {\bf R} = \begin{bmatrix} \cos\varphi & -\sin\varphi \\ \sin\varphi & \phantom{-}\cos\varphi \end{bmatrix} . \]

So the rotation angle φ is determined by trace tr(A) = 2cos(φ) with 0 ≤ φ ≤ π. Let λ₁ and λ₂ be the roots of the characteristic equation. Then Re(λ₁) = Re(λ₂) = cos(φ) and λ₁ = exp(jφ) and λ₂ = exp(−jφ).

In ℝ³, λ₁ = 1, λ₂ = λ₃* = exp(jφ). A rotation over eigenspace corresponding λ₁ is given by matrix

\[ {\bf R} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & \cos\varphi & -\sin\varphi \\ 0 & \sin\varphi & \phantom{-}\cos\varphi \end{bmatrix} . \]

A transformation A is called mirrored orthogonal if det(A) = −1. Vectors from E₋₁ are mirrored by A with respect to the invariant subspace E^⊥₋₁. A mirroring in ℝ² in <\( \left( \cos \left( \frac{1}{2}\,\varphi \right) , \sin \left( \frac{1}{2}\,\varphi \right) \right) \) > is given by

\[ {\bf S} = \begin{bmatrix} \cos\varphi & \phantom{-}\sin\varphi \\ \sin\varphi & - \cos\varphi \end{bmatrix} . \]

Mirrored orthogonal transformations in ℝ³ are rotational mirroring rotations of axis < a > through angle φ and mirror plane < a >^⊥. The matrix of such transformation is given by

\[ {\bf S} = \begin{bmatrix} -1 & 0 & 0 \\ 0 & \cos\varphi & -\sin\varphi \\ 0 & \sin\varphi & \phantom{-}\cos\varphi \end{bmatrix} . \]

For all orthogonal transformations in ℝ³, O(x)×O(y) = O(x×y).

ℝⁿ (n < ∞) can be decomposed in invariant subspaces with dimension 1 or 2 for each orthogonal transformation.

Unitary transformations

Let V be complex vector space with inner product. A linear transformation U of V is called unitary if it is isometric and its inverse exists.

An n × n matrix U is unitary if U*U = I, the identity matrix. Its determinant is det(U) = ±1. Each isometric transformation in a finite dimensional complex vector space is unitary.

Theorem 1: For an n × n matrix A, the following statements are equivalent:

A is unitary.
The columns of A form an orthonormal set.
The rows of matrix A form an orthonormal set.

Symmetric transformations

A transformation of ℝⁿ is called symmetric if ⟨ Ax , y ⟩ = ⟨ x , Ay ⟩ for any vectors x and y from the vector space.

A square matrix A is symmetric if A^T = A. A linear transformation is symmetric if its matrix with respect to an arbitrary basis is symmetric. All eigenvalues of a symmetric transformation are real. Eigenvectors corresponding to distinct eigenvalues are orthogonal. If A is symmetric, then A^T = A = A* for any orthogonal basis. The product A^TA is symmetric if ^T is.

Self-adjoint transformations

A transformation H : ℂⁿ → ℂⁿ is called self-adjoint or Hermitian if ⟨ Ax , y ⟩ = ⟨ x , Ay ⟩ for any vectors x and y from the vector space.

A product AB of two self-adjoint matrices A and B is self-adjoint if its commutator is zero, [A, B] = AB − BA = 0.

Eigenvalues of any self-adjoint matrix are real numbers.

Normal transformations

A linear transformation A is called normal if A*A = AA*.

Let the different roots of the characteristic equation of normal matrix A be β_i with multiplicities n_i. Than the dimension of each eigenspace V_i equalsn_i. These eigenspaces are mutually perpendicular and each vector x∈V can be written in exactly one way as

\[ {\bf x} = \sum_i {\bf x}_i , \qquad {\bf x}_i = P_i {\bf x} \in V_i , \]

where P_i is a projection on V_i.

Anton, Howard (2005), Elementary Linear Algebra (Applications Version) (9th ed.), Wiley International
Beezer, R.A., A First Course in Linear Algebra, 2017.
Fitzpatrick, S., Linear Algebra: A second course, featuring proofs and Python.

Introduction to Linear Algebra

Systems of Linear Equations

Matrix Algebra

Vector Spaces

Eigenvalues, Eigenvectors

Euclidean Vector Spaces

Functions of Matrices

Applications

Miscellany

Preliminaries

Glossary

Reference

Linear Transformations