Linear Transformations

Functions are used throughout mathematics, physics, and engineering to study the structures of sets and relationships between sets. You are familiar with the notation y = f(x), where f is a function that acts on numbers, signified by the input variable x, and produces numbers signified by the output variable y. It is a custom to write an input variable to the right of function following all European languages that perform writing in left to right.

In general, a function f : XY is a rule that associates with each x in the set X a unique element y = f(x) in Y. We say that f maps the set X into the set Y and maps the element x to the element y. The set X is the domain of f and the set Y is called range or codomain. The set of all outputs a particular function actually uses from the set X is its image. In linear algebra, we are interested in functions that maps vectors into vectors preserving vector operations.

A function T : ℝm ↦ ℝn is a linear transformation if it satisfies two conditions:
  1. T(v + u) = T(v) + T(u)      Preservation of addition;
  2. T(kv) = kT(v)            Preservation of scalar multiplication;
for all vectors v, u in ℝn and for all scalars k.
AS it is clear from the definition above, we can similar extend it for complex fields ℂm ↦ ℂn or rational fields ℚm ↦ ℚn. Actually, we can extend this definition for arbitrary vector spaces:
Let V and U be vector spaces over a scalar field 𝔽 (which is either ℂ or ℝ or ℚ). A function T : VU is called a linear transformation (also called linear mapping or vector space homomorphism) if T preserves vector addition and scalar multiplication.
Theorem 1: Let V be a finite-dimensional vector space of dimension n≥1, and let β = { v1, v2, … , vn } be a basis for V. Let U be any vector space, and let { u1, u2, … , un } be a list of vectors from U. The function T : VU defined by
\[ T \left( a_1 {\bf v}_1 + a_2 {\bf v}_2 + \cdots + a_n {\bf v}_n \right) = a_1 {\bf u}_1 + a_2 {\bf u}_2 + \cdots + a_n {\bf u}_n \]
is a linear transformation for any n scalars 𝑎1, 𝑎2, … , 𝑎n.
Because β is a basis in V, there are unique scalars 𝑎1, 𝑎2, … , 𝑎n such that arbitrary vector vV is represented as a linear combination of basis vectors: v = 𝑎1v1 + 𝑎2v2 + ⋯ + 𝑎nvn. So there is a unique corresponding element
\[ T \left( {\bf v} \right) = T \left( a_1 {\bf v}_1 + a_2 {\bf v}_2 + \cdots + a_n {\bf v}_n \right) = a_1 {\bf u}_1 + a_2 {\bf u}_2 + \cdots + a_n {\bf u}_n \]
in U. Hence T really is a function.

To show that T is a linear transformation, take any vectors v = 𝑎1v1 + 𝑎2v2 + ⋯ + 𝑎nvn and w = b1v1 + b2v2 + ⋯ + bnv2 in V and any real number k. We have

\begin{align*} T \left( {\bf v} + {\bf w} \right) &= T \left( a_1 {\bf v}_1 + a_2 {\bf v}_2 + \cdots + a_n {\bf v}_n + b_1 {\bf v}_1 + b_2 {\bf v}_2 + \cdots + b_n {\bf v}_n \right) \\ &= T \left( \left[ a_1 + b_1 \right] {\bf v}_1 + \left[ a_2 + b_2 \right] {\bf v}_2 + \cdots + \left[ a_n + b_n \right] {\bf v}_n \right) \\ &= \left[ a_1 + b_1 \right] {\bf u}_1 + \left[ a_2 + b_2 \right] {\bf u}_2 + \cdots + \left[ a_n + b_n \right] {\bf u}_n \\ &= a_1 {\bf u}_1 + a_2 {\bf u}_2 + \cdots + a_n {\bf u}_n + b_1 {\bf u}_1 + b_2 {\bf u}_2 + \cdots + b_n {\bf u}_n \\ &= a_1 T \left( {\bf v}_1 \right) + a_2 T \left( {\bf v}_2 \right) + \cdots + a_n T \left( {\bf v}_n \right) + b_1 T \left( {\bf v}_1 \right) + b_2 T \left( {\bf v}_2 \right) + \cdots + b_n T \left( {\bf v}_n \right) \\ &= T \left( {\bf v} \right) + T \left( {\bf w} \right) \end{align*}
\begin{align*} T \left( k\,{\bf v} \right) &= T \left( k\left[ a_1 {\bf v}_1 + a_2 {\bf v}_2 + \cdots + a_n {\bf v}_n \right] \right) \\ &= T \left( k\,a_1 {\bf v}_1 + k\,a_2 {\bf v}_2 + \cdots + k\,a_n {\bf v}_n \right) \\ &= k\,a_1 {\bf u}_1 + k\,a_2 {\bf u}_2 + \cdots + k\,a_n {\bf u}_n \\ &= k\, T \left( a_1 {\bf v}_1 + a_2 {\bf v}_2 + \cdots + a_n {\bf v}_n \right) = k\, T \left( {\bf v} \right) . \end{align*}
The two required conditions are satisfied, and T is a linear transformation.
This theorem provides a really nice way to create linear transformations.
Example 1: Let us construct a linear transformation from ℝ² into ℝ³. WE choose two basis vectors in ℝ² and corresponding vectors in ℝ³:
\[ {\bf v}_1 = \begin{pmatrix} -1 \\ \phantom{-}2 \end{pmatrix}, \quad {\bf v}_2 = \begin{pmatrix} 1 \\ 1 \end{pmatrix} \qquad \mbox{and} \qquad {\bf u}_1 = \begin{bmatrix} -1 \\ \phantom{-}1 \\ \phantom{-}2 \end{bmatrix}, \quad {\bf u}_2 = \begin{bmatrix} 2 \\ 1 \\ 3 \end{bmatrix} . \]
We can now define T: ℝ² ↦ ℝ³ by
\[ T \left( x \begin{bmatrix} -1 \\ \phantom{-}2 \end{bmatrix} + y \begin{bmatrix} 1 \\ 1 \end{bmatrix} \right) = x\begin{bmatrix} -1 \\ \phantom{-}1 \\ \phantom{-}2 \end{bmatrix} + y \begin{bmatrix} 2 \\ 1 \\ 3 \end{bmatrix} = \]
To prove that T is a linear transformation, all we need to do is say: by Theorem 1, T is a linear transformation. An alternative way to write T is as follows:
\[ T \left( \begin{bmatrix} y-x \\ 2x+y \end{bmatrix} \right) = \begin{bmatrix} 2y-x \\ x+y \\ 2x + 3y \end{bmatrix} \]
End of Example 1
Example 2: In order to create a linear transformation from ℳ2×2 , we need a basis for M 2,2 . The standard basis
End of Example 2
Let V and U be vector spaces over the same field 𝔽 (which is either ℂ or ℝ or ℚ).
A map T : UV is linear if for all vectors u, vU and arbitrary scalars α, β, we have
\[ T \left( \alpha{\bf u} + \beta{\bf v} \right) = \alpha\,T \left( {\bf u} \right) + \beta\,T \left( {\bf v} \right) \]
The set of all linear functions from U into V is denoted by ℒ(U, V). When U = V, we write ℒ(U) instead of ℒ(U, U).
Example 3:
  1. Let ℭ[𝑎, b] be a set of all real-values continuous functions on the closed interval. Then the integral operator
    \[ \int_a^b f(x)\,{\text d} x \]
    provides a linear transformation from ℭ[𝑎, b] into ℝ.
  2. Let ℭ(ℝ) be the set of all continuous functions on the real axis (−∞, ∞). Then the shift operator
    \[ T(f)(x) = f(x - x_0 ) , \qquad x_0 \in \mathbb{R}, \]
    is a linear transformation in the space ℭ(ℝ).
  3. Let ℘≤n be the linear space of polynomials of degree n or less. Then the derivative operator
    \[ \texttt{D}\left( \sum_{i=0}^n a_i x^i \right) = \sum_{i=1}^n a_i x^{i-1} \]
    provides a transformation from ℘≤n into ℘≤n-1
  4. Let ℳm×n is a set of all m × n matrices with entries from the field 𝔽. Then transformation gives a linear transformation from ℳm×n into ℳn×m.
  5. Let ℭ[𝕋] be set of infinitely differentiable periodic functions on the unit circle 𝕋 (one-dimensional torus). Then expansion of a function from ℭ[𝕋] into the Fourier series
    \[ f(x) \sim \sum_{n=-\infty}^{\infty} \hat{f}(n) \, e^{-{\bf j} nx} , \qquad \hat{f}(n) = \frac{1}{2\pi} \int_{-\pi}^{\pi} f(x)\, e^{{\bf j} nx} {\text d}x , \]
    provides a linear transformation from ℭ[𝕋] into the set of infinite sequences.


Isometric transformations

A transformation is isomeric when ∥A x∥ = ∥ x∥.
This implies that the eigenvalues of an isometric transformation are given by λ = exp(jφ). Then also we have ⟨ Ax , Ay ⟩ = ⟨ x m y ⟩.

When W is an invariant subspace of the isometric transformation A with dim(A) < ∞, then also W is also invariant subspace.


Orthogonal transformations

A transformation A is orthogonal if A is isometric and its inverse exists.
For an orthogonal transformation O, the identity OTO = I, so OT = O−1. If A and B are orthogonal, then AB and A−1 are also orthogonal.

Let A : VV be orthogonal with dim(V) < ∞, then A is direct orthogonal if det(A) = +1. Matrix A describes a rotation. In particular, A provides a rotation of ℝ² through angle φ, it is given by

\[ {\bf R} = \begin{bmatrix} \cos\varphi & -\sin\varphi \\ \sin\varphi & \phantom{-}\cos\varphi \end{bmatrix} . \]
So the rotation angle φ is determined by trace tr(A) = 2cos(φ) with 0 ≤ φ ≤ π. Let λ₁ and λ₂ be the roots of the characteristic equation. Then Re(λ₁) = Re(λ₂) = cos(φ) and λ₁ = exp(jφ) and λ₂ = exp(−jφ).

In ℝ³, λ₁ = 1, λ₂ = λ₃* = exp(jφ). A rotation over eigenspace corresponding λ₁ is given by matrix

\[ {\bf R} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & \cos\varphi & -\sin\varphi \\ 0 & \sin\varphi & \phantom{-}\cos\varphi \end{bmatrix} . \]
A transformation A is called mirrored orthogonal if det(A) = −1. Vectors from E−1 are mirrored by A with respect to the invariant subspace E−1. A mirroring in ℝ² in <\( \left( \cos \left( \frac{1}{2}\,\varphi \right) , \sin \left( \frac{1}{2}\,\varphi \right) \right) \) > is given by
\[ {\bf S} = \begin{bmatrix} \cos\varphi & \phantom{-}\sin\varphi \\ \sin\varphi & - \cos\varphi \end{bmatrix} . \]
Mirrored orthogonal transformations in ℝ³ are rotational mirroring rotations of axis < a > through angle φ and mirror plane < a >. The matrix of such transformation is given by
\[ {\bf S} = \begin{bmatrix} -1 & 0 & 0 \\ 0 & \cos\varphi & -\sin\varphi \\ 0 & \sin\varphi & \phantom{-}\cos\varphi \end{bmatrix} . \]
For all orthogonal transformations in ℝ³, O(xO(y) = O(x×y).

n (n < ∞) can be decomposed in invariant subspaces with dimension 1 or 2 for each orthogonal transformation.


Unitary transformations

Let V be complex vector space with inner product. A linear transformation U of V is called unitary if it is isometric and its inverse exists.
An n × n matrix U is unitary if U*U = I, the identity matrix. Its determinant is det(U) = ±1. Each isometric transformation in a finite dimensional complex vector space is unitary.
Theorem 1: For an n × n matrix A, the following statements are equivalent:
  1. A is unitary.
  2. The columns of A form an orthonormal set.
  3. The rows of matrix A form an orthonormal set.


Symmetric transformations

A transformation of ℝn is called symmetric if ⟨ Ax , y ⟩ = ⟨ x , Ay ⟩ for any vectors x and y from the vector space.
A square matrix A is symmetric if AT = A. A linear transformation is symmetric if its matrix with respect to an arbitrary basis is symmetric. All eigenvalues of a symmetric transformation are real. Eigenvectors corresponding to distinct eigenvalues are orthogonal. If A is symmetric, then AT = A = A* for any orthogonal basis. The product ATA is symmetric if T is.


Self-adjoint transformations

A transformation H : ℂn → ℂn is called self-adjoint or Hermitian if ⟨ Ax , y ⟩ = ⟨ x , Ay ⟩ for any vectors x and y from the vector space.
A product AB of two self-adjoint matrices A and B is self-adjoint if its commutator is zero, [A, B] = ABBA = 0.

Eigenvalues of any self-adjoint matrix are real numbers.


Normal transformations

A linear transformation A is called normal if A*A = AA*.
Let the different roots of the characteristic equation of normal matrix A be βi with multiplicities ni. Than the dimension of each eigenspace Vi equalsni. These eigenspaces are mutually perpendicular and each vector xV can be written in exactly one way as
\[ {\bf x} = \sum_i {\bf x}_i , \qquad {\bf x}_i = P_i {\bf x} \in V_i , \]
where Pi is a projection on Vi.


  1. Anton, Howard (2005), Elementary Linear Algebra (Applications Version) (9th ed.), Wiley International
  2. Beezer, R.A., A First Course in Linear Algebra, 2017.
  3. Fitzpatrick, S., Linear Algebra: A second course, featuring proofs and Python.