Vladimir Dobrushkin
https://math.uri.edu/~dobrush/

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the appendix entitled GNU Free Documentation License.

Matrix Transformations

Let V and U be vector spaces. We call a function \( T\,:\,V \to U \) a linear transformation (or operator) from V into U if for all \( {\bf x} , {\bf y} \in V \) any scalar α, we have
  • \( T({\bf x} + {\bf y}) = T({\bf x}) + T({\bf y}) \)  (Additivity property),
  • \( T( \alpha\,{\bf x} ) = \alpha\,T( {\bf x} ) \)          (Homogeneity property).
We often simply call T linear. The space V is referred to as the domain of the lienar transformation and the space U is called codomain of T. We summarize the almost obvious statements about linear transformation in the following proposition.

Theorem: Let V and U be a vector spaces and \( T\,:\, V\to U \) be linear transformation.

  1. If T linear, then \( T(0) =0. \)
  2. T is linear if and only if \( T(c{\bf x} + {\bf y} ) = c\,T({\bf x}) + T({\bf y}) \) for all \( {\bf x}, {\bf y} \in V \) and any scalar c.
  3. T is linear if and only if for \( {\bf x}_1 , \ldots , {\bf x}_n \in V \) and any real or complex scalars \( a_1 , \ldots , a_n \)
\[ T \left( \sum_{i=1}^n a_i {\bf x}_i \right) = \sum_{i=1}^n a_i T \left( {\bf x}_i \right) . \]

  Recall that a set of vectors β is said to generate or span a vector space V if every element from V can be represented as a linear combination of vectors from β.

Example: The span of the empty set \( \varnothing \) consists of a unique element 0. Therefore, \( \varnothing \) is linearly independent and it is a basis for the trivial vector space consisting of the unique element---zero. Its dimension is zero.


Example: Recall from section on Vector Spaces that the set of all ordered n-tuples or real numbers is denoted by the symbol \( \mathbb{R}^n . \) It is a custom to represent ordered n-tuples in matrix notation as column vectorsd. For example, the matrix
\[ {\bf v} = \begin{bmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{bmatrix} \qquad\mbox{or}\qquad {\bf v} = \begin{pmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{pmatrix} \]
can be used as an alternative to \( {\bf v} = \left[ v_1 , v_2 , \ldots , v_n \right] \quad\mbox{or}\quad {\bf v} = \left( v_1 , v_2 , \ldots , v_n \right) . \) The latter is called the comma-delimited form of a vector and former is called the column-vector form.

In \( \mathbb{R}^n , \) the vectors \( e_1 [1,0,0,\ldots , 0] , \quad e_2 =[0,1,0,\ldots , 0], \quad \ldots , e_n =[0,0,\ldots , 0,1] \) form a basis for n-dimensional real space, and it is called the standard basis. Its dimension is n. For example, the vectors
\[ {\bf e}_1 = {\bf i} = \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} \qquad {\bf e}_2 = {\bf j} = \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} \qquad {\bf e}_3 = {\bf k} = \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix} \]
form the standard basis vectors for \( \mathbb{R}^3 . \) Any three-dimensional vecor could expressed through these basic vectors:
\[ {\bf x} = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} x_1 \\ 0 \\ 0 \end{bmatrix} + \begin{bmatrix} 0 \\ x_2 \\ 0 \end{bmatrix} + \begin{bmatrix} 0 \\ 0 \\ x_3 \end{bmatrix} = x_1 \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} + x_2 \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} + x_3 \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix} = x_1 {\bf e}_1 + x_2 {\bf e}_2 + x_3 {\bf e}_3 = x_1 {\bf i} + x_2 {\bf j} + x_3 {\bf k} . \]
If f is a function with domain \( \mathbb{R}^m \) and codomain \( \mathbb{R}^n , \) then we say that f is a transformation from \( \mathbb{R}^m \) to \( \mathbb{R}^n \) or that f maps \( \mathbb{R}^m \) into \( \mathbb{R}^n , \) which we denote by wroting \( f\,:\, \mathbb{R}^m \to \mathbb{R}^n . \) In the special case where n = m, a transformation is sometimes called an operator on \( \mathbb{R}^n . \)
Suppose we have the system of linear equations
\begin{eqnarray*} a_{11} x_1 + a_{12} x_2 + \cdots + a_{1m} x_m &=& b_1 , \\ a_{21} x_1 + a_{22} x_2 + \cdots + a_{2m} x_m &=& b_2 , \\ \vdots &=& \vdots \\ a_{n1} x_1 + a_{12} x_2 + \cdots + a_{1m} x_m &=& b_n , \\ \end{eqnarray*}
which can be written in matrix notation as
\[ \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1m} \\ a_{21} & a_{22} & \cdots & a_{2m} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{nm} \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_m \end{bmatrix} = \begin{bmatrix} b_1 \\ b_2 \\ \vdots \\ b_n \end{bmatrix} , \]
or more briefly as
\[ {\bf A}\, {\bf x} = {\bf b} . \]
Although the latter represents a linear system of equations, we could view it instead as a transformation that maps a vector x from \( \mathbb{R}^m \) into the vector from \( \mathbb{R}^n \) by multiplying x on the left by A. We call this a matrix transformation and denote by \( T_{\bf A}:\, \mathbb{R}^m \to \mathbb{R}^n \) (in case where m=n, it is called matrix operator). This transformation is generated by matrix multiplication.

Theorem: Let \( T:\, \mathbb{R}^m \to \mathbb{R}^n \) be a linear transformation. Then there exists a unique matrix A such that

\[ T \left( {\bf x} \right) = {\bf A}\, {\bf x} \qquad\mbox{for all } {\bf x} \in \mathbb{R}^m . \]
In fact, A is the \( n \times m \) matrix whose j-th column is the vector \( T\left( {\bf e}_j \right) , \) where \( {\bf e}_j \) is the j-th column of the identity matrix in \( \mathbb{R}^m : \)
\[ {\bf A} = \left[ T \left( {\bf e}_1 \right) , T \left( {\bf e}_2 \right) , \cdots , T \left( {\bf e}_m \right) \right] . \]

Write \( {\bf x} = {\bf I}_m {\bf x} = \left[ {\bf e}_1 \ \cdots \ {\bf e}_m \right] {\bf x} = x_1 {\bf e}_1 + \cdots + x_m {\bf e}_m , \) and usethe linearlity of T to compute
\begin{align*} T \left( {\bf x} \right) &= T \left( x_1 {\bf e}_1 + \cdots + x_m {\bf e}_m \right) = x_1 T \left( {\bf e}_1 \right) + \cdots + x_m T \left( {\bf e}_m \right) \\ &= \left[ T \left( {\bf e}_1 \right) \ \cdots \ T \left( {\bf e}_m \right) \right] \begin{bmatrix} x_1 \\ \vdots \\ x_m \end{bmatrix} = {\bf A} \, {\bf x} . \end{align*}
Such representation is unique, which could be proved by showing that for any other matrix representation B x of transformation T, it follows that A = B.


Example: The transformation T from \( \mathbb{R}^4 \) to \( \mathbb{R}^3 \) defined by the equations
\begin{eqnarray*} 3\, x_1 -2\, x_2 + 5\, x_3 - 7\, x_4 &=& b_1 , \\ x_1 + 7\, x_2 -3\, x_3 + 5\, x_4 &=& b_2 , \\ 4\, x_1 -3\, x_2 + x_3 -6\, x_4 &=& b_3 , \\ \end{eqnarray*}
can be represented in matrix form as
\[ \begin{bmatrix} 3 & -2 & 5 & -7 \\ 1 & 7 & -3 & 5 \\ 4 & -3 & 1 & -6 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{bmatrix} = \begin{bmatrix} b_1 \\ b_2 \\ b_3 \end{bmatrix} , \]
from which we see that the transformation can be interpreted as a multiplication by
\[ {\bf A} = \begin{bmatrix} 3 & -2 & 5 & -7 \\ 1 & 7 & -3 & 5 \\ 4 & -3 & 1 & -6 \end{bmatrix} . \]
Therefore, the transformation T is generated by matrix A. For example, if
\[ {\bf x} = \begin{bmatrix} -1 \\ 2 \\ 3 \\ -4 \end{bmatrix} , \]
then
\[ T_{\bf A} \left( {\bf x} \right) = {\bf A}\, {\bf x} = \begin{bmatrix} 3 & -2 & 5 & -7 \\ 1 & 7 & -3 & 5 \\ 4 & -3 & 1 & -6 \end{bmatrix} \begin{bmatrix} -1 \\ 2 \\ 3 \\ -4 \end{bmatrix} = \begin{bmatrix} 36 \\ -16 \\ 17 \end{bmatrix} . \]


Example: If 0 is the \( n \times m \) zero matrix, then
\[ T_{\bf 0} \left( {\bf x} \right) = {\bf 0}\,{\bf x} = {\bf 0} , \]
so multiplication by zero maps every vector from \( \mathbb{R}^m \) into the zero vector in \( \mathbb{R}^n . \) Such transformation is called the zero transformation from \( \mathbb{R}^m \) to \( \mathbb{R}^n . \)


Example: If I is the \( n \times n \) identity matrix, then ■

 

Theorem: For every matrix A the matrix transformation \( T_{\bf A}:\, \mathbb{R}^m \to \mathbb{R}^n \) has the following properties for all vectors v anf u and for every scalar k:

  1. \( T_{\bf A} \left( {\bf 0} \right) = {\bf 0} . \)
  2. \( T_{\bf A} \left( k{\bf u} \right) = k\,T_{\bf A} \left( {\bf u} \right) . \)
  3. \( T_{\bf A} \left( {\bf v} \pm {\bf u} \right) = T_{\bf A} \left( {\bf v} \right) \pm T_{\bf A} \left( {\bf u} \right) . \)

All parts are restatements of the following properties of matrix arithmetics:
\[ {\bf A}{\bf 0} = {\bf 0}, \qquad {\bf A}\left( k{\bf u} \right) = k \left( {\bf A}\, {\bf u} \right) , \qquad {\bf A} \left( {\bf v} \pm {\bf u} \right) = {\bf A} \left( {\bf v} \right) \pm {\bf A} \left( {\bf u} \right) . \]

Theorem: \( T\,:\, \mathbb{R}^m \to \mathbb{R}^n \) is a matrix transformation if and only if the following relationships hold for all vectors v and u in \( \mathbb{R}^m \) and for every scalar k:

  1. \( T \left( {\bf v} + {\bf u} \right) = T \left( {\bf v} \right) + T \left( {\bf u} \right) \)  (Additivity property),
  2. \( T \left( k{\bf v} \right) = k\,T \left( {\bf v} \right) \)          (Homogeneity property).

If T is a matrix transformation, then properties (i) and (ii) follow respectively from previous theorem.

Conversely, assume that properties (i) and (ii) are valid. We must show that there exists an n-by-m matrix A such that

\[ T \left( {\bf x} \right) = {\bf A}\, {\bf x} \]
for every \( {\bf x} \in \mathbb{R}^m . \) Since T is a linear transformation, we have
\[ T \left( k_1 {\bf x}_1 + k_2 {\bf x}_2 + \cdots + k_m {\bf x}_m \right) = k_1 T \left( {\bf x}_1 \right) + k_2 T \left( {\bf x}_2 \right) + \cdots + k_m T \left( {\bf x}_m \right) \]
for all vectors \( {\bf x}_i , \quad i=1,2,\ldots , m , \) and for all scalars \( k_i , \quad i=1,2,\ldots , m . \) Let A be the matrix
\[ {\bf A} = \left[ T \left( {\bf e}_1 \right) \, | \, T \left( {\bf e}_2 \right) \, | \, \cdots \, T \left( {\bf e}_m \right) \right] , \]
where \( {\bf e}_i , \quad i=1,2,\ldots , m , \) are the staaaaaandard basis vectors for \( \mathbb{R}^m . \) We know that A x is a linear combination of the columns of A in which the successive coefficients are the entries \( x_1 , x_2 , \ldots , x_m \) of x. That is,
\[ {\bf A}\,{\bf x} = x_1 T \left( {\bf e}_1 \right) + x_2 T \left( {\bf e}_2 \right) + \cdots + x_m T \left( {\bf e}_m \right) . \]
Using linearlity of T, we have
\[ {\bf A}\,{\bf x} = T \left( x_1 {\bf e}_1 + x_2 {\bf e}_2 + \cdots + x_m {\bf e}_m \right) = T \left( {\bf x} \right) , \]
which completes the proof.

Theorem: Every linear transformation from \( \mathbb{R}^m \) to \( \mathbb{R}^n \) is a matrix transformation, and conversely, every matrix transformation from \( \mathbb{R}^m \) to \( \mathbb{R}^n \) is a linear transformation.

Theorem: If \( T_{\bf A}\,:\, \mathbb{R}^m \to \mathbb{R}^n \) and \( T_{\bf B}\,:\, \mathbb{R}^m \to \mathbb{R}^n \) are matrix transformations, and if \( T_{\bf A} \left( {\bf v} \right) = T_{\bf B} \left( {\bf v} \right) \) for every vector \( {\bf v} \in \mathbb{R}^m , \) then A = B.

To say that \( T_{\bf A} \left( {\bf v} \right) = T_{\bf B} \left( {\bf v} \right) \) for every vector in \( \mathbb{R}^m \) is the same as saying that
\[ {\bf A}\,{\bf v} = {\bf B}\,{\bf v} \]
for every vector v in \( \mathbb{R}^m . \) This will be true, in particular, if v is any of the standard basis vectors \( {\bf e}_1 , {\bf e}_2 , \ldots , {\bf e}_m \) for \( \mathbb{R}^m ; \) that is,
\[ {\bf A}\,{\bf e}_j = {\bf B}\,{\bf e}_j \qquad (j=1,2,\ldots , m) . \]
Since every entry of ej is 0 except for the j-th, which is 1, it follows that Aej is the j-th column of A and Bej is the j-th column of B. Thus, \( {\bf A}\,{\bf e}_j = {\bf B}\,{\bf e}_j \) implies that coerresponding columns of A and B are the same, and hence A = B.

  The above theorem tells us that there is a one-to-one correspondence between n-by-m matrices and matrix transformations from \( \mathbb{R}^m \) to \( \mathbb{R}^n \) in teh sense that every \( n \times m \) matrix A generates exactly one matrix transformation (multiplication by A) and every matrix transformation from \( \mathbb{R}^m \) to \( \mathbb{R}^n \) arises from exactly one \( n \times m \) matrix: we call that matrix the standard matrix for the transformation, which is given by the formula:

\[ {\bf A} = \left[ T \left( {\bf e}_1 \right) \,|\, T \left( {\bf e}_2 \right) \,|\, \cdots \,| T \left( {\bf e}_m \right) \right] . \]
This suggests the following procedure for finding standard matrices.

Algorithm for finding the standard matrix of a linear transformation:
Step 1: Find the images of the standard basis vectors \( {\bf e}_1 , {\bf e}_2 , \ldots , {\bf e}_m \) for \( \mathbb{R}^m . \)
Step 2: Construct the matrix that has the images obtained in Step 1 as its successive columns. This matrix is the standard matrix for the transformation.

Example: Find the standard matrix A for the linear transformation:
\[ T \left( \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} \right) = \begin{bmatrix} 3\,x_1 -2\,x_3 \\ 2\,x_2 + 5\,x_3 \end{bmatrix} . \]

To answer the question, we apply the linear transformation T to every basic vector:

\[ T \left( {\bf e}_1 \right) = 3\, {\bf i} = \begin{bmatrix} 3 \\ 0 \end{bmatrix} , \quad T \left( {\bf e}_2 \right) = 2\, {\bf j} = \begin{bmatrix} 0 \\ 2 \end{bmatrix} , \quad T \left( {\bf e}_3 \right) = -2\, {\bf i} + 5\,{\bf j} = \begin{bmatrix} -2 \\ 5 \end{bmatrix} . \]
Therefore, the standard matrix becomes
\[ T \left( \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} \right) = {\bf A}\,{\bf x} = \begin{bmatrix} 3&0&-2 \\ 0&2&5 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} . \]


Example: Over the field of complex numbers, the vector space \( \mathbb{C} \) of all complex numbers has dimension 1 because its basis consists of one element \( \{ 1 \} . \)

Over the field of real numbers, the vector space \( \mathbb{C} \) of all complex numbers has dimension 2 because its basis consists of two elements \( \{ 1, {\bf j} \} . \)