Linear Algebra, Part 1: Duality (Mathematica)

Duality in matrix/vector multiplication

Although duality is covered in Part 3, we introduce some of its elements here in order to show a different point of view in the context of matrix/vector multiplication.

So far, we have been using a naïve definition of a vector as an ordered (or indexed) finite sequence of scalars; it is usually represented in parenthesis as an n-tuple (x₁, x₂, … , x_n), an element of Cartesian product 𝔽ⁿ = 𝔽 × 𝔽 × ⋯ × 𝔽 of n copies of the field of scalars 𝔽. Later, we will extend this definition to include abstract objects; however, for us at this moment, it is sufficient to work with n-tuples. Recall that for scalars we use either rational numbers ℚ or real numbers ℝ or complex numbers ℂ.

Usually, an n-tuple (x₁, x₂, … , x_n), written in parenthesis and a row vector [x₁, x₂, … , x_n], written in brackets, look as the same object to human eyes. One of the pedagogical virtues of any software package is its requirement to pay close attention to the types of objects used in specific contexts. In particular, the computer algebra system Mathematica treats these two versions of a vector differently because

\[ \left( x_1 , x_2 , \ldots , x_n \right) \in \mathbb{F}^n , \]

\[ \mbox{but} \]

\[ \left[ x_1 , x_2 , \ldots , x_n \right] \in \mathbb{F}^{1\times n} , \]

where 𝔽^m×n denotes the space of m × n matrices with coefficients from field 𝔽. Hence, Mathematica considers bracket notation as a matrix with one row.

x = {1, 2}; y = {{1, 2}};
x == y

False

Mathematica considers single curly bracket notation as a list of entries, which is treated as a column vector for convenience. On the other hand, a double curly bracket notation is considered as a list of lists and it is represented as a matrix with one row.

x = {1, 2};
MatrixForm[x]

\( \displaystyle \begin{pmatrix} 1 \\ 2 \end{pmatrix} \)

MatrixForm[{x}]

(1, 2)

Note that dot product of these objects is non-commutative. That is, the order matters:

x . y

Dot::dotsh: Tensors {1,2} and {{1,2}} have incompatible shapes.

y . x

{5}

Matrices, as two dimensional arrays, provide a dual version of a row vector, known as a column vector:

\[ {\bf x} = \left[ \begin{array}{c} x_1 \\ x_2 \\ \vdots \\ x_n \end{array} \right] \in \mathbb{F}^{n\times 1} , \qquad {\bf a} = \left[ a_1 , a_2 , \ldots , a_n \right] \in \mathbb{F}^{1\times n} . \]

As matrices, column vectors (x ∈ 𝔽^n×1) can be multiplied by row vectors (a ∈ 𝔽^1×n) from left when they both have the same size:

\[ \left[ a_1 , a_2 , \ldots , a_n \right] \,\left[ \begin{array}{c} x_1 \\ x_2 \\ \vdots \\ x_n \end{array} \right] = \left[ a_1 x_1 + \cdots + a_n x_n \right] \in \mathbb{R}^{1\times 1} . \]

It is a matrix with one row and one column. In order to extract this entry from 1×1 matrix, mathematicians consider this expression as a linear combination. It also can be defined by dot product:

\[ \left( a_1 , \ldots , a_n \right) \bullet \left( x_1 , \ldots , x_n \right) = a_1 x_1 + a_2 x_2 + \cdots + a_n x_n \in \mathbb{R} , \]

when all entries are real numbers. You need to wait till Part 5 for dot product extension to complex numbers (it is inner product). Although dot product formular above can be used for complex numbers as well, it does not define a metwric (distance) in complex vector spaces. Physicists usually use Dirac's notation (also known as bra-ket notation), which is designated as <a | x> for an alternative to linear combination or dot product.

Every numeric vector a = (𝑎₁, 𝑎₂, … , 𝑎_n) ∈ 𝔽ⁿ determines the corresponding dual vector (known as a covector) a* that defines a linear functional a* : 𝔽ⁿ ⇾ 𝔽 via the simple rule

\[ {\bf a}^{\ast} ({\bf x}) = \left\langle {\bf a} \mid {\bf x} \right\rangle = a_1 x_1 + \cdots + a_n x_n , \quad {\bf x} \in \mathbb{F}^n . \]

When vector a = (𝑎₁, 𝑎₂, … , 𝑎_n) is considered as a covector, it is usually called a bra-vector and is denoted by <a∣, while x = (x₁, x₂, … , x_n) ∈ ℂⁿ is called a ket-vector or just a vector and denoted by ∣x>. Then the linear combination \( \displaystyle a_1 x_1 + a_2 x_2 + \cdots + a_n x_n \) (which is dot product for real numbers) defines a linear functional (or linear form), which we denote by a*. Hence, every n-tupple a = (𝑎₁, 𝑎₂, … , 𝑎_n) defines a covector (or linear functional). In case of real numbers, action of a covector on a vector is just dot product.

Example 1: The linear functional on ℝ⁴ given by \[ f\left( x_1 , x_2 , x_3 , x_4 \right) = 2\,x_1 - 3\, x_2 + x_3 - 5\, x_4 \] is dual to the numeric vector a = (2, −3, 1, −5). This vector generates a linear functional on ℝ⁴ by means of dot product

fx1 = {Subscript[x, 1], Subscript[x, 2], Subscript[x, 3], Subscript[x, 4]};
coefVec1 = {2, -3, 1, -5};
fx1 . coefVec1

\( \displaystyle 2\,x_1 - 3\, x_2 + x_3 - 5\, x_4 \)

\[ \langle {\bf a} \mid {\bf x} \rangle = {\bf a} \bullet {\bf x} = 2\,x_1 - 3\, x_2 + x_3 - 5\, x_4 , \] for arbitrary vector x = (x₁, x₂, x₃, x₄) ∈ ℝ⁴. This functional, written in bra-ket notation, can be redefined in standard mathematical form as \[ {\bf a}^{\ast} \, : \ \mathbb{R} \mapsto \mathbb{R} , \qquad\mbox{with} \qquad {\bf a}^{\ast} ({\bf x}) = \langle {\bf a} \mid {\bf x} \rangle = {\bf a} \bullet {\bf x} . \]

On the other hand, the numeric vector a = (2, −3, 1) ∈ ℝ³ is dual to the linear functional (known as a covector) given by \[ {\bf a}^{\ast} \left( {\bf x} \right) = \left\langle {\bf a} \mid {\bf x} \right\rangle = {\bf a} \bullet {\bf x} = 2\, x_1 -3\, x_2 + x_3 , \quad {\bf x} \in \mathbb{R}^3 . \]

fx2 = {Subscript[x, 1], Subscript[x, 2], Subscript[x, 3]};
coefVec2 = {2, -3, 1};
fx2 . coefVec2

\( \displaystyle 2\,x_1 - 3\, x_2 + x_3 \)

In particular, actions of the covector ⟨a | on some bra-vectors from ℝ³ are given below: \begin{align*} {\bf a}^{\ast} (1, 2, 3) &= {\bf a} \bullet \left( 1, 2, 3 \right) = 2\cdot 1 -3\cdot 2 + 3 = -1, \\ {\bf a}^{\ast} (-1, 2, -5) &= {\bf a} \bullet \left( -1, 2, -5 \right) = 2\cdot (-1) -3 \cdot 2 -5 = -13 . \end{align*}

End of Example 1

In Linear Algebra, the word duality traditionally refers to the dual roles of vector and covector associated with a given numeric vector, as in definition above. The row/column duality just described is one of many beautiful dualities we highlight in this tutorial. We will see many other examples of duality, including image/inverse image, geometric/numeric, and projection/span—problems that seem difficult from one viewpoint often turn out to be much easier from the dual viewpoint. Each of the two perspectives illuminates the other in a beautiful and useful way, and we shall make good use of this principle many times.

We can regard an m × n matrix, for instance, as a list of m (row) vectors in ℝ^1×n, or as a list of n (column) vectors in ℝ^m×1, and the difference is not superficial, as we shall come to see. Every m × n matrix

\[ {\bf A} = \begin{bmatrix} a_{1,1} & a_{1,2} & a_{1,3} & \cdots & a_{1,n} \\ a_{2,1} & a_{2,2} & a_{2,3} & \cdots & a_{2,n} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ a_{m,1} & a_{m,2} & a_{m,3} & \cdots & a_{m,n} \end{bmatrix} \in \mathbb{F}^{m\times n} \]

defines a linear transformation T_A : 𝔽^n×1 ⇾ 𝔽^m×1 by multiplication from left:

\[ T_A ({\bf x}) = {\bf A}\, {\bf x} \in \mathbb{F}^{m\times 1} , \qquad \mbox{with} \quad {\bf x} \in \mathbb{R}^{n\times 1} . \]

However, the same m × n matrix A can operate on row vectors y ∈ 𝔽^1×m from right:

\[ {\bf y}\,{\bf A} \in \mathbb{F}^{1\times n} \qquad \iff \qquad {\bf y}\,{\bf A} = \left( {\bf A}^{\mathrm T} {\bf y}^{\mathrm T} \right)^{\mathrm T} , \]

where "T" stands for transposition---a dual operation that transfers rows into columns and vice versa. Since in the following exposition we utilize matrix multiplication from left, we are forced to consider vectors as columns. Then

\begin{align*} {\bf A}\,{\bf x} &= \begin{bmatrix} a_{1,1} & a_{1,2} & a_{1,3} & \cdots & a_{1,n} \\ a_{2,1} & a_{2,2} & a_{2,3} & \cdots & a_{2,n} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ a_{m,1} & a_{m,2} & a_{m,3} & \cdots & a_{m,n} \end{bmatrix} \,\begin{pmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{pmatrix} \\ &= \begin{bmatrix} a_{1,1} x_1 + a_{1,2} x_2 + \cdots + a_{1,n} x_n \\ a_{2,1} x_1 + a_{2,2} x_2 + \cdots + a_{2,n} x_n \\ \vdots \\ a_{m,1} x_1 + a_{m,2} x_2 + \cdots + a_{m,n} x_n \end{bmatrix} \\ &= {\bf c}_1 ({\bf A})\, x_1 + {\bf c}_2 ({\bf A})\, x_2 + \cdots + {\bf c}_n ({\bf A})\, x_n , \end{align*}

where c_i(A) , i = 1, 2, … , n, are columns of matrix A:

\[ {\bf c}_1 ({\bf A}) = \left[ \begin{array}{c} a_{1,1} \\ a_{2,1} \\ \vdots \\ a_{m,1} \end{array} \right] , \quad{\bf c}_2 ({\bf A}) = \left[ \begin{array}{c} a_{1,2} \\ a_{2,2} \\ \vdots \\ a_{m,2} \end{array} \right] , \quad \cdots \quad , {\bf c}_n ({\bf A}) = \left[ \begin{array}{c} a_{1,n} \\ a_{2,n} \\ \vdots \\ a_{m,n} \end{array} \right] \in \mathbb{F}^{m\times 1} . \]

In other words, A x is a linear combination of column vectors of matrix A with weights takem from coordinates of vector x. Alternatively, using dual covectors, we can say the same thing like this:

\[ {\bf A}\,{\bf x} = \begin{bmatrix} {\bf r}_1^{\ast} ( {\bf x}) \\ {\bf r}_2^{\ast} ( {\bf x}) \\ \vdots \\ {\bf r}_m^{\ast} ( {\bf x}) \end{bmatrix} \in \mathbb{F}^{m\times 1} \qquad\mbox{or} \qquad {\bf A}\,{\bf x} = \begin{bmatrix} {\bf r}_1 ({\bf A}) \bullet {\bf x} \\ {\bf r}_2 ({\bf A}) \bullet {\bf x} \\ \vdots \\ {\bf r}_m ({\bf A}) \bullet {\bf x} \end{bmatrix} \in \mathbb{R}^{m\times 1} , \]

where r_j(A), j = 1, 2, … , m, are row vectors of matrix A:

\[ {\bf r}_1 ({\bf A}) = \left( a_{1,1} , a_{1,2} , \ldots , a_{1,n} \right) , \quad \cdots \quad , {\bf r}_m ({\bf A}) = \left( a_{m,1} , a_{m,2} , \ldots , a_{m,n} \right) . \]

Example 2: We consider a 2 × 3 matrix A and vectors x ∈ ℝ^3×1, y ∈ ℝ^1×2, where \[ {\bf A} = \begin{bmatrix} 1&2&3 \\ 4&5&6 \end{bmatrix} , \qquad {\bf x} = \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} , \quad {\bf y} = \left[ y_1 , y_2 \right] . \]

Clear[A, x, y];
A = {{1, 2, 3}, {4, 5, 6}};
xVec = {Subscript[x, 1], Subscript[x, 2], Subscript[x, 3]};
yVec = {{Subscript[y, 1], Subscript[y, 2]}};
Grid[{MatrixForm[#] & /@ {A, xVec, yVec}}]

\( \displaystyle \begin{pmatrix} 1&2&3 \\ 4&5&6 \end{pmatrix} \qquad \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} \qquad \begin{pmatrix} y_1 & y_2 \end{pmatrix} \)

Since columns of matrix A are \[ {\bf c}_1 ({\bf A}) = \left[ \begin{array}{c} 1 \\ 4 \end{array} \right] , \quad {\bf c}_2 ({\bf A}) = \left[ \begin{array}{c} 2 \\ 5 \end{array} \right] , \quad {\bf c}_3 ({\bf A}) = \left[ \begin{array}{c} 3 \\ 6 \end{array} \right] , \]

Subscript[col, 1] = A[[All, 1]];
Subscript[col, 2] = A[[All, 2]];
Subscript[col, 3] = A[[All, 3]];
Grid[{MatrixForm[#] & /@ {Subscript[col, 1], Subscript[col, 2], Subscript[col, 3]}}]

\( \displaystyle \begin{pmatrix} 1 \\ 4 \end{pmatrix} \qquad \begin{pmatrix} 2 \\ 5 \end{pmatrix} \qquad \begin{pmatrix} 3 \\ 6 \end{pmatrix} \)

and rows are \[ {\bf r}_1 ({\bf A}) = \left( 1, 2, 3 \right) , \quad {\bf r}_2 ({\bf A}) = \left( 4, 5, 6 \right) , \]

Subscript[row, 1] = A[[1]]; Subscript[row, 2] = A[[2]]; Grid[{# & /@ {Subscript[row, 1], Subscript[row, 2]}}]

{1, 2, 3} {4, 5, 6}

we find its product with vector x as \[ {\bf A}\,{\bf x} = \begin{bmatrix} 1\,x_1 + 2\,x_2 + 3\,x_3 \\ 4\, x_1 + 5\, x_2 + 6\,x_3 \end{bmatrix} \in \mathbb{R}^{2 \times 1} . \] This column vector can be splitted into sum of three vectors: \[ {\bf A}\,{\bf x} = x_1 \left[ \begin{array}{c} 1 \\ 4 \end{array} \right] + x_2\left[ \begin{array}{c} 2 \\ 5 \end{array} \right] + x_3 \left[ \begin{array}{c} 3 \\ 6 \end{array} \right] = \begin{bmatrix} 1\,x_1 \\ 4\,x_1 \end{bmatrix} + \begin{bmatrix} 2\,x_2 \\ 5\,x_2 \end{bmatrix} + \begin{bmatrix} 3\,x_3 \\ 6\,x_3 \end{bmatrix} \in \mathbb{R}^{2\times 1} , \] Each entry in the column vector A x is a dot product of two vectors: \[ {\bf A}\,{\bf x} = \begin{bmatrix} 1\,x_1 + 2\,x_2 + 3\,x_3 \\ 4\, x_1 + 5\, x_2 + 6\,x_3 \end{bmatrix} = \begin{bmatrix} {\bf r}_1 ({\bf A}) \bullet {\bf x} \\ {\bf r}_2 ({\bf A}) \bullet {\bf x} \end{bmatrix} \in \mathbb{R}^{2\times 1} , \] where \begin{align*} {\bf r}_1 ({\bf A}) \bullet {\bf x} &= \left( 1, 2, 3 \right) \bullet \left( x_1 , x_2 , x_3 \right) = x_1 + 2\, x_2 + 3\, x_3 \in \mathbb{R} , \\ {\bf r}_2 ({\bf A}) \bullet {\bf x} &= \left( 4, 5, 6 \right) \bullet \left( x_1 , x_2 , x_3 \right) = 4\,x_1 + 5\, x_2 + 6\, x_3 \in \mathbb{R} . \end{align*}

Transpose[A]*xVec
% // MatrixForm

\( \displaystyle \left\{ \left\{ x_1 \ 4\,x_1 \right\} , \ \left\{ 2\,x_2 , \ 5\, x_2 \right\} , \ \left\{ 3\,x_3 , \ 6\,x_3 \right\} \right\} \)

\( \displaystyle \begin{pmatrix} x_1 & 4\, x_1 \\ 2\,x_2 & 5\, x_2 \\ 3\,x_3 & 6\, x_6 \end{pmatrix} \)

Grid[{{Subscript[row, 1] . xVec}, {Subscript[row, 2] . xVec}}]

\( \displaystyle \begin{array} \ \ x_1 + 2\,x_2 + 3\,x_3 \\ 4\,x_1 + 5\,x_2 + 6\, x_3 \end{array} \)

On the other hand, the dual multiplication from right yields \begin{align*} {\bf y}\,{\bf A} &= \left[ {\bf y} \bullet {\bf c}_1 ({\bf A}) , {\bf y} \bullet {\bf c}_2 ({\bf A}) ,{\bf y} \bullet {\bf c}_3 ({\bf A}) \right] \\ &= \left[ y_1 + 4\,y_2 , \ 2\,y_1 + 5\,y_2 , \ 3\, y_1 + 6\,y_2 \right] \in \mathbb{R}^{1\times 3} . \end{align*}

End of Example 2

Anton, Howard (2005), Elementary Linear Algebra (Applications Version) (9th ed.), Wiley International
Beezer, R., A First Course in Linear Algebra, 2015.
Beezer, R., A Second Course in Linear Algebra, 2013.

Introduction to Linear Algebra

Systems of Linear Equations

Matrix Algebra

Vector Spaces

Eigenvalues, Eigenvectors

Euclidean Spaces

Matrix Decompositions

Applications

Functions of Matrices

Miscellany

Preliminaries

Glossary

Reference