Preface

This is a tutorial made solely for the purpose of education and it was designed for students taking Applied Math 0340. It is primarily for students who have some experience using Mathematica. If you have never used Mathematica before and would like to learn more of the basics for this computer algebra system, it is strongly recommended looking at the APMA 0330 tutorial. As a friendly reminder, don't forget to clear variables in use and/or the kernel.

Finally, the commands in this tutorial are all written in bold black font, while Mathematica output is in normal font. This means that you can copy and paste all commands into Mathematica, change the parameters and run them. You, as the user, are free to use the scripts for your needs to learn the Mathematica program, and have the right to distribute this tutorial and refer to this tutorial as long as this tutorial is accredited appropriately.

Return to computing page for the first course APMA0330
Return to computing page for the second course APMA0340
Return to Mathematica tutorial for the first course APMA0330
Return to Mathematica tutorial for the second course APMA0340
Return to the main page for the course APMA0340
Return to the main page for the course APMA0330
Return to Part IV of the course APMA0340

How to define vectors

A vector is a quantity that has magnitude and direction and that is commonly represented by a directed line segment whose length represents the magnitude and whose orientation in space represents the direction. In mathematics, it is always assumed that vectors can be added or subtracted, and multiplied by a scalar (real or complex numbers). It is also assumed that there exists a unique zero vector (of zero magnitude and no direction), which can be added/subtracted from any vector without changing the outcome. The zero vector is not zero. Wind, for example, has both a speed and a direction and, hence, is conveniently expressed as a vector. The same can be said of moving objects, momentum, forces, electromagnetic fields, and weight. (Weight is the force produced by the acceleration of gravity acting on a mass.)

A set of vectors is usually called a vector space (also a linear space), which is an abstract definition in mathematics. Historically, the first ideas leading to vector spaces can be traced back as far as the 17th century; however, the idea crystallized with the work of the German mathematician Hermann Günther Grassmann (1809--1877), who published a paper in 1862. A vector space is a collection of objects called vectors, which may be added together and multiplied ("scaled") by numbers, called scalars in this context. Scalars are often taken to be real numbers, but there are also vector spaces with scalar multiplication by complex numbers, rational numbers, or generally any field. The operations of vector addition and scalar multiplication must satisfy certain requirements, called axioms.

When a basis in a vector space has been chosen, a vector can be expanded with respect to the basis vectors. For example, when unit vectors in n-dimensional space \( \mathbb{R}^n \) have been chosen, any vector can be uniquely expanded with respect to this basis. In three dimensional space, it is custom to use the Cartesian coordinate system and denote these unit vectors by i (abscissa or horizontal axis, usually denoted by x), j (ordinate or vertical axis, usually denoted by y), and k (applicate, usually denoted by z). With respect to these unit vectors, any vector can be written as \( {\bf v} = x\,{\bf i} + y\,{\bf j} + z\,{\bf k} , \) or more generally, as \( {\bf v} = v_1 {\bf i} + v_2 {\bf j} + v_3 {\bf k} ,\) where \( v_1, v_2 , v_3 \) are called the coordinates of the vector v. Coordinates are always specified relative to an ordered basis. Then the vector v is uniquely defined by an ordered triple of real numbers

\[ {\bf v} = \left[ x, y , z \right] . \]

In general, a vector in n-dimensional space is identified by n-tuple (an ordered array of numbers or arbitrary entries), and in infinite dimensional space by a sequence of numbers. The number of components in the vector is called its dimension. Coordinate vectors of finite-dimensional vector spaces can be represented by either a column vector (which is usually the case) or a row vector, Generally speaking, Mathematica does not distinguish column vectors from row vectors, but it does when the user specifies it. One can define vectors using Mathematica commands: List, Array, curly brackets.

In mathematics and applications, it is a custom to distinguish column vectors

\[ {\bf v} = \left( \begin{array}{c} v_1 \\ v_2 \\ \vdots \\ v_m \end{array} \right) \qquad \mbox{also written as } \qquad {\bf v} = \left[ \begin{array}{c} v_1 \\ v_2 \\ \vdots \\ v_m \end{array} \right] , \]

for which we use lowercase letters in boldface type, from row vectors (ordered n-tuple)

\[ \vec{v} = \left[ v_1 , v_2 , \ldots , v_n \right] . \]

Here entries \( v_i \) are known as the component of the vector. The column vectors and the row vectors can be defined using matrix command as an example of an \( n\times 1 \) matrix and \( 1\times n \) matrix, respectively:

Vectors in Mathematica are built, manipulated and interrogated similarly to matrices (see next subsection). However, as simple lists (“one-dimensional,” not “two-dimensional” such as matrices that look more tabular), they are simpler to construct and manipulate. It will be delimited by parentheses ( (,) ) which allows us to distinguish a vector from a matrix with just one row, if we look carefully. The number of “slots” in a vector is not referred to in Mathematica as rows or columns, but rather by “size.”

In Mathematica, defining vectors and matrices is done by typing every row in curly brackets:

v ={1,2^6 ,Sin[x]}

Out[1]= {1, 2^6, Sin[x]}

So v is a vector with three components, v[[1]] =1, v[[2]]= 2^6, and v[[3]]=Sin[x]. Its entries can be numbers or functions or even vectors and other entities. We usually denote vectors with lower case letters while matrices with upper case letters. Say we define a \( 2\times 3 \) matrix (with two rows and three columns) as

A ={{1,2,3},{-1,3,0}}

However, to see the traditional form of the matrix on the screen, one needs to add a corresponding command, either TraditionalForm or MatrixForm. These special commands tells Mathematica that output should be displayed with the elements of list arranged in a regular array:

A ={{1,2,3},{-1,3,0}} // MatrixForm

Out[1]= \( \begin{pmatrix} 1&2&3 \\ -1&3&0 \end{pmatrix} \)

A column vector can be constructed from curly brackets shown here { }. A comma delineates each row. The output, however, may not look like a column vector. To fix this you must call //MatrixForm on your variable representation of a row vector.

u={1,2,3,4}
MatrixForm[u]

[5]//MatrixForm
1
2
3
4

To construct a row vector, the operation is very similar to constructing a column vector except two sets of curly brackets to be used. Again the output does look like a row vector and so //MatrixForm must be called to put the row vector in the format that you more familiar with:

n={{1,2,3,4}}
MatrixForm[n]

[7]//MatrixForm
{ 1 2 3 4 }

Mathematica usually does not distinguish column-vectors from row-vectors. For instance, let us define a matrix and two vectors:

A ={{1, 2, 3}, {4, 5, 6}, {7, 8, 9}}
a = {1, 0, 2} a // TraditionalForm

Out[4]= \( \begin{pmatrix} 1 \\ 0 \\ 2 \end{pmatrix} \)

b = {{1, 0, 2}}
b // MatrixForm

Out[6]= (1, 0, 2)

So we see that a is a column vector, which is a matrix of dimensions \( 3 \times 1 ,\) while b is a row vector, which is a matrix of dimensions \( 1 \times 3 .\) When we multiply the matrix A by a from left or right, Mathematica treats this vector either as a \( 3 \times 1 \) matrix or as a \( 1 \times 3 \) vector:

A.a

Out[7]= {7, 16, 25}

a.A

Out[8]= {15, 18, 21}

However, when we apply the matrix A to the vector b, Mathematica will not accept multiplication from the right:

b.A

Out[9]= {{15, 18, 21}}

A.b

Dot::dotsh: Tensors {{1,2,3},{4,5,6},{7,8,9}} and {{1,0,2}} have incompatible shapes.

Let S be a set of vectors \( {\bf v}_1 , \ {\bf v}_2 , \ \ldots , \ {\bf v}_k .\) A vector v is said to be a linear combination of the vectors from S if and only if there are scalars (not all zeroes) \( c_1 , \ c_2 , \ \ldots , \ c_k , \) such that \( {\bf v} = c_1 {\bf v}_1 + c_2 {\bf v}_2 + \cdots + c_k {\bf v}_k .\) That is, a linear combination of vectors from S is a sum of scalar multiples of those vectors. Let S be a nonempty subset of a vector space V. Then the span of S in V is the set of all possible (finite) linear combinations of the vectors in S (including zero vector). It is usually denoted by span(S). In other words, a span of a set of vectors in a vector space is the intersection of all subspaces containing that set.

Example: The vector [-2, 8, 5, 0] is a linear combination of the vectors [3, 1, -2, 2], [1, 0, 3, -1], and [4, -2, 1 0], because

\[ 2\,[3,\, 1,\, -2,\,2] + 4\,[1,\,0,\,3,\,-1] -3\,[4,\,-2,\, 1,\, 0] = [-2,\,8,\, 5,\, 0] . \qquad ■ \]

Let S be a subset of a vector space V.

(1) S is a linearly independent subset of V if and only if no vector in S can be expressed as a linear combination of the other vectors in S.
(2) S is a linearly dependent subset of V if and only if some vector v in S can be expressed as a linear combination of the other vectors in S.

Theorem: A nonempty set \( S = \{ {\bf v}_1 , \ {\bf v}_2 , \ \ldots , \ {\bf v}_r \} \) in a vector space V is linearly independent if and only if the only coefficients satisfying the vector equation

\[ k_1 {\bf v}_1 + k_2 {\bf v}_2 + \cdots + k_r {\bf v}_r = {\bf 0} \]

are \( k_1 =0, \ k_2 =0, \ \ldots , \ k_r =0 . \)

Theorem: A nonempty set \( S = \{ {\bf v}_1 , \ {\bf v}_2 , \ \ldots , \ {\bf v}_r \} \) in a vector space V is linearly independent if and only if the matrix of the column-vectors from S has rank r. ■

If \( S = \{ {\bf v}_1 , \ {\bf v}_2 , \ \ldots , \ {\bf v}_n \} \) is a set of vectors in a finite-dimensional vector space V, then S is called basis for V if:

S spans V;
S is linearly independent. ■

Mathematica has three multiplication commands for vectors: the dot and outer products (for arbitrary vectors), and the cross product (for three dimensional vectors).

The dot product of two vectors of the same size \( {\bf x} = \left[ x_1 , x_2 , \ldots , x_n \right] \) and \( {\bf y} = \left[ y_1 , y_2 , \ldots , y_n \right] \) (independently whether they are columns or rows) is the number, denoted either by \( {\bf x} \cdot {\bf y} \) or \( \left\langle {\bf x} , {\bf y} \right\rangle ,\)

\[ \left\langle {\bf x} , {\bf y} \right\rangle = {\bf x} \cdot {\bf y} = x_1 y_1 + x_2 y_2 + \cdots + x_n y_n , \]

when entries are real, or

\[ \left\langle {\bf x} , {\bf y} \right\rangle = {\bf x} \cdot {\bf y} = \overline{x_1} y_1 + \overline{x_2} y_2 + \cdots + \overline{x_n} y_n , \]

when entries are complex. The dot product was first introduced by the American physicist and mathematician Josiah Willard Gibbs (1839--1903) in the 1880s. An outer product is the tensor product of two coordinate vectors \( {\bf u} = \left[ u_1 , u_2 , \ldots , u_m \right] \) and \( {\bf v} = \left[ v_1 , v_2 , \ldots , v_n \right] , \) denoted \( {\bf u} \otimes {\bf v} , \) is an m-by-n matrix W such that its coordinates satisfy \( w_{i,j} = u_i v_j . \) The outer product \( {\bf u} \otimes {\bf v} , \) is equivalent to a matrix multiplication \( {\bf u} \, {\bf v}^{\ast} , \) (or \( {\bf u} \, {\bf v}^{\mathrm T} , \) if vectors are real) provided that u is represented as a column \( m \times 1 \) vector, and v as a column \( n \times 1 \) vector. Here \( {\bf v}^{\ast} = \overline{{\bf v}^{\mathrm T}} . \)

For three dimensional vectors \( {\bf a} = a_1 \,{\bf i} + a_2 \,{\bf j} + a_3 \,{\bf k} = \left[ a_1 , a_2 , a_3 \right] \) and \( {\bf b} = b_1 \,{\bf i} + b_2 \,{\bf j} + b_3 \,{\bf k} = \left[ b_1 , b_2 , b_3 \right] \) , it is possible to define special multiplication, called cross-product:

\[ {\bf a} \times {\bf b} = \det \left[ \begin{array}{ccc} {\bf i} & {\bf j} & {\bf k} \\ a_1 & a_2 & a_3 \\ b_1 & b_2 & b_3 \end{array} \right] = {\bf i} \left( a_2 b_3 - b_2 a_3 \right) - {\bf j} \left( a_1 b_3 - b_1 a_3 \right) + {\bf k} \left( a_1 b_2 - a_2 b_1 \right) . \]

Example: Taking, for instance, if m = 4 and n = 3, we have

\[ {\bf u} \otimes {\bf v} = {\bf u} \, {\bf v}^{\mathrm T} = \begin{bmatrix} u_1 \\ u_2 \\ u_3 \\ u_4 \end{bmatrix} \begin{bmatrix} v_1 & v_2 & v_3 \end{bmatrix} = \begin{bmatrix} u_1 v_1 & u_1 v_2 & u_1 v_3 \\ u_2 v_1 & u_2 v_2 & u_2 v_3 \\ u_3 v_1 & u_3 v_2 & u_3 v_3 \\ u_4 v_1 & u_4 v_2 & u_4 v_3 \end{bmatrix} . \]

An inner product of two vectors of the same size, usually denoted by \( \left\langle {\bf x} , {\bf y} \right\rangle ,\) is a generalization of the dot product if it satisfies the following properties:

\( \left\langle {\bf v}+{\bf u} , {\bf w} \right\rangle = \left\langle {\bf v} , {\bf w} \right\rangle + \left\langle {\bf u} , {\bf w} \right\rangle . \)
\( \left\langle {\bf v} , \alpha {\bf u} \right\rangle = \alpha \left\langle {\bf v} , {\bf u} \right\rangle \) for any scalar α.
\( \left\langle {\bf v} , {\bf u} \right\rangle = \overline{\left\langle {\bf u} , {\bf v} \right\rangle} , \) where overline means complex conjugate.
\( \left\langle {\bf v} , {\bf v} \right\rangle \ge 0 , \) and equal if and only if \( {\bf v} = {\bf 0} . \)

The fourth condition in the list above is known as the positive-definite condition. A vector space together with the inner product is called an inner product space. Every inner product space is a metric space. The metric or norm is given by

\[ \| {\bf u} \| = \sqrt{\left\langle {\bf u} , {\bf u} \right\rangle} . \]

The nonzero vectors u and v of the same size are orthogonal (or perpendicular) when their inner product is zero: \( \left\langle {\bf u} , {\bf v} \right\rangle = 0 . \) We abbreviate it as \( {\bf u} \perp {\bf v} . \) If A is an \( n \times n \) positive definite matrix and u and v are n-vectors, then we can define the weighted Euclidean inner product

\[ \left\langle {\bf u} , {\bf v} \right\rangle = {\bf A} {\bf u} \cdot {\bf v} = {\bf u} \cdot {\bf A}^{\ast} {\bf v} \qquad\mbox{and} \qquad {\bf u} \cdot {\bf A} {\bf v} = {\bf A}^{\ast} {\bf u} \cdot {\bf v} . \]

In particular, if w₁, w₂, ... , w_n are positive real numbers, which are called weights, and if u = ( u₁, u₂, ... , u_n) and v = ( v₁, v₂, ... , v_n) are vectors in \( \mathbb{R}^n , \) then the formula

\[ \left\langle {\bf u} , {\bf v} \right\rangle = w_1 u_1 v_1 + w_2 u_2 v_2 + \cdots + w_n u_n v_n \]

defines an inner product on \( \mathbb{R}^n , \) that is called the weighted Euclidean inner product with weights w₁, w₂, ... , w_n.

Example: The Euclidean inner product and the weighted Euclidean inner product (when \( \left\langle {\bf u} , {\bf v} \right\rangle = \sum_{k=1}^n a_k u_k v_k , \) for some positive numbers \( a_k , \ (k=1,2,\ldots , n \) ) are special cases of a general class of inner products on \( \mathbb{R}^n \) called matrix inner product. Let A be an invertible n-by-n matrix. Then the formula

\[ \left\langle {\bf u} , {\bf v} \right\rangle = {\bf A} {\bf u} \cdot {\bf A} {\bf v} = {\bf v}^{\mathrm T} {\bf A}^{\mathrm T} {\bf A} {\bf u} \]

defines an inner product generated by A.

Example: In the set of integrable functions on an interval [a,b], we can define the inner product of two functions f and g as

\[ \left\langle f , g \right\rangle = \int_a^b \overline{f} (x)\, g(x) \, {\text d}x \qquad\mbox{or} \qquad \left\langle f , g \right\rangle = \int_a^b f(x)\,\overline{g} (x) \, {\text d}x . \]

Then the norm \( \| f \| \) (also called the 2-norm) becomes the square root of

\[ \| f \|^2 = \left\langle f , f \right\rangle = \int_a^b \left\vert f(x) \right\vert^2 \, {\text d}x . \]

In particular, the 2-norm of the function \( f(x) = 5x^2 +2x -1 \) on the interval [0,1] is

\[ \| 2 x^2 +2x -1 \| = \sqrt{\int_0^1 \left( 5x^2 +2x -1 \right)^2 {\text d}x } = \sqrt{7} . \]

Example: Consider a set of polynomials of degree n. If

\[ {\bf p} = p(x) = p_0 + p_1 x + p_2 x^2 + \cdots + p_n x^n \quad\mbox{and} \quad {\bf q} = q(x) = q_0 + q_1 x + q_2 x^2 + \cdots + q_n x^n \]

are two polynomials, and if \( x_0 , x_1 , \ldots , x_n \) are distinct real numbers (called sample points), then the formula

\[ \left\langle {\bf p} , {\bf q} \right\rangle = p(x_0 ) q(x_0 ) + p_1 (x_1 )q(x_1 ) + \cdots + p(x_n ) q(x_n ) \]

defines an inner product, which is called the evaluation inner product at \( x_0 , x_1 , \ldots , x_n . \) ■

With dot product, we can assign a length of a vector, which is also called the Euclidean norm or 2-norm:

\[ \| {\bf x} \| = \sqrt{ {\bf x}\cdot {\bf x}} = \sqrt{x_1^2 + x_2^2 + \cdots + x_n^2} . \]

In linear algebra, functional analysis, and related areas of mathematics, a norm is a function that assigns a strictly positive length or size to each vector in a vector space—save for the zero vector, which is assigned a length of zero. On an n-dimensional complex space \( \mathbb{C}^n ,\) the most common norm is

\[ \| {\bf z} \| = \sqrt{ {\bf z}\cdot {\bf z}} = \sqrt{\overline{z_1} \,z_1 + \overline{z_2}\,z_2 + \cdots + \overline{z_n}\,z_n} = \sqrt{|z_1|^2 + |z_2 |^2 + \cdots + |z_n |^2} . \]

A unit vector u is a vector whose length equals one: \( {\bf u} \cdot {\bf u} =1 . \) We say that two vectors x and y are perpendicular if their dot product is zero. There are known many other norms, from which we mention Taxicab norm or Manhattan norm, which is also called 1-norm:

\[ \| {\bf x} \| = \sum_{k=1}^n | x_k | = |x_1 | + |x_2 | + \cdots + |x_n | . \]

For any norm, the Cauchy--Bunyakovsky--Schwarz (or simply CBS) inequality holds:

\[ | {\bf x} \cdot {\bf y} | \le \| {\bf x} \| \, \| {\bf y} \| . \]


Augustin-Louis Cauchy		Viktor Yakovlevich Bunyakovsky		Hermann Amandus Schwarz

The inequality for sums was published by Augustin-Louis Cauchy (1789--1857) in 1821, while the corresponding inequality for integrals was first proved by Viktor Yakovlevich Bunyakovsky (1804--1889) in 1859. The modern proof (which is actually a repetition of the Bunyakovsky's one) of the integral inequality was given by Hermann Amandus Schwarz (1843--1921) in 1888. With Euclidean norm, we can define the dot product as

\[ {\bf x} \cdot {\bf y} = \| {\bf x} \| \, \| {\bf y} \| \, \cos \theta , \]

where \( \theta \) is the angle between two vectors. ■

Return to Mathematica page

Return to the main page (APMA0330)
Return to the Part 1 Matrix Algebra
Return to the Part 2 Linear Systems of Equations
Return to the Part 3 Linear Systems of Ordinary Differential Equations
Return to the Part 4 Non-linear Systems of Ordinary Differential Equations
Return to the Part 5 Numerical Methods
Return to the Part 6 Fourier Series
Return to the Part 7 Partial Differential Equations

MATHEMATICA TUTORIAL for the Second Course. Part I: Vectors.

Vladimir Dobrushkin

Preface

Contents [hide]

How to define vectors