Vector Spaces

   Giusto Bellavitis  Michail Ostrogradsky      William Hamilton

In applications, people deal with a variety of quantities that are used to describe the physical world. Examples of such quantities include distance, displacement, speed, velocity, acceleration, force, mass, momentum, energy, work, power, etc. All these quantities can by divided into two categories -- vectors and scalars. A vector quantity is a quantity that is fully described by both magnitude and direction. On the other hand, a scalar quantity is a quantity that is fully described by its magnitude. In mathematics, physics, and engineering, a Euclidean vector (simply a vector) is a geometric object that has magnitude (or length) and direction. Many familiar physical notions, such as forces, velocities, and accelerations, involve both magnitude (the amount of the force, velocity, or acceleration) and a direction. In most physical situations involving vectors, only the magnitude and direction of the vector are significant; consequently, we regard vectors with the same length and direction as being equal irrespective to their positions.

It is a custom to identify vectors with arrows (geometric object). The tail of the arrow is called the initial point of the vector and the tip the terminal point. To emphasis this approach, an arrow is placed above the initial and terminal points, for example, the notation \( {\bf v} = \vec{AB} \) tells us that A is the starting point of the vector v and its terminal point is B. In this tutorial (as in most science papers and textbooks), we will denote vectors in boldface type applied to lower case letters such as v, u, or x.

Graphics[{Arrowheads[0.1], {Thick, Blue, Arrow[{{0, 0}, {3, 1}}]}}]

Any two vectors x and y can be added in "tail-to-head" manner; that is, either x or y may be applied to any point and then another vector is applied to the endpoint of the first. If this is done, the endpoint of the latter is the endpoint of their sum, which is denoted by x + y. Besides the operation of vector addition there is another natural operation that can be performed on vectors---multiplication by a scalar that are often taken to be real numbers. When a vector is multiplied by a real number k, its magnitude is multiplied by |k| and its direction remains the same when k is positive and the opposite direction when k is negative. Such vector is denoted by kx.

The concept of vector, as we know it today, evolved gradually over a period of more than 200 years. The Italian mathematician, senator, and municipal councilor Giusto Bellavitis (1803--1880) abstracted the basic idea in 1835. The idea of an n-dimensional Euclidean space for n > 3 appeared in a work on the divergence theorem by the Russian mathematician Michail Ostrogradsky (1801--1862) in 1836, in the geometrical tracts of Hermann Grassmann (1809--1877) in the early 1840s, and in a brief paper of Arthur Cayley (1821--1895) in 1846. Unfortunately, the first two authors were virtually ignored in their lifetimes. In particular, the work of Grassmann was quite philosophical and extremely difficult to read. The term vector was introduced by the Irish mathematician, astronomer, and mathematical physicist William Rowan Hamilton (1805--1865) as part of a quaternion.

Vectors can be described also algebraically. Historically, the first vectors were Euclidean vectors that can be expanded through standard basic vectors that are used as coordinates. Then any vector can be uniquely represented by a sequence of scalars called coordinates or components. The set of such ordered n-tuples is denoted by \( \mathbb{R}^n . \) When scalars are complex numbers, the set of ordered n-tuples of complex numbers is denoted by \( \mathbb{C}^n . \) Motivated by these two approaches, we present the general definition of vectors.

A vector space V over set of either real numbers or complex numbers is a set of elements, called vectors, together with two operations that satisfy the eight axioms listed below.
1. The first operation is an inner operation that assigns to any two vectors x and y a third vector which is commonly written as x + y and called the sum of these two vectors.
2. The second operation, is an outer operation that assigns to any scalar k and vector x another vector, denoted by kx.

  1. Associativity of addition: \( ({\bf v} + {\bf u}) + {\bf w} = {\bf v} + ({\bf u} + {\bf w}) \) for \( ({\bf v} , {\bf u}) , {\bf w} \in V . \)
  2. Commutativity of addition: \( ({\bf v} + {\bf u}) = {\bf u} + {\bf v} \) for \( ({\bf v} , {\bf u}) \in V . \)
  3. Identity element of addition: there exists an element \( ({\bf 0} \in V , \) called the zero vector, such that \( {\bf v} +{\bf 0}) = {\bf v} \) for every vector from V.
  4. Inverse elements of addition: for every vector v, there exists an element \( -{\bf v} \in V , \) called the additive inverse of v, such that \( {\bf v} + (-{\bf v}) = {\bf 0} . \)
  5. Compatibility of scalar multiplication with field multiplication: \( a(b{\bf v}) = (ab){\bf v} \) for any scalars a and b and arbitrary vector v.
  6. Identity element of scalar multiplication: \( 1{\bf v} = {\bf v} , \) where 1 denotes the multiplicative identity.
  7. Distributivity of scalar multiplication with respect to vector addition: \( k\left( {\bf v} + {\bf u}\right) = k{\bf v} + k{\bf u} \) for any scalar k and arbitrary vectors v and u.
  8. Distributivity of scalar multiplication with respect to field addition: \( \left( a+b \right) {\bf v} = a\,{\bf v} + b\,{\bf v} \) for any two scalars a and b and arbitrary vector v. ■
Hermann Grassmann

Historically, the first ideas leading to vector spaces can be traced back as far as the 17th century; however, the idea crystallized with the work of the Prussian/German mathematician Hermann Günther Grassmann (1809--1877), who published a paper in 1862. He was also a linguist, physicist, neohumanist, general scholar, and publisher. His mathematical work was little noted until he was in his sixties. It is interested that while he was a student at the University of Berlin, Hermann studied theology, also taking classes in classical languages, philosophy, and literature. He does not appear to have taken courses in mathematics or physics. Although lacking university training in mathematics, it was the field that most interested him when he returned to Stettin (Province of Pomerania, Kingdom of Prussia; present-day Szczecin, Poland) in 1830 after completing his studies in Berlin.


Example: The set \( \mathbb{R}^n \mbox{ or } \mathbb{C}^n \) of all ordered n-tuples of real or complex numbers is our first familiar example of vector spaces. This space has a standard basis: \( {\bf e}_1 = (1,0,0,\ldots ,0 ) ,\quad {\bf e}_2 = (0,1,0,\ldots , 0 ), \ldots , {\bf e}_n = (0,0,\ldots , 0,1) .\) In ℝ³ these unit vectors are denoted by
\[ {\bf i} = \left( 1, 0 ,0 \right) , \qquad {\bf j} = \left( 0,1,0 \right) , \qquad {\bf k} = (0,0,1) . \]
Therefore, every vector \( {\bf v} = \left( v_1 , v_2 , v_3 \right) \) can be expressed as a linear combination of standard unit vectors:
\[ {\bf v} = \left( v_1 , v_2 , v_3 \right) = v_1 {\bf i} + v_2 {\bf j} + v_3 {\bf k} . \]

The above representation v = (v1, v2, ... ,vn) of vectors in ℝn is called the comma-delimited form. However, since a vector in ℝn is just a list of its n components in a specific order, any notation that displays those components in the correct order is a valid way of representing the vector. For example, the vector can be written as
\[ {\bf v} = \left[ v_1 \ v_2 \ \cdots \ v_n \right] \qquad\mbox{or} \qquad {\bf v} = \left\langle v_1 \ v_2 \ \cdots \ v_n \right\rangle . \]
It is a custom to represent vectors with respect to some basis as columns:
\[ {\bf v} = \begin{bmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{bmatrix} \qquad\mbox{or} \qquad {\bf v} = \begin{pmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{pmatrix} . \]

Mathematica does not distinguish columns from rows, so the user should specify which object is in use. A string is a grouping or ordering collection of characters or symbols with quotes around them such as S = "Mary had a little lamb".

In Mathematica, there are no sets since Mathematica requires that an ordering be placed on its data, and so it deals with lists that can be treated as sets if you ignore the order of the elements in the lists. The list of elements a, b, and c is embraced into curly brackets as L= { a, b, c } which Mathematica considers as a row-vector. The empty list is { }, and L[[k]] is the k-th element of list L.

Let us define two vectors, one as a row denoted v1, and another as a column denoted v2:

v1 = { 1, 2, 3}
v2 = {{1}, {2}, {3}}
We can also check that they are row vector and column vector, respectively, with the command:
v1 // TraditionalForm
v2 // MatrixForm
So v1 is a 1×3 matrix; while v2 would be column vector, which is a 3×1 matrix. Therefore, these two vectors have different dimensions and cannot be added. Nevertheless, Mathematica does not care and if one enter
v1 + v2
the output will be the 3-column vector:
{ {2}, {4}, {6} }
Example: Set of continuous functions. Let |a,b| denote an open or closed or semiclosed interval on the real axis. The set \( C(|a,b|) \) of all continuous functions on the interval |a,b| is a vector space.

Example: A polynomial of degree n is an expression of the form
\[ p_n (x) = a_0 + a_1 x + a_2 x^2 + \cdots + a_n x^n , \]
where n is a nonnegative integer and each coefficient ai is a scalar/number. The zero polynomial is the polynomial having all coefficients to be zero. The polynomials \( p_n (x) = a_0 + a_1 x + \cdots + a_n x^n \) and \( q_n (x) = b_0 + b_1 x + \cdots + b_m x^m \) , where for simplicity \( n\ge m , \) can be added:
\[ p_n (x) + q_m (x) = \left( a_0 + b_0 \right) + \left( a_1 + b_1 \right) x + \cdots + \left( a_m + b_m \right) x^m + a_{m+1} x^{m+1} + \cdots + a_n x^n , \]
and multiplied by a constant
\[ k\, p_n (x) = k\,a_0 + k\,a_1 x + k\,a_2 x^2 + \cdots + k\,a_n x^n . \]
Under these operations of addition and scalar multiplication, the set of all polynomials of degree not exceeding n is a vector space.


The dot product of two vectors of the same size \( {\bf x} = \left[ x_1 , x_2 , \ldots , x_n \right] \) and \( {\bf y} = \left[ y_1 , y_2 , \ldots , y_n \right] \) (independently whether they are columns or rows) is the number, denoted either by \( {\bf x} \cdot {\bf y} \) or \( \left\langle {\bf x} , {\bf y} \right\rangle ,\)
\[ \left\langle {\bf x} , {\bf y} \right\rangle = {\bf x} \cdot {\bf y} = x_1 y_1 + x_2 y_2 + \cdots + x_n y_n , \]
when entries are real, or
\[ \left\langle {\bf x} , {\bf y} \right\rangle = {\bf x} \cdot {\bf y} = \overline{x_1} y_1 + \overline{x_2} y_2 + \cdots + \overline{x_n} y_n , \]
when entries are complex.

Josiah Gibbs
The dot product was first introduced by the American physicist and mathematician Josiah Willard Gibbs (1839--1903) in the 1880s. An outer product is the tensor product of two coordinate vectors \( {\bf u} = \left[ u_1 , u_2 , \ldots , u_m \right] \) and \( {\bf v} = \left[ v_1 , v_2 , \ldots , v_n \right] , \) denoted \( {\bf u} \otimes {\bf v} , \) is an m-by-n matrix W such that its coordinates satisfy \( w_{i,j} = u_i v_j . \) The outer product \( {\bf u} \otimes {\bf v} , \) is equivalent to a matrix multiplication \( {\bf u} \, {\bf v}^{\ast} , \) (or \( {\bf u} \, {\bf v}^{\mathrm T} , \) if vectors are real) provided that u is represented as a column \( m \times 1 \) vector, and v as a column \( n \times 1 \) vector. Here \( {\bf v}^{\ast} = \overline{{\bf v}^{\mathrm T}} . \)

For three dimensional vectors \( {\bf a} = a_1 \,{\bf i} + a_2 \,{\bf j} + a_3 \,{\bf k} = \left[ a_1 , a_2 , a_3 \right] \) and \( {\bf b} = b_1 \,{\bf i} + b_2 \,{\bf j} + b_3 \,{\bf k} = \left[ b_1 , b_2 , b_3 \right] \) , it is possible to define special multiplication, called cross-product:
\[ {\bf a} \times {\bf b} = \det \left[ \begin{array}{ccc} {\bf i} & {\bf j} & {\bf k} \\ a_1 & a_2 & a_3 \\ b_1 & b_2 & b_3 \end{array} \right] = {\bf i} \left( a_2 b_3 - b_2 a_3 \right) - {\bf j} \left( a_1 b_3 - b_1 a_3 \right) + {\bf k} \left( a_1 b_2 - a_2 b_1 \right) . \]
Example: For instance, if m = 4 and n = 3, then
\[ {\bf u} \otimes {\bf v} = {\bf u} \, {\bf v}^{\mathrm T} = \begin{bmatrix} u_1 \\ u_2 \\ u_3 \\ u_4 \end{bmatrix} \begin{bmatrix} v_1 & v_2 & v_3 \end{bmatrix} = \begin{bmatrix} u_1 v_1 & u_1 v_2 & u_1 v_3 \\ u_2 v_1 & u_2 v_2 & u_2 v_3 \\ u_3 v_1 & u_3 v_2 & u_3 v_3 \\ u_4 v_1 & u_4 v_2 & u_4 v_3 \end{bmatrix} . \]
In Mathematica, the outer product has a special command:
Outer[Times, {1, 2, 3, 4}, {a, b, c}]
Out[1]= {{a, b, c}, {2 a, 2 b, 2 c}, {3 a, 3 b, 3 c}, {4 a, 4 b, 4 c}}

An inner product of two vectors of the same size, usually denoted by \( \left\langle {\bf x} , {\bf y} \right\rangle ,\) is a generalization of the dot product if it satisfies the following properties:

  • \( \left\langle {\bf v}+{\bf u} , {\bf w} \right\rangle = \left\langle {\bf v} , {\bf w} \right\rangle + \left\langle {\bf u} , {\bf w} \right\rangle . \)
  • \( \left\langle {\bf v} , \alpha {\bf u} \right\rangle = \alpha \left\langle {\bf v} , {\bf u} \right\rangle \) for any scalar α.
  • \( \left\langle {\bf v} , {\bf u} \right\rangle = \overline{\left\langle {\bf u} , {\bf v} \right\rangle} , \) where overline means complex conjugate.
  • \( \left\langle {\bf v} , {\bf v} \right\rangle \ge 0 , \) and equal if and only if \( {\bf v} = {\bf 0} . \)

The fourth condition in the list above is known as the positive-definite condition.

A vector space together with the inner product is called an inner product space. Every inner product space is a metric space. The metric or norm is given by

\[ \| {\bf u} \| = \sqrt{\left\langle {\bf u} , {\bf u} \right\rangle} . \]
The nonzero vectors u and v of the same size are orthogonal (or perpendicular) when their inner product is zero: \( \left\langle {\bf u} , {\bf v} \right\rangle = 0 . \) We abbreviate it as \( {\bf u} \perp {\bf v} . \)

A generalized length function on a vector space can be imposed in many different ways, not necessarily through the inner product. What is important that this generalized length, called in mathematics a norm, should satisfy the following four axioms.

A <norm on a vector space V is a nonnegative function \( \| \, \cdot \, \| \, : \, V \to [0, \infty ) \) that satisfies the following axioms for any vectors \( {\bf u}, {\bf v} \in V \) and arbitrary scalar k.
  1. \( \| {\bf u} \| \) is real and nonnegative;
  2. \( \| {\bf u} \| =0 \) if and only if u = 0;
  3. \( \| k\,{\bf u} \| = |k| \, \| {\bf u} \| ;\)
  4. \( \| {\bf u} + {\bf v} \| \le \| {\bf u} \| + \| {\bf v} \| . \)
With any positive definite (having positive eigenvalues) matrix one can define a corresponding norm. If A is an \( n \times n \) positive definite matrix and u and v are n-vectors, then we can define the weighted Euclidean inner product
\[ \left\langle {\bf u} , {\bf v} \right\rangle = {\bf A} {\bf u} \cdot {\bf v} = {\bf u} \cdot {\bf A}^{\ast} {\bf v} \qquad\mbox{and} \qquad {\bf u} \cdot {\bf A} {\bf v} = {\bf A}^{\ast} {\bf u} \cdot {\bf v} . \]
In particular, if w1, w2, ... , wn are positive real numbers, which are called weights, and if u = ( u1, u2, ... , un) and v = ( v1, v2, ... , vn) are vectors in \( \mathbb{R}^n , \) then the formula
\[ \left\langle {\bf u} , {\bf v} \right\rangle = w_1 u_1 v_1 + w_2 u_2 v_2 + \cdots + w_n u_n v_n \]
defines an inner product on \( \mathbb{R}^n , \) that is called the weighted Euclidean inner product with weights w1, w2, ... , wn.
Example: The Euclidean inner product and the weighted Euclidean inner product (when \( \left\langle {\bf u} , {\bf v} \right\rangle = \sum_{k=1}^n a_k u_k v_k , \) for some positive numbers \( a_k , \ (k=1,2,\ldots , n \) ) are special cases of a general class of inner products on \( \mathbb{R}^n \) called matrix inner product. Let A be an invertible n-by-n matrix. Then the formula
\[ \left\langle {\bf u} , {\bf v} \right\rangle = {\bf A} {\bf u} \cdot {\bf A} {\bf v} = {\bf v}^{\mathrm T} {\bf A}^{\mathrm T} {\bf A} {\bf u} \]
defines an inner product generated by A.
Example: In the set of integrable functions on an interval [a,b], we can define the inner product of two functions f and g as
\[ \left\langle f , g \right\rangle = \int_a^b \overline{f} (x)\, g(x) \, {\text d}x \qquad\mbox{or} \qquad \left\langle f , g \right\rangle = \int_a^b f(x)\,\overline{g} (x) \, {\text d}x . \]
Then the norm \( \| f \| \) (also called the 2-norm) becomes the square root of
\[ \| f \|^2 = \left\langle f , f \right\rangle = \int_a^b \left\vert f(x) \right\vert^2 \, {\text d}x . \]
In particular, the 2-norm of the function \( f(x) = 5x^2 +2x -1 \) on the interval [0,1] is
\[ \| 2 x^2 +2x -1 \| = \sqrt{\int_0^1 \left( 5x^2 +2x -1 \right)^2 {\text d}x } = \sqrt{7} . \]

Example: Consider a set of polynomials of degree n. If
\[ {\bf p} = p(x) = p_0 + p_1 x + p_2 x^2 + \cdots + p_n x^n \quad\mbox{and} \quad {\bf q} = q(x) = q_0 + q_1 x + q_2 x^2 + \cdots + q_n x^n \]
are two polynomials, and if \( x_0 , x_1 , \ldots , x_n \) are distinct real numbers (called sample points), then the formula
\[ \left\langle {\bf p} , {\bf q} \right\rangle = p(x_0 ) q(x_0 ) + p_1 (x_1 )q(x_1 ) + \cdots + p(x_n ) q(x_n ) \]
defines an inner product, which is called the evaluation inner product at \( x_0 , x_1 , \ldots , x_n . \)

With dot product, we can assign a length of a vector, which is also called the Euclidean norm or 2-norm:

\[ \| {\bf x} \|_2 = \| {\bf x} \| = \sqrt{ {\bf x}\cdot {\bf x}} = \sqrt{x_1^2 + x_2^2 + \cdots + x_n^2} . \]
This norm can be generalized for arbitrary real p: \[ \| {\bf x} \|_p = \left( x_1^p + x_2^p + \cdots + x_n^p \right)^{1/p} . \]
The norm corresponding p=1 has a special name: Taxicab norm or Manhattan norm, which is also called 1-norm:
\[ \| {\bf x} \|_1 = \sum_{k=1}^n | x_k | = |x_1 | + |x_2 | + \cdots + |x_n | . \]
Another commonly used norm is
\[ \| {\bf x} \|_{\infty} = \sup_{1\le k \le n} | x_k | . \]
The following codes was used to visualize unit balls for three different norms:
region = RegionPlot[ Abs[x] + Abs[y] < 1, {x, -1.5, 1.5}, {y, -1.5, 1.5}]
r1 = Graphics[Arrow[{{-1.5, 0}, {1.5, 0}}]]
r2 = Graphics[Arrow[{{0, -1.5}, {0, 1.5}}]]
i1 = Graphics[ Text[Style["i", FontSize -> 14, Red, Thick], {1.0, -0.2}]]
j1 = Graphics[ Text[Style["j", FontSize -> 14, Red, Thick], {0.2, 1.0}]]
oo = Graphics[ Text[Style["0", FontSize -> 14, Red, Thick], {0.1, -0.1}]]
Show[i1, j1, r1, r2, region,oo]

region = RegionPlot[x^2 + y^2 < 1, {x, -1.5, 1.5}, {y, -1.5, 1.5}]
i1 = Graphics[ Text[Style["i", FontSize -> 14, Red, Thick], {1.1, -0.1}]]
j1 = Graphics[ Text[Style["j", FontSize -> 14, Red, Thick], {0.1, 1.1}]]
Show[i1, j1, r1, r2, region, oo]

region = RegionPlot[ Max[Abs[x], Abs[y]] < 1, {x, -1.5, 1.5}, {y, -1.5, 1.5}]
Show[i1, j1, r1, r2, region, oo]
         
 Unit ball for the norm \( \| \cdot \|_1 \) on \( \mathbb{R}^2 . \)    Unit ball for the norm \( \| \cdot \|_2 \) on on \( \mathbb{R}^2 . \)    Unit ball for the norm \( \| \cdot \|_{\infty} \) on on \( \mathbb{R}^2 . \)
It is not hard to verify that these norms satisfy
\begin{align*} \| {\bf x} \|_{\infty} &\le \| {\bf x} \|_1 \le n\,\| {\bf x} \|_{\infty} , \\ \| {\bf x} \|_2 &\le \| {\bf x} \|_1 \le \sqrt{n} \| {\bf x} \|_2 , \\ \| {\bf x} \|_{\infty} &\le \| {\bf x} \|_2 \le \sqrt{n} \| {\bf x} \|_{\infty} , \end{align*}
i.e., these norms are equivalent.
Example: Taking a vector \( {\bf v} = \left( 2, {\bf j} , -2 \right) , \) we calculate norms:
Norm[{2, \[ImaginaryJ], -2}]
Norm[{2, \[ImaginaryJ], -2}, 3/2]
Out[1]= 3
Out[2]= (1 + 4 Sqrt[2])^(2/3)
In linear algebra, functional analysis, and related areas of mathematics, a norm is a function that assigns a strictly positive length or size to each vector in a vector space—save for the zero vector, which is assigned a length of zero. On an n-dimensional complex space \( \mathbb{C}^n ,\) the most common norm is
\[ \| {\bf z} \| = \sqrt{ {\bf z}\cdot {\bf z}} = \sqrt{\overline{z_1} \,z_1 + \overline{z_2}\,z_2 + \cdots + \overline{z_n}\,z_n} = \sqrt{|z_1|^2 + |z_2 |^2 + \cdots + |z_n |^2} . \]
A unit vector u is a vector whose length equals one: \( {\bf u} \cdot {\bf u} =1 . \) We say that two vectors x and y are perpendicular if their dot product is zero. There are known many other norms.
         
 Augustin-Louis Cauchy    Viktor Yakovlevich Bunyakovsky    Hermann Amandus Schwarz
For any norm, the Cauchy--Bunyakovsky--Schwarz (or simply CBS) inequality holds:
\[ | {\bf x} \cdot {\bf y} | \le \| {\bf x} \| \, \| {\bf y} \| . \]
The inequality for sums was published by the French mathematician and physicist Augustin-Louis Cauchy (1789--1857) in 1821, while the corresponding inequality for integrals was first proved by the Russian mathematician Viktor Yakovlevich Bunyakovsky (1804--1889) in 1859. The modern proof (which is actually a repetition of the Bunyakovsky's one) of the integral inequality was given by the German mathematician Hermann Amandus Schwarz (1843--1921) in 1888. With Euclidean norm, we can define the dot product as
\[ {\bf x} \cdot {\bf y} = \| {\bf x} \| \, \| {\bf y} \| \, \cos \theta , \]
where \( \theta \) is the angle between two vectors.

The Xiang indequality:

\[ \left( {\bf x} \cdot {\bf y} \right)^2 = \left( \sum_{i=1}^n x_i \times y_i \right)^2 \le \left( \sum_{i=1}^n \max \left\{ x_i^2 , y_i^2 \right\} \right) \times \left( \sum_{i=1}^n \min \left\{ x_i^2 , y_i^2 \right\} \right) . \]
  1. Vector addition
  2. Omey, E., On Xiang's observations concerning the Cauchy--Schwarz inequality, The American Mathematical Monthly, 2015, Vol. 122, No. 7, pp.696--698.
  3. Milk