Linear combinations

The following definition gives important notation and terminology for constructing vector spaces. It is also widely used in other areas of mathematics, such as real analysis and differential equations.

Let S = { v₁, v₂, ... , v_n } be a set of n > 0 vectors in a vector space V over the field of 𝔽 (which is either a set of real numbers ℝ or complex numbers ℂ or rational numbers ℚ). If a₁, a₂, ... , a_n are scalars from the same field, then the linear combination of those vectors with those scalars as coefficients is

\[ a_1 {\bf v}_1 + a_2 {\bf v}_2 + \cdots + a_n {\bf v}_n . \]

The term linear combination is due to the American astronomer and mathematician George William Hill (1838--1914), who introduced it in a research paper on planetary motion published in 1900. Working out of his home in West Nyack, NY, independently and largely in isolation from the wider scientific community, he made major contributions to celestial mechanics and to the theory of ordinary differential equations. He taught at Columbia University for a few years. The importance of his work was explicitly acknowledged by Henri Poincaré in 1905.

In other words, a linear combination of vectors from S is a sum of scalar multiples of those vectors.

Observe that in any vector space V, 0v = 0 for each vector v∈V. Thus, the zero vector is a linear combination of any nonempty subset of V. So a linear combination is an expression that defines another vector. In this case, we say that a vector v is represented as a linear combination of vectors from (finite) set S:

\[ {\bf v} = a_1 {\bf v}_1 + a_2 {\bf v}_2 + \cdots + a_n {\bf v}_n . \]

Example 1: Our first example has a little connection to vector spaces, but it illustrates usefulness of the term of linear combination in real life.

It is well-known that the U.S. political system is binary---there are only two parties in charge of leading the country. This situation is a result of a plurality voting system that is used to determine a winner in any election. The majority of countries in the world except for a few (U.S., U.K., and in the lower house in India) do not use the plurality method for nationwide elections as it is easy to manipulate and it is self-contradicting (a loser can win election). Let us denote by r the vector (or basket) indicating the Republican party, and by b the corresponding vector for Democratic party. Then an election can be represented by v, their linear combination

\[ {\bf v} = x {\bf b} + y {\bf r} , \]

where x is the number of voters that prefer Democratic party candidate, and y is the number of voters for Republican party candidate. Note that in this example, scalars are taken from the set of nonnegative integers, denoted by ℕ, which is not a field. ■

Example 2: Consider the following table that shows the vitamin content of 100 grams of 6 foods with respect to vitamins B₁ (thiamine), B₂ (riboflavin), B₃ (Niacin Equivalents), C (ascorbic acid), and B₉ (Folate).

	B₁ (mg)	B₂ (mg)	B₃ (mg)	C (mg)	Folate (mcg)
Watermelon	0.05 mg	0.03 mg	0.45 mg	12.31 mg	4.56 mcg
Honey	0.03 mg	0.038 mg	0.02 mg	1.7 mg	6.8 mcg
Pork	0.396 mg	0.242 mg	4.647 mg	0.3 mg	212 mcg
Salmon	0.02 mg	0.2 mg	6 mg	0	25 mcg
Lettuce	0.07 mg	0.08 mg	0.375 mg	9.2 mg	73 mcg
Tomato	0.528 mg	0.489 mg	5.4 mg	39 mg	20 mcg

The vitamin content of 100 grams of each food can be recorded as a column vector in ℝ⁵; for example the vitamin vector for tomato is

\[ \begin{bmatrix} 0.528 \\ 0.489 \\ 5.4 \\ 39 \\ 20 \end{bmatrix} . \]

Then the vitamin vector for consuming each 6 foods is their linear combination:

\[ a_1 \begin{bmatrix} 0.05 \\ 0.03 \\ 0.45 \\ 12.31 \\ 4.56 \end{bmatrix} + a_2 \begin{bmatrix} 0.03 \\ 0.038 \\ 0.02 \\ 1.7 \\ 6.8 \end{bmatrix} + a_3 \begin{bmatrix} 0.396 \\ 0.242 \\ 4.647 \\ 0.3 \\ 212 \end{bmatrix} + a_4 \begin{bmatrix} 0.02 \\ 0.2 \\ 6 \\ 0 \\ 25 \end{bmatrix} + a_5 \begin{bmatrix} 0.07 \\ 0.08 \\ 0.375 \\ 9.2 \\ 73 \end{bmatrix} + a_6 \begin{bmatrix} 0.528 \\ 0.489 \\ 5.4 \\ 39 \\ 20 \end{bmatrix} , \]

where coefficients a_i represent the amount (portions) of each product per 100 grams.

$Post := If[MatrixQ[#1], MatrixForm[#1], #1] & (* outputs matrices in MatrixForm*)
nutMat = {{.05, .03, .45, 12.31, 4.56}, {.03, .038, .02, 1.7, 6.8}, {.396, .242, 4.647, .3, 212}, {.02, .2, 6, 0, 25}, {.07, .08, 3.75, 9.2, 7.3}, {.528, .489, 5.4, 39, 20}}
porVec = Table[RandomInteger[{10, 30}, 1], 6]
TableForm[Flatten[porVec].nutMat, TableHeadings -> {{"\!$\*SubscriptBox[\(B$, $1$]\)", "\!$\*SubscriptBox[\(B$, $2$]\)", "\!$\*SubscriptBox[\(B$, $3$]\)", "C", "Folate"}, None}]

Portions are a vector (where 𝑎₁ = Watermelon (W), 𝑎₂ = Honey (H), etc. ■

Example 3: Humans distinguish colors due to special sensors, called cones, in their eyes. Approximately 65% of all cones are sensitive to red light, 33% are sensitive to green light, and only 2% are sensitive to blue (but the blue cones are the most sensitive). Most color models in use today are oriented either toward hardware (such as for color monitors and printers) or toward applications where color manipulation is a goal (such as in the creation of color graphics for animation).

Colors on computer monitors or video cameras are commonly based on what is called the RGB color model in which red, green, and blue light are added together in various ways to reproduce a broad array of colors. Red is the color at the end of the visible spectrum of light, next to orange and opposite violet. It has a dominant wavelength of approximately 625–740 nanometers. Green is the color between blue and yellow on the visible spectrum. It is evoked by light which has a dominant wavelength of roughly 495–570 nm. The human eye perceives blue when observing light with a dominant wavelength between approximately 450 and 495 nanometres. The CMY (cyan, magenta, yellow) and CMYK (cyan, magenta, yellow, black) models are used for color printing.

Images represented in the RGB color model consist of three component images, one for each primary color. When fed into an RGB monitor, these three images combine on the screen to produce a composite color image. the number of bits used to represent each pixel in RGB space is called the pixel depth. Consider an RGB image in which each of the red, green, and blue images is an 8-bit image. Under these conditions each RGB color pixel, which is a triplet (R, G, B), is said to have a depth of 24 (= 3×8) bits. The total number of colors in a 24-bit RGB image is $ \left( 2^8 \right)^3 = 16,777,216. $ One way to identify the primary colors is to assign the vectors:

r = (1,0,0) pure red,
g = (0,1,0) pure green,
b = (0,0,1) pure blue

in ℝ³ and to create all other colors by forming linear combinations of r, g, and b using coefficients between 0 and 1, inclusive; these coefficients represent the percentage (gray scale) of each pure color in the mix. The set of all such color vectors is called RGB color space or the RGB color cube. Thus, each color vector c in this cube is expressed as a linear combination of the form

\begin{align*} {\bf c} &= a_1 {\bf r} + a_2 {\bf g} + a_3 {\bf b} \\ &= a_1 (1,0,0) + a_2 (0,1,0) + a_3 (0,0,1) = (a_1 , a_2 , a_3 ) , \end{align*}

where 0≤a_k≤1. As indicated in the figure below, the cones of the cube represent the pure primary colors together with the colors black, white, magenta, cyan, and yellow. The vectors along the diagonal running from black to white correspond to gray scale.

a1 = Graphics[{Thick, Line[{{0, 0}, {2, 0}, {2, 2}, {0, 2}, {0, 0}}]}];
a2 = Graphics[{Thick, , Dashed, Line[{{0.5, 2.75}, {0.5, 0.75}, {2.5, 0.75}}]}];
a3 = Graphics[{Thick, Line[{{0.5, 2.75}, {2.5, 2.75}, {2.5, 0.75}}]}];
b0 = Graphics[{Thick, Dotted, Line[{{0.5, 0.75}, {2.0, 2.0}}]}];
b1 = Graphics[{Thick, Dashed, Line[{{0.5, 0.75}, {0, 0}}]}];
b2 = Graphics[{Thick, Dashed, Line[{{0.5, 0.75}, {0.5, 2.75}}]}];
l1 = Graphics[{Thick, Line[{{0, 2}, {0.5, 2.75}}]}];
l2 = Graphics[{Thick, Line[{{2, 0}, {2.5, 0.75}}]}];
l3 = Graphics[{Thick, Line[{{2, 2}, {2.5, 2.75}}]}];
d1 = Graphics[{Red, Disk[{0, 0}, 0.07]}];
d2 = Graphics[{Magenta, Disk[{0, 2}, 0.07]}];
d3 = Graphics[{Blue, Disk[{0.5, 2.75}, 0.07]}];
d4 = Graphics[{Black, Disk[{0.5, 0.75}, 0.07]}];
d5 = Graphics[{Yellow, Disk[{2.0, 0}, 0.07]}];
d6 = Graphics[{White, Disk[{2.0, 2.0}, 0.07]}];
d7 = Graphics[{Cyan, Disk[{2.5, 2.75}, 0.07]}];
d8 = Graphics[{Green, Disk[{2.5, 0.75}, 0.07]}];
txt1 = Graphics[ Text[Style["Red (1,0,0)", FontSize -> 14, Black], {-0.85, 0.0}]];
txt2 = Graphics[ Text[Style["Yellow (1,1,0)", FontSize -> 14, Black], {3.0, 0.0}]];
txt3 = Graphics[ Text[Style["Magenta (1,0,1)", FontSize -> 14, Black], {-1.1, 2.0}]];
txt4 = Graphics[ Text[Style["White (1,1,1)", FontSize -> 14, Black], {3.0, 2.0}]];
txt5 = Graphics[ Text[Style["Black (0,0,0)", FontSize -> 14, Black], {-0.85, 0.75}]];
txt6 = Graphics[ Text[Style["Green (0,1,0)", FontSize -> 14, Black], {3.5, 0.75}]];
txt7 = Graphics[ Text[Style["Blue (0,0,1)", FontSize -> 14, Black], {-0.5, 2.75}]];
txt8 = Graphics[ Text[Style["Cyan (0,1,1)", FontSize -> 14, Black], {3.5, 2.75}]];
Show[a1, a2, a3, b0, b1, b2, l1, l2, l3, d1, d2, d3, d4, d5, d6, d7, d8, txt1, txt2, txt3, txt4, txt5, txt6, txt7, txt8]

The above code generates the RGB color model. ■

Example 4: In a rectangular xy-coordinate system every vector in the plane can be expressed in exactly one way as a linear combination of the standard unit vectors. For example, the only way to express the vector (4,3) as a linear combination of i = (1,0) and j = (0,1) is

\[ (4,3) = 4\,(1,0) + 3\,(0,1) = 4\,{\bf i} + 3\,{\bf j} . \]

i = {1, 0}; j = {0, 1};
trans1 = {4, 3}.{i, j}

{4, 3}

However, suppose that we introduce a third vector v = (1,1). Then the vector (4,3) has several combinations through vectors i, j, and v:

\begin{eqnarray*} (4,3) &=& 4\,(1,0) + 3\,(0,1) + 0\, (1,1) = 4\,{\bf i} + 3\,{\bf j} + 0\, {\bf v} \\ &=& 5\,(1,0) + 4\,(0,1) - (1,1) = 5\,{\bf i} + 4\,{\bf j} - {\bf v} \\ &=& 3\,(1,0) + 2\,(0,1) + (1,1) = 3\,{\bf i} + 2\,{\bf j} + {\bf v} , \end{eqnarray*}

and so on. Therefore, we get infinite many linear combinations for one vector.

$Post := If[MatrixQ[#1], MatrixForm[#1], #1] & (* outputs matricies in MatrixForm*)
i = {1, 0}; j = {0, 1};
v = {1, 1};
vec3 = {i, j, v}
trans2 = {4, 3, 0}.vec3

{4, 3}

TrueQ[trans1 == trans2]

True

End of Example 4

■

As we see from the previous example, a vector may have several representations in the form of linear combinations from the given set of vectors.

Example 5: In 3-dimensional space, the electric field E of a point charge q₁ with position vector r₁ at a point P---called the field point---with position vector r is given by the formula

\[ {\bf E} = \frac{k_e q_1}{\left\vert {\bf r} - {\bf r}_1 \right\vert^3} \left( {\bf r} - {\bf r}_1 \right) , \]

where k_e is the universal Coulomb constant, k_e = 8.99×10⁹N⋅m²/C², q is the charge of the particle.

We define the position vectors in Mathematica:

r = {x,y,z}; r1 ={x1,y1,z1} ;

When we place a semicolon at the end of an expression, Mathematica does not provide any output. Next, we will write an expression for the field (with k_e = 1). Recall that in Mathematica vector names are followed by underscores when being called in a function.

EField[r_ , r1_ , q1_ ] := q1/((r-r1).(r-r1))^(3/2) (r-r1)

We now take only the first two components of the field and try to make a two-dimensional plot of the field lines

{E1x,E1y} = Take[EField[{x,y,0},{1,1,0},1],2];

{(-1 + x)/((-1 + x)^2 + (-1 + y)^2)^( 3/2), (-1 + y)/((-1 + x)^2 + (-1 + y)^2)^(3/2)}

where Take[list, n] takes the first n elements of list and make a new list of them.

VectorPlot[{E1x, E1y}, {x, 0, 2}, {y, 0, 2}, Axes -> True]

Figure at the left presents a vector field for a single particle with charge q.

Dimensions[EField[r, {1, 1, 0}, 1]]

{3}

EField2[r_, r2_, q2_] := q2/((r - r2).(r - r2))^(3/2) (r - r2)
Dimensions[EField2[r, {1, 1, 0}, 1]]

{3}

Now we introduce another location of the second charge

r2 = {x2,y2,z2} ;

Then calculate the field due to this charge at the same field point

EField2[r_ , r2_ , q2_ ] := q2/((r-r2).(r-r2))^(3/2) (r-r2)

Now add the two fields to get the total electric field at r:

Etotal[r_, r1_, r2_, q1_, q2_] = EField[r,r1,q1] + EField2[r , r2 , q2 ]

{(q1 (x - x1))/((x - x1)^2 + (y - y1)^2 + (z - z1)^2)^(3/2) + ( q2 (x - x2))/((x - x2)^2 + (y - y2)^2 + (z - z2)^2)^(3/2), ( q1 (y - y1))/((x - x1)^2 + (y - y1)^2 + (z - z1)^2)^(3/2) + ( q2 (y - y2))/((x - x2)^2 + (y - y2)^2 + (z - z2)^2)^(3/2), ( q1 (z - z1))/((x - x1)^2 + (y - y1)^2 + (z - z1)^2)^(3/2) + ( q2 (z - z2))/((x - x2)^2 + (y - y2)^2 + (z - z2)^2)^(3/2)}

Let us see what the field lines of a dipole look like. A dipole is a combination of two charges of equal strength and opposite signs. Let the positive charge of +1 be at (1,1,0) and the negative charge of -1 be at (2,1,0). Let us also assume that the field point is at (x,y,0). Since we are interested in a two-dimensional plot of the field lines, we separate the first two components of Etotal

vEtotal[r_] = EField[r, {1, 1, 0}, 1] + EField2[r, {2, 1, 0}, -1]

{-((-2 + x)/((-2 + x)^2 + (-1 + y)^2 + z^2)^(3/2)) + (-1 + x)/((-1 + x)^2 + (-1 + y)^2 + z^2)^( 3/2), -((-1 + y)/((-2 + x)^2 + (-1 + y)^2 + z^2)^(3/2)) + (-1 + y)/((-1 + x)^2 + (-1 + y)^2 + z^2)^( 3/2), -(z/((-2 + x)^2 + (-1 + y)^2 + z^2)^(3/2)) + z/((-1 + x)^2 + (-1 + y)^2 + z^2)^(3/2)}

Then

{Etotal1, Etotal2} = Take[EField[r, {1, 1, 0}, 1], 2] + Take[EField2[r, {2, 1, 0}, -1], 2]

Now we are ready to plot this field

		strPlt = StreamPlot[{-((-2 + x)/((-2 + x)^2 + (-1 + y)^2 + z^2)^( 3/2)) + (-1 + x)/((-1 + x)^2 + (-1 + y)^2 + z^2)^( 3/2), -((-1 + y)/((-2 + x)^2 + (-1 + y)^2 + z^2)^(3/2)) + (-1 + y)/((-1 + x)^2 + (-1 + y)^2 + z^2)^(3/2)} /. {z -> 0}, {x, 0, 3}, {y, 0, 2}, Axes -> True, Ticks -> None]; GraphicsRow[{strPlt}]
Electric field potential of a dipole.		Mathematica code

Another version of the same plot:

charge[q_, {x0_, y0_, z0_}][x_, y_, z_] := q/((x - x0)^2 + (y - y0)^2 + (z - z0)^2)^(3/2) {x - x0, y - y0, z - z0};
projector[{x_, y_, z_}] := {x, y} VectorPlot[ projector[ charge[1, {0, 4, 0}][x, y, 0] + charge[-1, {0, -4, 0}][x, y, 0]], {x, -10, 10}, {y, -10, 10}];
StreamPlot[ projector[ charge[1, {-2, 0, 0}][x, y, 0] + charge[-1, {2, 0, 0}][x, y, 0]], {x, -5, 5}, {y, -5, 5}]

First, we define the Coulomb fields at the origin using "Ec" in the code below

Ec[x_,y_] := {x/(x^2 + y^2)^(3/2), y/(x^2 + y^2)^(3/2)};

		A completely different vector field is obtained when we add two equal charges: Ec[x_, y_] := {x/(x^2 + y^2)^(3/2), y/(x^2 + y^2)^(3/2)}; StreamPlot[Ec[x + 2, y] + Ec[x - 2, y], {x, -5, 5}, {y, -5, 5}]
Electric field potential of two equal charges.		Mathematica code

We can visualize vector fields in 3-dimensional space.

		StreamPlot3D[{-((-2 + x)/((-2 + x)^2 + (-1 + y)^2 + z^2)^( 3/2)) + (-1 + x)/((-1 + x)^2 + (-1 + y)^2 + z^2)^( 3/2), -((-1 + y)/((-2 + x)^2 + (-1 + y)^2 + z^2)^(3/2)) + (-1 + y)/((-1 + x)^2 + (-1 + y)^2 + z^2)^(3/2), 0}, {x, 0, 3}, {y, 0, 2}, {z, -1, 1}, Axes -> True, Ticks -> None]
3-D Electric field potential.		Mathematica code

■

Example 6: The vector [-2, 8, 5, 0] is a linear combination of the vectors [3, 1, -2, 2], [1, 0, 3, -1], and [4, -2, 1 0], because it is the sum of scalar multiples of the three vectors:

\[ 2\,[3,\, 1,\, -2,\,2] + 4\,[1,\,0,\,3,\,-1] -3\,[4,\,-2,\, 1,\, 0] = [-2,\,8,\, 5,\, 0] . \qquad ■ \]

We verify this answer with Mathematica using multiplication of a coefficient vector and the matrix built from these three vectors:

{2, 4, -3}.{{3, 1, -2, 2}, {1, 0, 3, -1}, {4, -2, 1, 0}}

{-2, 8, 5, 0}

■

Both a vector and a matrix can be multiplied by a scalar; with the operation being *. Matrices and vectors can be added or subtracted only when their dimensions are the same.

Generators

Vector spaces can be generated from arbitrary sets.

Let A be a set, and 𝔽 a field. The free vector space over 𝔽 generated by A is the vector space Free(A) consisting of all formal finite linear combinations of elements of A.

Example 7: Let A = { ♠, ♦, ♣, ♥ } and 𝔽 = ℝ. Then the Free(A) over the field of real numbers is the four-dimensional vector space consisting of all elements (𝑎♠, b♦, c♣, d♥), where 𝑎, b, c, d ∈ ℝ. Addition and scalar multiplication are defined in the obvious way.

\[ (a_1 \clubsuit , b_1 \diamondsuit , c_1 \spadesuit , d_1 \heartsuit ) + (a_2 \clubsuit , b_2 \diamondsuit , c_2 \spadesuit , d_2 \heartsuit ) \\ = (a_1 + a_2 ) \clubsuit , (b_1 + b_2 ) \diamondsuit , (c_1 + c_2 ) \spadesuit , (d_1 + d_2 ) \heartsuit ) \]

and

\[ k\,(a\, \clubsuit , b\, \diamondsuit , c\, \spadesuit , d\, \heartsuit ) = (ka\, \clubsuit , kb\, \diamondsuit , kc\, \spadesuit , kd\, \heartsuit ) , \]

for k ∈ ℝ. Then four suits generate ordering of real components in ℝ⁴.

End of Example 7

■

Situation becomes more fruitful when we consider subset of a vector space instead of arbitrary set. Staring from a finite number of vectors v₁, v₂, … , v_n from a vector space V, we can construct a vector subspace of V as follows. Consider the subset consisting of all (finite) linear combinations of these vectors. Actually, this subset of all linear combinations becomes a vector space---the smallest possible vector space containing the given subset of vectors.

The span of a set S of vectors, also called linear span, is the linear space formed by all the vectors that can be written as linear combinations of the vectors belonging to the given set. We also say that the set S generates the vector space and denote it by span(S). Then W = span(S) is generated by S and its elements are called generators of W.

Observation: A set of vectors generates (or span) a vector space if their linear combinations fill the space.

Although the definition of span has no restriction on the size of the generating set S, we will mostly work with finitely generated vector spaces. Moreover, we try to use as fewer generating vectors as possible. Any generating set of vectors can be enlarged indefinitely, say by adding kw (k is any scalar and wS) without changing the resulting span.

A vector space V is finitely generated when it has a finite generating family S of vectors: V = span(S) = span(v₁, v₂, … , v_n).

Example 8: The span of a single non-zero vector {v} consists of all constant multiples of this vector. This set is referred to as a line generated by v. If v = 0, zero vector, then its span consists of a single vector---0. We call this vector space a trivial space.

End of Example 8

Let S be a finite set of vectors. Its span, denoted by span(S), does not depend on the order in which elements of S are listed, Hence we call S a set, not a list. Its span is defined as

\[ \mbox{span}(S) = \left\{ \sum_{i=1}^N k_i {\bf v}_i \, : \, N \ge 1 , \quad {\bf v}_i \in S, \quad k_i \in \mathbb{F} \right\} . \]

If W = span(S) is a span of a nonempty set S, we can enlarge S and include any linear combination w of elements from S. Then W = span(S∪{w}). Also, if any vector from the set S is a linear combination of other vectors from S, this vector can be removed from the generating set S without changing span(S).

The span is a subspace since the trivial combination (all coefficients are 0) produces the zero vector of span(S). A multiple of a linear combination is again a linear combination:

\[ k \left( x_1 {\bf v}_1 + x_2 {\bf v}_2 + \cdots + x_n {\bf v}_n \right) = k\,x_1 {\bf v}_1 + k\,x_2 {\bf v}_2 + \cdots + k\,x_n {\bf v}_n . \]

(by the axioms of vector spaces), and similarly, the sum of two linear combinations

\[ \left( x_1 {\bf v}_1 + \cdots + x_n {\bf v}_n \right) + \left( y_1 {\bf v}_1 + \cdots + y_n {\bf v}_n \right) = \left( x_1 + y_1 \right) {\bf v}_1 + \cdots + \left( x_n + y_n \right) {\bf v}_n \]

is again a linear combination. The commutativity of addition shows that

\[ \mbox{span}\left( {\bf v}_1 , {\bf v}_2 , \ldots , {\bf v}_n \right) = \mbox{span}\left( {\bf v}_2 , {\bf v}_1 , \ldots , {\bf v}_n \right) \]

and the span does not depend on the order in which the elements v_i are listed. It is also obvious that if k ≠ 0, then

\[ \mbox{span}\left( {\bf v}_1 , {\bf v}_2 , \ldots , {\bf v}_n \right) = \mbox{span}\left( k\,{\bf v}_1 , {\bf v}_2 , \ldots , {\bf v}_n \right) , \qquad k \ne 0. \]

Theorem 1: Let S be a nonempty set of vectors v₁, v₂, … , v_n from a vector space V. Then for any scalar k ∈ 𝔽,

\[ \mbox{span}\left( {\bf v}_1 , \ldots , {\bf v}_n \right) = \mbox{span}\left( {\bf v}_1 , \ldots , {\bf v}_i + k\,{\bf v}_j , \ldots , {\bf v}_n \right) . \]

Theorem 2: If S = { v₁, v₂, … , v_n } and T = { u₁, u₂, … , u_r } are not empty sets of vectors in a vector space V, then

\[ \mbox{span}\left( {\bf v}_1 , \ldots , {\bf v}_n \right) = \mbox{span}\left( {\bf u}_1 , \ldots , {\bf u}_r \right) \]

if and only if each vector in S is a linear combination of those in T and each vector in T is a linear combination of those in S.

Example 9: Let us consider two vectors

\[ {\bf v} = \left( 2, 2, 2 \right) \qquad \mbox{and} \qquad {\bf u} = \left( -2, 2, -10 \right) , \tag{9.1} \]

When vectors are defined by arrays of numbers as above, it is assumed that they both start at the origin with tips indicated in Eq.(9.1). These two vectors generate a plane through the origin:

\[ a\,x +b\,y + c\,z = 0, \tag{9.2} \]

for some real numbers 𝑎, b, and c. In order to identify these coefficients, we substitute into equation (9.2) both vectors to obtain

\[ \begin{split} 2 \left( a +b + c \right) &= 0, \\ - 2 \, a + 2\,b -10\,c &= 0. \end{split} \tag{9.3} \]

To solve this system of algebraic equations, we employ Mathematica:

$Post := If[MatrixQ[#1], MatrixForm[#1], #1] & (* outputs matrices in MatrixForm*)
Clear[a, b, c, v, u];
v = {2, 2, 2}; u = {-2, 2, -10};

{x->-3 z y->2z}

The substitution is accomplished below by filling "#n" slots with the new variables, 𝑎, b, and c, respectively, shown in the square brackets in each equation.

sol1 = Solve[{2 #1 + 2 #2 + 2 #3 == 0 &[a, b, c], -2 #1 + 2 #2 - 10 #3 == 0 &[a, b, c]}, {a, b}]

{a -> -3 c b -> 2 c}

Now, we substitute the values we just found for a and b into a slightly rearranged second equation of (9.3) and get a numeric value of c.

In the code below, the symbols "/." are interpreted by Mathematica as "given that..."

sol2 = Solve[-2*a + 2*b == -10 /. sol1, c]

{z -> -1}

Using this constant value for c in sol1 gives

sol3 = sol1 /. c -> -1

{a -> 3 b -> -2}

So we obtain the required values of coefficients that we use to define the equation of the plane:

\[ 3\,x -2\,y -z = 0. \tag{9.4} \]

Another option to solve system of equations (9.3) is to add these two equations.

eq83a = v.{a, b, c}
eq83b = u.{a, b, c}

2 a +2b + 2 c
-2 a + 2 b 010 c

sol4 = eq83a + eq83b

4 b - 8 c

Setting this result equal to zero and solving for b provides its value in terms of c:

\[ 4\, b - 8\, c = 0 \qquad \Longrightarrow \qquad b = 2\, c . \]

sol5 = Solve[sol4 == 0, b]

{b -> 2 c}

Using this relation, we obtain from either of equations (9.3) that

\[ 2\,a + 2\cdot 2\,c + 2\,c = 0 \qquad \Longrightarrow \qquad a = -3\, c . \]

sol6 = eq83a /. sol5

{2 a + 6 c}

sol7 = Solve[sol6 == 0, a]

{a -> -3 c}

which leads to equation (8.4).

eq84 = Plus @@ {#1 a, #2 b, #3 c} &[sol3[[1, 1, 2]], sol3[[1, 2, 2]], sol2[[1, 1, 2]]]

3 a - 2 b - c

Note that Mathematica delivers its solutions as "expressions" each of which is composed of "parts" represented in the code as "levels" enclosed by double square brackets . Thus, for example, the coefficient for "a" (above as slot #1) is the second element (3) of the first element (a\[RightArrow]3) of the solution's first element ({a\[RightArrow]3,b\[RightArrow]-2}) which is the entirety of sol3, each shown below in the order described here.

sol3[[1, 1, 2]]

sol3[[1, 1]]

a -> 3

sol3[[1]]

{a -> 3 b -> -2}

We are now in a position to illustrate our work graphically. This also provides a chance to check our work. The task is to represent a two dimensional abstraction, two vectors each on the same plane. But, because the plane itself is at an angle, we need to show our 2d abstraction using a three dimensional (3d) space.

plane = Plot3D[3*x - 2*y, {x, -3, 3}, {y, -3, 3}, PlotStyle -> {Blue, Opacity[0.5]}, Mesh -> None, Axes -> False]; v1 = Graphics3D[{Black, Arrowheads[0.02], Thickness[0.01], Arrow[{{0, 0, 0}, {2, 2, 2}}]}]; v2 = Graphics3D[{Black, Arrowheads[0.02], Thickness[0.01], Arrow[{{0, 0, 0}, {-2, 2, -10}}]}]; txt1 = Graphics3D[{White, Text["v", {2, 2.2, 2}]}]; txt2 = Graphics3D[{White, Text["u", {-2, 2.3, -10}]}]; Show[plane, v1, v2, txt1, txt2, PlotLabel -> "Two vectors generate a plane"]

We wish to believe two things: One is that both vectors lie upon the same plane; the other is that both vectors begin at the origin. They both start at the origin by construction, but whether picture conforms it, we want to verify. Viewing the graphic from above does not confirm this information. We can rotate the 3d box such that you view it from the side as if your head was at about the top of the southeast corner of the box. The plane nearly disappears. At that viewpoint the plane appears as a line which coincides with the two vectors. It looks very likely that the vectors lie upon the plane. This is known as a "conjecture." We need more evidence.

Show[plane1, vArr, uArr, PlotLabel -> "Two vectors generate a plane", ViewPoint -> {1.7374874736246806`, -2.7734029029471206`, 0.8598683718580487}, ViewVertical -> {-0.13490979383439392`, 0.2153449849487252, 0.9671741751023932}]

For our second hope, that the two vectors each begin at the origin, we will add two elements to our graphic. One is the "zero plane" for which all z values are zero, the other is a red point at the origin.

origin = Graphics3D[{Red, PointSize -> Large, Point[{0, 0, 0}]}]; plane2 = Plot3D[0, {x, -3, 3}, {y, -3, 3}, PlotRange -> {{-3, 3}, {-3, 3}, {-15, 15}}, PlotStyle -> Opacity[0]]; Show[origin, plane2, Axes -> True]

Adding these two visual elements to our graphic, the presence of the red dot increases our confidence that the two vectors begin at the origin.

Show[plane1, plane2, vArr, uArr, txt1, txt2, origin]

From above we can approximate the position of the origin and imagine that the red dot is that point with respect to the x and y axes. But the location of the red point relative to the z-axis is harder to see. Again, tilting the graphic such that the zero plane disappears into a line, below we see that the conjecture about our vectors may actually be true.

Show[plane1, plane2, vArr, uArr, txt1, txt2, origin, ViewPoint -> {1.7282485218855632`, -2.9091491088169534`, \ -0.0029170652883053483`}, ViewVertical -> {0.0004402983097000594, -0.0007411511822849846, 0.9999996284160928}]

Until now, we have used words like "conjecture," "wish to believe," "hope," "imagine," "increases our confidence," "may actually be true," and "need more evidence." These words, to mathematicians, constitute what is uncharitably referred to as "hand waving." The etymology of this term is interesting and leads to a lot of jokes involving mathematicians and engineers. All of the words suggest uncertainty and that is intentional.

Mathematicians prefer a formal proof relying on inductive reasoning, which follows a specific format (Definition, Theorem, Proof, Remarks) handed down over the centuries. Keywords in a proof are "let," "show," and a conclusion acronym "QED" which stands for "Quod Erat Demonstrandum," literally "Which was to be demonstrated" or colloquially "We proved it true and are through talking about it."

Such a proof is provided below:

We need to show that any vector from the plane (8.4) can be expressed as alinear combination of vectors (8.1):

\[ \begin{pmatrix} x \\ y \\ 3x -2y \end{pmatrix} = c_1 {\bf v} + c_2 {\bf u} = \begin{bmatrix} 2\,c_1 - 2\,c_2 \\ 2\,c_1 + 2\,c_2 \\ 2\,c_1 - 10\,c_2 \end{bmatrix} . \tag{8.5} \]

sol8 = Solve[eq84 == 0, c]

{c -> 3 a - 2 b}

We rewrite this system of algebraic equations in vector form:

\[ \begin{bmatrix} 2 & -2 \\ 2 & 2 \\ 2 & -10 \end{bmatrix} \begin{pmatrix} c_1 \\ c_2 \end{pmatrix} = \begin{pmatrix} x \\ y \\ 3x-2y \end{pmatrix} . \]

Applying Gauss elimination to the corresponding augmented matrix, we obtain

\[ \left[ \begin{array}{cc|c} 2 & -2 & x \\ 2 & 2 & y \\ 2 & -10 & 3x -2y \end{array} \right] \sim \left[ \begin{array}{cc|c} 2 & -2 & x \\ 0 & 4 & y-x \\ 0 & -8 & 2x - 2y \end{array} \right] \sim \left[ \begin{array}{cc|c} 2 & -2 & x \\ 0 & 4 & y-x \\ 0 & 0 & 0 \end{array} \right] . \]

The first equivalent relation is obtained upon subtraction of the first row from each of the two other rows. The last relation follows upon adding the second row times two with to the last row.

Therefore, this system (8.5) has a unique solution for any x, y ∈ ℝ:

\[ c_1 = \frac{1}{2} \left( y + x \right) , \qquad c_2 = \frac{1}{4} \left( y - x \right) . \]

Mathematica arrives at the same answer:

mat = Transpose[{v, u}];
vec = {a, b, sol8[[1, 1, 2]]};
GaussianElimination[m_List?MatrixQ, v_List?VectorQ] := Last /@ RowReduce[Flatten /@ Transpose[{m, v}]] GaussianElimination[mat, vec]

It is frustrating for students when a computer (or a Professor) skips steps and just produces the answer. Using Trace you can see each step of the Mathematica code execution

trGauss = Trace[GaussianElimination[mat, vec]] // TableForm

To examine the code (and the process) more closely, you can extract a line of code and evaluate it separately. Here I take the 6th line from the cell above and use the name assignments instead of the matrix and vector values to get the solution for an interim step.

trGauss[[1, 6]]
Last /@ RowReduce[Flatten /@ Transpose[{mat, vec}]]

Mathematica has a number of approaches to solving this problem, each resulting in the same answer:

Solve[{2 c1 - 2 c2 == a, 2 c1 + 2 c2 == b, 2 c1 - 10 c2 == 3 a - 2 b}, {c1, c2}]
LinearSolve[mat, vec]
design = Insert[mat // Transpose, vec, 3] // Transpose;
RowReduce[design][[{1, 2}, 3]]
augMat1 = ResourceFunction[ "AugmentedMatrix"][({{2, -2}, {2, 2}, {2, -10}}).({{c1}, {c2}}) == ({{x}, {y}, {3 x - 2 y}}), {c1, c2}]; rmat1 = RowReduce[augMat1]; % // MatrixForm

End of Example 8

■

Suppose we are given a system of m algebraic equations with n unknowns

\begin{align*} a_{1,2} x_1 + a_{1,2} x_2 + \cdots + a_{1,n} x_n &= b_1 , \\ a_{2,2} x_1 + a_{2,2} x_2 + \cdots + a_{2,n} x_n &= b_2 , \\ \ddots\qquad\qquad & \qquad \vdots \tag{1} \\ a_{m,2} x_1 + a_{m,2} x_2 + \cdots + a_{m,n} x_n &= b_m . \end{align*}

Let us build n column vectors from coefficients of the system:

\[ {\bf a}_1 = \begin{pmatrix} a_{1,1} \\ a_{2,1} \\ \vdots \\ a_{m,1} \end{pmatrix} , \qquad {\bf a}_2 = \begin{pmatrix} a_{1,2} \\ a_{2,2} \\ \vdots \\ a_{m,2} \end{pmatrix} , \qquad \cdots \qquad {\bf a}_n = \begin{pmatrix} a_{1,n} \\ a_{2,n} \\ \vdots \\ a_{m,n} \end{pmatrix} . \]

Then system (1) can be written in vector form:

\[ x_1 {\bf a}_1 + x_2 {\bf a}_2 + \cdots + x_n {\bf a}_n = {\bf b} , \tag{2} \]

where

\[ {\bf b} = \begin{pmatrix} b_{1} \\ b_{2} \\ \vdots \\ b_{m} \end{pmatrix} . \]

The system (1) is consistent (or compatible) precisely when vector equation (2) is valid for suitable values of the coefficients x_i, namely when b is a linear combination of the vectors a_j:

\[ {\bf b} \in \mbox{span}\left( {\bf a}_1 , {\bf a}_2 , \ldots , {\bf a}_n \right) . \]

From Theorem 1, it follows that system (1) is consistent precisely when

\[ \mbox{span}\left( {\bf a}_1 , {\bf a}_2 , \ldots , {\bf a}_n \right) = \mbox{span}\left( {\bf b} , {\bf a}_1 , {\bf a}_2 , \ldots , {\bf a}_n \right) . \]

The same system (1) can be solved for all data b ∈ 𝔽^m precisely when the linear span of a₁, a₂, … , a_n is the whole space

\[ \mbox{span}\left( {\bf a}_1 , {\bf a}_2 , \ldots , {\bf a}_n \right) = \mathbb{F}^m , \]

namely when the m-tuples a₁, a₂, … , a_n form a set of generators of 𝔽^m.

Assume now that system (1) has infinitely many solutions (for a suitably given right-hand side b), but all have the same value x_j for some fixed j. To fix ideas, let us assume that x₁ has the same value in all solutions of (1). By the basic principle of linear algebra, all solutions are obtained from a particular one, by addition of a solution of

\[ x_1 {\bf a}_1 + x_2 {\bf a}_2 + \cdots + x_n {\bf a}_n = {\bf 0} . \tag{3} \]

Hence x₁ has a fixed value in all solutions of (2) when all solutions of (3) have a zero value for x₁. This simply means that a₁ is not a linear combination of a₂, a₃, … , a_n. Indeed, any expression

\[ {\bf a}_1 = c_2 {\bf a}_2 + \cdots + c_n {\bf a}_n , \]

corresponds to a relation

\[ {\bf a}_1 - c_2 {\bf a}_2 - \cdots - c_n {\bf a}_n = {\bf 0} . \]

Hence it corresponds to a solution of (3) having a first coefficient x₁ = 1 ≠ 0. In other words, x₁ has a fixed value in all solutions of (2) precisely when a₁ is not a linear combination of a₂, a₃, … , a_n, namely,

\[ {\bf a}_1 \notin \mbox{span}\left( {\bf a}_2 , {\bf a}_3 , \ldots , {\bf a}_n \right) . \]

More generally, x_j has a fixed value in all solutions of (1) precisely when a_j is not a linear combination of the other m-tuples.

To illustrate how vectors define a space through their linear combinations, we pick up two non-collinear vectors u and v and plot them.

The figure below includes a parallelogram (in light blue) that illustrates our intention to include all linear combinations of two vectors u and v with positive coefficients of values less than one. So this parallelogram represents only a part of span of two vectors u and v.

		u = {2, 3}; v = {5, 2}; vecArr = Graphics[{Thick, Red, Arrow[{{0, 0}, u}], Arrow[{{0, 0}, v}]}]; poly = Graphics[{LightBlue, Polygon[{{0, 0}, u, u + v, v}]}]; Show[poly, vecArr]
Two vectors form a parallelogram.		Mathematica code

We now wish to see how vectors that are linear combinations of those u and v outlining vectors generate a space. Choosing coefficients a₁ = 1.5 and a₂ = 1, we multiply these, respectively, times the elements of u, creating a new vector

{1.5, 1}*u

{3., 3}

This creates the corresponding vector that can be visualized as an arrow; so we see that it fits within the span of u and v.

arrRand1 = Graphics[{Thick, Black, Arrow[{{0, 0}, {1.5, 1}*u}]}];
Show[poly, vecArr, arrRand1, Axes -> True]

Repeat with different values for a and b:

arrRand2 = Graphics[{Thick, Black, Arrow[{{0, 0}, {2.84, 1}*u}]}];
Show[poly, vecArr, arrRand1, arrRand2 , Axes -> True]

Similarly,

arrRand3 = Graphics[{Thick, Black, Arrow[{{0, 0}, u + v}]}];
Show[poly, vecArr, arrRand1, arrRand2, arrRand3 , Axes -> True]

In order to demonstrate this linear combination phenomenon, we plot with Mathematica several combinations:

u = {2, 3}; v = {5, 2}; vecArr = Graphics[{Thick, Red, Arrow[{{0, 0}, u}], Arrow[{{0, 0}, v}]}]; poly = Graphics[{LightBlue, Polygon[{{0, 0}, u, u + v, v}]}, PlotLabel -> "Vectors Generating a Plane"]; poly2 = DiscretizeRegion[Polygon[{{0, 0}, u, u + v, v}]]; polyMesh = Drop[MeshCoordinates@poly2, 4]; Animate[ If[end < 115, Graphics[Table[Arrow[{{0, 0}, polyMesh[[i]]}], {i, 1, end}], PlotLabel -> "Vectors Generating a Plane"], Show[poly, vecArr, Graphics[Table[Arrow[{{0, 0}, polyMesh[[i]]}], {i, 1, end}], PlotLabel -> "Vectors Generating a Plane"]]], {{end, 1, "Add Vectors"}, 1, 115}, AnimationRepetitions -> 1 ]

A part of span of two non-collinear vectors.

Remark: It is important to keep in mind that linear combinations are always finite, even if itds generating set is not. To illustrate this point, consider the vector space ℝ[x] = span(1, x, x², x³, …), which is the set of all polynomials (of any degree). Recall from calculus that we can represent the function sin(x) in the form

\[ \sin (x) = \lim_{N\to\infty} \sum_{k=0}^N (-1)^k \frac{x^{2k+1}}{(2k+1)!} = x - \frac{x^3}{3!} + \frac{x^5}{5!} - \frac{x^7}{7!} + \cdots . \]

We have written sin(x) as a sum of scalar multiples of 1, x, x², x³, and so on. However sin(x) is not a polynomial in x because sin(x) can only be written as an infinite sum of polynomials, not a finite one.

Which of the following are linear combinations of two vectors v = (1, 2, −1) and u = (3, 0, 1)? \[ ({\bf a}) \quad (1, 1, 1), \qquad ({\bf b}) \quad (-3, 6, -5) , \qquad ({\bf c}) \quad (3 , 2, 1). \]
Which of the following are linear combinations of three vectors v = (1, 2, −1), u = (1, 1, 1), and w = (3, 0, 1)? \[ ({\bf a}) \quad (3, -5, 3), \qquad ({\bf b}) \quad (-6, 5, -4) , \qquad ({\bf c}) \quad (3 , 2, 1). \]
Express the following as linear combinations of b>v = (−1, 2, 4), u = (1, −3, 1), and w = (3, 1, 2). \[ ({\bf a}) \quad (-10, 7, -1), \qquad ({\bf b}) \quad (8, 1, -11) , \qquad ({\bf c}) \quad (3 , 22, 7). \]
Which of the following are linear combinations of matrices \[ {\bf A} = \begin{bmatrix} \phantom{-}1 & \phantom{-}2 \\ -3 & -1 \end{bmatrix} , \qquad {\bf B} = \begin{bmatrix} 1 & -1 \\ 3 & -2 \end{bmatrix} , \qquad {\bf C} = \begin{bmatrix} 4 & 1 \\ 3 & 2 \end{bmatrix} \ ? \] \[ ({\bf a}) \quad \begin{bmatrix} -5 & 6 \\ -18 & 2 \end{bmatrix} , \qquad ({\bf b}) \quad \begin{bmatrix} -12& -5 \\ -6 & -15 \end{bmatrix} , \qquad ({\bf c}) \quad \begin{bmatrix} 1&2 \\ 1&2 \end{bmatrix} . \]
In each part, express the vectors as a linear combination of polynomials \[ {\bf p}_1 = 2- x^2 , \quad {\bf p}_2 = 3 -x + 5\,x^2 , \quad {\bf p}_3 = 4 +2\,x - x^2 . \] \[ ({\bf a}) \quad -5 x + 14 x^2 , \qquad ({\bf a}) \quad 7+3 x - 9 x^2 , \qquad ({\bf c}) \quad 7 \left( 1 + x - 3 x^2 \right) . \]
Determine whether the polynomials from the previous exercise span ℘_≤
Are the given functions \[ ({\bf a}) \quad \cos 2x, \qquad ({\bf b}) \quad 5, ({\bf c}) \quad 0 \] linear combinations of the following functions \[ {\bf f} = 2\,\cos^2 x , \qquad {\bf g} = \sin^2 x , \qquad {\bf h} = 3? \]

Vector addition
Axler, S., Linear Algebra done Right, Third edition, Springer, San Francisco, CA, USA. 2015, doi: 10.1007/978-3-319-11080-6 ISBN 978-3-319-11079-0
Beezer, R.A., A First Course in Linear Algebra, 2017.

Introduction to Linear Algebra

Systems of Linear Equations

Matrix Algebra

Vector Spaces

Eigenvalues, Eigenvectors

Euclidean Spaces

Matrix Decompositions

Applications

Functions of Matrices

Miscellany

Preliminaries

Glossary

Reference

Linear combinations