A linear equation in n unknowns x₁, x₂, … , x_n is an equation of the form \[ a_1 x_1 + a_2 x_2 + \cdots + a_n x_n = b , \] where the literal coefficients 𝑎₁, 𝑎₂, … , 𝑎_n and b are given scalars.

The linear equation above is said to be written in standard form when a constant term "b" is separated by equal sign from other terms and placed at the right side. Also it is assumed that all unknown variables appear at most once in the left-hand side pf the equation. Of course, your computer solver does not care where you place this constant term, but we humans prefer to deal with the equation in standard form, which simplifies computer algorithms. For example, Mathematica solves successfully any of the following equations not in standard form:

\[ 2\,x + 3\,y + 5 = 0 \qquad\mbox{or} \qquad 2\,x + 4\,y = y - 5 . \]

Solve[{2 x + 3 y + 5 == 0}, {x, y}]

Solve::svars: Equations may not give solutions for all "solve" variables.
{{y -> -(5/3) - (2 x)/3}}

Solve[{2 x + 4 y == y - 5}, {x, y}]

Solve::svars: Equations may not give solutions for all "solve" variables.
{{y -> -(5/3) - (2 x)/3}}

Reduce[2 x + 3 y + 5 == 0, {x, y}]

y == -(5/3) - (2 x)/3

Reduce[2 x + 4 y == y - 5, {x, y}]

y == -(5/3) - (2 x)/3

When presenting linear equations, unknown variables can be written in either indexed form or just using distinct characters. For example, the same equation can be written in either way:

\[ 2\, 田_1 + 3\,田_2 + 4\,田_3 = 5 \qquad\mbox{or} \qquad 2\, 箱 + 3\,碗 + 4\,杯 = 5 . \]

Let us consider a system of m linear equations in standard form with n unknowns

\begin{equation} \label{EqRow.1} \begin{split} a_{1,1} x_1 + a_{1,2} x_2 + \cdots + a_{1,n} x_n &= b_1 , \\ a_{2,1} x_1 + a_{2,2} x_2 + \cdots + a_{2,n} x_n &= b_2 , \\ \qquad \vdots \qquad & \quad \vdots \\ a_{m,1} x_1 + a_{m,2} x_2 + \cdots + a_{m,n} x_n &= b_m . \end{split} \end{equation}

Any system having the form (1) above is called a linear system in n variables, or an m × n linear system in standard form. The 𝑎_i,j’s are called the system coefficients, and form the entries of the coefficient matrix A = [𝑎_i,j]. The variables x_j and the right-hand sides b_i form the variable vector x ∈ 𝔽ⁿ and the right-hand side vector b ∈ 𝔽^m (also is known as a constant term), respectively.
The system in (1) is homogeneous provided (b₁, b₂, … , b_m) = 0.

We rewrite this system in matrix/vector form

\begin{equation} \label{EqRow.2} {\bf A} \,{\bf x} = {\bf b} \in \mathbb{F}^{m\times 1} , \qquad {\bf x} \in \mathbb{F}^{n\times 1} , \end{equation}

where A = [𝑎_i,j] is called the coefficient matrix of system (1) and b is the right-hand term, with

\[ {\bf A} = \begin{bmatrix} a_{1,1} & a_{1,2} & a_{1,3} & \cdots & a_{1,n} \\ a_{2,1} & a_{2,2} & a_{2,3} & \cdots & a_{2,n} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ a_{m,1} & a_{m,2} & a_{m,3} & \cdots & a_{m,n} \end{bmatrix} \in \mathbb{F}^{m\times n}, \quad {\bf b} = \begin{bmatrix} b_1 \\ b_2 \\ \vdots \\ b_m \end{bmatrix} , \quad {\bf x} = \begin{pmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{pmatrix} \in \mathbb{F}^{n\times 1} . \]

The advantage of writing linear systems in matrix/vector form (beyond the fact that it requires less writing) is that we can make use of the various properties of matrices and matrix manipulations. Note that entries of the unknown vector x are embraced in parentheses instead of brackets to emphasize that their notation depends on humans and can be replaced by any characters.

The entries of the coefficient matrix A may have more complicated form than just being scalars; we will see later that systems of linear equations and corresponding coefficient matrices could be functions depending on some parameters.

By a solution of such a system (1), we mean any numeric vector x = (x₁, x₂, … , x_n) that simultaneously solves every equation in the system. The solution set or the general solution of the linear system (1) is a collection (which could be empty) of all solutions of the system.

Mathematica has a dedicated command LinearSolve that provides solutions to the linear matrix/vector equation A x = b. Therefore, if your objective is to know how to find a solution of Eq.(2), you can stop reading this futile tutorial and just utilize your ability to press buttons on the keyboard---Mathematica will do this job for you. Otherwise, keep reading and thinking.

A first observation of the system of linear equations (1) or its matrix/vector form (2) is unimportant of notation x in it. If the coefficient matrix A and constant term b are determined by the problem under consideration (or by Nature), notation for unknown variables depends strictly on your proclivity. By the way, Mathematica does not include notation for unknown variable when corresponding command is invoked: LinearSolve[A, b]. Moreover, Mathematica accepts b written as an n-tuple or as a column vector.

Example 1: We illustrate different approaches when solving the following underdetermined system of equations (when there are fewer equations than unknowns) \[ \begin{split} 2\,x_1 + 3\, x_2 + 5\, x_3 &= 3, \\ 3\,x_1 - 2\,x_2 - 2\, x_3 &= 1 . \end{split} \tag{1.1} \]

Needs["Notation`"]
Symbolize[NotationTemplateTag[Subscript[x, 1]]];
Symbolize[NotationTemplateTag[Subscript[x, 2]]];
Symbolize[NotationTemplateTag[Subscript[x, 3]]];
Reduce[{2 Subscript[x, 1] + 3 Subscript[x, 2] + 5 Subscript[x, 3] == 3, 3 Subscript[x, 1] - 2 Subscript[x, 2] - 2 Subscript[x, 3] == 1}, {Subscript[x, 1], Subscript[x, 2], Subscript[x, 3]}]
(* this is what you see in notebook: *)
\( \displaystyle \quad \begin{split} 2\,x_1 + 3\, x_2 + 5\, x_3 &= 3 \\ 3\,x_1 - 2\,x_2 - 2\, x_3 &= 1 \end{split} \)

Subscript[x, 2] == -(11/4) + (19 Subscript[x, 1])/4 && Subscript[x, 3] == 9/4 - (13 Subscript[x, 1])/4

Each equation in (1.1) can be written as dot products a₁ • x = 3 and a₂ • x = 1, where \[ {\bf a}_1 = \left( 2, 3, 5 \right) , \quad {\bf a}_2 = \left( 3, -2, -2 \right) , \] and x = (x₁, x₂, x₃) ∈ ℝ³. The solution set of the equation a • x = b is also a plane in ℝ³.

Symbolize[NotationTemplateTag[Subscript[a, 1]]];
Symbolize[NotationTemplateTag[Subscript[a, 2]]];
Subscript[a, 1] = {2, 3, 5};
Subscript[a, 2] = {3, -2, -2};
x = {Subscript[x, 1], Subscript[x, 2], Subscript[x, 3]};
Subscript[a, 1] . x

2 Subscript[x, 1] + 3 Subscript[x, 2] + 5 Subscript[x, 3]

Subscript[a, 2] . x

3 Subscript[x, 1] - 2 Subscript[x, 2] - 2 Subscript[x, 3]

Reduce[{Subscript[a, 1] . x == 3, Subscript[a, 2] . x == 1}, {Subscript[x, 1], Subscript[x, 2], Subscript[x, 3]}]

Subscript[x, 2] == -(11/4) + (19 Subscript[x, 1])/4 && Subscript[x, 3] == 9/4 - (13 Subscript[x, 1])/4

Geometrically we expect that the intersection of two nonparallel planes in ℝ³ is a line. (If you hold out two sheets of paper and turn the sheets so that they are not parallel, their intersection is a line. Imagine that the sheets are extended forever and ever.) The equation of this line is \[ S = \left\{ \left[ \begin{array}{c} x_1 \\ x_2 \\ x_3 \end{array} \right] \in \mathbb{R}^{3\times 1} \ : \ 2\,x_1 + 3\, x_2 + 5\, x_3 = 3 \quad\mbox{and} \quad 3\,x_1 - 2\,x_2 - 2\, x_3 = 1 \right\} . \] In order to plot the intersection of these two planes, we use a trick by introducing new variables (x, y, and z) without indexing that are common in physics and engineering. Below on the left you see the two planes with their intersection line in red; on the right is the edge view of the same graphic where the line momentarily becomes a point.

Intersection of two planes		Edge view of intersection of two planes
Intersection of two planes defines a line.		Edge view of two planes and line of intersection.

Clear[x, y] f[x_, y_] = (3 - 2*x - 3*y)/5;
g[x_, y_] = (3*x - 2*y - 1)/2;
line = Solve[((3 - 2*x - 3*y)/5) == ((3*x - 2*y - 1)/2), y][[1, 1, 2]]; plt = Plot3D[{f[x, y], g[x, y]}, {x, -1, 2}, {y, -1, 2}, Axes -> True, Boxed -> False, AxesLabel -> {"x", "y", "z"}];
ppdLine = ParametricPlot3D[{x, line, f[x, line]}, {x, -1, 2}, Boxed -> False, PlotStyle -> {Red, Thickness[.025]}];
show1 = Labeled[Show[plt, ppdLine, ImageSize -> 275], "Two planes and line\n of intersection"];
show2 = Labeled[ Show[plt, ppdLine, ImageSize -> 275, ViewPoint -> {-0.6139237514164889, -3.308148189190016, 0.3595179909344441}], "Edge view of\ntwo planes and line\n of intersection"]; GraphicsRow[{show1, show2}]

We know that a line is the set of points that can be written as the sum of a shift vector plus all scalar multiples of a nonzero vector (known as the generator). The set S does not look like a line. In fact, it is hard to look at S and tell if there are any points at all in S. To show that S is a line of points, we must resort to algebra and arithmetic, and solve the two deﬁning equations simultaneously. In particular, to answer our question we must solve the system (1.1).

We convert the given row problem (1.1) into column problem. Let V ⊆ ℝ² be spanned on three vectors: \[ V = \mbox{span}\left\{ \begin{bmatrix} 2 \\ 3 \end{bmatrix} , \ \begin{bmatrix} \phantom{-}3 \\ -2 \end{bmatrix} , \ \begin{bmatrix} \phantom{-}5 \\ -2 \end{bmatrix} \right\} . \] V is a subspace of ℝ², and in fact, it is generated by linear combinations \[ V = \left\{ x_1 \begin{bmatrix} 2 \\ 3 \end{bmatrix} + x_2 \begin{bmatrix} \phantom{-}3 \\ -2 \end{bmatrix} + x_3 \begin{bmatrix} \phantom{-}5 \\ -2 \end{bmatrix} \ : \ x_1 , x_2 , x_3 \mbox{ are real numbers} \right\} . \] System of equations (1.1) has a solution iff the right-hand side vector \( \displaystyle {\bf b} = \begin{bmatrix} 3 \\ 1 \end{bmatrix} \) belongs to V: \[ \begin{bmatrix} 3 \\ 1 \end{bmatrix} = x_1 \begin{bmatrix} 2 \\ 3 \end{bmatrix} + x_2 \begin{bmatrix} \phantom{-}3 \\ -2 \end{bmatrix} + x_3 \begin{bmatrix} \phantom{-}5 \\ -2 \end{bmatrix} . \] Our immediate task is to solve the system of equations (1.1). So we ask Mathematica for help. First, we introduce new variables in order to distinguish our new approach with the previously one.

Clear[x, y]
Symbolize[NotationTemplateTag[Subscript[y, 1]]];
Symbolize[NotationTemplateTag[Subscript[y, 2]]];
Symbolize[NotationTemplateTag[Subscript[y, 3]]];
Solve[{2*Subscript[y, 1] + 3*Subscript[y, 2] + 5*Subscript[y, 3] == 3, 3*Subscript[y, 1] - 2*Subscript[y, 2] - 2*Subscript[y, 3] == 1}, {Subscript[y, 1], Subscript[y, 2]}]

{{Subscript[y, 1] -> 1/13 (9 - 4 Subscript[y, 3]), Subscript[y, 2] -> 1/13 (7 - 19 Subscript[y, 3])}}

Solve[{2*y1 + 3*y2 + 5*y3 == 3, 3*y1 - 2*y2 - 2*y3 == 1}, {y1, y2}]

\[ x_1 = \frac{1}{13} \left( 9 - 4\, x_3 \right) , \quad x_2 = \frac{1}{13} \left( 7 - 19\, x_3 \right) \] How does Mathematica know the general solution ? To answer this question, we multiply the first equation by 2 and the second equation by 3; this leads to \[ \begin{split} 4\,x_1 + 6\, x_2 + 10\, x_3 &= 6, \\ 9\,x_1 - 6\,x_2 - 6\, x_3 &= 3 . \end{split} \] Upon adding these results, we eliminate variable x₂ because it is multiplied by zero: \[ \begin{split} 4\,x_1 + 6\, x_2 + 10\, x_3 &= 6, \\ 13\,x_1 + 0\,x_2 + 4\, x_3 &= 9 . \end{split} \tag{1.2} \] Solving the latter, we get \[ x_1 = \frac{1}{13} \left( 9 - 4\, x_3 \right) . \] Mathematica can accomplish that with one line of code.

Solve[(2*# & /@ (2*Subscript[y, 1] + 3*Subscript[y, 2] + 5*Subscript[y, 3] == 3))[[ 1]] + (3*# & /@ (3*Subscript[y, 1] - 2*Subscript[y, 2] - 2*Subscript[y, 3] == 1))[[ 1]] == (2*# & /@ (2*Subscript[y, 1] + 3*Subscript[y, 2] + 5*Subscript[y, 3] == 3))[[ 2]] + (3*# & /@ (3*Subscript[y, 1] - 2*Subscript[y, 2] - 2*Subscript[y, 3] == 1))[[2]], Subscript[y, 1]][[1, 1]]

A detailed discussion of the line of code above is given below. However, at the risk of being pedantic, one can break up the code above into a number of steps, making it more readable and understandable.

eq1 = 2*Subscript[y, 1] + 3*Subscript[y, 2] + 5*Subscript[y, 3] == 3;(*start by naming each equation*) eq2 = 3*Subscript[y, 1] - 2*Subscript[y, 2] - 2*Subscript[y, 3] == 1;
scaledEq1 = (2*# & /@ eq1)(*the equations are scaled by each multiplier*) scaledEq2 = (3*# & /@ eq2)

2 (2 Subscript[y, 1] + 3 Subscript[y, 2] + 5 Subscript[y, 3]) == 6
3 (3 Subscript[y, 1] - 2 Subscript[y, 2] - 2 Subscript[y, 3]) == 3

scaledEq1[[1]](*note that each scaled equation has a part on either side of the == sign*) scaledEq1[[2]]

2 (2 Subscript[y, 1] + 3 Subscript[y, 2] + 5 Subscript[y, 3])
6

combinedEq = scaledEq1[[1]] + scaledEq2[[1]] == scaledEq1[[2]] + scaledEq2[[2]] FullSimplify[%](*the Subscript[y, 2] variable is canceled out*)

3 (3 Subscript[y, 1] - 2 Subscript[y, 2] - 2 Subscript[y, 3]) + 2 (2 Subscript[y, 1] + 3 Subscript[y, 2] + 5 Subscript[y, 3]) == 9
13 Subscript[y, 1] + 4 Subscript[y, 3] == 9

Solve[combinedEq, Subscript[y, 1]][[1, 1]](*solve for Subscript[y, 1] in terms of Subscript[y, 3]*)
Subscript[solny, 1] = %[[2]]

Subscript[y, 1] -> 1/13 (9 - 4 Subscript[y, 3])
1/13 (9 - 4 Subscript[y, 3])

Plot[Subscript[solny, 1], {Subscript[y, 3], -1, 2}](*which is the line representing the intersection of the two \ planes*)

Back substitution x₁ into the first equation yields \[ \frac{2}{13} \left( 9 - 4\,x_3 \right) + 3\, x_2 + 5\, x_3 = 3 . \] This allows us to determine x₂ as a function of x₃. Since both unknowns x₁ and x₂ are expressed through x₃ can be considered x₃ as a free variable.

======================= Roger ==================

However, we can makes the system (1.2) looks nicer. In order to achive it, we use Chiò's method, invented by the Italian mathematician Felice Chiò (1813--1871) in 1853. At his time, computers wee not available and people did not (and still do not) feel comfortable working with fractions.

First, we observe that there is no need to continue writing the 'unknowns' x₁, x₂, x₃ because we work only with their coefficients. Therefore, we introduce vectors with corresponding coefficients that we abbreviate as \[ {\bf r}_1 = \begin{bmatrix} 2&3&5&3 \end{bmatrix} \qquad\mbox{and} \qquad {\bf r}_2 = \begin{bmatrix} 3&-2&-2&1 \end{bmatrix} . \] We multiply the first vector r₁ by 3 and subtract the second vectorr₂ multiplied by 2. This yields \[ 3\,{\bf r}_1 = \begin{bmatrix} 6&9&15&9 \end{bmatrix} \qquad\mbox{and} \qquad 2\,{\bf r}_2 = \begin{bmatrix} 6&-4&-4&2 \end{bmatrix} . \] So \[ {\bf w} = 3\,{\bf r}_1 - 2\,{\bf r}_2 = \begin{bmatrix} 0&13&19&7 \end{bmatrix} . \] Now we compare the second components of vectors w and r₂, we can make them equal upon multiplication w vt 2 and r₂, by 13: \[ 2\,{\bf w} = \begin{bmatrix} 0&26&38&14 \end{bmatrix} , \qquad 13\,{\bf r}_2 = \begin{bmatrix} 39&-26&-26&13 \end{bmatrix} . \] Adding these two vectors, we obtain \[ {\bf u} = 2\,{\bf w} + 13\,{\bf r}_2 = \begin{bmatrix} 39&0&12&27 \end{bmatrix} . \] Since system of equations corresponding vectors r₁ and r₂ is equivalent to the system of equations generated by vectors u and w, we claim that our system of equations is equivalent to the following system: \[ \begin{split} 0\,x_1 + 13\, x_2 + 19\, x_3 &= 7, \\ 39\,x_1 + 0\,x_2 + 12\, x_3 &= 27 . \end{split} \tag{1.4} \] Each of equations in (1.4) contains only two variables, so we can determine one of them: \[ 13\, x_2 = 7 - 19\, x_3 , \qquad 39\, x_1 = 27 - 12 \, x_3 \quad \iff \quad 13\, x_1 = 9 - 4\, x_ 3 . \] We can use the same approach eliminating variables in each equation of system (1.2). Again we start with vectors r₁ and r₂, but now eliminate variables x₁ and x₃. We can use vector w because its first component is zero. Comparing its third component (19) and the third component (−2) of r₂, we see that we need to build another vector \[ {\bf v} = 2\,{\bf w} + 19\,{\bf r}_2 = \begin{bmatrix} 39&12&0&33 \end{bmatrix} . \] Then the system of equations w, v contains only two unknown variables in equa equation \[ \begin{split} 0\,x_1 + 13\, x_2 + 19\, x_3 &= 7, \\ 39\,x_1 + 12\,x_2 + 0\, x_3 &= 33 . \end{split} \tag{1.5} \] Solving this system with respect to x₁ and x₃, we obtain \[ 13\, x_1 = 11 - 4\, x_2 , \qquad 19\, x_3 = 7 - 13\, x_ 2 , \] where x₂ is a free variable.

=================== to be checked ==============

\[ \begin{split} 0\,x_1 + 78\, x_2 + 114\, x_3 &= 42, \\ 13\,x_1 + 0\,x_2 + 4\, x_3 &= 9 . \end{split} \tag{1.3} \]

13*{4, 6, 10, 6} - 4*{13, 0, 4, 9}

{0, 78, 114, 42}

We can simplify system (11.3) further by dividing each term in the first equation by 6. This gives us a nice system: \[ \begin{split} 0\,x_1 + 13\, x_2 + 19\, x_3 &= 7, \\ 13\,x_1 + 0\,x_2 + 4\, x_3 &= 9 . \end{split} \tag{1.4} \] The x₃ variable can simply be anything, so let x₃ = 13t. Then solution of Eq.(1.1) is expressed in the form of line in ℝ³: \[ \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \frac{1}{13} \begin{bmatrix} 9 \\ 7 \\ 0 \end{bmatrix} + t \begin{bmatrix} -4 \\ -19 \\ 13 \end{bmatrix} , \qquad t \in \mathbb{R} . \]

Now we multiply the first equation by 2 and the second one by 5. Upon adding the results, we obtain \[ \begin{split} 4\,x_1 + 6\, x_2 + 10\, x_3 &= 6, \\ 19\,x_1 - 4\,x_2 &= 11 . \end{split} \] Solving for x₁ and then substitute into the first equation, we get the general solution \[ x_1 = \frac{1}{19} \left( 11 + 4\, x_2 \right) , \quad x_3 = \frac{1}{19} \left( 7 - 13\, x_2 \right) \] We verify with Mathematica:

Solve[{2*y1 + 3*y2 + 5*y3 == 3, 3*y1 - 2*y2 - 2*y3 == 1}, {y1, y3}]

So now variable x₂ is free and two variables are expressed through it. Similarly, we can obtain the general solution that has x₁ as a free variable: \[ x_2 = \frac{1}{4} \left( -11 + 19\, x_1 \right) , \quad x_3 = \frac{1}{4} \left( 9 - 13\, x_1 \right) . \]

End of Example 1

There are three fundamentally different ways of interpreting any linear system (1):

Every system of linear equations (1) can be considered as a linear transformation T : 𝔽ⁿ ⇾ 𝔽^m represented by the coefficient matrix A = [𝑎_i,j] that maps x ∈ 𝔽ⁿ into 𝔽^m:
\[ T({\bf x}) = \begin{pmatrix} a_{1,1} x_1 + a_{1,2} x_2 + \cdots + a_{1,n} x_n \\ \vdots \\ a_{m,1} x_1 + a_{m,2} x_2 + \cdots + a_{m,n} x_n \end{pmatrix} \in \mathbb{F}^m , \]
where output is written in column form to save space and x = (x₁, x₂, … , x_n) ∈ 𝔽ⁿ. Then solving linear system (1) is equivalent to determination of all (if any) inverse images of b under transformation T. If we rewrite vectors x and b in column form, we can define the linear transformation T in matrix/vector form:
\[ {\bf A}\,{\bf x} = {\bf b} , \qquad {\bf A} \in \mathbb{F}^{m\times n} , \quad {\bf x} \in \mathbb{F}^{n\times 1} . \quad {\bf b} \in \mathbb{F}^{m\times 1} . \]
Column problem: Find all ways to linearly combine the columns of A to produce b ∈ 𝔽^m×1. This is a problem about linear combinations in the range 𝔽^m×1.
Row problem: Find all vectors in the domain 𝔽ⁿ that simultaneously solve each of the linear equation
\[ < {\bf r}_i \mid {\bf x} > = a_{i,1} x_1 + a_{i,2} x_2 + \cdots + a_{i,n} x_n = b_i , \quad i =1, 2, \ldots , m, \]
for each row r_i(A) of matrix A. Equivalently, find all vectors in the intersection of the solution sets of all these equations <r_i(A)| x> = b_i, i = 1, 2, … , m. Geometrically, these sets are hyperspaces and they all lie in 𝔽ⁿ.

Since linear system of equations (1) defines a linear transformation T : 𝔽ⁿ ⇾ 𝔽^m, solving the system is equivalent to determination of T⁻¹(b). Linear mappings can be onto, one-to-one, both, or neither. This leads to the following conclusion.

Theorem 1: Every system of linear equations has either no solution, exactly one solution, or infinitely many solutions.

Another way of phrasing this theorem is as follows: if a system of linear equations has at least two solutions then it must have infinitely many solutions. With this in mind, we start by assuming that there are two distinct solutions to the linear system.

If A x = b is the matrix form of the linear system, where A ∈ 𝔽^m×n, x ∈ 𝔽^n×1, and b ∈ 𝔽^m×1, then there exist two distinct solutions of the linear system, which means that there exist two distinct solutions x₁ ≠ x₂ ∈ 𝔽^n×1 such that A x₁ = b and A x₂ = b, Then for any constant k ∈ 𝔽, it is true that \[ {\bf A} \left( k\, x_1 + (1-k)\, x_2 \right) = k\, {\bf A}\,{\bf x}_1 + \left( 1- k \right) {\bf A}\,{\bf x}_2 = k\, {\bf b} + \left( 1- k \right) {\bf b} = {\bf b} . \] Hence, every linear combination of two vectors is a solution of the linear system. Since there are infinitely many such vectors (one for each choice of k ∈ 𝔽, the proof is completed.

Example 2: Consider the simple 2 × 3 system \[ \begin{split} x - 2\,y + 3\,z &= -7, \\ 3\,x + 5\,y - 2\,z &= 23 . \end{split} \tag{4.1} \] The left-had side of the system defines a linear transformation T : 𝔽³ ⇾ 𝔽² via simple rule: \[ T(x, y, z) = \left( x - 2\,y + 3\,z , 3\,x + 5\,y - 2\,z \right) \in \mathbb{R}^2 . \] Therefore, solving this system means finding all inputs (x, y, z) from ℝ³ that are mapped by T into a "target" vector (−7, 23) ∈ ℝ². Any linear transformation can be rewritten in matrix/vector form A x = b, where \[ {\bf A} = \begin{bmatrix} 1 & -2 & \phantom{-}3 \\ 3 & \phantom{-}5 & -2 \end{bmatrix}, \qquad {\bf x} = \begin{pmatrix} x \\ y \\ z \end{pmatrix} , \quad {\bf b} = \begin{bmatrix} -7 \\ 23 \end{bmatrix} . \]

So we can also read the system as a problem about linear combinations of the columns of A. In this case, for instance, A x = b becomes a column problem \[ x \begin{pmatrix} 1 \\ 3 \end{pmatrix} + y \begin{pmatrix} -2 \\ 5 \end{pmatrix} + z \begin{pmatrix} 3 \\ -2 \end{pmatrix} = \begin{pmatrix} -7 \\ 23 \end{pmatrix} , \] or \[ x\,{\bf a}_1 + y\,{\bf a}_2 + z\,{\bf a}_2 = {\bf b} , \] where \[ {\bf a}_1 = \begin{pmatrix} 1 \\ 3 \end{pmatrix} , \quad {\bf a}_2 = \begin{pmatrix} -2 \\ 5 \end{pmatrix} , \quad {\bf a}_3 = \begin{pmatrix} 3 \\ -2 \end{pmatrix} \] denotes column j of matrix A. Seen in this light, solving the linear system means finding a linear combination of the vectors {a₁, a₂, a₃} that adds up to (−7, 23).

Each individual equation in the system (4.1) has the form r_i(A) • (x, y, z) = b_i, i = 1, 2, where r_i(A) denotes row i of A (and b_i is the ith entry in b ). In particular, \[ {\bf r}_1 ({\bf A}) = \left( 1, -2, 3 \right) , \quad {\bf r}_2 ({\bf A}) = \left( 3, 5, -2 \right) . \] We could write the 2-by-3 system (1.1) as \[ \begin{split} \left( 1, -2, 3 \right) \bullet (x, y, z) &= -7 , \\ \left( 3, 5, -2 \right) ) \bullet (x, y, z) &= 23 . \end{split} \] So from the row standpoint, the problem A x = b becomes: Find all vectors in ℝ³ (if any) that dot with each row of A to give the corresponding entry in b ∈ ℝ². This is the row problem. It is dual to the column problem.

End of Example 23

What we need is a description of the solution set that makes the contents apparent. To get such a description, we may need to reformulate the problem in such a way that the general solution remains the same but its description becomes lucid. The key idea is the following:

Two systems of linear equations with the same number of unknowns are said to be equivalent if they have the same solution set.

In order to develop an efficient algorithm for solving linear systems of equations, we need to determine operations on the systems that do not change the solution sets. First, we observe that the system formulated in row form admits interchanging the order of summation. This is also related to reindexing the variables. For example, the following two systems are equivalent because they define the same general solution:

\[ \begin{cases} x_1 + 2\, x_2 + 3\, x_3 &= 4 , \\ 5\, x_1 + 6\, x_2 + 7\, x_3 &= 8 , \end{cases} \qquad \mbox{and} \qquad \begin{cases} 2\, y_1 + y_2 + 3\, y_3 &= 4 , \\ 6\, y_1 + 5 \, y_2 + 7\, y_3 &= 8 . \end{cases} \]

The order in which internal summation is performed does not matter. However, when we reformulate this problems in matrix/vector form, this leads to two different coefficient matrices, but not the right-hand term:

\[ \begin{bmatrix} 1&2&3 \\ 5&6&7 \end{bmatrix} \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} = \begin{bmatrix} 4 \\ 8 \end{bmatrix} \qquad \mbox{and} \qquad \begin{bmatrix} 2&1&3 \\ 6&5&7 \end{bmatrix} \begin{pmatrix} y_1 = x_2 \\ y_2 = x_1 \\ y_3 = x_3 \end{pmatrix} = \begin{bmatrix} 4 \\ 8 \end{bmatrix} \]

This example shows that in the column problem, you can exchange columns in the coefficient matrix, but it requires a corresponding swap indexed unknown variables. Hence, the order in which columns are recorded in the coefficient matrix defines the corresponding order in unknown variables.

If you multiply any particular variable by a constant (making substitution x_i = c·y_i), this yields a corresponding division by c of the column in the coefficient matrix. For instance, using the previous example, we can divide the second column by 2 by making a corresponding scaling in the second variable:

\[ \begin{bmatrix} 1&2&3 \\ 5&6&7 \end{bmatrix} \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} = \begin{bmatrix} 4 \\ 8 \end{bmatrix} \qquad \mbox{and} \qquad \begin{bmatrix} 1&1&3 \\ 5&3&7 \end{bmatrix} \begin{pmatrix} y_1 = x_1 \\ y_2 = 2\,x_2 \\ y_3 = x_3 \end{pmatrix} = \begin{bmatrix} 4 \\ 8 \end{bmatrix} . \]

Now we turn our attention to the row-problem. Obviously, the order of presenting equations in the system does not matter and any two or more equations can be exchanged without altering a solution set. Surprisingly, there is one more familiar operation on equations in system (1) that is safe to apply: any equation can be replaced by a linear combination of all other equations. A basic operation over the equations of a linear equation system is the linear combination.

A linear combination of the equations of system (1) is, by definition, an equation of the form: \[ c_1 \left( a_{1,1} x_1 + a_{1,2} x_2 + \cdots + a_{1,n} x_n \right) + c_2 \left( a_{2,1} x_1 + a_{2,2} x_2 + \cdots + a_{2,n} x_n \right) \\ + \cdots + c_m \left( a_{m,1} x_1 + a_{m,2} x_2 + \cdots + a_{m,n} x_n \right) = c_1 b_1 + c_2 b_2 + \cdots + c_m b_m , \] where the c_i’s are real or complex numbers called the coefficients of the linear combination. A linear combination of linear equations is a linear equation.

Theorem 2: Every solution of a system of equations is also a solution of every linear combination of the equations of the system.

Unfortunately, I don't have enough time to type the solution to this theorem. You may try to do it using, for example, math induction.

Example 3: The system \[ \begin{split} x - 3\,y &= -1 , \\ 2\,x + y &= 12 , \end{split} \] has the solution (x, y) = (5, 2). Indeed, replacing 5 for x and 2 for y in each of the two equations, we have 5 − 3 × 2 = −1 and 2 × 5 + 2 = 12. Let us consider the generic linear combination of the equations of the system \[ c_1 \left( x - 3\,y \right) + c_2 \left( 2\,x + y \right) = -c_1 + 12\, c_2 \] and verify that this equation has the solution (5, 2); in fact, if (x, y) = (5, 2), we obtain \[ c_1 \left( 5 - 3 \cdot 2 \right) + c_2 \left( 2\cdot 5 + 2 \right) = -c_1 + 12\, c_2 . \]

End of Example 3

Let us name the m equations of a system with the symbols E₁ , E₂ , … , E_m. The notations simplify the indication of the linear combinations; the expression c₁E₁ + c₂E₂ means: multiply the equation E₁ by c₁, the equation E₂ by c₂, and add. For example, let us consider the system of two equations

\[ \begin{split} E_1 \ : \quad x - 3\,y &= -1 , \\ E_2 \ : \quad 2\,x + y &= 12 , \end{split} \]

The sum 3E₁ − 4E₂ is an abbreviation of the equation 3·(x − 3·y) − 4·(2·x + y) = −3 −4·12 = −51, that is, −5x −13y = −51, so we can simplify it further: 5x + 13y = 51.

Corollary 1: If the equation E_i in the system (1) is a linear combination of the remaining equations, the system is equivalent to that obtained by eliminating the equation E_i.

Example 4: The system \[ \begin{split} E_1 \ : \quad 3\,x - y &= 0 , \\ E_2 \ : \, -x + 2\,y &= 5 , \\ E_3 \ : \ \ \phantom{-x +} 5\, y &= 15 , \end{split} \] is equivalent to the system of the last two equations because E₃ = E₁ + 3·E₂. Therefore, the system admits the unique solution (1, 3).

End of Example 4

Note: Although systems with different number of equations can be equivalent, their corresponding augmented matrices are NOT equivalent because the equivalence relation between matrices is applicable only to those matrices that have the same dimensions.

However, a linear combination condition can be splitted into two simple operations on the equations not only for pedagogical purposes, but for computational reason: it is easier to code using simple operations. Then together with swapping equations, we obtain three basic operations on a system of linear equations that are safe to perform:

E1 (Interchange):		Interchange two equations (denoted by E_i ↕ E_j for swapping rows i and j).
E2 (Scaling):		Multiply both sides of an equation by a non-zero constant (abbreviation: E_i ← c·E_i).
E3 (Replacement):		Replace one equation by the sum of itself and a constant multiple of another equation (E_i ← E_i + c·E_j).

The three operations E1, E2, and E3 described above are called the elementary reduction operations or Gaussian operations on a system of linear equations that can be used for modifying a system to produce a new but equivalent system.

Theorem 3: If a linear system is changed to another by one of the elementary reduction operations, then the two systems have the same set of solutions.

E1: Let S be the solution set of the original system of equations. Using row formulation of the problem, we have \[ {\bf r}_1 \bullet {\bf x} = b_1 , \ldots , {\bf r}_m \bullet {\bf x} = b_m , \] where r₁, … , r_m are rows in the left-hand side of Eq.(1). Hence changing the order of the equations has no eﬀect on the elements in the set, and so the system obtained by interchanging two equations is equivalent to the original system.

E2: Take any linear system of m equations in n unknowns, and let S be the solution set of the system. Let r_i • x = b_i be an equation in the system, let c be a nonzero real number, and suppose that a new linear system is formed by replacing the equation r_i • x = b_i with the equation c·r_i • x = c·b_i (leaving the other equations unchanged). Let S₁ be the solution set of the modiﬁed system. Now if y ∈ S, then y satisﬁes all the equations in the original system including the equation r_i • x = b_i. Hence, y satisﬁes the equation c·r_i • x = c·b_i, and all the other equations, so y ∈ S₁. On the other hand, if y ∈ S₁, then y satisfies the equation c·r_i • x = c·b_i together with the other equations in the original system. Since c ≠ 0, c·r_i • x = c·b_i implies (1/c)·c·r_i • x = (1/c)·c·b_i and hence r_i • x = b_i. Thus, y satisﬁes all the equations in the original system, so y ∈ S. This means that S₁ ⊆ S, so S₁ = S, and the systems of equations are equivalent.

E3: Let S be the solution set of the original system of equations. Let r_i • x = b_i and r_j • x = b_j be two equations in the system, let d be a real number, and suppose a new linear system is formed by replacing the equation r_i • x = b_i with the equation d·r_j • x + r_i • x = d·b_j + b_i (leaving the rest of the equations unchanged). Let S₂ be the solution set of the modiﬁed system. Now if y ∈ S, then y satisﬁes all the equations in the original system, including the equations r_i • x = b_i and r_j • x = b_j. Thus, r_i • y = b_i and r_j • y = b_j, so d·r_j • y + r_i • y = d·b_j + b_i. Hence, y satisﬁes all the equations in the modiﬁed system. Thus, y ∈ S₂ and we have S ⊆ S₂.

On the other hand, if y ∈ S₂, then y satisﬁes the equation d·r_j • x + r_i • x = d·b_j + b_i together with the other equations in the original system. In particular, y satisﬁes the original system equations r_j • y = b_j. So \[ \left[ d\,{\bf r}_j \bullet {\bf y} + {\bf r}_i \bullet {\bf y} \right] = \left[ d\,b_j + b_i \right] \] and \[ {\bf r}_j \bullet {\bf y} = b_j , \] where brackets are used to indicate groupings, and so we must have \[ -d \,{\bf r}_j \bullet {\bf y} + \left[ d\,{\bf r}_j \bullet {\bf y} + {\bf r}_i \bullet {\bf y} \right] = - d\,b_j + \left[ d\,b_j + b_i \right] . \] This yields r_j • y = b_j. Hence, the vector y satisﬁes all the equations in the original system, so y ∈ S. Thus, S₂ ⊆ S; hence S₂ = S, and the systems of equations are equivalent.

Example 5: We consider the following system of equations \[ \begin{array}{c} \mbox{E}_1 \ : \\ \mbox{E}_2 \ : \end{array} \qquad \begin{cases} \phantom{3\,}x_1 + 2\, x_2 + 3\, x_3 &= 2, \\ 3\,x_1 - 2\,x_2 - 2\, x_3 &= 1 . \end{cases} \tag{5.1} \] Let us add the first equation to the second one; this yields an equivalent system of equations \[ \begin{array}{c} \mbox{E}_1 \ : \\ \mbox{E}_1 + \mbox{E}_2 \ : \end{array} \qquad \begin{split} x_1 + 2\, x_2 + 3\, x_3 &= 2, \\ 4\,x_1 + 0\,x_2 + 1\, x_3 &= 3 . \end{split} \] However, the latter contains only two independent variables instead of initial three variables. This means that we eliminate one of the variables and reduce the dimension of the equation: equation 3·x₁ + x₃ = 3 is 1 × 2 equation.

For further simplification, we multiply the second equation by (−3) and add to the first one. As a result, we get \[ \begin{array}{c} \mbox{E}_1 - 3\left( \mbox{E}_1 + \mbox{E}_2 \right) \ : \\ \mbox{E}_1 + \mbox{E}_2 \ : \end{array} \qquad \begin{split} -11\,x_1 + 2\, x_2 + 0\, x_3 &= -7, \\ 4\,x_1 + 0\,x_2 + 1\, x_3 &= 3 . \end{split} \tag{5.2} \] Although system (5.2) is still 2 × 3 system, each of the equations contains only two unknowns. This allows us to explicitly express its general solution \[ x_2 = \frac{1}{2} \left( 11\, x_1 - 7 \right) , \qquad x_3 = 3 - 4\, x_1 . \tag{5.3} \] In formula (5.3), variable x₁ is a free variable, which we denote by t: \[ \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} 0 \\ -\frac{7}{2} \\ 3 \end{bmatrix} + t\, \begin{bmatrix} 1 \\ \frac{11}{2} \\ -4 \end{bmatrix} , \qquad t \in \mathbb{R} . \tag{5.4} \] We verify with Mathematica

A = {{1, 2, 3}, {3, -2, -2}};
b = {2, 1};
LinearSolve[A, b]

{3/4, 5/8, 0}

So Mathematica provides a single solution corresponding t = 3/4.

Then we swap the first two columns that leads to another coefficient matrix.

B = {{2, 1, 3}, {-2, 3, -2}};
LinearSolve[B, b]

{5/8, 3/4, 0}

Hence, both matrices, A and B, provide the same solution set if we swap variables x Since the null space of matrix A is known (many thanks to Mathematica), the general solution of the given system (5.1) is \[ \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} \frac{3}{4} \\ \frac{5}{8} \\ 0 \end{bmatrix} + t \begin{bmatrix} 2 \\ 11 \\ -8 \end{bmatrix} , \qquad t \in \mathbb{R} . \]

NullSpace[A]

Now we exchange the first column with the third one.

A2 = {{3, 2, 1}, {-2, -2, -3}};
b = {2, 1};
LinearSolve[A2, b]

{3, -(7/2), 0}

In order to verify that the answer is correct, we need to find t such that \[ \begin{bmatrix} 0 \\ -\frac{7}{2} \\ 3 \end{bmatrix} = \begin{bmatrix} \frac{3}{4} \\ \frac{5}{8} \\ 0 \end{bmatrix} + t \begin{bmatrix} 2 \\ 11 \\ -8 \end{bmatrix} . \] From equations \[ \begin{split} 0 &= \frac{3}{4} + 2t , \\ -\frac{7}{2} &= \frac{5}{8} + 11\,t , \\ 3 &= -8\,t . \end{split} \] we find the unique solution t = −3/8.

End of Example 5

These operations (E1, E2, E3), known as elementary reduction operations, do not alter the set of solutions since the restrictions on the variables x₁, x₂, … , x_n given by the new equations imply the restrictions given by the old ones (that is, we can undo the manipulations made to retrieve the old system). Therefore, these three row operations are reversible.

Indeed, these elementary operations are invertible because they can be undone by other elementary operations of the same type. In particular, the multiplication operation c·E_j is reversed by (1/c)·E_j, the swap operation of equations i and j, Ei ↕ Ej, is undone by itself, and the addition operation E_i + c·E_j is undone by R_i − c·E_j. This reversibility of elementary operations ensures that, not only are solutions to the linear system not lost when equations are modified, but no new solutions are introduced either (since no solutions are lost when undoing the elementary operations). In fact, this is exactly why we require c ≠ 0 in the multiplication operation c·E_j.

Example 6: Two friends, Joe (Biden) and Donald (Trump), were taking the final exam in Linear Algebra course. So they were given the same assignment to solve a 2 × 2 system of linear equations. They record the given system as follows: \[ \begin{array}{c} \mbox{E}_1 \ : \\ \mbox{E}_2 \ : \end{array} \qquad \begin{split} x_1 - 3\, x_2 &= -7, \\ 2\,x_1 + x_2 &= \phantom{-}7 ; \end{split} \qquad {\bf A} = \begin{bmatrix} 1 & -3 \\ 2&\phantom{-}1 \end{bmatrix} \tag{6.J} \] and \[ \begin{split} -3 \,x + y &= -7, \\ x + 2\,y &= \phantom{-}7 ; \end{split} \qquad {\bf B} = \begin{bmatrix} -3&1 \\ \phantom{-}1 & 2 \end{bmatrix} . \tag{6.D} \] As you see, they used different notations: x₁ = y and x₂ = x. Other than that, their systems are identical. However, their coefficient matrices A and B have swapped columns. Let us see how they handled this problem.

Joe multiplied the first equation in (6.J) by (−2) and add to the second one: \[ \begin{split} -2\,x_1 + 6\, x_2 &= 14, \\ 2\,x_1 + x_2 &= \phantom{-}7 \end{split} \qquad \Longrightarrow \qquad \begin{split} x_1 - 3\, x_2 &= -7, \\ 0\,x_1 + 7\,x_2 &= 21 . \end{split} \tag{J.2} \] The last equation contains zero multiplied by x₁; in this case, we say that x₁ is eliminated and we immediately read off the value of another variable: x₂ = 3.

Having x₂ = 3 at hand, Joe substituted its value in the first equation \[ x_1 - 3\cdot 3 = -7 \qquad \Longrightarrow \qquad x_1 = 2 . \]

On the other hand, Donald multiplied the second equation by 3, which does not change its solution set \[ \begin{split} x - 3\, y &= -7, \\ 6\,x + 3\,y &= 21 . \end{split} \] If we add theses two equations, it will give a simpler equation \[ 7\, x + 0\,y = 14 \qquad \Longrightarrow \qquad x = 2 . \] So performing some elementary operations, we transfer the given system to such a system that allows us determine one of the solutions by simple inspection.

End of Example 6

Anton, Howard (2005), Elementary Linear Algebra (Applications Version) (9th ed.), Wiley International
Beezer, R., A First Course in Linear Algebra, 2015.
Beezer, R., A Second Course in Linear Algebra, 2013.

Introduction to Linear Algebra

Systems of Linear Equations

Matrix Algebra

Vector Spaces

Eigenvalues, Eigenvectors

Euclidean Spaces

Matrix Decompositions

Applications

Functions of Matrices

Miscellany

Preliminaries

Glossary

Reference

Elementary Reduction Operations