Linear Algebra

This page supports the main stream of the web site by providing/reminding some information regading Linear Algebra. We demonstrate capabilities of MuPad for this topic.

Elementary Row Operations

Our objective is to find an efficient method of finding the solutions of systems of linear equations. There are three operations that we can perform on the equations of a linear system without altering the set of solutions:
  • multiply both sides of an equation by a non-zero constant;
  • interchange two equations;
  • add a multiple of one equation to another.
These operations, usually called elementary row operations, do not alter the set of solutions of the linear system \( {\bf A}\,{\bf x} = {\bf b} \) since the restrictions on the variables \( x_1 , \, x_2 , \, \ldots , \,x_n \) given by the new equations imply the restrictions given by the old ones. At the same time, these operations really only involve the coefficients of the variables and the right-hand sides of the equations, b.

Since the coefficient matrix A together with the given constant vector b contains all the information we need to use, we unite them into one matrix, called the augmented matrix:

\[ {\bf M} = \left( {\bf A}\,\vert \,{\bf b} \right) \qquad \mbox{or} \qquad {\bf M} = \left[ {\bf A}\,\vert \,{\bf b} \right] . \]
Now rather than manipulating the equations, we can instead manipulate the rows of this augmented matrix. These observations form the motivation behind a method to solve systems of linear equations, known as Gaussian elimination, called after its inventor Carl Friedrich Gauss (1777--1855) from Germany. Various modifications have been made to the method of elimination, and we present one of these, known as the Gauss--Jordan method. It was the German geodesist Wilhelm Jordan (1842-1899) and not a French mathematician Camille Jordan (1838-1922) who introduced the Gauss-Jordan method of solving systems of linear equations. Camille Jordan is credited for Jordan normal form, a well known linear algebra topic.

Elimination produces an upper triangular system, called row echelon form for Gauss elimination and reduced row echelon form for Gauss--Jordan algorithm. The Gauss elimination introduces zeroes below the pivots, while Gauss--Jordan algorithm contains additional phase in which it introduces zeroes above the pivots. Since this phase involves roughly 50% more operations than Gaussian elimination, most computer algorithms are based on the latter method.

Suppose M is the augmented matrix of a linear system. We can solve the linear system by performing elementary row operations on M. In Sage these row operations are implemented with the functions.

sage: M.swap_rows()          # interchange two rows
sage: rescale_row() # scale a row by using a scale factor
sage: add_multiple_of_row() # add a multiple of one row to another row, replacing the row

Remember, row numbering starts at 0. Pay careful attention to the changes made to the matrix by each of the following commands.

Example. Consider the system of linear equations

\[ \begin{cases} 2\,x + 3\,y + z &= -1, \\ 4\,x +7\,y + 5\,z &= 5, \\ x -2\,y +2\,z &= 11 / \end{cases} \]
First, we form the augmented matrix
\[ {\bf M} = \left[ \begin{array}{ccc|c} 2&3&1&-1 \\ 4&7&5&5 \\ 1&-2&2&11 \end{array} \right] . \]
The idea of the elimination procedure is to reduce the augmented matrix to equivalent "upper triangular" matrix. So the first question is how to determine pivots. Depending on this choice, we get the corresponding row echelon form. Therefore, the Gaussian algorithm may lead to different row echelon forms; hence, it is not unique.

From computational point of view, the pivot should be the largest element. If we look at the first column, the largest entry is in the second row. To move it into the first row and choose "4" as a pivot, we need to swap first two rows. It is convenient, but not necessarily, to have 1 in this position, which we call pivot.

Another option is to choose "2" in the first row as pivot because it is not zero. We want to start with a 1 in the first position of the first column. To chieve it, we can either divide each entry in the first row by 2 or switch the first and the third rows because the last row has "1" in the first position already. If we stick with the latter decision, we may want to begin by scaling the first row by 1/2.

sage: M.rescale_row(0,1/2); M 

The first argument to rescale_row() is the index of the row to be scaled and the second argument is the scale factor. We could, of course, use 0.5 rather than 1/2 for the scale factor.

Now that there is a 1 in the first position of the first column we continue by eliminating the entry below it.

sage: M.add_multiple_of_row(1,0,-4); M
\[ {\bf M} \,\sim\, \left[ \begin{array}{ccc|c} 1&3/2&1/2&-1/2 \\ 4&7&5&5 \\ 1&-2&2&11 \end{array} \right] \,\sim\, \left[ \begin{array}{ccc|c} 1&3/2&1/2&-1/2 \\ 0&1&3&7 \\ 1&-2&2&11 \end{array} \right] . \]

The first argument is the index of the row to be replaced, the second argument is the row to form a multiple of, and the final argument is the scale factor. Thus, M.add_multiple_of_row(n,m,A) would replace row n with (row n)+A*(row m).

\[ {\bf M} \,\sim\, \left[ \begin{array}{ccc|c} 1&3/2&1/2&-1/2 \\ 4&7&5&5 \\ 1&-2&2&11 \end{array} \right] \,\sim\, \left[ \begin{array}{ccc|c} 1&3/2&1/2&-1/2 \\ 0&1&3&7 \\ 0&-7/2&3/2&23/2 \end{array} \right] \]
or without scaling
\[ {\bf M} \,\sim\, \left[ \begin{array}{ccc|c} 2&3&1&-1 \\ 0&1&3&7 \\ 0&-7&3&23 \end{array} \right] \,\sim\, \left[ \begin{array}{ccc|c} 1&3/2&1/2&-1/2 \\ 0&1&3&7 \\ 0&-7/2&3/2&23/2 \end{array} \right] . \]

Since the last entry in the first column is already zero we can move on to the second column. Our first step is to get a 1 in the second position of the second column. Normally we would do this by scaling the row but in this case we can swap the second and third row

\[ {\bf M} \,\sim\, \left[ \begin{array}{ccc|c} 2&3&1&-1 \\ 0&1&3&7 \\ 0&0&24&72 \end{array} \right] . \]
sage: M.swap_rows(1,2); M

The arguments to swap_rows() are fairly obvious, just remember that row 1 is the second row and row 2 is the third row.

Now we want to eliminate the 5 below the 1. This is done by multiplying the second row by −5 and adding it to the third row.

sage: M.add_multiple_of_row(2,1,-5); M 
To get a 1 as the last entry in the third column we can scale the third row by −1/2
sage: M.rescale_row(2,-1/2); M 
At this point, the matrix is in echelon form (well, having the 1's down the diagonal
of the matrix is not required, but it does make our work easier). All that remains
to find the solution is to put the matrix in reduced echelon form which requires
that we replace all entries in the first three columns that are not on the main
diagonal (where the 1's are) with zeros. We will start with the third column and
work our way up and to the left. Remember that this is an augmented matrix and we
are going to ignore the right-most column; it just "goes along for the ride."
sage: M.add_multiple_of_row(1,2,-1); M
sage: M.add_multiple_of_row(0,1,-2); M

Therefore, our solution is complete. We see that the solution to our linear system is
\( x_1 =2, \ x_2 = -2, \ \mbox{and} \ x_3 = 3. \)

There is an easy way to check your work, or to carry out these steps in the future. First, let's reload the matrix M.

sage: M=matrix(QQ,[[2,4,0,8],[-1,3,3,-2],[0,1,1,0]]); M
The function echelon_form() will display the echelon form of a matrix without changing the matrix.
sage: M.echelon_form()
Notice that the matrix M is unchanged.
sage: M
To replace M with its reduced echelon form, use the echelonize() function.
sage: M.echelonize(); M 
Sage has the matrix method .pivot() to quickly and easily identify the pivot columns of the reduced row-echelon form of a matrix. Notice that we do not have to row- reduce the matrix first, we just ask which columns of a matrix A would be the pivot columns of the matrix B that is row-equivalent to A and in reduced row-echelon form. By definition, the indices of the pivot columns for an augmented matrix of a system of equations are the indices of the dependent variables. And the remainder are free variables. But be careful, Sage numbers columns starting from zero and mathematicians typically number variables starting from one.
sage: coeff = matrix(QQ, [[ 1,  4, 0, -1,  0,   7, -9],
... [ 2, 8,-1, 3, 9, -13, 7],
... [ 0, 0, 2, -3, -4, 12, -8],
... [-1, -4, 2, 4, 8, -31, 37]])
sage: const = vector(QQ, [3, 9, 1, 4])
sage: aug = coeff.augment(const),
sage: dependent = aug.pivots()
sage: dependent
(0, 2, 3)
So, incrementing each column index by 1 gives us the same set D of indices for the dependent variables. To get the free variables, we can use the following code. Study it and then read the explanation following.
sage: free = [index for index in range(7) if not index in dependent]
sage: free
[1, 4, 5, 6]
This is a Python programming construction known as a “list comprehension” but in this setting I prefer to call it “set builder notation.” Let’s dissect the command in pieces. The brackets ([,]) create a new list. The items in the list will be values of the variable index .range(7) is another list, integers starting at 0 and stopping justbefore 7. (While perhaps a bit odd, this works very well when we consistently start counting at zero.) So range(7) is the list [0,1,2,3,4,5,6]. Think of these as candidate values for index, which are generated by for index in range(7). Then we test each candidate, and keep it in the new list if it is not in the list dependent. This is entirely analogous to the following mathematics:
\[ F= \{ f \, \vert \, 1 \le f \le 7, \ f \not\in D \} , \]
where F is free, f is index, and D is dependent, and we make the 0/1 counting adjustments. This ability to construct sets in Sage with notation so closely mirroring the mathematics is a powerful feature worth mastering. We will use it repeatedly. It was a good exercise to use a list comprehension to form the list of columns that are not pivot columns. However, Sage has us covered.
sage: free_and_easy = coeff.nonpivots()
sage: free_and_easy
[1, 4, 5, 6]

Example. We illustrate how to find an inverse matrix using row operations. Consider the matrix

\[ {\bf A} = \left[ \begin{array}{ccc} 1& 2 & 3 \\ 2&5&3 \\ 1&0 & 8 \end{array} \right] . \]
To accomplish this goal, we add a block of the identity matrix and apply row operation to this augmented matrix \( \left[ {\bf A} \,| \,{\bf I} \right] \) until the left side is reduced to I. These operatioons will convert the right side to \( {\bf A}^{-1} ,\) so the final matrix will have the form \( \left[ {\bf I} \,| \,{\bf A}^{-1} \right] .\)
\begin{align*} \left[ {\bf A} \,| \,{\bf I} \right] &= \left[ \begin{array}{ccc|ccc} 1& 2 & 3 & 1&0&0 \\ 2&5&3 & 0&1&0 \\ 1&0 & 8 & 0&0&1 \end{array} \right] \\ &= \left[ \begin{array}{ccc|ccc} 1& 2 & 3 & 1&0&0 \\ 0&1&-3 & -2&1&0 \\ 0&-2 &5 & -1&0&1 \end{array} \right] \qquad \begin{array}{l} \mbox{add $-2$ times the first row to the second} \\ \mbox{and $-1$ times the first row to the third} \end{array} \\ &= \left[ \begin{array}{ccc|ccc} 1& 2 & 3 & 1&0&0 \\ 0&1&-3 & -2&1&0 \\ 0&0 &-1 & -5&2&1 \end{array} \right] \qquad \begin{array}{l} \mbox{add 2 times the second row to the third} \end{array} \\ &= \left[ \begin{array}{ccc|ccc} 1& 2 & 3 & 1&0&0 \\ 0&1&-3 & -2&1&0 \\ 0&0 &1 & 5&-2&-1 \end{array} \right] \qquad \begin{array}{l} \mbox{multiply the third row by $-1$} \end{array} \\ &= \left[ \begin{array}{ccc|ccc} 1& 2 & 0 & -14&6&3 \\ 0&1&0 & 13&-5&-3 \\ 0&0&1 & 5&-2&-1 \end{array} \right] \qquad \begin{array}{l} \mbox{add 3 times the third row to the second} \\ \mbox{and $-3$ times the third row to the first} \end{array} \\ &= \left[ \begin{array}{ccc|ccc} 1& 0 & 0 & -40&16&9 \\ 0&1&0 & 13&-5&-3 \\ 0&0 &1 & 5&-2&-1 \end{array} \right] \qquad \begin{array}{l} \mbox{add $-2$ times the second row to the first} \end{array} . \end{align*}
Thus,
\[ {\bf A}^{-1} = \left[ \begin{array}{ccc} -40& 16 & 9 \\ 13&-5&-3 \\ 5&-2 &-1 \end{array} \right] . \]

Row Echelon Forms

In previous section, we demonstrate how Gauss elimination procedure is applied to a matrix, row by row, in order to obtain row echelon form. Specifically, a matrix is in row echelon form if

  • all nonzero rows (rows with at least one nonzero element) are above any rows of all zeroes (all zero rows, if any, belong at the bottom of the matrix), and
  • the leading coefficient (the first nonzero number from the left, also called the pivot) of a nonzero row is always strictly to the right of the leading coefficient of the row above it (some texts add the condition that the leading coefficient must be 1 but it is not necessarily).
These two conditions imply that all entries in a column below a leading coefficient are zeroes.

Unlike the row echelon form, the reduced row echelon form of a matrix is unique and does not depend on the algorithm used to compute it. It is obtained by applying the Gauss-Jordan elimination procedure. A matrix is in reduced row echelon form (also called row canonical form) if it satisfies the following conditions:

  • It is in row echelon form.
  • Every leading coefficient is 1 and is the only nonzero entry in its column.

Gaussian elimination algorithm can be generalized for block matrices. Suppose that a \( (m+n) \times (m+n) \) matrix M is partitioned into 2-by-2 blocks \( {\bf M} = \begin{bmatrix} {\bf A}_{m\times m} & {\bf B}_{m\times n} \\ {\bf C}_{n\times m} & {\bf D}_{n\times n} \end{bmatrix} .\) When we apply Gauss elimination algorithm, A is just an entry (not zero) in upper left corner (which is 1 by 1 matrix). and C is the column-vector (\( (m+n-1) \times 1 \) matrix). To eliminate C, we just multiply it by \( {\bf A}^{-1} \) and subtract the corresponding block.

Now we illustrate Gaussian elimination algorithm when \( (m+n) \times (m+n) \) matrix M is partitioned into blocks \( {\bf M} = \begin{bmatrix} {\bf A} & {\bf B} \\ {\bf C} & {\bf D} \end{bmatrix} ,\) where A is \( m \times m \) nonsingular matrix, B is \( m \times n \) matrix, C is \( n \times m \) matrix, and D is \( n \times n \) square matrix. Multiplying M by a "triangular" block matrix, we obtain

\[ \left[ \begin{array}{c|c} {\bf I}_m \times m} & {\bf 0}_{m\times n} \\ \hline -{\bf C}_{n\times m} {\bf A}^{-1}_{m\times m} & {\bf I}_{n\times n} \end{array} \right] \left[ \begin{array}{c|c} {\bf A}_{m\times m} & {\bf B}_{m\times n} \\ \hline {\bf C}_{n\times m} & {\bf D}_{n\times n} \end{array} \right] = \left[ \begin{array}{c|c} {\bf A}_{m\times m} & {\bf B}_{m\times n} \\ \hline {\bf 0}_{n\times m} & {\bf D}_{n\times n} - {\bf C}\,{\bf A}^{-1} {\bf B} \end{array} \right] . \]
The \( n \times n \) matrix in right bottom corner, \( {\bf D}_{n\times n} - {\bf C}\,{\bf A}^{-1} {\bf B} ,\) is called Schur complement.

Recall the difference between row echelon form (also called Gauss form) and reduced row echelon form (Gauss--Jordan). The following matrices are in row echelon form but not reduced row echelon form:

\[ \left[ \begin{array}{cccc} 1 & * & * & * \\ 0 & 1 & * & * \\ 0&0&1&* \\ 0&0&0&1 \end{array} \right] , \qquad \left[ \begin{array}{cccc} 1 & * & * & * \\ 0 & 1 & * & * \\ 0&0&1&* \\ 0&0&0&0 \end{array} \right] , \qquad \left[ \begin{array}{cccc} 1 & * & * & * \\ 0 & 1 & * & * \\ 0&0&0&0 \\ 0&0&0&0 \end{array} \right] , \qquad \left[ \begin{array}{cccc} 1 & * & * & * \\ 0 & 0 & 1 & * \\ 0&0&0&1 \\ 0&0&0&0 \end{array} \right] . \]
All matrices of the following types are in reduced row echelon form:
\[ \left[ \begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0&0&1&0 \\ 0&0&0&1 \end{array} \right] , \qquad \left[ \begin{array}{cccc} 1 & 0 & 0 & * \\ 0 & 1 & 0 & * \\ 0&0&1&* \\ 0&0&0&0 \end{array} \right] , \qquad \left[ \begin{array}{cccc} 1 & 0 & * & * \\ 0 & 1 & * & * \\ 0&0&0&0 \\ 0&0&0&0 \end{array} \right] , \qquad \left[ \begin{array}{cccc} 0 & 1 & * & 0 \\ 0 & 0 & 0 & 1 \\ 0&0&0&0 \\ 0&0&0&0 \end{array} \right] . \]

This is an example of using row reduction procedure:

sage: A=matrix(QQ,3,3,[2,-4,1,4,-8,7,-2,4,-3])
sage: b=matrix(QQ,3,1,[4,2,5])
sage: B=A.augment(b) # construct augmented matrix sage: B.add_multiple_of_row(0,0,-1/2)
[ 1 -2 1/2 2]
[ 4 -8 7 2]
[-2 4 -3 5]
sage: B.add_multiple_of_row(1,0,-4)
[ 1 -2 1/2 2]
[ 0 0 5 -6]
[-2 4 -3 5]
sage: B.add_multiple_of_row(2,0,2)
[ 1 -2 1/2 2]
[ 0 0 5 -6]
[ 0 0 -2 9]
sage: B.add_multiple_of_row(1,1,-4/5)
[ 1 -2 1/2 2 ]
[ 0 0 1 -6/5]
[ 0 0 -2 9 ]
sage: B.add_multiple_of_row(2,1,2)
[ 1 -2 1/2 2]
[ 0 0 1 -6/5]
[ 0 0 0 33/5]

Therefore, we obtained Gauss form for the augmented matrix. Now we need to repeat similar procedure to obtain Gauss-Jordan form. This can be achieved with a standard Sage command:

sage: B.echelon_form()
[ 1 -2 0 0]
[ 0 0 1 0]
[ 0 0 0 1]

We demonstrate Gauss-Jordan procedure in the following example.

Example. Solve the linear vector equation \( {\bf A}\,{\bf x} = {\bf b} , \) where

\[ {\bf A} = \left[ \begin{array}{ccc} -4 &1&1 \\ -1 &4&2 \\ 2&2&-3 \end{array} \right] , \qquad {\bf b} = \left[ \begin{array}{c} 6 \\ -1 \\ 20 \end{array} \right] . \]
sage: A=matrix(QQ,3,3,[[-4,1,1],[-1,4,2],[2,2,-3]]);A
[ -4 1 1]
[ -1 4 2]
[ 2 2 -3]
sage: A.swap_rows(0,1);A
[ -1 4 2]
[ -4 1 1]
[ 2 2 -3]
sage: A.swap_columns(0,1);A
[ 1 -4 2]
[ 4 -1 1]
[ 2 2 -3]
sage: b=vector([6,-1,-20])
sage: B=A.augment(b);B
[ -4 1 1 6]
[ -1 4 2 -1]
[ 2 2 -3 -20]
sage: B.rref()
[ 1 0 0 -61/55]
[ 0 1 0 -144/55]
[ 0 0 1 46/11]

So the last column gives the required vector x.

Another way of expressing to say a system is consistent if and only if column n + 1 is not a pivot column of B . Sage has the matrix method .pivot() to easily identify the pivot columns of a matrix. Let’s consider as an example.

sage: coeff = matrix(QQ, [[ 2, 1,  7, -7],
... [-3, 4, -5, -6],
... [ 1, 1, 4, -5]])
sage: const = vector(QQ, [2, 3, 2])
sage: aug = coeff.augment(const)
sage: aug.rref()
[1 0 3-2 0]
[0 1 1-3 0]
[0 0 0 0 1]
sage: aug.pivots()
(0, 1, 4)

We can look at the reduced row-echelon form of the augmented matrix and see a leading one in the final column, so we know the system is inconsistent. However, we could just as easily not form the reduced row-echelon form and just look at the list of pivot columns computed by aug.pivots(). Since aug has 5 columns, the final column is numbered 4, which is present in the list of pivot columns, as expected.

Example. We demonstrate the Gaussian method with Gauss--Jordan elimination by solving the same system of algebraic equations with two distinct given vectors:

\[ \begin{split} x_1 + 2\,x_2 + 3\,x_3 &= 4, \\ 2\,x_1 + 5\,x_2 + 3\,x_3 &= 5, \\ x_1 + 8\,x_3 &= 9 ; \end{split} \qquad\mbox{and} \qquad \begin{split} x_1 + 2\,x_2 + 3\,x_3 &= 1, \\ 2\,x_1 + 5\,x_2 + 3\,x_3 &= 6, \\ x_1 + 8\,x_3 &= -6 . \end{split} \]
The row echelon form of the former system of equations is
\[ \left[ \begin{array}{ccc|c} 1&2&3&4 \\ 0&1&-3&-3 \\ 0&0&-1&-1 \end{array} \right] . \]
Therefore its solution becomes \( x_1 = 1, \ x_2 =0, \ x_3 =1 . \) The augmented matrix for the latter system can be reduced to
\[ \left[ \begin{array}{ccc|c} 1&0&0&2 \\ 0&1&0&1 \\ 0&0&1&-1 \end{array} \right] , \]
which leads to the solution \( x_1 = 2, \ x_2 =1, \ x_3 =-1 . \)

Example. What conditions must \( b_1 , \) \( b_2 , \) and \( b_3 \) satisfy in order for the system

\[ \begin{split} x_1 + x_2 + 2\,x_3 &= b_1 , \\ x_1 + x_3 &= b_2 , \\ 2\,x_1 + x_2 + 3\,x_3 &= b_3 \end{split} \]
to be consistent?

The augmented matrix corresponding to the symmetric coefficient matrix is

\[ \left[ \begin{array}{ccc|c} 1&1&2& b_1 \\ 1&0&1& b_2 \\ 2&1&3& b_3 \end{array} \right] , \]
which can be reduced to row echelon form as follows
\begin{align*} \left[ {\bf A} \,| \,{\bf b} \right] &= \left[ \begin{array}{ccc|c} 1&1&2& b_1 \\ 0&-1&-1& b_2 -b_1 \\ 0&-1&-1& b_3 - 2\,b_1 \end{array} \right] \qquad \begin{array}{l} -1 \mbox{ times the first row was added to the} \\ \mbox{second and $-1$ times the first row was added to the third} \end{array} \\ &= \left[ \begin{array}{ccc|c} 1&1&2& b_1 \\ 0&1&1& b_1 - b_2 \\ 0&-1&-1& b_3 -2\, b_1 \end{array} \right] \qquad \begin{array}{l} \mbox{the second row was } \\ \mbox{multiplied by $-1$} \end{array} \\ &= \left[ \begin{array}{ccc|c} 1&1&2& b_1 \\ 0&1&1& b_1 - b_2 \\ 0&0&0& b_3 - b_2 - b_1 \end{array} \right] \qquad \begin{array}{l} \mbox{the second row was added} \\ \mbox{to the third} \end{array} . \end{align*}
It is now evident from the third row in the matrix that the system has a solution if and only if \( b_1 , \) \( b_2 , \) and \( b_3 \) satisfy the condition
\[ b_3 - b_2 -b_1 =0 \qquad\mbox{or} \qquad b_3 = b_1 + b_2 . \]
To express this condition another way, \( {\bf A}\, {\bf x} = {\bf b} \) is consistent if and only if b is the vector of the form
\[ {\bf b} = \begin{bmatrix} b_1 \\ b_2 \\ b_1 + b_2 \end{bmatrix} = b_1 \begin{bmatrix} 1 \\ 0 \\ 1 \end{bmatrix} + b_2 \begin{bmatrix} 0 \\ 1 \\ 1 \end{bmatrix} , \]
where \( b_1 \) and \( b_2 \) are arbitrary. The vectors \( \langle 1, 0 , 1 \rangle^T ,\) \( \langle 0 , 1, 1 \rangle^T \) constitute the basis in the column space of the coefficient matrix. ■