Sensitivity of Solutions

Previously, we discussed how to solve linear vector equation Ax = b using elimination procedure. One might expect that if the the entries of matrix A are slightly modified, then its solution will be closed to the original solution. Unfortunately, there exist linear systems Ax = b whose solutions are not stable under small perturbations of either b or A. Now we pay more attension to this issue.

Example 1: Consider the following two systems. The first one

\[ \begin{split} 577\,x + 880\,y &= 3, \\ 28\,x + 427\,y &= 9, \end{split} \]

has the unique solution, as Mathematica confirms

Solve[{577 x + 880 y ==3, 28 x + 427 y == 9}, {x,y}]

We will call the second system as the approximate system, and it is given as follows:

\[ \begin{split} 577\,x + 880\,y &= 3, \\ 28\,x + 427.1\,y &= 9, \end{split} \]

Solve[{577 x + 880 y ==3, 28 x + 427.1 y == 9}, {x,y}]

In matrix form, the augmented matrices for these two systems are given by

\[ \begin{bmatrix} 577&880&3 \\ 280&427&9 \end{bmatrix} \qquad\mbox{and} \qquad \begin{bmatrix} 577&880&3 \\ 280&427.1&9 \end{bmatrix} \]

It is apparant that these two systems have very close solution sets. ■

Example 2: We consider two systems of linear equations with almost identical entries:

\[ \begin{split} 577\,x + 880\,y &= 3, \\ 280\,x + 427\,y &= 9, \end{split} \]

and the approximate system

\[ \begin{split} 577\,x + 880\,y &= 3, \\ 280\,x + 427.1\,y &= 9, \end{split} \]

We use Mathematica to solve these systems:

LinearSolve[{{577, 880}, {280, 427}}, {3, 9}]

\( \left\{ \frac{2213}{7} , \frac{3451}{7} \right\} \)

N[%]

{316.143, -207.286}

Now for approximate system

N[LinearSolve[{{577, 880}, {280, 427.1}}, {3, 9}]]

{-180.891, 118.61}

So we see that solutions two these two systems are completely different. This is clearly interesting, and we need to know why the behavior of solutions are so unstable to small changes. In real life, there is almost always error in a linear system because its values are derived from measurements with inherent error and humans are sometimes are not accurate. We must know whether it is possible to avoid this sensitivity to error and also what causes it.

Since we used a two-dimensional system, it can be visualized easily in the plane. So let us solve for y in each equation.

Expand[NSolve[577 x + 880 y == 3, y]]

Expand[NSolve[280 x + 427 y == 9, y]]

Note that these two lines are very close to being parallel because they have nearly identical slopes. This may be the cause of our sensitivity to slight amounts of error since a small change in the slope of one line can move their intersection point a great distance from the original intersection point. ■

Example 3: We reconsider the previous example, which we rewrite using floating points to isolate one variable:

\[ \begin{split} 0.655682 \,x + y &= 0.00340909 \\ \left( 0.655738 + E \right) x + y &= 0.0210773 , \end{split} \]

where the error in the slope is E. The exact system is the one for E = 0, and they become parallel lines when E = 0.0292836 with no solution the system. When E = 0, the solution to the system is as given below:

Ae = {{0.655682, 1, 0.00340909}, {0.655738, 1, 0.0210773}}
RAe = RowReduce[Ae]
exactSol = RAe[[All, -1]]

{315.516, -206.875}

Note that if we use more decimal places and solve the equations

\[ \begin{split} 0.6556818181818181 \,x + y &= 0.003409090909090909 , \\ 0.6557377049180327\, x + y &= 0.02107728337236534 , \end{split} \]

we get more accurate answer:

\[ x = 316.143, \qquad y = -207.286 . \]

The exact solution's x coordinate is given by exactSol[[1]] and its y coordinate is given by exactSol[[2]]. For E ≠ 0, if we denote X and Y as solutions to the approximate system, then its distance from the exact solution is given by

\[ d = \sqrt{\left( X - exactSol[[1]] \right)^2 + \left( Y - exactSol[[2]] \right)^2} . \]

We ask Mathematica for help with these calculations:

As = {{0.655682, 1, 0.00340909}, {0.655738 +EE, 1, 0.0210773}}
approxSol = RowReduce[As] [[All, -1]]

{(1. (0.0176682 + 8.67362*10^-19 EE))/(0.000056 + 1. EE), ( 0.00340909 (-3.39813 + 1. EE))/(0.000056 + 1. EE)}

Next, we plot error function:

errorFun = Sqrt[(exactSol[[1]] - approxSol[[1]])^2 + (exactSol[[2]] - approxSol[[2]])^2]; Plot[errorFun, {EE, -1,1}, PlotRange -> {{-0.3, 0.3}, {340, 400}}, PlotStyle -> {{Red, Thickness[0.01]}}, AxesOrigin ->{-0.3,340}]

Plot of error within interval (-0.3, 0.3)

The plot of the distance between the exaxct solution and the approximate system solutions has a vertical asymptote at E = = 0.0292836, which is the value of E that makes the lines parallel, and so the system is without a solution. The closer E is to this value, the larger the error. As E moves away from 0.0292836, the error stabilizes at a value close to 278.

We are going to plot these solutions as a parametric plot in the xy-plane to see their behavior more clearly. ■

Example 4: Consider the system

\[ \begin{bmatrix} 10&7&8&7 \\ 7&5&6&5 \\ 8&6&10&9 \\ 7&5&9&10 \end{bmatrix} \begin{pmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{pmatrix} = \begin{bmatrix} 32 \\ 23 \\ 33 \\ 31 \end{bmatrix} . \]

It is easy to verify that the given system has the solution x = [1, 1, 1, 1]. If we perturb slightly the right-hand side,obtaining the new system

\[ \begin{bmatrix} 10&7&8&7 \\ 7&5&6&5 \\ 8&6&10&9 \\ 7&5&9&10 \end{bmatrix} \begin{pmatrix} x_1 + \Delta x_1 \\ x_2 + \Delta x_2 \\ x_3 + \Delta x_3 \\ x_4 + \Delta x_4 \end{pmatrix} = \begin{bmatrix} 32.1 \\ 22.9 \\ 33.1 \\ 30.9 \end{bmatrix} , \]

the new solutions turns out to be

\[ {\bf x} = \left[ 9.2, -12.6, 4.3, -1.1 \right] . \]

In other words, a relative error of the order 1/200 in th edata (here, b) produces a relative error of the order 10/1 in the solution, which represents an amplification of the relative error of the order 2000.

Now, let us perturb the matrix slightly, obtaining the new system

\[ \begin{bmatrix} 10&7&8.1&7.2 \\ 7.08&5.04&6&5 \\ 8&5.98&9.98&9 \\ 6.99&4.99&9&9.98 \end{bmatrix} \begin{pmatrix} x_1 + \Delta x_1 \\ x_2 + \Delta x_2 \\ x_3 + \Delta x_3 \\ x_4 + \Delta x_4 \end{pmatrix} = \begin{bmatrix} 32 \\ 23 \\ 33 \\ 31 \end{bmatrix} . \]

This time, the solution is

\[ {\bf x} = \left[ -81, 137, -34, 22 \right] . \]

Again, a small change in the data alters the result ratherdrastically. ■

The problem is that the matrix of the system is badly conditioned, a concept that we will now explain.

Determinant value and ill-conditioning

As some previous examples show, there exist some pathological problems of finding solutions to a linear equation A x = b that can flank the robast numerical solvers dedicated for such problems. To analyze the sensitivity of solutions to linear equations A x = b we have to employ a matrix norm ‖ · ‖ (see section), but not the determinant. It turns out that the determinant does not reflect whether matrix elements are small or large.

Let us consider a linear equation with the matrix

\[ {\bf A} = \begin{bmatrix} 3\times 10^5 & 2\times 10^5 \\ 4\times 10^5 & 5\times 10^5 \end{bmatrix} . \]

Its determinant is det(A) = 7×10¹⁰ and its Euclidean norm ≈ 728538. So det(A) ≫ ‖ A ‖.

Now suppose we divide the 2×2 system for our example by 10¹⁰ without changing anything in the statement of the problem. In other words, multiplying or dividing the matrix A should not lead to fundamental change. This yelds the matrix

\[ {\bf B} = \begin{bmatrix} 3\times 10^{-5} & 2\times 10^{-5} \\ 4\times 10^{-5} & 5\times 10^{-5} \end{bmatrix} . \]

We find det(B) = 7×10^-10, so det(A) ≪ ‖ B ‖ ≈ 0.0000728538.

Using matrix pertubation theory and carefully stady the effect a small change of the given matrix on the solution vector x + Δx looks like when when you start with A + ΔA. Following Trefethen & Bau, when we change A into A + δA, its inverse A⁻¹ changes to A⁻¹ + δ(A⁻¹):

\[ \delta \left({\bf A}^{-1} \right) = - {\bf A}^{-1} \cdot \delta{\bf A} \cdot {\bf A}^{-1}. \]

Then we get

\begin{equation} \label{EqSent.1} \frac{\| \Delta {\bf x} \|}{\| {\bf x} \|} \le \| {\bf A} \| \, \| {\bf A}^{-1} \| \, \frac{\| \Delta {\bf A} \|}{\| {\bf A} \|} . \end{equation}

This relates an error bound on ‖Δx‖/ ‖x‖ with an error bound on ‖ΔA‖/ ‖A‖. The coefficient in front of ‖ΔA‖/ ‖A‖ tells us whether or not a small pertubation will tend to get magnified. Equation \eqref{EqSent.1} naturally leads us to introducing the condition number

\begin{equation} \label{EqSent.2} \kappa ({\bf A}) = \| {\bf A} \| \, \| {\bf A}^{-1} \| . \end{equation}

When κ(A) is of order unity, we are dealing with a well-conditioned problem, i.e., a small pertubation is not amplified. Conversely, when κ(A) is large, a pertubation is typically amplified, so our problem is ill-conditioned. Crucially, this condition number involves both the matrix A and its inverse A⁻¹. Hence, the condition number has nothing to do with the determinant. As a result, an overall scaling has no effect. For example, the condions numbers of the previously considered 2×2 matrices are

\[ \kappa ({\bf A}) = 7.5824, \qquad \kappa ({\bf B}) =7.5824 \]

A = {{3*10^5 , 2*10^5}, {4*10^5, 5*10^5}}
N[Norm[A] *Norm[Inverse[A]]]

7.582401374401513

These numbers are not vwry large, so the corresponding system of linear equations are well-conditions.

Example 5: We reconsider the system from Example 4:

\[ \begin{bmatrix} 10&7&8&7 \\ 7&5&6&5 \\ 8&6&10&9 \\ 7&5&9&10 \end{bmatrix} \begin{pmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{pmatrix} = \begin{bmatrix} 32 \\ 23 \\ 33 \\ 31 \end{bmatrix} . \]

Using Mathematica, we find the corresponing condition number:

A = {{10,7,8,7}, {7,5,6,5},{8,6,10,9},{7,5,9,10}};
kappa = N[Norm[A]*Norm[Inverse[A]]]

2984.09

Since this number is around 3,000, the corresponding problem is ill-conditioned. Now we check with another norms:

N[Norm[A, 1]*Norm[Inverse[A], 1]]

4488

N[Norm[A, Infinity]*Norm[Inverse[A], Infinity]]

4488

N[Norm[A, "Frobenius"]*Norm[Inverse[A], "Frobenius"]]

3009.58

All of them indicate that the problem is ill-conditioned. ■

† \( \varnothing \)

Higham, Nicholas, Gaussian Elimination, Manchester Institute for Mathematical Sciences, School of Mathematics, The University of Manchester, 2011.
Trefethen, L.N., Bau, D. III, Numerical Linear Algebra, Society for Inductrial and Applied Mathematics, Pennsylvania, 1997.

Introduction to Linear Algebra

Systems of Linear Equations

Matrix Algebra

Vector Spaces

Functions of Matrices

Euclidean Vector Spaces

Numerical Methods

Applications

Miscellany

Preliminaries

Glossary

Reference

Sensitivity of Solutions