GAUSSIAN ELIMINATION FOR DENSE MATRICES: NUMERICAL CONSIDERATIONS
We review the impact of inexact computation (caused by data uncertainty and computer rounding error) on Gaussian elimination. We need to ensure that the algorithm produces answers that are consistent with the given data; that is, that the algorithm is stable. We want to know whether small changes to the original data lead to large changes in the solution; that is, whether the problem is ill-conditioned. We show how to control stability and measure conditioning. The concepts of pivoting, scaling, iterative refinement, and estimating condition numbers are introduced, which will be tools used while addressing sparsity issues in the rest of the book.
Neither problem data nor computer arithmetic is exact. Thus, when the algorithms of the previous chapter are implemented on a computer, the solutions will not be exact. As we prepare to solve very large systems of equations, perhaps with millions of unknowns, it is natural to be concerned about the effect of these inaccuracies on the final result.
There are two ways to study the effect of arithmetic errors on the solution of sets of equations. The most obvious way, accumulating their effect through every stage of the algorithm, is extremely tedious and usually produces very pessimistic results unless full account is taken of the correlations between errors. There is a less direct way to assess this error, which has become standard practice for numerical linear algebra computations. It was pioneered by Wilkinson (1961) and involves answering two questions:
- (i) Is the computed solution x the exact solution of a ‘nearby’ problem?
- (ii) If small changes are made to the given problem, are changes in the exact solution also small?
A positive answer to the first question means that we have been able to control the computing error. That is, the error in the computed solution is no greater than would result from making small perturbations to the original problem and then solving the perturbed problem exactly. In this case, we say that the algorithm is stable. We seek always to achieve this.
If the answer to the second question is positive, that is, if small changes in the data produce small changes in the solution, we say that the problem is well-conditioned. Conversely, when small changes in the data produce large changes in the solution, we say that the problem is ill-conditioned. Note that
Direct Methods for Sparse Matrices, second edition. I. S. Duff, A. M. Erisman, and J. K. Reid. © Oxford University Press 2017. Published 2017 by Oxford University Press.
this is a property of the problem and has nothing to do with the method used to solve it. Care is always needed in the ill-conditioned case. Data errors and errors introduced by the algorithm may cause large changes to the solution. Sometimes (see, for example, the simple case discussed in Section 4.10) the ill-conditioning may be so severe that only a few figures, or even none, are computed with certainty even though the algorithm is stable.
When both the algorithm is stable and the problem is well conditioned, the computed solution will be accurate (see Exercise 4.1). Because ill-conditioning is a property of the problem, it may be altered by changing the problem formulation. An example of this can be seen in the solution of the linear least- squares problem
where A has m rows and n columns, with m > n, which we solve in the sense of minimizing ||b — Ax||2. We can formulate the least-squares system as the normal equations
where now ATA is square, symmetric, and positive definite. While the exact solutions of the two sets of equations are the same, the sensitivity of the solution to small changes in b is not the same as its sensitivity to small changes in ATb. As an illustration, consider the problem
for which the exact solution is x = ^ . If the first component of b is changed
by 0.01%, the solution changes to ° 012 . If the first component of ATb is
Г g 5i 1
changed by 0.01%, the solution changes to ^ '05 . It is clear that the original
problem is far less sensitive to small changes than the normal equations, that is, reformulation of the problem as a set of normal equations is undesirable from the point of view of conditioning. This is true more generally for normal equations, see Higham (2002, chapter 20), but a full exploration of the sparse least-squares problem is beyond the scope of this book.
This discussion has been deliberately vague, without defining the precise meanings of large and small or giving rigorous bounds. These issues are addressed and illustrated in the rest of this chapter.