Inner product of two packed vectors
Taking an inner product of a packed vector with a full-length vector is very simply and economically achieved by scanning the packed vector, as shown in the Fortran code of Figure 2.2.1. A fast inner product between two packed vectors may be obtained by first expanding one of them into a full-length vector. An alternative, if they both have their components in order, is to scan the two in phase as shown in the Fortran code of Figure 2.3.1.
Fig. 2.3.1. Code for the inner product between two packed ordered vectors (value_x, index_x) and (value_y, index_y) with taux and tauy entries.
Adding packed vectors
Another very important operation is that of adding a multiple of one vector to another, say
If x and y are held as packed vectors with their components unordered, a satisfactory technique is again to use a full-length integer vector p known initially
Hf this is done n times while factorizing a sparse matrix of order n, O(n2) operations would be performed, which is likely to dominate the operation count.
to be zero. We then modify the packed form of x using p as a working vector but restoring it to zero as follows:
- (i) For each entry yk, place its position in the packed vector in pk. For example, if the vector y had three entries, ую, y15, yi say, then entries 10, 15, and 1 in p would be set to 1, 2, and 3, respectively.
- (ii) Scan x. For each entry x* check p*. If it is nonzero, use it to find the value of у*, modify x*, and reset p* to zero.
- (iii) Scan y. For each entry y* check p*. If it is nonzero, add a new component with value ay* to the packed form of x (we have a fill-in). Reset p* to zero.
Fortran code for these operations is shown in Figure 2.4.1. If their components
Fig. 2.4.1. Code for the operation x := x + ay.
are in order, an in-phase scan of the two packed vectors, analogous to that shown in Figure 2.3.1, may be made. Note that fill-ins need not occur at the end of the packed form, so we can be sure that the order is preserved only if a fresh packed form is constructed (so that the operation is of the form z := x + ay). We set the writing of such code as Exercise 2.1. Perhaps surprisingly, it is more complicated than the unordered code. In fact, there appears to be little advantage in using ordered packed vectors except to avoid needing the full-length vector p, but even this may be avoided at the expense of slightly more complicated code (see Exercise 2.2).
An alternative approach is as follows:
- (i) For each entry xk, place its position in the packed vector in pk.
- (ii) Scan y. For each entry y*, check p*. If it is nonzero, use it to find the position of x* and modify it, x* := x* + аур otherwise, add i to the packed form of x and set x* := ay*.
- (iii) Scan the revised packed form of x. For each i, set p* := 0.
While it is slightly more expensive than the first approach, since x is likely to have more entries than y, it offers one basic advantage. It permits the sequence of operations
to be performed with only one execution of steps (i) and (iii). Step (ii) is simply repeated with y(1), y(2),..., while the indices for x remain in p. Step (ii) may be simplified if it is known in advance that the sparsity pattern of x contains that of y.
We will see later (Chapter 10) that there are times during the solution of sparse equations when each of these approaches is preferable.