 Home Geography Direct methods for sparse matrices

# Minimum degree ordering

The minimum degree algorithm (Tinney scheme 2) was introduced in Section 7.3 and we will now consider its implementation.

To facilitate the pivot selection, we require access to whole rows (or columns) of the matrix. Taking full advantage of symmetry means that only half of the matrix is stored and access to the whole of a row demands access to the row and column of the stored half of the matrix. The simplest alternative is to store the entire matrix, and we make this assumption here. The storage burden is eased by not having to store the numerical values at this stage.

The critical factor in the efficient implementation of the minimum degree ordering is to avoid the explicit storage of the fill-ins. Rather, the fill-in is treated implicitly by using generated elements (cliques) to store updates to the reduced matrix. This clique representation of Gaussian elimination was popularized by Rose (1972), and used in a somewhat different way by George and Liu (1981).

We will describe the non-finite-element case first, although some simplifications are possible for finite-element problems. The first pivot is chosen as the diagonal entry in a row with least entries, say (k+1) entries. Following the operations involving the first pivot, the reduced matrix will have a full submatrix in the rows and columns corresponding to the k off-diagonal entries in the pivot row. The graph associated with this full submatrix is called the pivotal clique. The important thing concerning the storage of cliques (see Section 2.18) is that a clique on k nodes requires only k indices, while the corresponding submatrix requires storage for 1 k(k + 1) numerical values.

Rather than finding out where fill-ins take place and revising the lists of entries in the rows, as we did when working with actual values (see Section 10.2), we keep the original representations of the rows together with a representation of the full submatrix as a clique. To update the count of number of entries in a row that was involved in the pivotal step, we scan the index list of its entries (excluding the pivotal column) and the index list of the clique, incrementing the count each time an index occurs for the first time.

For example, in the case illustrated in Figure 11.2.1, the clique associated with the first elimination has index list (3,5,6) and row 3 has index list (3,5,7) Fig. 11.2.1. The first step of elimination. Fill-ins are shown as •.

so the count is incremented to 3 by scanning the clique and to 4 when 7 is found to occur for the first time when scanning the index list (3,5,7). The index of the pivot can remain as an index for the pivotal clique. Note that the index lists of rows outside the clique do not change. Each list may be held as a sparse vector (Section 2.7) or as a linked list (Section 2.10).

At a typical stage of the elimination, the pivot is chosen as the diagonal entry in a row with least entries. The set of entries in the pivot row is formed by merging its index list (excluding eliminated columns) with those of all the cliques that involve the pivot. This set, less the index of the pivot, provides the index set of the pivotal clique. Since all the old cliques that involve the pivot are included in the pivotal clique, there is no need to keep them. The number of entries in each altered row is now calculated.

As when working with actual values, it is advisable to store the indices of rows with equal numbers of entries as doubly-linked lists. At each pivotal step, each index of a row whose count changes is removed from its list and is inserted in its new list.

The algorithm has a very interesting property. Since each new list for a clique is always formed by merging old lists (and removing the pivot index) and the old lists are then discarded, the total number of list entries does not increase in spite of fill-ins. Hence, provided the original matrix can be stored and provided adequate data structures are in use, there is no possibility of failure through insufficient storage.

It is very worthwhile to recognize rows with identical structure, since once identical they will remain so until pivotal. Such a collection of rows is known as a super-row and the corresponding variable is known as a supervariable. Index lists of supervariables will be shorter and can be searched more quickly. Only one is needed for each super-row and only one degree needs to be calculated for all the rows of a super-row.

We explained in Section 2.14 how to identify efficiently all the supervariables for a given matrix. For testing whether two rows become identical after eliminations, an efficient approach is to use a hashing function. Amestoy et al. (1996 a) place the super-rows modified in a pivot step in n hash buckets by using the hash function 1 + s(i) mod (n — 1), where s(i) is the sum of the indices of the supervariables and cliques associated with super-row i. A linked list can be used for those in the same bucket. Whenever a super-row is placed in a bucket that is not empty, it is checked against the others in the bucket. This does not recognize a row that becomes identical to a row that is not modified, but they will be so recognized in the next pivotal step that involves them.

The graph whose nodes are the supervariables and whose edges correspond to full submatrices in their rows and columns is known as the quotient graph (George and Liu 1981).

Once one variable of a supervariable has been eliminated, the others may be eliminated immediately without any further fill-in. This strategy was called ‘mass node elimination' when it was first proposed. Following such a mass elimination, the pivotal clique has size equal to the number of variables in the pivot row less the number of variables in the pivot block. This is known as the external degree, since it does not count entries in the pivot block. In some cases, choosing the pivot to minimize the external degree instead of the true degree can have a significant effect on the number of entries in the factorization of the reordered matrix. This was proposed by Liu (1985), who reported that he found a reduction of 3-7% in the size of the factorized matrix. Bigger gains have been found since then. Some examples are shown in Table 11.3.1. One reason for this is that the external degree gives a better approximation to the fill-in than the true degree. We note that most modern codes work with a reduced matrix where blocks are represented by a single supernode so that the gains from using external degrees are readily available.

Three devices may be employed to reduce the cost of the calculations (see Duff and Reid 1983, and Exercise 11.1):

• (a) When recalculating the degree of a super-row that is affected in a pivotal step, start with the index list of the pivotal clique, since this is bound to be involved.
• (b) When scanning cliques during degree calculation, it is easy to recognize any clique that lies entirely within the pivotal clique. Such a clique may be removed without losing any information about the structure of the reduced system.
• (c) Once an entry of the original matrix has been ‘overlaid’ by a clique, it may be removed without altering the overall pattern.

Of course, there may be many rows with least number of entries. Liu (1985) suggested choosing a set with no interactions (none has an index list that includes any of the others). This is known as multiple minimal degree or MMD. The lack of interaction may be helpful when working in parallel.

To illustrate these improvements to the minimum degree ordering that we have considered in this chapter, we show in Table 11.2.1 specimen times for five codes. The problems are three of the graded-L triangulations of George and Liu (1978 b). Although these are old results on small problems they illustrate the gains well, and the most recent code in the comparison (MA27A) is essentially

Table 11.2.1 Times in IBM 370/168 seconds for various implementations of minimum degree ordering.

 Order Entries 265 1 753 1 009 6 865 3 466 23 896 MA17A (1970) 1.56 29.9 >250 MA17E (1973) 0.68 6.86 62.4 YSMP (1978) 0.24 1.27 6.05 SPARSPAK (1980) 0.27 1.11 4.04 MA27A (1981) 0.15 0.58 2.05

still state-of-the-art for a minimum degree code. Codes MA17A and MA17E are from the HSL Archive and do not use cliques; MA17A uses the storage scheme described in Table 2.13.2 and so needs to run through links to find the row or column index, whereas MA17E additionally holds row and column indices explicitly. YSMP (Yale Sparse Matrix Package—Eisenstat, Gursky, Schultz, and Sherman 1982), SPARSPAK (University of Waterloo), and MA27A (HSL) all use successive refinements of clique amalgamation, which is why they are faster than the first two. Since there is more than one version of each of these codes, the date of the version used in this comparison is given in parentheses. These dates also serve to indicate the advances in the implementation of the minimum degree algorithm.

The finite-element case is simpler in that the problem is given in the form of a set of cliques. The degrees of all the variables should be calculated initially from the clique representation without performing any assemblies. Thereafter, the implementation can be just like the non-element case except that the whole pattern is stored by cliques. We return to the finite-element case in Sections

11.10 and 11.11.

 Related topics