Home Computer Science
Table of Contents:
Fixed-point multiplication is relatively a complex operation compared to addition/ subtraction whether implemented in hardware or software. It requires comparatively more hardware, and as a result, it is not usually included in the instruction set of smaller processors. Multiplication is normally there implemented by a simple but slow method of repeated addition. To compute X x Y is only to add the multiplicand X to itself Y times, where Y is the multiplier. Often multiplication is implemented by multiplying X by Y, к bits at a time, and then adding the resulting terms as obtained. This process is used for the multiplication of unsigned binary numbers in pencil-and-paper calculations with k = 1. Numerous types of other algorithms do exist and have been implemented in various computers. Let us now begin this topic with a relatively simpler problem of multiplying two unsigned (non-negative) integers, and then with the most popular techniques for multiplication of numbers in 2's complement representation.
Multiplication of unsigned binary integers X and Y might be carried out using traditional paper-and-pencil method in which the main operations involved in this type of multiplication method are shifting and addition of partial products. However, the multiplication of an и-bit binary integer with an ш-bit binary integer yields a product of up to (m + n) bits in length (e.g. Ill x 11 = 10101). Here, the steps being followed are effectively inefficient, mainly in that each of the 1-bit partial products must be kept stored until the final addition step is completed. To make it more efficient, firstly, it is possible to perform a running addition on the partial products rather than waiting until the end. This means that it is desirable to add each partial product term as it is generated to the sum of the preceding terms to form the current partial product. This eliminates the need for storage of all the partial products; of course, fewer additional registers are needed to accomplish this. Secondly, it is feasible to save some amount of time on the generation of partial products; for each 1 on the multiplier, an add operation and a shift operation are required, but for each 0 in the multiplier, only a shift is required.
Shift-and-add algorithm: Following the above-mentioned strategy, multiplication can now be performed using an adder circuitry in the ALU, and carrying out a number of sequential steps. The circuit performs multiplication by using a single и-bit adder и times to implement the spatial addition needed in this multiplication. At the start, the multiplicand and multiplier are loaded into two registers (say, X and Y), respectively. A third register (say, the Z register) is also needed and is initially set to 0. There is also a 1-bit register (say, C register) that holds a potential carry bit generated from addition. Initially, this register is set to 0.
The multiplication operation is carried out in the following way. Control logic reads the bits of the multiplier (Y„ _u Y„_2,.... Y0) starting from Y0 one at a time. If Y0 = 1, then the multiplicand (X) is added to the Z register, and the result is stored in the Z register, with the C bit being used to hold carry. Then, all of the bits of the C, Z, and Y registers are shifted one bit to the right so that C bit goes into Z„ _ u Z0 goes into Y„ .,, and Y0 is lost. If Y0 = 0, then no addition is performed, and only the right-shift is carried out. This process is repeated for each bit of the original multiplier. Each cycle is ended with C, Z, and Y being shifted right one bit position to allow for growth of the partial product as the multiplier is shifted out of register Y. Because of this shifting, multiplier bit Y, appears at the LSB (least significant bit) position of Y to generate the Add/No-add signal at the correct time, starting with Y0 during the first cycle, Yx during the second cycle, and so on. In this way, all the multiplier bits after being used are discarded by the right-shift operation. Note that the carry-out from the adder held in C register is the leftmost bit of the partial product (i + 1), and it must be shifted right with the contents of Z and Y. After и cycles (end of multiplication), the high- order half of the product is held in register Z, and the low-order half is in register Y, giving the resulting product of 2n-bits.
A brief detail of this topic with required hardware and also an appropriate flowchart of this scheme as well as an example showing its working are given in the website: http:// routledge.com/9780367255732.
The multiplication of signed-magnitude numbers requires a straightforward extension of the unsigned case as already discussed above. The magnitude part of the product P = X x Y is computed as usual by the shift-and-add multiplication algorithm, and the sign ps of product P is computed separately from the sign of X and Y as follows: ps: = xs ® ys. The signed-magnitude multiplication can be, however, then implemented on the same lines using the same sequential method as already described, after including the computation of the sign of the product result separately in the existing circuit.
Multiplication of 2's complement signed operands is somewhat different and exhibits some peculiarities, especially when the operands are negative: e.g. if we consider, multiplying unsigned integers 11 (1011) by 13 (1101) to yield a product 143 (10001111). If we interpret these two numbers in 2's complement representation, we have -5 (1011) multiplied by -3 (1101), giving -113 (10001111). This example reveals that straightforward multiplication will not work if both the multiplicand and the multiplier are negative. In fact, it can be shown by other examples that straightforward multiplication will also not work if either the multiplicand or the multiplier is negative. More precisely, it can be said that the multiplication process must then treat positive and negative operands differently.
To resolve this dilemma, there exist a number of ways. One of these would be to convert both the multiplier and multiplicand to positive numbers, perform the multiplication, and then take the two's complement of the result if and only if the sign of the two original numbers differs. In order to avoid all such complications, particularly the final transformation step as mentioned above, one simpler technique that is most commonly used is Booth's algorithm. This approach additionally has the merit of potentially speeding up the multiplication process, relative to a more straightforward approach.
188.8.131.52.2 Booth’s Algorithm
One of the most elegant and widely used schemes for two's complement multiplication was proposed by Andrew D. Booth sometimes in 1951. Booth's algorithm is simple to understand and easy to implement that employs both addition and subtraction, and it treats positive and negative operands uniformly, with as such no special actions required for negative numbers separately. It generates a 2n-bit product while uniformly treating both positive and negative 2's complement и-bit operands. Moreover, one of the distinct advantages of this algorithm is that it can be readily extended in various ways to speed up the multiplication process. A version of this algorithm has been found to have been implemented in multiply instruction of reputed RISC (reduced instruction set computer) processor ARM6.
Booth's algorithm can be described as follows. Here too, the multiplicand and the multiplier are placed in two registers X and Y, respectively. There is also a 1-bit register placed logically to the right of the least significant bit (Y0) of the Y register and designated as Y_,. The results of the multiplication as usual will finally appear in the Z and Y register.
At the beginning, the registers Z and Y_, are initialized to 0. As before, control logic here also starts scanning from the rightmost bit of the multiplier onwards, one at a time. Now, as each bit of multiplier is examined, the bit to its right is also examined. The comparison between these two bits is then made, which indicates the following:
• If the two bits are the same (i.e. the current bit under scan and the one just on its
right are the same (1 1 or 0 <— 0)), then all of the bits of the Z, Y, and Y_j registers
are simply shifted to the right 1 bit;
In all the cases, the right shift is such that the leftmost bit of Z, namely Z„ _,, not only shifted into Z„ _2, but also remains in Z„ _ v This is required to preserve the sign of the number in the result of the multiplication to be obtained in Z and Y. This action is commonly known as the arithmetic shift since it preserves the sign bit.
It is important to note the significance of using the 1-bit register Y_v Now to start the multiplication process, we always assume that there is a virtual bit designated as Y_, on the right of the rightmost bit of the multiplier that is always taken as 0, since initialized with 0. As the right shift is always given in all the cases, Y0 always holds the current bit of the multiplier to be scanned, and Y_x consequently always holds the bit which is just to the right of the currently scanning bit of the multiplier. Thus, after shift (completion of one cycle), it is the pair (Y„, Y_,) that is always to be taken into consideration to determine the type of the actions (add, subtract, shift) to be taken.
Figure 7.8 shows an example detailing the sequence of steps mentioned in Booth's algorithm for the multiplication of 6 by 5. The result of this multiplication is finally obtained in Z and Y, i.e. 0 0 0 11 1 1 0, which is equal to 30 as usual.
Booth's algorithm can also be used equally well, directly for negative multipliers, as shown in Figure 7.9, for the multiplication of 6 by -5. The result of this multiplication is finally obtained in Z and Y, i.e. 1 11 00 01 0, which is equal to -30 as usual. In fact, by using
Multiplication example using Booth's algorithm (6 x 5 in Z and У).
Multiplication example using Booth's algorithm (the product of 6 x -5 in Z and У).
other examples, it can be shown that Booth's algorithm works equally well with any combination of positive and negative numbers, making no distinction between them as such.
Working of Booth's algorithm with both negative multipliers and negative multiplicands is shown in the website: http://routledge.com/9780367255732.
184.108.40.206.2 Alternative Method
Figure 7.10 illustrates the normal method and Booth's algorithm with a more compact manner using a different procedure from that of the examples just discussed. Figure 7.10a shows a normal multiplication scheme with a 4-bit signed operand -5 as multiplicand, multiplied by +7, the multiplier, to get the 8-bit product, -35. The sign extension of the multiplicand in the partial product is shown with an underline. Thus, the hardware as used for unsigned binary integer multiplication (already shown in the website) can also be used here for negative multiplicands if it provides for sign extension of the partial products.
Figure 7.10b shows a different multiplication procedure using Booth's algorithm as already defined, taking the same example of a 4-bit signed operand -5 (multiplicand) multiplied by +7 (multiplier) to get the 8-bit product, -35. The multiplier bit here is recoded (bit-pair recoding) when it is scanned from right to left following the original rules as already described above in Booth's algorithms, but essentially with a very little redefinition used for this type of multiplication scheme. It is to be remembered that there is always an implied 0 that lies to the right of the least significant bit (i.e. the rightmost bit) of the multiplier. Now, the scheme is as follows:
• When moving from 0 to 1 (1 <— 0), the current bit 1 of the multiplier will be replaced
• When moving from 1 to 0 (0 1), the current bit 0 of the multiplier will be replaced
Multiplication example (-5 x 7) using different method to implement Booth's algorithm: (a) normal multiplication scheme and (b) Booth's multiplication scheme.
• When moving either from 1 to 1 or from 0 to 0 (i.e. the current bit under scan and its immediately preceding right bit of the multiplier are the same (1 <— 1 or 0 <— 0)), the current bit of the multiplier, be it 1 or 0, will be replaced by 0.
In Figure 7.10b, it really occurs (least significant bit of the multiplier is 1). That is why, the least significant bit of the multiplier here is recoded as -1 according to the above rule (since the original 1 in the multiplier changes from 1 to -1 due to the implied 0 lying to its right). Thus, the recoded multiplier in this example becomes +100 -1 when the original multiplier was 0111. Here also, the sign extension of the multiplicand in the partial product is shown with an underline (Figure 7.10b).
Following the same method as already depicted in Figure 7.10b, it can be shown by suitable example that Booth's algorithm can also be equally applicable in multiplication with any combination of positive and negative numbers in a similar manner, making no distinction between them as such.
Some distinct features (remarks) of Booth's algorithm are given in the website: http:// routledge.com/9780367255732.
Fast Multiplication: Carry-Save Addition
In multiplication, the final result is ultimately produced by adding the partial products (called summands) as generated row-wise during the multiplication process due to bit-by- bit multiplication of the multiplier with the multiplicand, and is placed in rows after rows with necessary left shifts. When these rows are added column-wise to obtain the final result, the superior carry-save addition technique (already explained in the previous section) using a CSA can be effectively employed here to make the multiplication operation even faster.
In multiplication with longer operands, many summands are produced that need to be added to produce the final result. A more significant reduction in computation time can then be regularly achieved by way of grouping the summands in threes, and carry-save addition can now be performed on each of these groups in parallel to generate a set of S and C vectors in one full-adder delay. Next, the set of the S and C vectors thus obtained can again be grouped into threes, and carry-save addition once again is performed on them, generating a further set of S and C vectors in one more full-adder delay. This process, however, will be continued until there are only two vectors remaining. They can be then simply added by a conventional ripple-carry adder (CPA) or by a CLA (already explained in the previous section) in a usual way to produce the final desired product.