Desktop version

Home arrow Computer Science

  • Increase font
  • Decrease font

<<   CONTENTS   >>

RISC Pipelining

The instruction pipelining principles (see pipeline architecture, Chapter 8) are found quite conducive to the inherent characteristics of RISC architecture, and hence, massively used since the first part of the 1980s in RISC machines, including RISC scalar and superscalar processors with adequate hardware support, as one of the major contributors to achieve RISC's high performance. The pipeline design thus implemented in the instruction cycle of RISC processor may be composed of some basic stages, where each stage, in turn, may be decomposed again into a number of substages in actual implementation for its smooth operation to yield an even better outcome. As the RISC architecture always has a smaller, simpler, and a regular instruction set, the design of phasing an instruction cycle into four or five stages, and even more nowadays to execute an instruction, is found quite appropriate that eventually offers an average potential speed-up of a factor of almost 4 or 5, and sometimes even more. These basic stages, thus, are as follows:

  • • Instruction fetch unit;
  • • Instruction decode unit;
  • • Operands address calculation and operand fetching;
  • • Execution of the instruction;
  • • Write back of the result at the destined location.

The responsibilities carried out individually by each such stage are described in the website ( and more elaborately explained in Chapter 8.

RISC and CISC Union: Hybrid Architecture

Till the mid-1990s, processor architects were divided into two opposite tents. While CISC designs were mostly preferred mainly due to immense success of microprograming and a wide availability of existing versatile software, the RISC design was favoured chiefly with its simplicity and efficiency. However, the demarcation between them is gradually becoming faded away with passing days, and a trend eventually started to include one's useful features into the architecture of other. Many modern CISC processors are observed to include more number of GPRs (essentially a RISC ideal) in order to improve their efficiencies. Intel, from its Pentium III, included an additional set of eight 128-bit vector registers to implement SIMD-based multimedia (SSE technology, see chapter 3) applications. AMD's new X-86 chips also added further eight GPRs and eight extra SSE register for multimedia. This trend is now found to be continued, and in fact, the successor of Intel Pentium and Itanium IA-64 series will move further by adding 128 GPRs.

Moreover, the Pentium and Athlon family of processors now exploit a CISC-RISC hybrid architecture that uses a type of decoder to convert the CISC instructions into corresponding simpler RISC instructions before execution. These are then executed very fast by an embedded massively pipelined RISC core, equipped with many performance-enhancing hardware and software facilities. Traditionally, these have been possible only in a true RISC design. These hybrid processors are fully compatible with the software developed for their CISC predecessors, yet they can equally compete against processors based on true RISC designs. This concept expects to be enhanced more in the days to come, although these processors are not at all suited for popular mobile and embedded applications. On the other hand, contemporary RISC processors also indulged themselves to move more towards CISC-like designs by including more instructions and added functions (against RISC culture) than old CISC designs. For example, Motorola G4 processor used in power Macs and eMacs adds 162 new instructions to the existing RISC architecture for its Alti Vac unit to more efficiently handle multimedia and digital signal processing applications.

Types of RISC Processors

RISC machines available from different manufacturers exploit many design choices, of course, within the constraints of RISC; each one is different with many lucrative features absolutely of their own; each is equally competent, but experts still can only arrive at dissimilar conclusions. Some of these machines also made a compromise and included several useful features of CISC in their design to make them more versatile. In fact, the field of RISC design is full of pitfalls and choices, and no doubt, no common conclusions can yet be reached. The subject is still open and invites much activity for the exploration in the years to come. We cover here a brief architectural detail of only a few of them worthy of mention, for a clear understanding about the strength and capability of RISC architectures.

PowerPC Processors

In the early 1990s (1993), Apple, IBM, and Motorola jointly developed a family of single-chip microprocessors, the PowerPC, following the line of existing IBM RISC System RS/6000 series of computers. This family includes the members 601, 603, 604,620, and also other models that mostly exhibit certain common features, but few models have even something more. Interestingly, although PowerPC follows the typical RISC designs having a fixed instruction length of 32-bit word, it includes a variety of formats and addressing modes, and has a substantially large number of instructions - more than 200 distinct types that goes against the RISC culture and philosophy.

A brief detail of all PowerPCs with associated figures is given in the website: http://

SPARC Family of Processors

The SUN Microsystems Corporation in 1987 has introduced its first open microprocessor architecture, and not an implementation of a chip, popularly known as SPARC (Scalable Processor ARChitecture). Different technologies (CMOS, ECL, GaAs, gate array, VLSI, etc.) and different specifications were used by different licensed manufacturers, such as Fujitsu, Cypress, LSI Logic, Inc. and Texas Instruments, to fabricate the chip, but the basic instruction set architecture has been remained the same.

SPARC is basically a 32-bit design consisting of integer unit (IU), FPU, an optional user- supplied co-processor, a MMU, various sizes of off-chip cache, and different types of memory organisation. Use of an overlapping register window scheme using 32 visible and accessible registers, called RO to R31 is one of its marvellous features. It works on boundary-aligned 32-bit words and supports a paged linear address space of 232 individually addressable 8-bit bytes. Memory is a big-endian, like Motorola 68000 family. Although SPARC design is truly a uniprocessor architecture, it has kept the provision to connect multiple SPARC chips to build a symmetric multiprocessor (SMP) system with common shared main memory. Special instructions have been thus included to handle multiprocessor synchronization and other related similar issues.

UltraSPARC Processors

The UltraSPARC is a super pipelined superscalar processor enriched with some salient features, not available with ordinary SPARC. Superscalar processor consists of two independent pipelines: each pipeline consists of nine stages, and each stage is completed strictly in one processor clock cycle. Each pipeline consists of two execution units: one for integer operations with its own register set and one for floating-point operations having its own different register set, and they can be operated simultaneously in parallel through its own pipeline. As a result, a total of four new instructions can enter the execution phase every clock cycle. The processor uses two levels (multilevel) of cache: an external cache (E-cache) and two internal caches, one for instruction (I-cache) and one for data (D-cache). The MMU has two translation lookaside buffers (page table storage): one for instructions (iTLB) and one for data (dTLB). The UltraSPARC series of processors handles both addresses and data as 64-bit values, but maintains downward compatibility to accommodate everything of earlier 32-bit versions. A recent release, the UltraSPARC III fabricated with 0.18-pm technology having a clock speed in the range of 750-900 MHz, is enriched with many useful advanced features. It is targeted to attain around 1.5 GHz in the forthcoming days.

A brief detail of all SPARC and UltraSPARC processors with related figures is given in the website:

MIPS Processors

MIPS, the Stanford chip, slightly modified from Berkeley RISC chip, was initiated by Hennessy who subsequently formed MIPS Computer Systems (later a division of Silicon Graphics) in 1984 to introduce MIPS I as the first member of the MIPS RX000 series of microprocessors. Later, the 32-bit MIPS R2000 in 1985, followed by architecturally identical R3000 in 1988 differing only in speed and price, and later 64-bit R4000 in 1991, was launched. Later members of the same series, like R10000 announced in 1994, add numerous architectural extensions and far more complex instruction pipelines. Subsequently in 1999, 32-bit MIPS 2 and later 64-bit MIPS 64 were released.

The major components of this single-IC microprocessor family include a register file of 32 GPRs, each having 32-bits, and the processing logic to perform the basic fixed-point arithmetic/logic functions using 32-bit operands. Floating-point operations are executed by an on-chip or off-chip FPU supported also by an optional floating-point co-processor obeying IEEE 754 standards. MIPS never impose any earmarking on any of the registers in the chip for any special purposes. As a result, the local/global variables and the input/ output parameters can all be put in any of the registers for even faster execution. A unit called system control co-processor provides communications with external memory (both cache and main memory), and also an automatic address translation logic supported by special-purpose arithmetic circuits to perform address computations required to handle virtual memory system. MIPS uses a deeper instruction pipeline of five stages, compared to

four stages used in its contemporary counterpart ordinary SPARC. Each instruction in the instruction set is of 32 bits and word-aligned. The address space is 232 bytes (4 gigabytes) and byte addressable, while the upper 2-gigabyte address space is reserved for the operating system. The memory can be configured either as big-endian or as little-endian by selecting a pin on the chip, thereby satisfying the users of both tents. The machine supports paged virtual memory management. In general, MIPS architecture has been built up using different trade-offs where more thrusts have been put on software that ultimately resulted in creating problems for the software designers, particularly in the optimization of compiler design. Still, it has been done only for the sake of making the hardware simpler with faster operation to realize enhanced performance.

A brief detail of MIPS processors with related figures is given in the website: http://

<<   CONTENTS   >>

Related topics