Home Computer Science
Intel X-86 Addressing Modes: IA-32 and IA-64
Table of Contents:
The X-86 is equipped with a variety of these addressing modes as described, which aim to provide efficient execution of high-level language programs. But we will consider here only the generic addressing modes of the X-86 family of processors. The 64-bit operation of the Pentium 4 and Core 2 has, however, modified and extended a few of the usual aspects that are available in each of these modes. In this processor, a number of different types of registers (16 bits, 32 bits, and 64 bits) have been made available that are entrusted with specific responsibilities in providing different addressing modes. Figure 3.16 illustrates a representative generalized scheme of the address calculation mechanism involved in different types of addressing modes. There exist a set of segment registers that are employed to determine the current segment, which is the subject of present reference. Each segment register holds an index to the segment descriptor table, which contains the starting address of the corresponding segments. Associated with each such user-visible segment register is a segment descriptor register (also named in the same fashion in Figure 3.16 to indicate their correspondence), not programmer visible, which records the access rights for the segment, length (limit) of the segment, and the starting address of the segment. Altogether, six such segment registers are available, but which one is to be used for a particular reference depends on the context of execution and the instruction. In addition, there are also two other registers, namely the base register and the index register, that may be used to build an EA of the operand.
Intel provided all types of generic addressing modes as already discussed and also other useful modes of its own using different types and different lengths of dedicated and general-purpose registers along with other criteria befitting with its CPU organisation. Some of the most useful modes of addressing that are included are immediate mode, register mode, register direct mode, register indirect mode, displacement mode, base with displacement (register relative) mode, base-plus-index mode, base relative-plus-index mode (base with index and displacement), scaled-index mode, scaled-index with displacement mode, based-scale index with displacement mode, relative mode, RIP-relative mode, and many other useful ones.
A schematic representation of addressing mode calculation mechanism (Intel processor).
All these different types of generic addressing modes that are available with the advanced X-86 processors, including Pentium and its subsequent upper versions, can also be described in a tabular form.
A brief detail of all these addressing modes, their working, and their usefulness with examples along with a tabular form of them (Table 3.1) is given in the website: http:// routledge.com/9780367255732.
The normal approach of CPU design obeying Von Neumann concept includes one or more registers (both hardware and software) that support CPU operations to realize certain predefined objectives, and as such, the organisations of these registers play a decisive role in the design of the CPU. The number of registers available in a CPU, their types, the way they are being used while the CPU executes programs, and similar other features altogether determine a particular class of a specific type of CPU organisation, which can be broadly classified into two distinct categories, namely
Accumulator-based CPU (single accumulator organisation);
General register-organised CPU (Multiple register).
Accumulator-Based CPU (Single Accumulator Organisation)
If the CPU has only one hardware register (accumulator), all operations are performed with an implied accumulator register; the CPU is known as an accumulator-based CPU (single accumulator organisation). Intel, Motorola, Xylog, and many others in the early days were using this approach.
A schematic block diagram of a small accumulator-based CPU is given in the website: http://routledge.com/9780367255732.
General Register-Organised CPU (Multiple Register)
Almost all processors (CPU) of today, be it microprocessors, or processors used in large mainframe systems, have a set of registers, used as multiple processor registers. Since more than one hardware register is available in these CPUs, such CPUs are categorized as general-register organisation CPUs. In some processors under this class, each such register is often assigned a unique number 0, 1, 2, ..., n-1 for its identification; in others, each register is recognized by its specific unique name. Most mainframe systems are realized with CPUs having general-register organisation. One such representative system of this kind is the IBM 360/370 family of computers in which there are 16 general-purpose hardware registers named as Rl, R2, R3, etc. (each is of 32 bits being used for different purposes),four floating-point registers (each is of 64 bits),and many other dedicated registers being devoted to other specific purposes. Various types of instructions with numerous addressing modes are available in the instruction set that makes this organisation most versatile and an immensely powerful one. At the other end, almost all modern microprocessors, including the famous Motorola MC68000 series and the outstanding Intel X-86 family of processors, have all exploited general-register organisation (Section 3.3). A rough descriptive overview of the register organisation in the microprocessors of these two families has been already presented at the beginning of this chapter and also in the website: http:// routledge.com/9780367255732. However, a schematic block diagram of such a general- register organisation of CPU is depicted in Figure 3.3.
Motorola, a strong competitor of Intel in this trade holding a share of nearly 50% or even more of the microprocessor market over many years with their marvellous products (MC 68000 series), finally decided to shift their focus from the then existing business line and started to concentrate more on the then a newer upcoming most promising area known as mobile communication. As a result, Motorola has sold its microprocessor division to a company, now called Freescale Semiconductors, Inc. Consequently, Intel - the only giant of earlier days - remains, still continues from its IA-32 architecture, and then exerts more efforts to develop more technology-enriched hardware to introduce IA-64 architecture, and today almost monopolizes and captures a major share of the marketplace of desktop and notebook systems, in addition to a limited share in the area of mid-range workstations. That is why today, Intel products in this trade have become the most dominant ones and deserve to be a subject matter worthy of discussions.
The Intel IA-32/IA-64 Architecture
The 32-bit Intel architecture, popularly known as IA-32, commenced from the introduction of Intel 80386 microprocessor, sometimes 1985 onwards, through progressively more powerful advanced processors up to the recent release of IA-64 Pentium and multicore series during a period of last odd 30 years, has implemented an evolution of the same basic instruction set architecture (IA-32)along with MMX, but with the continuous appropriate enhancement and modification befitting the underlying advanced form of more technology-enriched hardware. The different distinctive attributes that are found in IA-32, and later in IA-64 in the Pentium and multicore line of processors, have been discussed throughout this book at different places relevant to the respective topics under discussion at that point.
The register organisation of the member processors under Intel IA-32 and IA-64 has been already described in Section 3.3.3.
Stack-Organised CPU (A Stack Processor)
Recall that the number of registers presents in a CPU and their internal organisation plays a decisive role in the design of register-based CPU, and that it also influences the corresponding machine instruction set design. A few alternatives of this approach were earlier explored: one of these gave rise to a different strategy in CPU design, commonly known as stack-organised CPU. The main architectural feature of this new concept is that the instruction set will have implicit operands to be held only in a stack data structure, and only to those locating at the TOS, and results after required operations are also always returned to the TOS. All read and write operations are referred to TOS only. A stack memory thus replaces the accumulator and the other CPU registers used for temporary data storage. Moreover, here machine instructions having no operand addresses (zero-address instruction) are shorter in length, thereby saving both CPU execution time and necessary memory space - one of the vital criteria of a CPU design. The earlier series of large mainframe computers B5500, B6500, and B6700 produced by Burroughs Corporation and later HP3000 from Hewlett-Packard use this approach in their CPU design. A recent example of this type of CPU design has been implemented in the SUN picojava microprocessor, for fast execution of compiled Java code.
The stack-organised CPU is used in computers with stack organisation (not stack orientation), which we have already discussed in detail in Section 3.4, including stack operations. It should be noted that stack-oriented operations are available in all CPUs, even those that do not belong to this class of stack-organised CPUs.
Stack is mostly located in memory, but its top few elements are held in hardware registers in processors to avoid frequent memory accesses, because all temporary storage locations are now part of the stack (memory). As a result, the stack access time will be drastically reduced since most accesses involve in only the top few elements which are in the registers, and therefore, only the register transfers are required within the processors. Different computer systems belonging to this category manufactured by different vendors use different number of registers to achieve their own target, which is essentially a trade-off criterion and a critical issue in this CPU design. The assigned memory for stack is partitioned into three segments: program, data, and stack separately, and a number of hardware registers are used as pointers to the program and data segments, as well as to handle and operate them. Stack computers have a variety of instructions which, when executed, perform operations on data that occupies the top few locations of the stack. The results thus generated are also left on the stack. During the execution, the data are moved between the stack and the memory. A computer organised around a stack offers several advantages, when compared to a multi register-organised CPU. Apart from many others, the major features are as follows:
None of the Intel CPUs have stack addressing, but they do have special instructions PUSH and POP to put items on the stack and remove them, respectively. In contrast, all of the Motorola 68000 series have stack addressing using autoindexing.
There are some special features of a generalized stack-organised CPU that emphasize certain elegant procedures these machines usually follow to execute the arithmetic expressions using reverse Polish notation (RPN).
Expression Evaluation and Reverse Polish Notation
A stack organisation is very effective for evaluating the arithmetic expressions. Under this organisation, the data-processing instructions do not need to contain addresses as they generally do in a conventional Von Neumann computer. For example, the ADD operation a+ b for a stack-organised machine is specified by the following sequence of instructions, and all these actions are hidden from the programmer, who does not have to worry about this at all:
PUSH a: Loads the memory operand a into TOS.
PUSH b: Loads the memory operand b into TOS above a causing a's location to become TOS - 1.
add: The top two words of the stack are popped into the ALU where they are added. The sum is once again pushed back automatically into the TOS with no instruction needed and the SP is automatically adjusted.
Mathematical formulas are commonly expressed in what is known as infix notation where a binary operation appears between the operands (e.g. u+v). The arithmetic expressions can be represented in prefix notation where the operator is placed before the operands (e.g. +uv). This representation is often referred to as Polish notation. The postfix notation, referred to as reverse Polish notation (RPN), places the operator after the operands (e.g. uv+). It is to be noted that, regardless of the complexity of an expression, no parentheses are ever required while using RPN. This notation is ideal for evaluating arithmetic and other expressions on a computer which is stack-organised. The expression consists of n symbols, where each one is either an operand (variable or constant) or an operator. The procedure consists of first converting the arithmetic expression into its equivalent RPN using a suitable algorithm. The expression in RPN thus generated will then finally be evaluated by another algorithm using a stack.
A brief detail of a generic stack-organised CPU implementation with figure as well as the implementation of RPN with algorithms and solved examples is given in the website: http://routledge.com/9780367255732.
Stack-Organised Symbolic LISP Processor
A symbolic processor is a stack-organised machine that has been primarily developed for artificial intelligence (AI) applications. The machine architecture is divided into layers that allow the use of a pure stack model to design an overall simple instruction set, and the implementation is carried out with a simple stack-oriented machine. To make the operation faster, cache memory is used to implement the stack buffer and temporary memories (scratch-pad) to communicate with main memory. Most of the instructions are executed in one machine cycle following the dominant RISC philosophy. Integer instructions fetch operands from the TOS and place them into the stack buffer and in scratchpad memory. Fixed-point additions are carried out in parallel. Floating-point operations are carried out by the tag processors. The system has been built to primarily execute Lisp instructions.
A brief detail of a generic LISP processor with figure is given in the website: http:// routledge.com/9780367255732.
CPU is considered as the chief resource in the computer system, and consists of control unit, ALU, and a set of registers. The main task of the CPU is to fetch each active instruction from memory one after another, decode it, and then finally execute it. This sequence of actions known as an instruction cycle is considered as the central theme of any kind of CPU operation. The different types of registers and their organisations inside a CPU have been described with the illustration of both IA-32 and IA-64 architectures. The basic organisation of two different categories of CPU, based on the number of available registers, namely the accumulator-based CPU (a single hardware register) and the general-purpose register CPU (a set of general-purpose hardware registers) organised in many different ways, has been discussed, along with a completely different type of CPU organisation, known as stack-organised CPU, which has also been implemented in large commercial computers. A representative real-life stack-organised processor architecture (LISP processor) has been presented here for a clear understanding of this subject. The generic instruction set along with MMX instructions (multimedia operations) of both IA-32 and IA-64, which determines the types of the operations that a CPU can perform, to be decided at the time of CPU design, has also been explained. Numerous addressing schemes used in the instructions of a CPU that enormously influence the instruction execution time and also determine the performance of the CPU as a whole have been narrated. Generic addressing modes, including the important concepts of pointers, and indexed addressing employed in today's dominant Intel X-86 processors family, including both IA-32 and IA-64, have been presented as the real-life representative examples. In fact, this chapter introduces the basic structure and different types of organisation of a generic CPU with relevant topics, setting aside for the time being, its advanced forms such as pipelined/superscalar architecture and multicore architecture, which have been discussed in Chapter 8.
a. How long is a clock cycle?
b. What is the duration of a particular type of machine instruction consisting of four clock cycles?
i. Pointer variable
ii. Global variable Explain why?
a. Using an accumulator-type computer with one-address instructions.
b. Using a general register-type computer with two-address instructions.
c. Using a general register-type computer with three-address instructions.
d. Using a stack-organised computer with zero-address instructions.
3.16 Explain why a given arithmetic expression (infix) needs to be converted to RPN for effective use of a stack organisation. Convert the following expression into RPN, and show the evaluation procedure in the stack-organised CPU.
Brey, Barry B. The Intel Microprocessors. Upper Saddle River, NJ: Prentice-Hall, 2009.
Flynn, M. and Johnson, J. "On instruction sets and their formats." IEEE Trans Coinput, March 1985. Stallings, W. Computer Organisation and Architecture, Indian ed. Dorling Kindersley India Pvt. Ltd., 2010.
Hayes, J. P. Computer Architecture and Organisation, Int'l ed. WCB/McGraw-Hill, 1998.