Home Computer Science
Direct Memory Access (DMA) I/O
Table of Contents:
The problem still remains; interrupts at different stages are required: more for PIO and relatively less for highly efficient interrupt-driven I/O, and processing an interrupt done only by CPU is, however, expensive. In addition, for both the cases, all data transfers also involve CPU and must be routed only through the CPU. As a result, the costly CPU time is badly wasted that causes an adverse impact on CPU activity, and also limits the I/O transfer rate. A befitting mechanism is thus needed that would target to ultimately relieve and release the CPU to a large extent from this hazardous time-consuming I/O-related activities, and that can be accomplished by simply letting the peripheral devices themselves to directly manage the memory buses without the intervention of the CPU. This would definitely improve not only the speed of data transfer but also the overall performance of the system, as the CPU is now released to carry out its own useful work.
With the development of hardware technology, the I/O device (or its controller) can be equipped with the ability to directly transfer a block of information to or from main memory without CPU's intervention. This requires that the I/O device (or its controller) is capable of generating memory addresses and transferring data to or from the system bus: i.e. it then must be a bus master. The CPU is still responsible for initiating each block transfer, and the I/O device controller can then carry out the physical data transfer without further program execution by the CPU. The CPU and I/O controller interact only when the CPU must yield control of the system bus to the I/O controller in response to a request issued from the controller. This request is in the form of an interrupt, to be serviced by the CPU. The CPU is now involved to process the interrupt. After servicing the interrupt, the CPU is no longer there, and it can then go back to resume the execution of its previously ongoing executing program. The access of the bus is now transferred and is now rest with the controller (requesting device) which then completes the required number of cycles for the data transfer, and then again hands over the control of the bus back to CPU. This type of I/O capability is called direct memory access (DMA). For large volumes of data transfer, DMA technique is found much faster than if it is carried out by the CPU, and is observed to be adequately efficient. To implement DMA and interrupt facilities, most computers may then require the system's I/O interface to contain special DMA and interrupt control units.
Figure 5.5 shows a block diagram indicating how the DMA mechanism works.
Schematic diagram of DMA I/O.
The controller is normally provided with an interrupt capability, and in this situation, it sends interrupts to the CPU to signal the beginning and end of the data transfer. The logic necessary to control the DMA activities can easily be placed in a single integrated circuit. DMA controllers are available that can supervise DMA transfers involving several I/O devices, each with a different priority of access to the system bus.
When the CPU intends to receive or send (read or write) a block of data, it issues a request to the DMA module with certain information, and the DMA transfer operation then proceeds as follows:
Intel 8257 chip supports four DMA channels, by which four peripheral devices can independently request for DMA data transfer at a time. The DMA controller has 8-bit internal data buffer, a read/write unit, a control unit, a priority-resolving unit along with a set of registers. Intel 8237 critically differs architecturally from 8257 and provides a better performance compared to 8257. It is an advanced programmable DMA controller capable of transferring a byte or a bulk of data between system memory and peripherals in either direction. Memory-to-memory data transfer facility is also available in this chip. This DMA controller can be interfaced to the processor family 80x86 with DRAM memory. Similar to 8257, the 8237 also supports four independent DMA channels (numbered 0, 1, 2, and 3), which may be expanded to any number, by cascading more number of 8237.
Here, each channel can be programmed independently, and any one of the channels may be made individually active at any point of time. But the distinctive feature of this chip is that it provides many programmable controls and dynamic reconfigurability attributes, which eventually enhance the data transfer rate of the system remarkably.
Many CPUs like those of the MC 68000 series have no internal mechanism for resolving multiple DMA requests-, this must be done by external logic. The DMA controller 68450 chip contains four copies of the basic DMA controller logic that enables the 68450 to carry out a sequence of DMA block transfers without reference to the CPU. When the current data count reaches zero, a DMA channel that has been programmed for chained DMA transfer (as mentioned in Bullet 4) fetches the new value of DC and IOAR from a memory region (MR) that stores a set of DC-IOAR pairs. A special memory address register in every DMA channel holds the base address of MR.
DMA is subsequently accepted as a standard approach, commonly used in all personal computers, minicomputers, and mainframes for carrying out I/O activities.
Different Transfer Types
Under DMA control, data can be transferred in one of the several following ways:
Implementation Mechanisms: Different Approaches
The DMA mechanisms can be implemented in a variety of ways.
Single bus: DMA detached from 1/0.
Single bus: DMA-I/O integrated.
DMA-I/O with separate I/O bus.
I/O Processor (I/O Channels)
While the introduction of a DMA controller in the I/O nodule is a radical breakthrough, it is after all not able to totally freeing the CPU. Moreover, DMA sometimes uses many bus cycles at a time, as in the case of a disk I/O, and during these cycles, the CPU will have to wait for bus access (as DMA always has a higher bus priority) that summarily restricts the performance to attain the desired level. Further development is thus targeted in quest of an enhanced I/O module so that this modified I/O module could control the entire I/O operation on its own, setting the CPU totally aside. This type of I/O module that almost fully relieves the CPU from the burden of I/O execution is often referred to as an I/O channel. The final and ultimate approach is then to convert this I/O channel to a full-fledged processor so that the CPU can now be relieved almost totally. This is accomplished by including a local memory to this I/O channel so that it can manage a large set of different devices with minimal or almost no involvement of the CPU. This module then basically consists of a local memory attached with a specialized processor and includes I/O devices. It shows that this unit as a whole then itself becomes a stand-alone computer. An I/O module having this kind of architecture is known as an I/O processor (IOP). An IOP can perform several independent data transfers between main memory and one or more I/O devices without recourse to the CPU. Usually, an IOP is connected to the devices it controls, by a separate bus system called the I/O bus or I/O interface. It is not uncommon for larger systems to use small computers as IOPs which are primarily communication links between I/O devices and main memory, and hence the use of the term channel for IOP. The IOPs are also called peripheral processing units (PPUs) to emphasize their subsidiary roles with respect to the CPU.
A channel or IOP is essentially a dedicated computer with its own instruction set processor to independently carry out entire I/O operations along with other processing tasks, such as arithmetic, logic, branching, and code translation required mostly for I/O processing. The CPU only initiates an I/O transfer by instructing the I/O channel to execute a specific program available in main memory, and then the CPU goes off. This program will indicate the device or devices to be taken, the area or areas of memory for storage, the priority, and the different types of actions to be taken in case of certain error conditions. The I/O channel uses this information and executes the entire I/O data processing, while CPU is fully devoted to its own work in parallel. When I/O activity is over, the channel interrupts the CPU and sends only all related necessary information. Traditionally, the use of I/O channels has been observed to be associated with mainframe or large-scale system attached with a large number of peripherals
(disks and tape storage devices), which are used simultaneously by many different users in multitasking as well as in on-line transaction processing (OLTP) environment handling bulk volume of data. As the development of chip-based microprocessors has dramatically progressed, the use of I/O channels has now extended to minicomputers and even to microcomputers. However, the fully developed I/O channel is best studied on the mainframe system, and possibly the best-known example in this regard is the flagship IBM/370 system.
The IOP in the IBM 370 system is commonly called a channel. A typical computer system may have a number of channels, and each channel may be attached to one or more similar or different types of I/O devices through I/O modules. Three types of channels are in common use: a selector channel, a multiplexor channel, and a block multiplexor channel (a hybrid of features of both the multiplexor and selector channels). The interface being used from an I/O module (channel) to a device (i.e. a device controller along with a device) is either a serial or a parallel. Although a parallel interface is traditionally a common choice for high-speed devices, but with the emergence of next-generation advanced high-speed serial interfaces, parallel interfaces have eventually lost their inherent importance to a considerable extent, and hence, are found to be much less common. However, the I/O channel is best implemented in the mainframe system, and possibly the well-known example in this regard is the flagship IBM/370 system.
A brief detail of I/O channel along with its different types, and its implementation in IBM/370 system, is given with figure in the website: http://routledge.com/9780367255732.
I/O Processor (IOP) And Its Organisation
The I/O channel has been finally promoted to a full-fledged IOP using a mechanism (already described in the "Introduction", Section 5.3.4) to make the CPU almost totally free from any I/O activities. The handshaking between the CPU and IOP at the time of establishing communications may take different forms depending mostly on the particular configuration of the computer being used. However, the memory unit in most cases acts as a mediator providing message centre (input-output communication region (IOCR)) facilities where each processor leaves some information for the IOP to follow. This is one form of indirect handshaking. The direct handshaking between CPU and IOP is generally done through dedicated control lines. Standard DMA or bus grant/acknowledge lines are also used for arbitration of the system bus between these two processors. However, Figure 5.9 illustrates here a schematic block diagram of a representative system containing an IOP. A sequence of steps is then required to be followed at the time of CPU-IOP interaction and communication for needed information exchange.
During IOP operation, the CPU is free and executes its own tasks with other programs. There may be a situation when the CPU and the IOP both compete with each other to get simultaneous memory access, and hence, the IOP is often restricted to have only a limited number of devices so that the number of memory accesses can be minimized. In the case of operation of a slower device, this may even lead to a situation of memory-access saturation, since I/O operations use DMA, and the CPU may then have to wait during this transfer, which may cause a notable degradation in CPU performance.
All the Intel CPUs have explicit I/O instructions to read or write bytes, words, or longs. These instructions specify the I/O port number desired, either directly as a field with the
Block diagram of a representative system containing an IOP.
instruction or indirectly using the register DX to hold it. In addition, of course, DMA chips (Intel 8257/8237) are frequently used to relieve the CPU from handling I/O burden. None of the Motorola chips have I/O instructions. It is expected that the I/O device register will be addressed via memory mapping. Here too, DMA is widely used.
A brief detail of IOP along with its working is given with figure in the website: http://routledge.com/9780367255732.