Home Computer Science
Translation Lookaside Buffer (TLB)
Every virtual memory reference can cause two physical memory access: one to fetch the appropriate page table entry, and another one to fetch the desired data after receiving the information from page table. To overcome (reduce) this double memory access time, most virtual memory schemes make use of an additional special fast cache to store the original page tables. This special cache is usually called a Translation Lookaside Buffer (TLB), The TLB works in the same way as memory cache, and contains those page table entries that have been most recently used, or likely to be referenced obeying the principle of locality in memory references. It is expected that pages belonging to the same working set will be directly translated using the TLB entries, thereby guaranteeing significant improvement in performance. The TLB can also be implemented with a special associative memory, or can use part of the cache memory (see virtual address cache in cache addressing). In general, the page table is placed in RAM. The reader is here cautioned not to confuse cache memory-main memory handshaking with main memory-virtual memory handshaking; they belong to two different levels in memory hierarchy. Figure 4.14 shows a scheme of address translation mechanism with the use of TLB. To facilitate the mapping, each virtual address is considered here to be represented by three fields; (i) the leftmost field holds the virtual page number, (ii) the middle field represents the cache block number, and (iii) the rightmost field is the word address within the block.
The virtual page number is used as a key to search through the TLB for match. In case of a match (hit) in the TLB, the page frame number is retrieved from the matched page entry. The cache block and word address are copied directly to form the physical address. In case there is a miss, i.e. the match cannot be found in TLB, the hashing function is used to generate the pointer to identify one of the page tables or a page table entry (in case of one page table) where the desired page frame can be obtained. In spite of that, if the requested page cannot be found in the page table, a page fault is declared. A page fault occurs when the referenced page is not available (resident) in the main memory, the running process hence, is suspended. An interrupt (page-fault interrupt) is issued. A process switch is made to another ready-to-run process while the missing page as requested is transferred from the
Translation lookaside buffer and cache operation.
disk unit to the physical (main) memory (interrupt servicing). The suspended process of address translation then resumes computation. It should be noted that both TLB entries and page table entries need to be dynamically updated to reflect the latest memory reference history, and that is maintained by using a few additional bits in the translation maps (in TLB, and memory page table).
The mapping scheme as described can be extended with multiple level page table (along with multiple page tables at each level). Multi-level paging requires multiple memory references to access a sequence of page tables to generate the desired physical address, and hence, consumes a relatively longer time, but can accommodate expanded memory space (larger memory), and can provide sophisticated protections of page accesses at different levels of the memory hierarchy to safeguard the users' interest. It is once again recalled that the virtual memory and its related aspects are created by the computer systems, but those are totally managed by the operating system using different policies, and hence are outside the domain of the organisation and architecture of the computer systems. Still, some topics related to their management are described here only for the sake of completeness of this discussion.
It has been observed that most programs do not reference their logical address space uniformly, rather randomly access only a small number of pages forming a set, and the changes in this set then occur slowly with time. The pages whose addresses are referenced during the time interval from (f - f,) to t denoted by (f - tv t) constitute the working set w (f, f ,). An alternative definition proposed by Denning is that at any instant time, f, there exists a set consisting of all pages used by the к most recent memory references.
This is defined as working set w(k, t). This к is sometimes called the window size of the working set. Only the working sets of active processes are resident in main memory. It has been found that k+1 most recent references must have used all the pages used by the к most recent references, and possibly others. If the working set is not properly maintained, then page faults are generated frequently and continuously, resulting in what is called thrashing. This makes a constant to and fro journey of pages from virtual memory causing unnecessary additional overhead (to service frequent page-fault interrupts) resulting in severe performance degradation.
If some pages are loaded before letting the processes run to avoid such page faults is called prepaging. The argument against this approach to bring such pages into memory in advance is that if the program is in transition between one working set to another, and has not arrived at a stable situation, the pages that are already brought in advance, are not really going to be referenced, thereby causing a huge waste already incurred in time and costly memory space. The alternative of this approach is what is called demand paging scheme in which the pages that are actually referenced will only be loaded, and no others. However, the relative merits of these two different approaches are still a debatable issue. But, the concept of working set itself ensures that if this set is maintained in the fastest level of memory M, (here it is main memory, in case of main memory-virtual memory handshaking), the number of references to can be made considerably greater than the number of references made to the other lower levels (virtual memory) of the memory hierarchy. This is also true at the time of cache memory-main memory handshaking.
A brief detail of the working set and its implications in handling pages is given in the following web site: http://routledge.com/9780367255732.
Demand Paging Systems
Paged memory management uses various types of memory allocation policy to operate a virtual memory. Amongst them, demand paging system is one of the most often used policy that properly matches the paging scheme principle. In analogy to the well-known demand feeding algorithm for babies; when the baby cries, you feed it (as opposed to feeding at regular intervals of day). Similarly, in demand paging, pages are brought in memory only when an actual request is issued for a page, and not in advance, i.e. the pages are brought into main memory only upon demand. This policy allows only pages (instead of processes) to be transferred between the main memory and the swap device (virtual memory). The idea of demand paging nicely goes with the concept of working set which is formed only with the pages as being referenced on demand made by the active processes. The major advantage of this method is that it offers the flexibility to dynamically accommodate a larger number of processes within a limited size of physical memory in multiprogramming environment. This, in turn, will substantially increase the system throughput with less overhead (e.g. page fault), and decreases the memory traffic load. Pages of the process not under demand will not involve in this game. This demand paging policy was first implemented in UNIX BSD 4.0 release. Later, UNIX System V and many other modern operating systems also employed this demand-paging scheme in their memory management.
Page Replacement Principles
Since the total number of available page frame is much smaller than the number of pages, the frames will eventually be mostly fully occupied. In order to accommodate a new page to resolve the page fault situation, one of the resident pages must be replaced. Memory management policies, in fact, include the preemptive allocation and deallocation of page frames to active processes, and also replace page frames whenever required. Page replacement refers to the process in which a resident page in main memory (page frame) is replaced by a new page transferred from the virtual memory. This replacement operates on a per-process basis. Different page replacement policies, however, have been suggested. The ultimate goal of a page replacement policy is thus to maximize the hit ratio or equivalently, to minimize the number of possible page faults so that the effective memory-access time can be reduced. A good policy should also take into account the program locality property, and will after all target a page frame to remove whose absence in the memory would have the smallest adverse effect on the running programs. The policy to be chosen, however, is often affected by the page size, and by the number of available page frames. The effectiveness of a policy mostly depends on the program behaviour, memory traffic patterns encountered, as well as on the type of paging scheme being implemented.
When the removal of a page is ultimately required to make room for a new page to be brought in, the operating system will always try to choose a clean page to remove, i.e. a page frame that has not been changed, its disk copy is already up-to-date, no rewrite is, hence, necessary, and the new page will then simply overwrite the page frame to be evicted. But, if the page frame selected to be removed has been modified (dirty page) while in main memory, it must be written back to the disk to bring the disk copy to latest status, which thereby consumes additional time. Thus, an extra bit (dirty bit) is then needed to be attached with every page entry in the page table to indicate whether the page is clean or dirty, and this information is used to select a page at the time of page removal.
It is always desirable to maintain in memory a high ratio of clean pages to dirty pages to minimize the rewrite time at the next page fault. Some operating systems take a chance that whenever the disk is idle, a dirty page is selected by suitable prediction and is picked up, and it is then copied to the disk using DMA or data channels (IOP) while the CPU is busy with its other activities. Such copying never changes its memory contents. Writes to disk with the intention of making dirty pages clean are called sneaky writes. It is still a debatable issue whether the sneaky write is really a profitable one if the administrative overhead attached with it is also taken into account.
Performance of a page removal algorithm is usually measured in terms of the success function S, the number of success, or the failure function F, the number of failures. If P is the number of page references in the page trace, then S + F = P. Another measure of performance is the success frequency functions S = S/P, or failure frequency function f = F/P, hence, /= 1 - s. The effectiveness of a page removal algorithm is measured in terms of how often the system will access virtual memory to fetch a page, it had replaced. If the replacement algorithm is a good one, the system would not have to go very often.