Mini-Lecture - Virtual Memory

Virtual Memory
Mini-Lecture

Bob Brown
Computer Science Department
Southern Polytechnic State University
Copyright © 1999 by Bob Brown

Scarce, Expensive Memory

In the earliest computers, memory was a scarce and expensive resource. Computers with 2K words were not uncommon. Programmers spent significant time making their programs fit into the available memory. Careful selection of algorithms and really tight coding helped, but sometimes a program was just too big to fit. In that case, programmers used overlays, the technique of dividing a program into pieces that were used at different times during the course of executing the program.

Use of Overlays

Consider an assembler program, the language translator that turns symbolic programs in assembly language into binary programs. Earlier we learned that assemblers make two passes through the source code. In the first pass the size of each item is determined, the location counter is stepped through the program, and symbols and their locations are saved in a symbol table. On the second pass, binary values for instructions and data are generated and the addresses are looked up in the symbol table. Pass 2 generates the binary program.

The memory map of the assembler might look like the first part of the figure, with routines common to both passes, an area for the symbol table, and the code for the two passes. If that program were too big to fit in memory, the programmer might write it so that when it starts its memory map looks like the second part of the figure. The common routines, symbol table, and pass 1 code are in memory, but the pass 2 code is not. At the end of pass 1, when that code is no longer needed, the pass 2 code is loaded in its place. This technique is called overlaying. The third part of the figure shows the memory map after the pass 2 code overlays the pass 1 code.

Dividing a program into overlays and loading them was the responsibility of the programmer. It was a difficult and error-prone task. In 1961, a group of researchers at Manchester, England set out to find a way to automate the process of managing overlays.

The key idea of the Manchester group, and the idea that makes virtual memory possible, is to separate the concept of the address space from the physical memory locations.

To understand what that means, we must examine the way we think about address space and memory in a computer without virtual memory. Consider a computer capable of generating 32-bit addresses. Such a computer has an address space of four gigabytes. However, it would be unusual for a computer actually to have 4 GB of memory. It would be much more likely for the computer to have from a few megabytes to a few hundred megabytes of actual memory.

Useful and Useless Address Spaces

The next figure shows the address space of a computer with 32-bit addresses and 16 MB of actual memory. The address space is still 4 GB, but only 16 MB of it is useful. If we have a program that requires more than 16 MB, it can't be run on this computer.

Remember, the idea that the Manchester researchers had was to separate the concepts of address space and physical memory. Virtual memory allows programmers to use the entire address space defined by the size of an address, regardless of the amount of physical memory present.

Virtual Memory

In a virtual memory system, the program and data are stored in secondary storage (disk) and only the parts in use at any given moment are brought into real memory. This works because of locality of reference. When we discussed cache systems, we described temporal locality of reference and spatial locality of reference. It is in the nature of von Neumann architecture computer systems that if a word from memory has been used recently, it is likely to be used again in the near future. Similarly if a word has been used recently, words near it are likely to be used in the relatively near future.

Virtual memory is implemented by dividing the portion of the address space used by a program into fixed-size pieces called pages. The size of the pages is a power of two, typically from 512 bytes to about 64 K bytes. We will use a page size of 4 K in the example, and discuss the trade-offs in selecting a page size later.

Real memory is divided into page frames which are the same size as the virtual pages. When a page is needed during the running of a program, it is copied into a page frame in real memory. The amount of real memory available is called the physical address space. The area available for the programmer to use (determined by the size of an address) is called the virtual address space.

The process of moving program pages to and from real memory is called paging. Paging is transparent to the writer of the program. Remember our two definitions:

Transparent: It's there, but you can't see it.
Virtual: You can see it, but it's not there.

We have skipped over some details up to now, and one of them is the need for address translation, or mapping. Address translation is made necessary by the following facts:

Programs generate virtual addresses,
The memory subsystem hardware only understands physical addresses, and
Any page can go into any page frame.

Address translation is accomplished by a memory management unit, or MMU. The MMU may be a part of the CPU or a separate unit. The MMU and the operating system work together to implement virtual memory.

Return to the example of a computer with a 32-bit address, 16 MB of real memory, and a virtual memory system that uses 4 K pages. The memory management process translates a 32-bit virtual address into a 24-bit physical address. This is done with the aid of a page table. If there are 32 bits in an address, and a page is 4 K, there are 2²⁰ or one M pages in the virtual address space. (How do you know that's true?) These pages are represented by a page table with 2²⁰ entries. Each entry contains a valid bit that indicates whether the page is currently in main memory, a dirty bit indicating whether the page has been modified, and a frame number pointing to a page frame in real memory. Since there are 4,096 4 K frames in a real memory of 16 M cells, (why?) the frame number in our example page table will be 12 bits.

Consider the figure below. The CPU has generated a 32-bit address as shown in the upper right of the figure. As far as the CPU is concerned, this is just a 32-bit number. For purposes of address translation, it is useful to think of the address as a 20-bit page number and a 12-bit "offset" that identifies a cell within a page. We will think of a physical address as a page frame number and an offset. Since the pages and page frames are the same size, the offset from the virtual address can simply be copied into the offset part of the physical address.

However, we have some work to do to translate the 20-bit page number into a 12-bit page frame number. In the example, the virtual page number is three. We look in entry three of the page table and find that the valid bit is a one, indicating that the page is already in real memory. We check the frame number from the page table and find that it is six. Virtual page three is stored in page frame six. We copy this number into the frame part of the physical address, and we have mapped a 32-bit virtual address onto a 24-bit physical address. The memory hardware can now retrieve the desired cell.

In the example, we assumed that the page containing the needed cell was already in main memory. That doesn't always happen. A reference to a cell that's not in mail memory is called a page fault. In case of a page fault, the MMU would have signalled an interrupt upon finding a zero valid bit. The operating system would have read the page from disk into a free page frame, updated the page table, and restarted the instruction that caused the page fault.

The process of reading pages in from disk only when they are needed is called demand paging. Pages are not loaded into page frames until there is a demand for them.

A program typically starts with none of its pages in real memory. The first reference causes a page fault. Soon, however, all the pages needed for a given part of the program are in memory. This set of pages is called the working set. The composition of the working set changes over time as the program runs.

As long as the working set of a program is smaller than the available physical memory, the program runs nearly as fast as it would if it had free access to enough real memory for the entire program. If there is not enough real memory to hold the working set, page faults occur frequently and the CPU spends more time moving pages around than it does running the program. This condition is called thrashing.

When There's No Free Frame -- Page Replacement Policy

When a page fault occurs and there's no free page frame, the operating system must make room for the new page by replacing a page already in main memory. The designer of a virtual memory system must decide on a page replacement policy to determine which page will get evicted from its page frame when there's no free frame.

We rejected LRU (least recently used) as a replacement policy for cache because of the amount of time consumed by bookkeeping. Cache operations happen very fast. In virtual memory, things happen more slowly, and it's possible we can afford the effort required for LRU.

An alternative is FIFO, or first-in first-out. The oldest page is evicted from its page frame, regardless of when it was last used. The advantage of FIFO is that bookkeeping only has to happen when a new page is loaded, and not every time a page is referenced.

How Big Is A Page?

Earlier we glossed over the problem of how big to make a page. No matter what we choose for a page size, most programs won't be an exact multiple of the page size. On the average, half of the last page in a program will be wasted. Since the unused bytes in a page occupy space in a page frame when the page is loaded, we would like to minimize the waste, and this drives us toward smaller page sizes. Similarly, we find that the first and last pages in each part of the working set generate some waste, driving us to smaller pages. Smalller pages waste less main memory.

However, each page requires a page table entry. Smaller pages mean more pages, and so more page table entries. Further, larger pages use I-O more effectively than smaller pages. As with much of the rest of computer architecture, the designer is faced with a set of trade-offs.

Table of Contents

Originally published: 2000-10-28

Last updated: 2014-10-02 7:17

Virtual MemoryMini-Lecture