Prev Next

Address Translation

Example Machine Memory Layout: System/161

System/161 emulates a 32-bit MIPS architecture.
Addresses are 32-bits wide: from 0x0 to 0xFFFFFFFF.

This MIPS architecture defines four address regions:

0x0–0x7FFFFFFF: process virtual addresses. Accessible to user processes, translated by the kernel. 2 GB.
0x80000000–0x9FFFFFFF: kernel direct-mapped addresses. Only accessible to the kernel, translated by subtracting 0x80000000. 512 MB. Cached.
0xA0000000–0xBFFFFFFF: kernel direct-mapped addresses. Only accessible to the kernel. 512 MB. Uncached.
0xC0000000–0xFFFFFFFF: kernel virtual addresses. Only accessible to the kernel, translated by the kernel. 1 GB.

mips 1

mips 2

mips 3

mips 4

mips 5

Mechanism v. Policy

We continue with the details of virtual address translation today.

However, it is important to note that both hardware and software are involved:

The hardware memory management unit speeds the process of translation once the kernel has told it how to translate an address or according to architectural conventions. The MMU is the mechanism.
The operating system memory management subsystem manages translation policies by telling the MMU what to do.

Goal: system follows operating system established policies while involving the operating system directly as rarely as possible.

Efficient Translation

Goal: almost every virtual address translation should be able to proceed without kernel assistance.

Why?

The kernel is too slow!

Recall: kernel sets policy, hardware provides the mechanism.

Explicit Translation

Process: "Dear kernel, I’d like to use virtual address 0x10000. Please tell me what physical address this maps to. KTHXBAI!"

Does this work?

No! Unsafe! We can’t allow process to use physical addresses directly. All addresses must be translated.

Implicit Translation

Process: "Machine! Store to address 0x10000!"
MMU: "Where the heck is virtual address 0x10000 supposed to map to? Kernel…help!"
(Exception.)
Kernel: Machine, virtual address 0x10000 maps to physical address 0x567400.
MMU: Thanks! Process: store completed!
Process: KTHXBAI.

Translation Example

K.I.S.S.: Base and Bound

Simplest virtual address mapping approach.

Assign each process a base physical address and bound.
Check: Virtual Address is OK if Virtual Address < bound.
Translate: Physical Address = Virtual Address + base

Base and Bounds: Example

Base and Bounds: Pros

Pro: simple! Hardware only needs to know base and bounds.
Pro: fast!
- Protection: one comparison.
- Translation: one addition.

Base and Bounds: Cons

Con: is this a good fit for our address space abstraction?
- No! Address spaces encourage discontiguous allocation. Base and bounds allocation must be mostly contiguous otherwise we will lose memory to internal fragmentation.
Con: also significant chance of external fragmentation due to large contiguous allocations.

K.I.Simplish.S.: Segmentation

One base and bounds isn’t a good fit for the address space abstraction.

But can we extend this idea?

Yes! Multiple bases and bounds per process. We call each a segment.

We can assign each logical region of the address space—code, data, heap, stack—to its own segment.
- Each can be a separate size.
- Each can have separate permissions.

Segmentation works as follows:

Each segment has a start virtual address, base physical address, and bound.
Check: Virtual Address is OK if it inside some segment, or for some segment:
Segment Start < V.A. < Segment Start + Segment Bound.
Translate: For the segment that contains this virtual address:
Physical Address = (V.A. - Segment Start) + Segment Base

Segmentation: Example

Segmentation: Pros

Have we found our ideal solution to the address translation challenge?

Pro: still fairly simple:
- Protection (Segment Exists): N comparisons for N segments.
- Translation: one addition. (Once segment located.)
Pro: can organize and protect regions of memory appropriately.
Pro: better fit for address spaces leading to less internal fragmentation.

Segmentation: Cons

Con: still requires entire segment be contiguous in memory!
Con: potential for external fragmentation due to segment contiguity.

Let’s Regroup

Ideally, what would we like?

Fast mapping from any virtual byte to any physical byte.

Operating system cannot do this. Can hardware help?

Translation Lookaside Buffer

Common systems trick: when something is too slow, throw a cache at it.
Translation Lookaside Buffers—or TLBs—typically use content-addressable memory or CAMs to quickly search for a cached virtual-physical translation.

TLB Example

What’s the Catch?

CAMs are limited in size. We cannot make them arbitrarily large.

So at this point:

Segments are too large and lead to internal fragmentation.
Mapping individual bytes would mean that the TLB would not be able to cache many entries and performance would suffer.

Is there a middle ground?