Today

  1. Finish long-winded, dull example from last time.

  2. Review.

  3. File handles.

  4. Process life cycle:

    • Birth: fork()

    • Change: exec()

    • Death: exit()

    • The Afterlife: wait()

$ cat announce.txt

  • Today is the add/drop deadline!

  • Recitations start this week and office hours continue.

  • A preliminary website is up! It has a link to the calendar, information about how to contact the course staff, and (most importantly): ASST0.

  • We will create your class accounts later today which will allow you to use the Discourse site and our private GitLab instance.

A Note on Grading

  • My goal is to help everyone "make an A."

  • If you are confused about anything, or have any doubts, email staff@ops-class.org and we will reply promptly.

A Note on Assignment Grading

  • This year we have removed many of the free points from the assignments—​particularly the code reading questions.

  • The goal is not to demoralize anyone. The goal is to ensure that you are working on the parts of the assignments that are important.

  • However, we strongly advise you to complete the recommended questions and exercises.

Last Time

We discussed the process abstraction.
  • Unfortunately at this point we are discussing an abstraction (processes) built on other abstractions (threads, address spaces, files) that we haven’t discussed yet!

  • There is a certain circularity to operating system design but we had to break through at some point.

  • Bear with me—​we will get there and we will also keep returning to the examples we’ve already introduced.

  • Questions about material presented Friday?

Process Example: Firefox

Firefox has multiple threads. What are they doing?
  • Waiting for and processing interface events: mouse clicks, keyboard input, etc.

  • Redrawing the screen as necessary in response to user input, web page loading, etc.

  • Loading web pages—​usually multiple parts in parallel to speed things up.

Firefox is using memory. For what?
  • Firefox.exe: the executable code of Firefox itself.

  • Shared libraries for web page parsing, security, etc.

  • Stacks storing local variables for running threads.

  • A heap storing dynamically-allocated memory.

Firefox has files open. Why?
  • Configuration files.

  • Fonts.

$ top # more process information

top

Process Example: bash

  • Let’s do this for real using standard Linux system utilities.

Finding bash

finding bash
  • ps aux gives me all process, then grep for the one I’m after.

  • …​or, do it all in one shot using pgrep.

  • …​or, if I know it’s running in my current session a bare ps will do.

bash

process bash 3

$ ps -Lf # thread information

threads bash
What are:
  • UID: user the process is running as.

  • PID: process ID.

  • PPID: parent process ID.

  • PRI: scheduling priority.

  • SZ: size of the core image of the process (kB).

  • WCHAN: if the process is not running, description of what it is waiting on.

  • RSS: total amount of resident memory is use by the process (kB).

  • TIME: measure of the amount of time that the process has spent running.

$ ps -Lf # thread information

threads bash
  • If bash had multiple threads running this view would show them, so bash does not have multiple threads.

bash

process bash 2

$ ps # process information

  • I wish we could see a process with multiple threads…​

$ ps -Lf # thread information

ps threads

$ pmap # memory mappings

pmap

bash

process bash 1

$ lsof # open files

lsof
True confessions: I cheated here.
  • /home/challen/.bashrc was not actually open when I ran this command.

  • bash didn’t have any interesting files open and I was embarrassed.

bashrc not open

Professor resorts to lying

$ lsof # open files

lsof
True confessions: I cheated here.
  • /home/challen/.bashrc was not actually open when I ran this command.

  • bash didn’t have any interesting files open and I was embarrassed.

Let’s imagine we caught bash during startup when it is reading its configuration parameters.

bash

process bash

Aside: the /proc/ file system

  • How do top, ps, pmap, lsof, and other process examination utilities gather information?

  • Linux reuses the file abstraction for this purpose.

procfilesystem

OK…​ Let’s Review

OS Abstraction Cheat Sheet

  • Threads save processor state.

  • Address spaces map the addresses used by processes (virtual addresses) to real memory addresses (physical addresses).

  • Files map offsets into a file to blocks on disk.

  • File-like objects look like files to a process but are not actually stored on disk and may not completely obey file semantics.

    • You can’t seek on a network socket or open certain network-mounted files.

  • Processes organize these other operating system abstractions.

Review: Abstractions

Abstractions simplify application design by:
  • hiding undesirable properties,

  • adding new capabilities, and

  • organizing information.

Review: Processes

Processes organize information about other abstractions and represent a single thing that the computer is "doing."

Processes contain:
  • one or more threads,

  • an address space, and

  • zero or more open file handles.

Review: Processes

  • Processes organize information about other abstractions and represent a single thing that the computer is "doing."

  • Processes contain:

    • one or more threads,

    • an address space, and

    • zero or more open file handles.

Review: Inter-Process Communication (IPC)

IPC mechanisms include:
  • files,

  • return codes,

  • pipes,

  • shared memory,

  • and signals.

Review: Protection

One major operating system goal is to protect processes from each other.

So Now: Questions About Processes?

Updated Process Model

  • For today’s material being precise about how processes use files becomes important.

  • So let’s update our model. Here’s what we had last time:

process
  • So let’s update our model. Here’s what we had last time:

  • And here’s today’s change:

process updated

File Handles

  • The file descriptor that processes receive from open() and pass to other file system system calls is just an int, an index into the process file table.

  • That int refers to a file handle object maintained by the kernel.

  • That file handle object contains a reference a separate file object also maintained by the kernel.

  • Which then is mapped by the file system to blocks on disk.

  • So three levels of indirection:

    • file descriptor → file handle.

    • file handle → file object.

    • file object → blocks on disk.

  • Why?

Are you just trying

to confuse me?

Sharing File State

The additional level of indirection allows certain pieces of state to be shared separately.
  • File descriptors are private to each process.

  • File handles are private to each process but shared after process creation.

    • File handles store the current file offset, or the position in the file that the next read will come from or write will go to. File handles can be deliberately shared between two processes.

  • File objects hold other file state and can be shared transparently between many processes.

Operating System Design Principles

  • Separate policy from mechanism.

  • Facilitate control or sharing by adding a level of indirection.

Process Creation

Where do processes come from?

fork() # create a new process

fork() is the UNIX system call that creates a new process.
  • fork() creates a new process that is a copy of the calling process.

  • After fork() we refer to the caller as the parent and the newly-created process as the child. This relationship enables certain capabilities.

process updated

fork() Semantics

  • Generally fork() tries to make an exact copy of the calling process.

    • Recent version of UNIX have relaxed this requirement and there are now many flavors of fork() that copy different amounts of state and are suitable for different purposes.

    • For the purposes of this class, ignore them.

  • Threads are a notable exception!

fork() Against Threads

  • Single-threaded fork() has reliable semantics because the only thread the processes had is the one that called fork().

    • So nothing else is happening while we complete the system call.

  • Multi-threaded fork() creates a host of problems that many systems choose to ignore.

    • Linux will only copy state for the thread that called fork().

Multi-Threaded fork()

There are two major problems with multi-threaded fork()
  1. Another thread could be blocked in the middle of doing something (uniprocessor systems), or

  2. another thread could be actually doing something (multiprocessor systems).

This ends up being a big mess. Let’s just copy the calling thread.

fork()

  1. fork() copies one thread—​the caller.

  2. fork() copies the address space.

  3. fork() copies the process file table.

image
image
image
image
image

After fork()

returnCode = fork();
if (returnCode == 0) {
  # I am the child.
} else {
  # I am the parent.
}
  • The child thread returns executing at the exact same point that its parent called fork().

    • With one exception: fork() returns twice, the PID to the parent and 0 to the child.

  • All contents of memory in the parent and child are identical.

  • Both child and parent have the same files open at the same position.

    • But, since they are sharing file handles changes to the file offset made by the parent/child will be reflected in the child/parent!

Calm Like A fork()bomb

What does this code do?

while (1) {
  fork();
}

while 1

fork()

Next Time

We continue the process lifecycle:
  • change (exec()),

  • death (exit()), and

  • heaven (wait()).

  • Heaven?

  • Write the code for our simple shell.