YouTube placeholder

Processes and File Handles

Process Example: Firefox

Firefox has multiple threads. What are they doing?
  • Waiting for and processing interface events: mouse clicks, keyboard input, etc.

  • Redrawing the screen as necessary in response to user input, web page loading, etc.

  • Loading web pages—​usually multiple parts in parallel to speed things up.

Firefox is using memory. For what?
  • Firefox.exe: the executable code of Firefox itself.

  • Shared libraries for web page parsing, security, etc.

  • Stacks storing local variables for running threads.

  • A heap storing dynamically-allocated memory.

Firefox has files open. Why?
  • Configuration files.

  • Fonts.

$ top # more process information

top

Process Example: bash

  • Let’s do this for real using standard Linux system utilities.

Finding bash

finding bash
  • ps aux gives me all process, then grep for the one I’m after.

  • …​or, do it all in one shot using pgrep.

  • …​or, if I know it’s running in my current session a bare ps will do.

bash

process bash 3

$ ps -Lf # thread information

threads bash
What are:
  • UID: user the process is running as.

  • PID: process ID.

  • PPID: parent process ID.

  • PRI: scheduling priority.

  • SZ: size of the core image of the process (kB).

  • WCHAN: if the process is not running, description of what it is waiting on.

  • RSS: total amount of resident memory is use by the process (kB).

  • TIME: measure of the amount of time that the process has spent running.

threads bash
  • If bash had multiple threads running this view would show them, so bash does not have multiple threads.

bash

process bash 2

$ ps # process information

  • I wish we could see a process with multiple threads…​

$ ps -Lf # thread information

ps threads

$ pmap # memory mappings

pmap

bash

process bash 1

$ lsof # open files

lsof
True confessions: I cheated here.
  • /home/challen/.bashrc was not actually open when I ran this command.

  • bash didn’t have any interesting files open and I was embarrassed.

lsof
True confessions: I cheated here.
  • /home/challen/.bashrc was not actually open when I ran this command.

  • bash didn’t have any interesting files open and I was embarrassed.

Let’s imagine we caught bash during startup when it is reading its configuration parameters.

bash

process bash

Aside: the /proc/ file system

  • How do top, ps, pmap, lsof, and other process examination utilities gather information?

  • Linux reuses the file abstraction for this purpose.

procfilesystem

OS Abstraction Cheat Sheet

  • Threads save processor state.

  • Address spaces map the addresses used by processes (virtual addresses) to real memory addresses (physical addresses).

  • Files map offsets into a file to blocks on disk.

  • File-like objects look like files to a process but are not actually stored on disk and may not completely obey file semantics.

    • You can’t seek on a network socket or open certain network-mounted files.

  • Processes organize these other operating system abstractions.

Updated Process Model

  • For today’s material being precise about how processes use files becomes important.

  • So let’s update our model. Here’s what we had last time:

process
  • So let’s update our model. Here’s what we had last time:

  • And here’s today’s change:

process updated

File Handles

  • The file descriptor that processes receive from open() and pass to other file system system calls is just an int, an index into the process file table.

  • That int refers to a file handle object maintained by the kernel.

  • That file handle object contains a reference a separate file object also maintained by the kernel.

  • Which then is mapped by the file system to blocks on disk.

  • So three levels of indirection:

    • file descriptor → file handle.

    • file handle → file object.

    • file object → blocks on disk.

  • Why?

Sharing File State

The additional level of indirection allows certain pieces of state to be shared separately.
  • File descriptors are private to each process.

  • File handles are private to each process but shared after process creation.

    • File handles store the current file offset, or the position in the file that the next read will come from or write will go to. File handles can be deliberately shared between two processes.

  • File objects hold other file state and can be shared transparently between many processes.

Operating System Design Principles

  • Separate policy from mechanism.

  • Facilitate control or sharing by adding a level of indirection.

Process Creation

Where do processes come from?

fork() # create a new process

fork() is the UNIX system call that creates a new process.
  • fork() creates a new process that is a copy of the calling process.

  • After fork() we refer to the caller as the parent and the newly-created process as the child. This relationship enables certain capabilities.

process updated

fork() Semantics

  • Generally fork() tries to make an exact copy of the calling process.

    • Recent version of UNIX have relaxed this requirement and there are now many flavors of fork() that copy different amounts of state and are suitable for different purposes.

    • For the purposes of this class, ignore them.

  • Threads are a notable exception!

fork() Against Threads

  • Single-threaded fork() has reliable semantics because the only thread the processes had is the one that called fork().

    • So nothing else is happening while we complete the system call.

  • Multi-threaded fork() creates a host of problems that many systems choose to ignore.

    • Linux will only copy state for the thread that called fork().

Multi-Threaded fork()

There are two major problems with multi-threaded fork()
  1. Another thread could be blocked in the middle of doing something (uniprocessor systems), or

  2. another thread could be actually doing something (multiprocessor systems).

This ends up being a big mess. Let’s just copy the calling thread.

fork()

  1. fork() copies one thread—​the caller.

  2. fork() copies the address space.

  3. fork() copies the process file table.

image
image
image
image
image

After fork()

returnCode = fork();
if (returnCode == 0) {
  # I am the child.
} else {
  # I am the parent.
}
  • The child thread returns executing at the exact same point that its parent called fork().

    • With one exception: fork() returns twice, the PID to the parent and 0 to the child.

  • All contents of memory in the parent and child are identical.

  • Both child and parent have the same files open at the same position.

    • But, since they are sharing file handles changes to the file offset made by the parent/child will be reflected in the child/parent!

Calm Like A fork()bomb

What does this code do?

while (1) {
  fork();
}

Created 2/17/2017
Updated 8/17/2017
Commit 4eceaab // History // View
Built 1/31/2016 @ 19:00 EDT