06 - Process attributes and context switching
Class: CSCE-313
Notes:
Outline:
- Process attributes
- Process context switching as a way of virtualizing the CPU
- Process states and some early thoughts on process schedulers
The story so far
/CSCE-313/Visual%20Aids/Visual%20Aids/Pasted%20image%2020260213134847.png)
Notes:
- We write a program in some high-level language
- The compiler turns it into an executable stored into a specially formatted file (linking format)
- In that file there are sections that define where the code and data segment is
- It also takes care of libraries, becoming part of the executable
- Every time you want some service from the OS you go from privilege 3 to 0 and execute code in the kernel
- There is a dispatching mechanism that allows you to redirect to entry points
Program execution flow
- Load code and data segments from executable file into memory
- Create stack and heap
- Transfer control to program's entry point
- Provide services to it (network, file connections IO, etc.)
Notes:
- Heap is created on-demand, the OS gives you 1 or two pages extra into your virtual address space
- You jump into the entry point of your program (which is not your main function)
- This will build a stack frame for your main function
- As you do system calls you get more services from your OS
/CSCE-313/Visual%20Aids/Visual%20Aids/Pasted%20image%2020260213134952.png)
- Note that usually your code is in a lower address and your stack is in a high address and grows towards the bottom (grows down)
- It is the code that periodically request the OS to increase its own virtual address space
- The OS will just give you pages, it will not manage the heap, you get to manage heap (
frees,mallocs,new, etc).
- The OS will just give you pages, it will not manage the heap, you get to manage heap (
- Context switching is saving all of this info into some block of memory (Process Control Block) and restoring the state from another process.
- The moment you assign the instruction pointer, it will be that other processes instruction
- Is really nothing more than switching memory blocks
- "Saving one set of registers and restoring from another set of registers"
Overview of memory regions
| Stack | This region contains temporary data such as method/function parameters, return address and local variables. |
|---|---|
| Heap | This region is dynamically allocated memory to during a process's run time. |
| Text | This section includes the current activity represented by the value of Program Counter and the contents of the processor's registers. |
| Data | This section contains global and static variables. |
The lifecycle of a process
| State | Description |
|---|---|
| Start | This is the initial state when a process is first started/created. |
| Ready | - The process is waiting to be assigned to a processor. - Ready processes are waiting to have the processor allocated to them by the operating system so that they can run. - Process may come into this state after the start state or while running and being interrupted by the scheduler to assign CPU to some other process. |
| Running | Once the process has been assigned to a processor by the OS scheduler, the process state is set to running and the processor executes its instructions. |
| Waiting | Process moves into the waiting state if it needs to wait for a resource, such as waiting for user input, or waiting for a file to become available. |
| Terminated or exited | Once the process finishes its execution, or it is terminated by the operating system, it is moved to the terminated state where it waits to be removed from main memory. |
Notes:
- These are states of a process
- Start: process has just been created
- Ready: process is already on the schedule queue of the CPU
- Running: already executing on some processor (some core of the CPU)
- Waiting: process is waiting for a resource (i.e. waiting for the disk)
- Terminated or exited: process is done (can be at the zombie state at this point)
- Think of this states as a transition diagram, a process can move from one state to another
State of a Process in Action
/CSCE-313/Visual%20Aids/Visual%20Aids/Pasted%20image%2020260213135511.png)
Notes:
- A process starts off in the Start state, then moves to Ready (in the queue), if the CPU picks you, you move to the Running state, if you get a signal you could get Terminated. If you ask for a resource or synchronize for another process you will be in this called "Blocking"/Waiting state.
How does the OS represent a process in the kernel?
- At any time, there are many processes in the system, each in its particular state.
- The OS data structure representing each process is called the Process Control Block (PCB)
- The PCB contains all the info about a process
- The PCB is also where the OS keeps the hardware execution state (PC, SP, registers, etc.) when the process is not running
- This state is everything that is needed to restore the hardware to the same configuration it was in, when the process was switched out of the hardware
Notes:
- For the OS, the handle to a process is a thing
- Is a place where you keep information about an entity
- The file descriptor table of a process will also be found on the PCB of a process
Process Control Block (PCB)
| Process State | The current state of the process i.e., whether it is ready, running, waiting, or whatever. |
|---|---|
| Process privileges | This is required to allow/disallow access to system resources. |
| Process ID | Unique identification for each of the process in the operating system. |
| Pointer | A pointer to parent process. |
| Program Counter | Program Counter is a pointer to the address of the next instruction to be executed for this process. |
| CPU registers | Various CPU registers where process need to be stored for execution for running state. |
| CPU Scheduling Information | Process priority and other scheduling information which is required to schedule the process. |
| Memory management information | This includes information about the page table, memory limits, etc. Segment table depending on memory used by the operating system. |
| Accounting information | This includes the amount of CPU used for process execution, time limits, execution ID etc. |
| IO status information | This includes a list of I/O devices allocated to the process. |
| Notes: |
- Privileges
- Linux and Unix have this notion of Privileges, which allows to do extra things
- This is administered by the notion of Identity, if you have a user id of 0 (root user), then you can do anything.
- You can have special privileges, for example for network, or other operations
- Privileges enables you to be able to perform some subset of the functionality without having to enable you to be root
- Often the easiest way to administer systems is all or nothing, but still people have tried.
- Program Counter and CPU registers is what really defines the state of the process, this is essentially what the process is.
/CSCE-313/Visual%20Aids/Visual%20Aids/Pasted%20image%2020260213141109.png)
Notes:
- Where is your exit code? it is in your PCB!
- You may also have a lot other information there
uiddefines your identity, a process can have many user ids that are used in a specific way- Some privileges are defined in
kernel_cap_t...
OS's Internal Tables
An OS keeps a lot of information in main memory, much of this info is about:
- Resources (e.g., devices, memory state)
- Running programs/processes
The PCB is a process's metadata. It's not the same as program data.
Program's data (variables, allocated memory) and code are kept in the process image (i.e., address space), which is separate and usually bigger in footprint.
Notes:
- Think of PCB of having information about your virtual address space and about the state of the process in general -> the "metadata of a process"
Some questions
How many processes can be in the running state simultaneously?
- n-cores = n processes running simultaneously
What state do you think a process is in most of the time?
- Waiting
How many processes can a Linux system theoretically support?
- Theoretically you are limited to what you can address
- Some large number of bytes but not infinite
- This question is equivalent to how many unique PIDs you can have?
Context Switch: PCB and Hardware States
- When a process is running, its hardware state (
, registers, etc.) is in the CPU. The hardware registers contain the current values. - When the OS stops running a process, it saves the current values of the registers into the process's PCB .
- When the OS is ready to start executing a new process, it loads the hardware registers from the values stored in that process's PCB.
The process of changing the CPU hardware state from one process to another is called a context switch.
This can happen 100 or 1000 or more times a second!
The Execution Trace of Processes
/CSCE-313/Visual%20Aids/Visual%20Aids/Pasted%20image%2020260213142020.png)
Notes:
- B starts to run first, an interrupt happens, and the scheduler decides to context switch you and start running A, then it runs for a while, something happens and the scheduler runs again to context switch back to B
- When B runs again, it should not see any difference between the previous state and this one, all registers values should be the same as when the scheduler interrupted first
- Think of this as two threads, the first one is executing and at some point it gives up the CPU, when the second threat starts executing it will not see any difference as when it left off first.
Examples: tracing processes execution
| Description | Command |
|---|---|
| Trace the system calls of the command ls | $ strace ls -la top3.txt |
| Display only the write system call of the ls command | $ strace -e write ls -la top3.txt |
| Trace the system calls of the command mkdir | $ strace mkdir books |
| Trace network-related system calls | $ strace -e network nc -v -n 127.0.0.1 80 |
| Trace a process using its PID | $ strace -p 1908 |
| Print the time spent on system call | $ strace -r mkdir books |
| Get a statistical report for an execution trace | $ strace -c ping google.com |
| Notes: |
- There is a very useful tool for this called
strace - It allows you to see the exact sequence of system calls that the program is executing
- Remember the environment is an array of strings of the form key=value
- It will ask the OS for more memory, and then it does a memory map, while doing various other stuff including opening the file, loading libraries, etc.
- With
straceyou will be able to see return values of system calls - It can take many arguments that allows you to trace things like nmap
- With this you can tell if a program is doing some suspicious stuff.
Scheduling and the Process Control Block (PCB)
Mechanism of a context switch:
/CSCE-313/Visual%20Aids/Visual%20Aids/Pasted%20image%2020260213143028.png)
- Save the registers for process A
- Restore/load registers from process B
/CSCE-313/Visual%20Aids/Visual%20Aids/Pasted%20image%2020260216140531.png)
- The PCB contains all information specific to a process.
ctxsw(char *from_sp, char *to_sp) X
/CSCE-313/Visual%20Aids/Visual%20Aids/Pasted%20image%2020260213143135.png)
States of a Process
User view: A process is executing continuously
In reality: Several processes compete for the CPU and other resources
A process may be
- running: it holds the CPU and is executing instructions
- ready: it is waiting to get back on the CPU
- blocked: it is waiting for some I/O event to occur
/CSCE-313/Visual%20Aids/Visual%20Aids/Pasted%20image%2020260216140827.png)
Lab 3 context switching
/CSCE-313/Visual%20Aids/Visual%20Aids/Pasted%20image%2020260213143227.png)
Notes:
-
Given some metadata, is this entry valid?
-
Here is where you will store the state of that thread
- (your thread library)
-
This is an OS defined data structure, that when you make a particular library call, it is going to save the callers into this data structure, in this example only this 2 indices are valid
-
What you will do is say this thread is executing (
current), you have a notion of the current thread (pointing to the PID that is executing) -
When you want to context switch you make a library call that will save the context of the caller, then you want to automatically restore the values of the registers.
-
The way you automatically save your context and restore somebody else's context is with this library called
swapcontext- It will basically do the trick
- Now the question is how do you save the context for a process just the first time
- The way to do that is through a combination of libraries called:
makecontextandgetcontextgetcontext: gets the callers contextmakecontext: change this context to make it appear as if this function was being invoked by the thread
Overall steps:
- Find another thread eligible to execute
- Save your own context (current thread)
- Restore the context from that other thread to pick up
Example:
- a program that every time prints a statement yields control ("saying context switch me").
- Your threading library (implemented by your scheduler) will then pick another thing to execute.
- I will appear like this thread invokes that function
- A
Representing Threads
/CSCE-313/Visual%20Aids/Visual%20Aids/Pasted%20image%2020260216140959.png)
- A thread has state (registers)
- Its own stack
Notes:
- Think of a thread as an execution context, which is like a process that has its own stack which updates as it makes calls to functions
- A thread has a stack and a set of registers that capture the entire state of the thread
Switching between Threads - I
/CSCE-313/Visual%20Aids/Visual%20Aids/Pasted%20image%2020260216141041.png)
- Coroutine linkage
- When the green thread calls t_yield, blue isn’t invoked on green’s stack
Notes:
- This is different to function calls
- Lets say we have these 3 threads and all their context has been initialized somehow
- When the green thread calls
t_yield(), the threading library picks some other thread to execute, lets say it picks the blue thread - The blue thread now starts executing, it looks to the blue thread that it just return from the previous invocation of
t_yield() - When they call
t_yield()it actually resumes somebody else, for this somebody else will look like the previous thread returned it. - The important takeaway: when you call
t_yield()the following execution will not be in the current threads stack it will go to the stack of the thread thatt_yield()picked.
Switching between Threads - II
int t_yield() {
// the caller uses the slot at index "current" in the proc table
next = find_next_valid();
if(next == curr) return ERROR; // only main thread remains
atomically:
save caller's context in current; - swapcontext
restore context from next;
return SUCCESS;
}
When a thread invokes t_yield(), the "other" thread returns from swapcontext!
Notes:
- This is how
t_yield()looks like. - It finds the next guy to execute
- If it cannot find another guy,
t_yield()may actually return something.
- If it cannot find another guy,
- Now that you have found the next guy, now you save your context
- Then you restore the context of the other guy.
Test example
void dosomething (int32_t x, int32_t y) {
for (int32_t i = x; i < y; i++) {
// Perform some computation
printf ("Hil (%d -> %d): Running: %d\n", x, v, i);
t_yield (); // Yield the control to other workers
}
printf ("Hil (%d -> %d): Done!\n", x, y); // Perform some computation
t_finish (); // All the work is done!
}
int main () {
t_init (); // Initialize the runtime
if (t_create (dosomething, 0, 10) != 0) return -1;
if (t_create (dosomething, 10, 20) !=0) return -1;
if (t_create (dosomething, 20, 30) !=0) return -1;
while (t_yield () >=1); // Wait for the workers to finish their tasks
return 0;
}
Notes:
t_init()allows you to create your threading library- Then we create some threads given different arguments
- Then we yield to enable our thread library to start executing and pick one of the threads to execute
- The library is the one who is going to decide which of the 3 threads to pick first!
- Each thread gets their own stack!
A simple example - I
static ucontext_t uctx_main, uctx_func1, uctx_func2;
static void func1 (void) ,{
printf ("%s: started\n" func__);
printf ("%s: swapcontext̀(&uctx funć1, &uctx_func2)\n", __func__);
Swapcontext (&uctx func1, &uctx func2);
prihtf ("%s: returning\n", __func__);
}
static void func2 (void) {
printf "%s: started\n" func );
printf ("%s: swapcontext̀(&uctx func2) &uctx_func1)\n", __func__);
Swapcontext (&uctx func2, &uctx func1);
prihtf ("%s: returning\n", __func__);
}
Notes:
- In this case it is a round robin kind of thread scheduling
A simple example - II
int main (int argc, char *argv[]) {
char func1_stack[16384];
char func2_stack[16384];
getcontext (&uctx_func1); uctx_func1.uc_stack.ss_sp = func1_stack;
uctx_func1.uc_stack.ss_size = sizeof (func1_stack);
uctx_func1.uc_link = &uctx_main;
makecontext (&uctx_func1, func1, 0);
getcontext (&uctx_func2); uctx_func2.uc_stack.ss_sp = func2_stack;
uctx_func2.uc_stack.ss_size = sizeof (func2_stack); /* Successor context is f1(), unless argc > 1 */
uctx_func2.uc_link = (argc > 1) ? NULL : &uctx_func1;
makecontext (&uctx_func2, func2, 0);
printf ("%s: swapcontext(&uctx_main, &uctx_func2)\n", __func__);
swapcontext (&uctx_main, &uctx_func2);
printf ("%s: exiting\n", __func__);
exit (EXIT_SUCCESS);
}