06 - Process attributes and context switching

#processes

Class: CSCE-313

Notes:

Outline:

Process attributes
Process context switching as a way of virtualizing the CPU
Process states and some early thoughts on process schedulers

The story so far

Pasted image 20260213134847.png|500

Notes:

We write a program in some high-level language
The compiler turns it into an executable stored into a specially formatted file (linking format)
In that file there are sections that define where the code and data segment is
- It also takes care of libraries, becoming part of the executable
Every time you want some service from the OS you go from privilege 3 to 0 and execute code in the kernel
- There is a dispatching mechanism that allows you to redirect to entry points

Program execution flow

Load code and data segments from executable file into memory
Create stack and heap
Transfer control to program's entry point
Provide services to it (network, file connections IO, etc.)

Notes:

Heap is created on-demand, the OS gives you 1 or two pages extra into your virtual address space
You jump into the entry point of your program (which is not your main function)
- This will build a stack frame for your main function
As you do system calls you get more services from your OS

Pasted image 20260213134952.png|500

Note that usually your code is in a lower address and your stack is in a high address and grows towards the bottom (grows down)
It is the code that periodically request the OS to increase its own virtual address space
- The OS will just give you pages, it will not manage the heap, you get to manage heap (frees, mallocs, new, etc).
Context switching is saving all of this info into some block of memory (Process Control Block) and restoring the state from another process.
The moment you assign the instruction pointer, it will be that other processes instruction
Is really nothing more than switching memory blocks
- "Saving one set of registers and restoring from another set of registers"

Overview of memory regions

Stack	This region contains temporary data such as method/function parameters, return address and local variables.
Heap	This region is dynamically allocated memory to during a process's run time.
Text	This section includes the current activity represented by the value of Program Counter and the contents of the processor's registers.
Data	This section contains global and static variables.

The lifecycle of a process

State	Description
Start	This is the initial state when a process is first started/created.
Ready	- The process is waiting to be assigned to a processor. - Ready processes are waiting to have the processor allocated to them by the operating system so that they can run. - Process may come into this state after the start state or while running and being interrupted by the scheduler to assign CPU to some other process.
Running	Once the process has been assigned to a processor by the OS scheduler, the process state is set to running and the processor executes its instructions.
Waiting	Process moves into the waiting state if it needs to wait for a resource, such as waiting for user input, or waiting for a file to become available.
Terminated or exited	Once the process finishes its execution, or it is terminated by the operating system, it is moved to the terminated state where it waits to be removed from main memory.

Notes:

These are states of a process
- Start: process has just been created
- Ready: process is already on the schedule queue of the CPU
- Running: already executing on some processor (some core of the CPU)
- Waiting: process is waiting for a resource (i.e. waiting for the disk)
- Terminated or exited: process is done (can be at the zombie state at this point)
Think of this states as a transition diagram, a process can move from one state to another

State of a Process in Action

Pasted image 20260213135511.png|500

Notes:

A process starts off in the Start state, then moves to Ready (in the queue), if the CPU picks you, you move to the Running state, if you get a signal you could get Terminated. If you ask for a resource or synchronize for another process you will be in this called "Blocking"/Waiting state.

How does the OS represent a process in the kernel?

At any time, there are many processes in the system, each in its particular state.
The OS data structure representing each process is called the Process Control Block (PCB)
- The PCB contains all the info about a process
- The PCB is also where the OS keeps the hardware execution state (PC, SP, registers, etc.) when the process is not running
- This state is everything that is needed to restore the hardware to the same configuration it was in, when the process was switched out of the hardware

Notes:

For the OS, the handle to a process is a thing
Is a place where you keep information about an entity
The file descriptor table of a process will also be found on the PCB of a process

Process Control Block (PCB)

Process State	The current state of the process i.e., whether it is ready, running, waiting, or whatever.
Process privileges	This is required to allow/disallow access to system resources.
Process ID	Unique identification for each of the process in the operating system.
Pointer	A pointer to parent process.
Program Counter	Program Counter is a pointer to the address of the next instruction to be executed for this process.
CPU registers	Various CPU registers where process need to be stored for execution for running state.
CPU Scheduling Information	Process priority and other scheduling information which is required to schedule the process.
Memory management information	This includes information about the page table, memory limits, etc. Segment table depending on memory used by the operating system.
Accounting information	This includes the amount of CPU used for process execution, time limits, execution ID etc.
IO status information	This includes a list of I/O devices allocated to the process.
Notes:

Privileges
- Linux and Unix have this notion of Privileges, which allows to do extra things
- This is administered by the notion of Identity, if you have a user id of 0 (root user), then you can do anything.
- You can have special privileges, for example for network, or other operations
- Privileges enables you to be able to perform some subset of the functionality without having to enable you to be root
- Often the easiest way to administer systems is all or nothing, but still people have tried.
Program Counter and CPU registers is what really defines the state of the process, this is essentially what the process is.

Pasted image 20260213141109.png|500

Notes:

Where is your exit code? it is in your PCB!
You may also have a lot other information there
uid defines your identity, a process can have many user ids that are used in a specific way
Some privileges are defined in kernel_cap_t...

OS's Internal Tables

An OS keeps a lot of information in main memory, much of this info is about:

Resources (e.g., devices, memory state)
Running programs/processes

The PCB is a process's metadata. It's not the same as program data.

Program's data (variables, allocated memory) and code are kept in the process image (i.e., address space), which is separate and usually bigger in footprint.

Notes:

Think of PCB of having information about your virtual address space and about the state of the process in general -> the "metadata of a process"

Some questions

How many processes can be in the running state simultaneously?

n-cores = n processes running simultaneously

What state do you think a process is in most of the time?

Waiting

How many processes can a Linux system theoretically support?

Theoretically you are limited to what you can address
Some large number of bytes but not infinite
This question is equivalent to how many unique PIDs you can have?

Context Switch: PCB and Hardware States

When a process is running, its hardware state ( $PC, SP$ , registers, etc.) is in the CPU. The hardware registers contain the current values.
When the OS stops running a process, it saves the current values of the registers into the process's PCB .
When the OS is ready to start executing a new process, it loads the hardware registers from the values stored in that process's PCB.

The process of changing the CPU hardware state from one process to another is called a context switch.

This can happen 100 or 1000 or more times a second!

The Execution Trace of Processes

Pasted image 20260213142020.png|600

Notes:

B starts to run first, an interrupt happens, and the scheduler decides to context switch you and start running A, then it runs for a while, something happens and the scheduler runs again to context switch back to B
When B runs again, it should not see any difference between the previous state and this one, all registers values should be the same as when the scheduler interrupted first
Think of this as two threads, the first one is executing and at some point it gives up the CPU, when the second threat starts executing it will not see any difference as when it left off first.

Examples: tracing processes execution

Description	Command
Trace the system calls of the command ls	`$ strace ls -la top3.txt`
Display only the write system call of the ls command	`$ strace -e write ls -la top3.txt`
Trace the system calls of the command mkdir	`$ strace mkdir books`
Trace network-related system calls	`$ strace -e network nc -v -n 127.0.0.1 80`
Trace a process using its PID	`$ strace -p 1908`
Print the time spent on system call	`$ strace -r mkdir books`
Get a statistical report for an execution trace	`$ strace -c ping google.com`
Notes:

There is a very useful tool for this called strace
It allows you to see the exact sequence of system calls that the program is executing
Remember the environment is an array of strings of the form key=value
It will ask the OS for more memory, and then it does a memory map, while doing various other stuff including opening the file, loading libraries, etc.
With strace you will be able to see return values of system calls
It can take many arguments that allows you to trace things like nmap
With this you can tell if a program is doing some suspicious stuff.

Scheduling and the Process Control Block (PCB)

Mechanism of a context switch:

Pasted image 20260213143028.png|450

Save the registers for process A
Restore/load registers from process B

Pasted image 20260216140531.png|600

The PCB contains all information specific to a process.

`ctxsw(char from_sp, char to_sp)` X

Pasted image 20260213143135.png|600

States of a Process

User view: A process is executing continuously
In reality: Several processes compete for the CPU and other resources
A process may be

running: it holds the CPU and is executing instructions
ready: it is waiting to get back on the CPU
blocked: it is waiting for some I/O event to occur

Pasted image 20260216140827.png|500

Lab 3 context switching

Pasted image 20260213143227.png|600

Notes:

Given some metadata, is this entry valid?
Here is where you will store the state of that thread
- (your thread library)
This is an OS defined data structure, that when you make a particular library call, it is going to save the callers into this data structure, in this example only this 2 indices are valid
What you will do is say this thread is executing (current), you have a notion of the current thread (pointing to the PID that is executing)
When you want to context switch you make a library call that will save the context of the caller, then you want to automatically restore the values of the registers.
The way you automatically save your context and restore somebody else's context is with this library called swapcontext
- It will basically do the trick
- Now the question is how do you save the context for a process just the first time
- The way to do that is through a combination of libraries called: makecontext and getcontext
  - getcontext: gets the callers context
  - makecontext: change this context to make it appear as if this function was being invoked by the thread

Overall steps:

Find another thread eligible to execute
Save your own context (current thread)
Restore the context from that other thread to pick up

Example:

a program that every time prints a statement yields control ("saying context switch me").
Your threading library (implemented by your scheduler) will then pick another thing to execute.
I will appear like this thread invokes that function
A

Representing Threads

Pasted image 20260216140959.png|400

A thread has state (registers)
Its own stack

Notes:

Think of a thread as an execution context, which is like a process that has its own stack which updates as it makes calls to functions
A thread has a stack and a set of registers that capture the entire state of the thread

Switching between Threads - I

Pasted image 20260216141041.png|400

Coroutine linkage
When the green thread calls t_yield, blue isn’t invoked on green’s stack

Notes:

This is different to function calls
Lets say we have these 3 threads and all their context has been initialized somehow
When the green thread calls t_yield(), the threading library picks some other thread to execute, lets say it picks the blue thread
The blue thread now starts executing, it looks to the blue thread that it just return from the previous invocation of t_yield()
When they call t_yield() it actually resumes somebody else, for this somebody else will look like the previous thread returned it.
The important takeaway: when you call t_yield() the following execution will not be in the current threads stack it will go to the stack of the thread that t_yield() picked.

Switching between Threads - II

int t_yield() {
    // the caller uses the slot at index "current" in the proc table
    next = find_next_valid();
    if(next == curr) return ERROR; // only main thread remains
    atomically:
        save caller's context in current; - swapcontext
        restore context from next;
    return SUCCESS;
}

When a thread invokes t_yield(), the "other" thread returns from swapcontext!

Notes:

This is how t_yield() looks like.
It finds the next guy to execute
- If it cannot find another guy, t_yield() may actually return something.
Now that you have found the next guy, now you save your context
Then you restore the context of the other guy.

Test example

void dosomething (int32_t x, int32_t y) {
    for (int32_t i = x; i < y; i++) {
        // Perform some computation
        printf ("Hil (%d -> %d): Running: %d\n", x, v, i);
        t_yield (); // Yield the control to other workers
    }
    
    printf ("Hil (%d -> %d): Done!\n", x, y); // Perform some computation
    t_finish (); // All the work is done!
}
int main () {
    t_init (); // Initialize the runtime
    
    if (t_create (dosomething, 0, 10) != 0) return -1;
    if (t_create (dosomething, 10, 20) !=0) return -1;
    if (t_create (dosomething, 20, 30) !=0) return -1;
    
    while (t_yield () >=1); // Wait for the workers to finish their tasks
    return 0;
}

Notes:

t_init() allows you to create your threading library
Then we create some threads given different arguments
Then we yield to enable our thread library to start executing and pick one of the threads to execute
The library is the one who is going to decide which of the 3 threads to pick first!
Each thread gets their own stack!

A simple example - I

static ucontext_t uctx_main, uctx_func1, uctx_func2;

static void func1 (void) ,{
    printf ("%s: started\n" func__);
    printf ("%s: swapcontext̀(&uctx funć1, &uctx_func2)\n", __func__);
    Swapcontext (&uctx func1, &uctx func2);
    prihtf ("%s: returning\n", __func__);
}

static void func2 (void) {
    printf "%s: started\n" func );
    printf ("%s: swapcontext̀(&uctx func2) &uctx_func1)\n", __func__);
    Swapcontext (&uctx func2, &uctx func1);
    prihtf ("%s: returning\n", __func__);
}

Notes:

In this case it is a round robin kind of thread scheduling

A simple example - II

int main (int argc, char *argv[]) {
	char func1_stack[16384];
	char func2_stack[16384];
	
	getcontext (&uctx_func1); uctx_func1.uc_stack.ss_sp = func1_stack;
	uctx_func1.uc_stack.ss_size = sizeof (func1_stack);
	uctx_func1.uc_link = &uctx_main;
	makecontext (&uctx_func1, func1, 0);
	
	getcontext (&uctx_func2); uctx_func2.uc_stack.ss_sp = func2_stack;
	uctx_func2.uc_stack.ss_size = sizeof (func2_stack); /* Successor context is f1(), unless argc > 1 */ 
	uctx_func2.uc_link = (argc > 1) ? NULL : &uctx_func1; 
	makecontext (&uctx_func2, func2, 0);
	
	printf ("%s: swapcontext(&uctx_main, &uctx_func2)\n", __func__);
	swapcontext (&uctx_main, &uctx_func2);
	printf ("%s: exiting\n", __func__);
	exit (EXIT_SUCCESS);
}

The story so far

Program execution flow

Overview of memory regions

The lifecycle of a process

State of a Process in Action

How does the OS represent a process in the kernel?

Process Control Block (PCB)

OS's Internal Tables

Some questions

Context Switch: PCB and Hardware States

The Execution Trace of Processes

Examples: tracing processes execution

Scheduling and the Process Control Block (PCB)

ctxsw(char *from_sp, char *to_sp) X

States of a Process

Lab 3 context switching

Representing Threads

Switching between Threads - I

Switching between Threads - II

Test example

A simple example - I

A simple example - II

`ctxsw(char from_sp, char to_sp)` X