02 - Architectural Support for Operating Systems
Class: CSCE-313
Notes:
Place of the Operative System
/CSCE-313/Visual%20Aids/Pasted%20image%2020260121134852.png)
- Processes architecturally are on another ring of the hardware
- The middleware is implementing isolation among processes and protecting devices from being accessible by processes
Function of an Operative System
- Isolation, Uniform interface, Efficient & fair sharing
- This is what the OS tries to do
- Analogy with the government
- Does no useful work,
- Simply legislates on resource use by competing applications,
- Enforces policy such as fairness of resource sharing
A process (simplified)
A process is an instance of a running program
- Started by system call
exec
State of a process
- CPU
- Memory (address space)
- Store code, data, stack, heap
- A process has its own address space
- Registers
- Program Counter, Stack Pointer Regular registers
- IO information
- Open files (and others)
- Some state that the kernel is keeping
From program to process
/CSCE-313/Visual%20Aids/Pasted%20image%2020260121135539.png)
- Think of multiple processes all basically using some portion of your physical memory in the form of virtual memory
- All are being multiplexed on that same physical memory
The Process, Refined
An executing program with restricted rights
/CSCE-313/Visual%20Aids/Pasted%20image%2020260121140124.png)
- Enforcing mechanism must not hinder functionality or hurt performance
- Even when implementing isolation it should not hurt performance or functionality, they should do the same.
- One of the principle ways the OS implements isolation is by virtual addressing
- Even though addresses may be identical to different processes they get mapped differently to physical memory and therefore it is not possible to get out of those bounds of virtual memory
- Each process is living in its own universe, the OS gives the illusion to the process that it has control of the whole memory.
User vs Kernel
Application/User Code (Untrusted)
- Run all the processor with all potentially dangerous operations disabled
Kernel Code (Trusted)
- Runs directly on processor with unlimited rights
- Performs any hardware operations
But run on the same machine!
Notes:
- You want to prevent a binary affecting the OS
Hardware must support
- Privileged Instructions
- Unsafe instructions cannot be executed in user mode
- There are some instructions that you can only do when running in privileged mode
- To do this we need to have in our processor a notion of what can be done and what cannot be node
- For example a process cannot change its own address space, it has to ask the OS to change it
- Memory Isolation
- Memory accesses outside a process’s address space prohibited
- A process cannot break out of its virtual addresses container
- Memory accesses outside a process’s address space prohibited
- Interrupts
- Ensure kernel can regain control from running process
- We need some way to interrupt the execution of the CPU, because if not, the sharing of resources would become unfair
- Safe Transfers
- Correctly transfer control from user-mode to kernel-mode and back
- We need some way to correctly and safely transfer an unprivileged work to a privileged work and vice-versa.
1. Privilege levels differentiate instruction sets
/CSCE-313/Visual%20Aids/Pasted%20image%2020260121141157.png)
- The OS runs in ring 0
- User space programs run in ring 3
- There are two bits in a register that determine your current privilege level, which determines what instructions you can do.
- A process cannot change its privileged level
2. VM can give the illusion of an entire contiguous address space
- Virtual memory as a mechanism that is supported in hardware
- The hardware maps virtual pages at the time, of reference, to an actual physical page
- It is just the mapping that is set up by the OS, but the actual referencing is done by the hardware
- To the process it looks like the address space is contiguous
- A process cannot change its address space
3. Interrupts wrest control from applications
- Hardware Interrupts. All interrupts are guaranteed to be taken on an instruction boundary.
[Vol 3A, 6-6]- Asynchronous
- External devices such as timers
- Software Interrupts (
INT 0x80,syscall)- Synchronous—explicit instruction
- System calls
- Exceptions—div/0,…In some cases,
%rippoints to fault insn.- Protection violations, Page faults,…
Notes:
- When you have an interrupt, the CPU will do whatever needs to be done right after, where it jumps to is already set up by the OS, so the user space program has no choice other than jump.
- "On an interrupt, you WILL execute my code"
- In hardware interrupts, a pin goes high when an interrupt is triggered, it then does an acknowledgement cycle with the pick, and the pick sends it a number between 0-255, this is the interrupt number. The CPU indexes into a table and interprets that number into an address, then the CPU jumps to that address.
- Hardware interrupts are considered to be external, it needs a number between 0-255 because its interrupt vector table is of size 256
- This resolves in the CPU going into a different location, in order to handle this interrupt
- In software interrupts, the OS provides you with the vector number (instructions in the CPU)
- System calls are implemented like that, they generate a software interrupt and the CPU jumps to the address
- It sets up its registers the same way it would do for function calls
- You do not know where you are going, all you are saying is go to some address. You do not control it, it is the OS that sets it up for you.
- Modern x86 does not uses INT 0x80 anymore, it uses another number from the interrupt vector table for
syscall - Note processes do not do anything, it is the kernel that will do it.
- The only way you can become ring 0 is because you get trapped into my address space because of an interrupt, but you are no longer running your own code, you are now running my code.
- Exceptions that happen as your program is being executed can also trigger an interrupt
- The CPU will generate an exception and it will be handled exactly in the same manner
- Whatever is the address stored against that index, the OS sets it for you, you become ring 0 and the kernel handles it for you, when it returns to you, you are also returned to ring 3.
- This restarts the instruction pointer to the same instruction you were next doing
Interrupts and exceptions result in control change
An interrupt/exception results in the transfer of control to the OS in response to some event
/CSCE-313/Visual%20Aids/Pasted%20image%2020260121143344.png)
/CSCE-313/Visual%20Aids/Pasted%20image%2020260121143401.png)
/CSCE-313/Visual%20Aids/Pasted%20image%2020260121143418.png)
- Note you will be returned to the address you left in.
- Everything will be the same for you, all the registers will be the same, it will just be a little bit of a delay between the previous instruction and the next, but everything will be okay.
Interrupt & Exception handlers
An interrupt handler is code that services an interrupt, then resumes the
interrupted program.
- Linux initializes an interrupt vector table to handle interrupts.
- The processor saves the
%rflags,%cs+%ripregisters onto the stack before transferring control to the interrupt handler routine.- Handler usually saves all processor registers
- Services the routine
- Restores processor registers
- iret’s from the interrupt (restore
%rflags+%cs+%rip)
- Interrupt handlers must not change registers before saving them
- must be able to resume where the program was interrupted.
Notes:
- When an interrupt happens, your hardware will automatically save your instruction pointer, your status, and other registers
- Why does the CPU needs to save the instruction pointer? well, because it needs to remember where it came from in order to be able to go back at some point
- You better make sure that your registers have the same values as when you left.
- You also need this to be atomic, if these are not atomic, you will have trouble.
Synchronous "interrupts"
Synchronous exceptions can be triggered by executing an instruction.
From within the user application there are 3 types of synchronous exceptions:
- Faults,
- Usually occurs before the instruction completes and is restartable.
- Page Fault
- Aborts
- A severe, unrecoverable exception (e.g., hardware failure or double fault), not restartable. ECC checksum failure.
- Traps (INT 3) or System Calls (SYSCALL)
Examples: Exception in Intel Processors
See https://wiki.osdev.org/exceptions
/CSCE-313/Visual%20Aids/Pasted%20image%2020260121143947.png)
- Our mechanism for implementing syscalls, is that the OC picks one of these indices
- When the user space program wants to do something privileged it has to tell the OS and generate a "Trap" so that your process is able to change its privilege level and jump to that address.
- It has to carefully manage its registers
- Saves all your registers and then restores them
- The OS knows where to return you back because you already pushed your address to the stack
Example 1: page fault
- User writes to memory location
- That page of user memory is not mapped yet (because memory pages are mapped only when necessary)
- Page handler must load page into physical memory
- Returns to faulting instruction
- Successful on second try
int a[1000];
main () {
a[500] = 13;
}
- This part of the array is not mapped into your virtual address space, it is valid, but it is not in memory at the time we access it, that is how we generate a page fault.
- There is a well defined exception when an exception happens
- Has to do with returning back to user space
- You will end up retrying that instruction
/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260123140511.png)
Notes:
- The instruction that generates the page fault is read/write
Example 2: illegal memory reference
Illegal Memory Reference
- User writes to memory location
- Address is not valid
- Page handler detects invalid address
- Sends SIGSEGV* signal to user process
- User process exits with “segmentation fault”
int a[1000];
main () {
a[500] = 13;
}
/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260123140809.png)
Notes:
- Software interrupts are called signals in the Unix world
Example 3: Abort
Aborts are severe and unrecoverable errors.
- Examples: memory parity error, machine check
- Aborts current program and hands control over to the OS
- This is the way for the OS to put essential error checking
- Also, an excellent way to make OS resilient when applications are failing or crashing
/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260123140914.png)
- Some are recoverable, some are just not recoverable
- Divide by zero is handled differently for INT and Floating Point (FP).
- System response can be +ve INF (FP) or fault (SIGFPE default is irrecoverable)
- Therefore, Divide by Zero can technically fall in several categories of exceptions
- Depending on how it is handled in a system.
...
Traps or system calls
/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260123141019.png)
- Generate a trap for 128 (syscall), then wrap through the values of your registers.
- There is trickiness because as a user process you are in a virtual address space, supposing you provide a pointer as an argument, it must be able to read into your virtual address space, and as it is reading it can find that some pages are not mapped, this could generate a page fault, it needs to be handled carefully by the kernel before you can retrieve registers, etc.
System calls: example
#include <unistd.h>
int main(void) {
write(1, "Hello, world\n", 13);
return 0;
}
- Your stdout is represented by a file descriptor (remember every process start with 3 descriptors)
write()is referring to the system call tolibc- Compile it with
gcc write-syscall.c - Run it with
./a.out - Note if you have many of these
write()calls, the compiler might buffer them so that you only go make thesyscallonce.
WRITE(2) Linux Programmer's Manual WRITE(2)
NAME
write - write to a file descriptor
SYNOPSIS
#include <unistd.h>
ssize_t write(int fd, const void *buf, size_t count);
DESCRIPTION
write() writes up to count bytes from the buffer starting at buf to the file referred to by the file descriptor fd.
- Got it from using the
mancommand - Your libraries are basically encapsulating system calls (
libc)
stdin, stdout, stderr
A process usually has three file descriptors
- stdin (0), stdout (1), stderr (2)
cin,cout,cerr
secho
stdin> Howdy!
stdout: Howdy!
stderr: Howdy!
- The program (secho) read "Howdy!" from standard input and just send it to standard output and standard error
secho > /dev/null
stdin> Howdy!
stderr: Howdy!
- Here the program redirects the standard output to
/dav/null(trash) - stderr is still displayed (it was not dumped)
secho > /dev/null 2>&1
Howdy!
- This is just unique syntax for saying that you want to include stderr in the redirect
- Now both stderr and stdout are pointing to
/dev/null(they are both dumped) - Why didn't the
stdin>prompt appeared this time?- In the code we are printing this prompt to
stderr, since we point to the/dev/null, we never get to see it.
- In the code we are printing this prompt to
Notes:
- C++ will provide convenient methods on
cin,cout,cerrto make programming easier
System calls: an example in assembly
main:
pushq %rbp
movq %rsp, %rbp
movl $13, %edx
leaq .LC0(%rip), %rax
movq %rax, %rsi
movl $1, %edi
call write@PLT
movl $0, %eax
popq %rbp
ret
- If we generate the assembly for the program, it look something like this
- Note the syscall
writeis going toglibc
glibc(GNU C Library) is the fundamental C standard library for GNU/Linux and other Unix-like systems, providing essential functions for nearly all applications, such as memory allocation, file I/O, string handling, and system calls, acting as a vital bridge between user programs and the operating system kernel, implementing standards like POSIX and ISO C
System call invocation
/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260123143317.png)
- Philosophy: If you keep your end of the bargain, then you will get the service that you asked for
Control Flow in System Calls
Example: file open
/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260123143422.png)
- Dashed lines mean context switch due to software interrupt,
- This is also where execution mode changes to/from “privileged”
Notes:
- The system call number for open is (2)
- The kernel has to be very careful in copying each of the arguments of the processes virtual address space
- This is figured out by the kernel
- Internally it will do a dispatch to say "I want to involve open_handler, etc."
- At the end it will return to
glibcand then to your user program
System calls
...
Types of System Calls
| Process Control | File Management | Device Management |
|---|---|---|
| • load • execute • end, abort • create process • terminate process • get/set process attributes • wait for time, wait event, signal event • allocate, free memory |
• create file, delete file • open, close • read, write, reposition • get/set file attributes |
• request device, release device • read, write, reposition • get/set device attributes • logically attach or detach devices |
| Information Maintenance | Communication |
|---|---|
| • get/set time or date • get/set system data • get/set process, file, or device attributes |
• create, delete communication connection • send, receive messages • transfer status information • attach or detach remote devices |