Midterm-II Prep

Notes:

1.1 Program Execution & Process Behavior

Problem 1

Write a small C program that uses fork() to create a child process. The child should print its own PID and its parent's PID, then exit. The parent should wait for the child and print whether the child exited normally. Predict the output before running it.

Answer:

int main() {
	pid_t pid = fork();
	
	if (pid < 0) {
		perror("forked failed");
		return 1;
	}
	
	if (pid == 0) {
		printf("My PID: ", getpid(), "My Parent's PID: ", getppid());
		exit(0);
	}
	
	else {
		int status;
		wait(&status);
		
		if (WIFEXITED(status)) {
			printf("Parent: child exited normally\n");
		} else {
			printf("Parent: child did not exit normally\n");
		}
		return 0;
	}
}

Problem 2

Consider the following program:

  int fd = open("data.txt", O_RDONLY);
  char buf[10];
  int n = read(fd, buf, 10);
  printf("%d\n", n);

Assume data.txt exists but contains only 5 bytes. What values could read() return? What happens if fd == -1? How should the program correctly handle errors?

Answer:

What values could read() return?
1. n = 5 (most likely / normal case)
  - Since data.txt contains only 5 bytes, the possible return values are:
  - read() returns the number of bytes actually read
  - Because only 5 bytes exist, the kernel copies those 5 bytes into buf
2. n = 0 (possible if already at EOF)
  - If the file offset was already at the end of file (for example if the file had already been read earlier), then:
  - read() returns 0
  - This means EOF (End Of File)
3. n = -1 (error case)
  - If an error occurs during read(), it returns -1 and sets errno.
What happens if fd == -1?
- If fd == -1, the open system call failed to start a session.
- Common reasons:
  - file does not exist
  - permission denied
  - too many open files
  - invalid path
- If we still call: read(...)
  - read() immediately fails
  - returns -1
  - sets errno = EBADF (Bad file descriptor)
  - The program will print -1
How should the program correctly handle errors?
- Should do an error check for fd == -1 and then do perror("open") inside the if condition:
```
    int fd = open("data.txt", O_RDONLY);
    if (fd == -1) {
        perror("open failed");
        return 1;
    }
```

Problem 3

A program repeatedly calls:

read(fd, buf, 1024);

but occasionally receives -1 with errno == EINTR.

What caused this?
- While the process was blocked in: read(fd, buf, 1024);
- a signal was delivered, such as:
  - SIGINT
  - SIGCHLD
  - SIGALRM
- If the signal handler runs and the system call is not automatically restarted, read() returns: and sets errno == EINTR

How should the code be written to handle this properly?

It should retry the read() when the error is EINTR.

ssize_t safe_read(int fd, void *buf, size_t count) {
    ssize_t n;

    do {
        n = read(fd, buf, count);
    } while (n == -1 && errno == EINTR);

    return n;
}

Or directly inline:

ssize_t n;
while ((n = read(fd, buf, 1024)) == -1 && errno == EINTR) {
    ; // retry
}

Then after that, handle the other cases normally:
- n > 0 → bytes read
- n == 0 → EOF
- n == -1 → real error

Why is it unsafe to simply exit on this error?
- Because EINTR is often not a real failure. It is usually a temporary interruption.
- If the program exits immediately:
  - it may stop reading valid input even though nothing is wrong with the file or pipe
  - it may terminate due to a completely normal signal
  - it can lose data or behave unreliably
  - it makes the program fragile in signal-heavy environments
- So exiting on EINTR treats a recoverable condition like a fatal error.

Answer:

EINTR happens when a blocked read() is interrupted by a signal before finishing. The code should retry the call in a loop when errno == EINTR. Exiting immediately is unsafe because the interruption is temporary and not an actual read failure.

Problem 4

A pipe has a buffer size of 64 KB. A process writes continuously to the pipe while another process reads slowly.

What happens when the buffer fills?
- A pipe is a bounded kernel buffer (here 64 KB).
- If the writer keeps writing while the reader is slow, eventually: The pipe becomes full.
- At that point, the kernel cannot accept more data from the writer.
Does write() block or fail?
- It depends on how the pipe (file descriptor) is configured.
- Default behavior (blocking I/O)
  - write() blocks (sleeps) until:
  - the reader consumes some data, creating space in the pipe buffer.
  - So the writer process is paused by the kernel and resumes later.
  - This is the normal expected behavior.
- Non-blocking mode (O_NONBLOCK)
  - If the write end of the pipe was opened or configured with O_NONBLOCK: fcntl(fd, F_SETFL, O_NONBLOCK);
- Then:
  - write() does NOT block
  - It immediately fails
  - Returns: -1 and sets errno = EAGAIN (or EWOULDBLOCK)
Under what conditions would SIGPIPE be generated?
- SIGPIPE happens in a completely different situation:
  - When a process writes to a pipe that has no readers.
- This occurs when:
  - all file descriptors referring to the read end of the pipe are closed
  - and a process calls write() on the write end
- Then the kernel:
  1. sends SIGPIPE to the writing process
  2. write() fails and returns -1
  3. errno is set to EPIPE
- If the signal is not caught or ignored → the process is terminated (default action).

Problem 5

What happens to each of the following across an exec() call?

Process ID
- Preserved.
  - The process does not become a new process
  - Only the program running inside it changes
  - Therefore: The PID remains the same across exec()
Open file descriptors
- Preserved (usually).
  - All open file descriptors remain open across exec()
  - The file offset and flags remain the same
- Exception:
  - Any descriptor marked with FD_CLOEXEC is automatically closed.
  - Example: fcntl(fd, F_SETFD, FD_CLOEXEC);
Signal handlers
- Reset to default.
  - Any signal handler installed with signal() or sigaction() is cleared.
  - After exec(), signals go back to their default disposition.
- Important nuance:
  - Signals set to SIG_IGN (ignored) often remain ignored (POSIX rule).
Memory heap
- Replaced / destroyed.
  - The entire address space is replaced:
    - heap
    - stack
    - global variables
    - code segment
    - mmap regions
- All dynamic allocations (malloc) are gone.
Current working directory
- Preserved.
  - The working directory is a process attribute, not part of the program image.
  - Therefore it remains unchanged after exec().

Problem 6

Consider a program that opens a file, writes some data, and then calls fork(). What happens to the file descriptor in the child? Is the write position shared? What could go wrong if both parent and child write to the file without coordination?

What happens to the file descriptor in the child after fork()?
- After fork(), the child receives a copy of the parent’s file descriptor table.
- So:
  - The child has a valid file descriptor referring to the same open file
  - Both descriptors refer to the same open file description in the kernel
Is the write position (file offset) shared?
- Yes — the file offset is shared.
- Because both descriptors point to the same kernel open-file object, they share:
  - file offset (write/read position)
  - status flags (e.g., O_APPEND)
  - access mode
- So if:
  - parent writes 10 bytes → offset moves forward
  - child then writes → it continues from the updated offset
What could go wrong if both parent and child write to the file without coordination?
- This creates a race condition.
- Possible problems:
  1. Interleaved / corrupted output
    - Writes from parent and child may overlap unpredictably. Example:
      Expected:
```
Parent line
Child line
```
      Possible real result:
```
ParChild line
ent line
```
  2. Lost updates / overwritten data
    - If both processes:
      - read current offset
      - compute something
      - then write
    - One write may overwrite the other.
  3. Non-atomic large writes
    - Regular file writes are not guaranteed to be atomic (unlike small pipe writes).
    - So partial interleaving can occur.
  4. Ordering becomes unpredictable
    - You cannot assume:
      - parent writes first
      - child writes second
    - The OS scheduler decides.
- Common solutions:
  - use wait() so only one process writes at a time
  - use file locking (flock, fcntl locks)
  - open file with O_APPEND (offset movement becomes atomic per write)
  - use IPC / synchronization (pipes, semaphores, etc.)

1.2 Signals

Problem 1

Stacks as a Data Structure (in C). Implement a stack in C using a dynamic array (i.e., using malloc/realloc). Your implementation should support push, pop, peek, and is_empty. Make your implementation signal-safe?

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>

typedef struct {
    int *data;
    size_t size;      // number of elements currently in stack
    size_t capacity;  // allocated capacity
} Stack;

#define INITIAL_CAPACITY 4

int stack_init(Stack *s) {
    if (s == NULL) return -1;

    s->data = malloc(INITIAL_CAPACITY * sizeof(int));
    if (s->data == NULL) {
        return -1;
    }

    s->size = 0;
    s->capacity = INITIAL_CAPACITY;
    return 0;
}

void stack_destroy(Stack *s) {
    if (s == NULL) return;

    free(s->data);
    s->data = NULL;
    s->size = 0;
    s->capacity = 0;
}

bool is_empty(const Stack *s) {
    return (s == NULL || s->size == 0);
}

int push(Stack *s, int value) {
    if (s == NULL) return -1;

    if (s->size == s->capacity) {
        size_t new_capacity = s->capacity * 2;
        int *new_data = realloc(s->data, new_capacity * sizeof(int));
        if (new_data == NULL) {
            return -1; // resize failed
        }
        s->data = new_data;
        s->capacity = new_capacity;
    }

    s->data[s->size] = value;
    s->size++;
    return 0;
}

int pop(Stack *s, int *out) {
    if (s == NULL || s->size == 0) {
        return -1; // empty stack
    }

    s->size--;
    if (out != NULL) {
        *out = s->data[s->size];
    }
    return 0;
}

int peek(const Stack *s, int *out) {
    if (s == NULL || s->size == 0 || out == NULL) {
        return -1;
    }

    *out = s->data[s->size - 1];
    return 0;
}

int main(void) {
    Stack s;

    if (stack_init(&s) != 0) {
        fprintf(stderr, "Failed to initialize stack\n");
        return 1;
    }

    push(&s, 10);
    push(&s, 20);
    push(&s, 30);

    int x;
    if (peek(&s, &x) == 0) {
        printf("peek = %d\n", x);
    }

    while (!is_empty(&s)) {
        if (pop(&s, &x) == 0) {
            printf("pop = %d\n", x);
        }
    }

    stack_destroy(&s);
    return 0;
}

A dynamic-array stack can be implemented with malloc/realloc, but it cannot be truly async-signal-safe because memory allocation functions are not async-signal-safe. If signal safety is required, the stack should use preallocated memory and avoid calling malloc, realloc, or free from code that may run in a signal handler.

Problem 2

Suppose a program installs a handler for SIGINT.

void handler(int sig) {
    printf("Signal received\n");
}

Why is this handler potentially unsafe?
- This is unsafe because printf() is NOT async-signal-safe.
  - If a signal interrupts the program while it is already inside printf() (or another stdio function), and the handler calls printf() again:
  - internal stdio buffers may be in an inconsistent state
  - locks inside the C library may already be held
  - this can cause deadlock, memory corruption, or undefined behavior
- Somewhere in the execution of printf you invoke printf again in your handler
- You are maintaining a user level buffer, at the end of it, there is a little file descriptor pointing to the object you want to write to
- In order to serialize invocation of a standard IO function is that it acquires a lock at the beginning of the buffer and releases the lock at the end.
- This is a requirement to serialize execution
- If you have printf in progress you may have acquire the lock, did something, but printf again may acquire the same lock, this is where the problem is
- The story is that in the handler you should only use functions that are signal-safe.
- All the standard IO functions are not async signal safe
- Two concurrent invocations of the same function can acquire the same lock
Which function should be used instead of printf()?
- We use write(), this is a system call and therefore it is serialized by the kernel, it won't go to deadlock.
- Now you can produce results in a different order, because you are going to the kernel while doing other stuff
- In your program you will use standard IO library like printf which accumulates stuff into the buffer,
- Whenever we use write() it instantly writes (standard IO is flushed) and all of the buffer appears on the write.
What property must functions used in signal handlers satisfy?
- Basically what they call atomicity (async-signal-safe)
- They are atomic in the sense that you are not in the middle of doing something when you receive a signal
- We have a notion of consistency
  - A block of consistency is this much (we do not want to be interrupted within this block)
  - Want things that our signal handler touches to be atomic
- Either you didn't start the thing or you finish it when receiving the signal
- This means:
  - the function can be safely called from within a signal handler
  - it must not:
    - use non-reentrant global state
    - acquire locks that may already be held
    - rely on heap allocation or stdio buffering
    - call other unsafe functions internally
- POSIX defines a specific list of async-signal-safe functions (examples):
  - write
  - _exit
  - signal
  - sigaction
  - kill
  - fork (with caveats)
  - wait
  - read

Problem 3

Write a program that installs a signal handler for SIGINT. The handler should increment a counter and print how many times the user has pressed Ctrl+C. After 3 presses, the program should exit cleanly. What constraints apply to code inside a signal handler?

Problem: Signals can be coalesced into one if received concurrently, it will not be counted, it will just tell you that you have pending signals. There are real time signals that actually count, but we are not doing that.

Answer:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>

volatile sig_atomic_t count = 0;

void handler(int sig) {
    count++;
	
    if (count == 1) {
        write(STDOUT_FILENO, "Ctrl+C pressed 1 time\n", 22);
    } else if (count == 2) {
        write(STDOUT_FILENO, "Ctrl+C pressed 2 times\n", 23);
    } else if (count >= 3) {
        write(STDOUT_FILENO, "Exiting after 3 presses\n", 24);
        _exit(0);   // async-signal-safe exit
    }
}

int main(void) {
    struct sigaction sa;
    sa.sa_handler = handler;
    sigemptyset(&sa.sa_mask);
    sa.sa_flags = 0;

    sigaction(SIGINT, &sa, NULL);

    while (1) {
        pause();   // wait for signals
    }

    return 0;
}

Constraints:

Only async-signal-safe functions may be called
Must avoid non-atomic shared data access
Keep handlers short and simple

Problem 4

Which of the following are valid things to do safely inside a signal handler, and why?

Call printf()
- Not safe.
- Why:
  - printf() is not async-signal-safe
  - it uses stdio buffers and internal library state
  - if the signal interrupts code already using stdio, calling printf() in the handler can cause deadlock or undefined behavior
Set a volatile sig_atomic_t flag
- Safe.
- Why:
  - sig_atomic_t is the standard type meant for values that can be accessed safely between normal code and a signal handler
  - volatile prevents the compiler from optimizing away reads/writes
Call malloc()
- Not safe.
- Why:
  - malloc() is not async-signal-safe
  - it may manipulate heap metadata or locks
  - if the signal arrives while the program is already inside malloc()/free(), calling it again in the handler can corrupt memory or deadlock
Write to a global array
- Usually not safe in general.
- Why:
  - writing arbitrary global data is only safe if you can guarantee it will not be accessed concurrently in an unsafe way
  - normal writes to arrays are not automatically atomic
  - this can create races or inconsistent state if the main program also uses that array
- So:
  - writing one simple sig_atomic_t flag: safe
  - modifying a general global array: not generally safe

Notes:

If you have a variable of the volatile type you can actually touch it in your handler
- The compiler is free to allocate variables of this type on registers during execution
- If in your handler you tried to read x, you won't read what your program actually wrote
- When you declare a variable as volatile then all reads and writes go to the address
  - You will actually go to the address to read or write to it.
You do not want to allocate new memory with malloc because what if you get interrupted? you could lose the reference to that memory space

Example

long long x;
x++;

There is no guarantee that the increment on x is atomic
Since we can read some bytes first, and then the others
x may not be on a consistent state always

Anything that is atomic is signal-safe?

No — atomic ≠ signal-safe.
Atomic
- An operation is atomic if it cannot be interrupted in the middle and appears indivisible.
- Example:
  - writing a sig_atomic_t
  - some hardware instructions
  - some lock-free operations
- Atomicity is about data consistency / race conditions.
Async-signal-safe
- A function is signal-safe if it is guaranteed by POSIX to be callable from inside a signal handler.
- This is about reentrancy and internal library state.
- A function may:
  - acquire locks
  - use global buffers
  - allocate memory
  - call other unsafe functions
- Even if parts of it use atomic instructions, the function as a whole can still be unsafe.
  - atomic_fetch_add(&x, 1);: Even if this instruction is atomic:
    - it is not guaranteed async-signal-safe by POSIX
    - therefore you cannot assume it is safe inside handlers

Problem 5

What is the difference between SIGKILL and SIGTERM? Can a process catch or ignore either? Write the kill command invocation you would use to send each.

SIGTERM (Signal 15)
- Graceful termination request.
- Default action: terminate the process
- The process can catch, handle, or ignore this signal
  - Gives the program a chance to:
    - clean up resources
    - close files
    - save state
    - terminate child processes
- Example use cases:
  - shutting down servers
  - stopping background jobs cleanly
SIGKILL (Signal 9)
- Forced termination.
  - Default action: immediately terminate the process
  - The process cannot catch, ignore, or block this signal
  - The kernel stops the process instantly
- Consequences:
  - no cleanup
  - files may remain open
  - shared resources may be left inconsistent
  - temporary files may remain
- Used when:
  - process is stuck
  - ignoring SIGTERM
  - in uninterruptible or runaway state
Can a process catch or ignore them?

Signal	Catchable?	Ignorable?	Blockable?
SIGTERM	Yes	Yes	Yes
SIGKILL	No	No	No

kill command invocation:

Send SIGTERM (default)

kill <pid>

or:

kill -TERM <pid>

kill -15 <pid>

Send SIGKILL

kill -KILL <pid>

or:

kill -9 <pid>

1.3 Pipes and Inter-Process Communication

Practice Problem 10.

Using pipe() and fork(), write a program where the parent sends the string "hello" to the child through a pipe, and the child reads it and prints it. What happens to the unused ends of the pipe, and why is it important to close them?

Answer:

#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <sys/wait.h>

int main() {
    int p[2];
    pipe(p);

    pid_t pid = fork();

    if (pid == 0) { // child (reader)
        close(p[1]);                    // close unused write end
        char buf[16];
        int n = read(p[0], buf, sizeof(buf));
        write(STDOUT_FILENO, buf, n);   // print what was read
        write(STDOUT_FILENO, "\n", 1);
        close(p[0]);
    } else { // parent (writer)
        close(p[0]);                    // close unused read end
        write(p[1], "hello", 5);
        close(p[1]);                    // important → send EOF
        wait(NULL);
    }
    return 0;
}

What happens to unused pipe ends and why close them?

Each process must close the pipe end it does not use.
This prevents:
- resource leaks
- incorrect blocking behavior
Most importantly:
- If the writer does not close its write end, the reader will not see EOF and may block forever waiting for more data.

Practice Problem 11.

Two processes communicate through a pipe. The writer closes its end and exits. What does the reader observe when it tries to read() from the pipe? What signal, if any, is involved?

Answer:
When the writer closes its end and exits, the reader calling:

read(pipe_fd, buf, size);

will observe:
- read() returns 0 → EOF
This means:
- No more writers exist and no more data will arrive.
What signal is involved?
- No signal is sent to the reader.
- Signals only occur in the opposite direction:
  - writing to a pipe with no readers → SIGPIPE

Practice Problem 12.

Explain what a broken pipe is. Write a scenario (in pseudocode or prose) that would trigger SIGPIPE, and describe what the default behavior is when a process receives it.

What is a broken pipe?

A broken pipe occurs when a process attempts to write to a pipe whose read end has been closed.

Scenario that triggers SIGPIPE

Example (prose):
1. Parent creates pipe
2. Child closes read end and exits
3. Parent later calls: write(pipe_write_fd, data, size);
Since:
- no process has the read end open
The kernel:
- sends SIGPIPE to the writing process
- write() fails and returns -1
- errno = EPIPE

Default behavior of SIGPIPE

The process is terminated immediately
No cleanup occurs unless the signal is caught or ignored

1.4 MCQs

Problem 1

A process opens a file and then calls fork(). Which statement is correct?

The parent and child get independent file offsets.
The parent and child share the same file offset.
The child cannot access the file unless it calls open() again.
The file descriptor is closed automatically in the child.

Problem 2

After a successful exec() call:

The process keeps its heap and stack.
The process keeps its open file descriptors.
The process keeps its signal handlers unchanged.
The process keeps its environment variables unchanged.

(Choose all that apply.)

Note:

The environment is passed to the new program image.
Unless a different environment is explicitly provided (e.g., execve).

Problem 3

Suppose a process writes to a pipe whose read end has been closed. What happens?

write() returns 0.
write() returns −1 and errno = EPIPE.
The process receives SIGPIPE.
The pipe silently discards the data.

(Choose all that apply.)

Problem 4

If a process calls: read(fd, buf, 100) on a pipe, when does it return 0?

When the pipe buffer is empty.
When the write end of the pipe has been closed and no more data remains.
When the process receives a signal.
When the file descriptor reaches the end of its buffer.

Note:

For pipes, read() returning 0 means EOF (end-of-file).
This happens only when:
- All write ends of the pipe are closed, and
- All data already in the pipe buffer has been consumed.
At that point, the kernel knows:
- No more bytes can ever arrive → return 0.

Problem 5

Which signals cannot be caught or ignored?

A. SIGSTOP
B. SIGKILL
C. SIGINT
D. SIGTERM

Note:

SIGKILL
- Forces immediate termination.
- Cannot be caught, ignored, or blocked.
- Used when a process must be killed no matter what.
SIGSTOP
- Forces the process to stop (pause execution).
- Cannot be caught, ignored, or blocked.
- Used by the kernel / shell for job control.

Problem 6

Which statement about Unix directories is true?

Directory entries store file data directly.
Directory entries map filenames to inode numbers.
Directories are stored only in memory.
A directory can contain two entries with the same name.

Note:

In Unix file systems, a directory is essentially a special file that contains entries of the form:
- filename → inode number
The inode stores metadata and pointers to the actual file data blocks.
The directory itself does not store file data — it only stores these mappings.
❌ Directory entries store file data directly.
- False — file data is stored in data blocks referenced by the inode, not in the directory.
❌ Directories are stored only in memory.
- False — directories are persistent structures stored on disk.
❌ A directory can contain two entries with the same name.
- False — filenames must be unique within the same directory (except for special cases like hard links having different names).

Problem 7

If a parent process never calls wait() for a child that exits, the child becomes:

A. blocked
B. orphaned
C. zombie
D. suspended

Note:

When a child process exits, the kernel:
- keeps a small entry in the process table
- stores the child’s exit status and accounting info
This state is called a zombie process.
It remains a zombie until the parent calls: wait() or waitpid() to reap the child
❌ A. blocked
- Blocked means waiting for I/O, a lock, or some resource — not the case here.
❌ B. orphaned
- An orphan happens when the parent exits before the child.
- Then the child is adopted by init (or systemd).
❌ D. suspended
- Suspended means stopped (e.g., by SIGSTOP), not exited.

Problem 8

Given:

  -rw-r----- file.txt

Which processes can read the file?

The owner
Members of the group
All users
Only root

Note:

Breakdown:
- Owner: rw- → read and write
- Group: r-- → read only
- Others: --- → no permissions

Problem 9

Which of the following are system calls?

printf()
read()
malloc()
write()

Note:

✅ read()
- This is a system call used to read data from files, pipes, sockets, etc.
- It causes a trap into the kernel to perform I/O.
✅ write()
- Also a system call used to write data to file descriptors.
❌ printf()
- This is a C library function (stdio).
- It eventually may call write(), but it is not itself a system call.
❌ malloc()
- This is also a library function for heap allocation.
- It may internally use system calls like brk() or mmap(), but malloc() itself is not a system call.

Last minute checklist

What state a process must save during a context switch
- The kernel must save the process’s CPU execution context, including:
  - program counter (instruction pointer)
  - CPU registers
  - stack pointer
  - processor status/flags
  - memory management state (e.g., page table pointer)
- This allows the process to resume exactly where it left off.
Why user level threads are cooperative unless extra machinery is added
- User-level threads execute on their own stack and run continuously until they explicitly yield control of the CPU (e.g., by calling a function like t_yield()). Without extra machinery, such as hardware timer interrupts to force an automatic context switch, the OS has no way to wrest control from the thread; it relies entirely on the thread's "cooperation" to voluntarily invoke the scheduler and let another thread run.
What read returning 0 means for a regular file and for a pipe
- Regular File: It means the End of File (EOF) has been reached, and there are no more bytes left to read.
- Pipe: It also indicates EOF, but specifically implies that all write ends of the pipe have been closed and the pipe is empty. (If a write end were still open, the read() call would block and wait for data instead of returning 0).
Why short counts are possible
- Short counts occur when read() or write() process fewer bytes than you requested. This is not an error and can happen because:
  - The EOF was reached before the requested number of bytes could be read.
  - When reading from a network/socket, buffering can cause delays in the arrival of data.
  - Record-oriented devices (like magnetic tape) may only return data one record at a time.
  - The system call was interrupted by an asynchronous signal mid-transfer.
Why open twice is different from dup2
- Calling open() twice creates two completely independent entries in the kernel's system-wide File Table. Because the sessions are separate, each has its own independent file cursor/offset.
- Conversely, dup2() copies a file descriptor so that both the original and the new descriptor point to the exact same File Table entry. Therefore, descriptors copied via dup2 share the exact same file offset and status flags.
Why a child can affect the parent's file position after fork
- When fork() is called, the child inherits an exact copy of the parent's file descriptor table. Because both the parent's and the child's file descriptors point to the exact same shared File Table entry in the kernel, they share the same file cursor (offset). Therefore, if the child reads or writes to the file, it advances that shared cursor, inherently changing the file position for the parent as well.
Why the first open commonly returns descriptor 3
- Unix processes generally begin life with three standard open file descriptors already assigned to the terminal: 0 (standard input), 1 (standard output), and 2 (standard error). Because the open() system call guarantees it will always return the lowest available unopened descriptor, the first new file a program opens will naturally be assigned descriptor 3.
How a shell redirects standard output
- The shell redirects output by using the dup2() system call to manipulate file descriptors. For example, to redirect output into a pipe or a file, the process closes its default standard output (descriptor 1) and uses dup2() to link descriptor 1 to the write-end of the pipe or the open file. As a result, anything the program attempts to write to standard output is automatically routed to the new destination.
- Example:
  - The shell:
    1. opens the target file
    2. uses dup2(fd, STDOUT_FILENO)
    3. closes the original descriptor
    4. executes the program (exec())
  - Now descriptor 1 points to the file instead of the terminal.
What fflush actually forces the library to do
- Standard I/O functions accumulate data in a user-space memory buffer for efficiency. Calling fflush() forces the C standard library to empty that buffer by immediately executing the underlying raw Unix write system call, transferring the accumulated data to the output file descriptor or device.
- It does not guarantee the data reaches disk, only that it leaves the library buffer.
Why mixing fprintf with write can reorder results
- fprintf uses standard I/O, which buffers its output in memory (often waiting for a newline character or for the buffer to fill up before writing).
- In contrast, write is a raw Unix I/O system call that sends data to the OS immediately without buffering. If you mix them, the unbuffered write output will print instantly, while the fprintf output may remain held in the buffer and print later when it is finally flushed, altering the expected chronological order of the outputs.
Why stdio buffers can be duplicated across fork
- Standard I/O functions (like printf or fprintf) accumulate data in user-space memory buffers to improve efficiency. When fork() is called, the OS creates an exact copy of the parent's entire virtual address space for the child. Because the stdio buffer resides in this memory space, any unflushed data sitting in the parent's buffer at the moment of the fork is perfectly duplicated into the child's memory.
How to interpret octal permission values
- Octal permissions are a scaled summation of binary bits representing Read (R), Write (W), and Execute (X) permissions. Each digit in the 3-digit octal value represents a specific category: Owner, Group, and Others
  - r = 4 (100 in binary)
  - w = 2 (010 in binary)
  - x = 1 (001 in binary) For example, an octal value of 7 (111 in binary) grants full r, w, and x permissions, while 6 (110 in binary) grants only Read and Write permissions.
- Example: 754
  - 754 → owner: rwx (7), group: r-x (5), others: r-- (4)
What execute means on a directory
- On a directory, the execute (x) bit acts as the "search" bit. It means you are allowed to find a file by its precise name and, most importantly, access the metadata (the inode) of the files within that directory. Without it, you cannot read or write to any files inside, even if you know they are there.
What set uid changes during program execution
- If a program has the set-uid (u+) bit enabled, executing that program temporarily changes the caller's effective User ID (UID) to match the UID of the file's owner. This allows the user to temporarily run the program with the elevated privileges of the owner.
Why signal handlers must stay simple
- Signals are delivered asynchronously, meaning they can interrupt the execution of the main program at any unpredictable microsecond. If the main program is in the middle of a non-atomic operation (like updating a global data structure) when the interrupt occurs, jumping into a complex signal handler that reads that same shared data will result in an inconsistent state and unpredictable bugs. Therefore, handlers must be kept extremely simple, or you must rely on pure functions.
Why SIGKILL cannot be caught or ignored
- SIGKILL (along with SIGSTOP) cannot be caught, overridden, or ignored by a process. This is a deliberate design choice by the OS to ensure that the kernel always retains a guaranteed, surefire mechanism to forcefully terminate runaway or unresponsive processes.
Why a SIGCHLD handler often uses waitpid in a loop
- Standard UNIX signals do not queue; they are coalesced. If multiple children terminate simultaneously while the parent is already inside the SIGCHLD handler (where the signal is temporarily blocked), the kernel will only record one single pending SIGCHLD signal for all of them. Using waitpid in a loop (usually with the WNOHANG flag) ensures the handler checks for and reaps all terminated children before returning, preventing zombies.
How to protect shared data from asynchronous signal delivery
- To protect shared data from being corrupted by unpredictable interrupts, you must manually block the signal before entering a critical section of code, and unblock it immediately after. This is done using the sigprocmask() system call, which temporarily adds the signal to the process's blocked signal mask.
What SIGPIPE means
- SIGPIPE is a software-generated signal that indicates a broken pipe. The kernel sends this signal to a process when it attempts to write data to a pipe that no longer has any active readers (e.g., the reading process has already exited or closed its read descriptor).
How to trace shared versus copied state without guessing
- Instead of guessing what a process is doing under the hood, you can use the strace utility. strace intercepts and records the exact sequence of system calls a program is executing (including file opens, forks, and network connections) along with their return values, allowing you to trace exactly how the OS handles its state.
- Use the rule:
  - Shared kernel objects (after fork):
    - open file descriptions (file offset, flags)
    - pipes
    - sockets
  - Copied user memory:
    - heap
    - stack
    - globals
    - stdio buffers
- General principle:
  - If the state lives in the kernel, it is usually shared.
  - If it lives in user address space, it is copied.