Fork and pipe

#fork #processes #pipe

Class: CSCE-313

Notes:

Question 1

What is returned by fork() in the child process?

Options:

PID of the child
1
PID of the parent
0

Overall explanation:

When you call fork(), the OS creates a new process (the child) that is almost an exact copy of the parent.
- In the parent process, fork() returns the PID of the child
- In the child process, fork() returns 0
If it fails, it returns -1 (in the parent, since the child never gets created)

Tags: 03 - Inter Process Communication - Pipes and FIFO#Forks

Fork & exec

Evaluate whether the following statements regarding fork and exec are true or false

Question 2

After a fork(), the parent receives the PID of the child process as the return value.

Options:

True
False

Overall explanation:

The parent process gets the PID of the newly created child as the return value.
The child process gets 0.

Tags: Lab 2 - Unix Pipes

Question 3

After a fork(), the child receives its own PID as the return value.

Options:

True
False

Overall explanation:

The parent process gets the PID of the newly created child as the return value.
The child process gets 0.

Tags: Lab 2 - Unix Pipes

Question 4

After a fork(), the parent and child processes are nearly identical.

Options:

True
False

Overall explanation:

Right after fork(), the child process is basically a copy of the parent:
- Same code
- Same variables / memory contents (logically)
- Same open file descriptors
- Same execution point (both continue right after the fork() line)
The main differences are things like:
- Different PID
- Different return value from fork()
- Parent/child relationship info

Tags: Lab 2 - Unix Pipes

Question 5

During a fork(), the parent process is terminated if an error occurs.

Options:

True
False

Overall explanation:

If fork() fails, it returns -1 in the parent process, and no child process is created.
But the parent is not terminated automatically.
So the parent just keeps running and can handle the error (usually by checking if fork() < 0).

Tags: Lab 2 - Unix Pipes

Question 6

After a fork(), the parent process is guaranteed to run first.

Options:

True
False

Overall explanation:

After fork(), the parent and child are both runnable, and the OS scheduler decides who runs first.
So:
- Sometimes the parent runs first
- Sometimes the child runs first
- Sometimes they interleave in weird ways
The scheduler bases its decision on complex algorithms, current system load, process priorities, and even the number of available CPU cores.
There is no guarantee unless you use synchronization (like wait(), signals, pipes, etc.).

Tags: 03 - Inter Process Communication - Pipes and FIFO#Forks

Question 7

After a fork(), the child process is guaranteed to run first.

Options:

True
False

Overall explanation:

After fork(), both processes are ready to run, and the OS scheduler chooses which one runs first.

Tags: 03 - Inter Process Communication - Pipes and FIFO#Forks

Question 8

After a fork(), the next process to run is determined by the scheduler.

Options:

True
False

Overall explanation:

Once a fork() system call is executed, both the parent and the child processes are ready to run. The operating system's scheduler is the component responsible for deciding which one gets CPU time first.
Key Concepts
- Non-determinism: You cannot assume a specific order. On some systems or under certain loads, the parent might continue immediately; on others, the child might be given priority to execute its first few instructions (like an exec() call).
- Race Conditions: If your code depends on the parent finishing something before the child starts (or vice versa) without explicit synchronization, you have a race condition.
- Synchronization: To control this behavior, you use system calls like wait() or waitpid(), which force the parent to pause until the child changes state or terminates.

Tags: 03 - Inter Process Communication - Pipes and FIFO#Forks

Question 9

After an exec(), a new process is created.

Options:

True
False

Overall explanation:

While fork() creates a new process, exec() (and its variants like execve) replaces the current process.
What actually happens during exec()?
- No New PID: The process ID (PID) remains exactly the same. The "identity" of the process doesn't change in the eyes of the operating system.
- Memory Overwrite: The current executable image (code, data, stack, and heap) is completely wiped out and replaced with the new program's code and data.
- Execution Start: The process begins executing the new program from its main() function.
- The "Point of No Return": If exec() is successful, it never returns to the original program because that program's code no longer exists in that process's memory.

Tags: [[04 - Process API - Part I#What is exec()?]]

Question 10

After an exec(), new program code is loaded into the process.

Options:

True
False

Overall explanation:

When a process calls exec(), the operating system loads the new executable file into the address space of the existing process.
What Happens Internally?
- Code Replacement: The current text segment (the actual machine instructions) is overwritten by the machine code of the new program.
- Data Reset: The global variables, heap, and stack are also reinitialized to match the requirements of the new program.
- Entry Point: Once the loading is complete, the CPU starts executing the new program from its defined entry point (usually main).
Essentially, exec() is like a "brain transplant"—the body (the Process ID and system resources) stays the same, but the thoughts and instructions (the code and data) are entirely new.

Tags: [[03 - Inter Process Communication - Pipes and FIFO#An aside-execl]]

Question 11

You must always call fork() before you call exec().

Options:

True
False

Overall explanation:

You can call exec() at any time in any process. If you call exec() without calling fork() first:
- The current process simply transforms into the new program.
- The original program "disappears" and is replaced.
- The Process ID (PID) stays the same, but the code being executed changes.
When would you do this?
- Chain Loading: A program that performs some initialization (like setting environment variables) and then hands over control entirely to another program.
- Shell Wrappers: A script that sets up a specific environment and then calls exec on the final application so that the application inherits the script's PID.
The Common Pattern (Fork-Exec)
- In most cases, like when you type a command into a terminal, the shell must fork first. If it didn't, the shell itself would be replaced by the command you typed, and the terminal would close as soon as that command finished.

Tags: [[04 - Process API - Part I#What is exec()?]]

Question 12

You must always call exec() after you call fork().

Options:

True
False

Overall explanation:

While fork() and exec() are often seen together, they are independent system calls that serve different purposes. You can absolutely use fork() without ever calling exec().
The Difference in Outcomes
- Fork then Exec: The child process discards the parent's code and loads a completely different program (like the shell running ls).
- Fork only: The child process continues running the same code as the parent, usually branching off using an if (pid == 0) statement to perform specific tasks.

Tags: [[04 - Process API - Part I#What is exec()?]]

Question 13

When a parent process exits, all child processes must immediately exit.

Options:

True
False

Overall explanation:

In Unix-like systems, a child process is not tied to the "life support" of its parent. If a parent process terminates, the child processes continue to run independently.
What happens when a Parent exits first?
When a parent process finishes before its children, the children become Orphan Processes. However, they aren't left without a leader for long:
- Re-parenting: The operating system automatically "re-parents" these orphans. Traditionally, they are adopted by the first system process, init (which has a PID of 1), or a modern equivalent like systemd.
- Cleanup: The new parent (init) takes on the responsibility of "reaping" the child (calling wait()) when it eventually finishes, ensuring that the system doesn't get cluttered with zombie processes.

Tags: [[04 - Process API - Part I#What happens in exec()]]

Question 14

If exec() succeeds, it returns to the calling program.

Options:

True
False

Overall explanation:

When exec() succeeds, it is often described as a "one-way trip." Because the process's entire address space is overwritten with the new program, there is no "calling program" left to return to.
Why it doesn't return?
- Memory Overwrite: The stack, heap, and code segments of the original program are wiped out. The instruction pointer is reset to the beginning of the new program.
- The "Vanishing" Act: The code that actually contained the exec() call no longer exists in that process's memory.
- Success vs. Failure:
  - On Success: The function never returns. The process simply starts living its "new life" as the new program.
  - On Failure: If the file isn't found or permissions are denied, exec() will return to the calling program with a value of -1 so you can handle the error.

Tags: [[04 - Process API - Part I#What happens in exec()]]

Question 15

A child process created by fork() initially has the same program code as its parent.

Options:

True
False

Overall explanation:

The answer is True.
When fork() is called, the operating system creates a "clone" of the parent process. This includes an exact copy of the parent's memory, including the text segment where the program code resides.
How it Works
- Identical Start: Immediately after the fork(), both processes are executing the exact same binary. They both start at the instruction immediately following the fork() call.
- Shared vs. Copied: While modern operating systems use a technique called Copy-on-Write (COW) to save memory, logically, the child has its own copy of everything the parent had.
- The Fork Return Value: The only way the processes know "who is who" is by checking the return value of fork().
  - In the parent, fork() returns the PID of the child.
  - In the child, fork() returns 0.

Tags: 03 - Inter Process Communication - Pipes and FIFO#Forks

Question 16

Calling wait() or waitpid() allows a parent to wait for a child process to terminate.

Options:

True
False

Overall explanation:

The wait() and waitpid() system calls are the primary tools used for process synchronization. They allow a parent process to pause its own execution until one of its children changes state (usually meaning it has finished executing).
Why this is necessary
- Synchronization: If a parent needs the results of a child's work before moving on, it must wait().
- Resource Cleanup: When a child process terminates, it doesn't disappear completely. It enters a "Zombie" state where it still occupies an entry in the system's process table. The parent must call wait() to "reap" the child and release those final resources.
Difference between the two
- wait(): The parent waits for any of its child processes to terminate. It is a blocking call, meaning the parent stays "asleep" until a child dies.
- waitpid(): This is more specific. It allows the parent to wait for a particular child process (by PID). It also has options for "non-blocking" checks (using the WNOHANG flag), so the parent can check if a child is done without actually stopping its own work.

Tags: 04 - Process API - Part I#Example command interpreter (solved)

Question 17

In UNIX, where is the buffer used to implement a pipe (as in the pipe system call) located?

Options:

On disk
In user space
In shared memory mapped by the processes
In kernel space

Overall explanation:

A pipe is a form of Inter-Process Communication (IPC) managed by the operating system. Because it acts as a bridge between two processes that usually have isolated memory, the kernel must provide a neutral ground for the data to sit.
Security and Isolation: Processes in UNIX are not allowed to access each other's memory directly (user space). By placing the buffer in kernel space, the OS ensures that the data is protected and managed through controlled system calls (read and write).
The Flow of Data: When one process writes to a pipe, the data is copied from its user-space buffer into a buffer in the kernel. When the second process reads from the pipe, the kernel copies that data from its own buffer into the second process's user-space memory.
Limited Size: Because this buffer exists in the kernel's memory, it is typically of a fixed, limited size (often 64 KB on modern Linux). If the buffer is full, the writing process is blocked by the kernel until the reader clears some space.

Tags: 03 - Inter Process Communication - Pipes and FIFO#What is a pipe in Linux?

Question 18

Immediately after fork() completes, if a parent process and its child process read from the same virtual address, which of the following is true? Read about the Copy-on-write (COW) optimization that the Unix kernel uses with fork() to avoid copying a process’s entire memory unless it’s actually needed.

Options:

They must read from different physical memory locations.
The child always reads uninitialized memory.
They may read from the same physical memory location.
The parent’s memory is copied immediately.

Overall explanation:

Copy-on-write (CoW) in Unix is a resource-management technique used to optimize memory and storage efficiency by postponing data duplication until it is absolutely necessary. The core idea is that multiple processes can share the same physical data (memory pages or disk blocks) with read-only permissions, and a private copy is created only when one of them attempts to modify it.
How Copy-on-Write (COW) Works
- In a traditional, "naive" fork(), the kernel would immediately duplicate every single page of physical memory. For a large process, this is incredibly slow and wasteful—especially if the child is just going to call exec() and throw that memory away anyway.
- Instead, the Unix kernel uses COW:
  - Shared Pages: Immediately after fork(), the kernel points the virtual memory of both the parent and the child to the exact same physical pages in RAM.
  - Read-Only Marking: The kernel marks these shared physical pages as read-only.
  - The "Copy" Trigger: As long as both processes are only reading, they both look at the same physical memory. However, the moment either process tries to write to a page, the CPU triggers a protection fault. The kernel then:
    1. Intervenes and creates a real copy of that specific page.
    2. Updates the writing process's page table to point to the new copy.
    3. Marks the page as writable for that process.

Question 19

After a successful exec() call, which of the following remains unchanged?

Options:

The program code
The return address of exec()
The stack contents
The process ID (PID)

Overall explanation:

What Stays the Same?
- Even though the "soul" of the process changes, the "body" provided by the kernel remains:
  - PID: The unique identifier for the process does not change.
  - Parent PID (PPID): The relationship to the parent process is preserved.
  - File Descriptors: Open files (unless specifically marked "close-on-exec") remain open and accessible to the new program. This is how shells redirect input and output before running a command.
  - Real User/Group ID: The ownership of the process typically stays the same.

Tags: [[04 - Process API - Part I#What happens in exec()]]

Question 20

Which of the following best describes the relationship between a parent and child process immediately after fork()?

Options:

The child replaces the parent
They have separate virtual address spaces
They share the same PID
They execute different programs

Overall explanation:

Even though they might share physical RAM initially via Copy-on-Write (COW), their virtual address spaces are distinct.
If the child changes a variable at virtual address 0x400500, the parent's value at that same virtual address remains unchanged.
The kernel manages separate page tables for each process to ensure that one cannot accidentally (or intentionally) corrupt the memory of the other.

Tags: [[03 - Inter Process Communication - Pipes and FIFO#UNIX fork()]]

Consider the two programs shown below.

// Program 1
#include <cstdio>
#include <unistd.h>

int main() {
    printf("PID %d running prog1\n", getpid());
    return 0;
}

// Program 2
#include <cstdio>
#include <unistd.h>
#include <sys/wait.h>

int main() {
    char *argv[2];
    argv[0] = (char *)"prog1";
    argv[1] = nullptr;

    printf("PID %d running prog2 (parent before fork)\n", getpid());

    pid_t pid = fork();

    if (pid == 0) {  
        // Child process  
        execv("./prog1", argv);  
  
        // Only runs if execv fails  
        perror("execv failed");
        return 1;
    } else {  
        // Parent process  
        waitpid(pid, nullptr, 0);
        printf("PID %d exiting from prog2 (parent after wait)\n", getpid());
    }

    return 0;
}

Question 21

How many different PIDs will print out if you run Program 2?

Output:

macc@craftlab:~/TAMUcode/CSCE-313$ ./prog2
PID 2897235 running prog2 (parent before fork)
PID 2897236 running prog1
PID 2897235 exiting from prog2 (parent after wait)

PID A: The original process running Program 2.
PID B: The new process created by fork(). Even though the child calls execv() to run Program 1, it retains the same PID it was given during the fork.

Answer: 2

Question 22

How many lines of output will you see (assuming execv("./prog1", argv) succeeds)?

Output:

PID 2897235 running prog2 (parent before fork)
PID 2897236 running prog1
PID 2897235 exiting from prog2 (parent after wait)

The first line comes from the parent before the fork.
The second line comes from Program 1 (the child process after the execv transformation).
The third line comes from the parent after the waitpid call finishes.

Answer: 3

Question 23

What are the lines of output? When outputting a line, use xxx to denote an unknown PID.

Output (with PIDs unknown):

PID xxx running prog2 (parent before fork)
PID xxx running prog1
PID xxx exiting from prog2 (parent after wait)

Parent starts: Prints the "before fork" message.
fork() occurs: A child process is created with a new PID.
Child branch: The child calls execv("./prog1", argv). Its memory is wiped and replaced by Program 1. It prints Program 1's message and then exits. Because it exited, it never reaches the perror or the end of Program 2.
Parent branch: The parent calls waitpid(pid, nullptr, 0). It pauses until the child (Program 1) finishes.
Parent resumes: Once the child is done, the parent prints the "after wait" message and exits.

Question 24

Which process prints the line PID ... running prog1 ?

Options:

Both the parent and child processes print this line
The child process created by fork(), after it successfully replaces itself with Program 1 using execv()
None of the above
The original parent process running Program 2, before the call to fork()

Overall explanation:

The Split: When fork() is called, two identical processes exist. The child process enters the if (pid == 0) block.
The Transformation: Inside that block, the child calls execv("./prog1", argv). At this exact moment, the code for Program 2 is wiped from the child's memory and replaced with the code for Program 1.
The Execution: Program 1 begins its execution from its own main() function. The very first thing it does is execute its own printf statement: printf("PID %d running prog1\n", getpid());.
The Identity: Even though the child is now running the code from "Program 1," it is still the same process entity that was created by the fork, which is why it maintains its unique Child PID.

Tags: Lab 2 - Unix Pipes#2. Create child to run first command

Question 25

In Program 2, how many times does fork() return?

Answer: 2

Overall explanation:

While you only see the word fork() written once in the source code, you have to remember that fork() creates a nearly identical clone of the process at the exact moment the function is finishing.
Why it returns twice:
- The Parent's Return: The original process (the parent) calls fork(). The operating system completes the task and returns the Process ID (PID) of the new child to the parent.
- The Child's Return: The newly created process (the child) starts its life at the exact same point—inside the fork() system call. The operating system returns 0 to this new child process.
The Logic of the "Split"
- Because the function returns different values to each process, it allows the code to branch. This is why the if (pid == 0) check works: even though both processes are looking at the same line of code, they receive different results from the function call, leading them down different paths.

Creating a pipeline

Do Problem 1 on the class website. When indicating the sequence of statements corresponding to [A], [B] etc., make sure that your response has no whitespace and that a sequence of statements is separated by a comma. For e.g., if [A] should be replaced by the sequence 3, 4, & 7 in that order then indicate it as 3,4,7.

Problem 1. Here’s the skeleton of a shell function implementing a simple two-command pipeline, such as “cmd1 | cmd2”.

void simple_pipe(char* cmd1, char** argv1, char* cmd2, char** argv2) {
    int pipefd[2], r, status;

    [A]

    pid_t child1 = fork();
    if (child1 == 0) {
        [B]
        execvp(cmd1, argv1);
    }
    assert(child1 > 0);

    [C]

    pid_t child2 = fork();
    if (child2 == 0) {
        [D]
        execvp(cmd2, argv2);
    }
    assert(child2 > 0);

    [E]
}

And here is a grab bag of system calls. STDIN_FILENO is a macro that expands to 0, and STDOUT_FILENO expands to 1.

close(pipefd[0]);
close(pipefd[1]);
dup2(pipefd[0], STDIN_FILENO);
dup2(pipefd[0], STDOUT_FILENO);
dup2(pipefd[1], STDIN_FILENO);
dup2(pipefd[1], STDOUT_FILENO);
pipe(pipefd);
r = waitpid(child1, &status, 0);
r = waitpid(child2, &status, 0);

Your task is to assign system call IDs, indicated by numerals such as “1”, to slots, such as “A”, to achieve a correct foreground pipeline. You can assume that execvp(cmd, argv) is a version of exec that replaces the caller process with the execution of cmd run with arguments in argv.

You may use each system call ID once, more than once, or not at all.
You may use zero or more system call IDs per slot. Write them in the order they should appear in the code.
You may assume that no signals are delivered to the shell process (so no system call ever returns an EINTR error).
The simple_pipe function should wait for both commands in the pipeline to complete before returning.

Question 26

Which system call(s) should be placed at location [A] to correctly initialize the pipeline?

Enter the system call number(s) in the order they should appear.
If more than one system call is needed, separate them with commas and do not include spaces.

Answer:

Overall explanation:

System Call 7 (pipe(pipefd);): This creates the pipe. It must be called in the parent process before the fork() calls so that both child processes inherit the same file descriptors. This establishes the shared kernel buffer that will allow cmd1 to send its output to cmd2.

Question 27

Which system call(s) should be placed at location [B] so that cmd1 writes its output to the pipe?

Enter the system call number(s) in the order they should appear.
If more than one system call is needed, separate them with commas and do not include spaces.

Answer:

6,1

Overall explanation:

6 (dup2(pipefd[1], STDOUT_FILENO);): This is the most critical step. It duplicates the write end of the pipe (pipefd[1]) onto the standard output file descriptor (1). After this, any data cmd1 tries to print to the screen is instead diverted into the pipe.
1 (close(pipefd[0]);): Good practice in pipe programming is to close the file descriptors you aren't using. Since child1 is only writing, it should close the read end of the pipe (pipefd[0]) to avoid resource leaks and potential hangs.

Question 28

Which system call(s) should be placed at location [D] so that cmd2 reads its input from the pipe?

Enter the system call number(s) in the order they should appear.
If more than one system call is needed, separate them with commas and do not include spaces.

Answer:

3,2

Overall explanation:

3 (dup2(pipefd[0], STDIN_FILENO);): This duplicates the read end of the pipe (pipefd[0]) onto standard input (file descriptor 0). Now, when cmd2 tries to read data, it will pull directly from the kernel buffer populated by cmd1.
2 (close(pipefd[1]);): The second child is only reading from the pipe, so it must close the write end (pipefd[1]). This is especially important for cmd2 because if any process (including itself) still has the write end open, cmd2 might hang forever waiting for more data that will never come.

Question 29

When a parent process exits, what happens to its child processes?

Options:

Child processes are suspended until the parent restarts.
Child processes continue running and are adopted by another process.
Child processes become zombie processes permanently.
All child processes are immediately terminated.

Overall explanation:

When a parent process terminates, the kernel looks at all of its active children and performs a re-parenting operation:
- The New Parent: Traditionally, the orphans are adopted by the very first process started by the system, init (PID 1). On modern Linux systems using systemd, they are often adopted by systemd or a specific "subreaper" process.
- The Role of the Adopter: The new parent process is programmed to automatically call wait() on these orphans when they eventually finish. This ensures that when the child finally dies, its resources are cleaned up and it doesn't stay a "zombie" forever.
Why the other options are incorrect:
- Suspended until the parent restarts: False. In UNIX, once a process exits, it cannot "restart" and reclaim its children. The child simply continues its execution independently.
- Become zombie processes permanently: False. A process only becomes a zombie after it has finished executing but before its parent has acknowledged it. Orphaned children can still run for hours or days before finishing.
- Immediately terminated: False. While some specific systems or non-standard configurations (like using prctl with PDEATHSIG) can change this, the default behavior is that they keep running.

Tags: 04 - Process API - Part I

Question 30

For each of the actions below, select the most appropriate system call from the following list.

Options:

wait
open
exec
sleep
exit
kill
fork
read

Answer:

Obtain the termination status of a child process.: wait
End the execution of the current process.: exit
Deliver a signal to another process.: kill
Load and execute a different program within the current process.: exec
Create a child process.: fork

Overall explanation:

Obtain the termination status of a child process: wait
- (Specifically, wait or waitpid blocks the parent until a child finishes and returns its exit code.)
End the execution of the current process: exit
- (This syscall terminates the process and passes an exit status back to the kernel for the parent to collect.)
Deliver a signal to another process: kill
- (Despite the name, kill is used to send any signal, such as SIGTERM, SIGKILL, or SIGUSR1, to a process.)
Load and execute a different program within the current process: exec
- (This replaces the current process image with a new one.)
Create a child process: fork
- (This creates a near-identical clone of the calling process.)