Lab 4 - Signals

Notes:

Instructions

In this lab, you will enhance the banking system from previous labs by adding signal handling capabilities. Signals are software interrupts that allow processes to handle asynchronous events. You'll implement handlers for various signals to make the system more robust and responsive to events like user interrupts (Ctrl+C), timeouts, and child process termination.

What are Signals?

Signals are notifications sent to a process to notify it of a particular event. These events might be:

User interrupts (like pressing Ctrl+C, which sends SIGINT)
Timer expiration (SIGALRM)
Child process termination (SIGCHLD)
Various error conditions (SIGSEGV, SIGPIPE, etc.)

When a signal arrives, the process can:

Handle the signal with a custom handler
Ignore the signal
Allow the default action to occur

Starter Code

The provided code builds upon the banking application:

This system consists of a Client (parent) that manages three Servers (children) via Named Pipes (FIFOs).

Process Logic (client.cpp)
- Startup: The client uses fork() and execvp() to launch ./finance, ./logging, and ./file.
- Communication: It creates three RequestChannel objects. Every time a user picks a menu option, the client sends a Request and waits for a Response.
- Shutdown: It sends a QUIT request to each server and then uses wait() to clean up the child processes.
The Communication Layer (channel.cpp)
- This file handles the actual "speaking" between processes.
- Crucial for Lab 4: The functions send_request and receive_request both call wait_with_timeout(). This is where your SIGALRM logic will be triggered. If your signal handler sets the timeout_occurred flag correctly, these functions will stop waiting and return an error to the user.
The Signal Controller (signals.cpp)
This is your primary workspace. It acts as the "Event Listener" for the entire system:
- shutdown_requested: Set by SIGINT. Tells the while loop in client.cpp to stop and start the cleanup.
- timeout_occurred: Set by SIGALRM. Tells the RequestChannel to stop waiting for a server that is taking too long.
- server_processes Vector: A registry of the PIDs of the servers. When SIGCHLD fires, you use this list to see which specific server (finance, file, or logging) has stopped.
The Server Files (finance.cpp, file.cpp, logging.cpp)
- These are infinite while(true) loops.
- They wait for a Request, process it, and send back a Response.
- They exit only when they receive a Request of type QUIT.

How it all connects:

Event	Signal	Handler Action	System Result
User hits Ctrl+C	`SIGINT`	Set `shutdown_requested`	Client finishes current loop and exits gracefully.
The server is slow	`SIGALRM`	Set `timeout_occurred`	channel.cpp stops waiting and returns an error.
Server process dies	`SIGCHLD`	Update Registry	Server Status menu option shows "TERMINATED".

Code Structure

lab3/
├── signals.h              # Signal handling declarations 
├── signals.cpp            # Signal handler implementations 
├── finance.cpp            # Financial transaction server 
├── file.cpp               # File operations server 
├── logging.cpp            # Logging server 
├── client.cpp             # Main client program 
├── common.h               # Common declarations 
├── common.cpp             # Common implementations 
├── channel.h              # IPC channel declarations 
├── channel.cpp            # IPC channel implementations 
├── test_signals.cpp       # Signal handling tests 
└── Makefile               # Build system

Signal Files (signals.h, signals.cpp): These files contain the signal handling infrastructure for the banking system. The signals.h file declares the SignalHandling namespace with atomic flags for tracking signal states and function declarations for signal handlers. The signals.cpp implements these handlers and signal management functions. You will need to implement several TODOs in these files
Client Program (client.cpp): The client code includes TODO sections

Objectives

**After completing this lab, you will understand:

How to implement and manage signal handlers
How to handle timeouts in system operations
How to properly block and unblock signals
How to implement graceful shutdown mechanisms
How to handle child process termination

In signals.cpp:

Here, you can now get an idea of implementing signal handlers and other functions used to block/unblock signals, etc., by implementing them yourself. The detailed information about each of these functions can be found in the code comments. The following TODOs are to be implemented in signals.cpp file :

Atomic Flags Declaration

Declare atomic variables for shutdown_requested (boolean), timeout_occurred (boolean), and child_exited (integer) to track the system state. These atomic variables ensure thread-safe operations when dealing with asynchronous signals.

Refer to this manuahttps://en.cppreference.com/w/cpp/atomic/atomicl page to understand more about them.

Signal Handler Setup

Implement a robust signal handling mechanism using the sigaction structure. Signal handlers are crucial for managing asynchronous events in your application. Begin by carefully initializing the sigaction structure for three specific signals: SIGINT, SIGALRM, and SIGCHLD.

For the SIGINT and SIGALRM handlers, use sigaction with default settings, meaning you'll set no specific flags. These handlers are typically used to manage interrupt signals (like Ctrl+C) and timer-related events.

The SIGCHLD handler requires special attention. Use the SA_RESTART flag for this handler, which is designed to automatically restart system calls interrupted by the child process signal. This flag is particularly useful when dealing with child processes, as it prevents system calls from failing due to signal interruptions. The SIGCHLD signal is sent to the parent process when a child process terminates, stops, or is resumed, making it crucial for process management.

Refer to this simple signal handler code in C++ to understand more about the implementation.

#include <iostream>
#include <signal.h>
#include <unistd.h>
#include <sys/file.h>

// signal handler
void signal_handler(int s)
{
    std::cout << "Caught signal: " << s << std::endl;
    exit(1);
}

// entry
int main(int argc, char* argv[])
{
    // setup signal handler
    struct sigaction sigIntHandler;

    sigIntHandler.sa_handler = signal_handler;
    sigemptyset(&sigIntHandler.sa_mask);
    sigIntHandler.sa_flags = 0;
    sigaction(SIGINT, &sigIntHandler, NULL);

    std::cout << "Press Ctrl+C to exit." << std::endl;

    for (;;) {
        sleep(1);
    }

    return 0;
}

SIGINT Handler Implementation

The SIGINT handler is responsible for managing the application's graceful shutdown mechanism when a user interrupts the program (typically via Ctrl+C). Implement a two-stage shutdown process that provides users with control over the termination sequence. On the first SIGINT signal, initiate a graceful shutdown by setting a predefined shutdown flag. Use only signal-safe functions for logging and communication to prevent potential race conditions or undefined behavior.

When the first SIGINT is received, set the shutdown flag, which should prevent new operations from starting and prepare the application for a clean exit. If a second SIGINT is received before the graceful shutdown completes, immediately force the application to exit. Ensure all print statements are done using signal-safe functions like write() instead of printf() or cout.

SIGALRM Handler Implementation

The SIGALRM handler is crucial for managing operation timeouts in the application. When a SIGALRM signal is received, immediately set a timeout flag to indicate that an operation has exceeded its allocated time. Use signal-safe functions to record the event for the user.

This handler should focus on marking the timeout state, allowing the main application logic to respond appropriately. The timeout flag serves as a communication mechanism between the signal handler and the main application thread, signaling that a time-sensitive operation has failed to complete within the expected timeframe.

SIGCHLD Handler Implementation

The SIGCHLD handler is responsible for managing child process termination. It is a robust handler that uses waitpid() with the WNOHANG flag to non-blockingly retrieve information about terminated child processes. This approach prevents the handler from blocking and allows multiple child process status checks in a single call.

For each terminated child process, the critical task is updating the server registry when a child process terminates. Iterate through the server processes vector to find the matching process, update its status, and log the termination event as per the comments given in the code.

Signal Blocking Implementation

Signal blocking is a critical mechanism for protecting critical sections of code from interruption. The block_signals() function should create a signal set using sigset_t and carefully block specific signals to prevent unexpected interruptions during sensitive operations. Focus on blocking only the SIGINT signal, which can potentially disrupt critical processes.

When implementing signal blocking, initialize the signal set using sigemptyset() to clear any existing signals and use the sigaddset() function to add signals to the block set, and then apply the block using sigprocmask(). This ensures that the specified signals are temporarily prevented from interrupting the current execution context.

The blocking mechanism should be used sparingly and for short durations to prevent prolonged signal suppression, which could make the application less responsive. Ensure that signals are unblocked as soon as the critical section is complete to maintain the application's responsiveness and ability to handle asynchronous events.

Signal Unblocking Implementation

The unblock_signals() function serves as the counterpart to signal blocking, restoring the normal signal handling behavior. Create a signal set similar to the blocking process, but use sigprocmask() to remove the previous signal blocks. This function is crucial for returning the application to a state where it can receive and process signals normally.

Timeout Mechanism Implementation

The wait_with_timeout() function introduces a robust timeout mechanism for operations that might potentially hang or take too long. Use the alarm() system call to set a timer for the specified number of seconds. This creates a race condition handler that prevents indefinite waiting on potentially stuck operations.

Timeout Cancellation Implementation

The cancel_timeout() function provides a way to cancel any pending alarm explicitly. Again use the alarm() system call to cancel any existing alarm without setting a new one. This is particularly useful when an operation completes successfully before the timeout, preventing unnecessary signal generation.

Error Checking Requirement:

Wherever required for system calls, check the return values. If the return value is -1, use perror() to print the specific error message and handle the failure appropriately. Proper error checking is crucial for creating robust and reliable system-level applications.

In the client.cpp

There are only a few changes in the client file, i.e., to register child processes after their creation with the signal handlers, and blocking signals before entering the critical section, and immediately unblocking them after the critical section transaction is completed. Watch out for the code comments in the file to understand more about the changes to be done.

The other files in the system (servers, common.h, channel code) are complete and provide the infrastructure your signal handling code will interact with. Focus your efforts on implementing the TODOs in signals.cpp and client.cpp while maintaining the existing functionality of the banking system.

Important System Limitations

Server Operation Behavior

The current implementation has some important limitations that you should be aware of:

Non-blocking Timeouts:
- While the client implements timeouts for operations, the server processes continue executing their operations even after a timeout occurs
- For example, if a deposit operation is timed out after 30 seconds but the server takes 40 seconds to process it, the deposit will still occur
- There is no rollback mechanism implemented in the servers
Transaction Consistency:
- Due to the above behavior, you might observe that:
- A "timed out" deposit might still appear in your balance
- A "failed" operation might actually succeed on the server side
- The log file might show operations that the client reported as timed out
Why This Happens:
- The servers are implemented to process requests fully once received
- Timeout signals only affect the client's waiting period
- Server operations are not designed to be interruptible
- Implementing true distributed transaction rollback would require significant additional complexity

Tasks

Implement Signal Handlers (50 points)

In signals.cpp, implement the following:

Declaration of Atomic Flags (5 points)
Signal Handler Setup Using Sigaction (15 points)
- Must use sigaction() for signal management
- Configure handlers with appropriate flags
SIGINT Handler (10 points)
- Must set shutdown_requested flag
- Must handle repeated SIGINT (force exit on second press)
- Must log the event properly
SIGALRM Handler (10 points)
- Must set the timeout_occurred flag
- Must log timeout events
SIGCHLD Handler (10 points)
- Must update the server registry for terminated child processes
- Must log the event properly

Signal Blocking/Unblocking in Client (20 points)

In signals.cpp, implement the following:

Must block appropriate signals during transactions (10 points)
Must unblock appropriate signals during transactions (10 points)

Implement Timeout Management (15 points)

In signals.cpp, implement the following functions:

Transaction Timeouts (10 points)
Cancel Timeout Function (5 points)

Server Registration with Signal Handlers (5 points)

In the client.cpp, implement the following:

Signal Blocking Around Transactions (10 points)

In the client.cpp, implement the following:

Block and Unblock signals before and after each critical transaction (10 points)

Implementation

In signals.cpp

Atomic flags

std::atomic<bool> shutdown_requested(false);
std::atomic<bool> timeout_occurred(false);
std::atomic<int> child_exited(0);

So that section becomes:

namespace SignalHandling {

    std::atomic<bool> shutdown_requested(false);
    std::atomic<bool> timeout_occurred(false);
    std::atomic<int> child_exited(0);
    
    // Server process registry
    std::vector<ServerProcess> server_processes;

What this is doing:

shutdown_requested(false);
- creates the actual storage for the flag that tells the client, “begin graceful shutdown.”
timeout_occurred(false);
- creates the flag that will be turned on when SIGALRM happens.
child_exited(0);
- creates a counter for how many child processes have terminated.

Why we write them here in signals.cpp:

signals.h

#ifndef _SIGNALS_H_
#define _SIGNALS_H_

#include <signal.h>
#include <atomic>
#include <string>
#include <vector>
#include <sys/types.h>
#include <iostream>

namespace SignalHandling {
    // Signal flags (using std::atomic for thread safety)
    extern std::atomic<bool> shutdown_requested;
    extern std::atomic<bool> timeout_occurred;
    extern std::atomic<int> child_exited;

    // Server process tracking
    struct ServerProcess {
        pid_t pid;
        std::string name;
        bool active;
    };

    extern std::vector<ServerProcess> server_processes;

    // Signal handlers
    void setup_handlers();
    void sigint_handler(int sig);
    void sigalrm_handler(int sig);
    void sigchld_handler(int sig);

    // Signal operations
    void block_signals();
    void unblock_signals();
    bool wait_with_timeout(int seconds);
    void cancel_timeout();

    // Server management
    void register_server(pid_t pid, const std::string& name);
    bool is_server_active(const std::string& name);
    void print_server_status();

    // Logging
    void log_signal_event(const std::string& message);
}

// Helper template for executing functions with timeout
template<typename Func>
bool execute_with_timeout(Func operation, int timeout_seconds) {
    SignalHandling::timeout_occurred = false;

    // Set alarm
    alarm(timeout_seconds);

    bool result = operation();

    // Cancel alarm
    alarm(0);
    return result && !SignalHandling::timeout_occurred;
}

#endif

In signals.h, they are declared with extern, which means “these variables exist somewhere else.”
In signals.cpp, we now provide the real definitions.

Why std::atomic:

Signals can arrive asynchronously, meaning your normal code and your signal handler may both touch the same variable.
Atomic variables help make those reads/writes safer and prevent weird partial updates.

Why the initial values are these:

shutdown_requested = false because the program has not been asked to shut down yet.
timeout_occurred = false because no timeout has happened yet.
child_exited = 0 because no child has been reaped yet.

One important distinction:

shutdown_requested and timeout_occurred are state flags.
child_exited is a counter, so it starts at zero and increases.

Signal Handler Setup

The lab specifically says to use sigaction for SIGINT, SIGALRM, and SIGCHLD, with SA_RESTART only for SIGCHLD

void setup_handlers() {

    struct sigaction sa;
    memset(&sa, 0, sizeof(sa));
    sigemptyset(&sa.sa_mask);

    sa.sa_handler = sigint_handler;
    sa.sa_flags = 0;
    if (sigaction(SIGINT, &sa, NULL) == -1) {
        perror("sigaction SIGINT");
    }

    sa.sa_handler = sigalrm_handler;
    sa.sa_flags = 0;
    if (sigaction(SIGALRM, &sa, NULL) == -1) {
        perror("sigaction SIGALRM");
    }

    sa.sa_handler = sigchld_handler;
    sa.sa_flags = SA_RESTART;
    if (sigaction(SIGCHLD, &sa, NULL) == -1) {
        perror("sigaction SIGCHLD");
    }

    log_signal_event("Signal handlers initialized");
}

What sigaction is doing?

A signal handler is just a function that the OS calls when a particular signal arrives.

So this line pattern:

sigaction(SIGINT, &sa, NULL);

means:

for signal SIGINT
use the configuration stored in sa
and install it as the new behavior for that signal

So setup_handlers() is basically saying:

when Ctrl+C happens, call sigint_handler
when an alarm timeout happens, call sigalrm_handler
when a child process changes state or exits, call sigchld_handler

Why we reuse the same `struct sigaction sa`?

You start with:

struct sigaction sa;
memset(&sa, 0, sizeof(sa));
sigemptyset(&sa.sa_mask);

This initializes the structure cleanly.

memset(&sa, 0, sizeof(sa));

This fills the whole structure with zeroes so there is no garbage data left in it.

sigemptyset(&sa.sa_mask);

This says:
- while the handler is running,
- do not additionally block any extra signals through sa_mask

So you start from a clean base configuration.

Then for each signal, you update the two important fields:

sa.sa_handler
sa.sa_flags

and call sigaction(...).

Handlers

First handler: SIGINT

sa.sa_handler = sigint_handler;
sa.sa_flags = 0;
if (sigaction(SIGINT, &sa, NULL) == -1) {
    perror("sigaction SIGINT");
}

sa.sa_handler = sigint_handler;
- tells the OS which function to call when SIGINT arrives.
sa.sa_flags = 0;
- means no special behavior, just default signal handling behavior with your handler.
sigaction(SIGINT, &sa, NULL)
- installs that configuration for SIGINT.
What SIGINT usually is
- This is the signal sent when the user presses Ctrl+C in the terminal.
- So later, when the user presses Ctrl+C, your function void sigint_handler(int sig) will run.

Second handler: SIGALRM

sa.sa_handler = sigalrm_handler;
sa.sa_flags = 0;
if (sigaction(SIGALRM, &sa, NULL) == -1) {
    perror("sigaction SIGALRM");
}

This is almost identical.
What changes here?
- Only the signal and the handler function:
  - signal = SIGALRM
  - handler = sigalrm_handler
What SIGALRM is for
- This signal is sent when an alarm() timer expires.
- So later, when your code does something like: alarm(5); then after 5 seconds, if not canceled, SIGALRM is delivered and your sigalrm_handler runs.

Third handler: SIGCHLD

sa.sa_handler = sigchld_handler;
sa.sa_flags = SA_RESTART;
if (sigaction(SIGCHLD, &sa, NULL) == -1) {
    perror("sigaction SIGCHLD");
}

What SIGCHLD is
- This signal is sent to a parent process when one of its child processes exits, stops, or resumes.
- In your lab, this matters because your banking system launches server processes, and the client needs to know if one of them terminated. The instructions explicitly say the SIGCHLD handler should use waitpid(..., WNOHANG) and update the server registry.
Why SA_RESTART
- SA_RESTART tells the system:
  - if a signal interrupts certain blocking system calls,
  - automatically restart them when possible
- This is useful for SIGCHLD because child termination can happen while the parent is doing other system calls. Without SA_RESTART, some calls might fail early with interruption-related errors.

After setup_handlers() runs, the program is now prepared for asynchronous events:

SIGINT → graceful shutdown behavior
SIGALRM → timeout behavior
SIGCHLD → child/server termination tracking

Implement SIGINT handler

The lab wants a two-stage shutdown:

1st Ctrl+C → do not kill the program immediately; set the shutdown flag so the client can exit cleanly
2nd Ctrl+C → force the process to end right away

The lab also explicitly says:

use the shutdown flag
log the exact messages
print to terminal using signal-safe functions like write()

Use this implementation:

void sigint_handler(int sig) {
    (void)sig;  // suppress unused parameter warning

    const char first_msg[] =
        "SIGINT received - initiating graceful shutdown\n";
    const char second_msg[] =
        "Second SIGINT received - forcing exit\n";

    if (!shutdown_requested) {
        // First SIGINT - Graceful termination
        shutdown_requested = true;

        write(STDOUT_FILENO, first_msg, sizeof(first_msg) - 1);
        log_signal_event("SIGINT received - initiating graceful shutdown");
    } else {
        // Second SIGINT - force exit
        write(STDOUT_FILENO, second_msg, sizeof(second_msg) - 1);
        log_signal_event("Second SIGINT received - forcing exit");
        _exit(1);
    }
}

Why the handler has this form

The function is:

void sigint_handler(int sig)

because signal handlers must match that standard signature:
- return type: void
- one parameter: the signal number that triggered it
So when the user presses Ctrl+C, the OS sends SIGINT, and this function runs.

Why we do `(void)sig;`

(void)sig;

This just tells the compiler:
- “yes, I know sig exists”
- “I am intentionally not using it”
Without this, some compilers may warn that the parameter is unused.
You could also use sig in a condition, but here it is not necessary because this function is already specifically installed for SIGINT.

Why the messages are stored as `const char[]`

const char first_msg[] =
    "SIGINT received - initiating graceful shutdown\n";
const char second_msg[] =
    "Second SIGINT received - forcing exit\n";

This is important because write() works with raw character buffers.
write() expects:
- a file descriptor
- a pointer to bytes
- the number of bytes to write
So storing the messages as C-style character arrays makes them easy to pass into write().
Also notice the \n at the end so the terminal output goes to a new line cleanly.

The key condition: first Ctrl+C or second Ctrl+C

if (!shutdown_requested) {

This means:
- if shutdown has not already been requested,
- then this is the first SIGINT
Since shutdown_requested starts as false, the first Ctrl+C enters this branch.
If the user presses Ctrl+C again later, then shutdown_requested is already true, so execution goes into the else branch.
That is how the two-stage logic works.

Why we print with `write()` instead of `cout` or printf

write(STDOUT_FILENO, first_msg, sizeof(first_msg) - 1);

The lab specifically says terminal printing inside the signal handler must use signal-safe functions.

Why not cout or printf?

Because signal handlers run asynchronously.
If a signal interrupts code while the C++ stream system or stdio library is already doing something internally, calling cout or printf from the handler can lead to undefined behavior.
write() is the safe low-level system call.

Breaking that line down:

STDOUT_FILENO = standard output
first_msg = the text buffer
sizeof(first_msg) - 1 = number of characters, excluding the null terminator '\0'

That last part matters because write() needs the exact byte count.

Logging the event

log_signal_event("SIGINT received - initiating graceful shutdown");

The TODO explicitly tells you to log the event using the predefined logging function and to use the exact message text shown in the instructions. So this line matches the lab requirement directly.
One important note:
- in strict POSIX terms, a function like log_signal_event() is not truly async-signal-safe, because its implementation uses things like localtime(), open(), and std::string.
- however, your lab specifically instructs you to call the predefined logging function from the handler, so for this assignment you should follow the lab’s expected design.

Why use `_exit(1)` instead of `exit(1)`

Inside a signal handler, _exit() is safer than exit().

exit()

runs cleanup handlers
flushes stdio buffers
may call code that is not signal-safe

_exit()

ends the process immediately at the system-call level
avoids a lot of unsafe cleanup behavior

Since this is happening inside a signal handler, _exit(1) is the better choice.

Implement SIGALRM handler

The purpose of SIGALRM here is just:

mark that a timeout happened
notify the user with a signal-safe terminal message
log the exact timeout event text the lab requires

The lab explicitly says the handler should set timeout_occurred and log "SIGALRM received - operation timed out". It also explains that this flag is how the main application knows an operation took too long.

Use this implementation:

void sigalrm_handler(int sig) {
    (void)sig;  // suppress unused parameter warning
	
    const char msg[] = "SIGALRM received - operation timed out\n";
	
    timeout_occurred.store(true);
	
    write(STDOUT_FILENO, msg, sizeof(msg) - 1);
    log_signal_event("SIGALRM received - operation timed out");
}

What this handler is for

SIGALRM is sent when an alarm timer expires.

So somewhere else in your code, there will be logic like:

alarm(seconds);

If the operation does not finish before that timer expires, the OS sends SIGALRM, and this handler runs.

This handler does not directly cancel the operation or stop the server. It just records:

“the timeout happened”

That matches the lab’s design: the signal handler communicates the timeout through the shared atomic flag, and the normal program logic reacts afterward.

Setting the timeout flag

timeout_occurred = true;

This is the core action of the handler.

You are telling the rest of the program:

a timeout has occurred
whatever operation was waiting too long should now be treated as timed out

This is exactly what the lab wants. The timeout flag is the communication bridge between the asynchronous signal handler and the regular application logic.

Why this handler is intentionally small

A signal handler should do as little as possible.

That is especially true here, because SIGALRM is just meant to notify the application that time is up.

So the good design is:

set one flag
print one short message
log the event
return immediately

Then the normal program flow can check timeout_occurred and decide what to do next.

Implement SIGCHLD handler

It keeps your server registry accurate when one of the child server processes dies.

Use this implementation inside the while loop:

void sigchld_handler(int sig) {
    (void)sig;  // suppress unused parameter warning
	
    int status;
    pid_t pid;
    
    while ((pid = waitpid(-1, &status, WNOHANG)) > 0) {
        child_exited++;
        
        // Update server registry
        for (auto& server : server_processes) {
            if (server.pid == pid) {
                server.active = false;
				
                std::stringstream ss;
                ss << "Child process terminated: " << server.name
                   << " (PID: " << pid << ")";
                log_signal_event(ss.str());
				
                break;
            }
        }
    }
}

What SIGCHLD means

SIGCHLD is sent to the parent process whenever one of its child processes changes state, especially when it terminates.

In your lab, the parent is the client, and the children are the three servers:

finance
file
logging

The lab explains that when a server dies, the SIGCHLD handler should update the registry so the Server Status menu can show that server as "TERMINATED".

So this handler’s job is:

find out which child died
mark the matching server as inactive
log that event

Why we use `waitpid(-1, &status, WNOHANG)`

This line is the heart of the handler:

while ((pid = waitpid(-1, &status, WNOHANG)) > 0)

Let’s break it apart.

waitpid(...)

This asks the OS whether any child process has changed state.
First argument: -1
- “check any child process”
- So you are not asking about one specific PID. You are asking: “has any of my children exited?”
Second argument: &status
- This stores information about how the child ended.
- For example:
  - exited normally
  - killed by a signal
  - stopped
- In this specific TODO, you do not actually use status, but waitpid still needs a place to write that information.
Third argument: WNOHANG
- This is extremely important.
- WNOHANG means:
  - do not block
  - if no child has exited yet, return immediately
- That is exactly what the lab wants, because a signal handler should not get stuck waiting.

Why the while loop is needed

You might wonder: why not just call waitpid() once?

Because one SIGCHLD signal can correspond to more than one child being ready to reap.

So the loop:

while ((pid = waitpid(-1, &status, WNOHANG)) > 0)

means:

keep collecting terminated children
one by one
until there are no more ready

This is what the lab means when it says the handler should allow multiple child process status checks in a single call.

Why `child_exited++`

This increments your atomic counter every time you successfully reap one terminated child.

So if:

one server dies → counter increases by 1
two children are reaped in the same signal handling pass → counter increases twice

This gives the program a running count of how many child processes have terminated.

Why we loop through `server_processes`

Your server_processes vector stores records like: (this is in signals.h)

struct ServerProcess {
    pid_t pid;
    std::string name;
    bool active;
};

So each server has:

its process ID
its name
whether it is still active

When waitpid() returns a PID, you need to determine:

which entry in server_processes has that PID?

That is why you do:

for (auto& server : server_processes) {

and then compare:

if (server.pid == pid)

That is the match.

Marking the server as terminated

Once you find the matching server:

server.active = false;

This is the actual registry update.

Before termination:
- server.active == true
After termination:
- server.active == false

Then later, when print_server_status() runs, it will print:

"ACTIVE" if active is true
"TERMINATED" if active is false

So this one assignment is what makes the server-status feature work correctly.

Logging the event message

The lab says the log must use exactly this format:

Child process terminated: <server-name> (PID: <pid>)

So we build it like this:

std::stringstream ss;
ss << "Child process terminated: " << server.name
   << " (PID: " << pid << ")";
log_signal_event(ss.str());

If the finance server with PID 12345 dies, the message becomes:

Child process terminated: finance (PID: 12345)

Why break; is important

After you find the matching server and update it:

break;

You should stop the for loop because:

one PID should correspond to one server entry
there is no reason to keep scanning the rest of the vector

This makes the code cleaner and avoids accidental repeated work.

What happens if no server matches the PID?

In normal lab behavior, every child server should be registered with:

register_server(pid, name);

So there should usually be a match.

If there is no match:

the loop simply finishes without changing anything

That is okay.
It just means the process was not in your registry.

One important system note

Strictly speaking, inside a real production-grade signal handler, using things like:

std::stringstream
iterating a std::vector
log_signal_event()

is not considered fully async-signal-safe.

But for this lab, the starter design clearly expects you to:

update server_processes
build the required log message
call log_signal_event()

So for the assignment, this is the correct implementation style.

Implement Signal Blocking

The lab is very specific here: block_signals() should build a sigset_t, add only SIGINT, and then apply the block with sigprocmask(). It also says to log the exact message "Signals blocked for critical section".

Use this:

void block_signals() {
    sigset_t set;

    if (sigemptyset(&set) == -1) {
        perror("sigemptyset");
        return;
    }

    if (sigaddset(&set, SIGINT) == -1) {
        perror("sigaddset");
        return;
    }

    if (sigprocmask(SIG_BLOCK, &set, NULL) == -1) {
        perror("sigprocmask SIG_BLOCK");
        return;
    }

    log_signal_event("Signals blocked for critical section");
}

What signal blocking means

Normally, if the user presses Ctrl+C, your process can receive SIGINT immediately.

But sometimes the program is in a critical section, meaning:

you do not want that interruption to happen in the middle of a sensitive operation
for example, maybe you are in the middle of a transaction or updating shared state

So block_signals() temporarily says:

“for now, do not deliver SIGINT to me”

The lab explicitly says this function is for protecting critical sections from interruption, and that you should focus on blocking only SIGINT.

Why we create `sigset_t`

sigset_t set;

A sigset_t is just a set of signals.

Think of it like a container that can hold:

SIGINT
SIGALRM
SIGCHLD
etc.

In this function, we only want the set to contain SIGINT.

Why `sigemptyset(&set)` comes first

if (sigemptyset(&set) == -1) {
    perror("sigemptyset");
    return;
}

This initializes the signal set so it starts empty.

That is important because you do not want garbage or old signal values inside it.

After this line, the set contains:

nothing

Then you explicitly add the signals you want.

Why `sigaddset(&set, SIGINT)`

if (sigaddset(&set, SIGINT) == -1) {
    perror("sigaddset");
    return;
}

This adds SIGINT to the set.

So now the set contains exactly:

SIGINT

Why `sigprocmask(SIG_BLOCK, &set, NULL)`

if (sigprocmask(SIG_BLOCK, &set, NULL) == -1) {
    perror("sigprocmask SIG_BLOCK");
    return;
}

This is the system call that actually changes the process signal mask.

SIG_BLOCK
- This tells the OS:
  - add the signals in set to the currently blocked signals
&set
- This is the set containing only SIGINT
NULL
- This means:
  - we are not saving the old mask here
So after this call, SIGINT becomes blocked.
If the user presses Ctrl+C during that time, the signal is not handled immediately. Instead, it remains pending until signals are unblocked.

Why this protects the critical section

Suppose the program is doing something delicate, like:

starting a transaction
writing related pieces of state
sending/receiving operation data

If SIGINT were delivered right in the middle, your graceful shutdown could begin at an awkward moment.

By blocking SIGINT during that short window, you reduce the chance of leaving the operation in an inconsistent state.

That is exactly why the lab says this mechanism should be used around critical sections and only for short durations

Why we check for -1 and use `perror()`

The lab says that whenever system calls require error checking, you should test for -1 and use perror().

That is why each step is guarded:

sigemptyset(...)
sigaddset(...)
sigprocmask(...)

If one fails, perror() prints the OS error reason, and return; exits the function early.

Implement Signal Unblocking

Use the mirror-image of block_signals():

void unblock_signals() {
    sigset_t set;

    if (sigemptyset(&set) == -1) {
        perror("sigemptyset");
        return;
    }

    if (sigaddset(&set, SIGINT) == -1) {
        perror("sigaddset");
        return;
    }

    if (sigprocmask(SIG_UNBLOCK, &set, NULL) == -1) {
        perror("sigprocmask SIG_UNBLOCK");
        return;
    }

    log_signal_event("Signals unblocked");
}

What each part is doing:

sigset_t set;
- creates the signal set we will use.
sigemptyset(&set)
- starts with an empty set.
sigaddset(&set, SIGINT)
- adds only SIGINT, because that is the only signal this lab wants blocked/unblocked.
sigprocmask(SIG_UNBLOCK, &set, NULL)
- tells the OS to remove SIGINT from the blocked signal mask, so Ctrl+C can be delivered normally again.
log_signal_event("Signals unblocked");
- records the exact message the lab requires.

Why this matters:

block_signals() temporarily protects a critical section.
unblock_signals() restores responsiveness afterward.
If a SIGINT arrived while blocked, it may now be delivered once unblocked.

Implement Timeout Mechanism

For this TODO, the implementation is very small. You just need to clear the timeout flag and start the alarm.

Use this:

bool wait_with_timeout(int seconds) {
    timeout_occurred = false;
	
    alarm(seconds);
	
    return true;
}

What this is doing

Reset the timeout flag

timeout_occurred.store(false);

This makes sure that before starting a new timed wait, the program does not still think an old timeout happened.
That is important because otherwise:
- a previous SIGALRM could have set the flag to true
- and your new operation would incorrectly look like it already timed out
So every new timed operation should begin with:
- timeout_occurred = false

Start the timer

alarm(seconds);

This tells the OS:
- “send me a SIGALRM after seconds seconds”
So if seconds is 5, then after 5 seconds the process receives SIGALRM, and your already-implemented sigalrm_handler() will run.
That handler then:
- sets timeout_occurred = true
- prints the timeout message
- logs the timeout event
So wait_with_timeout() and sigalrm_handler() work together:
- wait_with_timeout() starts the timer
- sigalrm_handler() reacts when the timer expires

Why it still returns true

return true;

Right now, this function is not actually waiting for anything itself.
It is only:
- resetting the flag
- arming the alarm
So it returns true because the timeout mechanism was successfully started.
In other words, this function does not mean:
- “the operation finished successfully”
It only means:
- “the timer was set”

Why this is called a timeout mechanism

Without a timeout, an operation might:

block forever
wait forever on I/O
hang if something goes wrong

By setting an alarm, you create a limit:

if the operation finishes in time, great
if not, SIGALRM fires and your program knows it took too long

Important note

alarm(seconds) sets a process-wide timer:

if there was already an old alarm running, this call replaces it

And if later you want to cancel the alarm after the operation completes successfully, that is usually done with:

alarm(0);

That cancels any pending alarm.

So in practice the pattern is often:

wait_with_timeout(5);
/* do operation */
alarm(0);   // cancel timer if operation finished in time

Implement Timeout Cancellation

You just need to cancel any existing alarm timer.

Use this:

void cancel_timeout() {
    alarm(0);
}

How `alarm()` works

The timer keeps running until either:
- it expires → SIGALRM is delivered
- or you cancel it

How to cancel an alarm

The POSIX rule is:

alarm(0);

means:

cancel any pending alarm
do not schedule a new one

So if an operation finishes successfully before the timeout:

you must cancel the alarm
otherwise the signal will still fire later and falsely mark a timeout

This is exactly what the lab description is referring to when it says this prevents unnecessary signal generation.

In client.cpp

Register finance/logging/file server with signal handler

Add this right after the if (pid == 0) { ... } block:

SignalHandling::register_server(pid, "finance");

So it becomes:

// Start finance server
pid_t pid = fork();
if (pid < 0) {
    perror("Fork failed");
    exit(1);
}
if (pid == 0) { // Child process
    char* args[] = {(char*)"./finance", (char*)"-m", (char*)to_string(max_account).c_str(), nullptr};
    execvp(args[0], args);
    perror("Execvp failed");
    exit(1);
}

// Register finance server with signal handler
SignalHandling::register_server(pid, "finance");

Why this is needed

The lab says that in client.cpp you must register all servers with the signal handlers after they are created using fork() and exec().

Also, the signal system uses a server_processes vector as a registry of the server PIDs so that when SIGCHLD happens, it can determine which specific server died and update its status.

So this one line is what connects:

the child process you just started
to the signal-handling system you already built

What `register_server(pid, "finance")` does

From signals.cpp, register_server does this:

pushes a new entry into server_processes
stores:
- the child PID
- the server name
- active = true

Conceptually, it adds something like:

{ pid_of_finance_server, "finance", true }

That means the program now knows:

this PID belongs to the finance server
this server is currently active

Why it must happen in the parent, not the child

Notice that you place this line after the child if (pid == 0) block.

That means only the parent executes it.

That is correct, because:

the parent is the one maintaining the server registry
the parent is the one that receives SIGCHLD
the parent is the one that later prints server status

The child should not register itself in the parent's registry.

Why we use pid

After fork():

in the child, pid == 0
in the parent, pid is the actual child process ID

So when execution reaches this line in the parent:

SignalHandling::register_server(pid, "finance");

pid is the real PID of the finance server process.
That is exactly what you want stored in the registry.

The next two TODOs for logging and file servers will almost certainly be the exact same pattern:

SignalHandling::register_server(pid, "logging");
SignalHandling::register_server(file_pid, "file");

Implement signal blocking before transaction

The idea is:

before starting the login transaction, temporarily block SIGINT
perform the transaction
once the transaction is done, unblock SIGINT

So the code should become:

    SignalHandling::block_signals();
	
    retry_operation("login", login_operation);
	
    SignalHandling::unblock_signals();

What this is trying to protect

The lab wants signal blocking around a critical section.

Here, the critical section is the actual login transaction:

retry_operation("login", login_operation);

Why is this sensitive?

Because during login, the program is in the middle of:
- building a request
- sending it to the logging server
- waiting for a response
- updating current_user

If SIGINT arrives right in the middle, the transaction could be interrupted at an awkward time.

So the purpose of blocking is:

do not let Ctrl+C interrupt the operation halfway through

Implement signal blocking/unblocking before transaction

In Deposit, the code should become:

    SignalHandling::block_signals();
	
    retry_operation("deposit", deposit_operation);
	
    SignalHandling::unblock_signals();

In Withdraw, the code should be:

    SignalHandling::block_signals();
	
    retry_operation("withdrawal", withdraw_operation);
	
    SignalHandling::unblock_signals();

In View Balance, the code becomes:

    SignalHandling::block_signals();
	
    retry_operation("balance check", balance_operation);
	
    SignalHandling::unblock_signals();

In Upload File, the code becomes:

    SignalHandling::block_signals();
	
    retry_operation("file upload", upload_operation);
	
    SignalHandling::unblock_signals();

In Download File, the code becomes:

    SignalHandling::block_signals();
	
    retry_operation("file download", download_operation);
	
    SignalHandling::unblock_signals();

In Logout, the code becomes:

    SignalHandling::block_signals();
	
    retry_operation("logout", logout_operation);
	
    SignalHandling::unblock_signals();

Instructions

What are Signals?

Starter Code

Code Structure

Objectives

In signals.cpp:

Atomic Flags Declaration

Signal Handler Setup

SIGINT Handler Implementation

SIGALRM Handler Implementation

SIGCHLD Handler Implementation

Signal Blocking Implementation

Signal Unblocking Implementation

Timeout Mechanism Implementation

Timeout Cancellation Implementation

Error Checking Requirement:

In the client.cpp

Important System Limitations

Server Operation Behavior

Tasks

Implement Signal Handlers (50 points)

Signal Blocking/Unblocking in Client (20 points)

Implement Timeout Management (15 points)

Server Registration with Signal Handlers (5 points)

Signal Blocking Around Transactions (10 points)

Implementation

In signals.cpp

Atomic flags

Signal Handler Setup

What sigaction is doing?

Why we reuse the same struct sigaction sa?

Handlers

Implement SIGINT handler

Why the handler has this form

Why we do (void)sig;

Why the messages are stored as const char[]

The key condition: first Ctrl+C or second Ctrl+C

Why we print with write() instead of cout or printf

Logging the event

Why use _exit(1) instead of exit(1)

Implement SIGALRM handler

What this handler is for

Setting the timeout flag

Why this handler is intentionally small

Implement SIGCHLD handler

What SIGCHLD means

Why we use waitpid(-1, &status, WNOHANG)

Why the while loop is needed

Why child_exited++

Why we loop through server_processes

Marking the server as terminated

Logging the event message

Why break; is important

What happens if no server matches the PID?

One important system note

Implement Signal Blocking

What signal blocking means

Why we create sigset_t

Why sigemptyset(&set) comes first

Why sigaddset(&set, SIGINT)

Why sigprocmask(SIG_BLOCK, &set, NULL)

Why this protects the critical section

Why we check for -1 and use perror()

Implement Signal Unblocking

Implement Timeout Mechanism

What this is doing

Why this is called a timeout mechanism

Important note

Implement Timeout Cancellation

How alarm() works

How to cancel an alarm

In client.cpp

Register finance/logging/file server with signal handler

Why this is needed

What register_server(pid, "finance") does

Why it must happen in the parent, not the child

Why we use pid

Implement signal blocking before transaction

What this is trying to protect

Implement signal blocking/unblocking before transaction

Why we reuse the same `struct sigaction sa`?

Why we do `(void)sig;`

Why the messages are stored as `const char[]`

Why we print with `write()` instead of `cout` or printf

Why use `_exit(1)` instead of `exit(1)`

Why we use `waitpid(-1, &status, WNOHANG)`

Why `child_exited++`

Why we loop through `server_processes`

Why we create `sigset_t`

Why `sigemptyset(&set)` comes first

Why `sigaddset(&set, SIGINT)`

Why `sigprocmask(SIG_BLOCK, &set, NULL)`

Why we check for -1 and use `perror()`

How `alarm()` works

What `register_server(pid, "finance")` does