PA 2 - FAT Filesystem

Class: CSCE-313


Notes:

Instructions

Introduction & Learning Objectives

In this PA, you will build a user-space library (libfs) that implements a simplified FAT (File Allocation Table) file system. Instead of interacting with a physical hard drive, your file system will operate on a virtual disk - a standard file on your Linux environment that has been artificially divided into fixed-size 4096-byte blocks.

By the end of this assignment, you will understand:

The Virtual Disk Architecture

Our file system operates entirely on a virtual disk where every block is exactly 4096 bytes (BLOCK_SIZE). You cannot read or write 10 bytes to the disk; you must read the entire 4096-byte block into memory, modify the specific bytes you care about, and write the entire 4096-byte block back. This is handled via the provided disk.c driver.

The disk is divided into four strict regions, always in this exact order:

  1. Superblock: Always exactly 1 block (Block 0).
  2. FAT Array: Can span 1 or more blocks, depending on the total disk size.
  3. Root Directory: Always exactly 1 block.
  4. Data Region: The remaining blocks, used strictly to hold the actual contents of your files.

The Core Data Structures

The Superblock (Block 0)

The Superblock is the "master record" of the file system. When you mount the disk, this is the very first block you read. It contains the geometry of the entire disk.

typedef struct __attribute__((__packed__)) Superblock {  
    uint8_t  sig[8];           // Must be exactly "CSCE-313"  
    uint16_t numBlocks;        // Total blocks on the virtual disk  
    uint16_t rootIndex;        // Block index of the Root Directory  
    uint16_t dataIndex;        // Block index of the Data Region  
    uint16_t numDataBlocks;    // Total number of data blocks  
    uint8_t  numFAT;           // Number of blocks the FAT occupies  
    uint8_t  padding[4079];    // Pads the struct to exactly 4096 bytes  
} Superblock;

Note: The __attribute__((__packed__)) directive tells the GCC compiler not to add arbitrary padding between variables, ensuring our struct maps perfectly to the raw disk bytes.

The File Allocation Table (FAT)

When a file spans multiple blocks, the FAT functions as a linked-list data structure that maps the logical sequence of data to its physical locations on the disk.

The Root Directory

The Root Directory is a flat catalog of all files on the disk. Our file system has a strict limit of 128 maximum files. A 32-byte entry represents each file.

typedef struct __attribute__((__packed__)) RootEntry {
    uint8_t  name[16];         // File name (null-terminated if < 16 chars)
    uint32_t size;             // File size in bytes
    uint16_t indexFirstBlock;  // Index of the first data block (or 0xFFFF if empty)
    uint8_t  padding[10];      // Pads the struct to exactly 32 bytes
} RootEntry;

Because 128 x 32 bytes = 4096 bytes, the Root Directory fits perfectly into a single block.

The File Descriptor Table

To allow user programs to read and write to files, the OS must keep track of active "sessions." If a user opens a file and reads the first 100 bytes, the file system must remember that the cursor is at byte 100.

You must declare a global array of 32 File Descriptors in your fs.c file: FileDescriptor fd_table[32];

You should define the FileDescriptor struct like this:

typedef struct {
  int is_used;       // 1 if this FD is currently open, 0 if it is free
  int root_index;    // The index of this file in your Root Directory array
  size_t offset;     // The current read/write cursor position in bytes
} FileDescriptor;

Why root_index? Do not store the file's name in this struct! By storing the index (0 to 127) of the file's entry in your Root Directory array, fs_read and fs_write can instantly access the file's size and starting FAT block without having to do a slow strcmp search every time the user wants to read a byte.

image.png

FAT Chain Example

image-1.png383x639

Note: This file system does not support folders or sub-directories. All files are stored in the Root Directory.

Codebase Organization

The repository is strictly organized to separate your implementation from the testing tools.

📁 CSCE-313-FAT-Lab/
├── 📄 Makefile                 # Top-level Makefile to build everything
├── 📄 autograder.sh            # The final grading script
├── 📁 src/                     # Core file system library
│   ├── 📄 disk.c / disk.h      # Provided block I/O driver (DO NOT MODIFY)
│   ├── 📄 fs.c                 # YOUR IMPLEMENTATION GOES HERE
│   ├── 📄 fs.h                 # API headers (DO NOT MODIFY)
│   └── 📄 Makefile             # Compiles your code into libfs.a
├── 📁 utils/                   # Standalone command-line tools
│   ├── 📄 fs_cli.c             # Source code for the interactive shell
│   ├── 📄 mkfs.x               # Pre-compiled virtual disk formatter
│   ├── 📄 ref_fs.x             # Pre-compiled TA reference implementation
│   └── 📄 Makefile             
└── 📁 tests/                   # C-level Integration Tests
    ├── 📄 test_integration.c   # Tests FD limits and state
    ├── 📄 test_file_io.c       # Tests read/write offset logic
    └── 📄 Makefile

What You Must Implement (The API)

You are required to implement the entire file system API inside src/fs.c. You should tackle this in four distinct steps. Do not modify src/fs.h.

1: Mounting and Unmounting

2: Directory Operations

3: File Descriptors

You must maintain the global fd_table array to track the state of up to 32 open files.

4: Reading and Writing (The Bounce Buffer)

This is the most complex phase. You must translate byte-level requests into block-level disk I/O.

Edge Cases & Expected Error Handling

Your file system should work reliably against these specific edge cases. Unless otherwise specified, your functions should return -1 to indicate an error.

1. Invalid File Descriptors

2. File Creation Limits (fs_create)

3. Open File Limits (fs_open)

4. Reading Past EOF (fs_read)

5. Out of Space / Disk Full (fs_write)

How to Compile and Run Your Code

Build, compile, and test your code iteratively using the provided tools.

1. Compiling the Project

You can build specific parts of the project, or everything at once.

2. Formatting a Virtual Disk

Before your file system can do anything, you need a disk to operate on. Use the provided mkfs.x utility to create an empty virtual disk file.

# Creates a virtual disk named 'disk.fs' with 100 total blocks  
./utils/mkfs.x disk.fs 100

3. Interactive Testing (Command-Line Interface)

Once you have implemented the code, you can use fs_cli.x to manually interact with your virtual disk and test your logic.

# Print the disk info (Tests fs_mount and fs_info)
./utils/fs_cli.x info disk.fs

# Create a file inside the virtual disk (Tests fs_create)
./utils/fs_cli.x add disk.fs my_test_file.txt

# List all files on the virtual disk (Tests fs_ls)
./utils/fs_cli.x ls disk.fs

4. Running the C-Level Tests

Once you complete the implementation, you must test your internal state machine (File Descriptors) and offset math.

# Make sure you have a fresh disk
./utils/mkfs.x disk.fs 100

# Run the integration test (Tests fs_open, fs_close, fd limits)
./tests/test_integration.x

# Run the File I/O test (Tests fs_read, fs_write offsets)
./tests/test_file_io.x

5. The Autograder

When you believe your entire implementation is complete and robust, run the grading script from the root directory. This script stresses your file system by creating 128 files, writing large text files that cross block boundaries, and forcing disk-full edge cases.

6. Understanding the Provided Tests

Use the testing suite to verify specific parts of your architecture before running the final autograder:

Key Systems Concepts to Understand

Let’s understand a few core quirks of how C and Operating Systems handle memory and hardware.

1. Struct Padding and __attribute__((__packed__))

By default, the C compiler optimizes your code for the CPU, not for the hard drive. CPUs prefer to read memory in even chunks (like 4 or 8 bytes at a time, known as "word alignment"). If you create a struct with a 1-byte char followed by a 4-byte int, the compiler will secretly insert 3 bytes of invisible "padding" between them so the int starts on an even memory boundary.

Why this is dangerous: In this lab, your Superblock and RootEntry structs must map flawlessly to the raw bytes on the virtual disk. If the compiler injects secret padding, your file sizes and block indices will read garbage data from the disk.

The Solution: We use __attribute__((__packed__)) at the end of our struct definitions. This is a directive that orders the GCC compiler to aggressively pack the variables together exactly as written, byte-for-byte, disabling all hardware-optimized padding.

2. The "Bounce Buffer" (Block I/O vs. Byte I/O)

The fundamental challenge of writing a file system is bridging the gap between what the user wants and what the hardware can actually do.

To solve this, you must implement a Bounce Buffer. This is a temporary 4,096-byte array you allocate inside your fs_read and fs_write functions.

3. Pointer Arithmetic vs. Byte Math

When reading data into memory, you will frequently need to calculate memory offsets. You must be extremely careful with pointer types in C.

If you have an array representing your FAT: uint16_t *fat_array = malloc(FAT_SIZE);

And you want to jump 4,096 bytes forward in memory to read the second block of the FAT, you might be tempted to do this: fat_array + 4096

The Trap: In C, adding a number to a pointer advances it by elements, not by bytes. Because fat_array points to 16-bit integers (2 bytes each), adding 4096 actually advances the memory address by 8,192 bytes, skipping a whole block and causing a segmentation fault!

The Fix: Whenever you are doing raw byte math to find offsets, always cast your pointer to a 1-byte type (like uint8_t * or char *) first: (uint8_t *)fat_array + 4096

4. File Descriptors and "State"

A file on a disk is completely static; it has no idea if anyone is reading it. But an Operating System needs to know exactly where a user left off reading so it can give them the next chunk of text.

A File Descriptor (FD) is how the OS tracks this active "state." It is simply an index to an entry in a table maintained by your library. For every open file, your library must track:

  1. Which file is open (its index in the Root Directory).
  2. The offset: A cursor tracking exactly how many bytes into the file the user has currently read or written.

When a user calls fs_read(fd, buf, 10), you look at your FD table, see that their cursor is at byte 50, fetch bytes 50-59, and then update their cursor to 60. When they call fs_close(fd), you simply mark that table entry as available again.

Implementation

1. Mounting and Unmounting

The key idea is:

Your assignment says the disk layout is always:

  1. block 0 = superblock
  2. blocks 1 .. numFAT = FAT
  3. block rootIndex = root directory
  4. data region after that

Also, the superblock must contain the exact 8-byte signature "CSCE-313", the FAT is an array of uint16_t, and the root directory fits in exactly one block because 128 entries × 32 bytes = 4096 bytes.

What each function should do

fs_mount(const char *diskname)

fs_umount(void)

fs_info(void)

Correct implementation

int fs_mount(const char *diskname) {
    if (diskname == NULL) {
        return -malo
    }

    /* Do not allow mounting twice */
    if (sblk != NULL) {
        return -1;
    }

    /* Open the virtual disk */
    if (block_disk_open(diskname) < 0) {
        return -1;
    }

    /* Allocate and read superblock */
    sblk = malloc(sizeof(Superblock));
    if (sblk == NULL) {
        block_disk_close();
        return -1;
    }

    if (block_read(0, sblk) < 0) {
        free(sblk);
        sblk = NULL;
        block_disk_close();
        return -1;
    }

    /* Validate signature: exactly 8 bytes, no null terminator required */
    if (memcmp(sblk->sig, "CSCE-313", 8) != 0) {
        free(sblk);
        sblk = NULL;
        block_disk_close();
        return -1;
    }

    /* Allocate FAT memory: numFAT blocks, each BLOCK_SIZE bytes */
    fat = malloc(sblk->numFAT * BLOCK_SIZE);
    if (fat == NULL) {
        free(sblk);
        sblk = NULL;
        block_disk_close();
        return -1;
    }

    /* Read FAT blocks from disk: FAT starts at block 1 */
    for (int i = 0; i < sblk->numFAT; i++) {
        if (block_read(1 + i, (uint8_t *)fat + (i * BLOCK_SIZE)) < 0) {
            free(fat);
            fat = NULL;
            free(sblk);
            sblk = NULL;
            block_disk_close();
            return -1;
        }
    }

    /* Read root directory block */
    if (block_read(sblk->rootIndex, root) < 0) {
        free(fat);
        fat = NULL;
        free(sblk);
        sblk = NULL;
        block_disk_close();
        return -1;
    }

    /* Initialize file descriptor table */
    for (int i = 0; i < FS_OPEN_MAX_COUNT; i++) {
        filedes[i].id = -1;
        filedes[i].offset = 0;
        filedes[i].rootIndex = -1;
    }

    numFilesOpen = 0;
    next_id = 0;

    return 0;
}

int fs_umount(void) {
    if (sblk == NULL) {
        return -1;
    }

    /* Cannot unmount if files are still open */
    if (numFilesOpen > 0) {
        return -1;
    }

    /* Write superblock back */
    if (block_write(0, sblk) < 0) {
        return -1;
    }

    /* Write FAT blocks back */
    for (int i = 0; i < sblk->numFAT; i++) {
        if (block_write(1 + i, (uint8_t *)fat + (i * BLOCK_SIZE)) < 0) {
            return -1;
        }
    }

    /* Write root directory back */
    if (block_write(sblk->rootIndex, root) < 0) {
        return -1;
    }

    /* Free memory */
    free(fat);
    fat = NULL;

    free(sblk);
    sblk = NULL;

    /* Close disk */
    if (block_disk_close() < 0) {
        return -1;
    }

    return 0;
}

int fs_info(void) {
    if (sblk == NULL) {
        return -1;
    }

    int fat_free = 0;
    int root_free = 0;

    /* Count free FAT entries */
    for (int i = 0; i < sblk->numDataBlocks; i++) {
        if (fat[i] == get_fat_free()) {
            fat_free++;
        }
    }

    /* Count free root directory entries */
    for (int i = 0; i < FS_FILE_MAX_COUNT; i++) {
        if (root[i].name[0] == '\0') {
            root_free++;
        }
    }

    printf("FS Info:\n");
    printf("total_blk_count=%u\n", sblk->numBlocks);
    printf("fat_blk_count=%u\n", sblk->numFAT);
    printf("rdir_blk=%u\n", sblk->rootIndex);
    printf("data_blk=%u\n", sblk->dataIndex);
    printf("data_blk_count=%u\n", sblk->numDataBlocks);
    printf("fat_free_ratio=%d/%u\n", fat_free, sblk->numDataBlocks);
    printf("rdir_free_ratio=%d/%d\n", root_free, FS_FILE_MAX_COUNT);

    return 0;
}

fs_mount questions

1. Why do we use sblk != NULL to check if mounted?

Because sblk is a global pointer:

static Superblock* sblk = NULL;

Before mounting, it is NULL.

After successful mounting, it points to allocated memory.

So this:

if (sblk != NULL) return -1;

means:

“A filesystem is already mounted, so do not mount another one on top of it.”

That prevents inconsistent state.

2. Why do we malloc(sizeof(Superblock))?

Because the superblock lives on disk, but after mounting we want a copy in RAM.

So:

sblk = malloc(sizeof(Superblock));
block_read(0, sblk);

means:

This matches the assignment’s description that block 0 is always the superblock.

3. Why use memcmp(sblk->sig, "CSCE-313", 8)?

Because sig is:

uint8_t sig[8];

That is exactly 8 bytes.

And "CSCE-313" is also 8 characters.

Important detail:

memcmp(..., 8) compares exactly 8 raw bytes, which is what we want.

4. Why is FAT allocation sblk->numFAT * BLOCK_SIZE?

The FAT occupies multiple whole disk blocks, and each block is 4096 bytes. The assignment explicitly says the FAT can span 1 or more blocks, and the disk only works in whole blocks. 

So if numFAT == 2, then FAT size in bytes is:

2 * 4096

That is why we allocate:

fat = malloc(sblk->numFAT * BLOCK_SIZE);

5. Why cast to (uint8_t *)fat when reading FAT blocks?

This is one of the most important C details in the whole assignment.

You are reading the FAT one disk block at a time:

block_read(1 + i, (uint8_t *)fat + (i * BLOCK_SIZE))

Why not just do fat + i * BLOCK_SIZE?

Because fat is a uint16_t *.

In C, pointer arithmetic moves by elements, not bytes.

So:

fat + 1

moves forward by sizeof(uint16_t) bytes, which is 2 bytes.

But here we want to move by exact byte offsets, like 4096 bytes at a time. The assignment itself warns about this pointer-arithmetic trap and says to cast to a 1-byte type first. 

So:

(uint8_t *)fat + (i * BLOCK_SIZE)

means:

“Treat FAT as raw bytes, then move forward exactly i * 4096 bytes.”

6. Why do we read root directly into root?

Because your code already has:

static RootEntry root[FS_FILE_MAX_COUNT];

And the assignment says the root directory is exactly one block. Since each entry is 32 bytes and there are 128 entries, that is exactly 4096 bytes total.

So this works perfectly:

block_read(sblk->rootIndex, root);

We are reading one block straight into the array.

7. Why initialize all file descriptors during mount?

Because open-file state should start clean each time you mount.

In your starter code, a descriptor is unused when:

filedes[i].id == -1

So we set all of them to unused.

Also:

numFilesOpen = 0;
next_id = 0;

resets the session bookkeeping.

fs_umount questions

8. Why check numFilesOpen > 0 in fs_umount?

Because unmounting while files are still open would mean:

That creates invalid state.

Your assignment explicitly says to ensure no files are open before unmounting.

9. Why write metadata back in fs_umount?

Because while the disk is mounted, you will modify:

Those changes live in memory until you save them.

So unmount is the “flush everything back to disk” step.

fs_info questions

10. Why count FAT free entries only up to numDataBlocks?

Because FAT entries correspond to the data region only, not all disk blocks. The assignment is very explicit about that. 

So the valid FAT entries are:

fat[0] ... fat[numDataBlocks - 1]

Not the entire disk block count.

11. Count free FAT entries

for (int i = 0; i < sblk->numDataBlocks; i++) {
    if (fat[i] == get_fat_free()) {
        fat_free_count++;
    }
}

This loops through the FAT entries that actually correspond to data blocks.

A FAT entry is free if it contains:

0x0000

which your helper returns through:

get_fat_free()

So we are literally counting:

“How many data blocks are currently unused?”

12. Count fee root entries

for (int i = 0; i < FS_FILE_MAX_COUNT; i++) {
    if (root[i].name[0] == '\0') {
        root_free_count++;
    }
}

Each root directory entry represents one possible file.

If the first character of the filename is '\0', that means the entry is empty.

So this counts:

“How many file slots are still available in the root directory?”

Since the root directory has room for 128 files, this loop checks all 128 entries.

Why root[i].name[0] == '\0' means empty:
A filename is stored in:

uint8_t name[FS_FILENAME_LEN];

If a root entry is unused, its name should be all zeros, or at least start with zero.

So checking only the first byte is enough to tell whether the entry is empty.

That is a very common pattern in C strings.

2. Directory Operations

fs_create implementation

Creating a file here does not allocate any data block yet.

That is a really important point.

When you create an empty file:

So the first block is set to the FAT end-of-chain marker:

0xFFFF

This means:

“There is currently no first block allocated.”

What checks should happen first

Before writing anything into the root directory, fs_create should verify:

  1. a filesystem is mounted
  2. filename is not NULL
  3. filename is not empty
  4. filename fits in FS_FILENAME_LEN (16 total, so max 15 visible characters + '\0')
  5. the file does not already exist
  6. there is at least one empty root directory slot

The assignment explicitly requires duplicate-name rejection, root-full rejection, and filename-length rejection. 

Correct implementation:

int fs_create(const char *filename) {
    if (sblk == NULL || filename == NULL) {
        return -1;
    }

    /* Reject empty filename */
    if (filename[0] == '\0') {
        return -1;
    }

    /* Max allowed: 15 chars + null terminator */
    if (strlen(filename) >= FS_FILENAME_LEN) {
        return -1;
    }

    int free_index = -1;

    for (int i = 0; i < FS_FILE_MAX_COUNT; i++) {
        /* Track first empty root slot */
        if (root[i].name[0] == '\0') {
            if (free_index == -1) {
                free_index = i;
            }
        }
        /* Reject duplicate filename */
        else if (strcmp((char *)root[i].name, filename) == 0) {
            return -1;
        }
    }

    /* No empty slot found => root directory full */
    if (free_index == -1) {
        return -1;
    }

    /* Initialize the new root entry */
    memset(&root[free_index], 0, sizeof(RootEntry));
    strcpy((char *)root[free_index].name, filename);
    root[free_index].size = 0;
    root[free_index].indexFirstBlock = get_fat_eoc();

    return 0;
}

1. Mounted check

if (sblk == NULL || filename == NULL) {
    return -1;
}

If the filesystem is not mounted, then root[] is not meaningful as active filesystem state.

Also, if filename == NULL, you cannot safely call strlen or strcmp on it.

2. Empty filename check

if (filename[0] == '\0') {
    return -1;
}

A zero-length file name should be rejected.

Remember empty names would conflict with how you detect unused root entries:

root[i].name[0] == '\0'

So allowing "" as a real file name would break your directory logic.

3. Length check

if (strlen(filename) >= FS_FILENAME_LEN) {
    return -1;
}

Your header defines:

#define FS_FILENAME_LEN 16

That means the array can hold at most:

So length 16 or more is too long.

That matches the assignment’s rule: filenames longer than 15 characters plus null terminator must be rejected.

4. One scan handles both duplicate detection and free-slot detection

This part is efficient and clean:

int free_index = -1;

for (int i = 0; i < FS_FILE_MAX_COUNT; i++) {
    if (root[i].name[0] == '\0') {
        if (free_index == -1) {
            free_index = i;
        }
    }
    else if (strcmp((char *)root[i].name, filename) == 0) {
        return -1;
    }
}

During the loop:

At the end:

This satisfies both assignment requirements: no duplicates, no creation when the root directory is full.

5. Why root[i].name[0] == '\0' means “empty slot”

Because unused root entries are effectively blank.

A filename stored in a C-style char array starts with '\0' if it is empty.

So this is the standard way to say:

“This root directory entry is unused.”

6. Why memset first is a good idea

memset(&root[free_index], 0, sizeof(RootEntry));

This clears the entire entry before filling in fields.

That means:

Then you overwrite the fields that matter:

strcpy((char *)root[free_index].name, filename);
root[free_index].size = 0;
root[free_index].indexFirstBlock = get_fat_eoc();

Using memset is not strictly required, but it is very safe and clean.

7. Why indexFirstBlock = get_fat_eoc()

This is one of the most important ideas in fs_create.

A newly created file has no data blocks yet.

So its “first block” is not a real FAT index.

Instead, you mark it as:

0xFFFF

which is your end-of-chain marker:

get_fat_eoc()

Later, when fs_write writes the first byte into an empty file, that is when you will allocate the first FAT block.

Why we do not call find_free_fat() here

This is a very common confusion.

You might think:

“I am creating a file, so shouldn’t I allocate a block now?”

But for this filesystem, no.

Creating the file only creates its directory entry.

The file is still empty, so it does not need any data block yet.

You only allocate a FAT block when the file actually needs to store data, usually during fs_write.

So fs_create does not touch FAT at all.

Common mistake

Mistake 1: using > instead of >= in length check

Wrong:

if (strlen(filename) > FS_FILENAME_LEN)

That would incorrectly allow a 16-character name, which leaves no space for the null terminator.

Correct:

if (strlen(filename) >= FS_FILENAME_LEN)

fs_delete implement

Big picture first

When a file exists, its root directory entry stores:

If the file has data, the FAT works like a linked list:

first block -> next block -> next block -> ... -> 0xFFFF

Deleting the file means:

  1. find the first block from the root entry
  2. follow the chain in the FAT
  3. free each FAT entry by setting it to 0x0000
  4. erase the root directory entry so the file no longer exists

One very important edge case

A file may be empty.

For an empty file:

indexFirstBlock == 0xFFFF

That means:

This is because fs_create initializes empty files with indexFirstBlock = 0xFFFF.

Implementation

Correct implementation:

int fs_delete(const char *filename) {
    if (sblk == NULL || filename == NULL) {
        return -1;
    }

    int root_index = -1;

    /* Find the file in the root directory */
    for (int i = 0; i < FS_FILE_MAX_COUNT; i++) {
        if (root[i].name[0] != '\0' &&
            strcmp((char *)root[i].name, filename) == 0) {
            root_index = i;
            break;
        }
    }

    /* File not found */
    if (root_index == -1) {
        return -1;
    }

    /* Make sure the file is not currently open */
    for (int i = 0; i < FS_OPEN_MAX_COUNT; i++) {
        if (filedes[i].id != -1 && filedes[i].rootIndex == root_index) {
            return -1;
        }
    }

    /* Free all FAT blocks belonging to the file */
    uint16_t curr = root[root_index].indexFirstBlock;

    while (curr != get_fat_eoc()) {
        uint16_t next = fat[curr];
        fat[curr] = get_fat_free();
        curr = next;
    }

    /* Clear the root entry */
    memset(&root[root_index], 0, sizeof(RootEntry));
    root[root_index].indexFirstBlock = get_fat_eoc();

    return 0;
}

1. Validate basic input

if (sblk == NULL || filename == NULL) {
    return -1;
}

This makes sure:

Without this, you could crash or operate on invalid state.

2. Find the file in the root directory

int root_index = -1;

for (int i = 0; i < FS_FILE_MAX_COUNT; i++) {
    if (root[i].name[0] != '\0' &&
        strcmp((char *)root[i].name, filename) == 0) {
        root_index = i;
        break;
    }
}

This loop searches through all 128 root directory entries.

We only compare names for entries that are actually in use:

root[i].name[0] != '\0'

If found, we store the index of the file’s root entry.

That index is important because later functions often refer back to the file through its root directory slot. The assignment specifically explains that the file descriptor table stores root_index for this reason.

3. If the file does not exist, fail

if (root_index == -1) {
    return -1;
}

If you never found a matching filename, there is nothing to delete.

4. Check whether the file is open

for (int i = 0; i < FS_OPEN_MAX_COUNT; i++) {
    if (filedes[i].id != -1 && filedes[i].rootIndex == root_index) {
        return -1;
    }
}

This is very important.

Your assignment explicitly says:

If a file is currently open, fs_delete should return -1. 

Since your codebase uses:

filedes[i].id == -1

to mean “unused FD”, this loop checks all currently open descriptors and sees whether any of them refer to this file’s root directory entry.

If yes, deletion is not allowed.

Why check by rootIndex instead of name?

5. Start FAT traversal

uint16_t curr = root[root_index].indexFirstBlock;

This gets the FAT index of the first data block of the file.

There are two possibilities:

Case A: file is empty
Then:

curr == 0xFFFF

So the loop below never runs.

Case B: file has data

6. Walk the FAT chain and free each block

while (curr != get_fat_eoc()) {
    uint16_t next = fat[curr];
    fat[curr] = get_fat_free();
    curr = next;
}

This is the heart of the delete logic.

Let’s unpack it carefully.

Why do we save next before freeing the current block?

Suppose the chain is:

fat[4] = 7
fat[7] = 10
fat[10] = 0xFFFF

That means the file uses blocks:

4 -> 7 -> 10 -> EOC

If you did this in the wrong order:

fat[curr] = 0x0000;
curr = fat[curr];

you would immediately lose the next pointer, because after setting fat[curr] = 0x0000, the chain is broken.

So the correct order is:

  1. remember next
  2. free current
  3. move to next

That is why we do:

uint16_t next = fat[curr];
fat[curr] = get_fat_free();
curr = next;

Example traversal

Suppose:

and the FAT contains:

fat[4]  = 7
fat[7]  = 10
fat[10] = 0xFFFF

Then the loop does:

Iteration 1

Iteration 2

Iteration 3

Now loop ends.

All three blocks are freed.

This matches the assignment’s requirement to traverse the FAT chain and mark each block as free.

7. Clear the root entry

memset(&root[root_index], 0, sizeof(RootEntry));
root[root_index].indexFirstBlock = get_fat_eoc();

This removes the file from the directory.

After memset, the root entry becomes blank:

Then we explicitly set:

indexFirstBlock = 0xFFFF

This is a nice consistent state for an empty/nonexistent file entry.

Strictly speaking, after memset it would be 0, but setting it back to EOC keeps the semantic meaning consistent with how empty files are represented elsewhere.

Deleting a file is not just about freeing data blocks.

If you only freed the FAT chain but left the root entry intact, the filesystem would still think the file exists.

So deletion must do both:

fs_ls implementation

What fs_ls should do

It should:

An entry is active if its name is not empty.

Implementation

Correct implementation:

int fs_ls(void) {
    if (sblk == NULL) {
        return -1;
    }
	
    printf("FS Ls:\n");
	
    for (int i = 0; i < FS_FILE_MAX_COUNT; i++) {
        if (root[i].name[0] != '\0') {
            printf("file: %s, size: %u, data_blk: %u\n",
                   root[i].name,
                   root[i].size,
                   root[i].indexFirstBlock);
        }
    }
	
    return 0;
}

1. Check whether the filesystem is mounted

if (sblk == NULL) {
    return -1;
}

If no filesystem is mounted, there is no valid root directory to list.

So just like your other functions, fs_ls should fail in that case.

2. Print a header

printf("FS Ls:\n");

This is just a label before the file list.

Many versions of this assignment use exactly this style.

3. Loop through the whole root directory

for (int i = 0; i < FS_FILE_MAX_COUNT; i++)

Your root directory has room for 128 entries, so this checks all possible file slots.

4. Skip empty entries

if (root[i].name[0] != '\0')

This means:

So fs_ls only prints real files.

5. Print file information

printf("file: %s, size: %u, data_blk: %u\n",
       root[i].name,
       root[i].size,
       root[i].indexFirstBlock);

This prints exactly the three things the assignment asks for:

Important note about data_block

root[i].indexFirstBlock

This is the FAT index of the first block, not necessarily the absolute disk block number.

That is an important distinction.

For example:

sblk->dataIndex + root[i].indexFirstBlock

But for this function, most versions of the assignment want the stored starting block value from the root entry, which is the FAT index.

Also, for an empty file, this value will be:

0xFFFF

because empty files do not yet have any allocated data block.

3. File Descriptors

fs_open() implementation

What fs_open must do

It should:

  1. make sure the filesystem is mounted
  2. validate filename
  3. find the file in the root directory
  4. fail if the file does not exist
  5. fail if that file is already open
  6. find the first free file descriptor slot
  7. fail if all 32 are in use
  8. initialize that FD:
    • assign an ID
    • set offset = 0
    • set rootIndex to the root directory entry index
  9. increment open-file bookkeeping
  10. return the FD index

Implementation

Correct implementation:

int fs_open(const char *filename) {
    if (sblk == NULL || filename == NULL) {
        return -1;
    }

    int root_index = -1;

    /* Find the file in the root directory */
    for (int i = 0; i < FS_FILE_MAX_COUNT; i++) {
        if (root[i].name[0] != '\0' &&
            strcmp((char *)root[i].name, filename) == 0) {
            root_index = i;
            break;
        }
    }

    /* File does not exist */
    if (root_index == -1) {
        return -1;
    }

    /* Do not allow opening the same file twice */
    for (int i = 0; i < FS_OPEN_MAX_COUNT; i++) {
        if (filedes[i].id != -1 && filedes[i].rootIndex == root_index) {
            return -1;
        }
    }

    /* Find first free file descriptor slot */
    for (int i = 0; i < FS_OPEN_MAX_COUNT; i++) {
        if (filedes[i].id == -1) {
            filedes[i].id = next_id++;
            filedes[i].offset = 0;
            filedes[i].rootIndex = root_index;
            numFilesOpen++;
            return i;
        }
    }

    /* No free FD slot */
    return -1;
}

Search the root directory for the file

int root_index = -1;

for (int i = 0; i < FS_FILE_MAX_COUNT; i++) {
    if (root[i].name[0] != '\0' &&
        strcmp((char *)root[i].name, filename) == 0) {
        root_index = i;
        break;
    }
}

This loop looks through all 128 root entries.

We only compare non-empty entries:

root[i].name[0] != '\0'

If we find a matching name, we save that root directory index.

That is important because the file descriptor table does not store the filename itself. It stores the index of the file’s root directory entry, which is exactly the design the assignment recommends.

3. Fail if the file does not exist

if (root_index == -1) {
    return -1;
}

If you did not find a matching file in the root directory, then fs_open must fail.

The assignment explicitly says that if the file does not exist, return -1.

4. Reject opening the same file twice

for (int i = 0; i < FS_OPEN_MAX_COUNT; i++) {
    if (filedes[i].id != -1 && filedes[i].rootIndex == root_index) {
        return -1;
    }
}

This is an important lab-specific rule.

Your assignment says:

For simplicity in this lab, you do not need to support independent read/write offsets for the same file. If a user calls fs_open on a file that is already marked as open in your FD table, return -1. 

So we scan the FD table and check whether any currently open descriptor already refers to that same root directory entry.

If yes, fail.

5. Find the first free file descriptor slot

for (int i = 0; i < FS_OPEN_MAX_COUNT; i++) {
    if (filedes[i].id == -1) {
        ...
    }
}

Your code uses:

id == -1

to mean “unused FD slot”.

So this loop is looking for the first available descriptor index from 0 to 31.

6. Initialize the file descriptor

filedes[i].id = next_id++;
filedes[i].offset = 0;
filedes[i].rootIndex = root_index;
numFilesOpen++;
return i;

This claims the FD slot.

Why return i instead of filedes[i].id?

This is a very important distinction.

There are two numbers here:

All later API calls like:

fs_close(fd)
fs_read(fd, ...)
fs_write(fd, ...)

expect the array index fd, not the internal session ID.

So the correct return value is:

return i;

not the unique ID

Think of the FD table like this:

FD index id offset rootIndex
0 -1 0 -1
1 -1 0 -1
2 -1 0 -1

Suppose notes.txt is in root entry 7.

fd = fs_open("notes.txt");

might produce:

FD index id offset rootIndex
0 0 0 7
1 -1 0 -1
2 -1 0 -1

and return:

0

meaning the user’s file descriptor is 0.

Later, if the user reads 100 bytes, offset may become 100.

fs_close implementation

What fs_close must do

From the instructions:

In your implementation, “unused” means:

filedes[i].id == -1

Implementation

Correct implementation:

int fs_close(int fd) {
    /* Validate fd */
    if (!is_valid_fd(fd)) {
        return -1;
    }

    /* Mark descriptor as unused */
    filedes[fd].id = -1;
    filedes[fd].offset = 0;
    filedes[fd].rootIndex = -1;

    numFilesOpen--;

    return 0;
}

1. Validate the file descriptor

You already have this helper:

static inline int is_valid_fd(int fd) {
    return (fd >= 0 && fd < FS_OPEN_MAX_COUNT && filedes[fd].id != -1);
}

So:

if (!is_valid_fd(fd)) {
    return -1;
}

means:

If either fails → return -1

2. Mark the FD as unused

filedes[fd].id = -1;

This is the most important line.

Because in your system:

id == -1  → unused

That is what makes the FD slot available again.

Mental model

Before closing:

FD id offset rootIndex
0 5 120 3

After:

FD id offset rootIndex
0 -1 0 -1

fs_stat implementation

What fs_stat should do

Given a file descriptor fd, it should:

  1. validate the FD
  2. use filedes[fd].rootIndex to find the file in the root directory
  3. return the file’s size

Implementation

Correct implementation:

int fs_stat(int fd) {
    /* Validate fd */
    if (!is_valid_fd(fd)) {
        return -1;
    }

    int root_index = filedes[fd].rootIndex;

    return root[root_index].size;
}

Get the root directory index

int root_index = filedes[fd].rootIndex;

This is the key design idea of your filesystem:

So instead of searching by name again, you directly jump to:

root[root_index]

fs_lseek implementation

What fs_lseek must do

Given:

It should:

  1. validate fd
  2. get the file size from the root directory
  3. check:
    • if offset > size → invalid → return -1
  4. otherwise:
    • update filedes[fd].offset = offset
    • return 0

Implementation

Correct implementation:

int fs_lseek(int fd, size_t offset) {
    /* Validate fd */
    if (!is_valid_fd(fd)) {
        return -1;
    }

    int root_index = filedes[fd].rootIndex;
    size_t file_size = root[root_index].size;

    /* Cannot seek past end of file */
    if (offset > file_size) {
        return -1;
    }

    /* Update offset */
    filedes[fd].offset = offset;

    return 0;
}

Enforce the key rule

if (offset > file_size) {
    return -1;
}

This is the most important rule.

The assignment explicitly says:

you cannot seek past the end of a file

So:

offset size allowed?
< size valid
= size valid
> size invalid

Why offset == size is allowed

If:

offset == size

That means:

This is valid, because:

Mental model

Think of the file like a byte array:

[0][1][2][3][4][5][6][7][8][9]

If size = 10:

Example

fd = fs_open("file.txt");  // size = 10

fs_lseek(fd, 5);   // OK
fs_lseek(fd, 10);  // OK (EOF)
fs_lseek(fd, 11);  // FAIL

4. Reading and Writing (The Bounce Buffer)

fs_read implementation

What fs_read does

Given:

it:

  1. checks how many bytes are actually available before EOF
  2. loops block by block
  3. reads each full 4096-byte disk block into a bounce buffer
  4. copies only the needed part into the user buffer
  5. advances the file descriptor offset
  6. returns the number of bytes actually read

Implementation

Correct implementation:

int fs_read(int fd, void *buf, size_t count) {
    if (!is_valid_fd(fd) || buf == NULL) {
        return -1;
    }

    if (count == 0) {
        return 0;
    }

    int r_idx = filedes[fd].rootIndex;
    size_t file_size = root[r_idx].size;
    size_t cur_offset = filedes[fd].offset;

    /* Already at EOF */
    if (cur_offset >= file_size) {
        return 0;
    }

    /* Do not read past EOF */
    size_t bytes_left_in_file = file_size - cur_offset;
    size_t bytes_to_read = (count < bytes_left_in_file) ? count : bytes_left_in_file;

    size_t total_read = 0;
    uint8_t bounce[BLOCK_SIZE];

    while (total_read < bytes_to_read) {
        size_t abs_offset = cur_offset + total_read;

        int disk_block = get_block_for_offset(fd, abs_offset, 0);
        if (disk_block < 0) {
            return -1;
        }

        if (block_read(disk_block, bounce) < 0) {
            return -1;
        }

        size_t block_offset = abs_offset % BLOCK_SIZE;
        size_t bytes_left_in_block = BLOCK_SIZE - block_offset;
        size_t bytes_remaining = bytes_to_read - total_read;

        size_t chunk = (bytes_remaining < bytes_left_in_block)
                     ? bytes_remaining
                     : bytes_left_in_block;

        memcpy((uint8_t *)buf + total_read, bounce + block_offset, chunk);
        total_read += chunk;
    }

    filedes[fd].offset += total_read;
    return (int)total_read;
}

1. Validate inputs

if (!is_valid_fd(fd) || buf == NULL) {
    return -1;
}

Your assignment says invalid FDs must return -1. 

And buf == NULL is also invalid because there is nowhere to copy data.

2. Handle count == 0

if (count == 0) {
    return 0;
}

Reading zero bytes is valid. It just moves nothing.

3. Check EOF first

if (cur_offset >= file_size) {
    return 0;
}

The assignment explicitly says:

So if the cursor is already at the end, there is nothing to read.

4. Clamp the read to EOF

size_t bytes_left_in_file = file_size - cur_offset;
size_t bytes_to_read = (count < bytes_left_in_file) ? count : bytes_left_in_file;

This is the line that enforces:

“Read only what remains before EOF.” 

Example:

Then:

5. Why we need a bounce buffer

The assignment explains that the disk only reads whole 4096-byte blocks, even if the user wants only a few bytes. So you must:

  1. read the entire block into a temporary array
  2. copy just the needed bytes from that array into the user buffer 

That is why we do:

uint8_t bounce[BLOCK_SIZE];

and later:

block_read(disk_block, bounce);

6. Compute which block to read

size_t abs_offset = cur_offset + total_read;
int disk_block = get_block_for_offset(fd, abs_offset, 0);

As the loop progresses, abs_offset moves through the file.

Examples:

That is exactly why the helper exists.

7. Find the position inside the block

size_t block_offset = abs_offset % BLOCK_SIZE;

This tells you where inside the 4096-byte block your read starts.

Examples:

So this is the byte position inside the bounce buffer.

8. Decide how many bytes to copy from this block

size_t bytes_left_in_block = BLOCK_SIZE - block_offset;
size_t bytes_remaining = bytes_to_read - total_read;

size_t chunk = (bytes_remaining < bytes_left_in_block)
             ? bytes_remaining
             : bytes_left_in_block;

This chooses the smaller of:

So if a read spans multiple blocks, each loop copies only the chunk that belongs to the current block.

9. Copy from bounce buffer into user buffer

memcpy((uint8_t *)buf + total_read, bounce + block_offset, chunk);

This says:

This is exactly the assignment’s bounce-buffer design.

10. Update state

total_read += chunk;
...
filedes[fd].offset += total_read;

The assignment says to update the FD offset by the number of bytes actually moved.

So when the read is done, the cursor advances.

Example walk-through

Suppose:

Then:
Step 1: clamp to EOF

Step 2: first iteration

Step 3: second iteration

Done.

That is how one logical read request turns into two block-level operations.

fs_read implementation

What fs_write must do

Given:

it should:

  1. validate input
  2. write starting at the current FD offset
  3. overwrite if offset is inside the file
  4. append if offset is at EOF
  5. allocate FAT blocks if the write extends the file
  6. use a bounce buffer for each affected disk block
  7. stop early if disk becomes full
  8. update:
    • FD offset
    • file size
  9. return actual bytes written

The key difference from fs_read

In fs_read, the flow was:

disk block -> bounce buffer -> user buffer

In fs_write, it is:

disk block -> bounce buffer -> modify part of it -> write bounce buffer back to disk

That is called read-modify-write.

Why do we do that?

Because the disk only reads and writes whole 4096-byte blocks. If the user wants to write 20 bytes into the middle of a block, you cannot just write those 20 bytes directly to disk or you would destroy the rest of the block.

So the pattern is:

  1. read full block into bounce buffer
  2. overwrite the needed bytes in memory
  3. write full block back

One subtle but important rule

When extending a file, a block might be newly allocated and never written before.

In that case, there is no old block content that matters.

So for newly allocated blocks, it is safest to initialize the bounce buffer to zeros before copying user data into it.

For existing blocks, do a normal block_read() first.

Implementation

Correct implementation:

int fs_write(int fd, void *buf, size_t count) {
    if (!is_valid_fd(fd) || buf == NULL) {
        return -1;
    }

    if (count == 0) {
        return 0;
    }

    int r_idx = filedes[fd].rootIndex;
    size_t cur_offset = filedes[fd].offset;
    size_t old_size = root[r_idx].size;
    size_t total_written = 0;
    uint8_t bounce[BLOCK_SIZE];

    while (total_written < count) {
        size_t abs_offset = cur_offset + total_written;
        size_t block_offset = abs_offset % BLOCK_SIZE;
        size_t bytes_left_in_block = BLOCK_SIZE - block_offset;
        size_t bytes_remaining = count - total_written;
        size_t chunk = (bytes_remaining < bytes_left_in_block)
                     ? bytes_remaining
                     : bytes_left_in_block;

        int disk_block = get_block_for_offset(fd, abs_offset, 1);

        /* Disk full: if nothing written yet, return 0; otherwise partial write */
        if (disk_block < 0) {
            break;
        }

        /*
         * Determine whether this block already existed in the file before
         * this write began. If yes, read old contents. If no, start with zeros.
         *
        if (abs_offset < old_size) {
            if (block_read(disk_block, bounce) < 0) {
                return -1;
            }
        } else {
            memset(bounce, 0, BLOCK_SIZE);
        }

        memcpy(bounce + block_offset, (uint8_t *)buf + total_written, chunk);

        if (block_write(disk_block, bounce) < 0) {
            return -1;
        }

        total_written += chunk;
    }

    /* Update FD offset by actual bytes written */
    filedes[fd].offset += total_written;

    /* Grow file size if needed */
    if (filedes[fd].offset > root[r_idx].size) {
        root[r_idx].size = filedes[fd].offset;
    }

    return (int)total_written;
}

1. Validate input

if (!is_valid_fd(fd) || buf == NULL) {
    return -1;
}

This matches the general FD error-handling rule:

2. Handle zero-byte write

if (count == 0) {
    return 0;
}

Writing zero bytes is valid and should do nothing.

3. Save starting state

int r_idx = filedes[fd].rootIndex;
size_t cur_offset = filedes[fd].offset;
size_t old_size = root[r_idx].size;
size_t total_written = 0;
uint8_t bounce[BLOCK_SIZE];

These variables matter a lot.

4. Loop block by block

while (total_written < count)

This is necessary because:

5. Compute where we are in the file

size_t abs_offset = cur_offset + total_written;
size_t block_offset = abs_offset % BLOCK_SIZE;
size_t bytes_left_in_block = BLOCK_SIZE - block_offset;
size_t bytes_remaining = count - total_written;
size_t chunk = (bytes_remaining < bytes_left_in_block)
             ? bytes_remaining
             : bytes_left_in_block;

This is almost exactly the same structure as fs_read.

6. Ask helper for the correct disk block

int disk_block = get_block_for_offset(fd, abs_offset, 1);

The important part is the 1.

That means:

“If this offset needs a block that doesn’t exist yet, allocate one.”

So unlike fs_read, fs_write can grow the FAT chain.

7. Handle disk-full / partial-write case

if (disk_block < 0) {
    break;
}

If allocation fails, that usually means there are no free FAT blocks left.

Then we stop writing.

This is exactly what you want for partial writes:

Since total_written starts at 0 and only increases after successful writes, breaking here naturally gives the right result.

Why this handles the edge case correctly

Suppose:

Then:

And if the disk is full before the very first byte:

Also exactly correct.

8. Decide whether to read old block contents or start with zeros

if (abs_offset < old_size) {
    if (block_read(disk_block, bounce) < 0) {
        return -1;
    }
} else {
    memset(bounce, 0, BLOCK_SIZE);
}

Case A: writing into a block that already existed in the file

Then we must preserve the untouched bytes in that block.

So we:

block_read(disk_block, bounce);

Case B: writing into a newly allocated block beyond old EOF

There is no previous file data to preserve there.

So we can safely do:

memset(bounce, 0, BLOCK_SIZE);

Why abs_offset < old_size is the right test

Because old_size is the file size before this write started.

If the current absolute offset is still inside that old file:

If the current absolute offset is past or at old EOF:

That is a clean way to distinguish overwrite vs growth.

9. Copy user bytes into bounce buffer

memcpy(bounce + block_offset, (uint8_t *)buf + total_written, chunk);

This writes the new bytes into the correct part of the block image.

So if the user writes 20 bytes starting at offset 100:

That is exactly the read-modify-write behavior you need.

10. Write whole block back to disk

if (block_write(disk_block, bounce) < 0) {
    return -1;
}

After the bounce buffer has been updated, we write the entire block back.

11. Advance progress

total_written += chunk;

Now the next iteration continues at the next byte to write.

12. Update FD offset

filedes[fd].offset += total_written;

This moves the file cursor forward by the number of bytes actually written.

That matches the assignment’s state-management rules.

13. Update file size if the file grew

if (filedes[fd].offset > root[r_idx].size) {
    root[r_idx].size = filedes[fd].offset;
}

This is very important.

If the write extends past old EOF, the file size must grow.

If it was only an overwrite inside the file, the size stays the same.

So new file size is:

max(old size, new offset)

and that is exactly what this if does.

Example 1: overwrite inside existing file

Suppose:

Then:

Example 2: append to end of file

Suppose:

Then:

Example 3: multi-block write

Suppose:

Then:
Iteration 1

Iteration 2

Done.

Example 4: disk full partial write

Suppose:

Then:

Helper functions

find_free_fat() implementation

Correct implementation:

static int find_free_fat() {
    if (fat == NULL || sblk == NULL) {
        return -1;
    }

    for (int i = 0; i < sblk->numDataBlocks; i++) {
        if (fat[i] == get_fat_free()) {
            return i;  // Found a free block
        }
    }

    return -1;  // No free blocks → disk is full
}

1. Safety check

if (fat == NULL || sblk == NULL)

This ensures:

2. Loop through FAT entries

for (int i = 0; i < sblk->numDataBlocks; i++)

We only check valid data-region entries, not raw memory.

Important detail:
The FAT size in memory is:

sblk->numFAT * BLOCK_SIZE   // in bytes

BUT the number of valid FAT entries is:

sblk->numDataBlocks

Why? Because:

So you MUST loop like this:

for (int i = 0; i < sblk->numDataBlocks; i++)

NOT over the full allocated memory.

Takeaway:

3. Check for free entry

if (fat[i] == get_fat_free())

Your helper:

static inline uint16_t get_fat_free() {
    return 0x0000;
}

So this is equivalent to:

if (fat[i] == 0x0000)

4. Return index

return i;

This index is:

To convert it to a disk block later:

disk_block = sblk->dataIndex + i;

5. If nothing found

return -1;

Meaning:

get_block_for_offset implementation

What get_block_for_offset does

Given:

it answers:

“Which absolute disk block contains that byte?”

To do that, it:
1. starts at the file’s indexFirstBlock
2. follows the FAT chain
3. advances one FAT link per 4096 bytes
4. converts the final FAT index into an absolute disk block: sblk->dataIndex + fat_index

Implementation

Correct implementation:

static int get_block_for_offset(int fd, size_t offset, int allocate) {
    if (!is_valid_fd(fd) || sblk == NULL) {
        return -1;
    }

    int r_idx = filedes[fd].rootIndex;
    uint16_t fat_idx = root[r_idx].indexFirstBlock;

    /* Which file block do we need?
       offset 0..4095   -> block 0 of file
       offset 4096..8191 -> block 1 of file
       etc.
    */
    size_t blocks_to_advance = offset / BLOCK_SIZE;

    /* Empty file / no first block yet */
    if (fat_idx == get_fat_eoc()) {
        if (!allocate) {
            return -1;
        }

        int new_idx = find_free_fat();
        if (new_idx < 0) {
            return -1;
        }

        root[r_idx].indexFirstBlock = (uint16_t)new_idx;
        fat[new_idx] = get_fat_eoc();
        fat_idx = (uint16_t)new_idx;
    }

    /* Walk forward through the FAT chain */
    for (size_t i = 0; i < blocks_to_advance; i++) {
        uint16_t next = fat[fat_idx];

        if (next == get_fat_eoc()) {
            if (!allocate) {
                return -1;
            }

            int new_idx = find_free_fat();
            if (new_idx < 0) {
                return -1;
            }

            fat[fat_idx] = (uint16_t)new_idx;
            fat[new_idx] = get_fat_eoc();
            fat_idx = (uint16_t)new_idx;
        } else {
            fat_idx = next;
        }
    }

    /* Convert FAT index -> absolute disk block index */
    return sblk->dataIndex + fat_idx;
}

offset / BLOCK_SIZE

This line is the key:

size_t blocks_to_advance = offset / BLOCK_SIZE;

Examples:

So this tells us how many FAT links to follow.

Why start from indexFirstBlock

The root directory stores:

root[r_idx].indexFirstBlock

That is the FAT index of the file’s first data block, or 0xFFFF if the file is empty. 

So the FAT chain always begins there.

Why return sblk->dataIndex + fat_idx

This is one of the most important rules in the assignment:

Why the allocate parameter exists

For fs_read, you will call:

get_block_for_offset(fd, offset, 0)

because reading should never allocate blocks.

For fs_write, you will later call it with allocate = 1, so if the file grows past its current chain, the helper can create new FAT blocks.

So even though allocate is not really used by fs_read, implementing it now will help you later with fs_write.

fat[fat_idx] = (uint16_t)new_idx;
fat[new_idx] = get_fat_eoc();
fat_idx = (uint16_t)new_idx;

We are extending a linked list.

Remember:

So a file looks like:

Block A → Block B → Block C → 0xFFFF

When do we execute this code?
This happens when:

next == get_fat_eoc()

Meaning:

Line-by-line explanation

Extend the chain

fat[fat_idx] = (uint16_t)new_idx;

Mark the new block as end of chain

fat[new_idx] = get_fat_eoc();

Move forward in the chain

fat_idx = (uint16_t)new_idx;

Visual example

Before allocation
Suppose the FAT looks like:

fat[4] = 7
fat[7] = 0xFFFF

File chain:

4 → 7 → EOC

You need one more block
Let’s say:

new_idx = 10

After these three lines

fat[7] = 10;
fat[10] = 0xFFFF;
fat_idx = 10;

Now FAT:

fat[4] = 7
fat[7] = 10
fat[10] = 0xFFFF

Chain becomes:

4 → 7 → 10 → EOC

Evidence

autograder_pass_evidence.png