PA 2 - FAT Filesystem
Class: CSCE-313
Notes:
Instructions
Introduction & Learning Objectives
In this PA, you will build a user-space library (libfs) that implements a simplified FAT (File Allocation Table) file system. Instead of interacting with a physical hard drive, your file system will operate on a virtual disk - a standard file on your Linux environment that has been artificially divided into fixed-size 4096-byte blocks.
By the end of this assignment, you will understand:
- Block-based I/O: How operating systems read and write data in fixed chunks rather than byte-by-byte.
- File System Metadata: How structs are packed into blocks to track file locations, sizes, and free space.
- Linked Allocation: How the File Allocation Table stitches fragmented data blocks together into contiguous files.
- File Descriptors: How the OS manages state for multiple open files simultaneously.
The Virtual Disk Architecture
Our file system operates entirely on a virtual disk where every block is exactly 4096 bytes (BLOCK_SIZE). You cannot read or write 10 bytes to the disk; you must read the entire 4096-byte block into memory, modify the specific bytes you care about, and write the entire 4096-byte block back. This is handled via the provided disk.c driver.
The disk is divided into four strict regions, always in this exact order:
- Superblock: Always exactly 1 block (Block 0).
- FAT Array: Can span 1 or more blocks, depending on the total disk size.
- Root Directory: Always exactly 1 block.
- Data Region: The remaining blocks, used strictly to hold the actual contents of your files.
The Core Data Structures
The Superblock (Block 0)
The Superblock is the "master record" of the file system. When you mount the disk, this is the very first block you read. It contains the geometry of the entire disk.
typedef struct __attribute__((__packed__)) Superblock {
uint8_t sig[8]; // Must be exactly "CSCE-313"
uint16_t numBlocks; // Total blocks on the virtual disk
uint16_t rootIndex; // Block index of the Root Directory
uint16_t dataIndex; // Block index of the Data Region
uint16_t numDataBlocks; // Total number of data blocks
uint8_t numFAT; // Number of blocks the FAT occupies
uint8_t padding[4079]; // Pads the struct to exactly 4096 bytes
} Superblock;
Note: The __attribute__((__packed__)) directive tells the GCC compiler not to add arbitrary padding between variables, ensuring our struct maps perfectly to the raw disk bytes.
The File Allocation Table (FAT)
When a file spans multiple blocks, the FAT functions as a linked-list data structure that maps the logical sequence of data to its physical locations on the disk.
- It is an array of 16-bit integers (
uint16_t). - Crucial Indexing Rule: The indices of the FAT array correspond only to the blocks in the Data Region, not the absolute blocks of the virtual disk.
- FAT Index 0 corresponds to the very first data block. To calculate the absolute disk block number that you must pass into
block_read()orblock_write(), you must offset the index by the start of the data region using this formula:Absolute_Disk_Block = superblock.dataIndex + FAT_Index - The value stored at
fat[N]is the FAT index of the next block in the file's chain. - Special Values:
0xFFFFindicates the End of Chain (EOC).0x0000indicates a free, unused block.
The Root Directory
The Root Directory is a flat catalog of all files on the disk. Our file system has a strict limit of 128 maximum files. A 32-byte entry represents each file.
typedef struct __attribute__((__packed__)) RootEntry {
uint8_t name[16]; // File name (null-terminated if < 16 chars)
uint32_t size; // File size in bytes
uint16_t indexFirstBlock; // Index of the first data block (or 0xFFFF if empty)
uint8_t padding[10]; // Pads the struct to exactly 32 bytes
} RootEntry;
Because 128 x 32 bytes = 4096 bytes, the Root Directory fits perfectly into a single block.
The File Descriptor Table
To allow user programs to read and write to files, the OS must keep track of active "sessions." If a user opens a file and reads the first 100 bytes, the file system must remember that the cursor is at byte 100.
You must declare a global array of 32 File Descriptors in your fs.c file: FileDescriptor fd_table[32];
You should define the FileDescriptor struct like this:
typedef struct {
int is_used; // 1 if this FD is currently open, 0 if it is free
int root_index; // The index of this file in your Root Directory array
size_t offset; // The current read/write cursor position in bytes
} FileDescriptor;
Why root_index? Do not store the file's name in this struct! By storing the index (0 to 127) of the file's entry in your Root Directory array, fs_read and fs_write can instantly access the file's size and starting FAT block without having to do a slow strcmp search every time the user wants to read a byte.
/CSCE-313/Labs/Visual%20Aids/image.png)
FAT Chain Example
/CSCE-313/Labs/Visual%20Aids/image-1.png)
Note: This file system does not support folders or sub-directories. All files are stored in the Root Directory.
Codebase Organization
The repository is strictly organized to separate your implementation from the testing tools.
📁 CSCE-313-FAT-Lab/
├── 📄 Makefile # Top-level Makefile to build everything
├── 📄 autograder.sh # The final grading script
├── 📁 src/ # Core file system library
│ ├── 📄 disk.c / disk.h # Provided block I/O driver (DO NOT MODIFY)
│ ├── 📄 fs.c # YOUR IMPLEMENTATION GOES HERE
│ ├── 📄 fs.h # API headers (DO NOT MODIFY)
│ └── 📄 Makefile # Compiles your code into libfs.a
├── 📁 utils/ # Standalone command-line tools
│ ├── 📄 fs_cli.c # Source code for the interactive shell
│ ├── 📄 mkfs.x # Pre-compiled virtual disk formatter
│ ├── 📄 ref_fs.x # Pre-compiled TA reference implementation
│ └── 📄 Makefile
└── 📁 tests/ # C-level Integration Tests
├── 📄 test_integration.c # Tests FD limits and state
├── 📄 test_file_io.c # Tests read/write offset logic
└── 📄 Makefile
What You Must Implement (The API)
You are required to implement the entire file system API inside src/fs.c. You should tackle this in four distinct steps. Do not modify src/fs.h.
1: Mounting and Unmounting
int fs_mount(const char *diskname): Open the disk usingblock_disk_open. Read the Superblock (Block 0), verify theCSCE-313signature, allocate memory for your FAT and Root Directory, and read them from the disk. Return 0 on success, -1 on failure.int fs_umount(void): Write your in-memory FAT and Root Directory back to the disk to save changes. Free allocated memory and close the disk. Return 0 on success.int fs_info(void): Print the disk geometry (total blocks, FAT blocks, free space ratio) to the console based on the Superblock data.
2: Directory Operations
int fs_create(const char *filename): Scan the Root Directory for an empty slot. Initialize the file name, set size to 0, and set the first block to0xFFFF.int fs_delete(const char *filename): Find the file in the Root Directory. Traverse its FAT chain and set every corresponding FAT entry to 0x0000 to free the blocks. Clear the Root Entry. If a file is currently open,fs_deleteshould return -1.int fs_ls(void): Iterate through the Root Directory and print the name, size, and starting block of all active files.
3: File Descriptors
You must maintain the global fd_table array to track the state of up to 32 open files.
-
fs_open(const char *filename): 1. Search the Root Directory for the requested filename. If it doesn't exist, return -1. 2. Scan yourfd_tablearray for the first available descriptor (is_used == 0). If 32 files are already open, return -1. 3. Claim the descriptor: Setis_used = 1, setroot_indexto the index where you found the file in the Root Directory, and setoffset = 0. 4. Return the index of the file descriptor (0-31) to the user. -
fs_close(int fd): Validate that thefdis within bounds (0-31) and is currently open. If valid, simply set itsis_usedflag to 0. -
fs_stat(int fd): Validate thefd. Use itsroot_indexto look up the file in your Root Directory array, and return the size variable from that entry. -
fs_lseek(int fd, size_t offset): Validate thefd. Check if the requested offset is larger than the file's current size. If it is, return -1 (you cannot seek past the end of a file). Otherwise, update the FD'soffsetvariable and return 0.Note on Sparse Files: In this lab, seeking past EOF returns an error. However, in production Linux systems, this is used to create Sparse Files. You can read how the standard lseek handles this (and the
SEEK_HOLEflag) in the Linux Programmer's Manual (man 2 lseek).
4: Reading and Writing (The Bounce Buffer)
This is the most complex phase. You must translate byte-level requests into block-level disk I/O.
int fs_read(int fd, void *buf, size_t count): This function reads up tocountbytes from the file into the user-providedbuf, starting at the current file offset.- Iterative Multi-Block I/O: If count is larger than 4,096 bytes (or spans across a block boundary), your function must loop. For each block involved:
- Identify the physical disk block using the current offset and the FAT chain.
- Load the entire 4,096-byte disk block into a temporary bounce buffer.
- Use
memcpyto copy the relevant portion from the bounce buffer into the correct position within the user'sbuf.
- State Management: After each successful read, increment the file descriptor's
offsetby the number of bytes actually moved. - Boundary Conditions: * If the user asks for more bytes than are remaining in the file (e.g., asking for 100 bytes when only 20 remain before EOF), read only the remaining 20 bytes.
- Return Value: Return the total number of bytes successfully read. If the offset is already at EOF, return 0.
- Iterative Multi-Block I/O: If count is larger than 4,096 bytes (or spans across a block boundary), your function must loop. For each block involved:
int fs_write(int fd, void *buf, size_t count): Read the block into a bounce buffer, memcpy the user's new data into it, and block_write it back to the disk. If writing past the end of the file, you must find a free block in the FAT, link it, and update the file's size in the Root Directory. Return the actual number of bytes written.- Writes occur at the current file descriptor offset.
- If offset < size → overwrite existing data
- If offset == size → append
- Writes occur at the current file descriptor offset.
Edge Cases & Expected Error Handling
Your file system should work reliably against these specific edge cases. Unless otherwise specified, your functions should return -1 to indicate an error.
1. Invalid File Descriptors
- For
fs_read,fs_write,fs_close,fs_stat, andfs_lseek: If the user passes a File Descriptor index that is out of bounds (e.g., < 0 or >= 32), or passes an FD that is not currently open (is_used == 0), you must immediately return -1.
2. File Creation Limits (fs_create)
- File Already Exists: You cannot have two files with the exact same name. If
fs_createis called with a filename that already exists in the Root Directory, return -1. - Root Directory Full: If there are already 128 active files in the Root Directory, return -1.
- Filename Too Long: If the provided filename exceeds 15 characters (plus the null terminator), return -1.
3. Open File Limits (fs_open)
- Too Many Open Files: If the FD table already has 32 active sessions, return -1.
- Opening the Same File Twice: For simplicity in this lab, you do not need to support independent read/write offsets for the same file. If a user calls
fs_openon a file that is already marked as open in your FD table, return -1.
4. Reading Past EOF (fs_read)
- A user might ask to read 100 bytes when only 20 bytes remain in the file. This is not a fatal error! You should read the remaining 20 bytes, update the offset, and return 20 (the actual number of bytes read).
- If the user calls
fs_readwhen their offset is already at the very end of the file, do not crash. Simply return 0 to indicate that 0 bytes were read.
5. Out of Space / Disk Full (fs_write)
- Similar to reading, if a user asks to write 5,000 bytes, but the FAT only has one free block left (4,096 bytes), you must perform a partial write.
- Write the 4,096 bytes, update the file's size and the FD offset, and return 4096. The user's application will see that the return value is smaller than what it requested and will know the disk is full.
- If the disk is completely 100% full when
fs_writeis called, return 0.
How to Compile and Run Your Code
Build, compile, and test your code iteratively using the provided tools.
1. Compiling the Project
You can build specific parts of the project, or everything at once.
- To build everything: Run
makefrom the root directory. This compiles the core library (src/), then the utilities (utils/), and finally the tests (tests/). - To build just the library: Navigate to the source folder (cd src) and run
make.
2. Formatting a Virtual Disk
Before your file system can do anything, you need a disk to operate on. Use the provided mkfs.x utility to create an empty virtual disk file.
# Creates a virtual disk named 'disk.fs' with 100 total blocks
./utils/mkfs.x disk.fs 100
3. Interactive Testing (Command-Line Interface)
Once you have implemented the code, you can use fs_cli.x to manually interact with your virtual disk and test your logic.
# Print the disk info (Tests fs_mount and fs_info)
./utils/fs_cli.x info disk.fs
# Create a file inside the virtual disk (Tests fs_create)
./utils/fs_cli.x add disk.fs my_test_file.txt
# List all files on the virtual disk (Tests fs_ls)
./utils/fs_cli.x ls disk.fs
4. Running the C-Level Tests
Once you complete the implementation, you must test your internal state machine (File Descriptors) and offset math.
# Make sure you have a fresh disk
./utils/mkfs.x disk.fs 100
# Run the integration test (Tests fs_open, fs_close, fd limits)
./tests/test_integration.x
# Run the File I/O test (Tests fs_read, fs_write offsets)
./tests/test_file_io.x
5. The Autograder
When you believe your entire implementation is complete and robust, run the grading script from the root directory. This script stresses your file system by creating 128 files, writing large text files that cross block boundaries, and forcing disk-full edge cases.
6. Understanding the Provided Tests
Use the testing suite to verify specific parts of your architecture before running the final autograder:
test_integration.x(State & File Descriptors): Checks yourfd_tablelogic. It verifies that you enforce the 32-file limit, properly initialize offsets, and successfully recycle File Descriptors after they are closed viafs_close.test_file_io.x(The Bounce Buffer): Checks your block offset math. It verifies thatfs_readandfs_writecan correctly locate data within a block, cross block boundaries using the FAT chain, and return the exact requested byte counts without reading garbage memory.autograder.sh(Stress & Edge Cases): The final gauntlet. It verifies the 128-file Root Directory limit, ensuresfs_deleteproperly reclaims FAT blocks, tests >4000-byte files to force block spanning, and triggers "Disk Full" scenarios to check for safe partial writes.
Key Systems Concepts to Understand
Let’s understand a few core quirks of how C and Operating Systems handle memory and hardware.
1. Struct Padding and __attribute__((__packed__))
By default, the C compiler optimizes your code for the CPU, not for the hard drive. CPUs prefer to read memory in even chunks (like 4 or 8 bytes at a time, known as "word alignment"). If you create a struct with a 1-byte char followed by a 4-byte int, the compiler will secretly insert 3 bytes of invisible "padding" between them so the int starts on an even memory boundary.
Why this is dangerous: In this lab, your Superblock and RootEntry structs must map flawlessly to the raw bytes on the virtual disk. If the compiler injects secret padding, your file sizes and block indices will read garbage data from the disk.
The Solution: We use __attribute__((__packed__)) at the end of our struct definitions. This is a directive that orders the GCC compiler to aggressively pack the variables together exactly as written, byte-for-byte, disabling all hardware-optimized padding.
2. The "Bounce Buffer" (Block I/O vs. Byte I/O)
The fundamental challenge of writing a file system is bridging the gap between what the user wants and what the hardware can actually do.
- The User wants byte-level access: "Read exactly 13 bytes starting at offset 50."
- The Hardware only does block-level access: "I can only give you the entire 4,096-byte block."
To solve this, you must implement a Bounce Buffer. This is a temporary 4,096-byte array you allocate inside your fs_read and fs_write functions.
- For Reading: You use
block_readto load the entire 4KB block from the disk into your bounce buffer. Then, you use pointer arithmetic to locate the exact starting byte inside that buffer, and usememcpyto copy just the 13 bytes the user asked for into their application. - For Writing (Read-Modify-Write): You cannot just write 13 bytes to the disk, or you will overwrite the other 4,083 bytes in that block with zeros. You must first
block_readthe existing block into your bounce buffer,memcpythe user's 13 new bytes into the bounce buffer to overwrite the old data, and thenblock_writethe entire mixed 4KB buffer back to the disk.
3. Pointer Arithmetic vs. Byte Math
When reading data into memory, you will frequently need to calculate memory offsets. You must be extremely careful with pointer types in C.
If you have an array representing your FAT: uint16_t *fat_array = malloc(FAT_SIZE);
And you want to jump 4,096 bytes forward in memory to read the second block of the FAT, you might be tempted to do this: fat_array + 4096
The Trap: In C, adding a number to a pointer advances it by elements, not by bytes. Because fat_array points to 16-bit integers (2 bytes each), adding 4096 actually advances the memory address by 8,192 bytes, skipping a whole block and causing a segmentation fault!
The Fix: Whenever you are doing raw byte math to find offsets, always cast your pointer to a 1-byte type (like uint8_t * or char *) first: (uint8_t *)fat_array + 4096
4. File Descriptors and "State"
A file on a disk is completely static; it has no idea if anyone is reading it. But an Operating System needs to know exactly where a user left off reading so it can give them the next chunk of text.
A File Descriptor (FD) is how the OS tracks this active "state." It is simply an index to an entry in a table maintained by your library. For every open file, your library must track:
- Which file is open (its index in the Root Directory).
- The offset: A cursor tracking exactly how many bytes into the file the user has currently read or written.
When a user calls fs_read(fd, buf, 10), you look at your FD table, see that their cursor is at byte 50, fetch bytes 50-59, and then update their cursor to 60. When they call fs_close(fd), you simply mark that table entry as available again.
Implementation
1. Mounting and Unmounting
The key idea is:
fs_mountloads the filesystem metadata from disk into memory.fs_umountsaves your in-memory metadata back to disk.fs_infojust reports what is currently mounted.
Your assignment says the disk layout is always:
- block 0 = superblock
- blocks 1 .. numFAT = FAT
- block rootIndex = root directory
- data region after that
Also, the superblock must contain the exact 8-byte signature "CSCE-313", the FAT is an array of uint16_t, and the root directory fits in exactly one block because 128 entries × 32 bytes = 4096 bytes.
What each function should do
fs_mount(const char *diskname)
- fail if a disk is already mounted
- open the disk
- allocate memory for the superblock
- read block 0 into it
- verify the signature
- allocate memory for the FAT
- read all FAT blocks into memory
- read the root directory block into root
- initialize all file descriptors as unused
- reset open-file bookkeeping
fs_umount(void)
- fail if nothing is mounted
- fail if any file is still open
- write superblock back to block 0
- write FAT blocks back to disk
- write root directory back to its block
- free allocated memory
- reset pointers/state
- close the disk
fs_info(void)
- fail if nothing is mounted
- print geometry using the superblock
- count:
- free FAT entries = FAT entries equal to 0x0000
- free root entries = empty filenames
- print the ratios
Correct implementation
int fs_mount(const char *diskname) {
if (diskname == NULL) {
return -malo
}
/* Do not allow mounting twice */
if (sblk != NULL) {
return -1;
}
/* Open the virtual disk */
if (block_disk_open(diskname) < 0) {
return -1;
}
/* Allocate and read superblock */
sblk = malloc(sizeof(Superblock));
if (sblk == NULL) {
block_disk_close();
return -1;
}
if (block_read(0, sblk) < 0) {
free(sblk);
sblk = NULL;
block_disk_close();
return -1;
}
/* Validate signature: exactly 8 bytes, no null terminator required */
if (memcmp(sblk->sig, "CSCE-313", 8) != 0) {
free(sblk);
sblk = NULL;
block_disk_close();
return -1;
}
/* Allocate FAT memory: numFAT blocks, each BLOCK_SIZE bytes */
fat = malloc(sblk->numFAT * BLOCK_SIZE);
if (fat == NULL) {
free(sblk);
sblk = NULL;
block_disk_close();
return -1;
}
/* Read FAT blocks from disk: FAT starts at block 1 */
for (int i = 0; i < sblk->numFAT; i++) {
if (block_read(1 + i, (uint8_t *)fat + (i * BLOCK_SIZE)) < 0) {
free(fat);
fat = NULL;
free(sblk);
sblk = NULL;
block_disk_close();
return -1;
}
}
/* Read root directory block */
if (block_read(sblk->rootIndex, root) < 0) {
free(fat);
fat = NULL;
free(sblk);
sblk = NULL;
block_disk_close();
return -1;
}
/* Initialize file descriptor table */
for (int i = 0; i < FS_OPEN_MAX_COUNT; i++) {
filedes[i].id = -1;
filedes[i].offset = 0;
filedes[i].rootIndex = -1;
}
numFilesOpen = 0;
next_id = 0;
return 0;
}
int fs_umount(void) {
if (sblk == NULL) {
return -1;
}
/* Cannot unmount if files are still open */
if (numFilesOpen > 0) {
return -1;
}
/* Write superblock back */
if (block_write(0, sblk) < 0) {
return -1;
}
/* Write FAT blocks back */
for (int i = 0; i < sblk->numFAT; i++) {
if (block_write(1 + i, (uint8_t *)fat + (i * BLOCK_SIZE)) < 0) {
return -1;
}
}
/* Write root directory back */
if (block_write(sblk->rootIndex, root) < 0) {
return -1;
}
/* Free memory */
free(fat);
fat = NULL;
free(sblk);
sblk = NULL;
/* Close disk */
if (block_disk_close() < 0) {
return -1;
}
return 0;
}
int fs_info(void) {
if (sblk == NULL) {
return -1;
}
int fat_free = 0;
int root_free = 0;
/* Count free FAT entries */
for (int i = 0; i < sblk->numDataBlocks; i++) {
if (fat[i] == get_fat_free()) {
fat_free++;
}
}
/* Count free root directory entries */
for (int i = 0; i < FS_FILE_MAX_COUNT; i++) {
if (root[i].name[0] == '\0') {
root_free++;
}
}
printf("FS Info:\n");
printf("total_blk_count=%u\n", sblk->numBlocks);
printf("fat_blk_count=%u\n", sblk->numFAT);
printf("rdir_blk=%u\n", sblk->rootIndex);
printf("data_blk=%u\n", sblk->dataIndex);
printf("data_blk_count=%u\n", sblk->numDataBlocks);
printf("fat_free_ratio=%d/%u\n", fat_free, sblk->numDataBlocks);
printf("rdir_free_ratio=%d/%d\n", root_free, FS_FILE_MAX_COUNT);
return 0;
}
fs_mount questions
1. Why do we use sblk != NULL to check if mounted?
Because sblk is a global pointer:
static Superblock* sblk = NULL;
Before mounting, it is NULL.
After successful mounting, it points to allocated memory.
So this:
if (sblk != NULL) return -1;
means:
“A filesystem is already mounted, so do not mount another one on top of it.”
That prevents inconsistent state.
2. Why do we malloc(sizeof(Superblock))?
Because the superblock lives on disk, but after mounting we want a copy in RAM.
So:
sblk = malloc(sizeof(Superblock));
block_read(0, sblk);
means:
- reserve memory for one superblock
- read disk block 0 into that memory
This matches the assignment’s description that block 0 is always the superblock.
3. Why use memcmp(sblk->sig, "CSCE-313", 8)?
Because sig is:
uint8_t sig[8];
That is exactly 8 bytes.
And "CSCE-313" is also 8 characters.
Important detail:
- this field is not necessarily null-terminated, so
strcmp()is not the right choice.
memcmp(..., 8) compares exactly 8 raw bytes, which is what we want.
4. Why is FAT allocation sblk->numFAT * BLOCK_SIZE?
The FAT occupies multiple whole disk blocks, and each block is 4096 bytes. The assignment explicitly says the FAT can span 1 or more blocks, and the disk only works in whole blocks. 
So if numFAT == 2, then FAT size in bytes is:
2 * 4096
That is why we allocate:
fat = malloc(sblk->numFAT * BLOCK_SIZE);
5. Why cast to (uint8_t *)fat when reading FAT blocks?
This is one of the most important C details in the whole assignment.
You are reading the FAT one disk block at a time:
block_read(1 + i, (uint8_t *)fat + (i * BLOCK_SIZE))
Why not just do fat + i * BLOCK_SIZE?
Because fat is a uint16_t *.
In C, pointer arithmetic moves by elements, not bytes.
So:
fat + 1
moves forward by sizeof(uint16_t) bytes, which is 2 bytes.
But here we want to move by exact byte offsets, like 4096 bytes at a time. The assignment itself warns about this pointer-arithmetic trap and says to cast to a 1-byte type first. 
So:
(uint8_t *)fat + (i * BLOCK_SIZE)
means:
“Treat FAT as raw bytes, then move forward exactly i * 4096 bytes.”
6. Why do we read root directly into root?
Because your code already has:
static RootEntry root[FS_FILE_MAX_COUNT];
And the assignment says the root directory is exactly one block. Since each entry is 32 bytes and there are 128 entries, that is exactly 4096 bytes total.
So this works perfectly:
block_read(sblk->rootIndex, root);
We are reading one block straight into the array.
7. Why initialize all file descriptors during mount?
Because open-file state should start clean each time you mount.
In your starter code, a descriptor is unused when:
filedes[i].id == -1
So we set all of them to unused.
Also:
numFilesOpen = 0;
next_id = 0;
resets the session bookkeeping.
fs_umount questions
8. Why check numFilesOpen > 0 in fs_umount?
Because unmounting while files are still open would mean:
- user processes still think those files are active
- but your metadata is being shut down and freed
That creates invalid state.
Your assignment explicitly says to ensure no files are open before unmounting.
9. Why write metadata back in fs_umount?
Because while the disk is mounted, you will modify:
- FAT
- root directory
- maybe even superblock in future logic
Those changes live in memory until you save them.
So unmount is the “flush everything back to disk” step.
fs_info questions
10. Why count FAT free entries only up to numDataBlocks?
Because FAT entries correspond to the data region only, not all disk blocks. The assignment is very explicit about that. 
So the valid FAT entries are:
fat[0] ... fat[numDataBlocks - 1]
Not the entire disk block count.
11. Count free FAT entries
for (int i = 0; i < sblk->numDataBlocks; i++) {
if (fat[i] == get_fat_free()) {
fat_free_count++;
}
}
This loops through the FAT entries that actually correspond to data blocks.
A FAT entry is free if it contains:
0x0000
which your helper returns through:
get_fat_free()
So we are literally counting:
“How many data blocks are currently unused?”
12. Count fee root entries
for (int i = 0; i < FS_FILE_MAX_COUNT; i++) {
if (root[i].name[0] == '\0') {
root_free_count++;
}
}
Each root directory entry represents one possible file.
If the first character of the filename is '\0', that means the entry is empty.
So this counts:
“How many file slots are still available in the root directory?”
Since the root directory has room for 128 files, this loop checks all 128 entries.
Why root[i].name[0] == '\0' means empty:
A filename is stored in:
uint8_t name[FS_FILENAME_LEN];
If a root entry is unused, its name should be all zeros, or at least start with zero.
So checking only the first byte is enough to tell whether the entry is empty.
That is a very common pattern in C strings.
2. Directory Operations
fs_create implementation
Creating a file here does not allocate any data block yet.
That is a really important point.
When you create an empty file:
- it has a name
- it has size 0
- it has no data blocks yet
So the first block is set to the FAT end-of-chain marker:
0xFFFF
This means:
“There is currently no first block allocated.”
What checks should happen first
Before writing anything into the root directory, fs_create should verify:
- a filesystem is mounted
- filename is not NULL
- filename is not empty
- filename fits in
FS_FILENAME_LEN(16 total, so max 15 visible characters +'\0') - the file does not already exist
- there is at least one empty root directory slot
The assignment explicitly requires duplicate-name rejection, root-full rejection, and filename-length rejection. 
Correct implementation:
int fs_create(const char *filename) {
if (sblk == NULL || filename == NULL) {
return -1;
}
/* Reject empty filename */
if (filename[0] == '\0') {
return -1;
}
/* Max allowed: 15 chars + null terminator */
if (strlen(filename) >= FS_FILENAME_LEN) {
return -1;
}
int free_index = -1;
for (int i = 0; i < FS_FILE_MAX_COUNT; i++) {
/* Track first empty root slot */
if (root[i].name[0] == '\0') {
if (free_index == -1) {
free_index = i;
}
}
/* Reject duplicate filename */
else if (strcmp((char *)root[i].name, filename) == 0) {
return -1;
}
}
/* No empty slot found => root directory full */
if (free_index == -1) {
return -1;
}
/* Initialize the new root entry */
memset(&root[free_index], 0, sizeof(RootEntry));
strcpy((char *)root[free_index].name, filename);
root[free_index].size = 0;
root[free_index].indexFirstBlock = get_fat_eoc();
return 0;
}
1. Mounted check
if (sblk == NULL || filename == NULL) {
return -1;
}
If the filesystem is not mounted, then root[] is not meaningful as active filesystem state.
Also, if filename == NULL, you cannot safely call strlen or strcmp on it.
2. Empty filename check
if (filename[0] == '\0') {
return -1;
}
A zero-length file name should be rejected.
Remember empty names would conflict with how you detect unused root entries:
root[i].name[0] == '\0'
So allowing "" as a real file name would break your directory logic.
3. Length check
if (strlen(filename) >= FS_FILENAME_LEN) {
return -1;
}
Your header defines:
#define FS_FILENAME_LEN 16
That means the array can hold at most:
- 15 actual filename characters
- 1 null terminator
So length 16 or more is too long.
That matches the assignment’s rule: filenames longer than 15 characters plus null terminator must be rejected.
4. One scan handles both duplicate detection and free-slot detection
This part is efficient and clean:
int free_index = -1;
for (int i = 0; i < FS_FILE_MAX_COUNT; i++) {
if (root[i].name[0] == '\0') {
if (free_index == -1) {
free_index = i;
}
}
else if (strcmp((char *)root[i].name, filename) == 0) {
return -1;
}
}
During the loop:
- if an entry is empty, remember the first empty slot
- if an entry is in use, compare its name to the requested filename
- if duplicate found, fail immediately
At the end:
- if no empty slot exists, the root directory is full
- otherwise use the remembered slot
This satisfies both assignment requirements: no duplicates, no creation when the root directory is full.
5. Why root[i].name[0] == '\0' means “empty slot”
Because unused root entries are effectively blank.
A filename stored in a C-style char array starts with '\0' if it is empty.
So this is the standard way to say:
“This root directory entry is unused.”
6. Why memset first is a good idea
memset(&root[free_index], 0, sizeof(RootEntry));
This clears the entire entry before filling in fields.
That means:
name[]becomes all zerosizebecomes 0indexFirstBlockbecomes 0 initially- padding bytes become clean too
Then you overwrite the fields that matter:
strcpy((char *)root[free_index].name, filename);
root[free_index].size = 0;
root[free_index].indexFirstBlock = get_fat_eoc();
Using memset is not strictly required, but it is very safe and clean.
7. Why indexFirstBlock = get_fat_eoc()
This is one of the most important ideas in fs_create.
A newly created file has no data blocks yet.
So its “first block” is not a real FAT index.
Instead, you mark it as:
0xFFFF
which is your end-of-chain marker:
get_fat_eoc()
Later, when fs_write writes the first byte into an empty file, that is when you will allocate the first FAT block.
Why we do not call find_free_fat() here
This is a very common confusion.
You might think:
“I am creating a file, so shouldn’t I allocate a block now?”
But for this filesystem, no.
Creating the file only creates its directory entry.
The file is still empty, so it does not need any data block yet.
You only allocate a FAT block when the file actually needs to store data, usually during fs_write.
So fs_create does not touch FAT at all.
Common mistake
Mistake 1: using > instead of >= in length check
Wrong:
if (strlen(filename) > FS_FILENAME_LEN)
That would incorrectly allow a 16-character name, which leaves no space for the null terminator.
Correct:
if (strlen(filename) >= FS_FILENAME_LEN)
fs_delete implement
Big picture first
When a file exists, its root directory entry stores:
- its name
- its size
- the FAT index of its first data block
If the file has data, the FAT works like a linked list:
first block -> next block -> next block -> ... -> 0xFFFF
Deleting the file means:
- find the first block from the root entry
- follow the chain in the FAT
- free each FAT entry by setting it to 0x0000
- erase the root directory entry so the file no longer exists
One very important edge case
A file may be empty.
For an empty file:
indexFirstBlock == 0xFFFF
That means:
- there is no actual FAT chain to free
- you should skip the FAT traversal
- you should still clear the root entry
This is because fs_create initializes empty files with indexFirstBlock = 0xFFFF.
Implementation
Correct implementation:
int fs_delete(const char *filename) {
if (sblk == NULL || filename == NULL) {
return -1;
}
int root_index = -1;
/* Find the file in the root directory */
for (int i = 0; i < FS_FILE_MAX_COUNT; i++) {
if (root[i].name[0] != '\0' &&
strcmp((char *)root[i].name, filename) == 0) {
root_index = i;
break;
}
}
/* File not found */
if (root_index == -1) {
return -1;
}
/* Make sure the file is not currently open */
for (int i = 0; i < FS_OPEN_MAX_COUNT; i++) {
if (filedes[i].id != -1 && filedes[i].rootIndex == root_index) {
return -1;
}
}
/* Free all FAT blocks belonging to the file */
uint16_t curr = root[root_index].indexFirstBlock;
while (curr != get_fat_eoc()) {
uint16_t next = fat[curr];
fat[curr] = get_fat_free();
curr = next;
}
/* Clear the root entry */
memset(&root[root_index], 0, sizeof(RootEntry));
root[root_index].indexFirstBlock = get_fat_eoc();
return 0;
}
1. Validate basic input
if (sblk == NULL || filename == NULL) {
return -1;
}
This makes sure:
- the filesystem is mounted
- the filename pointer is valid
Without this, you could crash or operate on invalid state.
2. Find the file in the root directory
int root_index = -1;
for (int i = 0; i < FS_FILE_MAX_COUNT; i++) {
if (root[i].name[0] != '\0' &&
strcmp((char *)root[i].name, filename) == 0) {
root_index = i;
break;
}
}
This loop searches through all 128 root directory entries.
We only compare names for entries that are actually in use:
root[i].name[0] != '\0'
If found, we store the index of the file’s root entry.
That index is important because later functions often refer back to the file through its root directory slot. The assignment specifically explains that the file descriptor table stores root_index for this reason.
3. If the file does not exist, fail
if (root_index == -1) {
return -1;
}
If you never found a matching filename, there is nothing to delete.
4. Check whether the file is open
for (int i = 0; i < FS_OPEN_MAX_COUNT; i++) {
if (filedes[i].id != -1 && filedes[i].rootIndex == root_index) {
return -1;
}
}
This is very important.
Your assignment explicitly says:
If a file is currently open,
fs_deleteshould return -1. 
Since your codebase uses:
filedes[i].id == -1
to mean “unused FD”, this loop checks all currently open descriptors and sees whether any of them refer to this file’s root directory entry.
If yes, deletion is not allowed.
Why check by rootIndex instead of name?
- Because your file descriptor table stores:
rootIndex - which is the index of the file in the root directory.
- That is better than storing the name because:
- it is faster
- it avoids repeated string comparisons
- it matches the assignment’s intended design
5. Start FAT traversal
uint16_t curr = root[root_index].indexFirstBlock;
This gets the FAT index of the first data block of the file.
There are two possibilities:
Case A: file is empty
Then:
curr == 0xFFFF
So the loop below never runs.
Case B: file has data
- Then
curris some valid FAT index like 0, 5, 19, etc.
6. Walk the FAT chain and free each block
while (curr != get_fat_eoc()) {
uint16_t next = fat[curr];
fat[curr] = get_fat_free();
curr = next;
}
This is the heart of the delete logic.
Let’s unpack it carefully.
Why do we save next before freeing the current block?
Suppose the chain is:
fat[4] = 7
fat[7] = 10
fat[10] = 0xFFFF
That means the file uses blocks:
4 -> 7 -> 10 -> EOC
If you did this in the wrong order:
fat[curr] = 0x0000;
curr = fat[curr];
you would immediately lose the next pointer, because after setting fat[curr] = 0x0000, the chain is broken.
So the correct order is:
- remember next
- free current
- move to next
That is why we do:
uint16_t next = fat[curr];
fat[curr] = get_fat_free();
curr = next;
Example traversal
Suppose:
indexFirstBlock = 4
and the FAT contains:
fat[4] = 7
fat[7] = 10
fat[10] = 0xFFFF
Then the loop does:
Iteration 1
curr = 4next = fat[4] = 7fat[4] = 0x0000curr = 7
Iteration 2
curr = 7next = fat[7] = 10fat[7] = 0x0000curr = 10
Iteration 3
curr = 10next = fat[10] = 0xFFFFfat[10] = 0x0000curr = 0xFFFF
Now loop ends.
All three blocks are freed.
This matches the assignment’s requirement to traverse the FAT chain and mark each block as free.
7. Clear the root entry
memset(&root[root_index], 0, sizeof(RootEntry));
root[root_index].indexFirstBlock = get_fat_eoc();
This removes the file from the directory.
After memset, the root entry becomes blank:
- name becomes empty
- size becomes 0
- all padding becomes 0
Then we explicitly set:
indexFirstBlock = 0xFFFF
This is a nice consistent state for an empty/nonexistent file entry.
Strictly speaking, after memset it would be 0, but setting it back to EOC keeps the semantic meaning consistent with how empty files are represented elsewhere.
Deleting a file is not just about freeing data blocks.
If you only freed the FAT chain but left the root entry intact, the filesystem would still think the file exists.
So deletion must do both:
- reclaim the data blocks
- remove the directory entry
fs_ls implementation
What fs_ls should do
It should:
- make sure a filesystem is mounted
- loop through all 128 root directory entries
- print only the entries that are active
- for each active file, print:
- filename
- size
- first data block
An entry is active if its name is not empty.
Implementation
Correct implementation:
int fs_ls(void) {
if (sblk == NULL) {
return -1;
}
printf("FS Ls:\n");
for (int i = 0; i < FS_FILE_MAX_COUNT; i++) {
if (root[i].name[0] != '\0') {
printf("file: %s, size: %u, data_blk: %u\n",
root[i].name,
root[i].size,
root[i].indexFirstBlock);
}
}
return 0;
}
1. Check whether the filesystem is mounted
if (sblk == NULL) {
return -1;
}
If no filesystem is mounted, there is no valid root directory to list.
So just like your other functions, fs_ls should fail in that case.
2. Print a header
printf("FS Ls:\n");
This is just a label before the file list.
Many versions of this assignment use exactly this style.
3. Loop through the whole root directory
for (int i = 0; i < FS_FILE_MAX_COUNT; i++)
Your root directory has room for 128 entries, so this checks all possible file slots.
4. Skip empty entries
if (root[i].name[0] != '\0')
This means:
- if the first character is not null, the file exists
- if it is null, that slot is unused
So fs_ls only prints real files.
5. Print file information
printf("file: %s, size: %u, data_blk: %u\n",
root[i].name,
root[i].size,
root[i].indexFirstBlock);
This prints exactly the three things the assignment asks for:
- name
- size
- starting block
Important note about data_block
root[i].indexFirstBlock
This is the FAT index of the first block, not necessarily the absolute disk block number.
That is an important distinction.
For example:
- if indexFirstBlock = 0, that means the first data block in the data region
- the actual disk block would be:
sblk->dataIndex + root[i].indexFirstBlock
But for this function, most versions of the assignment want the stored starting block value from the root entry, which is the FAT index.
Also, for an empty file, this value will be:
0xFFFF
because empty files do not yet have any allocated data block.
3. File Descriptors
fs_open() implementation
What fs_open must do
It should:
- make sure the filesystem is mounted
- validate filename
- find the file in the root directory
- fail if the file does not exist
- fail if that file is already open
- find the first free file descriptor slot
- fail if all 32 are in use
- initialize that FD:
- assign an ID
- set offset = 0
- set rootIndex to the root directory entry index
- increment open-file bookkeeping
- return the FD index
Implementation
Correct implementation:
int fs_open(const char *filename) {
if (sblk == NULL || filename == NULL) {
return -1;
}
int root_index = -1;
/* Find the file in the root directory */
for (int i = 0; i < FS_FILE_MAX_COUNT; i++) {
if (root[i].name[0] != '\0' &&
strcmp((char *)root[i].name, filename) == 0) {
root_index = i;
break;
}
}
/* File does not exist */
if (root_index == -1) {
return -1;
}
/* Do not allow opening the same file twice */
for (int i = 0; i < FS_OPEN_MAX_COUNT; i++) {
if (filedes[i].id != -1 && filedes[i].rootIndex == root_index) {
return -1;
}
}
/* Find first free file descriptor slot */
for (int i = 0; i < FS_OPEN_MAX_COUNT; i++) {
if (filedes[i].id == -1) {
filedes[i].id = next_id++;
filedes[i].offset = 0;
filedes[i].rootIndex = root_index;
numFilesOpen++;
return i;
}
}
/* No free FD slot */
return -1;
}
Search the root directory for the file
int root_index = -1;
for (int i = 0; i < FS_FILE_MAX_COUNT; i++) {
if (root[i].name[0] != '\0' &&
strcmp((char *)root[i].name, filename) == 0) {
root_index = i;
break;
}
}
This loop looks through all 128 root entries.
We only compare non-empty entries:
root[i].name[0] != '\0'
If we find a matching name, we save that root directory index.
That is important because the file descriptor table does not store the filename itself. It stores the index of the file’s root directory entry, which is exactly the design the assignment recommends.
3. Fail if the file does not exist
if (root_index == -1) {
return -1;
}
If you did not find a matching file in the root directory, then fs_open must fail.
The assignment explicitly says that if the file does not exist, return -1.
4. Reject opening the same file twice
for (int i = 0; i < FS_OPEN_MAX_COUNT; i++) {
if (filedes[i].id != -1 && filedes[i].rootIndex == root_index) {
return -1;
}
}
This is an important lab-specific rule.
Your assignment says:
For simplicity in this lab, you do not need to support independent read/write offsets for the same file. If a user calls fs_open on a file that is already marked as open in your FD table, return -1. 
So we scan the FD table and check whether any currently open descriptor already refers to that same root directory entry.
If yes, fail.
5. Find the first free file descriptor slot
for (int i = 0; i < FS_OPEN_MAX_COUNT; i++) {
if (filedes[i].id == -1) {
...
}
}
Your code uses:
id == -1
to mean “unused FD slot”.
So this loop is looking for the first available descriptor index from 0 to 31.
6. Initialize the file descriptor
filedes[i].id = next_id++;
filedes[i].offset = 0;
filedes[i].rootIndex = root_index;
numFilesOpen++;
return i;
This claims the FD slot.
filedes[i].id = next_id++;- This gives the session a unique ID and marks the slot as in use.
- Even though the user sees the FD index i, your internal id is what your starter code uses to distinguish free from occupied entries.
filedes[i].offset = 0;- This means when the file is opened, the current read/write cursor starts at the beginning of the file.
- That exactly matches the assignment: newly opened files should start with offset 0.
filedes[i].rootIndex = root_index;- This links the FD slot to the file’s entry in the root directory.
- Later, fs_stat, fs_read, fs_write, and fs_lseek will use this to quickly find:
- file size
- first block
- file metadata
- without doing another filename search.
numFilesOpen++;- This keeps your open-file count accurate.
- That matters because
fs_umountchecks whether any files are still open before unmounting.
return i;- This returns the file descriptor index, which is what the assignment expects.
- The assignment specifically says
fs_openshould return the descriptor index from 0 to 31.
Why return i instead of filedes[i].id?
This is a very important distinction.
There are two numbers here:
filedes[i].id= internal session markeri= the actual file descriptor index the user uses
All later API calls like:
fs_close(fd)
fs_read(fd, ...)
fs_write(fd, ...)
expect the array index fd, not the internal session ID.
So the correct return value is:
return i;
not the unique ID
Think of the FD table like this:
| FD index | id | offset | rootIndex |
|---|---|---|---|
| 0 | -1 | 0 | -1 |
| 1 | -1 | 0 | -1 |
| 2 | -1 | 0 | -1 |
Suppose notes.txt is in root entry 7.
fd = fs_open("notes.txt");
might produce:
| FD index | id | offset | rootIndex |
|---|---|---|---|
| 0 | 0 | 0 | 7 |
| 1 | -1 | 0 | -1 |
| 2 | -1 | 0 | -1 |
and return:
0
meaning the user’s file descriptor is 0.
Later, if the user reads 100 bytes, offset may become 100.
fs_close implementation
What fs_close must do
From the instructions:
- check that
fdis valid (0–31) - check that it is currently open
- mark it as unused
In your implementation, “unused” means:
filedes[i].id == -1
Implementation
Correct implementation:
int fs_close(int fd) {
/* Validate fd */
if (!is_valid_fd(fd)) {
return -1;
}
/* Mark descriptor as unused */
filedes[fd].id = -1;
filedes[fd].offset = 0;
filedes[fd].rootIndex = -1;
numFilesOpen--;
return 0;
}
1. Validate the file descriptor
You already have this helper:
static inline int is_valid_fd(int fd) {
return (fd >= 0 && fd < FS_OPEN_MAX_COUNT && filedes[fd].id != -1);
}
So:
if (!is_valid_fd(fd)) {
return -1;
}
means:
fdmust be between 0 and 31- AND the FD must currently be in use
If either fails → return -1
2. Mark the FD as unused
filedes[fd].id = -1;
This is the most important line.
Because in your system:
id == -1 → unused
That is what makes the FD slot available again.
Mental model
Before closing:
| FD | id | offset | rootIndex |
|---|---|---|---|
| 0 | 5 | 120 | 3 |
After:
| FD | id | offset | rootIndex |
|---|---|---|---|
| 0 | -1 | 0 | -1 |
fs_stat implementation
What fs_stat should do
Given a file descriptor fd, it should:
- validate the FD
- use
filedes[fd].rootIndexto find the file in the root directory - return the file’s size
Implementation
Correct implementation:
int fs_stat(int fd) {
/* Validate fd */
if (!is_valid_fd(fd)) {
return -1;
}
int root_index = filedes[fd].rootIndex;
return root[root_index].size;
}
Get the root directory index
int root_index = filedes[fd].rootIndex;
This is the key design idea of your filesystem:
- The FD does not store the filename
- It stores the index of the file in the root directory
So instead of searching by name again, you directly jump to:
root[root_index]
fs_lseek implementation
What fs_lseek must do
Given:
- a file descriptor
fd - a desired offset
It should:
- validate
fd - get the file size from the root directory
- check:
- if
offset > size→ invalid → return -1
- if
- otherwise:
- update
filedes[fd].offset = offset - return 0
- update
Implementation
Correct implementation:
int fs_lseek(int fd, size_t offset) {
/* Validate fd */
if (!is_valid_fd(fd)) {
return -1;
}
int root_index = filedes[fd].rootIndex;
size_t file_size = root[root_index].size;
/* Cannot seek past end of file */
if (offset > file_size) {
return -1;
}
/* Update offset */
filedes[fd].offset = offset;
return 0;
}
Enforce the key rule
if (offset > file_size) {
return -1;
}
This is the most important rule.
The assignment explicitly says:
you cannot seek past the end of a file
So:
| offset | size | allowed? |
|---|---|---|
| < size | ✅ | valid |
| = size | ✅ | valid |
| > size | ❌ | invalid |
Why offset == size is allowed
- This is subtle but important.
If:
offset == size
That means:
- cursor is at end of file (EOF)
This is valid, because:
- next read → returns 0
- next write → appends
Mental model
Think of the file like a byte array:
[0][1][2][3][4][5][6][7][8][9]
If size = 10:
- valid offsets: 0 → 10
- invalid offsets: > 10
Example
fd = fs_open("file.txt"); // size = 10
fs_lseek(fd, 5); // OK
fs_lseek(fd, 10); // OK (EOF)
fs_lseek(fd, 11); // FAIL
4. Reading and Writing (The Bounce Buffer)
fs_read implementation
What fs_read does
Given:
fd- user buffer
buf count
it:
- checks how many bytes are actually available before EOF
- loops block by block
- reads each full 4096-byte disk block into a bounce buffer
- copies only the needed part into the user buffer
- advances the file descriptor offset
- returns the number of bytes actually read
Implementation
Correct implementation:
int fs_read(int fd, void *buf, size_t count) {
if (!is_valid_fd(fd) || buf == NULL) {
return -1;
}
if (count == 0) {
return 0;
}
int r_idx = filedes[fd].rootIndex;
size_t file_size = root[r_idx].size;
size_t cur_offset = filedes[fd].offset;
/* Already at EOF */
if (cur_offset >= file_size) {
return 0;
}
/* Do not read past EOF */
size_t bytes_left_in_file = file_size - cur_offset;
size_t bytes_to_read = (count < bytes_left_in_file) ? count : bytes_left_in_file;
size_t total_read = 0;
uint8_t bounce[BLOCK_SIZE];
while (total_read < bytes_to_read) {
size_t abs_offset = cur_offset + total_read;
int disk_block = get_block_for_offset(fd, abs_offset, 0);
if (disk_block < 0) {
return -1;
}
if (block_read(disk_block, bounce) < 0) {
return -1;
}
size_t block_offset = abs_offset % BLOCK_SIZE;
size_t bytes_left_in_block = BLOCK_SIZE - block_offset;
size_t bytes_remaining = bytes_to_read - total_read;
size_t chunk = (bytes_remaining < bytes_left_in_block)
? bytes_remaining
: bytes_left_in_block;
memcpy((uint8_t *)buf + total_read, bounce + block_offset, chunk);
total_read += chunk;
}
filedes[fd].offset += total_read;
return (int)total_read;
}
1. Validate inputs
if (!is_valid_fd(fd) || buf == NULL) {
return -1;
}
Your assignment says invalid FDs must return -1. 
And buf == NULL is also invalid because there is nowhere to copy data.
2. Handle count == 0
if (count == 0) {
return 0;
}
Reading zero bytes is valid. It just moves nothing.
3. Check EOF first
if (cur_offset >= file_size) {
return 0;
}
The assignment explicitly says:
- if the offset is already at EOF, return 0
- do not crash 
So if the cursor is already at the end, there is nothing to read.
4. Clamp the read to EOF
size_t bytes_left_in_file = file_size - cur_offset;
size_t bytes_to_read = (count < bytes_left_in_file) ? count : bytes_left_in_file;
This is the line that enforces:
“Read only what remains before EOF.” 
Example:
- file size = 1000
- current offset = 990
- user asks for 50 bytes
Then:
- only 10 bytes remain
- so bytes_to_read = 10
5. Why we need a bounce buffer
The assignment explains that the disk only reads whole 4096-byte blocks, even if the user wants only a few bytes. So you must:
- read the entire block into a temporary array
- copy just the needed bytes from that array into the user buffer 
That is why we do:
uint8_t bounce[BLOCK_SIZE];
and later:
block_read(disk_block, bounce);
6. Compute which block to read
size_t abs_offset = cur_offset + total_read;
int disk_block = get_block_for_offset(fd, abs_offset, 0);
As the loop progresses, abs_offset moves through the file.
Examples:
- if current file offset is 3000 and you already read 0 bytes, abs_offset = 3000
- after reading 1096 bytes, abs_offset = 4096
- now you need the next file block
That is exactly why the helper exists.
7. Find the position inside the block
size_t block_offset = abs_offset % BLOCK_SIZE;
This tells you where inside the 4096-byte block your read starts.
Examples:
- abs_offset = 3000 → block_offset = 3000
- abs_offset = 4096 → block_offset = 0
- abs_offset = 5000 → block_offset = 904
So this is the byte position inside the bounce buffer.
8. Decide how many bytes to copy from this block
size_t bytes_left_in_block = BLOCK_SIZE - block_offset;
size_t bytes_remaining = bytes_to_read - total_read;
size_t chunk = (bytes_remaining < bytes_left_in_block)
? bytes_remaining
: bytes_left_in_block;
This chooses the smaller of:
- how many bytes are still needed overall
- how many bytes remain in the current block
So if a read spans multiple blocks, each loop copies only the chunk that belongs to the current block.
9. Copy from bounce buffer into user buffer
memcpy((uint8_t *)buf + total_read, bounce + block_offset, chunk);
This says:
- destination = current place inside the user buffer
- source = current place inside the bounce buffer
- number of bytes = chunk
This is exactly the assignment’s bounce-buffer design.
10. Update state
total_read += chunk;
...
filedes[fd].offset += total_read;
The assignment says to update the FD offset by the number of bytes actually moved.
So when the read is done, the cursor advances.
Example walk-through
Suppose:
- block size = 4096
- file size = 5000
- FD offset = 4000
- user asks for count = 300
Then:
Step 1: clamp to EOF
- Bytes left in file = 5000 - 4000 = 1000
- So read request is fine: bytes_to_read = 300
Step 2: first iteration
- abs_offset = 4000
- block_offset = 4000 % 4096 = 4000
- bytes left in current block = 4096 - 4000 = 96
- So first chunk = 96 bytes
Step 3: second iteration
- Now total_read = 96
- abs_offset = 4096
- block_offset = 0
- remaining bytes = 300 - 96 = 204
- So second chunk = 204 bytes
Done.
That is how one logical read request turns into two block-level operations.
fs_read implementation
What fs_write must do
Given:
fd- user buffer
buf count
it should:
- validate input
- write starting at the current FD offset
- overwrite if offset is inside the file
- append if offset is at EOF
- allocate FAT blocks if the write extends the file
- use a bounce buffer for each affected disk block
- stop early if disk becomes full
- update:
- FD offset
- file size
- return actual bytes written
The key difference from fs_read
In fs_read, the flow was:
disk block -> bounce buffer -> user buffer
In fs_write, it is:
disk block -> bounce buffer -> modify part of it -> write bounce buffer back to disk
That is called read-modify-write.
Why do we do that?
Because the disk only reads and writes whole 4096-byte blocks. If the user wants to write 20 bytes into the middle of a block, you cannot just write those 20 bytes directly to disk or you would destroy the rest of the block.
So the pattern is:
- read full block into bounce buffer
- overwrite the needed bytes in memory
- write full block back
One subtle but important rule
When extending a file, a block might be newly allocated and never written before.
In that case, there is no old block content that matters.
So for newly allocated blocks, it is safest to initialize the bounce buffer to zeros before copying user data into it.
For existing blocks, do a normal block_read() first.
Implementation
Correct implementation:
int fs_write(int fd, void *buf, size_t count) {
if (!is_valid_fd(fd) || buf == NULL) {
return -1;
}
if (count == 0) {
return 0;
}
int r_idx = filedes[fd].rootIndex;
size_t cur_offset = filedes[fd].offset;
size_t old_size = root[r_idx].size;
size_t total_written = 0;
uint8_t bounce[BLOCK_SIZE];
while (total_written < count) {
size_t abs_offset = cur_offset + total_written;
size_t block_offset = abs_offset % BLOCK_SIZE;
size_t bytes_left_in_block = BLOCK_SIZE - block_offset;
size_t bytes_remaining = count - total_written;
size_t chunk = (bytes_remaining < bytes_left_in_block)
? bytes_remaining
: bytes_left_in_block;
int disk_block = get_block_for_offset(fd, abs_offset, 1);
/* Disk full: if nothing written yet, return 0; otherwise partial write */
if (disk_block < 0) {
break;
}
/*
* Determine whether this block already existed in the file before
* this write began. If yes, read old contents. If no, start with zeros.
*
if (abs_offset < old_size) {
if (block_read(disk_block, bounce) < 0) {
return -1;
}
} else {
memset(bounce, 0, BLOCK_SIZE);
}
memcpy(bounce + block_offset, (uint8_t *)buf + total_written, chunk);
if (block_write(disk_block, bounce) < 0) {
return -1;
}
total_written += chunk;
}
/* Update FD offset by actual bytes written */
filedes[fd].offset += total_written;
/* Grow file size if needed */
if (filedes[fd].offset > root[r_idx].size) {
root[r_idx].size = filedes[fd].offset;
}
return (int)total_written;
}
1. Validate input
if (!is_valid_fd(fd) || buf == NULL) {
return -1;
}
This matches the general FD error-handling rule:
- invalid FD → -1
- null buffer → invalid
2. Handle zero-byte write
if (count == 0) {
return 0;
}
Writing zero bytes is valid and should do nothing.
3. Save starting state
int r_idx = filedes[fd].rootIndex;
size_t cur_offset = filedes[fd].offset;
size_t old_size = root[r_idx].size;
size_t total_written = 0;
uint8_t bounce[BLOCK_SIZE];
These variables matter a lot.
cur_offset- This is where the write starts.
old_size- This is the file size before the write begins.
- We save it because during the write, the file may grow, and we still want to know whether a block existed before or is newly allocated.
total_written- Tracks progress through the user’s buffer.
4. Loop block by block
while (total_written < count)
This is necessary because:
- the write may span multiple blocks
- each disk I/O operation only handles one block at a time
5. Compute where we are in the file
size_t abs_offset = cur_offset + total_written;
size_t block_offset = abs_offset % BLOCK_SIZE;
size_t bytes_left_in_block = BLOCK_SIZE - block_offset;
size_t bytes_remaining = count - total_written;
size_t chunk = (bytes_remaining < bytes_left_in_block)
? bytes_remaining
: bytes_left_in_block;
This is almost exactly the same structure as fs_read.
abs_offset- The current byte position in the file.
block_offset- Where inside the current block the write starts.
chunk- How many bytes to write in this iteration.
- It is the smaller of:
- bytes still needed overall
- space remaining in this block
6. Ask helper for the correct disk block
int disk_block = get_block_for_offset(fd, abs_offset, 1);
The important part is the 1.
That means:
“If this offset needs a block that doesn’t exist yet, allocate one.”
So unlike fs_read, fs_write can grow the FAT chain.
7. Handle disk-full / partial-write case
if (disk_block < 0) {
break;
}
If allocation fails, that usually means there are no free FAT blocks left.
Then we stop writing.
This is exactly what you want for partial writes:
- if some bytes were already written, return that partial amount
- if none were written, return 0
Since total_written starts at 0 and only increases after successful writes, breaking here naturally gives the right result.
Why this handles the edge case correctly
Suppose:
- user wants 5000 bytes
- only one free block remains
Then:
- first iteration writes 4096 bytes successfully
- next call to get_block_for_offset(..., 1) fails
- loop breaks
- function returns 4096
And if the disk is full before the very first byte:
- first allocation fails immediately
- total_written stays 0
- function returns 0
Also exactly correct.
8. Decide whether to read old block contents or start with zeros
if (abs_offset < old_size) {
if (block_read(disk_block, bounce) < 0) {
return -1;
}
} else {
memset(bounce, 0, BLOCK_SIZE);
}
- This is one of the trickiest parts.
Case A: writing into a block that already existed in the file
Then we must preserve the untouched bytes in that block.
So we:
block_read(disk_block, bounce);
- Then modify only the needed bytes.
Case B: writing into a newly allocated block beyond old EOF
There is no previous file data to preserve there.
So we can safely do:
memset(bounce, 0, BLOCK_SIZE);
- and then copy user data into the needed region.
- This prevents garbage bytes from being written into the unwritten portions of a newly allocated block.
Why abs_offset < old_size is the right test
Because old_size is the file size before this write started.
If the current absolute offset is still inside that old file:
- the block already existed
- read old contents
If the current absolute offset is past or at old EOF:
- this region did not exist before
- zero-fill bounce buffer first
That is a clean way to distinguish overwrite vs growth.
9. Copy user bytes into bounce buffer
memcpy(bounce + block_offset, (uint8_t *)buf + total_written, chunk);
This writes the new bytes into the correct part of the block image.
So if the user writes 20 bytes starting at offset 100:
- the first 100 bytes of the block stay as they were
- bytes 100–119 are replaced
- rest of the block stays as it was
That is exactly the read-modify-write behavior you need.
10. Write whole block back to disk
if (block_write(disk_block, bounce) < 0) {
return -1;
}
After the bounce buffer has been updated, we write the entire block back.
11. Advance progress
total_written += chunk;
Now the next iteration continues at the next byte to write.
12. Update FD offset
filedes[fd].offset += total_written;
This moves the file cursor forward by the number of bytes actually written.
That matches the assignment’s state-management rules.
13. Update file size if the file grew
if (filedes[fd].offset > root[r_idx].size) {
root[r_idx].size = filedes[fd].offset;
}
This is very important.
If the write extends past old EOF, the file size must grow.
If it was only an overwrite inside the file, the size stays the same.
So new file size is:
max(old size, new offset)
and that is exactly what this if does.
Example 1: overwrite inside existing file
Suppose:
- file size = 1000
- FD offset = 100
- user writes 20 bytes
Then:
- write stays inside existing file
- no new blocks are allocated
- size remains 1000
- offset becomes 120
Example 2: append to end of file
Suppose:
- file size = 1000
- FD offset = 1000
- user writes 50 bytes
Then:
- write starts at EOF
- maybe same last block, maybe a new block depending on position
- file grows to 1050
- offset becomes 1050
Example 3: multi-block write
Suppose:
- offset = 4000
- count = 300
Then:
Iteration 1
- block offset = 4000
- space left in block = 96
- write 96 bytes
Iteration 2
- next offset = 4096
- block offset = 0
- remaining = 204
- write 204 bytes into next block
Done.
Example 4: disk full partial write
Suppose:
- write needs 3 blocks
- only 2 are available
Then:
- first two blocks succeed
- third allocation fails
- loop breaks
- function returns bytes written so far
Helper functions
find_free_fat() implementation
Correct implementation:
static int find_free_fat() {
if (fat == NULL || sblk == NULL) {
return -1;
}
for (int i = 0; i < sblk->numDataBlocks; i++) {
if (fat[i] == get_fat_free()) {
return i; // Found a free block
}
}
return -1; // No free blocks → disk is full
}
1. Safety check
if (fat == NULL || sblk == NULL)
This ensures:
- filesystem is mounted
- FAT is loaded into memory
2. Loop through FAT entries
for (int i = 0; i < sblk->numDataBlocks; i++)
We only check valid data-region entries, not raw memory.
Important detail:
The FAT size in memory is:
sblk->numFAT * BLOCK_SIZE // in bytes
BUT the number of valid FAT entries is:
sblk->numDataBlocks
Why? Because:
- FAT entries correspond only to data blocks, not all disk blocks
So you MUST loop like this:
for (int i = 0; i < sblk->numDataBlocks; i++)
NOT over the full allocated memory.
Takeaway:
numFAT= how many blocks the FAT occupies on disknumDataBlocks= how many entries exist in the FAT (what you actually use)- The FAT is conceptually:
uint16_t fat[numDataBlocks]; - Each entry corresponds to one data block.
- So:
fat[0]→ first data blockfat[1]→ second data block- ...
fat[numDataBlocks - 1]
- The FAT is conceptually:
3. Check for free entry
if (fat[i] == get_fat_free())
Your helper:
static inline uint16_t get_fat_free() {
return 0x0000;
}
So this is equivalent to:
if (fat[i] == 0x0000)
4. Return index
return i;
This index is:
- a FAT index, NOT a disk block
To convert it to a disk block later:
disk_block = sblk->dataIndex + i;
5. If nothing found
return -1;
Meaning:
- Disk is full — no free blocks available
get_block_for_offset implementation
What get_block_for_offset does
Given:
- an open file descriptor
- a byte offset inside that file
it answers:
“Which absolute disk block contains that byte?”
To do that, it:
1. starts at the file’s indexFirstBlock
2. follows the FAT chain
3. advances one FAT link per 4096 bytes
4. converts the final FAT index into an absolute disk block: sblk->dataIndex + fat_index
Implementation
Correct implementation:
static int get_block_for_offset(int fd, size_t offset, int allocate) {
if (!is_valid_fd(fd) || sblk == NULL) {
return -1;
}
int r_idx = filedes[fd].rootIndex;
uint16_t fat_idx = root[r_idx].indexFirstBlock;
/* Which file block do we need?
offset 0..4095 -> block 0 of file
offset 4096..8191 -> block 1 of file
etc.
*/
size_t blocks_to_advance = offset / BLOCK_SIZE;
/* Empty file / no first block yet */
if (fat_idx == get_fat_eoc()) {
if (!allocate) {
return -1;
}
int new_idx = find_free_fat();
if (new_idx < 0) {
return -1;
}
root[r_idx].indexFirstBlock = (uint16_t)new_idx;
fat[new_idx] = get_fat_eoc();
fat_idx = (uint16_t)new_idx;
}
/* Walk forward through the FAT chain */
for (size_t i = 0; i < blocks_to_advance; i++) {
uint16_t next = fat[fat_idx];
if (next == get_fat_eoc()) {
if (!allocate) {
return -1;
}
int new_idx = find_free_fat();
if (new_idx < 0) {
return -1;
}
fat[fat_idx] = (uint16_t)new_idx;
fat[new_idx] = get_fat_eoc();
fat_idx = (uint16_t)new_idx;
} else {
fat_idx = next;
}
}
/* Convert FAT index -> absolute disk block index */
return sblk->dataIndex + fat_idx;
}
offset / BLOCK_SIZE
This line is the key:
size_t blocks_to_advance = offset / BLOCK_SIZE;
Examples:
- offset = 0 → 0 / 4096 = 0 → still in first file block
- offset = 3000 → 0 → still in first file block
- offset = 4096 → 1 → second file block
- offset = 9000 → 2 → third file block
So this tells us how many FAT links to follow.
Why start from indexFirstBlock
The root directory stores:
root[r_idx].indexFirstBlock
That is the FAT index of the file’s first data block, or 0xFFFF if the file is empty. 
So the FAT chain always begins there.
Why return sblk->dataIndex + fat_idx
This is one of the most important rules in the assignment:
- FAT indices are not absolute disk block numbers
- FAT index 0 means “the first block in the data region”
- to read the real disk block, you must offset by dataIndex
Why the allocate parameter exists
For fs_read, you will call:
get_block_for_offset(fd, offset, 0)
because reading should never allocate blocks.
For fs_write, you will later call it with allocate = 1, so if the file grows past its current chain, the helper can create new FAT blocks.
So even though allocate is not really used by fs_read, implementing it now will help you later with fs_write.
Navigating the linked list
fat[fat_idx] = (uint16_t)new_idx;
fat[new_idx] = get_fat_eoc();
fat_idx = (uint16_t)new_idx;
We are extending a linked list.
Remember:
- The FAT is a linked list structure
- Each entry points to the next block
- 0xFFFF = end of chain (EOC)
So a file looks like:
Block A → Block B → Block C → 0xFFFF
When do we execute this code?
This happens when:
next == get_fat_eoc()
Meaning:
- “We reached the end of the file’s current block chain”
- But we still need to go further (because offset requires more blocks), so:
- we must allocate a new block and link it
Line-by-line explanation
Extend the chain
fat[fat_idx] = (uint16_t)new_idx;
- Before:
fat[fat_idx] = 0xFFFF (end of chain) - After:
fat[fat_idx] = new_idx - So we change:
... → fat_idx → 0xFFFFinto... → fat_idx → new_idx - This extends the chain
Mark the new block as end of chain
fat[new_idx] = get_fat_eoc();
fat[new_idx] = get_fat_eoc();- Now the new block becomes the last block:
new_idx → 0xFFFF - So the chain is now:
... → fat_idx → new_idx → 0xFFFF - This keeps the chain valid
Move forward in the chain
fat_idx = (uint16_t)new_idx;
- This updates your traversal pointer so you continue walking the chain.
- Without this, you would still be pointing at the previous block.
Visual example
Before allocation
Suppose the FAT looks like:
fat[4] = 7
fat[7] = 0xFFFF
File chain:
4 → 7 → EOC
You need one more block
Let’s say:
new_idx = 10
After these three lines
fat[7] = 10;
fat[10] = 0xFFFF;
fat_idx = 10;
Now FAT:
fat[4] = 7
fat[7] = 10
fat[10] = 0xFFFF
Chain becomes:
4 → 7 → 10 → EOC
Evidence
/CSCE-313/Labs/Visual%20Aids/autograder_pass_evidence.png)