03 - Inter Process Communication - Pipes and FIFO

#Linux #processes

Outline

The pipe system call (see beej.us)
- By default pipes do not have a name
- But there are named pipes in linux
The fork system call
- Create copies of processes
fifo or named pipes
- Exists as types in the file system
- But ntfs does not support fifo as a named pipe

Pipes

Real-time communication between processes

...

Notes:

Processes are units of isolation, it is difficult for one process to look for another process
To do this processes do message passing (mechanism 1)
The other mechanism is shared memory
- It is possible for two different processes to actually map the same file into two processes namespace
There is other type, called the anonymous pipe
- A process can write full messages into a message queue, and the reader just pull messages.
- One message could be 5 bytes, the other 10 bytes and so on.
- In anonymous pipe, these messages are not preserved

What is a pipe in Linux?

...
Notes:

A data structure in the kernel, and when you use it you get 5 file descriptors (add two more to the original 3)
- One to write on one end
- One to read on the other end
How to create a pipe?
- Use the system call of pipe()
- The kernel will fill for you the 5 file descriptors
  - Basically small integers
  - "Every process has a file descriptor table"
    - You know these indeces are valid, what they are pointing to is generally something you cannot manipulate

man 2 pipe

$ man 2 pipe

NAME
     pipe – create descriptor pair for interprocess communication

SYNOPSIS
     #include <unistd.h>
     int pipe(int fildes[2]);

DESCRIPTION
     The pipe() function creates a pipe (an object that allows unidirectional data flow) and allocates
     a pair of file descriptors.  The first descriptor connects to the read end of the pipe; the second
     connects to the write end.

     Data written to fildes[1] appears on (i.e., can be read from) fildes[0].  This allows the output
     of one program to be sent to another program: the source's standard output is set up to be the
     write end of the pipe; the sink's standard input is set up to be the read end of the pipe.  The
     pipe itself persists until all of its associated descriptors are closed.

     A pipe whose read or write end has been closed is considered widowed.  Writing on such a pipe
     causes the writing process to receive a SIGPIPE signal.  Widowing a pipe is the only way to
     deliver end-of-file to a reader: after the reader consumes any buffered data, reading a widowed
     pipe returns a zero count.

     The generation of the SIGPIPE signal can be suppressed using the F_SETNOSIGPIPE fcntl command.

Somehow you need to put one end of the pipe to the other end (a process)
A process calls pipe, it gets back 2 descriptors, which corresponds to an unidirectional data channel that can be used for inter-process communication

How to create a pipe?

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(int argc, char* argv[]){
    int fd[2];
    char buf[30];
    
    //create pipe
    if(pipe(fd) == -1){
        perror("pipe");
        exit(EXIT_FAILURE);
    }
    //write to pipe
    printf("writing to file descriptor #%d\n", fd[1]);
    write(fd[1],"CSCE 313",9);
    
    //read from pipe
    printf("reading from file descriptor #%d\n", fd[0]);
    read(fd[0], buf,9);
    printf("read \"%s\"\n",buf);
    return 0;
}

pipe() takes an array of two ints as an argument.
What are the return values of pipe()?
- -1: pipe failure
- 0 : pipe success
pipe() fills in the array with two file descriptors.

Notes:

pipe(fd) is a system call
- If it returns -1, then you do not create a pipe
- Otherwise you have created a pipe
- It is uni-directional, and doesn't maintain message boundaries, once it goes through the pipe, it won't remember that chunk of the message
- If you are building a protocol based on pipes, your reader has to build some letter ... in order to be able to work with pipes

int fd[2]

Integer array got populated with the file descriptors that you can use

pipe(fd)

The OS picks two free entries from your file descriptor table, lets say 7 and 11
It will return which end is read (7) and which is write (11)
Now your file descriptor table is populated
Now you can say write(11, "...", #bytes)

write(fd[1], "CSCE-313", 9)

Write 9 bytes into this descriptor from the string "CSCE-313"
Note each write is atomic.
Note the "CSCE-313" is actually 9 bytes because char arrays must terminate with a null terminator \0, which occupies a byte.

read(fd[0], buf, 9)

Read from this file descriptor into this buffer, (read only 9 bytes)
This is a destructive read, when you read these 9 bytes, they are gone on the data channel

How to use a pipe?

Pasted image 20260202173312.png|200

When any bytes are written to fd[1],the OS makes them available for reading from fd[0].

Linux pipe command example 1 (cmd1 | cmd2)

...

Notes:

$ cat /usr/share/dict/word

It will print just the contents of the file to the terminal

$ cat /usr/share/dict/word | less

the file is now passed as the input to the left command which lets you scroll through the contents of the file.
Associates the standard out of the cat program to the write end of the pipe
The shell redirects this out to the right end of the pipe, which is the standard read end of the less program.

Linux pipe command example 2 (cmd1 | cmd2)

...

Notes:

$ cat /usr/share/dict/word | grep 'zy'

grep allows you to match, using a regular expression, against its input
If grep finds it, it will send it to the standard out (print to the terminal)

$ cat /usr/share/dict/word | grep 'zy.*s'

Matches any string that contains zy.[anything here]s

$ cat /usr/share/dict/word | grep 'zy.*s
- For words that start with 'zy' and end with 's'

### Linux command example 3 (cmd1 | cmd2)
...

### Linux command example 4 (cmd1 | cmd2)
...

**Notes**:
```bash
cat games.txt | sort | uniq | head -3 > top3.txt

Discard duplicated lines
head -3 returns the first 3 lines of the output file

Unix philosophy

...

Notes:

There was a time where pipes didn't exist in linux
Philosophy before wast that as clients want more and more functionality, they just keep adding more and more features, and the code becomes unmanageable
The philosphy of Unix with pipes is that you can write simple programs to do a single thing well, then use composition (assisted with pipes) to communicated with other processes to make bigger systems

Linux command example 6 (cmd1 | cmd2)

Notes:

This is exactly how your shell is creating the pipe command

Forks

UNIX `fork()`

...

Notes:

The mental model is that when a process involves fork(), what it gets back is the process ID of the child process
In Unix we can make a complete hierarchy of parent-child relationships
When a process calls fork, another identical process is created, now you will return from fork and your copy will also return from fork.

UNIX `fork()` example 1

...

Notes:

When this program starts to execute, it first makes a fork
- Now two of you are executed!
Now both of them are executing an identical program so both of them will print whatever the program prints

int main() {
	fork();
	
	printf("Welcome to CSCE-313:\n");
	return 0;
}

They will both terminate at return 0.
Every time a thread calls fork() it creates two copies.

UNIX `fork()` example 2

#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
int main()
{
    fork();
    fork();
    printf("Welcome to CSCE 313!\n");
    return 0;
}

Once fork() returns there are actually two of you!
- Both of them will return
- Two separate process but they have identical behavior

davidkebo@CSCE-C02F159MMD6M forks % ./a.out
Welcome to CSCE 313!
Welcome to CSCE 313!
Welcome to CSCE 313!
Welcome to CSCE 313!

The number of times the message prints is equal to number of process created. 2𝑛 (𝑛 is the number of fork system calls)

Pasted image 20260130135316.png

At some point the process calls fork(), the OS creates an identical copy of it, so both of them will return (at some time).

An aside-`execl`

#include <unistd.h>
int main() {
	execl("/bin/echo", "echo", "Hello World", NULL);
	return 1;
}

Will take over your code segment, data stack, and recreate an address space with the contents of this program
It will be able to look into an elf file which will have the complete description (i.e. data, code, etc.)
In this example, this program gets replaced itself with the program echo and therefore is able to print "Hello World" to the terminal.

exec is a system call that replaces a process’s user address space with data read from
an executable file.

Continuity of the process but replacement of its contents.
The process keeps the same PID, open file descriptors, environment, etc.,
But the program text and data are swapped out for a new image.

Notes:

Your virtual address space will be overwritten and you basically become into the new address space
The first argument is the name of the program itself
This program can do different things depending on the first argument provided
- The reason that you need the NULL at the end is because this is ragargs
When you exec, your file descriptor is the same?

Another example:

less /usr/share/dict/words

We can tell shell to replace itself with the less program in that file

execl less /usr/share/dict/words

Now when you close the less program your shell will also end!
- That shell is gone! it was overriden by the execl instruction
So execl is a way to replace yourself

Looking ahead-exec family of functions

See online man page. Only the first two are of interest to us at this time.

int execl ( const char *path, const char *arg, ... );
int execlp( const char *file, const char *arg, ... );
int execle( const char *path, const char *arg, ..., char *const envp[] );

int execv ( const char *path, char *const argv[] );
int execvp( const char *file, char *const argv[] );
int execve( const char *path, char *const argv[], char *const envp[] );

There is only one system call, the others just invoke arguments and involve that one system call
execl:
- Stands for the fact that the arguments that you will give to this program are in list form
execlp:
- You do not have to get the full file name of the executable you want to replace yourself with, you just give the ls name, and is the function call that will find the correct file
- The shell will say: is there an ls in this directory? if not, is there in this other directory?
- Signifies file lookup.
- Gives you some independence of actually hard coding names of executables

What is a file descriptor?

Imagine you're at a library.

You request a book, but the librarian gives you a numbered slip. It's your handle to the book.
You use the slip to get access to the book—you read, and eventually return the slip.
The slip is like a file descriptor: a small integer the operating system gives you to refer to an open resource.

Key point: the descriptor is not the file itself, just a handle.

Notes:

When you create a file, you are actually created some kernel data structure
To communicate between kernel and user process you need file descriptors
- Integers in your file descriptor table will indicate read and write ends
The only way to access a file is through the kernel because that file is in the disk which is not accessible by processes, so the kernel needs to handle it to you

Analogy (library):

You ask the librarian to read a book, instead what you get is a file descriptor, and it is through this file descriptor that you can read and write the book, you never actually get the book directly.

What is a file descriptors actually?

In every process there is a file descriptor table
Normally start with 0, 1, and 2, these are the descriptors that you are born with
If you create a socket (for network communications), the kernel exposes these things through the kernel to file descriptors.
The only handles that you have to the pipe is through these file descriptors

Pasted image 20260202140246.png

Unix maintains a giant file descriptor table and every process have their little file descriptor tables associated with them, all of the entries in the small descriptor table are pointing to the gigant file descriptor table, this is how different processes can refer to the same file.
When a parent forks, you are not copying anything, just the pointers of the same file descriptor table?
Think of a file descriptor table as being an array pointing to things in a kernel data structure.

`fork()` and `pipe()`

It is not useful for one process to use a pipe to talk to itself. Typically, a process creates a pipe just before it forks more child processes.

The pipe is inherited by the children, and then used for communicating either between the parent & child processes, or between two sibling processes.

In the following program, the parent writes a message to the pipe.

The child reads from the pipe 1 byte at a time until the pipe is empty.

Notes:

The file descriptor tables of a parent and a child process are exactly the same
When you did a fork actually the child also got a copy of the same pipe
- This kind of attributes are copied across a fork()
- Now A and B got the same pipe
A file descriptor is an indirect way of referring to a pipe
Though only one of them can actually have access to the pipe at a time
- Now the parent closes one end and the child closes the other end of the pipe, while letting the other side untouched.
- Now there is a single writer and a single reader on the pipe, this is important (you can only have one single writer but multiple readers)

Pasted image 20260130142119.png|500

`fork()` and `pipe()` example 1

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/wait.h>

int main(int argc, char* argv[]) {
    int pipefds[2];
    pid_t pid;
    char buf[30];
    
    //create pipe
    if(pipe(pipefds)==-1){
        perror("pipe");
        exit(EXIT_FAILURE);
    }
    
    memset(buf,0,30);
    pid = fork();
    
    if(pid>0){
        printf("PARENT: writing to the pipe\n");
        //parent close the read end
        close(pipefds[0]);
        //parent write in the pipe write end
        write(pipefds[1],"CSCI3150",9);
        //after finishing writing, parent close the write end
        close(pipefds[1]);
        //parent wait for child
        wait(NULL);
    } else {
        //child did not close the write end
        //child read from the pipe read end until the pipe is empty
        while(read(pipefds[0],buf,1)==1){
            printf("CHILD read from pipe --%s\n",buf);
        }
        close(pipefds[0]);
        printf("CHILD: EXITING!");
        exit(EXIT_SUCCESS);
    }
    
    return 0;
}

When fork() returns, if it is successful, then both of you will be able to return.
In the parent process fork() will return the process id of the child, but in the child it will return 0.
Using this information now, the parent and the child can do different things
Parent:
- The parent closes the read end of the pipe
- Then it writes into the write end of the pipe
- Then closes the pipe (this is not enough)
Child (else clause):
- Keeps reading from the read end of the pipe
  - Reads 1 byte and puts it in buff
- Unfortunately never returns when all the data in the pipe has finished

davidkebo@CSCE-C02F159MMD6M pipes \% ./a.out
PARENT: writing to the pipe
CHILD read from pipe --C
CHILD read from pipe --S
CHILD read from pipe --C
CHILD read from pipe --I
CHILD read from pipe --3
CHILD read from pipe --1
CHILD read from pipe --5
CHILD read from pipe --0
CHILD read from pipe --

Problem:
The child process doesn't exit because pipefds[1] is open. The system assumes that a write could occur while the write end is still open, and the system will not report EOF.

The child basically received the same two end points of the pipe
The child does not see the end of file because the child is also holding the right end of the file and the system assumes that a write could occur.
SO you need the reader to also close the write end of the file

The child blocks and waits to read from pipefds[0], and the operating system doesn't know that no process will be writing to pipdfds[1].

`fork()` and `pipe()` example 2

int main (int argc, char *argv[]) {
    int pipefds[2];
    pid_t pid;
    char buf[30];
    
    if (pipe (pipefds) == -1) {
        perror ("pipe");
        exit (EXIT_FAILURE);
    }
    
    memset (buf, 0, 30);
    pid = fork ();
    
    if (pid > 0) {
        printf ("PARENT: writing to the pipe\n");
        close (pipefds[0]);
        write (pipefds[1], "CSCI3 150", 8);
        close (pipefds[1]);
        wait (NULL);
    } else {
        close(pipefds[1]);    // IMPORTANT!! (close right end of pipe)
        while (read (pipefds[0], buf, 1) == 1) {
            printf ("CHILD read from pipe --%s\n", buf);
        }
        close (pipefds[0]);
        printf ("CHILD: EXITING!");
        exit (EXIT_SUCCESS);
    }
    
    return 0;
}

davidkebo@CSCE-C02F159MMD6M week3 % ./a.out
PARENT: write in pipe.
CHILD read from pipe --C
CHILD read from pipe --S
CHILD read from pipe --C
CHILD read from pipe --I
CHILD read from pipe --3
CHILD read from pipe --1
CHILD read from pipe --5
CHILD read from pipe --0
CHILD read from pipe --
CHILD: EXITING!

Now the read end reaches EOF.

Note:

When the child process finishes reading all the bytes of data, we close the related file descriptors.

Pasted image 20260202142351.png|500

Pipes for bidirectional communication

Pasted image 20260202142628.png

Problem:

Pipes are unidirectional and should not be used to write back.
Using a single pipe for bidirectional communication will fail.

Pasted image 20260202142705.png

Solution:

The parent needs to create the two pipes, otherwise if the pipe is created by the child, it not be inherited by the parent so it cannot communicate through the descriptor back to the parent
We can use two pipes for bidirectional communication between parent and child.
1. The parent process uses pcfd to send data to the child process.
2. The child process uses cpfd to send data to the parent process.
In both cases, the calls to write() use index 1, and the read() calls use index 0 on the file descriptor array.

Copy of a file descriptor: `dup2()`

The dup2() system call creates a copy of a file descriptor.

If the copy is successfully created, then the original and copy file descriptors may be used interchangeably.
They both refer to the same open file description and thus share file offset and file status flags.

int dup2(int oldfd, int newfd)

oldfd: old file descriptor
newfd: new file descriptor which is used by dup2() to create a copy.

Pasted image 20260130142544.png|600

The white table is the kernel-wide descriptor table

Bash uses dup2() with pipes to link commands together.

Example: ls | sort

The ls process closes its read end of the pipe and links the write end to its standard output.
The sort process closes the write end of the pipe and links the read end to become its standard input.
ls closes the write end of the pipe, sort closes the read end of the pipe.

Anything that ls writes to its standard output, sort would read from its standard input.

Notes:

Will copy data from one descriptor to another descriptor
Entries in a file descriptor table are pointing to objects in the file system
- Something that the kernel is maintaining for you
When you duplicate from 4 to 1, you are saying, whatever 4 was referring to, please copy it to 1.
- Therefore writing to 1 and writing to 4 will have the same effect, they are both pointing to the same underlying object

`dup2` in action

Pasted image 20260130143031.png|350

Imagine the left column wants to become ls and the right column wants to become sort
Left column calls dup2(1,4)
Right column calls dup2(0,3)
The moment you have dup2, now you have two writers, so you need to close one of them, always.
- You can get away with not closing extra reader file descriptor but you ALWAYS NEED to close the extra write file descriptor.
You can figure out whether you are a child or not!
- Yes childs can do different things!
Note: closing a descriptor file means sending an eof signal to the pipe, which the reader won't see, but you actually want the reader to see it so that sort knows there is no more data to read.
- You want the readers to notice the string has ended

Limitations of pipes

Unix pipes are "unidirectional" channels from one process to another, and cannot be turned into bidirectional or multicast channels.
- This limitation makes pipes difficult to use for complex inter-process communication.
- Bidirectional communication can be simulated using a pair of pipes, but
  - Inconsistent buffering between both pipes can lead to complications.
Pipes can only be shared between processes that have a common ancestor that creates the pipe.
There is no way for a process to "reference" a pipe that it (or an ancestor) did not directly create via pipe.
This restriction prevents many useful forms of inter-process communication from being layered on top of pipes.
- Pipes cannot be used for clients to connect to long-running services.
- Processes cannot open additional pipes to other processes that they already have a pipe to.

Notes:

Pipes are unidirectional (only one way)
Pipes have destructive reads, once a process reads the entirety of the data, that data is gone.
There are some request control protocols like ftp where you have a control channel and a data channel
- In this situation, it becomes though, because even though you have a defined hierarchy, when the data channel will be created is already to late because a fork already exists, this is inconvenient.

Pasted image 20260202143145.png|200

Is there a way to create a pipe between C and E?
- Find a common ancestor
- When A forks C or B, they inherit the ends of the pipe and A can close them
- When B does a fork, those two ends of the file are inherited by E so B can close them
- ...
- A needs to know that one of my children would eventually want to talk to one of my grandchildren
  - There is a whole lot of planning that you need to address

Using Files for Inter Process COmmunication

Two programs can communicate with one another by using an intermediate medium such as a file.

davidkebo@linux:~$ ls
Desktop Documents Downloads Music Pictures Public Templates Videos code src-cloud src-cloud.zip
davidkebo@linux:~$ ls> content.txt
davidkebo@linux:~$ more content.txt
Desktop
Documents
Downloads
Music
Pictures
Public
Templates
Videos
code
content.txt
src-cloud
src-cloud.zip

The program ls sends its output to the file "content.txt".
The program more takes "content.txt" as an input.
ls indirectly communicates with more via the file "content.txt".

FIFO

FIFO (named PIPE)

FIFO is a first-in, first-out message-passing IPC in which bytes are sent and received as unstructured streams. It is also known as a named pipe.

The named pipe is a POSIX pipe in contrast to anonymous pipes created with the pipe() system call.

FIFOs work by associating a filename with the pipe. Once created, any process (with correct access permissions) can access the FIFO by calling open() on the associated filename.

Once the processes have opened the file, they can use the standard read() and write() functions to communicate.

- For words that start with 'zy' and end with 's'

### Linux command example 3 (cmd1 | cmd2)
...

### Linux command example 4 (cmd1 | cmd2)
...

**Notes**:
{{CODE_BLOCK_12}}
- Discard duplicated lines
- `head -3` returns the first 3 lines of the output file

### Unix philosophy
...

**Notes**:
- There was a time where pipes didn't exist in linux
- Philosophy before wast that as clients want more and more functionality, they just keep adding more and more features, and the code becomes unmanageable
- The philosphy of Unix with pipes is that you can write simple programs to do a single thing well, then use composition (assisted with pipes) to communicated with other processes to make bigger systems

### Linux command example 6 (cmd1 | cmd2)

**Notes**:
- This is exactly how your shell is creating the pipe command


## Forks
### UNIX `fork()`
...

**Notes**:
- The mental model is that when a process involves `fork()`, what it gets back is the process ID of the child process
- In Unix we can make a complete hierarchy of parent-child relationships
- When a process calls fork, another identical process is created, now you will return from fork and your copy will also return from fork.

### UNIX `fork()` example 1
...

**Notes**:
- When this program starts to execute, it first makes a fork
	- Now two of you are executed!
- Now both of them are executing an identical program so both of them will print whatever the program prints

{{CODE_BLOCK_13}}
- They will both terminate at `return 0`.
- Every time a thread calls `fork()` it creates two copies.

### UNIX `fork()` example 2

{{CODE_BLOCK_14}}
- Once `fork()` returns there are actually two of you!
	- Both of them will return
	- Two separate process but they have identical behavior

{{CODE_BLOCK_15}}
- The number of times the message prints is equal to number of process created. 2𝑛 (𝑛 is the number of fork system calls)

![Pasted image 20260130135316.png](/img/user/00%20-%20TAMU%20Brain/6th%20Semester%20(Spring%2026)/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260130135316.png)
- At some point the process calls `fork()`, the OS creates an identical copy of it, so both of them will return (at some time).

### An aside-`execl`

{{CODE_BLOCK_16}}
- Will take over your code segment, data stack, and recreate an address space with the contents of this program
- It will be able to look into an elf file which will have the complete description (i.e. data, code, etc.)
- In this example, this program gets replaced itself with the program `echo` and therefore is able to print "Hello World" to the terminal.

**exec** is a system call that replaces a process’s user address space with data read from
an executable file.
- Continuity of the process but replacement of its contents.
- The process keeps the same PID, open file descriptors, environment, etc.,
- But the program text and data are swapped out for a new image.

**Notes**:
- Your virtual address space will be overwritten and you basically become into the new address space
- The first argument is the name of the program itself
- This program can do different things depending on the first argument provided
	- The reason that you need the NULL at the end is because this is ragargs
- When you `exec`, your file descriptor is the same?

Another example:
{{CODE_BLOCK_17}}
- We can tell shell to replace itself with the less program in that file

{{CODE_BLOCK_18}}
- Now when you close the less program your shell will also end!
	- That shell is gone! it was overriden by the `execl` instruction
- So `execl` is a way to replace yourself

### Looking ahead-exec family of functions

See [online man page](https://man7.org/linux/man-pages/man3/exec.3.html). Only the first two are of interest to us at this time.

{{CODE_BLOCK_19}}
- There is only one system call, the others just invoke arguments and involve that one system call
	
- `execl`:
	- Stands for the fact that the arguments that you will give to this program are in list form
- `execlp`:
	- You do not have to get the full file name of the executable you want to replace yourself with, you just give the `ls` name, and is the function call that will find the correct file
	- The shell will say: is there an `ls` in this directory? if not, is there in this other directory?
	- Signifies file lookup.
	- Gives you some independence of actually hard coding names of executables

### What is a file descriptor?
Imagine you're at a library.
- You request a book, but the librarian gives you a numbered slip. It's your handle to the book.
- You use the slip to get access to the book—you read, and eventually return the slip.
- The slip is like a file descriptor: a <span style="color:rgb(254, 134, 22)">small integer</span> the operating system gives you to refer to an open resource.

Key point: the descriptor is not the file itself, just a handle.

**Notes**:
- When you create a file, you are actually created some kernel data structure
- To communicate between kernel and user process you need file descriptors
	- Integers in your file descriptor table will indicate read and write ends
- The only way to access a file is through the kernel because that file is in the disk which is not accessible by processes, so the kernel needs to handle it to you
	
**Analogy (library)**:
- You ask the librarian to read a book, instead what you get is a file descriptor, and it is through this file descriptor that you can read and write the book, you never actually get the book directly.

### What is a file descriptors actually?
- In every process there is a file descriptor table
- Normally start with 0, 1, and 2, these are the descriptors that you are born with
- If you create a socket (for network communications), the kernel exposes these things through the kernel to file descriptors.
- The only handles that you have to the pipe is through these file descriptors

![Pasted image 20260202140246.png](/img/user/00%20-%20TAMU%20Brain/6th%20Semester%20(Spring%2026)/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260202140246.png)
- Unix maintains a giant file descriptor table and every process have their little file descriptor tables associated with them, all of the entries in the small descriptor table are pointing to the gigant file descriptor table, this is how different processes can refer to the same file.
- When a parent forks, you are not copying anything, just the pointers of the same file descriptor table?
- *Think of a file descriptor table as being an array pointing to things in a kernel data structure*.

### `fork()` and `pipe()`
It is not useful for one process to use a pipe to talk to itself. Typically, a process creates a pipe <span style="color:rgb(159, 239, 0)">just before</span> it forks more child processes.

The pipe is inherited by the children, and then used for communicating either between the parent & child processes, or between two sibling processes.

In the following program, the <span style="color:rgb(159, 239, 0)">parent writes</span> a message to the pipe. 

The <span style="color:rgb(159, 239, 0)">child reads</span> from the pipe <span style="color:rgb(254, 134, 22)">1 byte at a time</span> until the pipe is empty.

**Notes**:
- The file descriptor tables of a parent and a child process are exactly the same
- When you did a fork actually the child also got a copy of the same pipe
	- This kind of attributes are copied across a `fork()`
	- Now A and B got the same pipe
- A file descriptor is an indirect way of referring to a pipe
- Though only one of them can actually have access to the pipe at a time
	- Now the parent closes one end and the child closes the other end of the pipe, while letting the other side untouched.
	- Now there is a single writer and a single reader on the pipe, this is important (you can only have one single writer but multiple readers)

![Pasted image 20260130142119.png|500](/img/user/00%20-%20TAMU%20Brain/6th%20Semester%20(Spring%2026)/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260130142119.png)

### `fork()` and `pipe()` example 1

{{CODE_BLOCK_20}}
- When fork() returns, if it is successful, then both of you will be able to return.
- In the parent process fork() will return the process id of the child, but in the child it will return 0.
- Using this information now, the parent and the child can do different things
- **Parent**:
	- The parent closes the read end of the pipe
	- Then it writes into the write end of the pipe
	- Then closes the pipe (<span style="color:rgb(254, 134, 22)">this is not enough</span>)
- **Child** (else clause):
	- Keeps reading from the read end of the pipe
		- Reads 1 byte and puts it in buff
	- Unfortunately never returns when all the data in the pipe has finished

{{CODE_BLOCK_21}}

**Problem**:
The child process doesn't exit because `pipefds[1]` is open. The system assumes that a write could occur while the write end is still open, and the system will not report EOF. 
- The child basically received the same two end points of the pipe
- The child does not see the end of file because the child is also holding the right end of the file and the system assumes that a write could occur.
- SO you need the reader to also close the write end of the file

The <span style="color:rgb(254, 134, 22)">child blocks</span> and waits to read from `pipefds[0]`, and the operating system doesn't know that no process will be writing to `pipdfds[1]`.

### `fork()` and `pipe()` example 2

{{CODE_BLOCK_22}}

{{CODE_BLOCK_23}}
- Now the read end reaches EOF.

**Note**:
- When the child process finishes reading all the bytes of data, we close the related file descriptors.

![Pasted image 20260202142351.png|500](/img/user/00%20-%20TAMU%20Brain/6th%20Semester%20(Spring%2026)/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260202142351.png)

### Pipes for bidirectional communication

![Pasted image 20260202142628.png](/img/user/00%20-%20TAMU%20Brain/6th%20Semester%20(Spring%2026)/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260202142628.png)

**Problem**:
- Pipes are <span style="color:rgb(254, 134, 22)">unidirectional</span> and should not be used to write back.
- Using a single pipe for bidirectional communication will fail.

![Pasted image 20260202142705.png](/img/user/00%20-%20TAMU%20Brain/6th%20Semester%20(Spring%2026)/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260202142705.png)

**Solution**:
- The parent needs to create the two pipes, otherwise if the pipe is created by the child, it not be inherited by the parent so it cannot communicate through the descriptor back to the parent
- We can use <span style="color:rgb(159, 239, 0)">two pipes</span> for bidirectional communication between parent and child.
	1. The parent process uses `pcfd` to send data to the child process.
	2. The child process uses `cpfd` to send data to the parent process.
- In both cases, the calls to `write()` use index 1, and the `read()` calls use index 0 on the file descriptor array.

### Copy of a file descriptor: `dup2()`

**The `dup2()` system call creates a copy of a file descriptor.**
- If the copy is successfully created, then the original and copy file descriptors may be used interchangeably.
- They both refer to the same open file description and thus share file offset and file status flags.

{{CODE_BLOCK_24}}

`oldfd`: old file descriptor
`newfd`: new file descriptor which is used by dup2() to create a copy.

![Pasted image 20260130142544.png|600](/img/user/00%20-%20TAMU%20Brain/6th%20Semester%20(Spring%2026)/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260130142544.png)
- The white table is the kernel-wide descriptor table

**Bash uses `dup2()` with pipes to link commands together.**

Example: `ls | sort`
1. The ls process closes its `read` end of the pipe and links the write end to its standard output.
2. The sort process closes the `write` end of the pipe and links the read end to become its standard input.
3. ls closes the `write` end of the pipe, sort closes the `read` end of the pipe.

Anything that `ls` writes to its standard output, `sort` would read from its standard input.

**Notes**:
- Will copy data from one descriptor to another descriptor
- Entries in a file descriptor table are pointing to objects in the file system
	- Something that the kernel is maintaining for you
- When you duplicate from 4 to 1, you are saying, whatever 4 was referring to, please copy it to 1. 
	- Therefore writing to 1 and writing to 4 will have the same effect, they are both pointing to the same underlying object

### `dup2` in action

![Pasted image 20260130143031.png|350](/img/user/00%20-%20TAMU%20Brain/6th%20Semester%20(Spring%2026)/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260130143031.png)
- Imagine the left column wants to become `ls` and the right column wants to become `sort`
- Left column calls `dup2(1,4)`
- Right column calls `dup2(0,3)`
- The moment you have `dup2`, now you have two writers, so you need to close one of them, always.
	- You can get away with not closing extra reader file descriptor but you ALWAYS NEED to close the extra write file descriptor.
- You can figure out whether you are a child or not!
	- Yes childs can do different things!
- **Note**: closing a descriptor file means sending an `eof` signal to the pipe, which the reader won't see, but you actually want the reader to see it so that `sort` knows there is no more data to read.
	- You want the readers to notice the string has ended

### Limitations of pipes

- Unix pipes are "unidirectional" channels from one process to another, and cannot be turned into bidirectional or multicast channels.
	- This limitation makes pipes difficult to use for complex inter-process communication.
	- Bidirectional communication can be simulated using a pair of pipes, but
		- Inconsistent buffering between both pipes can lead to complications.
	
- Pipes can only be shared between processes that have a common ancestor that creates the pipe.
- There is no way for a process to "reference" a pipe that it (or an ancestor) did not directly create via **pipe**.
- This restriction prevents many useful forms of inter-process communication from being layered on top of pipes.
	- Pipes cannot be used for clients to connect to long-running services.
	- Processes cannot open additional pipes to other processes that they already have a pipe to.

**Notes**:
- Pipes are unidirectional (only one way)
- Pipes have destructive reads, once a process reads the entirety of the data, that data is gone.
- There are some request control protocols like `ftp` where you have a control channel and a data channel
	- In this situation, it becomes though, because even though you have a defined hierarchy, when the data channel will be created is already to late because a fork already exists, this is inconvenient.

![Pasted image 20260202143145.png|200](/img/user/00%20-%20TAMU%20Brain/6th%20Semester%20(Spring%2026)/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260202143145.png)
- Is there a way to create a pipe between C and E?
	- Find a common ancestor
	- When A forks C or B, they inherit the ends of the pipe and A can close them
	- When B does a fork, those two ends of the file are inherited by E so B can close them
	- ...
	- A needs to know that one of my children would eventually want to talk to one of my grandchildren
		- There is a whole lot of planning that you need to address

### Using Files for Inter Process COmmunication

Two programs can communicate with one another by using an intermediate medium such as a file.

{{CODE_BLOCK_25}}

- The program `ls` sends its output to the file "content.txt".
- The program `more` takes "content.txt" as an input.
- `ls` indirectly communicates with more via the file "content.txt".

# FIFO

### FIFO (named PIPE)

FIFO is a `first-in`, `first-out` message-passing IPC in which bytes are sent and received as `unstructured streams`. It is also known as a named pipe.

The named pipe is a POSIX pipe in contrast to anonymous pipes created with the `pipe()` system call.

FIFOs work by associating a filename with the pipe. Once created, any process (with correct access permissions) can access the FIFO by calling open() on the associated filename.

Once the processes have opened the file, they can use the standard read() and write() functions to communicate.

Outline

Pipes

Real-time communication between processes

What is a pipe in Linux?

man 2 pipe

How to create a pipe?

How to use a pipe?

Linux pipe command example 1 (cmd1 | cmd2)

Linux pipe command example 2 (cmd1 | cmd2)

Unix philosophy

Linux command example 6 (cmd1 | cmd2)

Forks

UNIX fork()

UNIX fork() example 1

UNIX fork() example 2

An aside-execl

Looking ahead-exec family of functions

What is a file descriptor?

What is a file descriptors actually?

fork() and pipe()

fork() and pipe() example 1

fork() and pipe() example 2

Pipes for bidirectional communication

Copy of a file descriptor: dup2()

dup2 in action

Limitations of pipes

Using Files for Inter Process COmmunication

FIFO

FIFO (named PIPE)

UNIX `fork()`

UNIX `fork()` example 1

UNIX `fork()` example 2

An aside-`execl`

`fork()` and `pipe()`

`fork()` and `pipe()` example 1

`fork()` and `pipe()` example 2

Copy of a file descriptor: `dup2()`

`dup2` in action