03 - Inter Process Communication - Pipes and FIFO
Outline
- The pipe system call (see beej.us)
- By default pipes do not have a name
- But there are named pipes in linux
- The fork system call
- Create copies of processes
- fifo or named pipes
- Exists as types in the file system
- But ntfs does not support fifo as a named pipe
Pipes
Real-time communication between processes
...
Notes:
- Processes are units of isolation, it is difficult for one process to look for another process
- To do this processes do message passing (mechanism 1)
- The other mechanism is shared memory
- It is possible for two different processes to actually map the same file into two processes namespace
- There is other type, called the anonymous pipe
- A process can write full messages into a message queue, and the reader just pull messages.
- One message could be 5 bytes, the other 10 bytes and so on.
- In anonymous pipe, these messages are not preserved
What is a pipe in Linux?
...
Notes:
- A data structure in the kernel, and when you use it you get 5 file descriptors (add two more to the original 3)
- One to write on one end
- One to read on the other end
- How to create a pipe?
- Use the system call of
pipe() - The kernel will fill for you the 5 file descriptors
- Basically small integers
- "Every process has a file descriptor table"
- You know these indeces are valid, what they are pointing to is generally something you cannot manipulate
- Use the system call of
man 2 pipe
$ man 2 pipe
NAME
pipe – create descriptor pair for interprocess communication
SYNOPSIS
#include <unistd.h>
int pipe(int fildes[2]);
DESCRIPTION
The pipe() function creates a pipe (an object that allows unidirectional data flow) and allocates
a pair of file descriptors. The first descriptor connects to the read end of the pipe; the second
connects to the write end.
Data written to fildes[1] appears on (i.e., can be read from) fildes[0]. This allows the output
of one program to be sent to another program: the source's standard output is set up to be the
write end of the pipe; the sink's standard input is set up to be the read end of the pipe. The
pipe itself persists until all of its associated descriptors are closed.
A pipe whose read or write end has been closed is considered widowed. Writing on such a pipe
causes the writing process to receive a SIGPIPE signal. Widowing a pipe is the only way to
deliver end-of-file to a reader: after the reader consumes any buffered data, reading a widowed
pipe returns a zero count.
The generation of the SIGPIPE signal can be suppressed using the F_SETNOSIGPIPE fcntl command.
- Somehow you need to put one end of the pipe to the other end (a process)
- A process calls pipe, it gets back 2 descriptors, which corresponds to an unidirectional data channel that can be used for inter-process communication
How to create a pipe?
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(int argc, char* argv[]){
int fd[2];
char buf[30];
//create pipe
if(pipe(fd) == -1){
perror("pipe");
exit(EXIT_FAILURE);
}
//write to pipe
printf("writing to file descriptor #%d\n", fd[1]);
write(fd[1],"CSCE 313",9);
//read from pipe
printf("reading from file descriptor #%d\n", fd[0]);
read(fd[0], buf,9);
printf("read \"%s\"\n",buf);
return 0;
}
pipe()takes an array of two ints as an argument.- What are the return values of
pipe()?- -1: pipe failure
- 0 : pipe success
pipe()fills in the array with two file descriptors.
Notes:
pipe(fd)is a system call- If it returns -1, then you do not create a pipe
- Otherwise you have created a pipe
- It is uni-directional, and doesn't maintain message boundaries, once it goes through the pipe, it won't remember that chunk of the message
- If you are building a protocol based on pipes, your reader has to build some letter ... in order to be able to work with pipes
int fd[2]
- Integer array got populated with the file descriptors that you can use
pipe(fd)
- The OS picks two free entries from your file descriptor table, lets say 7 and 11
- It will return which end is read (7) and which is write (11)
- Now your file descriptor table is populated
- Now you can say
write(11, "...", #bytes)
write(fd[1], "CSCE-313", 9)
- Write 9 bytes into this descriptor from the string "CSCE-313"
- Note each write is atomic.
- Note the "CSCE-313" is actually 9 bytes because char arrays must terminate with a null terminator
\0, which occupies a byte.
read(fd[0], buf, 9)
- Read from this file descriptor into this buffer, (read only 9 bytes)
- This is a destructive read, when you read these 9 bytes, they are gone on the data channel
How to use a pipe?
/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260202173312.png)
When any bytes are written to fd[1],the OS makes them available for reading from fd[0].
Linux pipe command example 1 (cmd1 | cmd2)
...
Notes:
$ cat /usr/share/dict/word
- It will print just the contents of the file to the terminal
$ cat /usr/share/dict/word | less
- the file is now passed as the input to the left command which lets you scroll through the contents of the file.
- Associates the standard out of the
catprogram to the write end of the pipe - The shell redirects this out to the right end of the pipe, which is the standard read end of the
lessprogram.
Linux pipe command example 2 (cmd1 | cmd2)
...
Notes:
$ cat /usr/share/dict/word | grep 'zy'
- grep allows you to match, using a regular expression, against its input
- If
grepfinds it, it will send it to the standard out (print to the terminal)
$ cat /usr/share/dict/word | grep 'zy.*s'
- Matches any string that contains
zy.[anything here]s
$ cat /usr/share/dict/word | grep 'zy.*s
- For words that start with 'zy' and end with 's'
### Linux command example 3 (cmd1 | cmd2)
...
### Linux command example 4 (cmd1 | cmd2)
...
**Notes**:
```bash
cat games.txt | sort | uniq | head -3 > top3.txt
- Discard duplicated lines
head -3returns the first 3 lines of the output file
Unix philosophy
...
Notes:
- There was a time where pipes didn't exist in linux
- Philosophy before wast that as clients want more and more functionality, they just keep adding more and more features, and the code becomes unmanageable
- The philosphy of Unix with pipes is that you can write simple programs to do a single thing well, then use composition (assisted with pipes) to communicated with other processes to make bigger systems
Linux command example 6 (cmd1 | cmd2)
Notes:
- This is exactly how your shell is creating the pipe command
Forks
UNIX fork()
...
Notes:
- The mental model is that when a process involves
fork(), what it gets back is the process ID of the child process - In Unix we can make a complete hierarchy of parent-child relationships
- When a process calls fork, another identical process is created, now you will return from fork and your copy will also return from fork.
UNIX fork() example 1
...
Notes:
- When this program starts to execute, it first makes a fork
- Now two of you are executed!
- Now both of them are executing an identical program so both of them will print whatever the program prints
int main() {
fork();
printf("Welcome to CSCE-313:\n");
return 0;
}
- They will both terminate at
return 0. - Every time a thread calls
fork()it creates two copies.
UNIX fork() example 2
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
int main()
{
fork();
fork();
printf("Welcome to CSCE 313!\n");
return 0;
}
- Once
fork()returns there are actually two of you!- Both of them will return
- Two separate process but they have identical behavior
davidkebo@CSCE-C02F159MMD6M forks % ./a.out
Welcome to CSCE 313!
Welcome to CSCE 313!
Welcome to CSCE 313!
Welcome to CSCE 313!
- The number of times the message prints is equal to number of process created. 2𝑛 (𝑛 is the number of fork system calls)
/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260130135316.png)
- At some point the process calls
fork(), the OS creates an identical copy of it, so both of them will return (at some time).
An aside-execl
#include <unistd.h>
int main() {
execl("/bin/echo", "echo", "Hello World", NULL);
return 1;
}
- Will take over your code segment, data stack, and recreate an address space with the contents of this program
- It will be able to look into an elf file which will have the complete description (i.e. data, code, etc.)
- In this example, this program gets replaced itself with the program
echoand therefore is able to print "Hello World" to the terminal.
exec is a system call that replaces a process’s user address space with data read from
an executable file.
- Continuity of the process but replacement of its contents.
- The process keeps the same PID, open file descriptors, environment, etc.,
- But the program text and data are swapped out for a new image.
Notes:
- Your virtual address space will be overwritten and you basically become into the new address space
- The first argument is the name of the program itself
- This program can do different things depending on the first argument provided
- The reason that you need the NULL at the end is because this is ragargs
- When you
exec, your file descriptor is the same?
Another example:
less /usr/share/dict/words
- We can tell shell to replace itself with the less program in that file
execl less /usr/share/dict/words
- Now when you close the less program your shell will also end!
- That shell is gone! it was overriden by the
execlinstruction
- That shell is gone! it was overriden by the
- So
execlis a way to replace yourself
Looking ahead-exec family of functions
See online man page. Only the first two are of interest to us at this time.
int execl ( const char *path, const char *arg, ... );
int execlp( const char *file, const char *arg, ... );
int execle( const char *path, const char *arg, ..., char *const envp[] );
int execv ( const char *path, char *const argv[] );
int execvp( const char *file, char *const argv[] );
int execve( const char *path, char *const argv[], char *const envp[] );
-
There is only one system call, the others just invoke arguments and involve that one system call
-
execl:- Stands for the fact that the arguments that you will give to this program are in list form
-
execlp:- You do not have to get the full file name of the executable you want to replace yourself with, you just give the
lsname, and is the function call that will find the correct file - The shell will say: is there an
lsin this directory? if not, is there in this other directory? - Signifies file lookup.
- Gives you some independence of actually hard coding names of executables
- You do not have to get the full file name of the executable you want to replace yourself with, you just give the
What is a file descriptor?
Imagine you're at a library.
- You request a book, but the librarian gives you a numbered slip. It's your handle to the book.
- You use the slip to get access to the book—you read, and eventually return the slip.
- The slip is like a file descriptor: a small integer the operating system gives you to refer to an open resource.
Key point: the descriptor is not the file itself, just a handle.
Notes:
- When you create a file, you are actually created some kernel data structure
- To communicate between kernel and user process you need file descriptors
- Integers in your file descriptor table will indicate read and write ends
- The only way to access a file is through the kernel because that file is in the disk which is not accessible by processes, so the kernel needs to handle it to you
Analogy (library):
- You ask the librarian to read a book, instead what you get is a file descriptor, and it is through this file descriptor that you can read and write the book, you never actually get the book directly.
What is a file descriptors actually?
- In every process there is a file descriptor table
- Normally start with 0, 1, and 2, these are the descriptors that you are born with
- If you create a socket (for network communications), the kernel exposes these things through the kernel to file descriptors.
- The only handles that you have to the pipe is through these file descriptors
/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260202140246.png)
- Unix maintains a giant file descriptor table and every process have their little file descriptor tables associated with them, all of the entries in the small descriptor table are pointing to the gigant file descriptor table, this is how different processes can refer to the same file.
- When a parent forks, you are not copying anything, just the pointers of the same file descriptor table?
- Think of a file descriptor table as being an array pointing to things in a kernel data structure.
fork() and pipe()
It is not useful for one process to use a pipe to talk to itself. Typically, a process creates a pipe just before it forks more child processes.
The pipe is inherited by the children, and then used for communicating either between the parent & child processes, or between two sibling processes.
In the following program, the parent writes a message to the pipe.
The child reads from the pipe 1 byte at a time until the pipe is empty.
Notes:
- The file descriptor tables of a parent and a child process are exactly the same
- When you did a fork actually the child also got a copy of the same pipe
- This kind of attributes are copied across a
fork() - Now A and B got the same pipe
- This kind of attributes are copied across a
- A file descriptor is an indirect way of referring to a pipe
- Though only one of them can actually have access to the pipe at a time
- Now the parent closes one end and the child closes the other end of the pipe, while letting the other side untouched.
- Now there is a single writer and a single reader on the pipe, this is important (you can only have one single writer but multiple readers)
/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260130142119.png)
fork() and pipe() example 1
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/wait.h>
int main(int argc, char* argv[]) {
int pipefds[2];
pid_t pid;
char buf[30];
//create pipe
if(pipe(pipefds)==-1){
perror("pipe");
exit(EXIT_FAILURE);
}
memset(buf,0,30);
pid = fork();
if(pid>0){
printf("PARENT: writing to the pipe\n");
//parent close the read end
close(pipefds[0]);
//parent write in the pipe write end
write(pipefds[1],"CSCI3150",9);
//after finishing writing, parent close the write end
close(pipefds[1]);
//parent wait for child
wait(NULL);
} else {
//child did not close the write end
//child read from the pipe read end until the pipe is empty
while(read(pipefds[0],buf,1)==1){
printf("CHILD read from pipe --%s\n",buf);
}
close(pipefds[0]);
printf("CHILD: EXITING!");
exit(EXIT_SUCCESS);
}
return 0;
}
- When fork() returns, if it is successful, then both of you will be able to return.
- In the parent process fork() will return the process id of the child, but in the child it will return 0.
- Using this information now, the parent and the child can do different things
- Parent:
- The parent closes the read end of the pipe
- Then it writes into the write end of the pipe
- Then closes the pipe (this is not enough)
- Child (else clause):
- Keeps reading from the read end of the pipe
- Reads 1 byte and puts it in buff
- Unfortunately never returns when all the data in the pipe has finished
- Keeps reading from the read end of the pipe
davidkebo@CSCE-C02F159MMD6M pipes \% ./a.out
PARENT: writing to the pipe
CHILD read from pipe --C
CHILD read from pipe --S
CHILD read from pipe --C
CHILD read from pipe --I
CHILD read from pipe --3
CHILD read from pipe --1
CHILD read from pipe --5
CHILD read from pipe --0
CHILD read from pipe --
Problem:
The child process doesn't exit because pipefds[1] is open. The system assumes that a write could occur while the write end is still open, and the system will not report EOF.
- The child basically received the same two end points of the pipe
- The child does not see the end of file because the child is also holding the right end of the file and the system assumes that a write could occur.
- SO you need the reader to also close the write end of the file
The child blocks and waits to read from pipefds[0], and the operating system doesn't know that no process will be writing to pipdfds[1].
fork() and pipe() example 2
int main (int argc, char *argv[]) {
int pipefds[2];
pid_t pid;
char buf[30];
if (pipe (pipefds) == -1) {
perror ("pipe");
exit (EXIT_FAILURE);
}
memset (buf, 0, 30);
pid = fork ();
if (pid > 0) {
printf ("PARENT: writing to the pipe\n");
close (pipefds[0]);
write (pipefds[1], "CSCI3 150", 8);
close (pipefds[1]);
wait (NULL);
} else {
close(pipefds[1]); // IMPORTANT!! (close right end of pipe)
while (read (pipefds[0], buf, 1) == 1) {
printf ("CHILD read from pipe --%s\n", buf);
}
close (pipefds[0]);
printf ("CHILD: EXITING!");
exit (EXIT_SUCCESS);
}
return 0;
}
davidkebo@CSCE-C02F159MMD6M week3 % ./a.out
PARENT: write in pipe.
CHILD read from pipe --C
CHILD read from pipe --S
CHILD read from pipe --C
CHILD read from pipe --I
CHILD read from pipe --3
CHILD read from pipe --1
CHILD read from pipe --5
CHILD read from pipe --0
CHILD read from pipe --
CHILD: EXITING!
- Now the read end reaches EOF.
Note:
- When the child process finishes reading all the bytes of data, we close the related file descriptors.
/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260202142351.png)
Pipes for bidirectional communication
/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260202142628.png)
Problem:
- Pipes are unidirectional and should not be used to write back.
- Using a single pipe for bidirectional communication will fail.
/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260202142705.png)
Solution:
- The parent needs to create the two pipes, otherwise if the pipe is created by the child, it not be inherited by the parent so it cannot communicate through the descriptor back to the parent
- We can use two pipes for bidirectional communication between parent and child.
- The parent process uses
pcfdto send data to the child process. - The child process uses
cpfdto send data to the parent process.
- The parent process uses
- In both cases, the calls to
write()use index 1, and theread()calls use index 0 on the file descriptor array.
Copy of a file descriptor: dup2()
The dup2() system call creates a copy of a file descriptor.
- If the copy is successfully created, then the original and copy file descriptors may be used interchangeably.
- They both refer to the same open file description and thus share file offset and file status flags.
int dup2(int oldfd, int newfd)
oldfd: old file descriptor
newfd: new file descriptor which is used by dup2() to create a copy.
/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260130142544.png)
- The white table is the kernel-wide descriptor table
Bash uses dup2() with pipes to link commands together.
Example: ls | sort
- The ls process closes its
readend of the pipe and links the write end to its standard output. - The sort process closes the
writeend of the pipe and links the read end to become its standard input. - ls closes the
writeend of the pipe, sort closes thereadend of the pipe.
Anything that ls writes to its standard output, sort would read from its standard input.
Notes:
- Will copy data from one descriptor to another descriptor
- Entries in a file descriptor table are pointing to objects in the file system
- Something that the kernel is maintaining for you
- When you duplicate from 4 to 1, you are saying, whatever 4 was referring to, please copy it to 1.
- Therefore writing to 1 and writing to 4 will have the same effect, they are both pointing to the same underlying object
dup2 in action
/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260130143031.png)
- Imagine the left column wants to become
lsand the right column wants to becomesort - Left column calls
dup2(1,4) - Right column calls
dup2(0,3) - The moment you have
dup2, now you have two writers, so you need to close one of them, always.- You can get away with not closing extra reader file descriptor but you ALWAYS NEED to close the extra write file descriptor.
- You can figure out whether you are a child or not!
- Yes childs can do different things!
- Note: closing a descriptor file means sending an
eofsignal to the pipe, which the reader won't see, but you actually want the reader to see it so thatsortknows there is no more data to read.- You want the readers to notice the string has ended
Limitations of pipes
-
Unix pipes are "unidirectional" channels from one process to another, and cannot be turned into bidirectional or multicast channels.
- This limitation makes pipes difficult to use for complex inter-process communication.
- Bidirectional communication can be simulated using a pair of pipes, but
- Inconsistent buffering between both pipes can lead to complications.
-
Pipes can only be shared between processes that have a common ancestor that creates the pipe.
-
There is no way for a process to "reference" a pipe that it (or an ancestor) did not directly create via pipe.
-
This restriction prevents many useful forms of inter-process communication from being layered on top of pipes.
- Pipes cannot be used for clients to connect to long-running services.
- Processes cannot open additional pipes to other processes that they already have a pipe to.
Notes:
- Pipes are unidirectional (only one way)
- Pipes have destructive reads, once a process reads the entirety of the data, that data is gone.
- There are some request control protocols like
ftpwhere you have a control channel and a data channel- In this situation, it becomes though, because even though you have a defined hierarchy, when the data channel will be created is already to late because a fork already exists, this is inconvenient.
/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260202143145.png)
- Is there a way to create a pipe between C and E?
- Find a common ancestor
- When A forks C or B, they inherit the ends of the pipe and A can close them
- When B does a fork, those two ends of the file are inherited by E so B can close them
- ...
- A needs to know that one of my children would eventually want to talk to one of my grandchildren
- There is a whole lot of planning that you need to address
Using Files for Inter Process COmmunication
Two programs can communicate with one another by using an intermediate medium such as a file.
davidkebo@linux:~$ ls
Desktop Documents Downloads Music Pictures Public Templates Videos code src-cloud src-cloud.zip
davidkebo@linux:~$ ls> content.txt
davidkebo@linux:~$ more content.txt
Desktop
Documents
Downloads
Music
Pictures
Public
Templates
Videos
code
content.txt
src-cloud
src-cloud.zip
- The program
lssends its output to the file "content.txt". - The program
moretakes "content.txt" as an input. lsindirectly communicates with more via the file "content.txt".
FIFO
FIFO (named PIPE)
FIFO is a first-in, first-out message-passing IPC in which bytes are sent and received as unstructured streams. It is also known as a named pipe.
The named pipe is a POSIX pipe in contrast to anonymous pipes created with the pipe() system call.
FIFOs work by associating a filename with the pipe. Once created, any process (with correct access permissions) can access the FIFO by calling open() on the associated filename.
Once the processes have opened the file, they can use the standard read() and write() functions to communicate.
- For words that start with 'zy' and end with 's'
### Linux command example 3 (cmd1 | cmd2)
...
### Linux command example 4 (cmd1 | cmd2)
...
**Notes**:
{{CODE_BLOCK_12}}
- Discard duplicated lines
- `head -3` returns the first 3 lines of the output file
### Unix philosophy
...
**Notes**:
- There was a time where pipes didn't exist in linux
- Philosophy before wast that as clients want more and more functionality, they just keep adding more and more features, and the code becomes unmanageable
- The philosphy of Unix with pipes is that you can write simple programs to do a single thing well, then use composition (assisted with pipes) to communicated with other processes to make bigger systems
### Linux command example 6 (cmd1 | cmd2)
**Notes**:
- This is exactly how your shell is creating the pipe command
## Forks
### UNIX `fork()`
...
**Notes**:
- The mental model is that when a process involves `fork()`, what it gets back is the process ID of the child process
- In Unix we can make a complete hierarchy of parent-child relationships
- When a process calls fork, another identical process is created, now you will return from fork and your copy will also return from fork.
### UNIX `fork()` example 1
...
**Notes**:
- When this program starts to execute, it first makes a fork
- Now two of you are executed!
- Now both of them are executing an identical program so both of them will print whatever the program prints
{{CODE_BLOCK_13}}
- They will both terminate at `return 0`.
- Every time a thread calls `fork()` it creates two copies.
### UNIX `fork()` example 2
{{CODE_BLOCK_14}}
- Once `fork()` returns there are actually two of you!
- Both of them will return
- Two separate process but they have identical behavior
{{CODE_BLOCK_15}}
- The number of times the message prints is equal to number of process created. 2𝑛 (𝑛 is the number of fork system calls)
/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260130135316.png)
- At some point the process calls `fork()`, the OS creates an identical copy of it, so both of them will return (at some time).
### An aside-`execl`
{{CODE_BLOCK_16}}
- Will take over your code segment, data stack, and recreate an address space with the contents of this program
- It will be able to look into an elf file which will have the complete description (i.e. data, code, etc.)
- In this example, this program gets replaced itself with the program `echo` and therefore is able to print "Hello World" to the terminal.
**exec** is a system call that replaces a process’s user address space with data read from
an executable file.
- Continuity of the process but replacement of its contents.
- The process keeps the same PID, open file descriptors, environment, etc.,
- But the program text and data are swapped out for a new image.
**Notes**:
- Your virtual address space will be overwritten and you basically become into the new address space
- The first argument is the name of the program itself
- This program can do different things depending on the first argument provided
- The reason that you need the NULL at the end is because this is ragargs
- When you `exec`, your file descriptor is the same?
Another example:
{{CODE_BLOCK_17}}
- We can tell shell to replace itself with the less program in that file
{{CODE_BLOCK_18}}
- Now when you close the less program your shell will also end!
- That shell is gone! it was overriden by the `execl` instruction
- So `execl` is a way to replace yourself
### Looking ahead-exec family of functions
See [online man page](https://man7.org/linux/man-pages/man3/exec.3.html). Only the first two are of interest to us at this time.
{{CODE_BLOCK_19}}
- There is only one system call, the others just invoke arguments and involve that one system call
- `execl`:
- Stands for the fact that the arguments that you will give to this program are in list form
- `execlp`:
- You do not have to get the full file name of the executable you want to replace yourself with, you just give the `ls` name, and is the function call that will find the correct file
- The shell will say: is there an `ls` in this directory? if not, is there in this other directory?
- Signifies file lookup.
- Gives you some independence of actually hard coding names of executables
### What is a file descriptor?
Imagine you're at a library.
- You request a book, but the librarian gives you a numbered slip. It's your handle to the book.
- You use the slip to get access to the book—you read, and eventually return the slip.
- The slip is like a file descriptor: a <span style="color:rgb(254, 134, 22)">small integer</span> the operating system gives you to refer to an open resource.
Key point: the descriptor is not the file itself, just a handle.
**Notes**:
- When you create a file, you are actually created some kernel data structure
- To communicate between kernel and user process you need file descriptors
- Integers in your file descriptor table will indicate read and write ends
- The only way to access a file is through the kernel because that file is in the disk which is not accessible by processes, so the kernel needs to handle it to you
**Analogy (library)**:
- You ask the librarian to read a book, instead what you get is a file descriptor, and it is through this file descriptor that you can read and write the book, you never actually get the book directly.
### What is a file descriptors actually?
- In every process there is a file descriptor table
- Normally start with 0, 1, and 2, these are the descriptors that you are born with
- If you create a socket (for network communications), the kernel exposes these things through the kernel to file descriptors.
- The only handles that you have to the pipe is through these file descriptors
/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260202140246.png)
- Unix maintains a giant file descriptor table and every process have their little file descriptor tables associated with them, all of the entries in the small descriptor table are pointing to the gigant file descriptor table, this is how different processes can refer to the same file.
- When a parent forks, you are not copying anything, just the pointers of the same file descriptor table?
- *Think of a file descriptor table as being an array pointing to things in a kernel data structure*.
### `fork()` and `pipe()`
It is not useful for one process to use a pipe to talk to itself. Typically, a process creates a pipe <span style="color:rgb(159, 239, 0)">just before</span> it forks more child processes.
The pipe is inherited by the children, and then used for communicating either between the parent & child processes, or between two sibling processes.
In the following program, the <span style="color:rgb(159, 239, 0)">parent writes</span> a message to the pipe.
The <span style="color:rgb(159, 239, 0)">child reads</span> from the pipe <span style="color:rgb(254, 134, 22)">1 byte at a time</span> until the pipe is empty.
**Notes**:
- The file descriptor tables of a parent and a child process are exactly the same
- When you did a fork actually the child also got a copy of the same pipe
- This kind of attributes are copied across a `fork()`
- Now A and B got the same pipe
- A file descriptor is an indirect way of referring to a pipe
- Though only one of them can actually have access to the pipe at a time
- Now the parent closes one end and the child closes the other end of the pipe, while letting the other side untouched.
- Now there is a single writer and a single reader on the pipe, this is important (you can only have one single writer but multiple readers)
/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260130142119.png)
### `fork()` and `pipe()` example 1
{{CODE_BLOCK_20}}
- When fork() returns, if it is successful, then both of you will be able to return.
- In the parent process fork() will return the process id of the child, but in the child it will return 0.
- Using this information now, the parent and the child can do different things
- **Parent**:
- The parent closes the read end of the pipe
- Then it writes into the write end of the pipe
- Then closes the pipe (<span style="color:rgb(254, 134, 22)">this is not enough</span>)
- **Child** (else clause):
- Keeps reading from the read end of the pipe
- Reads 1 byte and puts it in buff
- Unfortunately never returns when all the data in the pipe has finished
{{CODE_BLOCK_21}}
**Problem**:
The child process doesn't exit because `pipefds[1]` is open. The system assumes that a write could occur while the write end is still open, and the system will not report EOF.
- The child basically received the same two end points of the pipe
- The child does not see the end of file because the child is also holding the right end of the file and the system assumes that a write could occur.
- SO you need the reader to also close the write end of the file
The <span style="color:rgb(254, 134, 22)">child blocks</span> and waits to read from `pipefds[0]`, and the operating system doesn't know that no process will be writing to `pipdfds[1]`.
### `fork()` and `pipe()` example 2
{{CODE_BLOCK_22}}
{{CODE_BLOCK_23}}
- Now the read end reaches EOF.
**Note**:
- When the child process finishes reading all the bytes of data, we close the related file descriptors.
/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260202142351.png)
### Pipes for bidirectional communication
/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260202142628.png)
**Problem**:
- Pipes are <span style="color:rgb(254, 134, 22)">unidirectional</span> and should not be used to write back.
- Using a single pipe for bidirectional communication will fail.
/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260202142705.png)
**Solution**:
- The parent needs to create the two pipes, otherwise if the pipe is created by the child, it not be inherited by the parent so it cannot communicate through the descriptor back to the parent
- We can use <span style="color:rgb(159, 239, 0)">two pipes</span> for bidirectional communication between parent and child.
1. The parent process uses `pcfd` to send data to the child process.
2. The child process uses `cpfd` to send data to the parent process.
- In both cases, the calls to `write()` use index 1, and the `read()` calls use index 0 on the file descriptor array.
### Copy of a file descriptor: `dup2()`
**The `dup2()` system call creates a copy of a file descriptor.**
- If the copy is successfully created, then the original and copy file descriptors may be used interchangeably.
- They both refer to the same open file description and thus share file offset and file status flags.
{{CODE_BLOCK_24}}
`oldfd`: old file descriptor
`newfd`: new file descriptor which is used by dup2() to create a copy.
/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260130142544.png)
- The white table is the kernel-wide descriptor table
**Bash uses `dup2()` with pipes to link commands together.**
Example: `ls | sort`
1. The ls process closes its `read` end of the pipe and links the write end to its standard output.
2. The sort process closes the `write` end of the pipe and links the read end to become its standard input.
3. ls closes the `write` end of the pipe, sort closes the `read` end of the pipe.
Anything that `ls` writes to its standard output, `sort` would read from its standard input.
**Notes**:
- Will copy data from one descriptor to another descriptor
- Entries in a file descriptor table are pointing to objects in the file system
- Something that the kernel is maintaining for you
- When you duplicate from 4 to 1, you are saying, whatever 4 was referring to, please copy it to 1.
- Therefore writing to 1 and writing to 4 will have the same effect, they are both pointing to the same underlying object
### `dup2` in action
/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260130143031.png)
- Imagine the left column wants to become `ls` and the right column wants to become `sort`
- Left column calls `dup2(1,4)`
- Right column calls `dup2(0,3)`
- The moment you have `dup2`, now you have two writers, so you need to close one of them, always.
- You can get away with not closing extra reader file descriptor but you ALWAYS NEED to close the extra write file descriptor.
- You can figure out whether you are a child or not!
- Yes childs can do different things!
- **Note**: closing a descriptor file means sending an `eof` signal to the pipe, which the reader won't see, but you actually want the reader to see it so that `sort` knows there is no more data to read.
- You want the readers to notice the string has ended
### Limitations of pipes
- Unix pipes are "unidirectional" channels from one process to another, and cannot be turned into bidirectional or multicast channels.
- This limitation makes pipes difficult to use for complex inter-process communication.
- Bidirectional communication can be simulated using a pair of pipes, but
- Inconsistent buffering between both pipes can lead to complications.
- Pipes can only be shared between processes that have a common ancestor that creates the pipe.
- There is no way for a process to "reference" a pipe that it (or an ancestor) did not directly create via **pipe**.
- This restriction prevents many useful forms of inter-process communication from being layered on top of pipes.
- Pipes cannot be used for clients to connect to long-running services.
- Processes cannot open additional pipes to other processes that they already have a pipe to.
**Notes**:
- Pipes are unidirectional (only one way)
- Pipes have destructive reads, once a process reads the entirety of the data, that data is gone.
- There are some request control protocols like `ftp` where you have a control channel and a data channel
- In this situation, it becomes though, because even though you have a defined hierarchy, when the data channel will be created is already to late because a fork already exists, this is inconvenient.
/CSCE-313/Lecture/Visual%20Aids/Pasted%20image%2020260202143145.png)
- Is there a way to create a pipe between C and E?
- Find a common ancestor
- When A forks C or B, they inherit the ends of the pipe and A can close them
- When B does a fork, those two ends of the file are inherited by E so B can close them
- ...
- A needs to know that one of my children would eventually want to talk to one of my grandchildren
- There is a whole lot of planning that you need to address
### Using Files for Inter Process COmmunication
Two programs can communicate with one another by using an intermediate medium such as a file.
{{CODE_BLOCK_25}}
- The program `ls` sends its output to the file "content.txt".
- The program `more` takes "content.txt" as an input.
- `ls` indirectly communicates with more via the file "content.txt".
# FIFO
### FIFO (named PIPE)
FIFO is a `first-in`, `first-out` message-passing IPC in which bytes are sent and received as `unstructured streams`. It is also known as a named pipe.
The named pipe is a POSIX pipe in contrast to anonymous pipes created with the `pipe()` system call.
FIFOs work by associating a filename with the pipe. Once created, any process (with correct access permissions) can access the FIFO by calling open() on the associated filename.
Once the processes have opened the file, they can use the standard read() and write() functions to communicate.