12 - Thread Synchronization
Class: CSCE-313
Notes:
Atomic Operations
An atomic operation either "runs to completion" or "not at all".
- It is indivisible: in the sense that during the operation done by
- B cannot "observe" or "act" on the state under modification by A.
- Fundamental building block-without atomic operations we have no way for threads to work together and make sense of the resulting computation
Each instruction in the x86 instruction set is atomic. An instruction fully finishes before the current process/thread can be preempted/interrupted.
Notes:
- An operation that either runs to completion OR not at all.
- Its intermediate state i snot visible to anybody else outside
- Isolation: exectuing a thread completely isolated from other threads
Synchronization Variable - Lock to Provide Mutual Exclusion
- First step towards making shared data thread-safe
- The idea is to make a sequence of instructions "atomic" prevent two threads from executing the block concurrently
- The idea is to make multiple instructions atomic to stop context switching from being useful
General idea:
/CSCE-313/Lecture/Visual%20Aids/image-38.png)
/CSCE-313/Lecture/Visual%20Aids/image-39.png)
Notes:
- We are going to place a lock around
*p= p* = 1for example in 11 - Concurrency and Threads#Race Condition Demonstration - This guarantees that only one of them acquires the lock, which will prevent everybody else from being in the critical section at the same time
- When the thread that has the lock unlocks the lock, then one other thread can enter that critical section.
- This is the developers responsibility: Identifying shared state and critical sections
- In this case, if x had a certain value (x = 0) and then t1 and t2 both wanting to increment it, no matter which ones runs first, the critical sections will be serialized.
What is a mutex?
- A mutex is a lock that we acquire before using a shared resource and releasing it after use.
- When the lock is "set", no other thread can access the locked region of code
[advisory]. - It is used to protect data or other resources from concurrent access.
- A mutex lock (pthreads) can only be released by the thread that locked it.
Notes:
- Mutex = mutual exclusion
- Purpose: While one entity has the mutex, another entity cannot get that mutex (cannot acquire the lock at the same time)
- This is basically you making sure that the critical section is serialize, this is called Advisory.
- Your OS or compiler has no idea what your app semantics are
- Only you understand what is consistent for your application
What is a critical section?
- A critical section is a segment of a program that multiple threads should not access simultaneously.
- The segment needs to be executed exclusively & atomically, such as accessing a resource (file, input or output port, global data, etc.)
- It contains shared variables or resources that need to be synchronized to maintain consistency of data variables.
- In concurrent programming, if one thread tries to change the value of shared data at the same time as another thread tries to read the value (i.e., data race across threads), the result is unpredictable.
Notes:
- The segment of our program that we need to protect
- Bad things happen when we do not protect it (i.e. overwrites)
- You want to protect code A using a lock and code B using the same lock so that these sections are serialized (happen in order)
Mutex thread synchronization
| Thread |
Thread |
|---|---|
| mutex lock | mutex lock |
| critical section | critical section |
| mutex unlock | mutex unlock |
Notes:
- Thread 0 executes completely isolated from thread 1
- In general if the only thing you are doing is read(), then you can get away without locking.
- But if you are writing, this is where critical sections happen
- It is the entire read-write section that needs to be protected
- Critical section should include the read that derived the thing you will write to
Mutex thread synchronization
...
Mutex in C++ for Thread Safety
...
...
Notes:
- If they don't use the same mutex, then you are not serializing anything
- This is just a big heavy duty block that basically has one thread finishing, then the second thread executing or the other way around.
Now when we do:
./a.out 1000000
data = 2000000
- You will be able to see a deterministic output
...
Mutex in C++ -- Finer Locking
...
Notes:
- We can only protect the read-modify-write section
- It produces the same correct result
- So yes, we can put locks inside loops to do fine locking
When we do:
./a.out 1000000
data = 2000000
- Once again we are in happy territory
Timeline of Coarse/Fine Grained Locking
Notes:
- Coarse grained: A thread that acquire a lock right at the beginning of the function, and then releases it at the end
- Fine grained: you have much more locks and unlocks, since these happen between small sections of code that are critical
Producer-Consumer Synchronization
...
Notes:
- Mutual exclusion is not enough!
- It is good but for shared variables we may need other types of primitives to be able to do more general things
- Here is an example if a Queue
- You have multiple consumers and multiple producers
- You can already see that there are shared variables between consumer and producer
- producer needs to touch head/tail and size to add an element
- consumer as it pulls an item from the list needs to change the size and look at head/tail.
- What if both of them read tail at the same time and both try actions at the same time?
- The order in which things happens, you can actually lose elements
- This sounds like a problem for mutex
- Only one guy can be in the mutation at any given time
Producer-Consumer: problems
/CSCE-313/Lecture/Visual%20Aids/image-27.png)
- There’s a race condition in accessing the list if there are multiple producers
- Push-back at a queue end may not be thread-safe.
- It may require incrementing size, which may not be thread-safe.
- If there are multiple consumers, more than one thread may pass the test even if there is only one item in the queue
- Two threads may call
list.pop().
- Two threads may call
Notes:
- If the list size is 0, they do a busy wait
- When both think the queue is now empty, they can both try to read/write to the list, which is where bad things happen
- You might say: I do not need to protect anything because these are thread safe. But actually the problem is not that this is thread safe or not, it is that thread safety has gone out of the window much earlier.
- Both think that there are elements in the queue that they can pull off. The fact that you have multiple threads that both determine that there is at least one element in the queue and add an element in the queue is a violation. Now they basically both invoke list.pop, and depending on how well it is written, that can be our problem.
Solution:
You might attempt to solve multiple consumer’s race condition by trying to do
the following:
/CSCE-313/Lecture/Visual%20Aids/image-28.png)
This leads to a deadlock when list is empty, because “consumer” is blocked
with mutex held.
- Consumer holds the lock while the queue is empty,
- Producer must wait for lock to add data to the queue.
It would be nice if the lock could somehow be transferred to another thread while one consumer thread is “stuck”.
Notes:
- We are guaranteeing that only one thing sis at that critical section at any given time
- Mutexes do not give us dependence order, the order in which to execute things
- For this we need a new primitive
- Example:
- Lets say that a consumer tried to pull an item from the que and found that the size of the queue was empty, now they will just wait for something to appear but since they are holding the mutex, that won't never happen! the producer is not able to run!
- You are deadlock!
- Lets say that a consumer tried to pull an item from the que and found that the size of the queue was empty, now they will just wait for something to appear but since they are holding the mutex, that won't never happen! the producer is not able to run!
- Mutexes are not enforcing an ordering
- The only way I can actually test whether I can consume or not is by acquire the lock, but now when I determine that I can actually not do my thing, I want to be able to relinquish this lock and try again later.
- One thing that the consumer can do is to throw an error here, and have the higher level program call consumer again.
- But this i still a busy wait, we are just trying again repeatedly!
- The only way a consumer can progress is if a producer runs, but since you are holding the lock, nothing else can run, so you are essentially deadlock
- If you have written your mutexes correctly, if you return an exception you will be able to unlock the consumer
- But this is still a busy/pooling situation
- Takeaway: Mutexes give you mutual exclusion but they do not directly provide serialization and ordering!
What if...
- we determine that we cannot proceed (even though it is holding the lock)
- It would be nice is that at this point it could release the lock and go off to sleep
- So instead of a busy wait, you atomically release the mutex and go to sleep
- The producer can actually define a signal that can tell the consumer to wake up! when waking up the consumer acquires the mutex
- It still have mutual exclusion, and you can proceed!
- It turns out that if the consumer wakes up and turns out there are 0 elements in the list, once again, it releases the mutex and goes to sleep
What we need is "condition variables"
A condition variable (CV) is
- a queue of waiting threads, with
- a lock for access to CS (multiple CVs per lock is ok)
wait(cond_t *cv, mutex_t *lock)
- assumes lock is held when wait() is called
- puts caller to sleep & releases the lock atomically
- when awoken, reacquires lock before returning
signal(cond_t *cv)
- wake a single waiting thread (if >= 1 thread is waiting)
- if there is no waiting thread, just return w/o doing anything
/CSCE-313/Lecture/Visual%20Aids/image-29.png)
Notes:
- Think of a condition variable as of being a queue were threads can wait, and that it is associated with a lock
- When you call wait(), you will release the lock and park yourself in the queue corresponding to this condition variable
- The OS will pick one random process from the queue later, the first thing the chosen thread will do is to acquire the mutex.
- The consumer calls wait on this condition vraiable when there is no items on the lists
- This says: here is this mutex, here is this queue -> release the mutex and put me on the queue
- Now what the producer does is that it signals on the condition variable
- This says:
- if tehre is someone waiting on the queue -> please wake them (one) up
- if there is none waiting on the queue -> that signal is lost
- This says:
- What about having multiple waits?
- You get blocked on the first wait() you call, if it is the case that you wake up from that wait you can continue calling other waits one at a time.
- You can't call wait() twice in an object!
- The first wait() you call you are already blocked!
- There are no nested waits
- This functions as some sort of agreement between the waiter and the signaler
- This mechanism is so general that you can basically encode any condition that you want basically
- Remember:
- When you call wait() you better hold this mutex
- Whenever somebody signals this condition variable, you either pick one or eveyone waiting on this queue, but the first thing they will do is to acquire this mutex for this critical section
Example:
void Consumer () {
mutex.lock();
while (list.size() == 0) {
mutex.unlock();
sleep until list = empty;
mutex.lock();
}
int data = list.front();
list.pop();
mutex.unlock();
}
void Producer () {
int data = 0;
while (true) {
data = produce_data();
mutex.lock();
list.push_back(data);
mutex.unlock();
nofify_consumer();
}
}
/CSCE-313/Lecture/Visual%20Aids/image-30.png)
Notes:
- Once the lock is acquired, it tests again
- Between releasing the lock and sleep there is a lot of space
- Note if the signal is lost and the consumer goes to sleep, it may potentially go to sleep forever
- It is dangerous because when you release the lock, you need to make this things atomic
- This is so that if there is a signal, then that signal is not lost
"Broadcasting" on a condition variable
/CSCE-313/Lecture/Visual%20Aids/image-31.png)
- Two burgers already available in the warmer.
- 5 consumers show up and place their orders.
- 2 consumers get their burgers—the remaining 3 start playing.
- When each burger is ready:
- The producer notifies all waiting consumers.
- All consumers queue up to the counter—but the first one gets the burger.
- The remaining consumers return to their tables and continue playing.
Notes:
- Waking everybody = broadcasting on the condition variable
- Since you have two burgers in the burner, two consumers are automatically satisfied, but the remaining 3 are parked on the condition variable queue
- Every time that the cook produces 1 or more burgers, he sends a broadcast signal to the condition variable queue
- They all now compete to get the mutex, only one of them succeeds, and that client gets the burger and the rest will go back to sleep: they will acquire the mutex, they will find that the condition is false (there are no burgers) and they will release and go to sleep.
- The advantage of the broadcaster is that if you happen to produce 2 burgers you just need to wakeup everybody at the same time and you will know that 2 of them will get served automatically
- Be sure that your Exam will have condition variables. This is a fundamental thing and are very useful!
A detour through the C++ Thread API
/CSCE-313/Lecture/Visual%20Aids/image-32.png)
#include <iostream>
#include <string>
#include <thread>
void hello_worldstring s {
cout << s << std::endl;
}
class HelloWorld {
public:
void hello_worldstring s {
cout << s << " from " <<
"HelloWorld.hello_world." << endl;
}
void operator string s) const { (2
cout << s << "from " <<
" HelloWorld.operator()." << endl;
}
};
int main() {
std::string s = "Hello World";
thread t1(hello_world, s); (1)
HelloWorld hw;
thread t2(hw, s);
thread t3hello_world, &hw, s;
std::thread t4([s] {
cout << s << "from a lambda." << endl;
});
t1.join();
t2.join();
t3.join();
t4.join();
}
Notes:
- When your thread object is created, you are already runnable!
- Having a method in a class that implements the operator method
- Then another way to create a thread is by giving an instance of that class and what is going to happen is that the function that is going to be invoked is the operator?
- The way you provide the self is to provide the pointer to the argument
- For us the most important use is for lambdas (anonymous function)
- This anonymous function takes no arguments and captures the variable
sfrom its scope- C++ did a slightly different design to lambdas
- You will copy by value
- Captures s within the scope of you lambda
Producer-Consumer: Correctly
/CSCE-313/Lecture/Visual%20Aids/image-33.png)
Step 1: Declare a condition variable and a mutex
Step 2: The producer produces data in a way such that the consumer can notice e.g., pushing into the vector that consumer checks
Step 3: Calls notify_one/all() on the condition to wake up the consumers so that they can check again
Notes:
- The highlighted in consumer allows you to say: now I have a scope block and if I ever exit I will release the lock? And when somebody wakes me please run this lambda, if this lambda is true, only then return me from the wait, if this lambda is not true, then our condition is not satisfied so release the lock again and go back to sleep
- This is sort of encapsulating that whole while loop!
- What the consumer is doing is that before you can touch the shared data structure, it needs to acquire a lock that it shares with the producer, so that they can exchange data.
- C++ has exceptions, so it is dangerous to do things like
m.lock(), you might actually end up five levels above while you still hold the mutex - The suggested usage is to create a scope lock.
- You lock the mutex inside a type class called
unique_lock. - No matter how you exit from this function you will release the lock
- You lock the mutex inside a type class called
Produce-Consumer
/CSCE-313/Lecture/Visual%20Aids/image-34.png)
Step 3: The consumer(s) do the following:
- Calls
wait()on the condition Wait()needs a “wrapped” lock (as unique_lock) and a predicate function- The “wrapper” also locks the lock
- Consume data
- Unlock the lock
- The “wrapper” also locks the lock
An example producer-consumer scenario
/CSCE-313/Lecture/Visual%20Aids/image-35.png)
Producer-consumer with bounded size?
/CSCE-313/Lecture/Visual%20Aids/image-36.png)
Notes:
- You make another condition variable that signals when...
- You have application layer conditions:
- Two condition variables.
- can_write
- can_read
- You program your application so that it notifies these condition variables correctly
- First see wether the producer can proceed (if size of the queue is less than the produced size)
- Make sure that there is at least one item in the queue
- Two condition variables.
- The fact that you can encode any condition in a condition variable, makes them very general
Producer-Consumer in C
#define QUEUE_SZ 10
int queue[QUEUE_SZ];
int queue_hd = 0, queue_tl = 0, queue_nitems = 0;
pthread_mutex_t m;
pthread_cond_t can_write, can_read;
void*producer(void *pv) {
static int counter = 1;
while (1) {
pthread_mutex_lock(&m);
while(queue_nitems == QUEUE_SZ) {
...
}
...
}
}
Cond Var I
int done = 0;
void *child (void *arg) {
printf ("child\n");
done = 1;
return NULL;
}
int main (int argc, char *argv[]) {
pthread_t p;
printf ("parent: begin\n");
pthread_create (&p, 0, child, 0);
while (done == 0);
printf ("parent: end\n");
return 0;
}
Notes:
- pthread_t gives you the thread id;
- 0 basically says give me a standard thread
&pis a pointer to a function- It returns anything (a generic pointer), you as an application know what pointer type you are passing and what pointer type you are getting back. You know what you are going to be passed.
- You will have to cast it to that appropriate type before we can use it
- You cannot provide several arguments to child, to do that you will need to create a struct and pass a pointer to that struct.
Cond Var II
void *child (void *arg) {
printf ("child\n");
done = 1;
pthread_cond_signal(&cv);
return NULL;
}
int main (int argc, char *argv[]) {
pthread_t p;
printf ("parent: begin\n");
pthread_create(&p, 0, child, 0);
while(done == 0) {
pthread_cond_wait(&cv, &m);
}
printf("parent: end\n");
return 0;
}
Notes:
- The parent creates a child, and blocks itself on a condition variable queue
- Problem: The child sets done to 1, signals the condition variable, but that signal is lost because the parent is not blocked yet, now the parent blocks itself in the condition variable queue and it is blocked forever
Cond Var III
void *child (void *arg) {
printf ("child\n");
pthread_mutex_lock(&m);
pthread_cond_signal(&c);
pthread_mutex_unlock(&m);
return NULL;
}
int main (int argc, char *argv[]) {
pthread_t p;
printf ("parent: begin\n");
pthread_create (&p, 0, child, 0);
pthread_mutex_lock(&m);
pthread_cond_wait(&c, &m);
pthread_mutex_unlock(&m);
printf ("parent: end\n");
return 0;
}
Notes:
- Now we acquire mutexes before going to sleep!
- Unfortunately either this won't work (because any can execute first)
- You have removed state, which was used to determine whether the parent should sleep or not
- Now you are completely dependent of...
- What if we put the lock before doing
pth_create? - The point: it is very tricky to reason about concurrency
Cond Var IV (works)
void *child (void *arg) {
printf ("child\n");
othread_mutex_lock(&m);
done = 1;
pthread_cond_signal(&c);
pthread_mutex_unlock(&m);
return NULL;
}
int main (int argc, char *argv[]) {
pthread_t p;
printf ("parent : begin\n");
pthread_create (&p, 0, child, 0);
othread_mutex_lock(&m);
while (done == 0)
pthread_cond_wait(&c, &m);
pthread_mutex_unlock(&m);
printf ("parent: end\n");
return 0;
}
Notes:
- Now it is testing for a state!
- Now guarantees that once it blocks in the condition variable, the child has not got executed
What if the two operations are not atomic?
release(mutex);
add yourself to the Q;
Lost update problem
...