Part 1: Processes vs. Threads
Multi-process programming on Linux with C
Let us look at a simple code which demonstrates use of processes in Linux. Download and open processes.c file and study the use of fork() system call and its return values.
Note the wait(NULL) call by the parent process. The purpose of this call is to make sure the parent process waits until all its child processes are completed. In a situation where the child continues to run after the parent process is completed (died), the child is called an orphan process.
• Compile the code in a terminal (console):
gcc -o processes processes.c
• Run the program in a terminal (console):
./processes
Exercise 1
Compile and run processes.c. Observe the output. Why is the line “We just cloned a process..!” printed twice? Fix the code such that the line only prints once.
1
Creating and terminating threads
Listing 1: Snippet of threads.c
28 for(t=0;t<NUM_THREADS;t++)
29 {
30 printf("main thread: creating thread %ld\n", t);
31
32 //pthread_create spawns a new thread and return 0 on success
33 rc = pthread_create(&threads[t], NULL, work, (void *)t);
34 }
threads.c contains a simple example on creating (spawning) threads with pthread library and terminating them. In threads.c, the loop runs NUM_THREADS number of times and calls pthread_create function to create/spawn new threads.
pthread_create takes in four arguments:
1. thread – Reference to a thread variable of type pthread_t (element in threads array in this example)
2. attr – Thread attributes
3. start_routine – The function to be executed by the newly spawned thread (function work in this example)
4. arg – Arguments to be passed on to the parallel function (t in this example). Please note that instead of passing a single variable, we could pass a structure when multiple arguments are required by the parallel function.
• To find out more about different C functions, you can use the manual in command line:
man 2 pthread_create
• Compile the code in a terminal (console):
gcc -pthread -o threads threads.c
• Run the program in a terminal (console):
./threads
Exercise 2
Compile and run the threads.c program. Observe the output. Modify the NUM_THREADS value and observe the order of thread execution. Do threads execute in the same order they are spawned each time the program runs? Is the final value of the variable counter always the same? Explain.
Part 2: Process and Thread Synchronization
A critical section is a section of code that uses mutual exclusion to ensure that:
• Only one thread at a time can execute in the critical section
• All other threads have to wait on entry
• When a thread leaves a critical section, another can enter
A race condition happens when two concurrent threads (or processes) accessed a shared resource without any synchronization. Race conditions arise in software when an application depends on the sequence or timing of processes or threads for it to operate properly.
A race condition can also happen when the result of the program depends on the sequence of which the threads access the critical section. These inconsistencies of the result occur in critical sections due to sharing of resources by multiple processes/threads. (eg. sharing arrays, variables, files, etc.)
Process Synchronization with Semaphores
Download and study semaph.c program which illustrates the usage of semaphores for synchronizing Linux processes. To manage inter process communication, we need to create a shared memory space. This shared memory need to be destroyed at the completion of the program. Observe the use of Semaphore related function calls in the code. You may refer to man pages for each function call to learn more information.
• Compile the code in a terminal (console):
gcc -pthread -o semaph semaph.c
• Run the program in a terminal (console):
./semaph
Thread Synchronization with Mutexes and Condition Variables
Download and study race-condition.c program which illustrates a multithreaded program with a race condition. race-condition.c demonstrates manipulation of a shared global_counter by multiple threads. There are 4 ADD threads that increment the global_counter by 1 each, and 4 SUB threads that decrement the global_counter by 1 each. At the end, the program should print the final value of the global_counter. Please note that sleep(rand() % 2) is called inside add and sub functions to delay the completion for few seconds.
• Compile the code in a terminal (console)
gcc -pthread -o race race-condition.c
• Run the program in a terminal (console)
./race
Exercise 3
Compile and run the race-condition.c program. Observe the output.
You should notice that the final result of the global counter is printed before completion of all threads. This is due to non-completion of threads before printing the final result.
pthread_join is a pthread library function which guarantees the caller thread that the target thread is terminated. For example, in race-condition.c program, if the main thread calls pthread_join for all the ADD threads and SUB before printing the final result of the global variable, we should see the real final value after all SUB and ADD threads are completed.
int pthread_join(pthread_t thread, void **retval);
// example
pthread_join(thread, NULL);
Exercise 4
Modify race-condition.c (new name race-condition-ex4.c) to ensure that all ADD threads and SUB threads complete before printing the final result. Compile, run, and observe the output. (run multiple times to see if the output is consistent)
Although the threads are synchronized, you may still see a wrong final result. Each ADD thread increments the counter by 1 and each SUB thread decrements the counter by 1. Thus, the final value of the global_counter should remain at its initial value of 10. This behavior is caused due to the existence of a race condition.
Mutexes
Mutex is a synchronization construct which is used to control access to a critical section in the code. A mutex variable acts like a lock and the thread that acquires the thread gets to access the critical section. Once a thread has acquired a mutex lock to a critical section, no other thread can acquire it until the first thread releases the mutex.
pthread mutex example pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER; pthread_mutex_lock(&lock); // critical section here pthread_mutex_unlock(&lock);
Exercise 5
Modify race-condition.c (new name: race-condition-ex5.c) by adding a mutex variable to control access to the global_counter. Compile, run, and observe the output. (run multiple times to see if the output is consistent)
Condition variables
Mutexes provide mechanism for controlling access to a critical section and prevent races. However, they do cannot be used for threads to wait until another thread completes some arbitrary task. Condition variables provide a mechanism for threads to be signaled by other threads rather than continuously polling to check if a certain condition has been met. Condition variables are used in association with mutex variables. Related pthreads functions:
• Create and destroy
pthread_cond_init(condition,attr) pthread_cond_destroy(condition)
• Waiting and signaling: pthread_cond_wait(condition,mutex) pthread_cond_signal(condition) pthread_cond_broadcast(condition)
Download and study cond.c which demonstrates use of condition variables. The main thread creates three threads. Two of those threads increment a “count” variable, while the third thread watches the value of “count”. When “count” reaches a predefined limit, the waiting thread is signaled by one of the incrementing threads. The waiting thread “awakens” and then modifies count. The program continues until the incrementing threads reach TCOUNT. The main program prints the final value of count.
Exercise 6
Modify race-condition.c (new name: race-condition-ex6.c) using condition variables to prevent SUB threads from executing until all ADD threads are completed.
Further reading and examples: https://computing.llnl.gov/tutorials/pthreads/
Part 3: Producer-Consumer Problem
In this part we solve the producer consumer problem using (i) processes and semaphores, and, (ii) threads, mutexes and condition variables.
Recall the producer consumer problem from the lecture. Let us limit the scope of the problem as follows: There are two producers and one consumer. Each producer generates a random number between 1 and 10 (inclusive) and inserts (writes) to producer_buffer. Consumer reads (consumes) these numbers one at a time and updates the sum of all numbers consumed in consumer_sum variable. The producer_buffer can store only 10 numbers and the producers are not allowed to overwrite any unconsumed number.
Exercise 7
prod-con-threads.c is a basic code skeleton for producer consumer problem without synchronization constructs. Implement the above-mentioned producer-consumer scenario with synchronization using pthreads, mutexes and/or condition variables starting from the code skeleton (prod-con-threads.c). You may add any number of synchronization constructs and other functions as required.
Implementing the same producer consumer logic with processes involves allocating memory from the kernel space as a means of maintaining a global variable (For inter-process communication). Refer to the example which uses shared memory with processes in semaph.c
Exercise 8
Implement the above-mentioned producer-consumer scenario with synchronization using processes and semaphores. The very basic approach should be as follows:
// allocate shared memory // allocate semaphores if (fork()) producer(); //
producer 1
if (fork()) producer(); // consumer();
// cleanup shared memory
producer 2
Exercise 9
Limit the number of total items (numbers) produced by the consumers to 100 and measure the time taken to complete the program for both cases (processes and pthreads). Comment on the observations.
Part 4: Debugging
Viewing Processes and Threads
To view the running processes and threads in a linux console, we can use ps and top commands. These commands should be invoked separately in a different terminal window.
To see a list of processes running on your system details, run any of the following commands in a terminal:
• ps –ef
• ps -A
• top
If too many information is printed and impossible to read at one time, you can pipe the output through the less command to scroll through them at your own pace:
ps -A | less
More information on ps: http://man7.org/linux/man-pages/man1/ps.1.html or type in man ps in the console.
To list individual threads under each process:
top -H
More information on top: http://man7.org/linux/man-pages/man1/top.1.html or type in
man top in the console.
You may also try htop, an improved version of top which supports advanced visualization features.
To kill a running process use either one of these commands:
• kill –p <pid
• pkill
• killall
Debugging C Programs
There are multiple debugging tools available for debugging C programs. The gdb debugger is a command line debugger for C (and many other languages). To use gdb debugger, we need to compile the source code with -g compiler flag. (When you compile with -g the compiler includes debugging information in the binary, making it easier for gdb to find bugs.) gdb provides debugging features such as adding breakpoints, step execution, and, examining the call stack.
• Compiling the code in a terminal (console)
gcc -g -o prog prog.c
• Invoke gdb
gdb prog
• Run the program inside gdb
run <prog argument1 <prog argument 2
Resources on gdb
• Gdb tutorial from UChicago https://www.classes.cs.uchicago.edu/archive/2017/winter/51081-1/LabFAQ/ lab2/gdb.html
• Official gdb documentation https://ftp.gnu.org/old-gnu/Manuals/gdb/html_node/gdb_toc.html
Valgrind is a more advanced profiler which helps us debug applications as well as detect performance issues.
It includes advanced features such as detecting race conditions and false sharing