$25
Part I: Game of Life
In this assignment you will implement a parallel version of the Game of Life in OpenMP. The universe of the Game of Life is a cellular automaton, in which cells live on a 2-dimensional world. They are born, live and die over successive generations. The world is defined as a binary-valued array, and each generation evolves according to the following rules:
• Each cell can be one of two possible states: alive or dead.
• Every cell interacts with its eight neighbours, which are the cells that are horizontally, vertically and diagonally adjacent.
• At each step in time, the following transitions occur:
- Any live cell with fewer than two live neighbours dies, as if caused by under-population.
- Any live cell with two or three live neighbours lives on to the next generation.
- Any live cell with more than three live neighbours dies, as if by overcrowding. - Any dead cell with exactly three live neighbours becomes a live cell, as if by reproduction.
The first generation can be created randomly and the successive generations will follow the above rules simultaneously to every cell. Each generation is a pure function of the preceding one. The rules continue to be applied repeatedly to create further generations. Here is the pseudocode for the Game of Life:
for t in 0:T-1 //time step loop forall (i,j) in 1:N x 1:N //simulation domain nNeigh = number of live neighbors of World(t)[i,j]
World(t+1)[i,j] = DEAD if World(t)[i,j] AND (nNeigh == 2 or nNeigh ==3) then
World(t+1)[i,j] = LIVE else
World(t+1)[i,j] = (nNeigh == 3) end if
end forall
1
2
3
4
5
6
7
8
9
10
1
Serial Code
end for
11
Serial Code
We are providing you with a working serial program that implements the Game of Life. The provided code creates a random world, distributing cells uniformly according to a specified probability. It then runs the game of life for a given number of generations, sending each generation to the gnuplot plotting program.
1
The program may be run from the command prompt as follows:
./life [-n <int>] [-i <int>] [-s <int>] [-p <float >] [-t <int>] [-d] [-step] [-g <int
With the arguments interpreted as follows:
-n <integer> Number of mesh points in one dimension
-i <integer> Number of generations (or iterations)
-s <integer> Random seed
-p <float> Distribution probability
-t <integer> Number of threads
-d disables display
-g <integer> selects a game type
-step pauses every iteration (for debugging)
2 >]
3
4
5
6
7
8
9
10
11
12
Requirements
• To run the program, you will need the gnuplot plotting program to be installed on your computer. You can download the software from http://www.gnuplot.info .
• The Makefile that we have provided includes an “arch” file that defines appropriate command line flags for the compiler.
To compile, type make
To clean the objects and executables make clean Example run:
./life -n 500 -i 200 -p 0.2
1
2
3
4
5
6
• Since this program requires an interactive interface, you won’t be able to use the campus clusters for plotting. Please develop and test your implementations on your local machines or on the computer labs on campus.
Task Parallelism
In this part of the assignment, you will employ multithreading to handle graphical display. You will modify the life simulator to run with two threads. One thread (the plotter thread)
Data Parallelism
should handle the display (via gnu plot) while the other thread should perform the cell updates. The two activities may take place concurrently. To synchronise the execution of the threads, barriers can be used. Data should be shared between the threads via sharedmemory and the mesh plot must appear for each iteration.
Data Parallelism
In this part of the assignment, you will introduce additional threads to share the computational workload in the cell update. To implement data parallelism, use a one-dimensional decomposition where the computational domain is to be divided (approximately) equally among all workers involved in the computation. Lastly, have the master thread initialize the game board. Although this approach isn’t scalable because of the first touch policy, the impact on this assignment won’t be noticed. Having the master initialise the game board ensures reproducible results and is extremely helpful for debugging.
You can introduce data or task parallelism in any order you like but the final implementation should use a single thread for plotting and the remaining threads to perform cell updates concurrently with the plotter thread. Starting with data parallel implementation might be easier.
Testing Correctness
• You’ll notice that the serial program we provided allows you to input a random seed via the -s parameter. If you don’t specify this parameter, you obtain a seed based on the time of day. The program outputs this seed so that you will be able to reproduce a given run. In order to establish correctness, run your program on differing numbers of threads, including a single thread.
• Another approach is to run with a problem that has a known solution. The literature is full of problems with known solutions. The simplest problems to test have the static patterns that do not change. We have provided one, which is called block still. You can find some examples on the Wikipedia page for Conway’s Game of Life Others include “blinkers” and gliders. A command line option -g can be used to select a game number. You can add the glider (spaceship) game if you like.
• Another command line option -step has been added to allow you to “single step” through the game, one iteration at a time. This is handy for watching moving patterns.
• Finally, another simple way of checking your results against the single thread implementation is to output the contents of all game board locations in a systematic order (say, row major order), printing a single digit (1 or 0) for each position. You can then use the diff program to compare results. Of course, this assumes that your single processor implementation is correct!
Experiments
Experiments
You are going to conduct an experimental performance study on KUACC. The aim of this experimental study is to get performance data across different numbers of threads and under different input sizes. In these experiments, you will disable plotting feature. You should still spare a thread for handling the display but this thread will not perform plotting.
• The first experiment will be a strong scaling study such that you will run your parallelized code multiple times under different thread counts. Please use the command line arguments below for this study.
bash$ ./life -n 2000 -i 500 -p 0.2 -d -t <num-of-threads>
1
<num-of-threads>will be 1, 2, 4, 8, 16, and 32. Plot the speedup and execution time figures as a function of thread count and include them in your report.
• In the second experiment, you will evaluate your code’s performance under different input sizes, but with fixed thread count. The command that you will run is as follow.
bash$ ./life -n <input-sizes> -i 500 -p 0.2 -d -t 16
1
<input-sizes>that you will test are 2000, 4000, 6000, 8000, and 10000. Plot the execution time as a function of input size. To observe the scaling, also include a plot which shows the execution time per data point (runningtime/n2), where n is the input size. Include your figures in the report.
To ensure that all of your performance data is taken from the same machine, you can run your commands for the first experiment and the second experiment in the same job script. In other words, all commands for both experiments will be submitted as a single job.
More details on how to run your code in KUACC cluster can be found in Section Environment .