Starting from:

$125

CS6210 Project 2- Barrier Synchronization Algorithms Solution

Overview
OpenMP allows you to run parallel algorithms on shared-memory multiprocessor/multicore machines. For this assignment you will implement two spin barriers using OpenMP. MPI allows you to run parallel algorithms on distributed memory systems, such as compute clusters or other distributed systems. You will implement two spin barriers using MPI. Finally, you will choose one of your OpenMP barrier implementations and one of your MPI barrier implementations and combine the two in an MPI-OpenMP combined program in order to synchronize between multiple cluster nodes that are each running multiple threads.
You will run experiments to evaluate the performance of your barrier implementations (information about compute resources for running experiments is in a later section). You will run your OpenMP barriers on an 8-way SMP (symmetric multi-processor) system, and your MPI and MPI-OpenMP combined experiments on a cluster of up to 24 nodes with 12 cores each.
Finally, you will create a write-up that explains what you did, presents your experimental results, and most importantly, analyzes your results to explain the trends and phenomena you see (some hints for analysis are given below).
Detailed Instructions
Part 1: Learn about OpenMP and MPI
The first thing you want to do is learn how to program, compile, and run OpenMP and MPI programs.
Setup
Included with this project is a Vagrantfile which provides a VM with the required environment and installs.
Assuming you don't have it installed already, download the latest version of Vagrant for your platform. You'll also need the latest version of VirtualBox, which can be found here.
OpenMP
You can compile and run OpenMP programs on any Linux machine that has libomp installed. You can try the example code in the assignment folder (examples/OpenMP). Additional informational resources are as follows: - OpenMP Website - OpenMP Specification - Introduction to OpenMP (video series) - LLNL's OpenMP Tutorial
MPI
You can compile and run MPI programs on any Linux machine that has mpich installed (eg. the Vagrant box). Although MPI is normally used for performing computations across different network-connected machines, it will also run on a single machine. This setup can be used for developing and testing your project locally. You can try running the example code in the assignment folder (examples/MPI), as well as looking at the following informational resources: - MPI website - MPICH website - OpenMPI website (for general MPI API documentation)
Part 2: Develop OpenMP Barriers
Given to you:
1. The barrier function interfaces are specified in omp/gtmp.h in the assignment folder. Don't change the function signatures.
2. The omp/harness.c is a rudimentary test harness for you to test your implementation. Feel free to modify the harness to your needs.
What you need to do:
Complete the implementations of your 2 barriers in omp/gtmp1.c and omp/gtmp2.c
Part 3: Develop MPI Barriers
However you can optionally use it as a third barrier in your experiments, as a baseline/control, if you choose.
Given to you:
1. The barrier function interfaces are specified in mpi/gtmpi.h in the assignment folder. Don't change the function signature.
2. The mpi/harness.c is a rudimentary test harness for you to test your implementation. Feel free to modify the harness to your needs.
What you need to do:
Complete the implementations of your 2 barriers in mpi/gtmpi1.c and mpi/gtmpi2.c
Part 4: Develop MPI-OpenMP Combined Barrier
Now choose one of the OpenMP barriers you implemented, and one of the MPI barriers you implemented. Combine them to create a barrier that synchronizes between multiple nodes that are each running multiple threads. You'll also want to be sure to preserve your original code for the two barriers so that you can still run experiments on them separately. You can compare the performance of the combined barrier to your standalone MPI barrier. Note that you will need to run more than one MPI process per node in the standalone configuration to make a comparable configuration to one multithreaded MPI process per node in the combined configuration, so that total number of threads is the same when you compare.
Given to you:
You are given a template combined/Makefile which generates a binary named combined to test the combined barrier.
What you need to do:
Implement the combined barrier along with your own testing harness to generate a binary named "combined". Please provide the appropriate Makefile. The invocation for the binary will be as follows:
mpiexec.mpich -np <num_proc> ./combined <num_threads>
Note that you are free to create your own harness for the combined barrier. The gradescope autograder will only test for compilation and run of the combined barrier.
Part 5: Run Experiments
1. The next step is to do a performance evaluation of your barriers on a large cluster(PACE). Information on how to use the cluster is described under Resources.
3. You will measure your OpenMP barriers on a single cluster node, and scale the number of threads from 2 to 8.
4. You will measure your MPI barriers on multiple cluster nodes. You should scale from 2 to 12 MPI processes, one process per node.
5. You will measure your MPI-OpenMP combined barrier on multiple cluster nodes, scaling from 2 to 8 MPI processes running 2 to 12 OpenMP threads per process.
Some things to think about in your experiments:
2. You can use the gettimeofday() function to take timing measurements. See the man page for details about how to use it. You can also use some other method if you prefer, but explain in your write-up which measurement tool you used and why you chose it. Consider things like the accuracy of the measurement and the precision of the value returned.
3. If you're trying to measure an operation that completes too fast for your measurement tool (i.e., if your tool is not precise enough), you can run that operation several times in a loop, measure the time to run the entire loop, and then divide by the number of iterations in the loop. This gives the average time for a single loop iteration. Think a moment about why that works, and how that increases the precision of your measurement.
4. Finally, once you've chosen a measurement tool, think a bit about how you will take that measurement. You want to be sure you measure the right things, and exclude the wrong things from the measurement. You also want to do something to account for variation in the results (so, for example, you probably don't want to just measure once, but measure several times and take the average).
Part 6: Write-Up
The last part is to create the write-up. This should a PDF file and it should include a minimum of the following: 1. The names of both team members 2. An introduction that provides an overview of what you did (do not assume the reader has already read this assignment description). 3. An explanation of how the work was divided between the team members (i.e., who did what) 4. A description of the barrier algorithms that you implemented. You do not need to go into as much implementation detail (with pseudocode and so forth) as the MCS paper did. However, you should include a good high-level description of each algorithm. You should not simply say that you implement algorithm X from the paper and refer the reader to the MCS paper for details. 5. An explanation of the experiments, including what experiments you ran, your experimental set-up, and your experimental methodology. Give thorough details. Do not assume the reader has already read this assignment description. 6. Your experimental results. DO present your data using graphs. DO NOT use tables of numbers when a graph would be better (Hint: a graph is usually better). DO NOT include all your raw data in the write-up. Compare both your OpenMP barriers. Compare both your MPI barriers. Present the results for your MPI-OpenMP barrier. 7. An analysis of your experimental results. You should explain why you got the results that you did (think about the algorithm details and the architecture of the machine on which you experimented). Explain any trends or interesting phenomena. If you see anything in your results that you did not expect, explain what you did expect to see and why your actual results are different. There should be at least a couple of interesting points per experiment. The key is not to explain only the what of your results, but the how and why as well. 8. A conclusion.
Resources
1. You will have access to the coc-ice PACE cluster for use with this project.
3. Please refer to the Cluster-HOWTO for details on using the PACE cluster.
Submission Instructions
Submit the following to the Project 2 module in Gradescope:
project2/ omp/ Makefile gtmp.h gtmp1.c gtmp2.c harness.c mpi/ Makefile gtmpi.h gtmpi1.c gtmpi2.c harness.c combined/ Makefile (generates the "combined" binary) *.c (all required sources) *.h (all required headers)
2. Report.pdf - Your write-up (as a single PDF file) that includes all the things listed above will be submitted to the same Gradescope module.

More products