Starting from:

$29.99

CSC718 Parallel Programming Final Exam Solution

1. (10 points) What is GPU programming? What are the strengths of GPU programming over CPU programing?
2. (10 points) An oceanographer gives you a serial program and asks you how much faster it might run on 8 processors. You can only find one function amenable to a parallel solution. Benchmarking on a single processor reveals 80% of the execution time is spent inside this function. What is the best speedup a parallel version is likely to achieve on 8 processors using Amdahl's law?
3. (10 points) For a problem size of interest, 6 percent of the operations of a parallel program are inside I/O functions that are executed on a single processor. What is the minimum number of processors needed in order for the parallel program to exhibit a speedup of 10?
4. (20 points) Average memory access time can be calculated using AMAT = Hit time + Miss rate x Miss penalty.
1) (10 pints) If a direct mapped cache has a hit rate of 95%, a hit time of 4 ns, and a miss penalty of 100 ns, what is the AMAT?
2) (10 points) If replacing the cache with a 2-way set associative increases the hit rate to 97%, but increases the hit time to 5 ns, what is the new AMAT?
5. (50 points) Programming Assignment: The simplest harmonic progression is
1 1 1
, , , … 1 2 3
Let 𝑆𝑆𝑛𝑛 = ∑𝑛𝑛𝑖𝑖=11 𝑖𝑖.
A harmonic progression sequential summary program (hst.c) is given in the exam and the program can computes the sums to arbitrary precision after the decimal point. The program requires two parameters, n and d, and computes 𝑆𝑆𝑛𝑛 to d digits of precision after the decimal point. For example, 𝑆𝑆7 = 2.592857142857, to 12 digits of precision after the decimal point.
1) (25 points) Convert the C sequential program, hst.c, to MPI parallel program (the hst.c source code can be found in the hps.zip file). Benchmark the program computing 𝑆𝑆1,000,000 to 100 digits of precision, using 1, 2, 3, 4 processors.
Problem Np=1 Np=2 Np=3 Np=4
Programming

2) (25 points) Convert the C sequential program, hst.c, to OpenMP parallel program. Benchmark the program computing 𝑆𝑆1,000,000 to 100 digits of precision, using 1, 2, 3, 4 threads.
Problem Nt=1 Nt=2 Nt=3 Nt=4
Programming

Submit the following items to ‘Final Exam’ dropbox in D2L website:
• A short description about how to compile and run your program
• MPI source code in 3.1)
• OpenMP source code 3.2)
• Benchmark results from 3.1) and 3.2)

More products