In this problem you are asked to write two parallel algorithms (one in Pthreads and the other in openMP) for solving a dense linear equations of the form A*x=b, where A is an n*n matrix and b is a vector. You will use Gaussian elimination without pivoting.
The algorithm has two parts:
(a) Gaussian Elimination: the original system of equation is reduced to an upper triangular form Ux=y, where U is a matrix of size n*n in which all elements below the diagonal are zeros which are the modified values of the A matrix, and the diagonal elements have the value 1. The vector y is the modified version of the b vector when you do the updates on the matrix and the vector in the Gaussian elimination stage.
(b) Back Substitution: the new system of equations is solved to obtain the values of x.
The Gaussian elimination stage of the algorithm comprises (n-1) steps. In the algorithm, the ith step eliminates nonzero subdiagonal elements in column i by subtracting the ith row from row j in the range [i+1,n], in each case scaling the ith row by the factor Aji/ Aii so as to make the element Aji zero. See the figure.
Hence the algorithm sweeps down the matrix from the top left corner to the bottom right corner, leaving subdiagonal elements behind it.
1
The whole point of parallel programming is performance, so you will be graded partially on the efficiency of your algorithm. Therefore, once you achieve a correct solution, you should perform some experimentation with your programs to try to optimize its performance. You should hand in two working and documented programs. You should provide a basic correctness argument for your solution (arguing that the relevant dependencies and other issues are taken care of by proper synchronization). In addition, you should describe some of the different versions of your program that you tried, what performance results you got out of them, and why you think your current version is reasonably efficient (or what more you could do to make it more efficient).
Suggestions:
• A sequential C code is provided on Blackboard.
• Gaussian elimination involves O(n3) operations. The back substitution stage requires only O(n2) operations, so you are not expected to parallelize back substitution.
• The algorithm should scale, assuming n is much larger than the number of processors.
• There should be clear comments at the beginning of the code explaining your algorithm.