CSE6332-Cloud-and-big-data PROJECT-4 For this project, you need to imp Solution
lement block matrix addition using Spark in Scala. Do not use Hadoop Map-Reduce. You will take two sparse matrices as input, convert them to block matrices, and then perform block matrix addition on them. A sparse matrix is a dataset of triples (i, j, v), where i and j are the indices (of type Int in Scala) and v is that matrix value (of type Double in Scala) at the indices i and j. A block matrix is a Spark RDD of blocks of type RDD[((Int, Int), Block)] in Scala, where the two Int are the block coordinates. A Block has type Array[Double] in Scala and has size rows * columns, where rows and columns are arguments in the main program. A matrix element Mij is stored inside the block with block coordinates (i/rows, j/columns) at the location (i%rows)columns + (j%columns) inside the block. The block matrix addition of M and N is done by finding blocks from M and N with the same block coordinates and by adding the blocks together using regular matrix addition in Scala. Your project is to convert two sparse matrices M and N which are read from files to block matrices using the Scala function createBlockMatrix and then to add them using block matrix addition. In your Scala main program, args(0) is the number of rows, args(1) is the number of columns, args(2) is the first input matrix M, and args(3) is the second input matrix N. You should only print to the output the block that you derive from matrix addition with block coordinates (1, 2). Like in Project-2, there are two small sparse matrices of sizes 3548 in the files M-matrix-small.txt and N-matrix-small.txt for testing in local mode using rows=8 and columns=6. Your block with coordinates (1, 2) must be similar to that in small-solution.txt. Then, there are 2 moderate-sized matrices 500010000 in the files M-matrix-large.txt and N-matrix-large.txt for testing in distributed mode (located on Expanse) using rows=200 and columns=300. So each block matrix will have 510 blocks. Your block with coordinates (1, 2) must be similar to that in large-solution.txt.