Starting from:

$30

CS342-Project 1 Processes, IPC, and Threads Solved

Part A. Processes (Mutrix-Vector Multiplication)
 

 In this project, you will develop a multi-process application that will perform matrix-vector multiplication for large matrices and vectors. More precisely, the application will multiply an nxn matrix M with a vector v of size n. For matrix M, the element in row i and column j is denoted as mij. For vector v,  jth element is denoted as  vj. Then the matrix-vector product is the vector x of length n, whose ith  element xi is given by:

 

!

𝑥! = 𝑚!"𝑣! !!!

 

 The matrix M information is stored in an input text file (ascii).  The vector v information is also stored in an input text  file (ascii). We assume that the row-column coordinates of each matrix element is also is stored in the file. Hence for each nonzero matrix value we store a triple (i, j,mij) in a line of the matrixfile. The  line format is: 

                        <rownumber <columnumber <value

            For exampe, for row 3, column 13, and the corresponding value 2, the line will be: 

            3 13 2

We assume the position of element vj in the vector v is also stored together with the value as a 2-tuple (j, vj) in a line of the vectorfile. The line format will be: 

                        <rownumber <value

            For example, if the 5th element of vector v has value 10, we will store: 

                        5 10 

 For vector v, information about zero-valued entries will be stored in the input file as well. But for matrix, we do not store information about zero-valued entries in the input file. 

 You will develop a multi-process program called mv that will do the matrixvector multiplication in the following way (for educational purposes). It will take the following parameters. 

 

                        mv matrixfile  vectorfile resultfile K 

 

 The main process will read the matrixfile (which can be quite large) and will partition it into K splits (each split will be a file). The partitioning will be very simple. Assume there are L values (lines) in matrixfile. Then the first split will contain the first   s = ceiling(L/K) values from the matrixfile, the next split will contain the next s values, and so on. The last split may contain less than s, which is fine. The number of values L in the file can be obtained by first reading the file from beginning to end in the main process before generating the splits (yes this is not efficient,  but for this project it is fine). 

 After generating the split files, the main process will create K child processes to process (map) the split files. These child processes will be called as mapper processes (mappers). Each mapper process will process another split file. Mapper processes will run and process their splits concurrently. 

 The processing of a split file in a mapper process will be as follows. The mapper process will first read the whole vectorfile and put the vector values into an array in main memory. If the vector is of size n,  the array size will be n. Then the mapper will start reading its split file one value at a time (one line at a time). The value will be multiplied by the corresponding element in vector v and the result will be added into the corresponding entry in a (partial) result array.  If, for example, the triple read from a line is (i, j,  mij),  then mij will be multiplied by vj and the multiplication result will be added to the value at entry i of the partial result array. Each line of  the split file will be read and processed  as described. In this way, the mapper will create a partial result for the multiplication. After mapper has finished with processing its split file, it will write out the partial result array into an intermediate output file, one result per line in the following format:

                        <rownumber <value 

 When all mappers finish, a reducer process (another child) will be started by the main process. The reducer process will open and read and process the intermediate files produced by the mappers. It will read the intermediate files and will sum up the respective values corresponding to the same vector index. At the end, the result vector will be obtained (after processing all input files). The result vector will be printed out to the resultfile in sorted order with respect to row numbers. The line format for the resultfile will be: 

                        <rownumber <value 

 Let us consider an example.  Assume we have a 6x6 matix and a vector of size 6, as follows. 

 

                                                              Matrix M           

                                                           1 2 3 4 5 6            Vector v

1
0
3
0
2
2
1

2

3

4

5

6
2
5
0
0
1
0
4
1
4
3
0
2
0
0
0
0
2
3
5
4
1
3
2
1
1
0
0
2
0
0
3
4
2
1
0
1
1

2

3

4

5

6

                                                             

 Assume there are 2 mappers. The first mapper will read the first 12 lines and the second mapper will read the remaining 11 lines. The figure below shows the results (intermediate files) produced by mapper 1 and mapper 2. Reducer will read  the intermediate files and will aggregate the values corresponding to the same row numbers, and finally will generate the resultfile as shown below. In example below, (4, 17) from intermediate file 1 and (4, 1)  from intermediate file 2 are aggregated to be (4, 18). The figure below is also an example for the content format of the files. 

 

 

Part B. IPC (with pipes)
 

 Implement the same program this time using pipes instread of use of intermediate files between mapper processes and reducer process. The name of the program will be mvp.  If there are K mappers, then there will K pipes created. Each pipe will be used for sending data from a mapper process to the reducer process. Each mapper will put data in a different pipe. Reducer process will get data from all pipes. 

 

Part C. Threads
 

 Implement the same program using threads this time. The name of the program will be mvt. There will be K mapper threads and 1 reducer thread created. Global variables (structures like arrays or linked lists)  will be used to pass information between mapper  threads and the reducer thread. 

 

Part D: Experiments 
 

Do some timing experiments to measure the running time (turnaround time) of the 3 programs for various values of  inputs. Try to draw some conclusion. 

More products