Starting from:

$30

CITS3402-Assignment 1 Sparse Matrices Solved

Matrix algebra underpins many compute intense applications in many fields (most fields of Engineering, Machine Learning, Scientific Modelling etc.); computers were originally used to crunch numbers so numbers we shall crunch.

Your goal, broadly speaking, is to:

•    Create a piece of software implementing a variety of sparse matrix operations

•    Ensure this software conforms to our input/output specification

•    Performance test your software and comment on any speedup you observe (or lack thereof) in a brief report.

3        Sparse Matrix Introduction
3.1         What is a Matrix
A mathematical matrix is a rectangular array of numbers arranged in rows and columns. For simplicity we use the convention of (rows, columns) when giving dimensions. An example of a 3 × 2 matrix (read ’three by two’) is

                                                                                                                  (1)

In many practical cases involving matrix algebra, many elements will contain zero as their value. Storing all elements at all times wastes a potentially vast amount of memory in these cases. For a given machine this heavily restricts the size of problem able to be computed and so sparse matrices were developed to save space at the cost of more time. Loosely speaking, a matrix is considered ’sparse’ if it contains ≈ 10% non-zero elements.

The general idea is to save space by storing fewer zero elements (since they generally do not matter in any calculation). There are three broadly accepted sparse matrix representations used:

3.2          Coordinate Format (COO)
The most intuitive sparse matrix format specifies each non-zero element as a triple

                                                              Ai,j = (i,j,val)                                                        (2)

For example, the matrix,

                                                                                                               (3)

becomes

                                                 X = [(0,2,1),(1,0,3),(1,2,2)]                                           (4)

This representation while simple can be cumbersome when wanting to look for all values in a particular row or column.

3.3           Compressed Sparse Row Format (CSR)
The compressed sparse row format stores a matrix with three arrays:

•    NNZ: The non-zero values stored in row-major order (left to right, top to bottom)

•    IA: The number of elements in each row. An extra element IA[0] = 0 is used by convention. This array can be used to index into the NNZ array for each i-th row

•    JA: Stores the column index of each non-zero element.

For example, the matrix,

                                                                                                               (5)

becomes
 
NNZ = [1,3,2]
(6)
IA = [0,1,3,3]
(7)
JA = [2,0,2]
(8)
3.4           Compressed Sparse Column Format (CSC)
The compressed sparse column format is very similar to CSR format except the non-zero values are stored in a column-wise ordering, the IA array corresponds to the extent of each column and the JA array provides row indices. For example, the matrix,

                                                                                                               (9)

becomes
 
NNZ = [3,1,2]
(10)
IA = [0,1,1,3]
(11)
JA = [1,0,1]
(12)
4      Tasks
Your code will be a simple command-line application that will:

•    Read in up to two dense matrix files

•    Convert this matrix (these matrices) to a suitable sparse format.

•    Perform a single matrix algebra routine specified by command-line

•    Time the whole process and log the results to a file

•    Cleanup all memory usage and terminate

We elaborate on the tasks required of you (and your code) below:

4.1        Reading in Files
Depending on the algebra operation requested one or two matrices will be required. These should be provided by command-line as .in files (see Section B).

4.2         Sparse Matrix Representation
You have a choice of using any of the three sparse matrix representations presented in Section 3. There will be pros and cons to each representation depending on what operation is requested; you will need to comment on this in your report.

4.3         Matrix Algebra Routines
We request five different operations to be implemented. They range in difficulty (and are presented in roughly increasing difficulty). Marks are distribution accordingly.

4.3.1        Scalar Multiplication

Type: Single matrix

Command-line argument: --sm \%f For a matrix A and scalar (float) α compute:

                                                                 Y = α · A                                                         (13)

i.e. multiply each element of A by α.

The command line argument expects the scalar to be supplied immediately after (either an int or float). You only need to consider a floating point scalar when using floating-point matrices.

4.3.2      Trace

Type: Single matrix

Command-line argument: --tr

The trace(t) of a matrix A is given as:

                                                               t = X (Ai,i)                                                        (14)

i=1,n

i.e. the sum of each diagonal element. Note: the Trace is only well defined for square matrices.

4.3.3        Matrix Addition

Type: Double matrix

Command-line argument: --ad

For two matrices A and B (of identical dimensions (n,m) their pair-wise sum is:

                                                 Yi,j = Ai,j + Bi,j∀i,j|i ≤ n,j ≤ m                                         (15)

i.e. add each element of both matrices together.

4.3.4      Transpose

Type: Single matrix

Command-line argument: --ts

The transpose of matrix A (denoted A0) is given by:

                                                       A0j,i = Ai,j∀i ≤ n,j ≤ m                                               (16)

i.e. The rows of A become the columns of A0 and vice-versa for the columns.

4.3.5        Matrix Multiplication

Type: Double matrix

Command-line argument: --mm

Note: Sparse matrix multiplication is a notorious bottle-neck in many codes; achieving speedup will be difficult and that is okay. Focus on implementing a simple, correct solution and working from there.

4.4         Timing and Logging
Your code is required to time the following events (in seconds)

•    The time to load in and convert any matrix files.

•    Time to execute the requested operation

This information must be logged to a .out file with the a header containing the following information:

•    Operation requested

•    File1

•    File2 (if needed)

•    Number of threads used

Followed by the result of your computation (single value or dense-matrix depending on the operation).

4.5       Report
Your report should be fairly brief but covers the following:

•    The operation system used in your submission.

•    The sparse matrix representation(s) used

•    For each matrix operation implemented:

–    Description of the parallelism implemented

–    Some informal reasoning about expected run-time and scalability

–    Testing results

–    Comment on the performance observed

We expect the report to be around six pages (including figures).

4.6       Restrictions
We place a number of restrictions on the type of matrix your solution will encounter and the design of your solution. They are briefly summarised below as a reminder

•    Input File format - See App. B for a detailed description of the matrix file-type we use.

•    Log-File format - See App. D for an example. Some further clarifications:

–    Operation requested - Provide the command line argument fed to the program as the label

–    Input Filenames - Provide the command line argument fed to the program as the label

–    Number of threads - Given as an integer

–    Output filename - Provide the date and time of execution plus your student number(s) .out as the filename

–    Computation result - Finally, output the result of the requested operation on the provided matrix (matrices). Floating point results will be tested up to 1e − 6 difference

–    We provide example output files for all types of operations for reference

•    Matrix Datatype

–    Integers

–    Floats

–    You will never be required to perform operations involving matrices of different data-types

•    Command-Line Arguments - See App. C for a detailed description. We guarantee that we will test your solution with arguments presented in the order described and will only provide valid command line arguments.

•    Input matrices - We provide a number of guarantees in our test matrices

–    All command line arguments provided will be valid

–    All input files will be well-formed, valid and of a single data-type

–    The dimensions of tested matrices will range in powers of two in the range [2,16,384]

–    Input matrices are not guaranteed to be square

–    Input matrices are not guaranteed to be the correct dimensions for the requested operation (e.g. we may ask for the Trace of a nonsquare matrix or addition of two matrices differing in dimension(s))

More products