Starting from:

$25

CMPUT398- Lab 2 Solved

Objective
The purpose of this lab is to introduce you to the CUDA API by implementing vector addition. You will implement vector addition by writing the GPU kernel code as well as the associated host code.

All parts of this lab will be submitted as one zipped file through eclass. Details for submission are at the end of the lab.

Instructions
Edit the code where the TODOs are specified and perform the following:

•          Allocate device memory

•          Copy host memory to device

•          Initialize thread block and kernel grid dimensions

•          Invoke CUDA kernel

•          Copy results from device to host

•          Free device memory

•          Write the CUDA kernel

 

Local Setup Instructions
Steps:

1.     Download “Lab2.zip”.

2.     Unzip the file.

3.     Open the Visual Studios Solution in Visual Studios 2013.

4.     Build the project. Note the project has three configurations.

a.     Test

b.     Debug

c.     Submission

For testing it is recommended that you run the “Debug” configuration.

 

 

But make sure you have the “Submission” configuration selected when you finally submit.

5.     Run the program by pressing the following button:

 

 

Don’t try to run the program when the “Test” configuration is selected.

Vector Add Testing
1.     The “Debug” configuration will show “false” in the last line of the programs output If your code is incorrect.

 

 

The outputted vector can be seen in “Dataset\VectorAdd\Test\[0-9]”. For example, “Dataset\VectorAdd\Test\0\myOutput.raw”. The first line is the size of the array. The “Debug” configuration will run the first test “Dataset\VectorAdd\Test\0”. 

2.     You can also run the program from the Command Prompt (cmd).

 

VectorAdd -e <expected.raw -i <intput1.raw,<input2.raw \

  -o <output.raw -t vector

 

Make sure you are in the directory with the executable before trying to run the command.

3.     If you want to run all tests, then you can run the “Test” configuration. To do this simply build the program, with the “Test” configuration selected, and without running the debugger.

 

Build - Build Solution

 

In the Build Output window you should see the following:

"Vector Add Testing Test 0..."
COMMAND
 
Same
 
 
 
 
Note that if the test fails you see “Different or error” instead of “Same”

Alternatively, you can just run the file “Test.bat” provided to you instead.

Using NSIGHT To Analyze Performance Instructions
This is complemented by the file “Guide on Debugging, Testing, Submitting and Profiling CUDA Project” on e-class, please read the file if you have not done so.

In Application Setting, enter:

Application: <Path\to\the\project\Test\VectorAdd.exe

Arguments: -e output.raw -i input0.raw,input1.raw -o myOutput.raw -t vector

Working Directory: <Path\to\the\project\Dataset\VectorAdd\Test\<Test Number

 

Then do the other steps like outlined in the guide file.

Submit an image called “cuda_summary.jpg” contain the screenshot of the CUDA Summary page that you get from running NSIGHT. For example:

 

Questions
Assume that the input vectors to your program has length N. Answers for the following questions must be based on N.

1.     How many floating operations are being performed in your vector add kernel? EXPLAIN.

2.     How many global memory reads are being performed by your vector add kernel? EXPLAIN.

3.     How many global memory writes are being performed by your vector add kernel? EXPLAIN.

4.     In the vector add project, how many bytes are transferred from the Host to the Device? EXPLAIN.

5.     In the vector add project, how many bytes are transferred from the Device to the Host? EXPLAIN.

More products