Starting from:

$24.99

CS6700 Assignment #1 Solution

CS6700 : Reinforcement Learning
Programming Assignment #1

• You have to turn in the well-documented code along with a detailed report of the results of the experiments. Typeset your report in LATEX.
• Be precise with your explanations. Unnecessary verbosity will be penalized.
• Check the Moodle discussion forums regularly for updates regarding the assignment.
• Please start early.

You are to conduct experiments on the 10-arm bandit testbed described in Section 2.3 of the book. Please turn in the code for the testbed as well as the algorithms. Label the graphs clearly, with axes labels, parameter values, question numbers etc. Ensure that the code is adequately commented. Also turn in a short write-up of your observations from the experiments and answers to all the questions asked below.
Remember: The graphs are to run for 1000 plays, with each curve being the average of the performance on 2000 different bandit problems, generated as per the description in the book.
Points will be given according to the following criteria :
• Correct generation of the bandit problems.
• Correct coding of the learning algorithms.
• Correct code for gathering data to plot the graphs,
• Performance of the learning algorithms (correctness, optimality),
• Neatness of the graphs, appropriate labelling etc, and well-commented code.
Note: You can program in any language you want, but if you need any help later on, it will have to be in a language the TAs are comfortable with. So check with us before you start.

Use graphing software of your choice, (for eg. Matlab or Gnuplot), to produce the graphs. Ensure that you have labelled the graphs correctly.
• Compare the performance with the others. What do you observe? Why do you think this is so?
• What is the computational cost of computing median? Is it the rate-determining step? Can you make it faster?
Submission Guidelines
Submit a single zip file containing the following files in the specified directory structure. Use the following naming convention: rollno PA1.zip A sample submission would look like this: rollno PA1
experiment testbed file(s) algorithm file(s)
plotting file(s)
···
report.pdf README
Page 2

More products