Starting from:

$30

CSCI-GA3033-090-Homework 4 Solved

Assignment
Make a copy of the colab to your drive, and then go through the skeleton code in it. At the very end, you will find a function that is supposed to run the bandit algorithms and plot their cumulative regret over time. Complete this function, and verify that it works by testing it with the two given environments and the FullyRandom solver.

Note: For a full score on this problem, the following must be true: each solver must be denoted by a different color, and each environment (Bernoulli bandit and Gaussian bandit) must be shown on a different plot. Make sure to label each of the two plots and each line in each plot with the associated algorithm as well. For formatting guidance, look at the given plot

Once you have finished it, implement EpsilonGreedy, UCB, and Thompson Sampling solvers. Make sure when you run the colab notebook, it generates the two associated plots: one for the Bernoulli bandit with all the algorithms, and another for the Gaussian bdit with all the algorithms.

More products