CS489 - Assignment 3 Solved

Starting from:

$35

Reinforcement Learning

1 Introduction
The goal of this assignment is to do experiment with model-free control, includ¬ing on-policy learning (Sarsa) and off-policy learning (Q-learning). For deep understanding of the principles of these two iterative approaches and the differ¬ences between them, you will implement Sarsa and Q-learning at the application of the Cliff Walking Example, respectively.
2 Cliff Walking

Figure 1: Cliff Walking
Consider the gridworld shown in the Figure 1. This is a standard undis-counted, episodic task, with start state (S), goal state (G), and the usual actions causing movement up, down, right, and left. Reward is -1 on all transitions ex¬cept those into the region marked “The Cliff”. Stepping into this region incurs a reward of -100 and sends the agent instantly back to the start.
3 Experiment Requirments
• Programming language: python3
• You should build the Cliff Walking environment and search the optimal travel path by Sara and Q-learning, respectively.
• Different settings for f can bring different exploration on policy update. Try several f (e.g. f = 0.1 and f = 0) to investigate their impacts on performances.

More products

COT5930-Homework 5 Solved

$30

Add to cart

Shopping cart

US$0

CS489 - Assignment 3 Solved

More products