Contact Us
Projects | Assignments
sellfycode@gmail.com
More
Get Homework Help
English
Español
Português
Français
Deutsch
Italiano
Русский
close
Contact Us
Projects | Assignments
sellfycode@gmail.com
Starting from:
$25
Add to cart
Homework #3Defeat Policy Iteration Solved
Homework #3
Defeat Policy Iteration
Problem
Description
Policy iteration (PI) is perhaps the most under appreciated algorithm for solving MDPs. Although each iteration is expensive, it generally requires very few iterations to find an optimal policy. In this problem, you'll gain an appreciation for how hard it is to get policy iteration to break a sweat.
Currently, it is not known whether there is an MDP which requires more than a linear number of PI iterations in the number of states of the MDP. Your goal is to create a 30 state MDP that attains at least 15 iterations of PI before the algorithm terminates.
Procedure
● Construct an MDP with at most 30 states and at most 2 actions per state. You may assume the discount factor is 3/4. The MDP may have stochastic transitions.
● Use an editor or a simple program to create a json description of the target MDP that is parseable by the tester.
○ the json created should use double quotes instead of single quotes
○ the entire description must be less than 100,000 characters
● Validate your description
○ http://jsonlint.com/
○ http://www.charactercountonline.com/
● Test your MDP locally with the provided tester to ensure you meet the submission requirements.
Example
The following is an example of the json definition of a simple MDP
{
"gamma":0.75,
"states": [
{
"id": 0 ,
"actions": [
{
"id": 0,
"transitions": [
{
"id": 0,
"probability": 0.5 ,
"reward": 0 ,
"to": 0
},
{
"id": 1,
"probability": 0.5,
"reward": 0 ,
"to": 1
}
]
}
]
},
{
"id": 1 ,
"actions": [
{
"id": 0,
"transitions": [
{
"id": 0 ,
"probability": 1 ,
"reward": 1,
"to": 1
}
]
}
]
} ]
}
Resources
The concepts explored in this homework are covered by:
● Lectures
○ Lesson 1: Smoov & Curly's Bogus Journey
○ Lesson 5: AAA
● Readings
○ Littman (1996)( chapters 1-2)
Additionally, a tool to create and test your MDPs can be found here:
Starting from:
$25
Add to cart
1 file (5.4MB)
More products
CMPSC311-Assignment 2 Solved
$30
Add to cart
INT301- Week 14: Associative Memory Solved
$20
Add to cart
INT301- Week 13: Self-Organizing Map Solved
$20
Add to cart