$30
Description
The goal of the Final Course Project is to explore advanced methods and/or applications in reinforcement learning. You will be expected to prepare a proposal, milestone report, final report, and final presentation. All projects should evaluate novel ideas that pertain to deep RL or its applications. The project must involve reinforcement learning algorithms. You are encouraged to use your ongoing research work as a project in this course, provided that this work relates to deep reinforcement learning. You may discuss the topic of your final project with course staff by email, private message in Piazza, or in office hours. If you are not sure about the topic, we encourage you to speak with us. There are few directions, each have its own checkpoints.
Multiagent RL
Building and solving multiagent tasks (including but not limited to agents communications, transportation problems, multi-agent cooperation, etc) - all might potentially lead to a research project. Steps:
1. Building a multiagent environment from scratch (can be an extension of your work at Assignment 1).
2. Solving the environment using any tabular methods
3. Solving the environment using any deep RL methods (DQN, DDQN, AC, A2C, DDPG, TRPO, PPO, etc) and compare the results
Checkpoints:
• Solving the environment using any tabular methods
RC Cars
Setting up the simulator, training the cars in the simulator, applying results on the real RC cars. These steps may require prior knowledge in robotics or autonomous vehicles.
Steps:
1. Install and explore the DeepRacer RC cars simulator
2. Check existing solutions and apply any RL methods to teach the car to drive in the simulator
3. Apply learnt knowledge in the real RC cars (with the ultimate goal of making a car move forward using only the RL algorithm)
Checkpoints:
• Apply any RL methods to teach the car to drive in the simulator
Exploring Deep RL Algorithms
Explore recent advances in RL. This may include solving ANY of the below environments using deep RL algorithms.
Possible environments include:
• Google Research Football Environment [blog post, github, includes participation in the tournament]
• MALMO (platform built on top of Minecraft) [github]
• Robotics by OpenAI [details, blog post]
• Atari by OpenAI [details] Steps:
1. Set up the environment
2. Check the existing baseline methods applied to solve it
3. Apply deep RL to improve the results
Checkpoints:
• Check existing baseline methods applied to solve it
You can propose your own topic, thus you will get individual checkpoint.
If you get interesting results, we would encourage to share your project with the public in terms of participating in the CSE Demo Days, or some other events, so it would be beneficial, if you choose topic that you are really interested in.
If you do not know what to choose - go with Exploring Deep RL Algorithms on Atari OpenAI.
You may also come with your topic proposal. Please talk to Alina [avereshc@buffalo.edu].
Google Form link will be added later.
The project proposal should be a one page single-spaced extended abstract motivating and outlining the project you plan to complete. You proposal should have the following structure:
1. Topic
2. Objective. Explain the objective of the project and why that objective is relevant and important.
3. Related Work. Briefly review the most relevant prior work, and highlight where those works fall short of meeting the objectives described above.
4. Technical Outline. Explain your approach at a high-level, making clear the novel technical contribution. What environment and algorithm you are planning to use.
Before submitting, your proposal should be approved by any of the course staff.
Each direction have individual expectations for the middle checkpoint. If you do your own project - the checkpoint has to be confirmed during the proposal.
Complete your project in either Jupyter Notebook or python script. In your report include:
• The main motivation of your project (Why is it important/novel?)
• Preliminary materials (Discuss the algorithms, some background info you need to know)
• Implementation details
• Your results
Present your work
Presentation Days: will be added later
Present your work during the Presentation Days. Registration slots will be available around a week prior to dates. The whole team should present the work. Note: your presentation should represent the work you have submitted. If you take part in CSE Demo Days, you will make a short presentation during that day.
Presentation details
Length: 10 mins + followup questions
Presentation Templates: UB branded ppt templates or UB CSE PowerPoint template
Suggested presentation structure:
– Project Title / Team’s Name / Course / Date [1 slide]
– Project Description [1 slide]
– Background [max 2 slides]
– Implementation [max 2 pages]
– Results (Graphs & Any Visuals) [max 4 slides]
– Key Observations / Summary [1 slide]
– Thank you Page [1 slide]