This project is to design a reinforcement learning algorithm using double Q-learning algorithm with function approximation. Two multi-layer neural networks should be designed as the Q function approximators. The environment should be Acrobot-v1, taken from the OpenAI Gym package.
• A report in PDF format, including a description of the problem, a description of the neural networks, pseudo-code for the algorithm, performance of the algorithm, training time, lessons learned, and possible improvement.
• Source code in Python format.
Any code taken from the Internet (or peers in the class) will need to be clearly labeled with the source of the code.