Starting from:

$30

CS221-Homework 5 Multi-agent Pac-Man Solved

Problem 1: Minimax
 Before you code up Pac-Man as a minimax agent, notice that instead of just one adversary, Pac-Man could have multiple ghosts as adversaries. So we will extend the minimax algorithm from class (which had only one min stage for a single adversary) to the more general case of multiple adversaries. In particular, your minimax tree will have multiple min layers (one for each ghost) for every max layer.
Specifically, consider the limited depth tree minimax search with evaluation functions taught in class. Suppose there are n+1 agents on the board, a0,…,an, where a0 is Pac-Man and the rest are ghosts. Pac-Man acts as a max agent, and the ghosts act as min agents. A single depth consists of all n+1 agents making a move, so depth 2 search will involve Pac-Man and each ghost moving two times. In other words, a depth of 2 corresponds to a height of 2(n+1) in the minimax game tree.

Write the recurrence for Vmax,min(s,d) in math. You should express your answer in terms of the following functions: IsEnd(s), which tells you if s is an end state; Utility(s), the utility of a state; Eval(s), an evaluation function for the state s; Player(s), which returns the player whose turn it is; Actions(s), which returns the possible actions; and Succ(s,a), which returns the successor state resulting from taking an action at a certain state. You may use any relevant notation introduced in lecture.
Now fill out MinimaxAgent class in submission.py using the above recurrence. Remember that your minimax agent should work with any number of ghosts, and your minimax tree should have multiple min layers (one for each ghost) for every max layer. 

Your code should also expand the game tree to an arbitrary depth. Score the leaves of your minimax tree with the supplied self.evaluationFunction, which defaults to scoreEvaluationFunction. The class MinimaxAgent extends MultiAgentSearchAgent, which gives access to self.depth and self.evaluationFunction. Make sure your minimax code makes reference to these two variables where appropriate as these variables are populated from the command line options.

Other functions that you might use in the code: GameState.getLegalActions() which returns all the possible legal moves, where each move is Directions.X for some X in the set {NORTH, SOUTH, WEST, EAST, STOP}. Go through ReflexAgent code as suggested before to see how the above are used and also for other important methods like GameState.getPacmanState(), GameState.getGhostStates(), GameState.getScore() etc. These are further documented inside the MinimaxAgent class.

Implementation HintsPac-Man is always agent 0, and the agents move in order of increasing agent index. Use self.index in your minimax implementation to refer to the Pac-Man's index. Notice that only Pac-Man will actually be running your MinimaxAgent.
All states in minimax should be GameStates, either passed in to getAction or generated via GameState.generateSuccessor. In this project, you will not be abstracting to simplified states.
The evaluation function in this part is already written (self.evaluationFunction), and you should call this function without changing it. Use self.evaluationFunction in your definition of Vmax,min wherever you used Eval(s) in part 1a. Recognize that now we're evaluating states rather than actions. Look-ahead agents evaluate future states whereas reflex agents evaluate actions from the current state.
You can assume that you will always have at least one action returned from getLegalActions function.
If there is a tie between multiple actions for the best move, you may break the tie however you see fit.
The minimax values of the initial state in the minimaxClassic layout are 9, 8, 7, -492 for depths 1, 2, 3 and 4 respectively. You can use these numbers to verify whether your implementation is correct. Note that your minimax agent will often win (just under 50% of the time for us--be sure to test on a large number of games using the -n and -q flags) despite the dire prediction of depth 4 minimax.python pacman.py -p MinimaxAgent -l minimaxClassic -a depth=4
Further ObservationsOn larger boards such as openClassic and mediumClassic (the default), you'll find Pac-Man to be good at not dying, but quite bad at winning. He'll often thrash around without making progress. He might even thrash around right next to a dot without eating it. Don't worry if you see this behavior. Why does Pac-Man thrash around right next to a dot? (These questions are here for you to ponder upon; no need to include in the write-up.)
Consider the following run:python pacman.py -p MinimaxAgent -l trappedClassic -a depth=3
Why do you think Pac-Man rushes the closest ghost in minimax search on trappedClassic?
Problem 2: Alpha-beta pruning
Make a new agent that uses alpha-beta pruning to more efficiently explore the minimax tree in AlphaBetaAgent. Again, your algorithm will be slightly more general than the pseudo-code in the slides, so part of the challenge is to extend the alpha-beta pruning logic appropriately to multiple minimizer agents.You should see a speed-up (perhaps depth 3 alpha-beta will run as fast as depth 2 minimax). Ideally, depth 3 on mediumClassic should run in just a few seconds per move or faster.

python pacman.py -p AlphaBetaAgent -a depth=3
The AlphaBetaAgent minimax values should be identical to the MinimaxAgent minimax values, although the actions it selects can vary because of different tie-breaking behavior. Again, the minimax values of the initial state in the minimaxClassic layout are 9, 8, 7, and -492 for depths 1, 2, 3, and 4, respectively. Running the command given above this paragraph, the minimax values of the initial state should be 9, 18, 27, and 36 for depths 1, 2, 3, and 4, respectively.
Problem 3: Expectimax
 Random ghosts are of course not optimal minimax agents, so modeling them with minimax search is not optimal. Instead, write down the recurrence for Vmax,opp(s,d), which is the maximum expected utility against ghosts that each follow the random policy which chooses a legal move uniformly at random. Your recurrence should resemble that of Problem 1a (meaning you should write it in terms of the same functions that were specified in 1a).
 Fill in ExpectimaxAgent, where your agent will no longer take the min over all ghost actions, but the expectation according to your agent's model of how the ghosts act. Assume Pac-Man is playing against multiple RandomGhost, which each chooses getLegalActions uniformly at random.You should now observe a more cavalier approach to close quarters with ghosts. In particular, if Pac-Man perceives that he could be trapped but might escape to grab a few more pieces of food, he'll at least try:

python pacman.py -p ExpectimaxAgent -l trappedClassic -a depth=3
You may have to run this scenario a few times to see Pac-Man's gamble pay off. Pac-Man would win half the time on an average and for this particular command, the final score would be -502 if Pac-Man loses and 532 or 531 (depending on your tiebreaking method and the particular trial) if it wins (you can use these numbers to validate your implementation).Why does Pac-Man's behavior in expectimax differ from the minimax case (i.e., why doesn't he head directly for the ghosts)? Again, just think about it; no need to write it up.
Problem 4: Evaluation function (extra credit)
Write a better evaluation function for Pac-Man in the provided function betterEvaluationFunction. The evaluation function should evaluate states (rather than actions). You may use any tools at your disposal for evaluation, including any util.py code from the previous assignments. With depth 2 search, your evaluation function should clear the smallClassic layout with two random ghosts more than half the time for full credit and still run at a reasonable rate.python pacman.py -l smallClassic -p ExpectimaxAgent -a evalFn=better -q -n 10
We will run your Pac-Man agent 20 times, and calculate the average score you obtained in the winning games. Starting from 1300, you obtain 1 point per 100 point increase in your average winning score. In grader.py, you can see the how extra credit is awarded. You get 8 extra points when you reach 2000 or more. In addition, the top 3 people in the class will get additional points.

Check out the current top scorers on the leaderboard! You will be added automatically when you submit.Hints and Observations

Having gone through the rest of the assignment, you should play Pac-Man again yourself and think about what kinds of features you want to add to the evaluation function. How can you add multiple features to your evaluation function?
You may want to use the reciprocal of important values rather than the values themselves for your features.
The betterEvaluationFunction should run in the same time limit as the other problems.

More products