Sale!

Choice and Game Theory Assignment 3

$30.00

Category:
Rate this product

COMP4418,  – Assignment 3

Worth: 15%
1. [20 Marks] (Social Choice and Game Theory)
(a)
a b c
d
e f g
In the tournament in the above Figure, assuming all the arcs missing from the figure are
downward arc, list
• the uncovered set;
• the top cycle;
• the set of Copeland winners;
• the set of Banks winners; and
• the set of Condorcet winners.
(b) Compute all the Nash equilibria of the following two player game.
D E
A 2, 4 8, 5
B 6, 6 4, 4
1
2. [30 Marks] (Decision Making)
(a) For each of the following games, choose the best model among (A) Markov chain (Markov
process); (B) Markov decision process (MDP); (C) Hidden Markov model (HMM); (D)
Partially-observable Markov decision process (POMDP); and (E) None/Other.
• Blackjack
• Candy Crush
• Chess
• Minesweeper
• Snakes and Ladders
• Texas Hold ’em Poker
For the next questions, consider the Markov Decision Process depicted below. Edges are labelled
“name of the action (reward associated), probability of the transition”.
Stay (1), 1 s1 s2 Stay (0), 1 s3
Leave (0), 1
Leave (5), 0.5
Leave (5), 0.5 Stay (−2), 1
Leave (−2), 1
(b) Using your intuition, give an optimal policy for situations where the discount factor is very
high (for instance, δ = 0.999)? Explain your reasoning in two or three sentences.
(c) Using your intuition, give an optimal policy for situations where the discount factor is very
low (for instance, δ = 0.001)? Explain your reasoning in two or three sentences.
(d) Represent the values computed during the first three iterations of the Value Iteration algorithm
using the following format where L represents the action Leave and S represents the action
Stay. Use a discounting factor of 0.6.
V0(s) V0(s, S) V0(s, L) V1(s) V1(s, S) V1(s, L) V2(s) V2(s, S) V2(s, L) V3(s)
s1 0 1 . . .
s2 0 . . .
s3 0
(e) Let π be the following policy: π(s1) = L, π(s2) = L, π(s3) = S. If π is assumed to hold, the
MDP turns into a Markov Chain. Represent this Markov Chain / Markov Process.
(f) Assuming the agent uses π, express the value associated to each state as a function of the
discount factor δ. Provide the formal derivation of the result as part of your answer. Elaborate
on whether the computations of this question support the intuition of questions 2b and 2c.
2
Submission
• Put your written solutions in a single PDF file assn3.pdf
• Submit using the command: give cs4418 assn3 assn3.pdf
Late Submissions
Due to the assignment due date being extended to the 22nd November there will be no late submissions
allowed.

Choice and Game Theory Assignment 3
$30.00
Open chat
Need help?
Hello
Can we help?