Problem 3: Epsilon-Greedy Algorithm - HMW 4

Re: Problem 3: Epsilon-Greedy Algorithm - HMW 4

by Thomas Weinberger -
Number of replies: 0
Dear Gabin,

Your approach looks reasonable! Regarding the last term: try to write down each \mathbb{E}[X_{i}^t*}] explicitly as a sum of products (regret when choosing arm 1 \leq t \leq k times the probability of arm t^* looking the best). Moreover, note that the latter probability can be upper bounded by the prob. that arm t^* looks better than just the optimal arm (arm 1 by convention).

You might also want to take a look at my recent post in the discussion forum.

Best,
Thomas