COM-406: Final 2021, exo 4.b

Could you please explain how to obtain the value of m that maximises the bound of E[R]? And then how do we obtain the final bound for E[R]?

Thank you

Re: Final 2021, exo 4.b

by Thomas Weinberger - Thursday, 17 November 2022, 11:12

Hello Marina,

Due to the imposed Lipschitzness of the reward we have that the reward around each grid point

$i \delta - \delta/2$ changes at most linearly in

$n$ . Moreover, since the grid points are spaced at a distance of

$1/m$ apart, we can approximate the regret within each interval around a given grid point by the regret at grid point itself, with being off by at most

$\mathcal{O}(n/m)$ .

The only thing that remains is then to add approximation error to the bound obtained in i), plug in

$|C|=m$ and optimize for

$m$ (set derivative wrt.

$m$ equal to 0).

Cheers,
Thomas

Re: Final 2021, exo 4.b

by Thomas Weinberger - Thursday, 17 November 2022, 11:25

Nevermind,

$sqrt(m)$ is concave so setting the first derivative = 0 doesnt work. I will have another look later!

Re: Final 2021, exo 4.b

by Thomas Weinberger - Wednesday, 23 November 2022, 19:17

Hi Marina,

My first guess was actually correct. Applying the first-order optimality condition is permitted here because, asymptotically (for

$n \rightarrow \infty$ ), the term

$\mathcal{O}(n/m)$ (which is convex!) dominates. Hence the objective function is (informally speaking) "asymptotically convex" and setting the first derivative to zero is sufficient and necessary for obtaining the (asymptotically) optimal choice of

$m$ .

Best,
Thomas

Final 2021, exo 4.b

by Marina Rapellini - Wednesday, 1 February 2023, 19:25

Hi
I am trying again to do this computation but I am not able to compute the derivative. Can you help me out? Thank you

Re: Final 2021, exo 4.b

by Thomas Weinberger - Thursday, 2 February 2023, 15:18

Dear Marina,

Ignoring the big-O notation for a moment, the first derivative is equal to

$\frac{1}{2}\sqrt{ \frac{k n \log(n)} {m} } - \frac{n}{m^2}$ . Setting this to zero and solving for

$m$ , you obtain the optimal choice for

$m$ (as I explained above, the first order optimality condition holds even though the function is strictly speaking not convex).

Best,

Thomas