Problem 2 Final Exam 2020

Problem 2 Final Exam 2020

par Andreea-Alexandra Musat,
Nombre de réponses : 3

Hello, 


I have a question regarding the second problem from the final exam from 2020. We are given n i.i.d. samples from a Bernoulli distribution with parameter  \mu , where we know that  \mu is between  \kappa and  1 - \kappa for some  \kappa \in [0, \frac{1}{2}] and we are asked to accurately estimate the entropy of this distribution.

1) The proposed solution is to construct an estimate  \hat{\mu}(S) of the true parameter  \mu and then to use this for computing the entropy. This should be done by taking into account that  \mu \in [\kappa, 1-\kappa] . However, the proposed estimator is  \hat{\mu(S)} = \min{ \{ \max{ \{ \kappa, \frac{1}{n} \sum_{i=1}^n X_i \} }, \frac{1}{2} \} } . I don't understand why it makes sense to have our estimator have a maximum value of  \frac{1}{2} . This means that if  \mu > \frac{1}{2} (eg:  \kappa = \frac{1}{4}, \mu = 0.7 , this is perfectly fine according to our assumptions), our estimate will never be more than  \frac{1}{2} . I guess instead we'd want  \hat{\mu(S)} = \min{ \{ \max{ \{ \kappa, \frac{1}{n} \sum_{i=1}^n X_i \} }, 1 - \kappa \} } in order to make sure that our estimator is in the good range? 


2) For the second part, I understand that  X - \mu is  \frac{1}{4} -subgaussian and the subsequent bound on  P( | \hat{\mu(S)} - \mu |) (this is simply from Hoeffding's lemma applied for the Bernoulli random variable + subgaussian bound). Just to make sure, you obtain the bound on  P( | \hat{h}(S) - h |) ? via the mean value theorem for  h_2 ?


Thank you!

Andreea

En réponse à Andreea-Alexandra Musat

Re: Problem 2 Final Exam 2020

par Thomas Weinberger,
Hello Andreea,

1) you are of course correct, it should read \hat{\mu}(S):= \min\{\max\{\kappa, \frac{1}{n}\sum X_i\}, 1-\kappa\}
2) I don't think this is due to the mean-value Theorem. I will try to reconstruct the argument soon and will then post it here (until then, don't worry, this will not appear in the mid term...)

Best,
Thomas
En réponse à Thomas Weinberger

Re: Problem 2 Final Exam 2020

par Andreea-Alexandra Musat,
Hello Thomas,

Thanks for the reply!

To expand a bit on my reasoning for (2):
 h_2(x) = - x \log{(x)} - (1 - x) \log{(1 - x)} is continuous and differentiable on  (0, 1) and thus for any  a, b \in (0, 1), \exists c \in (a, b) such that  h^\prime (c) = \frac{f(b) - f(a)}{b - a} (*). Because  X_i - \mu is  \frac{1}{4} -subgaussian, we have that  \mathbb{P}(| \hat{\mu}(S) - \mu| \geq \epsilon) \leq \delta (for some  \epsilon that depends on  \delta ) and so if we multiply both sides inside the probability by  | h^\prime(c) | we have that  \mathbb{P}( |h^\prime (c) | | \hat{\mu}(S) - \mu | \geq | h^\prime (c) | \epsilon  ) \leq \delta and the bound on  \mathbb{P}(| h(\hat{\mu}(S)) - h | \geq \epsilon^\prime) follows from (*) with  a = \hat{\mu}(S) and  b = \mu together with the fact that  h^\prime(c) is bounded.
But of course, there might be better/shorter ways of getting to this.

Best,
Andreea