2018 exam solutions

Re: 2018 exam solutions

by Jean-Cédric Chappelier -
Number of replies: 0
As mentioned/sketched in the QAS, in 2018 we had 2 different occurrences of the course (Spring and Fall) [[ and changed the course from a 6ECTS to a 4ECTS ]]; you are referring to **spring** 18 I guess.

  • question VI.6 and VI.13 are about using the **full** CYK chart to do lexical initialization of multi-token words: simply put the corresponding non terminal in the corresponding cell (which is on a row above the first).
    For instance for the word "satellite antennas",which is a noun (N) made of two tokens, add the non-terminal N to the **second** row (and same column as "satellite").
  • question VI.9 is about comparing taggings of sentences of different length (due to different tokenization, more precisely to multi-token words, the tokens of which can also be (single token) words):
    in such a case the probabilities can **not** be compared: it's not the same probabilistic space. More clearly (back to the fundamentals of first lecture about probabilistic tagging): P(tags|words) = P(words|tags) * P(tags) / P(words) is not divided by the same P(words) (since "words" are changing due to tokenization) ; thus it does not make any sense to compare the numerators only (which is what HMM compute)
  • question VI.11 makes some relationship between HMM probabilization and SCFG probabilization (basically you could rewrite HMM as a SCFG since HMM are nothing else but a probabilized regular language and any regular language is a CFG)
  • question VI.14 what's your precise question there? (maybe post your answer)
  • question VI.15 is about complexity: complexity of Viterbi algorithm (HMM) is linear (w.r.t size of the input sentence) whereas CYK is cubic
  • regarding question V.5: I also need a more precise question here, but I would say: lowering case, normalizing whitespaces, URLs and usernames make sense in this context, removing punctuation and adding gender does not a all and removing hash signs is unclear and has pros and cons (to be discussed)
  • question III.3: sure: make use of the chart, benefit from question 1 and proceed similarly to factorize the probabilities (no need to compute everything, just compute the parts that make a difference among choices), furthermore, make use of the fact that one of the "bottom" proba (NP-->process) is very small (thus are all parse trees making use of that derivation)