How come that t_4 is kept in this exercise? In the lecture notes about “limited scope for syntactic dependencies”, we have that P(t_i | t_1,.....,t_i-1) = P(t_i | t_i-k, ... , t_i-1). So if we set k=1 and i=3 we just get P(t_3|t_2), but there is no i+1 anywhere so we shouldn't be able to get t_4?
The "limited scope for syntactic dependencies (1 neighbor)" hypothesis is valid in both directions (left or right if you consider a word or character sequence). So, the "k" neighbors that are either on the left or right can be considered here.
> The "limited scope for syntactic dependencies (1 neighbor)" hypothesis is valid in both directions
The reason being because the original objective is to maximize the joint probability.
To convince yourself: either start again from the original objective, or apply the Viterbi algorithm (and understand why we need the backward reconstruction phase at its end -- same reason).
To convince yourself: either start again from the original objective, or apply the Viterbi algorithm (and understand why we need the backward reconstruction phase at its end -- same reason).
Yes ok I see, thank you a lot!