Last year's Quiz 2 - 1st question

Last year's Quiz 2 - 1st question

par Beatriz Borges,
Number of replies: 5

The question reads:

Using K. Oflazer 1996 spelling error correction algorithm presented in class, the correction of some input string X with a threshold θ = 2 will visit at least...

I think the correct answer would be the 3 options bellow: 

  • all the strings in the lexicon that are at distance 1 from X 
  • all the strings in the lexicon that are at distance 2 from X 
  • all the strings in the lexicon that are at distance 3 from X

, because there is no "early stopping", and only after looking at "distance 3" strings can the algorithm know that any string derived from those would have distance > θ, thus ending its execution. Is this correct?


Thank you!

In reply to Beatriz Borges

Re: Last year's Quiz 2 - 1st question

par Jean-Cédric Chappelier,

No, not ALL the strings in the lexicon that are at distance 3 from X would be considered. There is indeed a kind of "early stopping" (whatever this means) with the cut-off. Try for instance X=yyymple; string "example" is in the lexicon, at distance 3 from X, but won't be considered as the algorithm will stop as soon as prefix-string "exa".
Makes sense?


In reply to Jean-Cédric Chappelier

Re: Last year's Quiz 2 - 1st question

par Valentin Oliver Loftsson,

So, are you saying that if no potential corrections are found having edit distance of \le \theta = 2 from X, the algorithm keeps searching until it finds one?

I'm not sure if I fully understand. Say we have the string "exa". Its edit distance to string X is 7  which is obviously why "exa" would not be considered either, right?

In reply to Valentin Oliver Loftsson

Re: Last year's Quiz 2 - 1st question

par Jean-Cédric Chappelier,
  1. ???
    Not at all. I'm not saying such a thing.

  2. Maybe you confuse D (distance) and D_c (cut-off). stoping or continuing is not about D, but about D_c.
    If slides are unclear, may I suggest you to have a look at some of the practical session solution code?
In reply to Jean-Cédric Chappelier

Re: Last year's Quiz 2 - 1st question

par Valentin Oliver Loftsson,
Ah, okay. So the stop/continue condition depends on the cut-off edit distance. The cut-off edit distance between "exa" and "yyymple" is 3 < \theta and that is why it stops at "exa"?

If I'm totally off track I might need to look at the code and find some more detailed explanations. Thanks for your answers professor.
In reply to Valentin Oliver Loftsson

Re: Last year's Quiz 2 - 1st question

par Jean-Cédric Chappelier,

> So the stop/continue condition depends on the cut-off edit distance. The cut-off edit distance between "exa" and "yyymple" is 3<θ and that is why it stops at "exa"?

yes.