CS-431: Handson 12 Question 6

Hi,

Looking through the solutions of Handson 12, I don't fully understand the solutions to Question 6.

In particular for 6.1: If it's important to find all the documents that are relevant for a given query, then from the perspective of just caring how many of those relevant documents we retrieve both systems are equivalent since they have the same recall. What is it that makes System 2 better here?
Additionally, with 6.2 I don't see why we can in general assume that system S1 is better without knowing more about our evaluation. For example, isn't it possible that the first evaluation does not correspond closely to what will be observed in the real world in which case the second system would be better?

Thank you.

Re: Handson 12 Question 6

by Jean-Cédric Chappelier - Wednesday, 7 December 2022, 08:32

[Sorry, i missed that one...]
There is no such thing as "the recall" ([" they have the same recall "]) of such systems: you can vary the recall of such systems pretty much the way you want by asking for more or less retrieved documents ; and you see it even in the hand-out question: each system was evaluated at 3 different recall points.
The question is much more: at what recall do you want to use the system?
In the first setup, you want to have high recall (not miss many relevant documents). And, of course, at that working point, you'd like the best precision (in order to save you time filtering garbage out). Thus system 2 fits your needs better.
In the first setup, you work at law recall (look at how many links, not even talking about pages ;-), you look at when googling the Web); and, again, the best precision there; thus system 1.

Now regarding the evaluation: well, that's as usual (see week 2), if your evaluation is inappropriate, was poorly desing, well... there is not much you can do about it (but re-do a better evaluation ;-) )

Makes sense?

Re: Handson 12 Question 6

by Lucas Dodgson - Wednesday, 7 December 2022, 09:20

It does. Thank you very much for the clarification!