Inconsistency in the use of Stop method regarding previous informations

Inconsistency in the use of Stop method regarding previous informations

par Bastien Wermeille,
Nombre de réponses : 4

Hello,

I asked during the first part of the project if we needed to be able to stop a gossiper and restart it later with the run command.

The answer at that time was no. However, it seems to me that one test of hw2 is based on this behavior, see TestGossiper_Crash_Recovery.

So I assume we have to change our implementation such as to be able to support this feature and be able to restart the gossiper on the exact same address, is it correct ?

Best regards,

Bastien Wermeille

En réponse à Bastien Wermeille

Re: Inconsistency in the use of Stop method regarding previous informations

par Kirill Nikitin,

Hi,

As you can see in TestGossiper_Crash_Recovery (we have updated the test in the latest push to make it more clear), the only thing we check in it is that your gossiper can re-index previously shared & downloaded files. We specify this state survival requirement on page 6 of the handouts.

Other properties, such as the address or the routing table, do not need to persist. So you can think of it as a completely new gossiper that needs to re-index the state of shared files.

Kirill

En réponse à Kirill Nikitin

Re: Inconsistency in the use of Stop method regarding previous informations

par Elie Daou,

Does this include crash recovery of half-downloaded files? So if a gossiper crashes mid-download with an incomplete number of chunks of a file, are we expected to recover from that?

En réponse à Elie Daou

Re: Inconsistency in the use of Stop method regarding previous informations

par Kirill Nikitin,

If your gossiper crashes mid-download, you should have the file indexed (so it is available for search) and know which chunks are present when the gossiper restarts. Essentially, this means that you should index the file and save the information about it in a persistent storage as soon as you obtain its metafile (because at this stage you have all the necessary knowledge). That being said, I do not think that we have any tests that specifically check this behavior.

In terms of download itself, you do not need to resume it after the crash. Assume that it is a one-time command.

Let me know if anything is still unclear.

Kirill