mldonkey-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Mldonkey-users] RFC: QueryZonesMd4 / QueryZonesMd4Reply


From: Pierre Etchemaite
Subject: [Mldonkey-users] RFC: QueryZonesMd4 / QueryZonesMd4Reply
Date: Thu, 27 Feb 2003 03:18:50 +0100

        Hi,

I have an idea for a protocol extension (I know, it may not be the best
moment ;) )

Since it seems that there's some data corruption going on on the network,
and that currently a single flipped bit is enough to spoil a whole chunk
(9500kB) of data, recovering from data corruption is not very graceful :(

The proposed extension tries to be simple, and to match with the already
existing protocol mecanisms (mainly QueryChunkMd4 and QueryChunkMd4Reply).
It requires two new packet types, that I'll call QueryZonesMd4 and
QueryZonesMd4Reply.

QueryZonesMd4: 
        packet ID (not choosen)
        data: file MD4, offset of the beginning of the chunk (are there
          chunk numbers in the current protocol ?)

QueryZonesMd4Reply:
        packet ID (to choose also)
        data: file MD4, offset of the beginning of the chunk, array of 53
          MD4 hashes

The hashes are the MD4 hashes of the zones (180kB) contained in the chunk.
9500/180 = 52.777..., the last zone is smaller than 180kB. More flexible
choices are possible, but I'm not sure it's worth the added complexity.

As you can see, QueryZonesMd4 should be small, and QueryZonesMd4Reply under
1kB. Well worth the effort to avoid redownloading 9500kB!


Example of implementation support, in MLdonkey:

* add a new chunk state:
    WaitingForSubhashes of int
  the parameter is the time when the chunk entered that state (for timeout)

* when validating a chunk, if the hash is wrong, it's switched to the 
    "WaitingForSubhashes of last_time" state

* (chunks in that new state should not be choosen or written to. When
   storing chunks state to files, do as if the state was AbsentVerified ?
   Or better, save the state)

* when selecting a chunk from a client (find_client_bloc), look for chunks
    in WaitingForSubhashes state:
    if one is found, but has timed out (let's say, stayed more than 15 mins
    in that state), change its state to AbsentVerified;
    Otherwise, if the client has the chunk, and was not already asked
    about it, send a QueryZonesMd4 to the client, tag the client as 
    already queried. 
    Then select from chunks of other states.

    As you can see, several QueryZonesMd4 can be sent to different sources.
    Since sources may not know about that extension, we need to cope with
    non answering peers.

* when receiving a QueryZonesMd4 from another client, check that we have the
    file, then that we have the requested chunk, hash all the zones of the
    chunk and send the results in a QueryZonesMd4Reply packet.
    It may be necessary to implement some throttling to avoid DoS attacks.

* when receiving a QueryZonesMd4Reply from another client, check that we
    have the file, that the chunk is in the WaitingForSubhashes state, then
    compute the zones hashes and invalidate the zones with faulty hashes;
    Chunk state is switched to PartialTemp.


Zone chunks could also be asked proactively, to detect broken peers; But
since that adds overhead in the common case (no corruption), maybe it should
only be triggered under bad conditions (several broken chunks already
received).

Comments ?

Pierre.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]