Why you should know about DHT and Kademlia ?
Hi ! This post is a short introduction to the Kademlia Distributed Hash Table (DHT) and why it matters so much.
A hash table is a key/value data store: given a hash function it computes from the given data an index to an array of buckets/slots where the data is stored. The hash table is a fundamental building block used in many applications where you need to quickly find a data given a key.
The Distributed Hash Table enables both to partition and to replicate the data store amongst several nodes. This is a key building block used in many decentralized applications: BitTorrent, Ethereum, IPFS just to name a few.
Bittorrent/IPFS example: a file is decomposed in several blocks of a fixed size, for each block you apply a hash function (such as SHA) that returns for each a signature, then you ask to the peer to peer network: Who is having this block given this signature ? You receive a list of peers, then you try to contact one and fetch from it the given block. To verify that the data is correct you apply the hash function.
For Bitcoin, Ethereum and many other cryptocurrencies, it is very similar you ask for a given block through its hash, you receive it and you recheck the data validity with the hash function (or check the signature).
So, given a space that corresponds to the length of the key, each peer in the network takes care of a part of this space. Due to the fact that peers are coming and leaving any time, we must have a replication in order to ensure the liveness of the system: we need to have at least a peer that is storing a unique key to be safe.
A peer has an id that should be random: a hash of some data, a public key, a signature... In the original paper, sha1 is used so the id is 160 bits.
Kademlia use a XOR metric distance which is the bitwise XOR performed on the ids. This is a nice idea because it is fast to compute, it is symmetric (x xor y = y xor x).
The key space is decomposed using k-buckets where the peer stores at most k neighbors. Those buckets are organized using the previous XOR metric distance. All entries inside the bucket are sorted by the time of the last communication.
Example for an update: a peer receive an update from another, it updates the corresponding bucket, if the peer already exists it is moved on the most recent entry, if it is not full it is added, if it is full we try to contact the least recent entry, if it is not reachable (time-out) the new peer replaces it, otherwise we do nothing and we keep the long lived one.
This is a nice trick to keep peers that are long lived, it acts as a DoS protection if someone tries to force updating of all the entries. Such entries are representing a routing table. Each entry of the bucket contains the id of the peer, its IP with the associated port.
Protocol
The core protocol used by Kademlia is the following :
- Ping: a peer sends a ping and the target peer should send a pong as response and echo the message sent. It is used to say "Hey buddy ! I am alive !". Both peers are updating the associated bucket for each other, it is used both for the discovery and as an update for the last communication.
- Store: the peer sends the key and the associated data, the target peer stores it (it sends an error message if the data is empty or too big for instance).
- Find_Node: the peer sends a key, the target peer returns the list of entries that are the closest (XOR distance) to the given key.
- Find_Value: the peer sends a key, the target peer returns the value if present, otherwise it returns the result of Find_Node.
Now, we can describe the Node lookup, looking for the "nearest" neighbors of a given key. We search for the non empty bucket closest to the key. If there are less entries than a given bound, we extend to other buckets the selection. Otherwise, we pickup a given amount of the entries of this bucket and we send Find_Node (or Find_Value), we receive from those entries potential new entries (we had them to the list of entries to be asked if we not yet asked already) and we update the associated buckets. The algorithm is iterated until there is no new entry, or new entry near enough the key. If no Node lookup has been performed after a given time on a given bucket, then a random key is generated that matches the given bucket and a Node lookup is performed: this allows to refresh the DHT.
In a similar way to the previous algorithm, we have :
- a Store with Replication, we lookup for the "nearest" neighbors and we perform a Store for each of them. After a given time, the key/value stored is sent to ensure the liveness (in a similar way as we refresh through Node lookup).
- a Find Value, we call Find_Value on the "nearest" neighbors until the value is found or a list of neighbors is returned if it is not found. Additionally, a Store of the key/data must be performed by the caller to the closest node seen which did not return the value (for the replication).
It is just a basic overview of it, but it should give you a good idea of how it works.
If you want to dig further, I recommend to read the following :
- http://xlattice.sourceforge.net/components/protocol/kademlia/specs.html
- Kademlia Paper: https://pdos.csail.mit.edu/%7Epetar/papers/maymounkov-kademlia-lncs.pdf
Implementations:
- C : https://github.com/jech/dht (It is short, clean and used in production code : transmission client and several applications)
- GO : https://github.com/nictuku/dht, https://github.com/prettymuchbryce/kademlia
- Python: https://github.com/bmuller/kademlia
- Java: https://github.com/atomashpolskiy/bt
@originalworks
The @OriginalWorks bot has determined this post by @boucaron to be original material and upvoted it!
To call @OriginalWorks, simply reply to any post with @originalworks or !originalworks in your message!
You got a 1.56% upvote from @upme requested by: @boucaron.
Send at least 1.5 SBD to @upme with a post link in the memo field to receive upvote next round.
To support our activity, please vote for my master @suggeelson, as a STEEM Witness
This post has received a 0.61 % upvote from @buildawhale thanks to: @boucaron. Send at least 1 SBD to @buildawhale with a post link in the memo field for a portion of the next vote.
To support our daily curation initiative, please vote on my owner, @themarkymark, as a Steem Witness