#2 - Indexer: blocks.py
A deep dive into Hivemind's design
I'll kick the technical aspects of this series off by focusing on the indexer component, which is where Hivemind's basic logic begins. This component is responsible for reading data from the blockchain as it is added, parsing it and then populating the database based on transactions in blocks.
In short, it is an important part of how database state is maintained.
Python script on GitHub: hivemind/hive/indexer/blocks.py (master branch)
indexer, we have a number modules working together. In this post, I will focus on
This module is a "blocks processor" and it works through the transactions and operations in each block to trigger the relevant actions in other
indexer modules that update the database. This is also where the
hive_blocks table for the database is updated.
It uses a class named Blocks to host several methods for handling block related data.
Process methods in this class
class Blocks we have a number of methods involved in the processing of blocks.
head_num() used to query the current head block number in the database
head_date() used to query the current head block date in the database
There are three methods used in processing blocks. Two of them shell the process:
Both of the methods above call the
_process() method to do the actual processing. The difference between the two methods above is that one is handling one block at a time and the other handles an array of blocks by making a call to
_process() (expanded on below) for each block.
This method loops through all the transactions in a block, and loops through all the operations within each transaction, to trigger relevant methods.
That means it looks for post operations, account metadata updates, custom JSON operations, etc. All the operations that fall within Hivemind's scope. When found, relevant classes are invoked to update the database, for example a custom JSON operation will trigger a call to be made in the
CustomOp class to process this operation.
_verify_head() is used to ensure that hive and Steem have the same head block. To recover from forks, it removes blocks from hive by going backwards through recent blocks, until they're synced.
DB methods in this class
class Blocks we also have methods that handle database related operations.
_get() is used to retrieve a block from the database.
_push() is used to insert a row in the blocks table.
_pop() is used to remove blocks from the database and to also rollback entries made in other tables from transactions within that block; such as posts and transfers.
That's it for this walkthrough of what I learned about how
blocks.py works as part of the indexer component in Hivemind.
Studying this module has helped me understand how I will have to handle fork recovery for new tables that Native Ads will introduce to the database. I will need to implement ways to get all transactions affected by a block rollback and to remove them.
Most of the work can use existing logic. A little more code will need to be written to fully cover the ads implementation, for example, I will need to implement custom SQL logic to filter out ads from the ads table and remove them.
This has been an educational study for me!
Posts in this series
#1 - Overview and opportunities
#2 - Indexer: blocks.py
I am working on a new feature called Native Ads, that may be added to Hivemind Communities in a future update.
For an overview of the Native Ads feature and how it will work, read this doc.
If you would like to take a look at the code, check out my fork of Hivemind on GitHub.