SteemSQL had to be redesigned to manage the awesome growth of Steemit.
- Introduction post
- SteemSQL - Update 1
- SteemSQL - Update 2 - Language detection
- SteemSQL - Update 3
- SteemSQL - Update 4
- SteemSQL - Update 5
- SteemSQL - Update 6
- SteemSQL - Update 7
- SteemSQL - Update 8
In my last witness report, I expressed my concern about SteemSQL's ability to ingest the growing mass of data that is generated every day.
Indeed, if you look at the following chart taken from my last daily stats, you will find that the number of transactions to be processed has tripled (+199%) on the last 30 days.
For posts and comments, this represents respectively an increase of 315% and 297%
And the number of upvotes has more than doubled with an increase of 112%
Gradually, the performances of data injection in the database have deteriorated. SteemSQL ended up struggling with the numerous updates all transactions generated. At some times, I found it with being late for more than 24 hours compared to the data available in the block chain (that's why I was unable to publish my daily stats for 2 days).
So, I decided to take the bull by the horns and performed a thorough analysis that led to several changes to the database and a complete redesign of the Database Injector.
1. Changes to the database.
These mainly concern the indexes of the database. I found that many indexes had been created to improve performances of some queries, but that finally these indexes were rarely used.
The indexes of the database have therefore been analyzed one by one, some have been deleted, others have been modified.
2. Changes to the Database Injector
The Database Injector is the process responsible for reading each block of the blockchain, and distributing the transactions in the different tables of the database.
I worked on the optimization of these processes by separating the injection of the transactions from the update of the impacted non-transactional tables (Accounts and Comments tables)
These processes now run in parallel thanks to an optimized multithreaded implementation of the different tasks.
Wow, thanks to this work, I have been able to reduce the overall processing time and got it divided by 3 !!
And I am persuaded that it is still possible to further improve it because I worked a bit in emergency mode in order to restore the database to a functional state as quickly as possible. I had to take into account that more and more people rely on SteemSQL for their process Or analysis.
Things are done and SteemSQL is now ready to face without any problem a doubling of the number of transactions. This is something I really wish for Steemit!
Oh ... and for those who missed this announcement ... in the middle of my optimization work, I found a bit of time to launch the "Happy Birthday" action on SteemitBoard ... that was just to think to something different while I was doing small pauses ... :P
Thanks for reading !
All payout from this post will be dedicated keep SteemSQL running. Thanks for your support.