Some struggle with Steem-Python
Using steem-python to access the Steem API can sometimes turn out to be painfully slow. Especially using the official Steemit RPC nodes.
The RPC nodes seem to be frequently under heavy load and can cause extremely unreliable response times. In many cases this will only lead to slow result, but it can even end up with incomplete and missed data. One example of this is trying to stream data from the blockchain using the function stream_comments().
Part of my goal with @dorabot was to enable interactions via Steemit. To make that happen, I needed a reliable way to process all comments posted in the blockchain. My initial choice was to use a function called stream_comments in the Steemd class.
def stream_comments(self, *args, **kwargs): """ Generator that yields posts when they come in To be used in a for loop that returns an instance of `Post()`. """
The stream_comments() function is constructed to listen to all the blocks in the Blockchain and return a Post object for each post/comment. Very convenient, but returning a Post object means there is an additional API call and that makes it horrendously slow.
When I was looking for a solution I stumble upon this post made by @pibara.
He was having similar issues and made a pretty neat script to measure the performance fetching raw blocks from the blockchain. The post above is old by now and doesn't include the latest bug fix. The following line needs to be moved outside of the while-loop for the script to function properly.
index = start
The script is using no fancy API calls and is an excellent way to measure the pure performance of fetching blocks from the blockchain.
Running the script from @pibara, it was sometimes slow to fetch data from the Steemit RPC nodes, but at least it could easily keep up. So it was clear that the stream_comments function should be avoided.
My next step was to use this script to compare the performance using different RPC nodes. Ever since I started using steem-python I noticed that the response time can vary greatly. And as @followbtcnews (in cooperation with @crimsonclad) just had released a full node as part of the Minnow Support Project, I thought it would be a cool thing to compare. Please follow this link for the announcement from @followbtcnews.
I started the test with the default steemit nodes. As this is the default, there is no specific modification. As soon as you create an object of the Steem class, like below, you will connect to one of the default nodes.
steem = Steem()
Steemit RPC Node
And below the code to connect to a different node.
my_nodes = ['https://steemd.minnowsupportproject.org'] steem = Steem(my_nodes)
MSP's RPC Node
As you can see above, the results from the default nodes vary quite a bit while the server from @followbtcnews is very consistent. Don't look blindly at the "seconds behind" value. There seems to be some kind of delay for how the blocks are presented through the API, hence the ~60 seconds behind being normal.
Seeing these results I will definitely be moving @dorabot away from the default RPC nodes.
Thank you for reading!
Stayed tune for future updates.
Please let me know if you have any questions.
And please ping me (@danielsaori) if you connect to Discord.
Proud member of #minnowsupportproject & #teamaustralia
Thank you @aggroed, @ausbitbank, @teamsteem,
@theprophet0, @someguy123, @canadian-coconut and @sirknight
Click HERE to learn more about Minnow Support Project.
Click HERE to connect to our Discord chat server.