Rocking with Steem-Python

in #steemdev7 years ago

Making Steem-Python more reliable


dorabotPython code in action...

The official RPC nodes have been under heavy load recently and because of that, it has been beneficial to swap nodes completely or at least implement some kind of failover mechanism.

I have now started to use a mix of the following nodes:
https://gtg.steem.house:8090
https://steemd.minnowsupportproject.org
https://rpc.steemliberator.com
https://steemd.privex.io
https://steemd.steemit.com

Although it has improved response times and reliability, it has not been perfect. I have had especially one big problem. Please read along to understand the issue, how I fixed it, and how I proposed a change to the steem-python library.

Much of my coding has been to get my bot, @dorabot, going, and it has moved along pretty smoothly. The issue I ran into though, was the ability to reliably stream data from the blockchain to be able to pick up comments sent to the bot. I have for example a function that can pick a random winner from all users who upvoted a post or a comment. This is triggered by making a reply and including the two words, "@dorabot" and "?winner". As you see below any other text can be included as well.

There are a number of different functions part of the of the steem-python library to stream comments or fetch blocks from the blockchain, all with some different features or additional processing. But in the background, many of them are using the API call: get_block to actually fetch the data.

Although I had started to use several different RPC nodes and implemented failover in case of errors, I regularly ran into issues with the "get_block" API call. In this implementation I used the "get_blocks()" function part of Steemd.

The issue was not with the API call itself, but rather a combination of how steem-python is handling API calls and how the "get_blocks()" function is treating the return data. Steem-python is using a library called "urllib3" to execute all API calls. This is handled in "http_client.py", part of "steembase".

The code in "http_client.py" has its own error checking, it will failover to a redundant node and hold off further requests etc. , to give the server a chance to recover from temporary issues. But the issue I found was with responses where the server actually replied, but returned an error code, a non 200 code, like a 403 or 504 error.

The code below is a snippet of the exec() function, part of http_client.py. In case an HTTP error, like a 403 code, is received from the server, nothing is really done. The response will be empty and will be returned to the calling function.


if response.status not in tuple([*response.REDIRECT_STATUSES, 200]):
    logger.info('non 200 response:%s', response.status)
 
return self._return(
    response=response,
    args=args,
    return_with_args=return_with_args)
(from http_client.py - end of exec() function).

An empty response by itself is nothing bad, as long as it is dealt with properly, but the "get_blocks" function part of Steemd has nothing to act upon when an empty response is received. No error will be thrown, instead, it will rerun exactly the same request. So if one RPC node is broken and constantly returns a 403 error, we will loop to infinity...
Below the code showing this loop.


while missing:
    for block in self._get_blocks(missing):
        blocks[block['block_num']] = block

        available = set(blocks.keys())
        missing = required - available
(from steemd.py - part of the get_blocks() function).

With the simple few lines in the following snippet, you can test this behaviour on your own. Run this from the python command line.
I supply two nodes below, the first one is a non-existing link to GitHub which will always return a 404 error. DING! Perfect for this test. :) I'm using "steem.hostname" to check the active node. As you can see, I have to cancel the loop as it otherwise would keep on running forever.


>>> from steem import Steem
>>> my_nodes = ['https://github.com/logddin', 'https://steemd.minnowsupportproject.org']
>>> steem = Steem(my_nodes)
>>> blocks = steem.get_blocks_range(16269929,16269930)
... (###output truncated)
^CNon 200: github.com (###stopped the loop with Ctrl+C)
.... (###output truncated)
>>> steem.hostname
'github.com'
>>>

Below an example of using the get_account() function. Here we don't risk to get stuck in an infinite loop, but instead, the empty response will cause the function to immediately throw the error as seen below. If you have used the steem-python library you should have seen a number of these. This is not really an issue on its own as some simple error checking with "try:/expect:" statements will easily detect it.


>>> steem.get_account('dorabot')
Traceback (most recent call last):
... (###output truncated)
TypeError: 'NoneType' object is not iterable
>>>

The code below shows the modifications I have done to the code in the exec() function part of http_client.py. I copied the logic from the other error handling done earlier in the exec() function. The original code also had a bug, where the second if-statement below was written as an elif-statement and hence would never be executed. That bug would cause another loop to infinity condition in case all the RPC nodes would malfunction.


if response.status not in tuple([*response.REDIRECT_STATUSES, 200]):
	logger.info('non 200 response:%s', response.status)
	# try switching nodes before giving up ### Added
	if _ret_cnt > 2: ### Added
		time.sleep(5 * _ret_cnt)  ### Added
	if _ret_cnt > 10: ### Added
		return self._return(response=response.status) ### Added
	self.next_node() ### Added
	return self.exec(name, *args, return_with_args=return_with_args, _ret_cnt=_ret_cnt + 1) ### Added

return self._return(
	response=response,
	args=args,
	return_with_args=return_with_args)
(from http_client.py - with my modifications).

With these modifications, we can re-run the same test done above with the get_blocks_range() function.

The script starts by sending the request to github.com, but as a 404 error is received, it quickly fails over to steemd.minnowsupportproject.org.
(Thanks to @followbtcnews & @crimsonclad for hosting this RPC node as part of MSP!!!)


>>> from steem import Steem
>>> my_nodes = ['https://github.com/logddin', 'https://steemd.minnowsupportproject.org']
>>> steem = Steem(my_nodes)
>>> steem.hostname
'github.com'
>>> blocks = steem.get_blocks_range(16269929,16269930)
Non 200: github.com ### Print statement added for troubleshooting
>>> steem.hostname
'steemd.minnowsupportproject.org'
>>>

Since implementing this I have had no issues anymore with streaming blocks from the blockchain. And as a bonus, as less empty responses are returned, I have seen in my logs that I deal with way fewer exceptions.

I have submitted two pull requests on GitHub. Let's see what reviewers will say and if something is wrong with my logic.
https://github.com/steemit/steem-python/pulls

Thank you for reading!
Stayed tune for future updates.

Please let me know if you have any questions.
And please ping me (@danielsaori) if you connect to Discord.

Click HERE to connect to MSP's Discord server.


Sort:  

@dorabot Let me pick the ?winner :)

The winner is: wandrnrose7!!!

Congratulations @wandrnrose7!! 😀
But I’m afraid there is no prize... Only the honor of being picked by @dorabot 😉

hugs always a pleasure, dear. This stuff is way over my head, lol

This post has received a 9.11 % upvote from @booster thanks to: @danielsaori.

This wonderful post has received a bellyrub 19.18 % upvote from @bellyrub thanks to this cool cat: @danielsaori. My pops @zeartul is one of your top steemit witness, if you like my bellyrubs please go vote for him, if you love what he is doing vote for this comment as well.

Calling @originalworks :)
img credz: pixabay.com
Nice, you got a 5.0% @minnowbooster upgoat, thanks to @danielsaori
Want a boost? Minnowbooster's got your back!

The @OriginalWorks bot has determined this post by @danielsaori to be original material and upvoted it!

ezgif.com-resize.gif

To call @OriginalWorks, simply reply to any post with @originalworks or !originalworks in your message!

To enter this post into the daily RESTEEM contest, upvote this comment! The user with the most upvotes on their @OriginalWorks comment will win!

For more information, Click Here! || Click here to participate in the @OriginalWorks sponsored writing contest(125 SBD in prizes)!!!
Special thanks to @reggaemuffin for being a supporter! Vote him as a witness to help make Steemit a better place!

This post has received a 9.76 % upvote from @buildawhale thanks to: @danielsaori. Send at least 0.50 SBD to @buildawhale with a post link in the memo field for a portion of the next vote.

To support our curation initiative, please vote on my owner, @themarkymark, as a Steem Witness

Coin Marketplace

STEEM 0.19
TRX 0.15
JST 0.029
BTC 62948.49
ETH 2583.15
USDT 1.00
SBD 2.74