Improve Performance using Asynchronous Design 用异步来提高性能

in #witness6 years ago (edited)


Image Credit: Pixabay.com

I noticed for sometime that many of my online steemit tools were slow to give answers, especially if the result dataset contains many items. For example, more than 100 delegates Steem Power to myself and this online tool took a while to list the delegators.

I dig into the code and finally found out the converter from steem-python is slow.

steem = Steem(nodes = steem_nodes)
converter = Converter(steemd_instance = steem)
while some loop:
    r.append( "sp": converter.vests_to_sp(vests)})

The converter.vests_to_sp() is called for every single row in the data set and it is time consuming. Looking into the converter.py:

def steem_per_mvests(self):
        info = self.steemd.get_dynamic_global_properties()
        return (Amount(info["total_vesting_fund_steem"]).amount /
                (Amount(info["total_vesting_shares"]).amount / 1e6))

def vests_to_sp(self, vests):
        return vests / 1e6 * self.steem_per_mvests()

We can see that steem_per_mvests is time-consuming as it will need to get data from the steem blockchain using steemd object.

To improve the performance, we can cache the self.steem_per_mvests() for e.g. 1 hour. So we can write a cached version of vests_to_sp that converts VESTS to Steem Power:

import os

def file_get_contents(filename):
  with open(filename) as f:
    return f.read()

def vests_to_sp(vests):
  steem_per_mvests = 489.85031585637665
  fname = "cache/steem_per_mvests.txt"
  try:
    if os.path.isfile(fname):
      x = file_get_contents(fname).strip()
      if len(x) > 1:
        x = float(x)
        if x > 0:
          steem_per_mvests = x
  except:
    pass          
  return vests / 1e6 * steem_per_mvests      

The next thing is to write a script e.g. update_steem_per_mvests.py that will get the value from steem block chain and save it locally to a text file e.g. steem_per_mvests.txt

from steem.converter import Converter
from steem import Steem
from nodes import steem_nodes

def file_put_contents(filename, data):
  with open(filename, 'w') as f:
    f.write(data)

steem = Steem(nodes = steem_nodes)
converter = Converter(steemd_instance = steem)

x = converter.steem_per_mvests()
file_put_contents('cache/steem_per_mvests.txt', str(x))

You can then put this in crontab e.g.

@hourly python3 update_steem_per_mvests.py > /dev/null 2>&1

Getting data from block chain is slow, and we should really avoid that as much as we can. For exchange rates or something we don't need 100% real time accuracy, we can always store the data locally or in the cache and let another script to update asynchronously at a interval. Using real time data is resource-intensive and we usually can achieve a more responsive system by asynchronous approach.

Another example: the @justyy voting bot is accelerated by using the cached list of delegators, which is updated regularly by another script. This makes the bot more responsive and of course less time per round voting.

Support me and my work as a witness by voting for me here!


我注意到一些STEEMIT在线工具返回数据很慢,特别是当返回的数据很多行时,经常需要等个几十秒,非常的不友好。比如,现有138人代理给YY银行,通过 这个代理查询工具 则需要几十秒才能返回数据。

今天我稍微研究了一下,发现在使用 steem-python中的 converter 特别的慢。

steem = Steem(nodes = steem_nodes)
converter = Converter(steemd_instance = steem)
while some loop:
    r.append( "sp": converter.vests_to_sp(vests)})

方法 converter.vests_to_sp() 很慢,被调用在循环里时,这就能解释为什么数据量越大的时候需要的时间越长。看了converter.py代码:

def steem_per_mvests(self):
        info = self.steemd.get_dynamic_global_properties()
        return (Amount(info["total_vesting_fund_steem"]).amount /
                (Amount(info["total_vesting_shares"]).amount / 1e6))

def vests_to_sp(self, vests):
        return vests / 1e6 * self.steem_per_mvests()

我们可以看到 steem_per_mvests 实际上会调用 steemd 去区块链上取数据。

实际上这个数值我们并不需要实时的精确,并且这个数值也不会变化太剧烈,所以我们只要把 self.steem_per_mvests() 这个数值缓存起,然后定期更新它就可以。重写一下这个缓存版本的 vests_to_sp

import os

def file_get_contents(filename):
  with open(filename) as f:
    return f.read()

def vests_to_sp(vests):
  steem_per_mvests = 489.85031585637665
  fname = "cache/steem_per_mvests.txt"
  try:
    if os.path.isfile(fname):
      x = file_get_contents(fname).strip()
      if len(x) > 1:
        x = float(x)
        if x > 0:
          steem_per_mvests = x
  except:
    pass          
  return vests / 1e6 * steem_per_mvests      

我们需要一个更新脚本 update_steem_per_mvests.py用于定期的却STEEM区块链上取数据然后写入 steem_per_mvests.txt

from steem.converter import Converter
from steem import Steem
from nodes import steem_nodes

def file_put_contents(filename, data):
  with open(filename, 'w') as f:
    f.write(data)

steem = Steem(nodes = steem_nodes)
converter = Converter(steemd_instance = steem)

x = converter.steem_per_mvests()
file_put_contents('cache/steem_per_mvests.txt', str(x))

最后面只需要放在 crontab 定时执行即可(频率可以根据需要调整)

@hourly python3 update_steem_per_mvests.py > /dev/null 2>&1

在软件设计的时候,我们尽可能的不要去区块链实时的取数据,因为这样性能很低,相反,我们把那些不是非常需要100%精确的数据缓存起来,然后异步的去更新,这样整体性能就能反应快许多,更加的 responsive.

YY的点赞机器人每次都会去取YY银行股东的名单,这并不需要实时的准确,所以,只需要通过另一脚本定期(每隔几分钟)跑一次,把名单写在本地文件里,需要的时候瞬间就能返回数据,大大的减少了机器人跑一次的时间。

同步到博文: https://justyy.com/archives/6115

支持我的工作 支持我成为 见证人 请在 这里 投我一票

Sort:  

慢确实是个很大的问题。

@justyy, 我很欣赏你!

今天点赞机器人迟到了,还没来给我点赞😄

steemsql 慢了3个多小时。

谢谢分享,原来没留意这个问题

嗯,被那个方法害惨了,很多脚本都用了这句话,相当的慢。

Coin Marketplace

STEEM 0.18
TRX 0.15
JST 0.031
BTC 60795.60
ETH 2627.31
USDT 1.00
SBD 2.58