Performance Degraded: Investigating the Nodes

in Witness Activities7 months ago (edited)

Since yesterday, many services were impacted. As I am currently travelling, so may not have enough time to fix this properly. But I have investigated a bit and identified the root cause.

Broken RPC Node

Most of the services use a list of RPC nodes, and one of them malfunctioned and returned outdated data. I guess the RPC node got stucked.

I have restarted the services yesterday, but today, it got problem again. And I see that when it switched to that problematic node, the services broke.

Fail-over

There is a fail-over so that when a node is down or simply being rate-limited, it will be switched to next available node (Round-robin scheduling). However, there is currently no checks for the actual data returned. The "down" check is more like checking the HTTP status to see if it is 200.

I have removed the broken node from the list, and will closely monitor this.

TLDR;

When a RPC node returns outdated data but with a 200 success response, it will mess up the services. We need to have a better way to identify this (at a minimal cost).

Steem to the Moon🚀!

Sort:  

行长加油,steem多亏有行长这样的人才,才有今天。

另外,行长如果有空,再帮我补个赞吧。非常谢!
https://steemcn.xyz/hive-180932/@cheva/3ze63z

哈哈,这嘴真是甜。给你补这里了。

Hello @cheva! You are The Best!


command: !thumbup is powered by witness @justyy and his contributions are: https://steemyy.com
More commands are coming!

Since the voting came yesterday, the voting has not come today. I just saw your post and could see the situation. When will this problem be completely resolved? The system is losing people's trust. I hope the instability of Steemit is corrected and stable operation is achieved ASAP. T^T

Vote missing two consecutive days

Coin Marketplace

STEEM 0.16
TRX 0.17
JST 0.029
BTC 69218.33
ETH 2488.39
USDT 1.00
SBD 2.53