You are viewing a single comment's thread from:

RE: Steem recovered from the hf20 fork thanks to abit and its emergency patches

in #witness6 years ago

Seriously?
Why were 0.20 .0 software versions running?
why didn't anybody review the code?
Why is everything hidden in secret chats?
I have no idea, what version to checkout.

This is utter bullshit.
0 communication to us low ranking witnesses.
no proper documentation and instructions given.

This case was another example, that steem is not decentralised.

Sort:  

@isnochsys I understand you are frustrated. Imagine how I feel. HF20 been in the works for so long. It was really a pity this bug was missed, most probably a simple testing error. Seconds vs hours - was used in payout window, (my guess to test this a more speedy way). Then the bug occurs, causing 19's to halt, obviously reverting to 19 as exchanges use this. Then the second bug on 19's actually prevented to rollback to that - also because (some) HF20 nodes where still producing with the bug enabled making the network dirty.

About your q's/remakrs:

  • ofcourse people reviewed the code. Even ran testnets and simulate forks. This was really an unfortunate miss. We are all humans. See below.

  • 0.20 was running as a witness runs it to signal / vote for the hardfork to be happening on the scheduled HF time. The way the software is build is that new features should only be activated after the "scheduled HF Time" which in this case was next week tuesday. Obviously as a witness it is task to keep older software running as backup for a while until certain the new software is running ok. Many did this, but when switching back to 0.19 the bug on that software occured. Network stuck.

  • Not "everything" is hidden in secret chats. It is the same when disaster strikes: it's best to not start shouting around and create useless panic. What happens when the chain halts: one (and only one) node with a patch should restart creating blocks irregardless of whether other nodes are running and restart broadcasting blocks with the patch enabled. Normally the software waits for "a recent block" before producing, so there must be only "one producing" node set to ignore this (which can be done in config.ini). Imagine if there are 100's of witnesses producing these 'stale blocks', would not help a restart. As the chain state, when crashed, still keeps the current vote slate, it is imminent for a "quick" relaunch that witnesses who where prior to the crash had a top20 position are on duty and ready to be applying patches and start signing again as soon as possible. It is less important that nodes who only sign a block once a while also immediately know this until the code is verified and working correctly. As you say you need a replay 36 hours, imagine then during replay it was already found out that the patch you applied did not fix it, another replay. Hence stuff is kept in a smaller group when this emergency arises.

  • steem is decentralised. But all nodes crashed. the token distribution might be skewed but that is something different then having a decentralised network. The chain is 100% decentralised. This fix even, was done not by Steemit inc, (the main developers team of the Steem chain) but by @abit as Holger also mentioned. Obviously @abit deserves your witness vote, he has been working in Graphene for who knows how long and comes from the BTS era where he is one of the core team (DAO) developers. Personally I've been voting for @abit since I learned about him when I started on this platform.

Lastly: the patch was needed to get beyond the "rogue" HF20 chain which was still producing blocks and moving the Last irreversible block forward. As we agreed that that chain should be halted that chain needed to be ignored by a patch. As the chain restored and HF19 overtook HF20 all is good now and you can / should even be able to restart your HF19 node without any patches. Just give it time when you come at the "drama block sets". It won't harm to put in a checkpoint in your config.ini

checkpoint = [26037575, "018d4d47225e6cada82b9aaabc8503ee318c547c"]

I find these moments kinda magical in a way. You have the chain. It stopped. It paused kinda. It stopped because something was going wrong. It saves itself by 'crashing'. A fix is applied (yes it took a while because of HF 20 nodes still running) and then voila - 'move along people, nothing happened'.

It was fixed without the help of steemit inc, so it's kind of decentralised.

The main problem is price manipulation, because of this, it is not a bad idea to hide this chat.

Checkout 0.19.12 and apply the patches from abit, then replay. I pasted my changes to the config.ini.

replay will take me 36 hours or so. after that if it doesn't work, maybe there is a better routine already in place...
a real hotfix, maybe even a patch

There is a branch, which apply the 2048 fix: https://github.com/steemit/steem/commits/20180917-increase-fork-buffer

but no official release yet.

so it's kind of decentralised

There is no decentralisation if Steem is completely down because of few witnesses wanted to run new software.

The witness are elected by the steem user and have the power to vote for a new hardfork. So when HF20 was announced, every witness has to decide to install it or not. When a defined percentage of witnesses did go to HF 20, it was activated. Normally, this should not be a problem and the new features are then activated on the scheduled date.

This time, there were two undiscovered bugs, one in HF 19 and one in HF 20.

After the crash, the elected witnesses have the duty and the power to recover and to start the steem chain again.

I do not see, why this is not decentralized?

why this is not decentralized

Because whole community have to decide who are those 20 centered witnesses. Those twenty are centralised power and decide what to run for whole network.
We should rearrange votes on them if they fail like yesterday.
Vote for @abit!

Coin Marketplace

STEEM 0.29
TRX 0.12
JST 0.033
BTC 62963.68
ETH 3115.67
USDT 1.00
SBD 3.89