Witness Back Online, and WHY Steem Crashed

in #witness-update6 years ago (edited)

My witness server is back online on v0.19.12 after turning it off last night. I was already running this version, but when the last replay finished after a restart, it was still giving errors. I didn't know about truncating the block_log.

So I thought I had to get a new block_log and start over :/ Doh! After a long wait, everything is back in order again.

It appears the reason for the outage was due to HF20 code having problems with existing HF19 code, as HF20 and HF19 code were running on blockchain through various witnesses.

It seems that the failure in the chain came about when a vote was applied in the lockout period that had changed between HF19 and HF20. The lockout period is the time period where you can't vote prior to a post being paid out. A vote was made which conflicted with the different code in the different versions of the blockchain being used by witnesses, and that bug crashed the chain. At least, that's how I understand it (I may be wrong).


Thank you for your support. Peace.


If you appreciate and value the content, please consider: Upvoting, Sharing or Reblogging below.
Follow me for more content to come!


My goal is to share knowledge, truth and moral understanding in order to help change the world for the better. If you appreciate and value what I do, please consider supporting me as a Steem Witness by voting for me at the bottom of the Witness page.

Sort:  

Glad to see you back online. You're a witness I support.

I wrote more details about it here.

The bug existed since July 18... and only was discovered recently when there was enough of a mixture of 0.19 witnesses and 0.20 witnesses running together that consensus could not be reached.

Glad to see everyone recovered from it well. :)

Thanks for the support ;) We're still ticking after taking a licking ;)

What a chaos.

However, these issues are very useful to show how Blockchain technology is still at its infancy.

14h downtime is a bit extreme. Things will need to improve a lot before Private Corporations start seeing Blockchain as a useful tool.

We need to squash bugs faster, just for starters. And find failback mechanisms to prevent a bug from creating such a long downtime.

Thanks for keeping us informed. Keep up with the good work!

Yeah, it's long because of the size of the chain. Next time, it will be longer, as the chain keeps growing... Smaller chains don't have that much issue to get it back online. Better testing methods are indeed required. You're welcome, thanks!

Yes, i got some info on the bug that made the platform to stand still for some hours. I was afraid when i couldn't logged into my account.
I've also heard of the tough fight it took the witnesses to bring back the plaform. More grease to your elbows for the good works

Yup, it was a long outage this time. The more the chain grows (now at like around 150 GB), the longer it will take to come back online when issues like this happen.

Thanks for the short and concise explanation it’s been the easiest one to understand So far! It really was a ballet of emotion yesterday and I kinda enjoyed all the steemians freaking out and supporting one another was a pretty entertaining day not that everyday on steemit isn’t filled with drama this just toon it to a new level

Posted using Partiko iOS

LOL, SteemDrama, tune in next time ;)

That's what I've heard, too, @krnel. In talking with different folks today and yesterday, it sounds like that particular instance, voting in the lockout period, was not a part of the testnet, which would be the reason why it wasn't caught there.

Yeah,it's a shame, hopefully that mess up won't happen again :/

Not that particular mess up. That wouldn't be good at all. It's another one I worry about. Any other one. When it comes to human error, I understand it's inevitable, (I certainly couldn't stand up under any scrutiny of perfection) but for me, that's not an adequate excuse to not do everything in your power to ensure something else doesn't stop the blockchain. I don't get anything out of a stopped blockchain, so this fix it when it breaks mentality is hard for me. I also have a hard time accepting the inevitability of anything other than death, and even that I'm not rushing headlong to embrace. :)

I am wondering what happened in detail. Usually this should fork the chain. I guess that is what happened and then the nodes were not set up to to deal with the fork so everything crashed.

I saw that at some point there were more than 500 blocks awaiting consensus. What happened to those? The got replayed?

I also noticed that one of my accounts suddenly had too low voting power even though it did not make any votes... No idea what could have casued that.

No people upgraded to HF20, and since not everyone does at the same time, some ppl are on 20 and some on 19, and there was a coding conflict between the two. Some blocks had to be expunged from being corrupted.

Thank you for the update. Do you foresee any more issues with the fork?

No, but who knows ;) I think they will get their code right the next time hehe ;)

So which version are you all on now? confusing? MUCH!

You can see here: https://steemd.com/witnesses

Most are on 19.12 and some on 19.6

Superb, thank you.

Thanks for letting us know what happened.

You're welcome ;)

I wondered what happened. I tried for hours to get on. been though this before and knew it would eventually resolve. Thanks my friend @krnel

Yup, eventually things work again.

Coin Marketplace

STEEM 0.29
TRX 0.12
JST 0.033
BTC 62559.43
ETH 3092.10
USDT 1.00
SBD 3.86