You are viewing a single comment's thread from:

RE: Do forks with fallback - my HF21 wish

in #steem6 years ago (edited)

You overlooked that we had to do exactly a "fallback" when we were still on 19.2 and some 20.2 boxes inadvertently started running code prematurely just a week before causing an unintended early fork.

Which in turn caused all of us who had successfully prepared for this fork to have to roll back our already updated, and properly prepared boxes to 19.2 again. (Those of us with some wisdom and experience had our backups in reserve on 19.2, so we only needed to swap and reset to the last correct headblock checkpoint, but it was still a clusterfuck of bugs and errors that led to a two day outage.)

Your post isn't wrong, but the problem remains that the very creators of the code that has failed us twice now, cannot even explain what to expect it to do, have not documented anything in any clear and accessible way, and often do not even comment the places they change in the code, that in turn they cannot explain or predict to others. But sure, tell the "witnesses" to "read the code" - easy for arm chair observers to say. Impossible in practice in this context.

Then we dont test. Clones of our chain and front ends are popping up all over like weku, and smoke io and others. But for some reason, even though those small time outfits can clone our entire infrastructure ,we cant seem to have a test net that doesn't apparently need "frequent restarts and wasn't running mixed version nodes" according to one insider developer on that team I've spoken to, but won't indict here since he is new to that team and I doubt its his fault.

So yes, roll back is a thing. But WAY before that, lack of basic coding standards, peer review BEFORE dev commits, documentation, testing, experience and competence are bigger things.

Sort:  

So yes, roll back is a thing. But WAY before that, lack of basic coding standards, peer review BEFORE dev commits, documentation, testing, experience and competence are bigger things.

Yep. Rollback should 'not be seen as an option' in testing and prep but, should be available if absolutely necessary, so it doesn't become a crutch.

There is no way to prevent bugs in production systems. If there was, cars and planes would never crash. That said, we definitely don't have safeguards in place here to mitigate them to the levels we should have them in place.

Fundamental errors were made here. Fundamental checks were not done. Fundamental understanding of the changes being made en masse here was not achieved. Fundamental documentation was not provided PRIOR to the testing and release phases. Fundamental testing environments did NOT exist in a proper fashion for anyone to use, least of all the average witness.

Someone will definitely try to refute this in a follow up comment here, Id bet on it, and they will be taking advantage of double speak to try and throw shade and fool the public to discredit those who stand up with this claim because we don't have to worry about losing our income by falling out of the top 20 or our stinc employment, trust, fam. Cui Bono - who benefits?

Roll backs are an emergency exit. Before you leap through them, 100 other things should have been done before the plane left the ground.

"The witnesses are at fault, and should have read the code" is all at once, a truth, and a gross red herring at the same time. For reasons the average non-technical reader would even understand if we found the right simple metaphors to explain them.

Maybe this metaphor will work. The plane crashes. The engineers made a change to it that they did not create proper tests for, didn't document well, and did not publish about in advance. They had no one check their work before bolting it in the plane and when queried, said, well we can't really articulate it in english, but fly the plane awhile and we'll see how it works out.

After the crash they says, well, it's up to the fuel guys, the ground crew ,the flight attendants and the pilot and co-pilot job to make sure our random undocumented changes worked, right?

And the public doesn't know they are wrong, so they get away with such remarks.

And that's why there's a lot of FUD and pissed off people pointing fingers right now.

But it all comes back down to the fail of the engineers who set up the failure. And those who allowed it to become this way.

I broadly agree with you but I have to say no to this:

Roll backs are an emergency exit. Before you leap through them, 100 other things should have been done before the plane left the ground.

No way. It's not the first thing but patch after patch to a live system - no. We're talking vaguely here, your 100 things might be 5 of mine, but as it sounds no. You've got to know when to say, it's not actually ready, let users come first and let take our time to get it right.

Ask yourself again, what's the rush for HF20? The sun will rise again.

This reply confused me, its like you are stating you want to disagree but then you sort of go ahead and agree? No snark, you lost me here.

Haha! Okay I see that. What I'm saying is yes, for emergencies but it's theoretically always available, even now, but it is unthinkable to many witnesses. So no, not the 100th thing you try, the 5th thing.

The difference of attitude I'm talking about is perhaps more important. Repeat after me: We can go back. No one believes that.

Thanks. Like I'm saying elsewhere it's both, it's not either / or. What happened last week isn't what I'm taking about. In the diagrams you see a for real plan B as fallback. We haven't ever had that.

Yes too testing, one hundred times yes. I had an idea before about changes as a matter of course in a defined time period, say 1 month. After that the test coins (TESTS I believe is the convention) are worth something on the main chain, at least something. The idea was not taken seriously but this may be the time to advance it again, or at least start looking outside the box at such solutions.

Hear! hear!

rollback will not be possible if there are changes to the blockchain (database schema) and it will require complete replay.

Not entirely. We did it just last week, and only had to go back to the last good headblock.

  1. I was not able to reply because of lack of "MANA"

Not entirely. We did it just last week, and only had to go back to the last good headblock.

I was not around, but from what I understand it was from a minor version to another minor version. It was a Soft fork and not a hardfork which often includes changes to the "consensus" logic and also to the format/schema in which blockchain snapshots are stored. Part of the blockchain state from many plugins are now stored to rocksdb. So a restart is possible without replay in cases where schema and the consensus is not changed.

I believe eventually STEEM is heading to a model where only the consensus related data will be on the immutable blockchain and rest will be in various databases.

Also, I need to elaborate on the "roll back" - generally roll back means going back to the earlier state. So what I meant to say is that is complete re-index will be needed if there are changes to the consensus state. Roll back will be against the "immutable nature" of the blockchains. TheDAO attack on the Etherium chain is probably the best example where the immutability was not touched and forks were brought into fix the issues (with the smart contract) : https://ethereum.github.io/blog/2016/06/17/critical-update-re-dao-vulnerability/

go back to the last good headblock.

I am not sure how this was done - blocks after the last-good-head-block was ignored ?

You aren't wrong, at all about any of your assessment, past or present.

Except, there was a fork from mixed node versions, leading to a split chain, aka an actual unintended fork. We DID roll back to a checkpoint block and restarted the chain, quickly, and lost transactions (reversed,as if they never happened) so if we do it fast enough (too late now because way too much to undo), it is not entirely impossible.

oh, I was not aware of this - interesting scenario. Thank you for explaining.... This sounds like a classic Byzantine generals scenario.

Careful, we might sound smart and able to code and NEVER make the top 20...

Oh, everything I said above was bluffing ... There are infinite parallel blockchains and infinite number of top 20s ... as people from this chain and has done more hard freezes, i mean forks than every other blockchain in the known universe of blockchains put together, the immutable genesis blocks of all the chains will bless us with infinite amounts of mana ... the super intellectual state machines using probabilistic methods to maintain inter galactic consensus will help us with intelligence to even understand the meaning of 42 .... believe in Satoshi .. don't fear .. Amen! Aham Brahamasmi.

When I am not bluffing I speak like this. Will this help ?

PS: 42 is the "Answer to the Ultimate Question of Life, the Universe, and Everything" in The Hitchhiker's Guide to the Galaxy books.

If you weren't married with a new baby on top of all you do here (congratulations on the baby again), I might change my sexual persuasion and hit on you, just on the sheer awesomeness of that comment and the fact you own 2000 banana plants and know how to prep a jackfruit.

Coin Marketplace

STEEM 0.19
TRX 0.14
JST 0.030
BTC 60122.55
ETH 3199.29
USDT 1.00
SBD 2.43