RE: The Gridcoin Fireside #6 - Gridcoin 4.0.3.0: The Scraper. This one is kind of a big deal.
I think it would be great to have a scraper node application process: Fulfill these X criteria to become a candidate scraper.
One thing not covered in the discussion was geographic and hardware diversity for scraper nodes. Im sure you realise why having all scapers on AWS in the USA (for example) would be a bad idea; if AWS or the USA was offline for some time then no superblocks would occur. Clearly this isnt as serious as blockchain fullnode diversity, but 5 scrapers feels a little on the light side to me.
I think an enhancement may be to use what you described as the DPoS type thinking, where if we had more validated scrapers, a simple round robin cycle could be used. Perhaps the maximum we should have is 25 to prevent BOINC projects feel like they are being DDoSed, and if we have 50 vailated scrapers we just take turns in who provides the consensus for each SB.
I would probably happily operate a scraper node, Im not the most active in the community, but I've been around Gridcoin since Classic days, and a BOINCer for over a decade.
I can answer this for our current status. As of right now, all 5 scrapers are in unique locations and spread around the globe. At present we have scrapers covering US East, US West, the UK, Germany, and Australia. We also have a diverse group of service platforms hosting the scrapers, with at least 3 unique providers being utilized (to my knowledge). I agree more scrapers would be beneficial. Are there any other geographic zones we need to cover? Ideally we should cover more of Asia, but most node graphs seem to show a low percentage of our userbase in Asia (perhaps VPNs are masking this?).
Ok well thats good to hear. We shouldnt try to include a geographic location at the expense of another key attribute, but Asia, S.America and Africa would ideally have some representation.
You don't really need that many more. Perhaps 10-15 is really far more than enough. 5 scapers allow 2 to be down and require 3 to agree to converge, 10 scrapers allow 4 to be down and require 6 to agree to converge. Remember the whole algorithm is designed to essentially make the scrapers transparent. The nodes do not trust individual scrapers. 25 and up is really starting to put on the load on the BOINC servers again, for essentially no benefit.
We also cannot be distributing the Einstein credentials to the entire network, or even a subset unless those people are validated, as that is not in keeping with their intended use of the credentials. Hence the scrapers will always be run by trusted members of the community.
I am all for having more, to a point. Would be happy for you to run one if you want @scalextrix.
For many people, myself included, downtime is not a fear. Collusion is the fear. As you state in the recording, this is blockchain. The rule is "trust no node". So if we must trust for now, which we must, trusting 5 people is incredibly scary, but understandable for a boot-strap. Trusting 10-15 is absolute minimum in my eyes. Trusting 25 would be good. Down the road, as the tech develops, trusting even more would be best.
I would not worry about BOINC server load if we build an ecosystem that encourages BOINC development.
I would not worry about credentials if we encourage or incentivize entities to seek scraper status.
Additionally, the more entities that put reputation at risk by hosting scraping nodes, the fewer scraper nodes are needed to establish reliability. A university department is less likely to collude and risk damaging its reputation than an individual. This also depends on how blockchain law evolves.
This is one of these things we have to be thinking decades ahead on or we'll be building a system that can easily be replaced by one that is better.
Hmm... You are not trusting 5 people. The algorithm is built to cross-verify. Collusion is the real issue. Starting with the position that the scrapers are independent, it requires a bad actor to gain direct control of 60% of the number of scrapers, and publish stats in such a way to achieve convergence (i.e. matching to hash and signature). Because of hashed and signed nature of the messages, man in the middle attacks will not work. (Because the man in the middle does NOT have the private key to sign the messages properly.)
The sole issue here is the probability of non-independence of the nodes. Certainly the chance of "collusion" is higher with a low number of nodes, but getting beyond even 10 or 15, the chances of collusion between 60% of the nodes is exceedingly small.
We need to publish a list of the scraper nodes to show people that they are truly independent.
Sorry for necro-post, but collusion is about trust! haha I hear what you're saying. With 5 scraper nodes we are trusting that those 5... actually just 3 of them... will not collude.
I understand the need for security-through-privacy etc. etc. The boot-strap as is is fine. Looking forward to building it out!
Hi, yes Im certainly not advocating for a lot of scrapers, or as you say you we would be hurting projects again. My thinking is to perhaps have enough validated scrapers available "waiting in the wings" in case of catastrophe.
Not so much because I think its a problem having 5, but I have seen enough "decentralize everything" rhetoric that gets thrown at projects.
Ill drop you a PM on slack, let me know whats required and if I can comply then will be happy to support the network in this effort.
There is a difference between "decentralize everything" and "this is probably not decentralized enough". Having a core mechanism that can be easily corrupted with minimal effort or cost is fairly dangerous. Look at EOS and other DPoS systems along with the words of incredibly reputable figures in the bitcoin and blockchain and how they approach decentralization.
This is a mechanism that should be as decentralized as possible. It does not need to be 100% distributed, however.
To be clear, though, 5 is fine for a boot-strap. It's similar to how bitcoin started.
Scrapers waiting in the wings is a good idea.
I think you hit some hot topics for where I also hope this mechanism goes.
I think there are dozens of potentials with regards to where we can take this mechanism and I am very excited to see what people come up with. Things like the gamification/incentive potentials. Maybe second, third, or fourth sets of scraper nodes for collecting additional off-chain data/statistics from other distributed computing platforms (or other things like solar energy production) to be incentivized.
The round robin of 50 scrapers sounds a lot like combining Brod's DWP proposal with this scraper mechanism which I think holds potential, but would need input by more technically minded people.
For now I think keeping things as is is the most practical route as this scraper, while minimal, lets us focus on the other things we need to fix. I see this as a boot-strap of what can become a fairly intricate system that has some pretty neat effects. I would not feel comfortable with the mechanism as is if either the network grows rapidly or if a significant amount of time passes (for now, let's say a year) without revisiting its operation and the points you brought up.
50 isnt a magic number I think we should reach, just saying if we had more quality providers than we need it would be better to share the load than turn offers away.