You are viewing a single comment's thread from:

RE: Step 2 on developing UA: Calculating UA scores from a MongoDB subset

in #utopian-io7 years ago

An algorithmic deterrent/solution to spam is a great idea and I hope it goes forward but @lextenebris raised the point that we need to create a snapshot and ACTUALLY see/test how the algorithm performs. First, can we identify the bots, and secondly, how will it impact users? Mathematical models are by definition cleaner than the real world in which they must perform. (Just ask any physicist ;-)
I would like to see a public tool that would let any member check to see the REAL impact of the algorithm ON THEM. And then maybe a vote!

What are the odds of getting some such tool?

Sort:  

As I've said multiple times before, UA is not intended as a stand-alone tool. It is a derived metric from the already existing follower data, it could be used combined with other metrics to detect spam (very small UA && very high post frequency && very high post rewards: accounts having that profile, could very well be "reward-grabbing self-upvoters"). Therefore note that UA is not just intended as an anti-spam tool, it can be used for all kinds of things. But regarding post curation, look at it as a "mathematical assistant and extension of human curation": it's not just algorithmical, it's an extension of human activity. Like a Word processor is to manual handwriting, and a CMS is for publishing website content, and a car is for transportation.

@yulem, we are still developing code to implement UserAuthority (UA). This is post #2 about that development, we're not done yet. As this article shows, I created a GitHub repo. Snapshot all you want with it. Because of the large Steem blockchain datasize, it's impossible to begin development on the entire dataset, because how should we identify if the results are correct or if the code contains a calculation bug? Therefore, we're first developing code using my test / example matrix and graph. Once that works completely, we will optimize the algorithm for faster / more efficient performance on the real-life Steem dataset.

The development roadmap looks something like this, from here on:

  • implement writing to and reading from the binary UA-results index (we can later add more data in that index if we like, but first it will only contain account info and UA-results);
  • optimizing the calculate.js algorithm: the way it is now, it computes an entire UA iteration in RAM, and we need to change that by intermediary database writes to flush RAM, and additional data to detect where we left off / need to restart. That would also allow for data sharding / parallelization btw;
  • we then need to chronologically read all blockchain blocks for follower transactions, so we can update "followDB" (which stores the entire Steem follower matrix) per blockchain block;
  • then we will try to implement UA within the upvote calculations on the Utopian-IO bot: we will then not just count human votes per post, but UA-scores per human vote. 50 bot upvotes (all very low UAs) on a post will then account for much less rewards than 1 upvote from for example @stellabelle or the no.1 witness @jesta (very high UA). This last step is a very important "public tool" you ask for.

Nota bene: Another great "public tool" would be a live interactive followerGraph where an account's UA-score is visualized by node size.
Feel free to collab with for example @lextenebris to develop such a public tool. I would resteem development posts on that with much enthusiasm!!

Thanks for your thoughtful reply. I do get where you're coming from and fully support you in that approach with its associated goals.
Maybe "tool" is a bad word choice.
Steemit is a select microcosm of humanity and as such we are embarking on a very noble experiment in its governance. Consider each hard-fork as an amended constitution (of which UA should probably play a role) which governs us all. Having witnessed other such experiments and seeing them fail miserably for a variety of reasons I'm no doubt overly sensitive to the unanticipated ramifications of even the most seemingly insignificant acts. Hence, my tendency to simulate before deployment.
I'm afraid my development days are behind me but I get your point. I fully trust the future into your younger and capable hands.
I might be available for beta testing or QA ;-)

Coin Marketplace

STEEM 0.16
TRX 0.13
JST 0.027
BTC 57529.75
ETH 2571.57
USDT 1.00
SBD 2.44