Step 2 on developing UA: Calculating UA scores from a MongoDB subset

in #utopian-io7 years ago (edited)

image.png

Following the previous article First step to building a PoC for UA, @scipio and I have continued working on the project development. In the first article, I discussed how we constructed a subset of the entire Steem blockchain data, just to follower data, which we read (as an intermediary step) using the MongoDB interface of Steemdata.com. In the second development step, we are calculating UAs via multi-iterations, and we are testing the results using the example follower matrix dataset @scipio presented in his first article.

Technical approach

We started with a JSON file, holding the follower matrix of the example follower graph. The file describes the user names (the nodes in the graph) and the user IDs. In step 3, writing to and reading UA data in a unidimensional binary index file, it's important to store accounts / users via their ID in chronological order of entering Steem. We calculate all account UA results via intermediate MongoDB collections (UA and UA1) via a multi-iterative approach.

Installation of UA-JS

Do you want to test how the current algorithm works and verify its output? Then install the code via the following steps!

Step 1 : Install node


  • Clone the Github repository somewhere on your device.
  • Install node (there is plenty of resources online to find how to do that).
  • Open a node console, navigate to the uaJS folder.
  • Run npm install

Step 2: Create the testDb collection


  • Install mongodb and launch mongod (plenty of tutorials on the internet)
  • Use mongoimport from the mongodb folder to import the test collection : mongoimport --db uaJS --collection testDB --file testDB.json. replace testDB.json by the full path to the file.

Step 3: Run computeUA.js


  • Run node test/computeUA to run the test.

The first time, you won't need these two lines, so comment them out:

/*     await db.collection('uaDB').drop();
      await db.collection('uaDB_1').drop();*/

image.png

This result is coherent with the result from @scipio 's post. You can create your own collection as testDB and test the algorithm.

This is a two-men-effort, so we decided to equally share the payout of this post between @scipio and @stoodkev.

@stoodkev && @scipio



Posted on Utopian.io - Rewarding Open Source Contributors

Sort:  

Thanks for this! This is really awesome.

This seems too under the radar, should get more attention and hype. Do you have any sense of if the witnesses might be on board with this?

I think the witnesses are following development with enthusiasm, just look at my posts and @stoodkev's posts on UA, and check for upvotes and comments.

PS: you told me you wanted to help, but after 1 week I haven't heard of you on Discord. Do you still want to help development?

Yeah I've had less free time than I thought, having trouble at home and at work, I'll try to get on but don't depend on me for now, i suppose.

An algorithmic deterrent/solution to spam is a great idea and I hope it goes forward but @lextenebris raised the point that we need to create a snapshot and ACTUALLY see/test how the algorithm performs. First, can we identify the bots, and secondly, how will it impact users? Mathematical models are by definition cleaner than the real world in which they must perform. (Just ask any physicist ;-)
I would like to see a public tool that would let any member check to see the REAL impact of the algorithm ON THEM. And then maybe a vote!

What are the odds of getting some such tool?

As I've said multiple times before, UA is not intended as a stand-alone tool. It is a derived metric from the already existing follower data, it could be used combined with other metrics to detect spam (very small UA && very high post frequency && very high post rewards: accounts having that profile, could very well be "reward-grabbing self-upvoters"). Therefore note that UA is not just intended as an anti-spam tool, it can be used for all kinds of things. But regarding post curation, look at it as a "mathematical assistant and extension of human curation": it's not just algorithmical, it's an extension of human activity. Like a Word processor is to manual handwriting, and a CMS is for publishing website content, and a car is for transportation.

@yulem, we are still developing code to implement UserAuthority (UA). This is post #2 about that development, we're not done yet. As this article shows, I created a GitHub repo. Snapshot all you want with it. Because of the large Steem blockchain datasize, it's impossible to begin development on the entire dataset, because how should we identify if the results are correct or if the code contains a calculation bug? Therefore, we're first developing code using my test / example matrix and graph. Once that works completely, we will optimize the algorithm for faster / more efficient performance on the real-life Steem dataset.

The development roadmap looks something like this, from here on:

  • implement writing to and reading from the binary UA-results index (we can later add more data in that index if we like, but first it will only contain account info and UA-results);
  • optimizing the calculate.js algorithm: the way it is now, it computes an entire UA iteration in RAM, and we need to change that by intermediary database writes to flush RAM, and additional data to detect where we left off / need to restart. That would also allow for data sharding / parallelization btw;
  • we then need to chronologically read all blockchain blocks for follower transactions, so we can update "followDB" (which stores the entire Steem follower matrix) per blockchain block;
  • then we will try to implement UA within the upvote calculations on the Utopian-IO bot: we will then not just count human votes per post, but UA-scores per human vote. 50 bot upvotes (all very low UAs) on a post will then account for much less rewards than 1 upvote from for example @stellabelle or the no.1 witness @jesta (very high UA). This last step is a very important "public tool" you ask for.

Nota bene: Another great "public tool" would be a live interactive followerGraph where an account's UA-score is visualized by node size.
Feel free to collab with for example @lextenebris to develop such a public tool. I would resteem development posts on that with much enthusiasm!!

Thanks for your thoughtful reply. I do get where you're coming from and fully support you in that approach with its associated goals.
Maybe "tool" is a bad word choice.
Steemit is a select microcosm of humanity and as such we are embarking on a very noble experiment in its governance. Consider each hard-fork as an amended constitution (of which UA should probably play a role) which governs us all. Having witnessed other such experiments and seeing them fail miserably for a variety of reasons I'm no doubt overly sensitive to the unanticipated ramifications of even the most seemingly insignificant acts. Hence, my tendency to simulate before deployment.
I'm afraid my development days are behind me but I get your point. I fully trust the future into your younger and capable hands.
I might be available for beta testing or QA ;-)

Hey @stoodkev I am @utopian-io. I have just upvoted you!

Achievements

  • You have less than 500 followers. Just gave you a gift to help you succeed!
  • Seems like you contribute quite often. AMAZING!

Community-Driven Witness!

I am the first and only Steem Community-Driven Witness. Participate on Discord. Lets GROW TOGETHER!

mooncryption-utopian-witness-gif

Up-vote this comment to grow my power and help Open Source contributions like this one. Want to chat? Join me on Discord https://discord.gg/Pc8HG9x

Thank you for the contribution. It has been approved.

You can contact us on Discord.
[utopian-moderator]

Congratulations! This post has been randomly Resteemed! To join the ResteemSupport network and be entered into the lottery please upvote this post and see the following rules.

Coin Marketplace

STEEM 0.16
TRX 0.13
JST 0.027
BTC 57483.44
ETH 2574.21
USDT 1.00
SBD 2.48