You are viewing a single comment's thread from:

RE: Underbelly - No experience necessary

in #community6 years ago

From my perspective, the convergence statistic mechanism is the broken part of the process, not the overall desire to determine what's true about this set of data.

Or to be more specific, bringing machine learning and when it has absolutely no place to be there is like invoking the Cheshire smile in order to justify coherent output. It's deliberately taking understood, cogent, meaning-laden information in, then running it through the gauntlet of ML transformations only to get out something on the other side which may or may not have any traceable relationship to the original sources.

If I had been consulted, I would have suggested that they get the entire ML part of things out of the way, remove all the handwaving around trying to cut out one or another species of activity which clearly exists on the blockchain, and simply start with what can we see with the simplest possible elements which we can calculate trivially? And the answer is "the number of accounts who interact with you versus the number of accounts you interact with." This provides a simple ratio which is easy to understand and requires no modifications. Calculate that ratio over a large portion of the user population that you're interested in. See what numbers they tend to get tagged with. Then and only then start to look for correlations between those metrics and the behavior of the accounts that they land on. Generally starting at accounts which express the most extreme values to be found in the metrics space.

You don't go in knowing what you will find, because then you've already preloaded your expectations set with reasons to justify one behavior over another. First, see what is there. Then, see what things look alike. Then, reason from the perspective that the similarities that you see are real similarities and that at some level these accounts or other processes you're observing are related. Then, only then, can you really start to talk about what and how those relations could come to be, post hoc or propter hoc.

But first you collect the data. Then you look at the data. You consider the data. And then you tell me what the data seems to suggest. Doing any of the latter things before the first thing? Not a good job.

I'm fond of building graphs and other diagrams which show what is actually going on. I particularly like to show what is going on in ways that people don't expect, or from directions people don't expect. But a lot of the activity that's going on is going on just as I have expected, just as I do expect, and just as I am no longer interested in expecting.

Which is part of why I kind of fell out of doing any kind of blockchain analysis for quite a while. That and SteamData going down. So far, I haven't seen a lot of reason I should get back in the saddle on that particular horse.

Sort:  

amazing conversation to read, thank you both @tarazkp and @lextenebris, enlightening and informative.

Just on the contribution score and my post, at the moment its a concept. I do not have the skills to take this further on my own. My hope was that from the post, a discussion would start and from there I can see who is interested, who can and would be interested in helping. I need expert and skilled opinions and help. if you are interested in helping me let me know and when I set up the discussion group I will add you.

I do fully agree that trying to tell people what they should like, or what good content is, is not the way to go. And it is by no means what I am trying to do.

"One of the primary goals of Steem’s reward system is to produce the best discussions on the internet"

Taken from steemits white paper. A discussion is a two way thing, there are inputs and outputs. the current rep system does little to identify the people that are creating the best discussions. I am not trying to measure what the best discussion is, but those actively trying to pursue this.

Also 'engagement' vastly changes between communities and topics, these blanket scores across the entire platform may very well be better suited to hivemind and smt's and oracles. But we don't have them yet, what we do have is steemit it in current form and time to create something before all this comes into play.

"the number of accounts who interact with you versus the number of accounts you interact with"
I have done this @lextenebris , but never published. only to a limited extent did I "start to look for correlations between those metrics and the behavior of the accounts that they land on".

@tarazkp, in my very rough unfinished model, your rank is 52. Just so you can make a comparison to the scores. Funny my UA rank is 116, I sit rather high, but in my cs my rank is 880. if votes were given based on either of these, I will be better off with UA, although I truly believe there are well more than 116 accounts with more 'authority' than me.

I think it is going to be really interesting to see what comes of all of this. I am not a fan of rankings in general but if they are going to exist, I'd rather them not reward circle jerking which I think UA does at the moment a bit.

Loading...

Coin Marketplace

STEEM 0.26
TRX 0.11
JST 0.033
BTC 64266.94
ETH 3077.24
USDT 1.00
SBD 3.87