Improving Steem’s rankings to cater to diverse content preferences

smooth (70) 9 years ago (edited)

Would Steem fail if every blog post of a lady putting on her makeup is rewarded $26,000?

I don't know that it would (FWIW, I do not believe that every such post would be so rewarded, only the first that emerged as a hit or others that offered something new). Look around you. The most commercially successful music is cookie cutter. The most successful TV shows are dumb. Nobody even buys books any more at all (other than maybe romance novels?). One of the top posts at reddit right now (5000 votes) is: "Excluding my mom, what's the worst sex you've ever had?"

What is successful is what addresses the market as it exists, and that includes a great deal of demand for shallow content and being popular for being popular.

Also, implemention factors are very important. Voting is processed by the consensus code which needs to not only be kept relatively simple but also very high performance. This doesn't mean that your ideas can't work but to be a credible proposal you need to completely specify how and when the various processing steps occur. Of course that can be the subject of later posts.

$707.68

anonymint (65) 9 years ago (edited)

Would Steem fail if every blog post of a lady putting on her makeup is rewarded $26,000?

I don't know that it would (FWIW, I do not believe that every such post would be so rewarded, only the first that emerged as a hit or others that offered something new).

I realize that opening statement could be interpreted the way you did. Rather I intended it in the context of the article, meaning that given the current optimum voting strategy of each voter maximizing his/her reward by choosing to vote on the posts with the highest $rewards, then there will always be some post that is awarded far too much while so many others are rewarded far too little. Btw, my proposal isn't intended to entirely interfere with the intended psychology effect of quadratic weighting of voting power which the white paper says is intended to motivate blog posters by causing them to incorrectly assess the odds of their likely average reward. Rather my proposal is merely so that the voting power isn’t incorrectly incentivized to vote as a groupthink monolith. I realize some people may be voting their conscience in spite of it not maximizing their curation rewards, but still my proposal benefits them because their conscience will then actually be more highly reflected in the ranking for the cluster that shares the same like-mindedness.

Look around you. The most commercially successful music is cookie cutter. The most successful TV shows are dumb. Nobody even buys books any more at all (other than maybe romance novels?). One of the top posts at reddit right now (5000 votes) is: "Excluding my mom, what's the worst sex you've ever had?"

My rebuttal to the claim that one-size-fits-all content is what the masses want, is to observe the decline in viewership of non-interactive media (e.g. newspapers and TV) since the Internet. Apparently the masses want to customize their experience with media, while sharing that experience/content with like-minded community (and 5000 people in a like-minded cluster is sufficient socializing and good feeling being a member of a group that shares interests). Meaning I agree and disagree with you, in that yes masses want to share things that many others also like, but they want to prioritize their sharing around their mutual likes, not everything under the sun. And they probably do also want to venture outside their priority interests sometimes to see what is going on outside. Even my 26 year old filipina gf tells me she doesn't like seeing violent sharings on her Facebook timeline. She wants cutesy and humorous content, e.g. dogs dancing, etc.. She would have absolutely no interest in reading this proposal. But she will partake of an occasional guy who got crushed under a bus (or apparently a pornographic scandal she did not tell me about lol).

Given the millions of users who visit Reddit every day, 5000 up votes seems quite low. This seems to confirm that millions of viewers are being spammed with content they probably aren’t interested in. We might have an opportunity to improve upon Reddit’s ranking algorithm.

What is successful is what addresses the market as it exists, and that includes a great deal of demand for shallow content and being popular for being popular.

A “Trending for Others” ranking choice could still provide the original unclustered rankings when users want to venture outside their voting preferences. These could even be sparsely interleaved by the UI in the clustered rankings to provide some statistical opportunities for users to morph their voting preferences and not get stuck in a localized groupthink. Yeah we probably need both! Good point.

Note none of the reduces the other main benefit of my proposal which is that curator rewards would be confined to clusters, to remove the incentive for voting in one global groupthink.

Also, implemention factors are very important. Voting is processed by the consensus code which needs to not only be kept relatively simple but also very high performance.

The voting is recorded on the blockchain in I presume real-time, but the rewards are only recorded periodically. Note the rankings and rewards can be computed in real-time for the UI independently of the consensus validators. Thus afaics, it is only when the rewards need to be periodically recorded that those computations need to be performed by the consensus validators. Thus it appears the cost can perhaps be amortized over significant periods.

$575.81

12 votes

smooth (70) 9 years ago (edited)

Agree overall on most most points. Re. implementation, payouts are transactions too. Currently those perform a relatively simple calculations based on shares, and those transactions would need to remain limited in computational cost. I don't think this proposal changes that much though, it just has more complicated (but still computationally simple) accounting of shares by clutser. My concern is primarily the k-means clustering; Incremental k-means variants exist, so it is probably solvable.

Also, dividing users into clusters requires a minimum number of users in each cluster, thus a minimum number of users on the site as a while. At present I doubt that is feasible as the number of users on the site is just barely reaching the point where it works at all. Ideally of course the number of users will be much larger in the future, so something like this could be phased in.

$8.59

5 votes

stijn (50) 9 years ago

Pleas check out this post I made about a steemit Bug the dev's should see this post so they can fix it. https://steemit.com/bug/@stijn/steemit-bug-needs-to-be-fixed

$0.00

dana-edwards (74) 9 years ago

You are right about that but I think if the interface had clearly defined groups like Reddit, which were moderated like Reddit or Facebook, where only people in that group selected as moderators can determine for example to allow a post into that topic, then maybe you can focus the curation per topic.

Right now if I check the basic income topic I can't find anything related to basic income without digging through all sorts of noise.

$0.00

ian.ridgwell (35) 9 years ago

It would help but not resolve the main issue. Currently users are rewarded more to rewarding content creators with the highest rewards rather than best content.

$0.00

dana-edwards (74) 9 years ago

The best content is subjective. The only way to determine best is to go by rank and the only rank we have is votes. So the most votes is considered the best by the community. No different from Reddit or Slashdot or any other collaborative filtering.

$0.00

anonymint (65) 9 years ago (edited)

where only people in that group selected as moderators can determine for example to allow a post into that topic ... if I check the basic income topic I can't find anything related to basic income without digging through all sorts of noise

Afaics, my proposal when orthogonally clustered per topic (aka tag or hashtag) would automatically (i.e. algorithmically) accomplish the same by ranking the irrelevant posts at the bottom of the list of posts for the topic, without needing to manually pick and trust moderators.

The best content is subjective. The only way to determine best is to go by rank and the only rank we have is votes. So the most votes is considered the best by the community. No different from Reddit or Slashdot or any other collaborative filtering.

True, but hypothetically my clustering proposal should offer the advantage that rankings are customized for each cluster, i.e. for each group of people who tend to vote with the same preferences, thus automatically customizing rankings to different people’s likes (up votes) and dislikes (down votes).

$0.00

ian.ridgwell (35) 9 years ago

Not all votes are equal. Rich/Powerful voters have more important vites.

$0.00

complexring (68) 9 years ago

I would not even try to map the data to the Grassmannian and use the naturally defined metric to determine clustering of the data on various manifolds.

Bad ideas don't work in practice.

Seriously. Avoid this notion at all costs. Grassmannians are tricky beasts.

$23.82

9 votes

anonymint (65) 9 years ago

Is it the intermediate mean for Dₛₜ? You think I should cluster directly on the votes? That worry crossed my mind but I hadn't yet had time to delve into the ramifications. I am not familiar with why they are tricky.

$8.18

5 votes

complexring (68) 9 years ago (edited)

I dunno man. Anytime I start thinking about the space of k-dimensional linear subspaces on an n-dimensional space (i.e. Gr(n,k)), and about points on that space, my mind begins to get warped.

The big question is how can other metrics be used as a way of identifying clustered points on a manifold.

I think we'll see how persistent homology will be coming into play.

$23.79

7 votes

dana-edwards (74) 9 years ago

I think the only thing Steemit needs to do is improve the user interface so we can find posts on topics we look for. Now it's hard to do because the interface doesn't make it simple.

In Reddit you can subscribe to groups. Groups have moderators. You can expect to see only posts of a certain topic within certain groups. Same with Facebook.

Steemit doesn't yet have clear groups. This is why it's hard to find the topics which we look for. I don't think the math or technical aspects need to change but I do think the interface needs to be at least Reddit level.

$0.42

anonymint (65) 9 years ago (edited)

Afaics, the problem can not be solved only with filtering the display by tag or grouping without changing the underlying computation of the rewards. Without my proposal, the voters have a mathematical incentive to vote on only the highest payout posts, in order to maximize their own curator rewards. Maybe it is not clear to everyone why I think that is the case. The reason is because I presume the voter has no way to predict which voters will vote on each blog post and thus which blog posts which be optimum for them to vote on to maximize their curator rewards constrained to their cluster. So I presume absent an a priori strategy to maximize their curator rewards, they will choose to vote honestly according their content quality preferences. Note this aspect of the game theory needs to be pondered carefully.

Additionally, I quote what I wrote in reply to @smooth:

I realize some people may be voting their conscience in spite of it not maximizing their curation rewards, but still my proposal benefits them because their conscience will then actually be more highly reflected in the ranking for the cluster that shares the same like-mindedness.

$0.11

alexgr (66) 9 years ago (edited)

As the system is right now, perhaps he doesn't even need to "predict". He can vote, see whether others voted, and then if they did vote and raised money preserve the vote. If they didn't raise money, he can proceed to unvote and vote something else: https://steemit.com/steemit/@alexgr/curation-gamed-through-unvoting

$0.03

2 votes

smooth (70) 9 years ago

This doesn't work. I'll explain why in a response to your post.

$0.03

2 votes

dana-edwards (74) 9 years ago

I am not concerned about the rewards. I think the rewards are working great. Popular content is getting the most money. Sure you can have content which gets a lot of votes but which doesn't get much money because not a lot of voting weigh is behind it but that will change over time as more people have Steem Power.

$0.00

anonymint (65) 9 years ago (edited)

I don’t understand why you are not concerned that people are apparently not always voting for the content which they like the best, but instead for the content they think can give them the highest curator rewards? I am doubting that you have understood deeply enough the various concepts and impacts of this proposal.

Without my proposal, the voters have a mathematical incentive to vote on only the highest payout posts, in order to maximize their own curator rewards.

The point is voters are being incentivized by the curator rewards to not vote on the most relevant content, but rather on the content that other whales have voted on. This apparently causes a swarm vortex effect where all the votes get sucked towards what ever got the early momentum, regardless if that blog post was not so much better in quality than all the others.

Granted I am not sure that everyone is voting to maximize their curator rewards. But if say two whales who have 3 million Steem Power voted for a blog post, then many others will vote for it, because not only their curator rewards will be higher for doing so than wasting their vote on another blog post, but also because that blog post being at the top of the listing will also cause it to be seen by EVERYBODY more often.

Hypothetically my proposal not only divvies up the curator rewards by like-mindedness, it also divvies up what people see at the top of the listings so not everybody sees the same top ranked blog posts.

In the abstract, my proposal is about adding more degrees-of-freedom to the ranking system in all aspects of it.

$0.00

https://steemit.com/steemit/@jacobt/the-circlejerk-needs-to-end-now

jacobt (58) 9 years ago

I posted my proposal to solve this issue the other day which actually fits in with your algo as well.

$0.08

saknan (54) 9 years ago

This need to be on first page. Upvoted.

$0.07

2 votes

kevinpham20 (61) 9 years ago (edited)

EVERYONE UPVOTE THIS MAN!! We need him to have a vested interest in the platform so he'll keep on contributing his brilliance to the community!

$0.07

3 votes

firepower (79) 9 years ago (edited)

This was long due! As a non-coder I can really hope that some of what you've suggested gets implemented. This is a great proposal and I hope it will bring many positive changes to this platform. I hope it makes it to the front page and stays there for a while! :)

$0.00

3 votes

kimmar (40) 9 years ago

This is what will make this platform great. People from all walks of life putting their efforts in. Great post!

$0.00

jcweiss (37) 9 years ago (edited)

I agree about the imperfections of the issues with the ranking and reward systems. However, I do not see the solution as fully considered in its present state.

Primarily, what are the implications for the allocation of rewards across tags? How much should #steemit get versus #beauty?

Might multiple cluster membership or topic modeling be a better fit?

And by automagically, I assume machine learning of some variety?

$0.00

anonymint (65) 9 years ago (edited)

Multiple cluster membership is what I alluded to in the last paragraph of the last section:

Also the like-minded distances and clustering algorithm could be applied orthogonally to each hashtag (a.k.a. Tag or category), so that voters can be clustered differently for different hashtags.

You are correct to point out I was vague in the last section, but afaics the rewards should still be according to voting power; thus the relative allocation between clusters and hashtags could remain an orthogonal attribute of the voting power. In other words, the total reward for the blog post’s author would still be based on the sum of the voting power for it, but the rankings would be clustered. However, the voter’s reward should be constrained to the voting power of the clusters he/she is a member of, so that the Nash equilibrium of voting for the posts with the highest rewards is I think removed. The proposed algorithm incentivizes the voting power to divvied up more granularly by providing orthogonal rankings for like-minded clusters of voters and this could be computed orthogonally for each hashtag. I will be pondering more the game theory ramifications of this proposal.

I did actually think of this issue, but I just forgot to write it into the last section. My energy level and focus was tailing off apparently as I composed the end of the document. There were many details I thought of and I may not have written them all down yet. I will as I remember.

The automagic is the algorithm as described.

$0.07