What's a Minnow to do? The Game Theory of Steem, Part 4

in #gametheory8 years ago (edited)

And we're back with more riveting analysis of the game theory of Steem! For the past two articles (Part 2, Part 3), we've been using the example of a simple beauty contest to study the incentives for voting on content in Steem. In particular, we've been looking at the incentives for minnows - that is, people who don't have enough Steem Power to affect the outcome of the voting, but nevertheless get paid for their votes. We're almost done with minnows; I just want to tie up a few loose ends first.

The conclusion we ended with in Part 3 was that minnows are incentivized to vote for the content they think will win, not necessarily the content they think is good. At the end of Part 3, I asked for comments on why we shouldn't believe this conclusion, and got some excellent replies. One was by @jholmes91, who mentioned that in the real Steem, the following two things are true:

  1. Not everybody votes at once, and
  2. Steem is way too complicated for everybody to vote rationally.

These are great points, and we'll look at them together in this article.

In game theory, we call a game in which moves are not simultaneous a sequential game, and there's a huge amount of research on how we should model these. One of the basic ideas is that the simple concept of Nash Equilibrium (that we discussed in Part 3) doesn't work very well in sequential games. I'm not going to go into a ton of detail here, but one solution concept that has received a lot of attention over the years (it won Selten his Nobel Prize) is that of the Subgame Perfect Equilibrium (SPE). At its core, a SPE is a sequence of moves by all players that everybody is happy with. What does it mean to be "happy with" a sequence of moves? It means that I take into account the fact that my actions at stage 1 impact your actions at stage 2, and so on and so forth - so to compute a SPE, I have to do 2 things perfectly:

  1. Write out all eventualities (even if there are billions of them), and
  2. Assume all of my opponents are doing the same thing and making perfectly rational choices at every stage.

Does that sound like something we expect people do be doing? Have you ever done that? Me neither. The reality is that it's very difficult, even for a computer, to think that far into the future. In a sense, everybody would need to be infinitely rational to be able to know how to play optimally in a sequential game; to analyze behavior in these settings, we need a way to model the fact the people's rationality is not infinite.

One notion that tries to get at this is called bounded rationality. Today, in particular, I'm going to look at a very simple abstraction called level-k thinking. Here's how it works:

Level-K thinking

To vote optimally on Steem, each person needs to be able to anticipate the effect their vote has on everybody else's payoffs and vice-versa. To anticipate this, each person needs to compute an infinitely-long, infinitely-brancing chain of hypothetical decisions. What if we could model actual human behavior by imagining that people can only look K steps into the future? It would provide a tractable way of analyzing decision-making in sequential games.

Here's the setup: We assume that the population of voters is broken down into groups that we'll call Type-0, Type-1, Type-2, and so on.

  • Type-0 voters are not strategic; they don't take into account the effect their votes have on other voters.
  • Type-1 voters use a very simple strategy: they assume that everybody else is Type-0.
  • Type-2 voters assume that everybody else is Type-1.
  • and so on and so forth.

From what I can tell, experiments that have tried to validate this model have been reasonably successful, and they show that most human decision-makers are Type-0, -1, or -2. So now let's build our Steem voting model under this assumption.

Steem Voting

So: let's assume that there are m Steem posts active at the moment, and at each time step t a new voter looks at the m posts and decides which one to vote on. For simplicity, assume that each voter only has 1 vote; there are n voters, so we'll end up with a total of n votes.

Each voter has idiosyncratic preferences over the posts; say vij is the value that voter i assigns to post j winning. After everybody has voted (after n time-steps), the votes are tallied; write Nj as the number of votes cast for post j. Finally, let's pretend payouts are proportional to the vote share, so the payout to people who voted for j is Nj/n; the number of votes for j divided by the total number of votes. Note that this is not how Steem actually works; the real payout is proportional to the number of votes squared, along with a few other complications - but we have to start somewhere. We're also assuming that nobody can change their vote once it's been cast.

So now we have all we need to write down player i's payoff function, where player i is voting for k, and j wins.

The best payoff a player can get is when he votes for his favorite and it wins. Note here that we could also include a "vote-your-conscience" term as suggested by @nkyinkyim. I'm not doing this here for simplicity, but I'd encourage a curious reader to work out how that might change things. Here is our question: How do the various level-K players vote optimally at each stage?

Level-0

There are a lot of ways to model level-0 players. We'll take the easiest: we'll model a level-0 player as someone who thinks that he's the last one to vote, and nobody will come after him. This type of player has a very simple best response: if there is a tie for 1st place, vote for your favorite of these. Otherwise, vote for the one with the most votes.

This is a very interesting conclusion! What we're saying is that a Level-0 player will usually vote for whichever is winning, even if he likes a different one a lot. This is because a single vote doesn't have the power to change the outcome, so he might as well vote with the crowd.

Well yeah, but why is it interesting? It's interesting because it's the exact same behavior that we predicted in Part 3 when we were assuming perfectly-strategic, infinitely-rational behavior. A Level-0 player is totally non-strategic, and yet in the sequential game he votes the same as a totally strategic player. This suggests that assuming simultaneous votes and perfect rationality might not have been such a bad assumption after all.

Level-1

Do we run into any more nuances as we increase the rationality? Remember that a Level-1 player thinks that all the other players are Level-0. So a Level-1 player knows that if she can vote to get her own favorite out in front, that it will win because all subsequent players will vote for it.

So what's her strategy? Well, to really do this justice, we would need to model her beliefs about future voters' preferences - but let's sweep that under the rug for the moment and assume she's pessimistic, and believes that future voters don't have the same favorite as her. Here's the Level-1 optimal strategy: if there's currently a vote tie, break the tie by voting for your favorite of those that are tied. Otherwise, vote for the one that currently has the most votes.

Was that an echo? That's the exact same strategy as the Level-0 guy! Why? It's simple: it's because her vote doesn't have enough power to really sway things, so she'll vote for for the one that's going to get her a monetary payout unless she can put one of her favorites out in front.

Level-2 and so on

We've already done all the work we need to! Since Level-0 and Level-1 have the exact same strategies, all subsequent Level-K will do the same. (This is a sloppy induction argument; maybe someday I should write a post on proper induction.)

So What?

The whole point in writing this article is to explore the effect of going from a very simple model (the puppies/kittens example of Part 3) to a considerably more complicated model. I had a hunch going into this that we'd get exactly the same behavior; it turns out I was right.

Always nice when these things come out so clean. Or is it too clean...

Discussion question: Did I "rig the game" and set up my level-k model in some way that guaranteed I'd get the same result? Whenever I read a surprising result, I try to turn on my BS meter and check if there's a fishy assumption somewhere that makes everything work out.

Thanks for reading this far! If you haven't seen them already, check out

Part 1 of the game theory series: Introduction

Part 2 of the game theory series: Beauty Contests

Part 3 of the game theory series: Voting with the crowd

A little about me

Sort:  

@biophil What are your thoughts on this? Can answer here or in my post. I haven't been able to find an answer and value your opinion and I believe many others as well are interested in an answer.

What are the safeguards in place if a Whale sells their $100 upvotes for $50 of bitcoin outside of the system? Detailed example link below.

https://steemit.com/steemit/@sephiroth/i-ve-been-wanting-to-get-an-answer-on-this-what-s-to-prevent-someone-from-selling-their-votes-what-if-a-whale-sells-their-usd100

Fantastic question. See my new proposal.

This is a really, really excellent series. You write very clearly and explain things well. Well done!

Having said that, I don't think it describes my voting strategy. I usually consciously avoid voting up highly popular posts because (1) I have limited voting power and (2) I realize that voting after whales doesn't earn me much curation rewards, whereas voting before them does. So, I'm constantly scouring the "new" page for content that I think is great AND that I think whales will think is great, and then I try to vote it up early before the whales have spotted it.

How does my approach fit with your theory? Or, is that covered in your next installment?

Also, wouldn't the fact that voting awards scale exponentially rather than linearly change your analysis? By assuming linear, I think you did "rig the game" a little.

I'm looking forward to the next installment!

Thanks for the kind words! Your support and promotion is always greatly appreciated. The more of my posts I can push to the front page, the more I'll write.

You're right on all counts, except that assuming linear/proportional rewards actually isn't rigging the game in my simplified model. We'd get the same results using the proper quadratic rewards.

Your suggestion is to add in

  1. Whales
  2. A model of voters' beliefs regarding other voters' preferences
  3. A true-to-Steem model of early-voting incentive.

Adding in any of these things would definitely change the answer, and I suspect that we could very easily argue that your voting strategy is game-theoretically justified and individually optimal. We could model you as a Type-1 player who has probabilistic beliefs about the preferences of whales, and votes for posts specifically in an effort to push them up high enough to get whale attention. Then the Type-0 whale comes along and votes for what he likes. I probably won't do this analysis myself quite yet - I'm itching to get on to analyzing whale behavior.

(2) I realize that voting after whales doesn't earn me much curation rewards, whereas voting before them does.

What makes you think that?

My understanding is that earlier votes are weighted more than later votes so as to incentivize people to scour news posts for good content. No?

I don’t remember reading that in the white paper.

Edit: I just re-read the applicable sections of the white paper and I do not see any mention of your expectation. I note the curation reward algorithm was finalized after the release of the white paper.

The current curation reward system is not detailed in the whitepaper, and the section on the website where it explains the math is nothing like the current implementation in the code.

Thanks for the shout out on my reply on part three. Two comments: One, I may create a post on working out the "vote-your-conscious" impact as I can't stop thinking about it. And two, on your question: Did I "rig the game" and set up my level-k model in some way that guaranteed I'd get the same result?

I started to write a comment questioning Type-1 being overly pessimistic about the preferences of future voters. I was still thinking puppies and kittens. If Type-1 voters believe that preferences over any significant amount of time are 50/50, then a rational Type-1 player is almost always incented to vote their preference. (50% of vi + P is more than 50% of P if we are calculating potential payouts at the time of voting). However, I realize your example here is much more complicated in that there are multiple choices (or posts) to chose from, so the pessimism of the Type-1 voter is warranted.

I think your analysis further supports (if we are to believe we are rational actors to any degree) the idea that there is a value to the voter in voter their preference. Otherwise the first post to gain any advantage would always skyrocket to the top. I'd love to see some empirical analysis on Steem data on whether such a tipping point exists!!!

Definitely, write the conscience post! I don't have a ton of SP, but I'll promote your post where I can.

So you're right - assuming pessimism here is rigging the game a bit. See my reply to @jholmes91 on this post; basically, if the Type-1 voter can put one of her favorites into a tie, and she believes that other voters like her preference also, then her best response may be to vote her favorite since in expectation (given her prior beliefs of others' preferences), it will win.

I assumed pessimism for 2 reasons:

  1. It's easier.
  2. I'm skeptical of the idea that people are Bayesian decision-makers! There seems to be a lot of empirical evidence that they are not; see Khaneman and friends and Prospect Theory.

I have a hunch that your model of vote-your-conscience could act as a proxy for assuming some sort of more-complicated belief structure. If I really want to vote for what I like, that's at least a little bit like believing that others are going to vote for it. But it tells a cleaner story and is easier to analyze, which I think is valuable.

My First mention in a Steemit article! Great article by the way, I will have to mull over the discussion question as not too sure at the moment but my initial thoughts are to do with the other players beliefs. There could be a change in the belief system in Steemit, so right now we are all thinking yes let's vote for the most popular posts. But this belief is not observable to everyone and that could show change over time as people learn about the platform (learning by doing)?

Yeah, I think that's a place where we need to be careful. If the level-1 player can put her favorite into a tie and she believes that other voters like her favorite, then she can probabilistically sway opinion towards her favorite if it's only down by 1 vote. Of course, if she's wrong, she will have wasted her vote and played suboptimally.

Your post made me think a lot about Steemit and where it is going. https://steemit.com/steemit/@jholmes91/why-i-believe-in-steemit-s-success

Excellent post, highly recommend reading the full series as game theory is so pertinent to what is making Steemit tick. Steemit may end up being the largest game theory experiment in history...

I hope you keep these coming for awhile. I am learning a lot by reading them. I doubt I am the only one.

Thanks! I'm learning a lot by writing them, as well.

After thinking of it, I think it's unlikely the issue of "just rewards" to be solved by a single-dimensional algorithm.

Meaning, that it will probably require a multi-tiered approach by combining things like

a) author long-term "reputation"
b) author ratings by large or appointed curators who may do this manually
c) post ratings with something like a star system which build (a)
d) voting

...then all these should be weight-combined so that the first page displays not the highest earner (unless asked so by the drop down filter), but the higher combined quality/voting.

Whether segmentation of interest groups plays a role, I don't know. What I do know, is that the algorithm will have to evolve - and it will.

Another idea that sprung to mind, was the "request" of a user to be "evaluated" for quality. After writing his post down and before pressing submit, user pays a fixed amount of STEEM (like 1-2-5 steem) to apply for a "curators review" within the next 30 minutes or so... (money goes to curator or curators - no matter what the curator decides), and the post then gets a "quality multiplier" depending what the curator decided. Then the quality post starts at a different "base" level compared to the normal posts and thus attracts more attention from the get go. In case the curator-(p)reviewer misbehaves and the quality of the post is crap, a flagging/downvoting will remove a multiple of what the curator gained.

The thing about adding complexity is that if done sloppily, it can make it very much easier to game.

I've been thinking that a curation service something like what you describe could arise endogenously as it is. A whale could basically offer to look at peoples' posts for a specified fee, and then upvote the best of them. It would be crude, but effective.

Don’t we have to factor in display of listings which are ranked by pre-existing votes.

Wouldn’t level-0 strategy dominate the game theory because higher level players will never see the lower ranked content later in the sequential game?

Related point is will anyone be able to vote their conscience when they can’t find the content they like?

Soon there will be so much content that we otherwise will not be able to find the content we are interested in.

Yeah, I think that's basically what I'm saying - in my model, it doesn't matter what level you are - if there's a tie, you vote for your favorite among the tied posts; if a post is strictly winning, you vote for the one that's winning. In this way, the first voter on the scene gets to choose which post wins.

Sure, those are fair comments. But take a close look at my article - I'm arguing that even if everybody could find what they like, they would still be incentivized to vote with the crowd.

I’ve concluded that the minnows have the incentive to always vote their conscience:

The curator rewards are such a small fraction of the voter’s Steem Power, that only those with significant Steem Power are economically motivated by curator rewards. Thus we could posit that minnow voters have no great mental calculation cost when voting since they can vote their conscience without significant curator reward implications. Although the minnows don’t individually have much impact on the payout rewards, collectively they do since there are many more of them than there are whales. Thus we can conclude the white paper is more or less correct w.r.t. to minnows and w.r.t. to whales, the votes are not a replacement for micropayments because the whale’s vote doesn’t involve an insignificant economic value.

Nevertheless the dominant game theory seems to remain that the most upvoted posts are likely to be the most upvoted in a vortex of one-size-fits-all, simply because other posts get buried and not seen. And if whales vote early enough, they drive which posts get seen the most. And whales have a dominant game theory which would be to vote for the most popular posts if they were motivated by the curator rewards, but for the largest of the whales that is unlikely to be the case because:

  1. Their time is too expensive to spend it doing curating all the time, thus they can’t amass significant curator rewards relative to their Steem Power holdings.
  2. They are more motivated by the long-term success of the site.

Hypothetically a potential improvement appears to be the one I blogged.

Another fly-in-the-ointment is that minnows can potentially be a Sybil attack, since there is no way to really confirm that a sign up is a unique user.

Yeah, I think that all makes a lot of sense. Curator rewards are so small for minnows that it's all dust - we can't say it's irrational to vote for what you like because it will mean forfeiting 0.003 SP.

  1. Their time is too expensive to spend it doing curating all the time, thus they can’t amass significant curator rewards relative to their Steem Power holdings.

I realized they could employ a bot.

ooooh you got a minnow vote (me lol). Well written

I am curious to see if the price guesses are a good contrarian indicator as is usually the case with "retail" data. problem is that the community probably has a pretty high density of professionals so it might actually end up being a positive correlation

You're saying the correlation is uaually negative? For what value of "usually?" If enough people know the trick, it would get priced in, right?

retail sentiment is usually negatively correlated. basically if everybody "knows" something is going to happen, the opposite does as that was already priced in by the people who figured things out before everybody knows.

http://www.forexfactory.com/showthread.php?t=228144
http://www.businessinsider.com/contrarian-indicators-signal-buy-stocks-2015-9

Coin Marketplace

STEEM 0.27
TRX 0.13
JST 0.032
BTC 61497.12
ETH 2932.17
USDT 1.00
SBD 3.62