What's a Minnow to do? The Game Theory of Steem, Part 4

And we're back with more riveting analysis of the game theory of Steem! For the past two articles (Part 2, Part 3), we've been using the example of a simple beauty contest to study the incentives for voting on content in Steem. In particular, we've been looking at the incentives for minnows - that is, people who don't have enough Steem Power to affect the outcome of the voting, but nevertheless get paid for their votes. We're almost done with minnows; I just want to tie up a few loose ends first.

The conclusion we ended with in Part 3 was that minnows are incentivized to vote for the content they think will win, not necessarily the content they think is good. At the end of Part 3, I asked for comments on why we shouldn't believe this conclusion, and got some excellent replies. One was by @jholmes91, who mentioned that in the real Steem, the following two things are true:

Not everybody votes at once, and
Steem is way too complicated for everybody to vote rationally.

These are great points, and we'll look at them together in this article.

In game theory, we call a game in which moves are not simultaneous a sequential game, and there's a huge amount of research on how we should model these. One of the basic ideas is that the simple concept of Nash Equilibrium (that we discussed in Part 3) doesn't work very well in sequential games. I'm not going to go into a ton of detail here, but one solution concept that has received a lot of attention over the years (it won Selten his Nobel Prize) is that of the Subgame Perfect Equilibrium (SPE). At its core, a SPE is a sequence of moves by all players that everybody is happy with. What does it mean to be "happy with" a sequence of moves? It means that I take into account the fact that my actions at stage 1 impact your actions at stage 2, and so on and so forth - so to compute a SPE, I have to do 2 things perfectly:

Write out all eventualities (even if there are billions of them), and
Assume all of my opponents are doing the same thing and making perfectly rational choices at every stage.

Does that sound like something we expect people do be doing? Have you ever done that? Me neither. The reality is that it's very difficult, even for a computer, to think that far into the future. In a sense, everybody would need to be infinitely rational to be able to know how to play optimally in a sequential game; to analyze behavior in these settings, we need a way to model the fact the people's rationality is not infinite.

One notion that tries to get at this is called bounded rationality. Today, in particular, I'm going to look at a very simple abstraction called level-k thinking. Here's how it works:

Level-K thinking

To vote optimally on Steem, each person needs to be able to anticipate the effect their vote has on everybody else's payoffs and vice-versa. To anticipate this, each person needs to compute an infinitely-long, infinitely-brancing chain of hypothetical decisions. What if we could model actual human behavior by imagining that people can only look K steps into the future? It would provide a tractable way of analyzing decision-making in sequential games.

Here's the setup: We assume that the population of voters is broken down into groups that we'll call Type-0, Type-1, Type-2, and so on.

Type-0 voters are not strategic; they don't take into account the effect their votes have on other voters.
Type-1 voters use a very simple strategy: they assume that everybody else is Type-0.
Type-2 voters assume that everybody else is Type-1.
and so on and so forth.

From what I can tell, experiments that have tried to validate this model have been reasonably successful, and they show that most human decision-makers are Type-0, -1, or -2. So now let's build our Steem voting model under this assumption.

Steem Voting

So: let's assume that there are m Steem posts active at the moment, and at each time step t a new voter looks at the m posts and decides which one to vote on. For simplicity, assume that each voter only has 1 vote; there are n voters, so we'll end up with a total of n votes.

Each voter has idiosyncratic preferences over the posts; say vij is the value that voter i assigns to post j winning. After everybody has voted (after n time-steps), the votes are tallied; write Nj as the number of votes cast for post j. Finally, let's pretend payouts are proportional to the vote share, so the payout to people who voted for j is Nj/n; the number of votes for j divided by the total number of votes. Note that this is not how Steem actually works; the real payout is proportional to the number of votes squared, along with a few other complications - but we have to start somewhere. We're also assuming that nobody can change their vote once it's been cast.

So now we have all we need to write down player i's payoff function, where player i is voting for k, and j wins.

The best payoff a player can get is when he votes for his favorite and it wins. Note here that we could also include a "vote-your-conscience" term as suggested by @nkyinkyim. I'm not doing this here for simplicity, but I'd encourage a curious reader to work out how that might change things. Here is our question: How do the various level-K players vote optimally at each stage?

Level-0

There are a lot of ways to model level-0 players. We'll take the easiest: we'll model a level-0 player as someone who thinks that he's the last one to vote, and nobody will come after him. This type of player has a very simple best response: if there is a tie for 1st place, vote for your favorite of these. Otherwise, vote for the one with the most votes.

This is a very interesting conclusion! What we're saying is that a Level-0 player will usually vote for whichever is winning, even if he likes a different one a lot. This is because a single vote doesn't have the power to change the outcome, so he might as well vote with the crowd.

Well yeah, but why is it interesting? It's interesting because it's the exact same behavior that we predicted in Part 3 when we were assuming perfectly-strategic, infinitely-rational behavior. A Level-0 player is totally non-strategic, and yet in the sequential game he votes the same as a totally strategic player. This suggests that assuming simultaneous votes and perfect rationality might not have been such a bad assumption after all.

Level-1

Do we run into any more nuances as we increase the rationality? Remember that a Level-1 player thinks that all the other players are Level-0. So a Level-1 player knows that if she can vote to get her own favorite out in front, that it will win because all subsequent players will vote for it.

So what's her strategy? Well, to really do this justice, we would need to model her beliefs about future voters' preferences - but let's sweep that under the rug for the moment and assume she's pessimistic, and believes that future voters don't have the same favorite as her. Here's the Level-1 optimal strategy: if there's currently a vote tie, break the tie by voting for your favorite of those that are tied. Otherwise, vote for the one that currently has the most votes.

Was that an echo? That's the exact same strategy as the Level-0 guy! Why? It's simple: it's because her vote doesn't have enough power to really sway things, so she'll vote for for the one that's going to get her a monetary payout unless she can put one of her favorites out in front.

Level-2 and so on

We've already done all the work we need to! Since Level-0 and Level-1 have the exact same strategies, all subsequent Level-K will do the same. (This is a sloppy induction argument; maybe someday I should write a post on proper induction.)

So What?

The whole point in writing this article is to explore the effect of going from a very simple model (the puppies/kittens example of Part 3) to a considerably more complicated model. I had a hunch going into this that we'd get exactly the same behavior; it turns out I was right.

Always nice when these things come out so clean. Or is it too clean...

Discussion question: Did I "rig the game" and set up my level-k model in some way that guaranteed I'd get the same result? Whenever I read a surprising result, I try to turn on my BS meter and check if there's a fishy assumption somewhere that makes everything work out.

Thanks for reading this far! If you haven't seen them already, check out

Part 1 of the game theory series: Introduction

Part 2 of the game theory series: Beauty Contests

Part 3 of the game theory series: Voting with the crowd

A little about me