How-to solve SPAM and Democratize Steem: Introducing UserAuthority

I've developed, contributed to, quite some search engine algorithms, and therefore I know from experience how hard it can be to invent, and implement, proper ranking mechanisms, especially in environments that are quite new and/or paradigm shifting; such as Utopian and Steem are themselves. Utopian @stoodkev has already been contributing a tremendous amount of work regarding the Utopian Bot.

Having said that, I hereby propose a number of improvements to the Utopian Voting Bot. I hope this will not only be implemented on Utopian, but on the complete Steem ecosystem. At the bottom of the post, I propose 2 HardForks (HF) based on UserAuthority.

UserAuthority (UA) seen as a Probability Distribution

Inspired by the inner workings of search engine Google, I hereby propose the metric UserAuthority corresponding to the principal eigenvector of the normalized follower matrix within the Steem ecosystem.

Let:
UA(A) = (1-d) + d[UA(1)/C(1) + ... + (UA(n)/C(n))]
where
UA(A) = UserAuthority of user A
UA(x) = UserAuthority of user X
C(x) = the total amount of users user X is following
d = damping factor (set to 0.85)

In layman's terminology, this probability distribution can function as an authoritative perceived quality metric for every "post", "upvote" and "downvote" a user transacts, based on the entire Steem follower graph. The probability that a random user is followed (liked) is its UserAuthority. This mechanism effectively deals with fake and/or bot accounts as well, making it nearly impossible to deliberately mislead the ecosystem in order to get higher author rewards. This principle is intuitively justified both if many users follow a user or if just a few users having a high UserAuthority follow a user. Consequently, if spam accounts or bot accounts mainly link to eachother recursively, each will have a low UserAuthority within the entire Steem follower graph.

Nota bene: it is possible to manually ignore known spammer / bot accounts from the cumulative follower graph (which might be needed due to a large amount of people following - for example - the heavily debated booster bots in order to not miss their notifications).

Mathematical Proof by Example

Please regard the following simplified example-follower graph. The unidirectional arrows represent a user following another user.

Update 1: I hereby add the link matrix following from the above total example-follower graph:

Update 4: Note that by simply saving, for every account, which accounts that account follows (read matrix left-to-right), as a consequence, reading the matrix top-to-bottom shows which accounts follow that account.

Please also regard the iteration data: in just a couple of iterations, the UserAuthority mechanism succesfully identified spammers / bots via the total follower graph.

Update 2: Instead of Gaussian elimination (row reduction via Gauss-Jordan) I have used an iterative approach to reduce computational cost for re-calculating the UserAuthority binary index daily.

Update 3: Using Gaussian row reduction would be very RAM-heavy in case of a large follower graph. However, by solving many simple follower equations per user, every iteration better approximates each user's UserAuthority. Its computational complexity grows linearly in stead of exponentially when more users are added to the follower graph. The "1"'s at iteration 0, are simply an entry-approximation where every account is weighed equally. However, if you would randomly choose numbers at iteration 0 as an estimate for UserAuthority, in stead of all the "1"'s, the iterations eventually end up at the same equilibrium state (it would only take some more iteration steps to get to that equilibrium state).

Screen Shot 2017-11-17 at 18.03.33.png

Technical implementation

The Steem blockchain stores all data in chronological order, yet in order to use UserAuthority for algorithmic curation purposes, a pre-calculated reversed binary index holding all results shown in the Excel example is needed. This data can be retrieved via the BSON data using SteemData's MongoDB interface.

Update 5: Only a subset of the SteemData's MongoDB interface is needed to construct the follower matrix:
{uid: [follows], total_follows: total_follows}

Further implementations

It is possible to expand on the UserAuthority mechanism by adding weights to manually assigned trusted "witnesses" in the form of content moderators. In a Utopian-IO context, a contribution post is only allowed to be voted upon after manually assessing it via a topical moderator.

Update 6:
UserAuthority (UA) not only allows for algorithmical curation mechanisms, but it can democratize the entire Steem ecosystem if adopted widely.

For example:

Proposal for HF21: a minimum amount of UA is needed to downvote
downvotable(user) bool = (UA(user) >= UA(threshold)) ? true : false
=> that prevents "flag wars"

Proposal for HF22: implement UA to curate monetary rewards (author / curation):
upvote_reward = UA * SP
=> that effectively combats SPAM-rewards / self-upvoting / delegating SP to multiple self-owned bots for self-upvoting.

Nota bene: the only influence needed, for any user disagreeing with the actions of some other user, is by simply unfollowing that user.

Posted on Utopian.io - Rewarding Open Source Contributors

I've published a follow-up article:

UserAuthority (UA): explanations, applications and implications