Although several reputation systems for Steem have been proposed (some of them even very recently), I believe that none of them are appropriate for implementation as a reputation metric that could be potentially used on our up-coming SteemSTEM app (to be officially released very soon).
[Credits: @hightouch]
Therefore, I decided to build one myself. Here I share the code and its ingredients, and would love to read comments or suggestions for potential improvements. The code is open (available from this GitHub repository) and can be used by anyone freely.
What will SteemSTEM do with such reputation indicators for now? Well, I don’t know (yet!), but it was fun to develop ;)
A good reputation for the SteemSTEM community must in my opinion include two ingredients, an authorship component and an engagement component, contributing in equal parts. Indeed, we absolutely need both authors to provide amazing contributions, and readers questioning authors and entertaining worthy discussions. I suspect that holds true for any community by the way.
AUTHORSHIP INDICATOR
The authorship metric is built from a few key principles.
Each SteemSTEM vote on a post at x% gives x reputation points at the time of the vote. The SteemSTEM curation team scours Steem to find the best STEM content. As all these blogs contributed to what SteemSTEM has become today, it makes sense they all enter any given reputation indicator.
It would be weird to assign a large reputation score to someone who contributed a lot two years ago but then left Steem. However, the reputation of that person should be somehow non negligible. After all, this person has left a trace in our memories. For this reason, I decided to introduce reputation points which vary with time. After a given time (the authorship point half-life), one point loses half its value. After twice the time, the remaining 0.5 points are 1/4 point worth, and so on. A simple exponential decay.
As SteemSTEM strives to push for quality as much as possible,e prefer someone writing one excellent post a week (thus supported very strongly) over someone writing 5 good posts a week (thus supported five times moderately). For this reason, the reputation score today (i.e. accounting for the fact that each gained point has lost value with time) is divided by the square root of the number of posts. The square root tames the effect when a very large number of posts is reached.
Finally, we may want to remove individuals from the algorithm, like the team, blacklisted people, bots, etc. Moreover, the total amount of reputation points is fixed to a given value so that each score is renormalized at the end of the day.
RESULTS FROM THE STEEMSTEM AUTHORSHIP INDICATOR
I adopted an authorship half-life of 3.5 months and excluded all team members (management and curators), bots and blacklisted authors from the run. The total number of available authorship reputation points is normalized to 1000.
The top 30 most reputed SteemSTEM authors of all time, out of 2662 authors, are (with their score):
1 abigail-dantes 6.305
2 chloroform 6.244
3 egotheist 5.918
4 scienceblocks 5.884
5 steemit-italia 5.691
6 lordneroo 5.632
7 zen-art 5.216
8 nonzerosum 5.089
9 highonthehog 4.931
10 conficker 4.902
11 nikolanikola 4.893
12 effofex 4.808
13 anaestrada12 4.804
14 tomastonyperez 4.555
15 deathbatter 4.466
16 samminator 4.456
17 hidden84 4.385
18 agmoore 4.361
19 romulexx 4.357
20 jfermin70 4.266
21 answerswithjoe 4.159
22 dysfunctional 4.093
23 elvigia 4.047
24 n4zrizulkafli 4.021
25 alexander.alexis 4.017
26 dedicatedguy 3.993
27 lupafilotaxia 3.958
28 anasav 3.909
29 scienceangel 3.856
30 irelandscape 3.809
The code has been run on Sep 24th at 10:11:25 AM.
ENGAGEMENT INDICATOR
Here, I track every single comment to any SteemSTEM-supported post and give reputation points to the comment author.
First, if the comment length is smaller than N characters, it is considered as spammy and no po. Moreover, if the comment has been posted more than W weeks after the SteemSTEM vote, no point is given. I want meaningful comments that help illustrating that supported posts are interesting during the time in which they are hot or trending (on the #steemstem tag).
If non zero, the score is given by the square root of the comment length. The square root allows once again to make a large difference between smallish and average comments, but tame down the difference once a given length is crossed. This is the only way I have found so far to deduce the score, and I am only partially satisfied with it. But at least, it provides some level of quantification of the engagement of the readers.
As with the authorship indicator, any earned engagement point loses value with time, the score today is divided by the square root of the number of comments and some individuals can be removed from the algorithm.
The final score is normalized as for the authorship case, the total number of available points being fixed to a given value (taken to be the same as engagement and authorship are considered as important).
RESULTS FROM THE STEEMSTEM ENGAGEMENT INDICATOR
I adopted an engagement half-life of 1.75 months, and excluded comments whose length is smaller than 100 characters (N=100). I fixed W to 2 weeks. I excluded all team members (management and curators), bots and blacklisted authors from the run. The total number of available engagement points is 1000.
The top 30 most engaging SteemSTEM comment authors of all time, out of 23134 comment authors, are (with their score):
1 erh.germany 2.886
2 agmoore 2.764
3 steemit-italia 2.623
4 amestyj 2.621
5 abigail-dantes 2.357
6 scienceblocks 2.175
7 fran.frey 2.149
8 insight-out 2.114
9 rudyardcatling 2.096
10 lupafilotaxia 2.079
11 dedicatedguy 2.031
12 samminator 1.979
13 tsoldovieri 1.950
14 alexander.alexis 1.921
15 cyprianj 1.853
16 herbayomi 1.847
17 tomastonyperez 1.833
18 jamalgayoni 1.756
19 steepup 1.726
20 alexdory 1.682
21 kimberlylane 1.678
22 synick 1.665
23 olamseu 1.656
24 emperorhassy 1.628
25 lucylin 1.625
26 osariemen 1.611
27 ied 1.576
28 egotheist 1.575
29 delpilar 1.526
30 chireerocks 1.487
The code has been run on Sep 24th at 10:11:25 AM.
FINAL REPUTATION INDICATOR
The final reputation is given by the average of the two above metrics. The top 25 (with the score) is given by
1 abigail-dantes 4.331
2 steemit-italia 4.157
3 scienceblocks 4.029
4 egotheist 3.746
5 chloroform 3.579
6 agmoore 3.563
7 lordneroo 3.489
8 nonzerosum 3.247
9 erh.germany 3.245
10 samminator 3.217
11 tomastonyperez 3.194
12 conficker 3.151
13 effofex 3.115
14 lupafilotaxia 3.019
15 dedicatedguy 3.012
16 alexander.alexis 2.969
17 anaestrada12 2.959
18 nikolanikola 2.911
19 cyprianj 2.795
20 tsoldovieri 2.734
21 zen-art 2.723
22 jfermin70 2.605
23 alexdory 2.598
24 highonthehog 2.576
25 amestyj 2.572
The code has been run on Sep 24th at 10:11:25 AM.
MORE ABOUT THE CODE
The code can be obtained from the following GitHub repository. It is programmed in Python 3 and requires steem-python.
I am not happy with the way the engagement indicator is computed, because I need to get the information on each post separately, which takes an enormous amount of time. For this reason, the information is saved into a file when the SteemSTEM upvote on a post is older than two weeks (as any later a comment would just bring 0 point). This requires removal of the ‘null’ author from the algorithm, which is used to trace posts without any single comment.
To run it, it is sufficient to complete the setup part of the code,
## Setup
half_life_vote = 3.5*30*24*3600. # 3.5 months - authorship point half-life
half_life_comment = 1.75*30*24*3600. # 1.75 month - engagement point half-life
comment_timelimit = 14*24*3600. # 2 weeks - the W number
comment_spam_limit= 100 # minimum number of characters for a comment to be valid (N)
comment_filename = 'comments_data.txt' # where to save the treated comments
load_backup = True # Using the file with the saved comments
normalized_rep = 1000 # Score normalization
## Exclusions
team = [ ‘null’ ]
bots = [ ]
blacklist = [ ]
and execute the program.