Repository
https://github.com/pibara-utopian/asyncsteem
What Will I Learn?
- You will learn how to write a simple statistics script in python that processes a set period (day/week) worth of historic transactions from the STEEM blockchain.
Requirements
- Experience with Python
- A base understanding of the STEEM blockchain.
Difficulty
- Intermediate
Tutorial Contents
Asyncsteem is a limited non-official STEEM Python API implementation designed specifically to work in conjunction with the Twisted asynchronous Python framework. While the prime use case for asyncsteem is in a networking or bridging setup, in this tutorial we are going to be looking at the usage of asyncsteem for the creation of a statistics script. We will walk through the steps needed to use the higher level API of asyncsteem in such a way that the library process historic rather than just a lifestream data, and will look at what we need to do to not have the script run forever waiting for new blockchain events.
Step 1: imports
#!/usr/bin/python
import io
import time
from twisted.internet import reactor
from twisted.logger import Logger, textFileLogObserver
from asyncsteem import ActiveBlockChain
Next to the ActiveBlockchain that we import from our asyncsteem library, we get a few essential Twisted components. Most importantly the reactor. But also some asynchonous logging facilities.
Step 2: Bot boilerplate.
We will be handing control to the asyncsteem library later, but before we do, the high-level asyncsteem API requires us to define a bot object first.
#Simple testing both class for demonstration purposes
class TestBot:
#Constructor taking a reference to the asyncsteem
#independent business logic
def __init__(self,businesslogic):
self.businesslogic = businesslogic
#Handler for 'comment' operations from streaming
#the block-chain
def comment(self,tm,comment_event,client):
pass
#This method is called when blocks streaming traverses
#to a new day according to UTC timestamps.
day(self,tm,event,client):
pass
businesslogic = BusinessLogic()
bot = TestBot(businesslogic)
We will revisit the TestBot later to add functionality. For now, we walk through the base functionality of the bot. The bot part of the script will act as if it was following a live stream of blockchain transactions. The core loop of the asyncsteem ActiveBlockChain will stream the operations that are registered to a bot to that bot. Our bot above registers for just one type of operation, the comment operation. If you want to gather stats on other types of operations, just add the appropriate methods to your own bot class. You could, for example, define a transfer method for keeping track of fund transfers, or a vote method for keeping track of votes. Let's have a look at the attributes to these methods:
- tm : The (blockchain) timestamp for this operation.
- comment_event: An operation dependent object with operation specific attributes.
- client : A low-level JSON-RPC client that can be used for doing additional asynchronous STEEM API calls.
Step 3: Streaming a hystoric day to the bot.
#Initiate and instantiate logger to log to a logfile
#asynchounously
obs = textFileLogObserver(io.open("demobot.log", "a"))
log = Logger(observer=obs,namespace="asyncsteem")
#Determine how many days to rewind the blockchain
#streaming so we process the last available day of
#closed-for-voting' blog posts.
days = 8
if time.gmtime().tm_hour > 11:
days = 7
#Instantiate our asyncsteem ActiveBlockChain
bc = ActiveBlockChain(reactor,
rewind_days=days,
day_limit=1,
log=log,
nodelist="stage",
stop_when_empty=True)
#Register our demo bot with the ActiveBlockChain
#to start receiving events.
bc.register_bot(tb,"testbot")
#Start the Twisted reactor
reactor.run()
#After the whole day is processed, the reactor is
#stopped and we can finalize our data for further
#processing.
businesslogic.finalize()
The first thing we need now is to create a logger object. Twisted has several logging options that fall outside of the scope of this tutorial. In this tutorial, we use a simple file based logger.
Next, we want to rewind our blockchain streaming. For the purpose of our script we want to process the votes for all posts for what the voting window of six and a half days has just recently closed. As we are about to stream a single day to our bot, we look at the GMT time and choose whether to rewind either seven or eight days.
Now it's time to instantiate our ActiveBlockchain object. Let's zoom into the attributes we use:
- reactor : A twisted reactor
- rewind_days : The number of days to rewind the blockchain operation streaming.
- day_limit: Stop streeming blocks after this many days
- log: The twisted logger
- nodelist: The name of the STEEM nodelist to use.
- stop_when_empty : When true, stop the Twisted reactor as soon as no pending RPC calls are left.
In our instance, we set the ActiveBlockChain to stop after one day of streaming.
Finally, we register our bot with the ActiveBlockChain and start the Twisted reactor, that in our case should start the one day worth of blockchain events being streamed to the bot's comment method.
Step 4: Using the rpclient from our bot.
class TestBot:
def __init__(self,businesslogic):
self.businesslogic = businesslogic
def comment(self,tm,comment_event,client):
#Empty error handler; We expect errors because
# the original post might have been deleted.
def process_votes_error(errno, msg, rpcclient):
pass
#Handler for return data from get_active_votes
#JSON-RPC call
def process_votes(vote_events,c2):
if (len(vote_events) > 0):
#If there are votes, we want to first sort them
#by time of them having been cast.
sortedvotes = sorted(vote_events, key=lambda kv: kv["time"])
for vote in sortedvotes:
#Process a single vote in our asyncsteem
#independent business logic
self.businesslogic.process_vote(comment_event["author"],vote)
#Create a new JSON-RPC get_active_votes call
#using our blog post designation, don't queue
#it yet.
opp = client.get_active_votes(comment_event["author"],
comment_event["permlink"])
#Set error and result handlers for our JSON-RPC call
opp.on_result(process_votes)
opp.on_error(process_votes_error)
#Operations will get queued at method exit.
day(self,tm,event,client):
pass
We define a new operation for the rpclient and then bind two functions to that operation for handling errors and results. In our case, we request the active votes for the blog post or comment we are currently processing. This operation will be added to the queue currently holding request for blocks, and shall be processed asynchronously when network operation capacity becomes available. In our case, we sort the votes on order of actual arrival and invoke some business logic.
Step 5: Day event
class TestBot:
....
def day(self,tm,event,client):
#Request a list of all relevant accounts that
#voted in a certain way in our processed day.
accounts= self.businesslogic.accounts()
for account in accounts:
#Define a handler for return data from a
#get_accounts JSON-RPC call.
def process_accounts(accounts_event,c2):
self.businesslogic.process_account(accounts_event)
#Create a new JSON-RPC get_accounts call
#using our blog post designation, don't queue it yet.
opp = client.get_accounts([account])
#Set a result handler for the get_accounts JSON-RPC call.
opp.on_result(process_accounts)
Other than blockchain events, the ActiveBlockChain also has hooks for time markers between events. In our script we define the day event. As our ActiveBlockChain is set to stream only one day, this event will be triggered at the end of our comment stream, allowing us a last chance to do other operations. As we have been processing vote events on one day of posts, and as votes are linked to accounts, we now have a last chance to gather some account level data from the steem JSON-RPC service. What we do here is create an operation fetching this information and setting another callback for the results.
Once the last operation initiated like this has finished, the run of our reactor will end and er end up at the end of our program.
Output
Using the business logic part that while trivial makes up quite some code and thus falls outside of the scope of this tutorial, the above example code produces a JSON file that in turn can be used by a second script to produce visualizations using the graphviz python library. Let's have a look at some of the output from the script to see what is possible:
{
"flags" : {
"by_pair": {
"robotq->biblegateway" : 77871798513,
"anarchofarmer->biblegateway" : 5696441283,
"lethn1->biblegateway" : 5196933813,
"blacklist-a->biblegateway" : 1041918690,
"yanes94->biblegateway" : 20721945343,
"promienie->biblegateway" : 598526449,
"dredo->biblegateway" : 592519960,
"phoneinf->biblegateway" : 50656737664,
"mitchyb->biblegateway" : 1025398922,
"anti-phishing->biblegateway" : 8719987210,
"sofian88->biblegateway" : 600306936,
"kiikoh->biblegateway" : 160789172,
"palmalp->biblegateway" : 607545891,
"anti-spam->biblegateway" : 27400450011,
"sudutpandang->biblegateway" : 1197101988,
...
}
}
"meta" : {
..."biblegateway" : {
"recovery_account" : "anonsteem",
"reputation" : -1642643858064,
"vesting_shares" : 15693.512806,
"delegated_vesting_shares" : 0,
"received_vesting_shares" : 183387.038389,
"proxy" : ""
},
...
}
}
Consolidating this info in a top-X flags visualization, the graphviz library can be used to visualize this info as below:
Conclusion
We have shown how to write a simple asyncsteem script for processing old STEEM blockchain data and creating statistics reports. We left out the case specific logic and focused on the use of asyncsteem for this purpose. here is an example of the type of statistics reports that can be created in this way. I hope the above tutorial gives you some ideas and inspiration you need to go out and write your own asyncsteem based statistics scripts for the STEEM blockchain.
Proof of Work Done
This Tutorial is based on the stats collection part of the flag-wars stats scripts that get posted under my @pibarabot account.