Block.ops - An Analysis Tool

For the last year I have carried out a monthly analysis of the Steem blockchain activity by application (i.e. by the different websites and applications through which you can post to the Steem blockchain - often termed dApps).

My aim is now to build a tool that can automate such complex analyses of Steem data, providing both historic time series and rapidly updatable real-time results. In addition to the dApp analysis there are many other projects for which such a system could be useful.

This tool is Block.ops.

gears_blockops.jpg

Repository

https://github.com/miniature-tiger/block.ops

New Project - Block.ops

What is the project about?
  • Block.ops is a tool that will allow automation of complex analyses of Steem data.
  • The data will be sourced directly from the Steem blockchain through the AppBase API. This data will be filtered and formatted into the desired structure and then stored in a Mongo DB.
  • Initially the Block.ops project will focus on historic analysis but the roadmap also includes the build of real-time results, updating as new Steem blocks are created.
  • New features from this initial contribution are described in the section below.
Technology Stack
  • Block.ops is being built with Javascript, Node.js, AppBase API and MongoDB.
  • This is my first project with the latter three of these technologies so it will be a learning experience!
Roadmap
  • The short-term roadmap is:
    (1) Build Block.ops to automatically produce the monthly analysis of the Steem blockchain activity by application, including author, post, and payout statistics and rankings.
    (2) It should be possible to generate analyses for any chosen date range, not just monthly.
    (3) Consider, map and build more granular analyses by application and by user, including time-series.
    (4) Build automated charts and graphs to illustrate the data.
    (5) Consider which other data from the block operations data are to be stored (votes, follower operations, transfers etc) and which other analyses are to be built.
    (6) Once the mappings are defined, fill the database (this will take some time but can be done gradually if necessary - once the data has been loaded for an analysis of one month it then is available for all other historic analyses - assuming the data is stored in the required format).

  • The longer-term roadmap is still to be defined but is likely to include:
    (1) A front-end UI for more manageable launching of analyses and reading of results.
    (2) Build for production of real-time results.
    (3) Production of API for particular analyses.

How to contribute?
  • If you are interested in this project you can contact me as miniature-tiger on discord.

New Features - Initial contribution

Creation of blockDates index

The Steem blockchain operations data (posting data, rewards, voting etc) can be extracted from each block of the blockchain. However analyses are typically carried out by date rather than for a certain number of blocks. The first task is to create an index of blockDates by finding the first block of each day (in UTC time).

The index is created starting from the first block. It then moves forward a day at a time, estimating where the first block of each day should be (based on three second blocks), adjusting and re-estimating (due to dropped blocks) and validating by comparing the timestamp of the chosen block and the immediately prior block. There is a workaround for dates with lots of blocks missing.

This commit also includes the construction of various steem API functions for accessing AppBase. I have toyed with both Steem-js and dSteem but in the end I have built my own functions (from scratch and then using generic npm modules request and request-promise-native). There's no overwhelming reason for this choice, other than this being my first project of this type and I like to learn from the ground up.

There is also a report which checks to illustrate whether this blockDates setup has completed as desired. It looks like this:

blockDates report.png

As might be expected with 3 second blocks, virtually all the first blocks of each day are at midnight. The two exceptions are the first block (24 March 2016 16h05) and 1 June 2017 (3 seconds past midnight).

Almost all days have some dropped / missed blocks in comparison to the expected number based on 3 second intervals. I'd be interested in understanding how this happens if anyone can point me to a good explanation. Generally the number is small but occasionally there are days with significant outages. Well, you can't make omelettes without breaking a few eggs.

The code is here:
https://github.com/miniature-tiger/block.ops/commit/adea42f91017cde8270d4f4c3db8457b0515ad5d

Block Operations Loop and Comment Analysis

The second main commit covers:
(1) The creation of the loop to pull consecutive blocks of operations from blockchain based on date parameters;
(2) The analysis and formatting of comment operations data from each block, including separating out the dApp information, and inserting the individual records into the MongoDB;
(3) A report on the market share of each application by comment numbers.

Aside from comments (which include posts) all other block operations are ignored. More functionality will be added in the next contribution.

The report looks like this. Already looking good! More work will be required to classify applications between dApps that people can actually use, bots, and libraries.

MarketshareComments.png

The code is here:
https://github.com/miniature-tiger/block.ops/commit/70641d4048dbfc2578fea3125d1870b7338cbdd8

Capturing blocks processed / Date range parameters

The third main commit provides two main additions:

  • Work on parameters allowing the date range to be defined by two dates or by a single date plus a number of blocks to process (this latter option is for testing where you only really want to run a small number of blocks).
  • The capture of which blocks have been processed, with status "OK" or "error" and the reporting of blocks processed between date ranges. This will be used in future to allow blocks dropped in error to be reprocessed and prevent reprocessing of previously analysed blocks when lomger date ranges are chosen.

The report looks like this:
Reportblocksprocessed.png

GitHub Account

My account on github is here:
https://github.com/miniature-tiger

H2
H3
H4
3 columns
2 columns
1 column
16 Comments