SteemData 1.2
I've decided to ship early, and not wait until SteemData 2.0. The main reason is that I'd like to push out all the breaking changes now, to reduce the amount of pain in the future.
Features in 1.2
Fast updates and eventual consistency
Before 1.2, I would run a handful of workers in a loop, and scrape account related updates one by one. Steem now has over 120,000 accounts, and this approach certainly doesn't scale. It also means that an account can only be updated once every few hours, and thus some of the data is stale.
I have solved this problem by switching to an asynchronous event based model (powered by Celery and RabbitMQ, the distributed queue), where posts, accounts and their virtual operations are updated shortly after new blocks become available.
I have repurposed the old worker model as a fail-safe - if for whatever reason the event based approach fails in such a way that it would cause loss of data - the background worker will back-fill the missing data afterwards.
Structural Changes and Types
This release contains a handful of design improvements and changes, which are not backwards compatible. I do not expect any major breaking changes for 2.0.
Also, the typing support has been improved greatly.
Historic Prices
I've added hourly snapshots for STEEM, implied SBD and Bitcoin prices.
Performance Improvements
The new Mongo deployment is wriredTiger enabled.
I have reworked indexes on all collections, which yields in over 2-10 fold query performance improvement for most historic queries.
SteemData is now also hosted on a more expensive, Intel i7 6700k powered server with 64GB RAM. The hardware upgrade should yield over 2x performance gain.
Open Source
All of the code powering SteemData is now available on Github, and is licensed under highly permissive MIT.
steemdata-node
If you're looking for a Docker based, easy to use steemd RPC deployment, this is it.
It comes with all blockchain plugins enabled, latest seed node list and automatic blockchain snapshot download on first run for quick syncing times (thanks to @gtg).
steemdata-mongo
This repo contains all the code that is responsible for syncing STEEM blockchain with MongoDB.
steemdata
This is a core library for working with STEEM blockchain data. It is database agnostic (could be used for SQL or any other database in the future).
steemdata.com
Right now, the website only hosts basic instructions and stats.
Eventually, I would like to build:
- an API for 3rd party apps
- blockchain explorer
- steemle inspired charts and analytics
TODO (until next release)
- Integrate Comments
- Add Relationships via HRefs
- Create Sample Notebooks
- Documentation!
Now that the stable base is in place, I'd like to work on making this project more useful and friendly to people who can benefit from it. If you're a developer, please talk to me (I am @furion on steemit.chat)
Upgrade Now
The old version of SteemData will be shutting down on Feb 10th. Please upgrade to SteemData 1.2, see steemdata.com for connection info.
Crowdfunding
We have raised $5,120 of the $5,000 goal so far. Big thanks to @cass for making this project possible.
Supporters | |
---|---|
@cass | $4,900 |
@fabien | $100 |
@abit | $100 |
@tuck-fheman | $20 |
The donations should be sent to @steemdata, and the list of friendly donors will be published and updated here, as well as in future announcements.
If you'd like to support my work, feel free to vote @furion for witness.