Steem Developer Update (Graphene 2.0)



Image created by AlexanderAlUS under
CC By-SA 3.0

This update is for those who are interested in what our team is working on behind the scenes and to let the community know what to expect in the next couple of weeks. We are really excited about what some of these updates mean for Steem and blockchain technology in general.

Introducing Graphene 2.0

Graphene is the underlying database technology that powers many different blockchains (Steem, Bitshares, Golos, etc). Graphene 1.0 was groundbreaking in its ability to process hundreds of thousands of transactions per second. It is extremely developer friendly and enabled the development of Steem in just a couple of months. Graphene 2.0 is a significant overhaul on this backend technology that is aimed at helping platforms like steemit.com scale in an secure and economical manner.

Adopting Memory Mapped File for Storage

Under Graphene 2.0 all blockchain consensus state will be maintained in a memory mapped file that can be shared among multiple processes. This means application state is effectively “on disk” and the operating system will handle paging data to/from disk as needed. As the blockchain memory requirements grow this will provide many huge benefits:

  1. Faster load and exit times
  2. Parallel access to the database
  3. More robust against crashes
  4. Less frequent database corruption
  5. Instant “snapshotting” of entire state
  6. Serve more RPC requests from the same memory

Problems with Graphene 1.0

Graphene is designed to keep all blockchain consensus state in memory using what is arguably one of the highest performance in memory data structures (Boost Multi-index Containers). For traditional cryptocurrencies this approach scales very well because the application state (account balances) is relatively small relative to the transactional throughput (balance transfers and trades).

Steem has a much larger application state than any other cryptocurrency. This state includes all of the article content, feed lists, and votes. Additionally, this state is queried by thousands of passive readers who are interested in browsing blockchain explorers like steemit.com.

Steem is currently the second largest blockchain measured by transactions-per-second. Only Bitcoin is processing more transactions than Steem. The Steem consensus state is growing faster than any other blockchain because almost every operation adds more state than it consumes (especially for full nodes serving steemit.com).

Currently the Steem nodes that power steemit.com consume over 14 GB of RAM and this number is growing at a rapid rate. Every time we want to add a new feature it usually means increasing the amount of RAM required.

Slow Exit and Load Times

When a full node starts up it must process and index many gigabytes of data. This process currently takes 10s of seconds when there are no problems. If there are any problems detected loading the saved state, then the entire blockchain must be processed to regenerate the state from the history of transactions. This blockchain replay process can take over 5 minutes on even the fastest machines.

When a full node shuts down, it must save all of this data to disk. This can also take 10s of seconds. If anything goes wrong while saving then the next time the database is loaded will require a full 5+ minute replay of the blockchain.

Single Threaded Bottleneck limits Connections

Graphene was designed to be single threaded for performance reasons. The very nature of blockchain technology requires a deterministic generation of consensus state which means a definite sequential order of operations for everything that impacts shared state. The overhead of multithreaded synchronization is greater than any benefits we might gain.

In a normal blockchain environment this is perfectly OK, but Steem isn’t your normal blockchain. Our Steem nodes are processing requests from thousands of clients every second. Each of these requests must be proxied to the thread that is allowed to read and write to the database. To make a long story short, each Steem node is only able to process about 150 simultaneous connections before users start experiencing a degradation in website performance.

In order to maintain good performance for all users, steemit.com runs many instances of the Steem node and load balances requests among those instances. Each of these instances requires another 14 GB of RAM (and growing).

Software Crashing is Expensive to Recover

Any software bugs that cause an unexpected crash will result in a corrupt application state. When a node crashes it can take minutes to recover while maxing out a CPU core.

Any process that is servicing requests from users is at greater risk of software bugs and crashes because these processes change more frequently than the core consensus logic.

API versioning

Anytime we upgrade our API it requires us to run a full node. Supporting multiple versions of our API in parallel requires significant resources. Under Graphene 2.0 multiple APIs can share the same shared database and can be started and stopped at will.

Better Access Control

It is now possible to serve all of our blockchain database queries from a process that has mapped the database in READ-ONLY mode. This means the operating system will enforce that no API call can inadvertently corrupt the state of the blockchain consensus database.

Parallel Network Protocols

Under the new model we can separate the P2P networking code from the core database code and logic. This separation will allow us to add multiple networking protocols in parallel while maintaining an operating system enforced firewall between publicly facing network code and the core blockchain validation logic.

This will allow us to start, stop, and restart the P2P networking infrastructure without having to restart the entire blockchain database.

Summary

Graphene 2.0 will involve significant updates to the very core of Steem and will take us a couple of weeks to implement and test. Because this update is so far-reaching all other new blockchain features will be put on hold until this migration is complete. There will be an extensive period of testing where old and new versions of Steem will be running side by side to ensure we do not accidentally introduce a consensus changing unexpected hard-fork.

After the migration to Graphene 2.0 is complete, we will return our attention to Curation Guilds.

H2
H3
H4
3 columns
2 columns
1 column
62 Comments