Steem Pressure #3 - Steem Node 101

Everything you’ve always wanted to know about your first Steem Node but were afraid to ask.

Our goal for today

We will set up the system, build steemd manually, configure a simple Steem seed node, run it and sync it with the current head block.

What is that "simple Steem seed node"?


Video created for Steem Pressure series.

A consensus node.

Every Steem node is at least a consensus node.
In its simplest form, a consensus node is simply a node that is connected to the Steem network for the sole purpose of getting and sending blocks and transactions.

The more Steem nodes are running, the more decentralized and resilient the Steem network is.

A consensus node is also called a “Low Memory Node” - the name comes from the compile-time option LOW_MEMORY_NODE=ON, which is used to build steemd in such a way that data and fields not needed for consensus are not stored in the object database.

A low memory node is all that is needed (and therefore recommended) for seed nodes, witness nodes, and nodes run by exchanges.

A full node.

Running a low memory node is enough in many cases, but we still need full nodes to be able to use certain plugins and their APIs.

Importantly, a full node means here something different than in the Bitcoin realm, where Bitcoin full node is pretty much something that a Steem consensus node can do.

Here, on Steem, the word “full” doesn’t refer to anything related to the blockchain - it refers to the fully featured set of APIs enabled within steemd.

For example, the Condenser that powers up steemit.com site uses those APIs to display posts, comments, votes, feeds, and tags.

Many of those calls don’t need to be served by steemd, and in the future they will be served by various microservices.

Full nodes have significantly higher resource requirements, but this issue will not be covered in this episode.

Setting up the hardware

In the previous episode Steem Pressure #2 - toys for boys and girls, I gave you some tips about the hardware you might need.

In this episode, I will use an entry-level dedicated machine:
Intel(R) Xeon(R) CPU E3-1245 V2 @ 3.40GHz on an Ivy Bridge with 32GB DDR3 1333MHz RAM and 3x 120GB SSD

OS setup

I’m currently using Ubuntu 16.04 LTS.
You can assume that it’s a default clean install.

It’s up to you how you will set up your system to suit your needs and best utilize its hardware components. Every case is different, so there’s no ultimate solution here. In the end, if you run Steem node as a public service, you are expected to be qualified enough for sysadmin tasks.

In this example, the performance of each of the three disks is good on “Gandalf’s scale” ;-) i.e. it can perform Trivial benchmark presented in previous episode in less than 8 seconds.

I’ve used a 12GB swap partition on each of those drives with the same priority
/ and /boot partitions are configured as software RAID1 on all three drives.
/home is configured as software RAID0 on all three drives.

It’s OK to use RAID0 in my case, because I’m not going to use it to store anything important (anything I can’t afford to lose in the event of a power or drive failure), and that makes it able to pass the test in around 4 seconds.

Software prerequisites

Let's prepare our system so that we can build and run steemd.

Add a steem user:
useradd -s /bin/bash -m steem
Update the list of packages:
apt update
Make sure that your packages are up to date:
apt upgrade
Make sure that you have a reliable time source:
apt install ntp

If you wish to use tmpfs for the shared memory file (which is an optional solution), prepare enough free space on the target tmpfs device, create a directory and set ownership:
mount -o remount,size=48G /run
mkdir /run/steem && chown steem:steem $_

Building steemd

To use a docker or not?

Use it. It’s way faster and easier and it protects you from many possible errors that you will likely make by not following the manual very carefully.

Manual way

I'm not going to use a docker now, because I want you to get more familiar with the steemd building process.

Install packages needed to build steemd:

apt install \
    autoconf \
    automake \
    cmake \
    doxygen \
    g++ \
    git \
    libboost-chrono-dev \
    libboost-context-dev \
    libboost-coroutine-dev \
    libboost-date-time-dev \
    libboost-filesystem-dev \
    libboost-iostreams-dev \
    libboost-locale-dev \
    libboost-program-options-dev \
    libboost-serialization-dev \
    libboost-signals-dev \
    libboost-system-dev \
    libboost-test-dev \
    libboost-thread-dev \
    libncurses5-dev \
    libreadline-dev \
    libssl-dev \
    libtool \
    make \
    perl \
    pkg-config \
    python3 \
    python3-jinja2 \
    wget

apt install

This list is likely to change in future releases, but you can always take a look at Dockerfile to get an idea what is needed…. Or just go and use the docker instead (we will cover this question in one of the upcoming episodes).

From now on, you can perform all steps as the user steem:
su - steem

Clone Steem from GitHub:
git clone https://github.com/steemit/steem

Checkout stable branch and update submodules:

cd steem
git checkout stable
git submodule update --init --recursive

git checkout

Create two directories: one for the build process and one for the resulting binaries:
mkdir ~/build ~/bin

Use cmake to configure steem for the build process:

cd ~/build
cmake -DCMAKE_BUILD_TYPE=Release \
      -DLOW_MEMORY_NODE=ON -DCLEAR_VOTES=ON \
      ../steem

Setup build

Finally, build steemd:
make -j$(nproc) steemd
And copy it to ~/bin for convenience:

cp ~/build/programs/steemd/steemd ~/bin
cd;~/bin/steemd --version

building.gif

Congratulations! You have your own steemd.

Configuring a simple node.

How simple could that be?

Create a directory for data:
mkdir testdata

Create a simple configuration file:
cat > testdata/config.ini

p2p-endpoint = 0.0.0.0:2001
seed-node = gtg.steem.house:2001
public-api =
enable-plugin = witness

[log.console_appender.stderr]
stream=std_error

[logger.default]
level=info
appenders=stderr

I have configured it to listen on all interfaces (0.0.0.0) on port 2001.
You need to provide at least one seed node (address and port) - in this example I’m using my own public seed node: gtg.steem.house:2001
doc/seednodes.txt is recommended as an authoritative source of reliable seed nodes. You can use this one-liner to add them to your configuration file:

while read s; do echo seed-node = ${s%% *}; done < ~/steem/doc/seednodes.txt >> config.ini

I have not enabled any public APIs. You need to explicitly put an empty list. Otherwise the default setting will be used, which contains the login, database and account_by_key APIs.
I have enabled the witness plugin. You might find this surprising, but it has nothing to do with being a witness. It will allow your node to have an idea about bandwidth restrictions, which is part of the witness plugin. It is not required, but it is recommended, so it’s a good opportunity to highlight this option.
The remaining lines are related to logging; the levels info and higher will go to stderr. In this way, we won’t be flooded by p2p debug messages or get bored by the lack of messages during resync.

You might point shared-file-dir to a /run/steem, if you configured that earlier or explicitly set shared-file-size to a value other than default.

Please note that this configuration works well for the current stable release, which is v0.19.2 (and its minor changes, i.e. will be obsolete for appbase)

Run steemd, run!

So once we are configured, it’s time to start synchronization, which will take quite a lot of time depending on the configuration.

~/bin/steemd -d testdata

Run

That’s it

My example node synced 20M blocks in slightly less than 9 hours, reaching the current head block 100 minutes later.

If you see something like this:

Getting blocks

It means that your node is fully synced and you are getting blocks produced by witnesses.

There are ways to speed things up, like putting the shared memory file on ramfs (pros: ram is fast, cons: ram is expensive and it can quickly run out) or tmpfs (pros: it will use swap when out of ram, cons: it will run swap when out of ram), or using a local copy of block_log to replay what we already have inside it and sync what we are missing up to the current head block.

What’s next?

In the next episode, I will show you the performance differences between various setups and how quickly they can replay up to 20M blocks to give you some reference data.


If you believe I can be of value to Steem, please vote for me (gtg) as a witness on Steemit's Witnesses List or set (gtg) as a proxy that will vote for witnesses for you.
Your vote does matter!
You can contact me directly on steem.chat, as Gandalf



Steem On

H2
H3
H4
3 columns
2 columns
1 column
66 Comments