Beem: race condition slowing down multi-threaded blockchain stream

Project Information

A GitHub issue was created: https://github.com/holgern/beem/issues/32

Expected behavior

b.stream(start=start, stop=stop, threading=True, thread_num=8) should return all ops from the given block range and fetch the data with 8 threads in the background. Using multiple threads should speed up the streaming process in comparison to a single-threaded call.

Actual behavior

The call throws lots of connection errors from various nodes and is comparably slow. Running the same call with threading=False works without any connection error and is faster.

How to reproduce

#!/usr/bin/python
from beem.blockchain import Blockchain
import sys
b = Blockchain()
opcount = 0
for op in b.stream(start=23483000, stop=23483200, threading=True, thread_num=8,
                   opNames=['vote']):
    sys.stdout.write("\r%s" % op['block_num'])
    opcount += 1
print("\n", opcount)

Output: (log output truncated here - see the github issue for the full output)

# time python bug_threading.py 
23483103Error: Invalid error code (_ssl.c:2273)
Lost connection or internal error on node: wss://steemd.pevo.science (1/-1) 

Error: rsv is not implemented, yet
Lost connection or internal error on node: wss://steemd.pevo.science (2/-1) 

Error: [Errno 9] Bad file descriptor
Error: Invalid WebSocket Header
Lost connection or internal error on node: wss://steemd.pevo.science (4/-1) 

Lost connection or internal error on node: wss://steemd.pevo.science (4/-1) 

Retrying in 5 seconds
[...]

23483500
 12230

real    4m40.085s
user    0m32.147s
sys 0m1.010s

The script takes considerably long to run and errors are shown from various nodes. The log (across several runs of the script above) contains errors I haven't seen before from typical node/connection issues:

Error: socket is already closed.
Error: rsv is not implemented, yet
Error: Handshake status 400 Bad Reest
Error: invalid literal for int() with base 10: '07:16:40'  # <- this was utcnow() at that time
Error: invalid literal for int() with base 10: 'Tm7e05+XYwFFwwm+UDClM9rFIfo='
Error: invalid literal for int() with base 10: '(Ubuntu)'
Error: Illegal frame
Error: ('Invalid opcode %r', 6)
Error: ('Underlying socket has been closed.',)

The same without threading:

#!/usr/bin/python
from beem.blockchain import Blockchain
import sys
b = Blockchain()
opcount = 0
for op in b.stream(start=23483000, stop=23483500, threading=False, thread_num=8,
                   opNames=['vote']):
    sys.stdout.write("\r%s" % op['block_num'])
    opcount += 1
print("\n", opcount)

Output:

# time python bug_threading.py 
23483500
 12230

real    0m52.549s
user    0m15.767s
sys 0m0.527s

No connection error, and the single-threaded version is faster than the multi-threaded version.

Further investigations:

  • The issue only occurred with websocket nodes, https nodes were not affected
  • Using separate Steem() instances per thread made the errors disappear, however the performance was poor
  • The issue was identified to be caused from a multi-threading race condition: multiple threads take out data from the same framebuffer, mixing up responses from different requests.

The PO implemented a fix in fe0bc6d that solved the problem. The fix is part of release 0.19.40.

With this fix in place, the multi-threaded streaming is much faster than the single threaded variant:

No Threads with wss duration: 22.18 s - votes: 4957
No Threads with https duration: 119.82 s - votes: 4957
8 Threads with wss duration: 6.33 s - votes: 4957
8 Threads with https duration: 14.76 s - votes: 4957

(This benchmark script uses different block range than above)

Environment

  • beem revision ab740d2
  • Python 3.6.5

GitHub Account

https://github.com/crokkon

H2
H3
H4
3 columns
2 columns
1 column
2 Comments