Quick Tech Update: The sircork witness is indeed temporarily manually disabled for upgrades and software updates and to make me go nuts, mostly.


Hey there,steemitizens,

This is my life right now, running on a quick 4 hour nap in my desk chair, after having our witness offline for what should have only been just some quick updates and a nic card replacement for the last twenty four hours so far.

I should have never said "only"...

sigh

I got an opportunity to save some money by taking advantage of a promotional offer from one of my multiple hosting companies I use for my projects, and companies and clients. I jumped on it, because it offered a machine with much bigger processors and drives than the one I had already for our team witness.

Welp. No good deed goes unpunished!

The new server was AWESOME and blazing fast. For about 5 minutes. I noticed replaying the chain to it as I built the witness software was taking treacherously long. I disabled my current server, and tried many things, but as the night and day wore on, I had connections issues just killing me. The machine would run, the replays would go... sloooooowly. And i'd keep getting disconnected. It was unbearably frustrating.

I immediately reported a potential failing nic card in the new server to my tech support team, and they began diagnostics proving me correct. We replaced the nic card.

Then I began replaying again, much better this time... except it would keep stalling at about the 14 millionth block and would halt. Starting the node would result in a replay starting over. and over. and over.

We all expected the NIC card so we replaced it, but not lucky because issues persisted, more diags ran and then it turned out to be a failed physical RAM issue, so we replaced all 128gb with new sticks. FINALLY the machine passed the tests and seems to be totally solid with new nic and ram onboard. New replay underway for the 3rd or 4th time since provisioning this new equipment. All seems to be going well at last.

Back to the current box. Fired it up, stalled and reports beginning a chain with zero blocks. Wtf? It was running fine for a month or more since the 64gb upgrade in early march, just hours ago?

I found it was no longer able to reach its chosen seed nodes and fixed that with some trial and error to find one actually responding at all, and kicked it all off again. And so it too began replaying. Hopefully to actually complete this time.

So this is how we witnesses spend our days.

Sometimes.

Sometimes it just works, and other times it's just work.

And so it goes.

We will be back online shortly, the chain will live without us in the meantime maybe, and we will consider ourselves lucky this is only the third time in 8 months of witnessing that we have been down at ALL, still maintaining a very health position in the not very many blocks missed listings, and in this instance, having missed no blocks at all, because it is a controlled, intended outage that just happens to be taking an unintended extra long damn time. And we regret that.

[edit] We are back online after a 36 hour outage and once again smoothly producing blocks on 8 XEON processors, with 24 cores, 128g of RAM and sizzling fast, low latency, high speed gigabit connectivity from our west coast USA data center.[/edit]

Sincerely, your diligent, dedicated, upgrading, nerdy but sometimes borked, witness team CTO,
@SirCork
on behalf of Team Three Headed Beast, witness #64, in full partnership with CMO @RhondaK and CFO @Beanz

H2
H3
H4
3 columns
2 columns
1 column
29 Comments