Witness Update - Acknowledgement of error and changes to @curie witness operation structure (April 2nd, 2018)

curie-logo-witness-update-transparency.png

This witness update is an open acknowledgement of a major error made on the @curie witness and an outline of the changes in our witness operation structure that have been put in place to ensure there is not a repeat.

  • On April 1st, 2018 an error was made by @curie witness operator @locikll while upgrading to a new witness server. The @curie witness began double producing and causing block collision.
  • This event highlighted a failing in the witness monitoring protocols in place for the @curie witness. The initial error was a major mistake, and it was compounded by the fact it took some hours before the mistake was fixed and @curie witness recovered.
  • We take this matter very seriously; operating a top 20 witness on the Steem blockchain is a serious responsibility and we openly acknowledge failing to live up to this responsibility. The following steps have been taken to ensure this does not happen again:
  1. @locikll has been replaced as primary witness operator. @markangeltrueman (UTC time zone) will take over as primary witness operator. Mark is Curie's lead developer for several ongoing development projects and in his full time job heads up a team of third line support engineers for a large payments company after spending 15 years as a software engineer.
  2. @locikll will continue as backup witness operator to provide additional monitoring and coverage as needed; primary and backup witness operator will coordinate schedules so at least one is providing coverage on every day. @locikll will continue to manage the RPC node and other non-witness infrastructure etc.
  3. A "four eyes" QA process will be added to any changes that are to be made to the witness node outside of emergency incidents. No changes to witness node config etc. will be made by either primary or backup witness operator without the other operator checking it first.
  4. Automated monitoring will be implemented to notify primary and backup witness operators via SMS when blocks are missed.
  5. A backup monitor position has been created with direct phone contact to the primary and backup witness operators. @carlgnash is assuming the duties of backup witness monitor; this gives Curie witness 24/7 human monitor coverage as @carlgnash is Pacific coast US time, with waking hours to overlap UTC night time. This is intended as a final stopgap to ensure the fastest possible response time at all hours.

  • For those new to Curie, please follow @curie, and join us on Discord: https://discord.gg/jQtWbfj
    For all witness related queries, there is a dedicated channel on the Discord server: #witness-enquiries channel. Please address any @curie witness related questions or concerns in this channel.

  • To learn more about Curie operations, please read the Curie Whitepaper at curiesteem.com

  • Follow @curie's votes to support the authors. Please consider following our trail and voting for curated authors. If you are a SteemAuto user, @curie is an available trail to follow.

H2
H3
H4
3 columns
2 columns
1 column
26 Comments