Update: The full v0.20.6 release is now published. Many of the issues mentioned are the same, it seems.
I'm really excited to see the Steem community leading the way in terms of clarifying expectations in DPoS between witnesses and core consensus developers. If you haven't read it already (and you care about this stuff), check out the post:
Steem Governance is Multiparty
From the post:
In order for Steem blockchain developers (like Steemit) to produce code as effectively as possible, and Steem holders to be as informed as possible, we believe that it would be a substantial benefit to the ecosystem for each Witness to publish the standards that each will hold for proposed Steem code, as well as standards they will apply to the development process used to create that new code. These would be the standards they are committed to enforcing before they would consider adopting and running code in a new version of Steem. We also believe these should include additional standards for each Witnesses’ own procedure which would ensure that their requirements for new code are met.
I'm glad to hear Steemit, Inc will be publishing their own standards along with the community.
As HF20 went live, I put this post together where I outlined some of my thoughts and personal standards for pushing out code. I'll list them here again along with some thoughts specific to the upcoming v0.20.6 release:
- Code review the changes being deployed on Github. Look for anything obvious/confusing/concerning and ask questions of the team as appropriate.
The v0.20.5 diff to master as of this post has 82 file changes over 131 commits. 47 commits have come after v0.20.6rc1. It's a lot to review, so we should be cautious rolling it out.
I've looked through much of this code and since I'm not (yet) a C++ developer, my usefulness in that regard is limited. I am going through some C++ tutorials in free time (that I don't really have), as that is a skill I want to obtain. Many of the changes seem to be about removing the witness_api plugin and moving code around to a block producer container.
Going through the code line by line may not be as helpful as reviewing each issue ticket and PR mentioned in the release notes (more on that later).
- Run the code on a seed node for a period of time and monitor the logs closely. Are there any changes, issues, or unexpected problems? If so, discuss with the other witnesses, community, and Steemit, Inc. developers and create Github issues as needed.
- I have not yet done this on v0.20.6rc1 as most of the changes seemed to me to be witness related.
- Run the code on a backup witness node that is not actively signing blocks.
- I did this for a couple days and then switched over to producing for a time to test that out as well. This led to some helpful discussions here and some github issues here and here.
- Test new functionality on the testnet.
- There doesn't appear to be new functionality here, but tweaks to the existing functionality. I'm hoping for more direction from Steemit, Inc developers on how we can best go about testing this.
- Ask application developers if they've tested their applications against the changes, and if they are comfortable with those changes.
- My current understanding of v0.20.6 means, ideally, there should be limited impact on application developers. They will have to remove the witness_api plugin from their configuration file, but that's most likely it. If the RC changes impact their user base negatively, that's something we want to know about now and something they should be testing against. I'm here to help if I can.
- Discuss with other technical witnesses their own testing and confidence level with the proposed changes.
- I've started some discussions along these lines already, specifically related to how we can best leverage the testnet and get clarity on the intentions of the code to be deployed.
- Ask questions of the Steemit, Inc. development team to determine their confidence level and figure out if there are specific areas of the code which they may be most concerned about (and which I should focus my testing).
This post will be an extension of that process along with other conversations in other mediums that have already started.
I'm hoping for more direction from the developers of this code as to its intention, how those intentions have already been tested and verified, and how we can test them further on the testnet prior to release on the mainnet.
So where to go from here?
I'm thinking an additional approach to get the process started may be to look through each pull request and issue mentioned in the release notes and ensure I understand the intention of the changes and get some clarity on whether or not I think they have been tested to the community's satisfaction.
You can see the release notes here: Steem 0.20.6 RC1 and there may be a new full release coming soon.
Here are the questions I plan to go through for each change mentioned:
- What is the intention of the change?
- How has the change been tested by the developers?
- How can the change be tested on the testnet?
- Are there any concerns related to how the testnet functions with regards to this change which may not translate well to the mainnet?
- As a witness, am I confident this change won't alter the user experience (or security) of the change and if it will, have I done enough to educate the community and help set proper expectations for the rollout?
First, some commentary on the overview:
This is a bug fix release primarily targeting the RC Plugin.
Cool! We know this will be a continual process of refinement as we improve the Resource Credit system. You can learn more about this system here:
- Developer Guide: Resource Credit System
- RC Bandwidth System
- RC Bandwidth Parameters
- Resource dynamics for witnesses
Note: witness_api plugin no longer exists, and should be removed from config.ini if present.
Attention witnesses! Remove witness_api from your config.ini before trying to start this version up. It will error and not start if you leave it in there.
Reindexing
Nodes running the RC plugin need to manually force a reindex with --replay.
Noted. That's all witnesses.
Nodes that do not run the RC plugin do not need to upgrade or replay.
Hurray appbase. Means exchanges hopefully get bothered less by the social blockchain aspects of improvements we're making.
At this time we recommend witnesses and API nodes upgrade.
But... none have yet. I did for a bit and reverted as it caused some confusion as noted above. Maybe we should get clarity on this directive from Steemit, inc? Why ask if no one is doing it? What do we still need as witnesses to feel comfortable upgrading? That's what this post is all about.
This is an optional release for exchanges. While the RC values will be different on 0.20.5 and 0.20.6, we have determined through our testing the difference will not be significant and should not cause problems for exchanges. If an exchange wants to update we recommend reindexing 0.20.6 up on a separate machine and switching only once the 0.20.6 node is functional.
Cool. As witnesses, we may want to clarify this for people as they will see some accounts able to post that previously could not because of negative resource credits.
Fixes From the Release Notes
Here I'll go through each issue and related PRs that I can find. Hopefully the release notes cover everything seen in the full code diff, but that is an assumption made by the approach I'm taking here.
- RCs are no longer calculated prior to HF20 (#3007)
Code is here: https://github.com/steemit/steem/pull/3026/files
What is the intention of the change?
RC are computed from the beginning of time. This was useful for testing, but it should be disabled to speed replays.
Seems pretty straight forward to me and a good idea to speed things along. :)
How has the change been tested by the developers?
Based on this comment in the code, we may not have a good way to test this on testnet. That's a little concerning, but again, it seems like a straight forward change.
How can the change be tested on the testnet?
Doesn't look like it. :(
Are there any concerns related to how the testnet functions with regards to this change which may not translate well to the mainnet?
N/A
As a witness, am I confident this change won't alter the user experience (or security) of the change and if it will, have I done enough to educate the community and help set proper expectations for the rollout?
I'm not super clear what this line is about, but everything else... seems okay. I'd certainly like it better if it was testable on the testnet in some way. theoreticalbts worked on the code and mvandeberg approved the changes.
- Removed the old bandwidth system. This includes the removal of several API fields that were specific to bandwidth and the removal of the witness_api, which was used exclusively for returning information related to bandwidth. (#3029)
What is the intention of the change?
The old bandwidth system has now been replaced by the Resource Credit system. Remove the old code which no longer does anything.
How has the change been tested by the developers?
According to this comment by Ned, the code isn't being used and is just being commented out:
With this change, Bandwidth, which is already deprecated and replaced by RCs, is functionally being commented out. No replay would be needed. The benefit of this however is any subsequent replays would occur faster due to not needing to calculate available bandwidths.
Which you can see in this pull request: https://github.com/steemit/steem/pull/3022/files
How can the change be tested on the testnet?
This should hopefully be a non-issue if the code really isn't being used at all right now.
Are there any concerns related to how the testnet functions with regards to this change which may not translate well to the mainnet?
N/A again. I'm starting to think this question isn't very helpful, though it was important for the HF20 release.
As a witness, am I confident this change won't alter the user experience (or security) of the change and if it will, have I done enough to educate the community and help set proper expectations for the rollout?
In this case, I'm trusting the developers know enough about the code as part of implementing the resource credit system to begin with, that this code can safely be commented out. Later it was completely removed in https://github.com/SteemCommunity/steem/pull/3
Shout out to @reggaemuffin and the SteemCommunity GitHub repo https://github.com/SteemCommunity/steem for creating https://github.com/SteemCommunity/steem/pull/3
This pr removes bandwidth and reserve ration calculations completely.
It keeps the apis for them for backwards compatibility, but serves dummy values.
It removes all calculations regarding bandwidth to speed up replay.
Unit tests ensure that all apis are working and no bugs are present.
Which mvandeberg agreed to here and implemented here: https://github.com/steemit/steem/pull/3049 with theoreticalbts reviewing and approving it.
Concerns I have relate to the API changes:
The changed APIs are as follows: condenser_api.get_dynamic_global_properties - Removed the following fields: average_block_size, current_reserve_ratio, max_virtual_bandwidth. Removed the following fields from the extended_account_object: average_bandwidth, lifetime_bandwidth, last_bandwidth_update, average_market_bandwidth, lifetime_market_bandwidth, last_market_bandwidth_update. Removed condenser_api. get_account_bandwidth. Removed the witness_api.
If application developers are using any of this functionality, they should speak up now. It's possible their applications are already returning invalid data and "failing silently." Removing this may cause them to fail loudly.
- Changes were made to the RC parameters to conserve current pricing while more aggressively protecting blockchain resources at high usage levels. (#3061)
What is the intention of the change?
It's outlined well in the issue ticket and we can read the results in the PR here: https://github.com/steemit/steem/pull/3062/files
How has the change been tested by the developers?
From the issue:
A simulation shows these changes should have the following approximate multipliers on the RC costs of standardized transactions at current pool levels:
2018-10-11 17:00:00:1 long_post 0.265
2018-10-11 17:00:00:1 short_post 0.265
2018-10-11 17:00:00:1 transfer 0.970
2018-10-11 17:00:00:1 vote 0.265As you can see, the RC cost of standardized transactions actually drops based on these changes.
How can the change be tested on the testnet?
I'm not sure... but I don't see any discussion about it.
Are there any concerns related to how the testnet functions with regards to this change which may not translate well to the mainnet?
My hunch is this might be tricky to test due to needing real-world activity and usage to see how it responds to times of high load we only see on the mainnet.
As a witness, am I confident this change won't alter the user experience (or security) of the change and if it will, have I done enough to educate the community and help set proper expectations for the rollout?
This one has me a little concerned because simulations were done (as far as I know) on HF20 also, and it failed. It's possible these changes to the RC system in the real world could actually have a negative impact. How will we know and how will we test it prior to the code going out? What criteria will we use and measure to know if this change is "right" as in, how will it impact new users with low SP compared to how they are interacting today? Is this relying heavily on "when resource usage is low" and if so, what happens when things are being used a lot?
I still have some questions about this one, but I'm not clear on what the best approach is for getting those questions clarified.
- The state byte cost for votes has been reduced as votes are only required in consensus before a comment has been paid. (#3064)
What is the intention of the change?
Seems pretty straight forward to me. Instead of STATE_BYTES_SCALE (10000) we get STATE_COMMENT_VOTE_BYTE_SIZE (525)
How has the change been tested by the developers?
Looks simple enough from this comment:
Good catch. This should have been done from the beginning.
How can the change be tested on the testnet?
It's a simple math change, from what I can see.
Are there any concerns related to how the testnet functions with regards to this change which may not translate well to the mainnet?
All RC changes seem a little tricky to test since you need real-world activity to fully simulate the changes.
As a witness, am I confident this change won't alter the user experience (or security) of the change and if it will, have I done enough to educate the community and help set proper expectations for the rollout?
Seems it will improving voting by making it more accurately reflected as a cheaper operation. At least, I think that's what it does. :)
- Negative RCs are limited to 2 hours worth of regeneration. (#3050)
What is the intention of the change?
"No description provided." worried me a bit, but I asked for clarification within a discussion and got this reponse regarding this line in particular:
int64_t min_mana = -1 * ( STEEM_RC_MAX_NEGATIVE_PERCENT * mbparams.max_mana ) / STEEM_100_PERCENT;
I was given a detailed answer in a semi-public forum which I'll include here if/when I get permission from the commenter to do so.
How has the change been tested by the developers?
Looks like there's an open issue in the SteemCommunity repo for this.
How can the change be tested on the testnet?
Not sure.
Are there any concerns related to how the testnet functions with regards to this change which may not translate well to the mainnet?
Not sure, but again my hunch is it can't easily be tested with real-world activity on the testnet.
As a witness, am I confident this change won't alter the user experience (or security) of the change and if it will, have I done enough to educate the community and help set proper expectations for the rollout?
I have some concers about it letting spammers through, as it seemed to do when I was testing it out in production. I mentioned this here and there are still some concerns here that it might be too easy on spammers.
- The execution time cost of custom json operations has been reduced by 20 times for all custom json operations except those used by the follow plugin. (#3027)
What is the intention of the change?
More accruately reflect costs of custom json operations (that's something that SteemMonsters uses a lot, as far as I know).
How has the change been tested by the developers?
This comment summarizes it well:
The state byte numbers are calculated from the python script.
The execution time numbers were derived from empirical testing. The previous custom json time was based on our full node (all plugins enabled) execution times, which includes significant execution times for custom ops used by the follow plugin. We want to account for those because they do incur significant costs to operating Steemit.com but we acknowledge that other Dapps make extensive use of custom ops and do not increase operating costs for Steemit.com at all. This change reduces the cost by 20x for all other custom ops, bring the cost in line with other lower impact operations.
How can the change be tested on the testnet?
Similar to other RC changes... hard to do.
Are there any concerns related to how the testnet functions with regards to this change which may not translate well to the mainnet?
Same as above.
As a witness, am I confident this change won't alter the user experience (or security) of the change and if it will, have I done enough to educate the community and help set proper expectations for the rollout?
Seems safe enough. If the numbers are off in production and we see too many people spamming custom json, we can always adjust them again and, eventually, as Ned mentioned here move them to a witness parameter in the future. This seems like a good change to improve the user experience for apps like SteemMonsters.
- RC Execution Time was not be charged against ops. This has been fixed and execution time is properly being checked. (#2972)
What is the intention of the change?
Seems to fix something that was missing all along, this one liner:
result.resource_count[ resource_execution_time ] += vtor.execution_time_count;
How has the change been tested by the developers?
I'm not entirely sure as the questions asked here went unanswered.
How can the change be tested on the testnet?
I don't know, but I'd like to ensure this small change doesn't have a big, unintended impact.
Are there any concerns related to how the testnet functions with regards to this change which may not translate well to the mainnet?
Not sure.
As a witness, am I confident this change won't alter the user experience (or security) of the change and if it will, have I done enough to educate the community and help set proper expectations for the rollout?
I'd like to see more clarity on this one from the developers. If the HF20 rollout hurt people's usability and this particular function wasn't being charged at all, could turning it on make things even worse? Has this been properly tested in a real world situation? Having run my own node on v0.20.6rc1 for a while and produced blocks, I didn't see a huge amount being rejected, so... maybe it's fine?
- Fixed command line interface compatibility issues with boolean flags. Flags work as intended on the command line but are settable to boolean values in the config file. For example, on the command line you can set --enable-stale-production or set it as enable-stale-production=true in the config. (#3025)
What is the intention of the change?
Pretty straight forward. Fixes some inconsistencies.
How has the change been tested by the developers?
Seems to me like a "it works or it doesn't" thing.
How can the change be tested on the testnet?
N/A
Are there any concerns related to how the testnet functions with regards to this change which may not translate well to the mainnet?
N/A
As a witness, am I confident this change won't alter the user experience (or security) of the change and if it will, have I done enough to educate the community and help set proper expectations for the rollout?
All good since these params aren't often used anyway.
- The account creation fee in the command line wallet uses the correct HF 20 value. (#3019)
What is the intention of the change?
Thanks to @smooth for this contribution. Seems account creation calculations since HF20 were off a bit? I'm not sure what
* asset( STEEM_CREATE_ACCOUNT_WITH_STEEM_MODIFIER, STEEM_SYMBOL )
evaluates to now.
How has the change been tested by the developers?
I'm not quite sure since there wasn't much commentary about this. Seems that account creation wasn't quite right before? It seems like a minor change, but I'd like to know more to ensure it won't make account creations too cheap (or expensive).
How can the change be tested on the testnet?
Create accounts and notice the new cost.
Are there any concerns related to how the testnet functions with regards to this change which may not translate well to the mainnet?
The last I checked, account creation was free on the testnet and this was an issue properly testing HF20. That may be an issue here as well.
As a witness, am I confident this change won't alter the user experience (or security) of the change and if it will, have I done enough to educate the community and help set proper expectations for the rollout?
I think we're okay as long as this fix implements the correct, expected costs of account creation. I need to know more about what it's supposed to be and how this may impact any existing account creation services.
Okay... that's enough for now as 1am is approaching quickly. There may be a new release by the time I wake up in the morning so this may have to be done again with a full v0.20.6 release anyway.
Luke Stokes is a father, husband, programmer, STEEM witness, DAC launcher, and voluntaryist who wants to help create a world we all want to live in. Learn about cryptocurrency at UnderstandingBlockchainFreedom.com