-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add validator slot id groups #123
Conversation
Related to nim-codex/457, nim-codex/458. To cover the entire SlotId (uint256) address space, each validator must validate a portion of the SlotId space. When a slot is filled, the SlotId will be put in to a bucket, based on the value of the SlotId and the number of buckets (validators) configured. Similar to `myRequests` and `mySlots`, a function called `validationSlots` can be used to retrieve the `SlotIds` being validated for a particular bucket (validator index). This facilitates loading actively filled slots in need of validation when a validator starts.
Rename: `validationSlots` to `myValidationSlots` `addToValidationSlots` to `addToMyValidationSlots` `removeFromValidationSlots` to `removeFromMyValidationSlots`
So I am a bit surprised by this design. I understand that you are taking a similar approach to how we solve similar issues for other concepts (Slots, Requests), but this seems to me a bit over-engineered solution that is not necessary and will incur additional IMHO unnecessary costs (gas) for users of the network. Also while we go with this solution (Validators) for now, we know we will move away from it in the future. Moreover, I don't really like the staticity of the Validator groups, which are specified on the contract deployed. Let's discuss it on the call today. |
The alternative idea proposed by @AuHau (modified for detail and completeness):
The potential downsides of this approach are:
To address the other point that the current PR approach is "static" in that the number of minimum validators is set at a network level in the contract. Yes, this is true and to change it would require a new contract deployment. The new design idea outlined above would still require this value to be set, but would need to be hardcoded in the codebase instead. Because this is a temporary solution, I don't see a network-level configuration setting as a show stopper. If the network grows large enough such that validators are having trouble keeping up with the number of slots they need to monitor, this can be worked around in a number of ways: adding more memory, using the existing Ultimately, as @AuHau stated, this is a temporary solution and will most likely be replaced by aggregators in the future. By my estimate, the amount of work involved above is much more than the solution that is currently proposed in this PR. Because of this, in my opinion, I think we should keep what has been already proposed in this PR. |
I put some more thought into this and have attempted to map the pros and cons for each design. Loading Validator slots on startup
Legend
1 The number of slots to validate may exceed the validators' capabilities to validate all required slots in a single period. If this happens, then SPs may be able to get paid out at the end of contracts before validators have had a chance to validate that they didn't miss proofs. A potential solution for this, is to disallow 2 Assumes configured number of validators is large enough to keep the bucket size to within the validators' capacity to validate slots within a single period. The chance of exceeding validator capabilities is less than if there were no slot buckets, due to less slots required for validation. 3 Slot state maintenance has a few disadvantages:
4 Before a validator begins validating, it must check the current state of the slot. The more slots to validate, the longer it will take to start validating slots at startup. With buckets in place, the number of slots will be less than without buckets. 5 Building the slot state may rob needed CPU/IO from other parts of the Codex startup routines 6 To ensure that the entire SlotId space is covered by validators, a minimum of one validator per bucket must be running at all times. The number of SlotId to be covered is less than having no buckets all, in which case one validator must be running that covers the entire space. 7 8 Compared to fetching all 9 Compared to fetching and building the slot state locally 11 With no slot buckets to limit the number of slots to validate, there is an increased chance the validator will not be able to validate all slots in the network in a given period. 12 No additional validator CLI params are needed, only 13 No need to consider if one validator for each bucket are running, however, in order to guarantee the entirety of the 14 When slot buckets are used, an additional CLI param is required to assign the validator to a particular bucket. If not specified, by default a random bucket could be assigned, or slots in all buckets could be assigned. In both cases, 15 Compared to fetching 16 The complexity required for the local slot state structure is more complex and will take more time to implement and maintain than storing it on chain. Other considerations
ScoringScores were drawn using the current client state as a baseline to indicate if a point was a supplementary benefit or a drawback to the baseline. All four options added a net positive benefit of allowing validators to fetch active slots in the network, so that point has been left out of the comparison. We might want to add different additional weights to some of the points, eg introduction of validation delays could potentially have a higher weight. |
After a discussion, it was highlighted that in order to prevent SPs from being locked into lengthy contracts, a maximum duration of initially 30 days will be implemented. Due to this maximum, querying and processing 30 days of historical @markspanbroek @AuHau, I would like to point out, that even though the number of slots is limited to slots filled in the past 30 days, and limited to the assigned bucket, the number could be substantial and take quite a lot of time to check the state of each slot in the Did we figure out what we decided to do if there are too many slots for the validator to be able to validate in a given period? |
Add an error log if this occurs. Another point, is that the validator does not have an entire period to validate all slots. It has from the end of the last period to One idea for an optimisation is to listen for |
@emizzle I believe this PR can be closed as we are going with the local approach right? |
Closing in favour of codex-storage/nim-codex#890 |
Related to codex-storage/nim-codex#457 and codex-storage/nim-codex#458.
To cover the entire SlotId (uint256) address space, each validator must validate a portion of the SlotId space. When a slot is filled, the SlotId will be put in to a bucket, based on the value of the SlotId and the number of buckets (validators) configured. Similar to
myRequests
andmySlots
, a function calledvalidationSlots
can be used to retrieve theSlotIds
being validated for a particular bucket (validator index). This facilitates loading actively filled slots in need of validation when a validator starts.The number of validators specified in the network-level configuration specifies the minimum number of validators to deployed to the network. There can be more than that number of validators on the network. In the Codex client, each validator will opt to validate a bucket of the
SlotId
space, by specifying an index of[0, validators-1]
. It is important a minimum of 1 validator per bucket is deployed on the network, or the configuration value is set to 1. For example, if thevalidators
configuration value was set to 3, there should be a minimum of 3 validators deployed on the network, with each one specifying a different validator bucket to cover.These changes do not prevent any 1 validator participant from watching the entire SlotId space to potentially earn more.