Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[coordinator] placement: add+remove fully atomic #1022

Merged
merged 2 commits into from
Oct 5, 2018

Conversation

schallert
Copy link
Collaborator

I realized that even with the change to make adds "safe" by default,
there could still be a race. That is, the placement that all shards were
validated as AVAILABLE for could not be the present placement by the
time AddInstances was called.

This changes placement adds/removes to use the placement algorithm to
construct the new placement, and perform a CheckAndSet with the old
placement.

Additionally, this implements "safe" adds in the delete handler as well.

badInsts := []string{}
for _, inst := range p.Instances() {
if inst.Shards().NumShards() == 0 {
continue
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is possible in valid placement? could you add this to the list of badInsts for now and return as error?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm if it's not possible in a valid placement I'll revert this. I mainly did it because I had a test with an instance with no shards

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we are checking against instance with no shards in a sharded placement, so this should not be possible

if err != nil {
return nil, err
}

return newPlacement, nil
}
if err := validateAllAvailable(curPlacement); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should work, but I feel a more reusable approach would be passing a placement validation function to the service through options and call it there, in case one day we change the logic in placement service and forgot to update here, what do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does sound more flexible, but so far this is the only use case we've had for this and I'd rather limit the scope in this PR to unblock progress.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kk, can you make a TODO here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codecov
Copy link

codecov bot commented Oct 5, 2018

Codecov Report

Merging #1022 into master will decrease coverage by 0.07%.
The diff coverage is 84.84%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1022      +/-   ##
==========================================
- Coverage   76.72%   76.65%   -0.08%     
==========================================
  Files         436      436              
  Lines       37002    37037      +35     
==========================================
- Hits        28391    28390       -1     
- Misses       6585     6616      +31     
- Partials     2026     2031       +5
Flag Coverage Δ
#dbnode 81.19% <ø> (-0.12%) ⬇️
#m3em 66.39% <ø> (ø) ⬆️
#m3ninx 75.33% <ø> (+0.07%) ⬆️
#m3nsch 51.19% <ø> (ø) ⬆️
#query 63.92% <84.84%> (+0.06%) ⬆️
#x 84.72% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 94784bd...ea0b441. Read the comment docs.

badInsts := []string{}
for _, inst := range p.Instances() {
if inst.Shards().NumShards() == 0 {
continue
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we are checking against instance with no shards in a sharded placement, so this should not be possible

if err != nil {
return nil, err
}

return newPlacement, nil
}
if err := validateAllAvailable(curPlacement); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kk, can you make a TODO here?

I realized that even with the change to make adds "safe" by default,
there could still be a race. That is, the placement that all shards were
validated as AVAILABLE for could not be the present placement by the
time `AddInstances` was called.

This changes placement adds/removes to use the placement algorithm to
construct the new placement, and perform a `CheckAndSet` with the old
placement.

Additionally, this implements "safe" adds in the delete handler as well.
@schallert schallert force-pushed the schallert/coord_safe_remove branch from ea0b441 to 6dcfaf0 Compare October 5, 2018 19:14
@schallert schallert merged commit bb310cd into master Oct 5, 2018
@schallert schallert deleted the schallert/coord_safe_remove branch October 5, 2018 19:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants