Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[api] Implement replace endpoint #1162

Merged
merged 6 commits into from
Nov 13, 2018
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
*.cov
*.html
*.tmp
.DS_Store
test.log

# glide manages this
Expand Down Expand Up @@ -46,4 +47,3 @@ site/
!m3db.io/**/vendor
# Automatically populated from asset sources
m3db.io/openapi

4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,15 +90,15 @@ produce a lightweight production image from a single Dockerfile. Accordingly, it
17.05 or later to build.

```
docker build -t m3dbnode:$(git rev-parse head) .
docker build -f docker/m3dbnode/Dockerfile -t m3dbnode:$(git rev-parse head) .
docker run --name m3dbnode m3dbnode:$(git rev-parse head)
```

If you wish to build an image with the source code included you can stop the build after the
`builder` stage:

```
docker build -t m3dbnode:$(git rev-parse head) --target builder .
docker build -f docker/m3dbnode/Dockerfile -t m3dbnode:$(git rev-parse head) --target builder .
```

## Configuration
Expand Down
31 changes: 23 additions & 8 deletions docs/operational_guide/placement_configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,10 +64,10 @@ The instructions below all contain sample curl commands, but you can always revi

#### Placement Initialization

Send a POST request to the `/api/v1/placement/init` endpoint
Send a POST request to the `/api/v1/services/m3db/placement/init` endpoint
Copy link
Collaborator

@justinjc justinjc Nov 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think right now it's a little ambiguous as to when to use /api/v1/... and when to use others like /api/v1/services/m3db/.... There are other mentions of /api/v1/placement like in docs/how_to (Github docs) and in our docker integration tests. It would be great to get some clarification on when to use which, but perhaps that's for a larger discussion in another PR.

cc @richardartoul

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@justinjc It should always be the /services/ ones, we just kept the old names in to not make a breaking change. Anywhere we have a URL without the service name should just be updated to include it


```bash
curl -X POST localhost:7201/api/v1/placement/init -d '{
curl -X POST localhost:7201/api/v1/services/m3db/placement/init -d '{
"num_shards": <DESIRED_NUMBER_OF_SHARDS>,
"replication_factor": <DESIRED_REPLICATION_FACTOR>(recommended 3),
"instances": [
Expand Down Expand Up @@ -104,10 +104,10 @@ curl -X POST localhost:7201/api/v1/placement/init -d '{

#### Adding a Node

Send a POST request to the `/api/v1/placement` endpoint
Send a POST request to the `/api/v1/services/m3db/placement` endpoint

```bash
curl -X POST <M3_COORDINATOR_HOST_NAME>:<M3_COORDINATOR_PORT(default 7201)>/api/v1/placement -d '{
curl -X POST <M3_COORDINATOR_HOST_NAME>:<M3_COORDINATOR_PORT(default 7201)>/api/v1/services/m3db/placement -d '{
"instances": [
{
"id": "<NEW_NODE_ID>",
Expand All @@ -126,16 +126,31 @@ After sending the add command you will need to wait for the M3DB cluster to reac

#### Removing a Node

Send a DELETE request to the `/api/v1/placement/<NODE_ID>` endpoint.
Send a DELETE request to the `/api/v1/services/m3db/placement/<NODE_ID>` endpoint.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice thanks, I think the legacy stuff still works but good to update the docs to the new stuff


```bash
curl -X DELETE <M3_COORDINATOR_HOST_NAME>:<M3_COORDINATOR_PORT(default 7201)>/api/v1/placement/<NODE_ID>
curl -X DELETE <M3_COORDINATOR_HOST_NAME>:<M3_COORDINATOR_PORT(default 7201)>/api/v1/services/m3db/placement/<NODE_ID>
```

After sending the delete command you will need to wait for the M3DB cluster to reach the new desired state. You'll know that this has been achieved when the placement shows that all shards for all hosts are in the `Available` state.

#### Replacing a Node

Currently, the best way to replace a node (due to hardware failure or any other reason) is to perform a node remove followed by a node add. We will eventually support node replacement as a single operation, but that is currently not implemented.

Send a POST request to the `/api/v1/services/m3db/placement/replace` endpoint containing hosts to replace and candidates to replace it with.

```bash
curl -X POST <M3_COORDINATOR_HOST_NAME>:<M3_COORDINATOR_PORT(default 7201)>/api/v1/services/m3db/placement/replace -d '{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this all work out correctly with various numbers of inputs and outputs? I.E if I want to replace one node with two, or remove two and add one etc

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should so long as that algorithm considers that change valid, since we don't do much rather than be a glorified wrapper around that. @cw9 can chime in on the algo stuff

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is fine, the algo basically calls the add() and then call the remove() on non sharded placements

"leavingInstanceIDs": ["<OLD_NODE_ID>"],
"candidates": [
{
"id": "<NEW_NODE_ID>",
"isolationGroup": "<NEW_NODE_ISOLATION_GROUP>",
"zone": "<ETCD_ZONE>",
"weight": <NODE_WEIGHT>,
"endpoint": "<NEW_NODE_HOST_NAME>:<NEW_NODE_PORT>(default 9000)",
"hostname": "<NEW_NODE_HOST_NAME>",
"port": <NEW_NODE_PORT>
}
]
}'
```
20 changes: 0 additions & 20 deletions src/dbnode/example/README.md

This file was deleted.

195 changes: 0 additions & 195 deletions src/dbnode/example/m3db-node-config.yaml

This file was deleted.

9 changes: 0 additions & 9 deletions src/query/api/v1/handler/placement/add.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@
package placement

import (
"fmt"
"net/http"
"path"
"time"
Expand Down Expand Up @@ -62,14 +61,6 @@ var (
// AddHandler is the handler for placement adds.
type AddHandler Handler

type unsafeAddError struct {
hosts string
}

func (e unsafeAddError) Error() string {
return fmt.Sprintf("instances [%s] do not have all shards available", e.hosts)
}

// NewAddHandler returns a new instance of AddHandler.
func NewAddHandler(opts HandlerOptions) *AddHandler {
return &AddHandler{HandlerOptions: opts, nowFn: time.Now}
Expand Down
2 changes: 1 addition & 1 deletion src/query/api/v1/handler/placement/add_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,7 @@ func TestPlacementAddHandler_SafeOK(t *testing.T) {
)
switch serviceName {
case M3AggregatorServiceName:
req = httptest.NewRequest(AddHTTPMethod, M3DBAddURL, strings.NewReader(`{"instances":[{"id": "host1","isolation_group": "rack1","zone": "test","weight": 1,"endpoint": "http://host1:1234","hostname": "host1","port": 1234}]}`))
req = httptest.NewRequest(AddHTTPMethod, M3AggAddURL, strings.NewReader(`{"instances":[{"id": "host1","isolation_group": "rack1","zone": "test","weight": 1,"endpoint": "http://host1:1234","hostname": "host1","port": 1234}]}`))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 thanks

default:
req = httptest.NewRequest(AddHTTPMethod, M3DBAddURL, strings.NewReader(`{"instances":[{"id": "host1","isolation_group": "rack1","zone": "test","weight": 1,"endpoint": "http://host1:1234","hostname": "host1","port": 1234}]}`))
}
Expand Down
25 changes: 25 additions & 0 deletions src/query/api/v1/handler/placement/common.go
Original file line number Diff line number Diff line change
Expand Up @@ -381,6 +381,15 @@ func RegisterRoutes(r *mux.Router, opts HandlerOptions) {
r.HandleFunc(M3DBDeleteURL, deleteFn).Methods(DeleteHTTPMethod)
r.HandleFunc(M3AggDeleteURL, deleteFn).Methods(DeleteHTTPMethod)
r.HandleFunc(M3CoordinatorDeleteURL, getFn).Methods(GetHTTPMethod)

// Replace
var (
replaceHandler = NewReplaceHandler(opts)
replaceFn = applyMiddleware(replaceHandler.ServeHTTP)
)
r.HandleFunc(M3DBReplaceURL, replaceFn).Methods(ReplaceHTTPMethod)
r.HandleFunc(M3AggReplaceURL, replaceFn).Methods(ReplaceHTTPMethod)
r.HandleFunc(M3CoordinatorReplaceURL, replaceFn).Methods(ReplaceHTTPMethod)
}

func newPlacementCutoverNanosFn(
Expand Down Expand Up @@ -460,6 +469,14 @@ type m3aggregatorPlacementOpts struct {
propagationDelay time.Duration
}

type unsafeAddError struct {
hosts string
}

func (e unsafeAddError) Error() string {
return fmt.Sprintf("instances [%s] do not have all shards available", e.hosts)
}

func validateAllAvailable(p placement.Placement) error {
badInsts := []string{}
for _, inst := range p.Instances() {
Expand Down Expand Up @@ -517,3 +534,11 @@ func parseServiceFromRequest(r *http.Request) (string, error) {

return "", errUnableToParseService
}

func isStateless(serviceName string) bool {
switch serviceName {
case M3CoordinatorServiceName:
return true
}
return false
}
Loading