Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue: Make VSchema management serializable #17398

Open
4 tasks
mattlord opened this issue Dec 17, 2024 · 0 comments
Open
4 tasks

Tracking Issue: Make VSchema management serializable #17398

mattlord opened this issue Dec 17, 2024 · 0 comments

Comments

@mattlord
Copy link
Contributor

mattlord commented Dec 17, 2024

Feature Description

Today, we have no consistency guarantees around VSchema writes. If there is any concurrent read-modify-write cycles — we only offer an API for writing the entire object today — then you can lose intermediate writes. Given that the vschema plays a critical role in serving queries and data, this is not an ideal situation.

You can see the general problem reported in a specific context here: #15794

We have support for linearizing writes using the topo key version:

// Update is part of the topo.Conn interface.
func (s *Server) Update(ctx context.Context, filePath string, contents []byte, version topo.Version) (topo.Version, error) {
nodePath := path.Join(s.root, filePath)
if version != nil {
// We have to do a transaction. This means: if the
// current file revision is what we expect, save it.
txnresp, err := s.cli.Txn(ctx).
If(clientv3.Compare(clientv3.ModRevision(nodePath), "=", int64(version.(EtcdVersion)))).
Then(clientv3.OpPut(nodePath, string(contents))).
Commit()
if err != nil {
return nil, convertError(err, nodePath)
}
if !txnresp.Succeeded {
return nil, topo.NewError(topo.BadVersion, nodePath)
}
return EtcdVersion(txnresp.Header.Revision), nil
}
// No version specified. We can use a simple unconditional Put.
resp, err := s.cli.Put(ctx, nodePath, string(contents))
if err != nil {
return nil, convertError(err, nodePath)
}
return EtcdVersion(resp.Header.Revision), nil
}

That is used for Keyspaces, Shards, and Tablets today using KeyspaceInfo, ShardInfo, and TabletInfo respectively, which wrap the Keyspace, Shard, and Tablet records and store the version of the key that was read from the topo server. If we look at KeyspaceShard as an example:

  • KeyspaceInfo:
    // KeyspaceInfo is a meta struct that contains metadata to give the
    // data more context and convenience. This is the main way we interact
    // with a keyspace.
    type KeyspaceInfo struct {
    keyspace string
    version Version
    *topodatapb.Keyspace
    }
  • The version set on read:
    // GetKeyspace reads the given keyspace and returns it
    func (ts *Server) GetKeyspace(ctx context.Context, keyspace string) (*KeyspaceInfo, error) {
    if ctx.Err() != nil {
    return nil, ctx.Err()
    }
    if err := ValidateKeyspaceName(keyspace); err != nil {
    return nil, vterrors.Wrapf(err, "GetKeyspace: %s", err)
    }
    keyspacePath := path.Join(KeyspacesPath, keyspace, KeyspaceFile)
    data, version, err := ts.globalCell.Get(ctx, keyspacePath)
    if err != nil {
    return nil, err
    }
    k := &topodatapb.Keyspace{}
    if err = k.UnmarshalVT(data); err != nil {
    return nil, vterrors.Wrap(err, "bad keyspace data")
    }
    return &KeyspaceInfo{
    keyspace: keyspace,
    version: version,
    Keyspace: k,
    }, nil
    }
  • The version used on write to ensure we linearize the writes and are only allowed to update the latest/current version (do not lose intermediate changes):
    // UpdateKeyspace updates the keyspace data. It checks the keyspace is locked.
    func (ts *Server) UpdateKeyspace(ctx context.Context, ki *KeyspaceInfo) error {
    if ctx.Err() != nil {
    return ctx.Err()
    }
    // make sure it is locked first
    if err := CheckKeyspaceLocked(ctx, ki.keyspace); err != nil {
    return err
    }
    data, err := ki.Keyspace.MarshalVT()
    if err != nil {
    return err
    }
    keyspacePath := path.Join(KeyspacesPath, ki.keyspace, KeyspaceFile)
    version, err := ts.globalCell.Update(ctx, keyspacePath, data, ki.version)
    if err != nil {
    return err
    }
    ki.version = version
    event.Dispatch(&events.KeyspaceChange{
    KeyspaceName: ki.keyspace,
    Keyspace: ki.Keyspace,
    Status: "updated",
    })
    return nil
    }

We do not, however, use this same mechanism for VSchemas today. So you can have workflow related commands coming from systems and humans, humans using vtctldclient vschema related commands (GetVSchema and ApplyVSchema), systems using vschema related RPCs, vtgates doing the same via its VSchema SQL interface, with all of these read-modify-write cycles potentially happening concurrently and without any mechanism to ensure consistency via linearizing the writes (all writes happen in a sequential order w/o losing intermediate ones and going back in logical time) so that you can only update the current/latest vchema. This can potentially lead to undefined behavior and in turn can lead to major failures and downtime.

We should fix this in all topo server implementations (etcd, consul, and zookeeper all support versions).

This work will involve several pieces:

  • Add versioning to the low level topo interface for VSchemas using the existing Info struct model #17399
  • Add an interface for modifying discrete parts of the VSchema via concrete actions: AddTable, AddVindex, RemoveTable, etc.
  • Provide a replacement or alternative to transition away from the one shot ApplyVSchema command and if we keep it long term find a way to make it linearizable and ensure consistency as well
  • Ensure that SrvVSchema management is also linearizable (it contains copies of the keyspaces' VSchema protos)

Use Case(s)

Removing a sharp edge in Vitess cluster management.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Backlog
Development

No branches or pull requests

1 participant