Add multi-database support to cluster mode #1671

xbasel · 2025-02-05T15:28:52Z

This commit introduces multi-database support in cluster mode while maintaining backward compatibility and requiring no API changes. Key features include:

Database-agnostic hashing: The hashing algorithm is unchanged. Identical keys map to the same slot across all databases. No changes to slot calculation. This ensures consistency in key distribution and maintains compatibility with existing single-database setups.
Implementation is fully backward compatible with no API changes.
The core structure remains an array of databases, each containing a list of hashtables (one per slot).

Cluster management commands are global commands, except for GETKEYSINSLOT and COUNTKEYSINSLOT, which run in selected-DB context.

MIGRATE command operates a selected-db context. Please note that MIGRATE command parameter destination-db is used, when migrating keys they can be migrated to a different database in the target, like in non-cluster mode.

Slot migration process changes when multiple databases are used:

	Iterate through all databases
 		SELECT database
 		keys = GETKEYSINSLOT
 		MIGRATE source target keys

Valkey-cli has been updated to support resharding across all databases.

#1319

This commit introduces multi-database support in cluster mode while maintaining backward compatibility and requiring no API changes. Key features include: - Database-agnostic hashing: The hashing algorithm is unchanged. Identical keys map to the same slot across all databases. No changes to slot calculation. This ensures consistency in key distribution and maintains compatibility with existing single-database setups. - Implementation is fully backward compatible with no API changes. - The core structure remains an array of databases, each containing a list of hashtables (one per slot). Cluster management commands are global commands, except for GETKEYSINSLOT and COUNTKEYSINSLOT, which run in selected-DB context. MIGRATE command operates a selected-db context. Please note that MIGRATE command parameter destination-db is used, when migrating keys they can be migrated to a different database in the target, like in non-cluster mode. Slot migration process changes when multiple databases are used: Iterate through all databases SELECT database keys = GETKEYSINSLOT MIGRATE source target keys Valkey-cli has been updated to support resharding across all databases. Signed-off-by: xbasel <[email protected]>

codecov · 2025-02-05T15:45:24Z

Codecov Report

Attention: Patch coverage is 89.23077% with 7 lines in your changes missing coverage. Please review.

Project coverage is 71.12%. Comparing base (2eac2cc) to head (0cf9b2d).
Report is 32 commits behind head on unstable.

Files with missing lines	Patch %	Lines
src/valkey-cli.c	80.64%	6 Missing ⚠️
src/cluster.c	88.88%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##           unstable    #1671      +/-   ##
============================================
+ Coverage     70.97%   71.12%   +0.14%     
============================================
  Files           121      123       +2     
  Lines         65238    65543     +305     
============================================
+ Hits          46305    46619     +314     
+ Misses        18933    18924       -9

Files with missing lines	Coverage Δ
src/cluster_legacy.c	`86.24% <100.00%> (+0.34%)`	⬆️
src/config.c	`78.35% <ø> (-0.06%)`	⬇️
src/db.c	`89.94% <ø> (+0.37%)`	⬆️
src/valkey-benchmark.c	`61.83% <ø> (+1.69%)`	⬆️
src/cluster.c	`89.17% <88.88%> (-0.07%)`	⬇️
src/valkey-cli.c	`56.10% <80.64%> (+0.22%)`	⬆️

... and 33 files with indirect coverage changes

JoBeR007 · 2025-02-11T10:09:32Z

src/db.c

@@ -1728,12 +1714,6 @@ void swapMainDbWithTempDb(serverDb *tempDb) {
 void swapdbCommand(client *c) {
    int id1, id2;

-    /* Not allowed in cluster mode: we have just DB 0 there. */


Would that be enough for swapdb to work in cluster mode? What will happen in setup with 2 shards, each responsible for half of slots in db's?

With this implementation SWAPDB must be executed in all primary nodes. There are three options:

Allow SWAPDB and shift responsibility to the user – Risky, non-atomic, can cause temporary inconsistency and data corruption. Needs strong warnings.

Keep SWAPDB disabled in cluster mode – Safest, avoids inconsistency.

Make SWAPDB cluster-wide and atomic or – Complex, unclear feasibility.

I think option 2 is the safest bet. @JoBeR007 wdyt?

Is SWAPDB replicated as a single command? Then it's atomic.

If it's risky, it's risky in standslone mode with replicas too, right?

I think we can allow it. Swapping the data can only be done in some non-realtime workloads anyway I think.

I think risky because of replication and risky because of the need to execute SWAPDB on all primary nodes are unrelated just because as a user you can't control first, but user is the main risk in the second case.
I would keep SWAPDB disabled in cluster mode, if we decide to continue with this implementation

In cluster mode, consistency is per slot.

Is SWAPDB replicated as a single command? Then it's atomic.

If it's risky, it's risky in standslone mode with replicas too, right?

I think we can allow it. Swapping the data can only be done in some non-realtime workloads anyway I think.

I don’t think it’s very risky with standalone replicas. The only downside is if SWAPDB propagation to the replica takes time, a client might still access the wrong database. At least the client won’t be able to modify the wrong database, as they can only read.
In cluster mode, the same (logical) DB can be DB0 on one node and DB1 on another, but similar issues already exist today, FLUSHDB on one node doesn’t clear the entire DB since data exists in other slots/nodes. But as you said, consistency is per slot.

Yes, FLUSHDB is very similar in this regard. If a failover happens just before this command has been propagated to replicas, it's a big thing, but it's no surprise I think. The client can use WAIT or check replication offset to make sure the FLUSHDB or SWAPDB was successful on the replicas.

xbasel added 2 commits February 5, 2025 17:45

Reformatting

4d45a7e

Typo

a20c149

xbasel mentioned this pull request Feb 5, 2025

[NEW] Multiple DB supports in cluster mode #1319

Open

xbasel marked this pull request as draft February 6, 2025 10:01

xbasel mentioned this pull request Feb 6, 2025

[NEW] Multi-database support in cluster mode - Implementation Plan #1681

Closed

xbasel added 2 commits February 10, 2025 21:57

fix tests

1c3ea2f

fix test

0cf9b2d

xbasel marked this pull request as ready for review February 10, 2025 21:37

xbasel requested a review from zuiderkwast February 10, 2025 22:13

JoBeR007 reviewed Feb 11, 2025

View reviewed changes

soloestoy requested review from soloestoy and removed request for zuiderkwast February 12, 2025 06:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add multi-database support to cluster mode #1671

Add multi-database support to cluster mode #1671

xbasel commented Feb 5, 2025 •

edited

Loading

codecov bot commented Feb 5, 2025 •

edited

Loading

JoBeR007 Feb 11, 2025

xbasel Feb 11, 2025 •

edited

Loading

zuiderkwast Feb 12, 2025

JoBeR007 Feb 12, 2025

zuiderkwast Feb 12, 2025

xbasel Feb 12, 2025

zuiderkwast Feb 12, 2025

Add multi-database support to cluster mode #1671

Are you sure you want to change the base?

Add multi-database support to cluster mode #1671

Conversation

xbasel commented Feb 5, 2025 • edited Loading

codecov bot commented Feb 5, 2025 • edited Loading

Codecov Report

JoBeR007 Feb 11, 2025

Choose a reason for hiding this comment

xbasel Feb 11, 2025 • edited Loading

Choose a reason for hiding this comment

zuiderkwast Feb 12, 2025

Choose a reason for hiding this comment

JoBeR007 Feb 12, 2025

Choose a reason for hiding this comment

zuiderkwast Feb 12, 2025

Choose a reason for hiding this comment

xbasel Feb 12, 2025

Choose a reason for hiding this comment

zuiderkwast Feb 12, 2025

Choose a reason for hiding this comment

xbasel commented Feb 5, 2025 •

edited

Loading

codecov bot commented Feb 5, 2025 •

edited

Loading

xbasel Feb 11, 2025 •

edited

Loading