-
Notifications
You must be signed in to change notification settings - Fork 702
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NEW] Support for Empty SET #68
Comments
This is another feature which had quite a strong demand redis/redis#6048 . I think we should make a decision to distinguish between empty vs NULL for data type like set. @valkey-io/core-team |
I am looking into this issue |
I'm skeptical. If it had been like that from the beginning that commands like SPOP leaves an empty set, then that would have been much better. Then SPOP, HDEL, ZREM and other commands would just not delete a key. Only DEL would delete a key. But for historical reasons, we have what we have. Adding empty sets now and at the same keeping the default behavior of deleting the key when it becomes empty has problems:
Regarding alternatives, maybe a new config to make commands not automatically delete empty keys is a good idea? At first sight it seems to be a very big semantical change, but in practice maybe it doesn't actually break many applications? I'm just speculating, but if we allow users to test it and evaluate it, perhaps we can make a more informed decision later. Are there any other alternatives? |
I have discussed with @hpatro yesterday, I think there could be two approaches.
|
In terms of SMEMBERS behavior, We probably don't want to modify SMEMBERS to return null for a non-existent key and an empty array for an empty set, as each command should have a single, clear purpose. Its role is to retrieve the members of a set, not to determine whether the key exists. |
Does it make sense to start by adding one new command I think I think For RDB - using |
Thanks @murphyjacob4 I considered introducing new commands, but IMO, there are potential downsides:
|
The proposal makes sense to me, but I share the skepticism raised earlier by @zuiderkwast. The
I have a doubt about the usability of "Client capa" also doesn’t seem quite right here, as it involves global/server-wide behavior changes. What happens if one client expects empty sets while another does not? This feels problematic. Server configurations might fare a bit better than "client capa," but even then, altering existing behavior in this way feels too drastic and potentially messy, because the change proposed shouldn't be limited to SET but applicable to pretty much all data structures (other than strings). |
I often hear similar requests, not only for I am wondering if there could be an alternative solution. For instance, we could set attributes for the key, allowing the user to specify that the key can be empty by using a command like |
@soloestoy Thanks for the suggestion, |
I don’t think this will lead to memory and RDB expansion. It only requires 1 bit to indicate whether empty is allowed. There is enough space in the struct serverObject {
unsigned type : 4;
unsigned encoding : 4;
unsigned lru : LRU_BITS; /* LRU time (relative to global lru_clock) or
* LFU data (least significant 8 bits frequency
* and most significant 16 bits access time). */
int refcount;
void *ptr;
}; for example, encoding can be compressed to 3 bits (provided that we need to utilize the type and categorize the encoding, allowing for 8 encodings (3 bits) under one type, instead of the current setup with only 16 encodings (4 bits) without distinguishing types). This way, the extra 1 bit can indicate whether empty is allowed. |
@soloestoy Do we want to do this at a key level or provide a config at server level to flip when the application is ready to deal with empty object(s)? My initial suggestion to @sungming2 was to explore |
This is too big a change to be gated by a server config IMO. Also I feel that it is a bit odd to require the clients inquire a server config in order to reason about, essentially, the protocol level behavior. I like @soloestoy's per object attribute idea. I think it is extensible too for other similar changes. We would need an |
Key attributes would be a new level of complexity too. I don't love this idea either. Out of the suggestions seen so far, the one I dislike least is to add new command syntax. But I don't think we need all the proposed commands and the "MT" suffix looks cryptic. A command like The ones I think seem useful and not too ugly:
|
Introducing new commands was the initial proposal by @nmvk / @madolson. There are two challenges with it.
|
I think maybe we can do |
I don't think |
Yes, I think everything can be a follow up. I like the discussion to cover all use cases to get a complete picture, but then narrow it down as much as possible.
In many cases, it can be done with just a new parameter to an existing command. That doesn't create a whole new jungle of commands. For lists, we'd come far with only these:
I'm not sure there are good uses cases for empty lists though. They're not used in the same way as sets. Similarly, TTL for list elements probably don't make sense. We should base these additions on real use cases and not just add everything based on symmetry. If we start off with the bare minimum for sets, it could be only one optional parameter for SPOP ( |
I just take look this issue, I think this is a huge change if empty set is allowed. We have 4 important datatype, Set, List, Hash, Sorted Set, (here we do not consider the String), and there are similar commands among these 4 datatypes: Set: SADD key member [member ...] List: RPUSH key element [element ...] Hash: HSET key field value [field value ...] Sorted Set: ZADD key [NX | XX] [GT | LT] [CH] [INCR] score member [score member Let's clarify our plan.
|
I think SREM is a little higher demand for developers. I thought about the use cases. Use Case: Like postExample:
In scenarios managing likes for a post, SREM is often used to remove users from a set of "likes" Use Case: Caching SQL query resultsExample:
If the query returns no results, the application must either avoid storing the result or insert a dummy value to represent the absence of data (just create an empty set). |
Not really. I think we would also need to provide a mechanism to create an empty set. @zuiderkwast |
@sungming2 Thanks for the use cases. It seems we need a new variant for SREM then. To cache an empty result, it seems we need to create an empty set. I just got an idea. The variadic SADD is used for caching all the elements at once, so what if we allow SADD without any elements to create an empty set? We make all |
Yeah, this is part of the proposal above. |
However, I want us to clearly think and vote if we want it at a individual container level or empty object as a primitive for all containers in the system. |
I personally think I will vote individual container level..
|
This seems like a good idea and can be extended to List, Sorted Set, and Hash too if needed. If a container is created empty, we could also stipulate that it be retained after the last member is removed. |
@hpatro Ahh,
@PingXie That would require some attribute or metadata on the key, which I don't like to introduce. I'd rather add |
😀 if SADD already allows an empty set to be created then I agree we need a more explicit handshake like the new command you suggested (btw what does MT stand for?) otherwise it will still be a breaking change (to infer the keep-empty intent) |
EMPTY |
In US English, you don't pronounce the P in empty, so it sounds like em-tee, MT? In UK English, I believe the P is pronounced. That's what the audio samples on wiktionary suggest: https://en.wiktionary.org/wiki/empty |
We also just had this conversation that
One thing I hear fairly often is people do not like that streams have attributes which make them much harder to reason about. I'm in agreement with @zuiderkwast that let's not add attributes.
I'm more convinced about just supporting this to see if people use it, and what was also sort of what I was suggesting @nmvk do. I'm not really sure empty lists make sense, but having an empty command for set, hash, and sorted set seems doable. In short, I'm against adding a lot of complexity for a feature we don't have a lot of validation is useful yet. |
Yes, about SET with the GET argument. The GET argument completely changes the return value, which is convoluted and confusing. The other arguments aren't, IMHO. |
I like
For the record, I am NOT proposing to extend "empty container" support to all data structures. There is, however, a key difference in whether a design is extensible or not, so we don't end up with a bunch of ad hoc solutions. "KEEPEMPTY" checks that "extensibility" mark for me. |
I'm not a native speaker, but I don't think the idea of the "P" being silent is accurate. :) For that matter, I don't think we should go with "KEEPMT." |
I wonder if we can add What about combining config-based empty set support with
|
[core team discussion]
|
@PingXie Was there any conclusion? It feels like all of the ideas are not ideal. |
No decisions (we didn't have quorum yesterday anyways). The least bad option right now seems to be new ad-hoc commands. |
From what I've seen - I think the way many people who require empty sets are implementing this is by adding some kind of sentinel value - I.e. Could we use similar semantics in the engine? Basically - all sets would be "non-empty sets" unless the user comes and specifies With this option, we only need one command total per data type, and we should be able to meet all use cases I can see. We can optionally add more commands, but I don't know if they are necessarily needed now:
I guess semantics would be similar to a container attribute, but I think I find this syntax is more friendly, and maintainability doesn't seem too bad given the limited blast radius (just one command, only need to add it to the data types we want to support). Plus it has the benefit of being similar to what users are already doing with sentinels, but with some syntactical sugar. Curious what others think, I'm not super attached to naming so open to naming suggestions as well. |
I think the semantics change, rather than the syntax, is at the core of the concerns as it modifies the behavior of a well-known data structure. |
I've added this issue to the agenda for 1/20. Hope we can reach to consensus on how to proceed further on this. @valkey-io/core-team |
Before we propose some solutions, I would like to confirm, the EMPTY SET feature will apply to key level, datatype level, specific client level or global server level? |
Typically everything is done at the key level within Redis, but we could choose to do something else. |
Current thinking is to defer the implementation of this feature until we are done with #640. Since hash field expiration is a much more highly requested feature, let's make sure that we are implementing it correctly. |
The problem/use-case that the feature addresses
Currently, a key is deleted when the set is empty. Distinguishing between an empty set and a key that does not exist offers many advantages. For example, if we store SQL results in a set, an empty set can represent that a query was executed but the result was empty, whereas null might imply that we need to make a database call to store in the set.
Description of the feature
Without breaking any backward compatibilities we scan support the Empty set with set of new commands and arguments
Write commands
SADD key
Existing SADD would be modified to make members as optional, invoking SADD with only key will create an empty set.
SPOP key [count] MT
→ Additional argumentAdd new optional argument to retain the SET if no element exist.
SREMMT key member [member ...]
→ [New Command]Same as SREM would keep the empty set once all members are removed.
SMOVEMT
,SINTERSTOREMT
,SDIFFSTOREMT
, andSUNIONSTOREMT
would be implemented as followups if needed. One can delete the empty set usingDEL
command.Read commands
SCARD
would still return 0 for empty set. To distinguish empty set from a non existing key one can useEXISTS
followed bySCARD
. Same can be approach for theSISMEMBER
andSMISMEMBER
.SMEMBERSMT key
→ [New Command]Similar to
SMEMBERS
but will return null when key does not exist. This command can be achieved by usingEXISTS
+SMEMBERS
but would be supported for ease of use.Alternatives you've considered
Provide a new param to change the default behavior.
The text was updated successfully, but these errors were encountered: