-
Notifications
You must be signed in to change notification settings - Fork 702
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NEW] Add scripting languages (for EVAL, etc.) using module API #1261
Comments
This will be a good addition to Valkey to provide the underlying abstraction to support new engines easily. Few questions which comes to my mind and some discussed in the weekly meeting
|
One cannot easily create parallel Lua engines. This is due to symbol collisions with the statically linked Lua, discussed here and here. Disabling the Lua support makes this straight-forward at the expense of losing the built-in Lua. Otherwise, module implementors must carefully move their Lua symbols to not collide (I never tried this). But then they can't use system-installed Lua libraries in their modules (maybe that's fine). At least once it is figured out for one Lua module, it will be figured out for all. Then it's a documentation issue =). Valkey could rename its Lua symbols since it is building and statically linking Lua from source. I'm not sure what that might break elsewhere. I hadn't tried this since 2016, but since I just upgraded the |
Yes.
What do you mean by host? Officially support or vendor? It's not impossible. I have no answer.
ModJS adds a new command EVALJS. Modules can always add their own commands, but the idea here is to provide an API for modules to hook in to EVAL and FUNCTIONs. We can extend this to triggers or events of some sort. Scripts without a shebang are Lua scripts for backward compatibility, at least by default. We could add a config to change the default engine though, but I imagine that all other languages will use a shebang.
They should be able to take advantage of the framework provided by EVAL + EVALSHA + SCRIPT LOAD, FUNCTION CREATE + FCALL, etc. There's a difference between scripts and functions. Scripts are part of an application and are written by the application developers while functions are assumed to be installed by a database admin. The caller of the function doesn't need to know which language the function was written in.
Yes. Do you have a suggestion? INFO? A new subcommand of FUNCTION or SCRIPT? |
@neomantra We discussed this, but we were not sure why. Dynamically linked symbols don't collide with statically linked symbols, do they? Does Lua itself use dynamic linking for its modules? Worst case, we can only have one Lua at a time. 😢 |
Yeah, I was trying to see what's Valkey's stance on supporting other scripting engines. |
Yes, since they use the same exact symbol names and the linker can't disambiguate. LuaJIT is intended to be a drop-in replacement for Lua 5.1. A Lua 5.4 load would by be similar. There are common names like I had a bit of a conversation with Claude (for this chat, "better" IMO than ChatGPT) on how to get around it and it is tricky and platform-specific -- don't have a sharing account but wasn't something I could do in 2016. Not sure if it is gauche to share prompts:
I did test this on ARM/OSX versus x64/Linux and got segfaults on both. I realized I quoted the wrong section earlier and my reply was meant to suggest you shouldn't make this a goal:
|
@zuiderkwast I would like to implement the WASM engine using this approach. I can include the module API changes as part of the work I'm doing with WASM. |
@rjd15372 sounds great, but I'd prefer a separate PR for only the module API and a dummy engine module for testing that just returns the script code back or something. |
@zuiderkwast sure, I wasn't implying that all work would be in a single PR. I was thinking in the same lines as you. |
@zuiderkwast @madolson I opened a PR #1277 with the changes to the module API. |
I am generally aligned with the proposal of extending scripting language support via modules. I think it strikes a good balance between extensibility and complexity. |
Great questions, @hpatro!
Yes but there should be one "inbox" engine - the current Lua one. All others will come in via the modules
If by "first party" you meant "inbox", I think there should be one and only, i.e., the current Lua engine. Others will be shipped out of band via modules.
Make sense.
This would be bad coupling.
Can we encode the version in the shebang as well? |
@zuiderkwast what is your thought on supporting other scripting languages in FUNCTION? |
btw, we should capture the details in an RFC once we wrap up the discussion, I think. |
My thought is that it's a good idea and already implemented in #1277. But EVAL is very easy to use and preparations for that are started in #1312. I'd like a scripting engine module to provide FUNCTION and EVAL. It doesn't seem that hard to achieve both. |
The main languages/engines I can guess being used are WASM and JavaScript (e.g. V8), because are very well sandboxed by design, while Lua isn't. I can see other Lua versions provided by modules too, like LuaJIT, Lua 5.4 and Luau. For the Lua variants, it would be nice to allow a module to be the default engine, if built-in Lua is disabled, so many applications written for regular Lua can benefit without modification. |
I am not object to support more script languages, but all of them should be via the module part, even for the WASM in the future. We had better only keep Lua in core part, but we could give an option for user and developer to enable or disable it. |
Yes, module API for all new languages. We already agreed about this in the meeting some weeks ago. |
This will be the module API additions after merging the PRs: To register a scripting engine the module should call the following function: /* Registers a new scripting engine in the server.
*
* - `engine_name`: the name of the scripting engine. This name will match
* against the engine name specified in the script header using a shebang.
*
* - `engine_ctx`: engine specific context pointer.
*
* - `engine_methods`: the struct with the scripting engine callback functions
* pointers.
*/
int ValkeyModule_RegisterScriptingEngine(ValkeyModuleCtx *ctx,
const char *engine_name,
ValkeyModuleScriptingEngineCtx *engine_ctx,
ValkeyModuleScriptingEngineMethods *engine_methods); The The typedef struct ValkeyModuleScriptingEngineMethods {
/* Compile code function callback. When a new script is loaded, this
* callback will be called with the script code, compiles it, and returns a
* list of `ValkeyModuleScriptingEngineCompiledFunc` objects. */
ValkeyModuleScriptingEngineCompileCodeFunc compile_code;
/* The callback function called when `FCALL` command is called on a function
* registered in this engine. */
ValkeyModuleScriptingEngineCallFunctionFunc call_function;
/* The callback function used to reset the runtime environment used
* by the scripting engine for EVAL scripts. */
ValkeyModuleScriptingEngineResetEvalEnvFunc reset_eval_env;
/* Function callback to free the memory of a registered engine function. */
ValkeyModuleScriptingEngineFreeFunctionFunc free_function;
/* Function callback to return memory overhead for a given function. */
ValkeyModuleScriptingEngineGetFunctionMemoryOverheadFunc get_function_memory_overhead;
/* Function callback to get current used memory by the engine. */
ValkeyModuleScriptingEngineGetUsedMemoryFunc get_used_memory;
/* Function callback to return memory overhead of the engine. */
ValkeyModuleScriptingEngineGetEngineMemoryOverheadFunc get_engine_memory_overhead;
} ValkeyModuleScriptingEngineMethods; Each of the above function pointer types, plus the parameter types they use, are defined as follows: /* This struct represents a scripting engine function that results from the
* compilation of a script by the engine implementation.
*/
typedef struct ValkeyModuleScriptingEngineCompiledFunction {
char *name; /* Function name */
size_t name_len; /* The length of the function name string */
void *function; /* Opaque object representing a function, usually it's
the function compiled code. */
char *desc; /* Function description */
size_t desc_len; /* The length of the description string */
uint64_t f_flags; /* Function flags */
} ValkeyModuleScriptingEngineCompiledFunction;
typedef enum ValkeyModuleScriptingEngineSubsystemType {
VMSE_EVAL,
VMSE_FUNCTION,
VMSE_ALL
} ValkeyModuleScriptingEngineSubsystemType;
/* The callback function called when either `EVAL`, `SCRIPT LOAD`, or
* `FUNCTION LOAD` command is called to compile the code.
* This callback function evaluates the source code passed and produces a list
* of pointers to the compiled functions structure.
* In the `EVAL` and `SCRIPT LOAD` case, the list only contains a single
* function.
* In the `FUNCTION LOAD` case, there are as many functions as there are calls
* to the `server.register_function` function in the source code.
*
* - `engine_ctx`: the engine specific context pointer.
*
* - `type`: the subsystem type. Either EVAL or FUNCTION.
*
* - `code`: string pointer to the source code.
*
* - `timeout`: timeout for the library creation (0 for no timeout).
*
* - `out_num_compiled_functions`: out param with the number of objects
* returned by this function.
*
* - `err` - out param with the description of error (if occurred).
*
* Returns an array of compiled function objects, or `NULL` if some error
* occurred.
*/
typedef ValkeyModuleScriptingEngineCompiledFunction **(*ValkeyModuleScriptingEngineCompileCodeFunc)(
ValkeyModuleScriptingEngineCtx *engine_ctx,
ValkeyModuleScriptingEngineSubsystemType type,
const char *code,
size_t timeout,
size_t *out_num_compiled_functions,
char **err);
/* The callback function called when either `EVAL`, or `FCALL`, command is
* called.
* This callback function executes the `compiled_function` code.
*
* - `module_ctx`: the module runtime context.
*
* - `engine_ctx`: the engine specific context pointer.
*
* - `server_ctx`: the context opaque structure that represents the server-side
* runtime context for the function.
*
* - `compiled_function`: pointer to the compiled function registered by the
* engine.
*
* - `type`: the subsystem type. Either EVAL or FUNCTION.
*
* - `keys`: the array of key strings passed in the `FCALL` command.
*
* - `nkeys`: the number of elements present in the `keys` array.
*
* - `args`: the array of string arguments passed in the `FCALL` command.
*
* - `nargs`: the number of elements present in the `args` array.
*/
typedef void (*ValkeyModuleScriptingEngineCallFunctionFunc)(
ValkeyModuleCtx *module_ctx,
ValkeyModuleScriptingEngineCtx *engine_ctx,
ValkeyModuleScriptingEngineServerRuntimeCtx *server_ctx,
ValkeyModuleScriptingEngineCompiledFunction *compiled_function,
ValkeyModuleScriptingEngineSubsystemType type,
ValkeyModuleString **keys,
size_t nkeys,
ValkeyModuleString **args,
size_t nargs);
typedef struct ValkeyModuleScriptingEngineCallableLazyEvalReset {
void *context;
/*
* Callback function used for resetting the EVAL context implemented by an
* an engine. This callback will be called by a background thread when it's
* ready for resetting the context.
*
* - `context`: a generic pointer to a context object, stored in the
* callableLazyEvalReset struct.
*
*/
void (*engineLazyEvalResetCallback)(void *context);
} ValkeyModuleScriptingEngineCallableLazyEvalReset;
/* The callback function called when `SCRIPT FLUSH` command is called. The
* engine should reset the runtime environment used for EVAL scripts.
*
* - `engine_ctx`: the engine specific context pointer.
*
* - `async`: if has value 1 then the reset is done asynchronously through
* the callback structure returned by this function.
*/
typedef ValkeyModuleScriptingEngineCallableLazyEvalReset *(*ValkeyModuleScriptingEngineResetEvalEnvFunc)(
ValkeyModuleScriptingEngineCtx *engine_ctx,
int async);
/* Free the given function */
typedef void (*ValkeyModuleScriptingEngineFreeFunctionFunc)(
ValkeyModuleScriptingEngineCtx *engine_ctx,
ValkeyModuleScriptingEngineCompiledFunction *compiled_function,
ValkeyModuleScriptingEngineSubsystemType type);
/* Return memory overhead for a given function, such memory is not counted as
* engine memory but as general structs memory that hold different information
*/
typedef size_t (*ValkeyModuleScriptingEngineGetFunctionMemoryOverheadFunc)(
ValkeyModuleScriptingEngineCompiledFunction *compiled_function);
/* Return the current used memory by the engine.
*
* - `engine_ctx`: the engine specific context pointer.
*
* - `type`: the subsystem type.
*/
typedef size_t (*ValkeyModuleScriptingEngineGetUsedMemoryFunc)(
ValkeyModuleScriptingEngineCtx *engine_ctx,
ValkeyModuleScriptingEngineSubsystemType type);
/* Return memory overhead for engine (struct size holding the engine) */
typedef size_t (*ValkeyModuleScriptingEngineGetEngineMemoryOverheadFunc)(
ValkeyModuleScriptingEngineCtx *engine_ctx); Both A scripting engine can also unregister itself from the server. For that purpose it can call the following API function: /* Removes the scripting engine from the server.
*
* `engine_name` is the name of the scripting engine.
*
*/
int ValkeyModule_UnregisterScriptingEngine(ValkeyModuleCtx *ctx, const char *engine_name); To support the typedef enum ValkeyModuleScriptingEngineExecutionState {
VMSE_STATE_EXECUTING,
VMSE_STATE_KILLED,
} ValkeyModuleScriptingEngineExecutionState;
/* Returns the state of the current function being executed by the scripting
* engine.
*
* `server_ctx` is the server runtime context.
*
* It will return VMSE_STATE_KILLED if the function was already killed either by
* a `SCRIPT KILL`, or `FUNCTION KILL`.
*/
ValkeyModuleScriptingEngineExecutionState ValkeyModule_GetFunctionExecutionState(
ValkeyModuleScriptingEngineServerRuntimeCtx *server_ctx); |
The problem/use-case that the feature addresses
Description of the feature
Looking at
function.c
, there is work started that should allows different "engines" for functions (FUNCTION CREATE, FCALL, etc.). For example, there is a function to register an engine. Currenly, Lua is the only engine, implemented infunctions_lua.c
. There is some separation here.Extend the existing modularity to the module API:
ValkeyModule_RegisterScriptingEngine
or similar.The module registers a callback that is invoked for executing code in commands like the EVAL, EVALSHA and FCALL.
In the beginning of an EVAL script, users can add a shebang, a line like
#!lua
and some optional flags or parameters, to select the scripting engine. This mechanism already exists, but currently, only "lua" exists. A module should be able to provide their own languages.To add Lua engine in parallel to the built-in Lua implementation, the module can register with a different name like "lua5.4", "luajit". For a module to be able to replace the default "lua" engine, the built-in Lua support needs to be disabled. For that, see #1204.
Alternatives you've considered
...
Additional information
Related discussions:
The text was updated successfully, but these errors were encountered: