Add integration test framework #20

murphyjacob4 · 2025-01-23T17:52:46Z

This also adds some initial tests:

valkey_search_integration_test.py: General functional testing of ValkeySearch
stability_test.py: A stress test that performs many VSS operations while performing various one-off operations (BGSAVE, FLUSHDB, etc.)

The goal is to capture the "sunny day" scenarios in the functional tests, and to discover potential edge cases and crashes with the stress tests. Given the multi-threaded nature of the ValkeySearch engine, it is common for simple functional tests to pass while introducing a more complicated multi-threading issue that is only triggered under stress.

To run:

bazel test //testing/integration:stability_test --test_arg=--valkey_server_path=/path/to/valkey-server --test_arg=--valkey_cli_path=/path/to/valkey-cli --test_arg=--memtier_path=/path/to/memtier_benchmark --test_output=streamed

bazel test //testing/integration:vector_search_integration_test --test_arg=--valkey_server_path=/path/to/valkey-server --test_arg=--valkey_cli_path=/path/to/valkey-cli --test_output=streamed

You will need to have a local build of the Valkey server and Valkey CLI tool for both tests, and for the stability tests, you will need to have a local build of Memtier. We may want to later consider moving to valkey-benchmark for simplicity.

Signed-off-by: Jacob Murphy <[email protected]>

murphyjacob4 · 2025-01-23T18:03:06Z

Note that tests are failing right now. This appears to be related to an issue with the memory allocator overrides interacting with gRPC. These are real failures, and when I remove the memory allocator overrides, all tests pass. See #22 for more info

yairgott · 2025-01-23T20:31:17Z

I suggest adding a compilation flag to disable the memory allocator override specifically for the integration tests. This would allow all integration tests to pass until the root cause is identified and resolved.

Once this adjustment is made, it would be highly beneficial to expand the presubmit tests to include the integration tests. For reference, you can review how the unittests are configured to run: unittests.yml.

allenss-amazon · 2025-01-28T06:37:07Z

testing/integration/utils.py

+    args: dict[str, str],
+    modules: dict[str, str],
+    password: str | None = None,
+) -> subprocess.Popen[Any]:


I think we should be returning an instance of a new object type here, rather than a raw subprocess.Popen. The new object would be a place to concentrate node-specific controls and behaviors, etc.

Yeah that makes sense. For now it is pretty basic, but nice to have a seam for future functionality.

allenss-amazon · 2025-01-28T06:38:40Z

testing/integration/utils.py

+        cluster_args["cluster-config-file"] = os.path.join(
+            node_dir, "nodes.conf"


"nodes.conf" seems to be a pre-existing file, i.e., a hidden parameters to this function. I'd rather see a dict get passed in and have this function construct nodes.conf.

nodes.conf is created by the Valkey subprocess in the test temp directory. At this time we have no use in editing this or making this ourselves, so I don't see why we would need to parameterize it. If we ever have a need to inspect or modify it we could parameterize it then

allenss-amazon · 2025-01-28T06:47:04Z

testing/integration/utils.py

+      index_name:
+      vector_dimensions:
+    """
+    args = [


We should either make this more generic or move it to the specific test file that's using it.

Made it more generic

allenss-amazon · 2025-01-28T06:47:30Z

testing/integration/utils.py

+
+
+def create_flat_index(
+    r: valkey.ValkeyCluster, index_name: str, vector_dimensions: int


save as above, I don't see this as a real utility routine.

Unified the two into one create_index function with configurable mapping of attributes

allenss-amazon

I have several concerns about this PR. Since this is the start of the integration framework, I feel that perhaps it would be best if we had discussion on the requirements of a framework. For example, IMO, we want a framework that will allow individual tests to be run in parallel on a big box so that testing wall-time is minimized. Other concerns are related to code layering and reuse.

murphyjacob4 · 2025-01-28T18:16:37Z

we want a framework that will allow individual tests to be run in parallel on a big box so that testing wall-time is minimized

To me this seems like an incremental improvement. Do you have any concerns with the high level approach?

Parallelization should be afforded by the python test framework we end up using, in addition to a port-finding utility similar to what the Valkey tests use to spin up the Valkey sub-processes on individual ports.

Other concerns are related to code layering and reuse

If you could call these out inline I can address them. I see your comments about the index creation functions - are there any others?

Pull request closed

This seems a bit premature to me. Was this on purpose? I am going to reopen for now, unless there is a concrete alternative proposed that would make this PR unneeded

allenss-amazon · 2025-01-28T19:42:34Z

I'm new to github. wasn't sure if closing was the right thing to do or not. So apologizes....

allenss-amazon · 2025-01-28T19:44:02Z

Parallelization is an improvements. I'm concerned that it get prioritized sooner so that the "increment" of the "incremental" doesn't become so large that it's a major stumbling block.

allenss-amazon · 2025-01-28T19:48:16Z

As for the code layering/factoring. I think that long term we don't want to make our code be using the raw library types: Subprocess.Popen, Valkey.ValkeyCluster, etc. Rather, I think we should have intermediate abstractions for nodes and clusters. That's because we're likely to create additional functionality for these objects that go beyond what their underlying types support. For example, as a test developer, I'll want to have the ability to "restart" a node. Where would I put that code? In my mind, it should be part of this new abstraction for node that I'd like to see, etc.

yairgott · 2025-01-28T20:49:23Z

FYI, I mitigated the memory tracking issue reported by the issue.

murphyjacob4 · 2025-01-28T21:54:54Z

I'm new to github. wasn't sure if closing was the right thing to do or not. So apologizes....

No worries. Obviously it depends on the project - but the general etiquette is to close PRs when they are obsolete, the owner is unresponsive, or the request is not aligned with the project direction. It is effectively you saying "denied, don't bother with further discussion" :)

I'm concerned that it get prioritized sooner so that the "increment" of the "incremental" doesn't become so large that it's a major stumbling block

Makes sense. I am happy to look into it following PR submission. But as you know we are still setting up the project, and getting integration tests running on presubmits feels like a bigger milestone than parallelized testing. Suffice to say - I think filing an issue is a good way to track this and make sure it isn't deferred for very long.

As for the code layering/factoring. I think that long term we don't want to make our code be using the raw library types: Subprocess.Popen, Valkey.ValkeyCluster, etc. Rather, I think we should have intermediate abstractions for nodes and clusters. That's because we're likely to create additional functionality for these objects that go beyond what their underlying types support. For example, as a test developer, I'll want to have the ability to "restart" a node. Where would I put that code? In my mind, it should be part of this new abstraction for node that I'd like to see, etc.

Yeah I get your concern. I think we are still building the framework out. There will be lots of incremental work to do things like support fault injections. But I think regardless we want to use valkey-py (valkey.ValkeyCluster) for the client, which gives us a really large test interface - but doesn't get us 100% of the way there. I would hesitate to wrap the client in an abstraction, since I don't know it will give us a huge benefit and seems like more maintenance work.

That said - consolidating the Valkey Subprocess.Popen handle into a wrapper object seems reasonable to do before submission, I can start work now and let you know when it is done. Later, we can follow up with support for faults like pausing the process to simulate lag or restarting the process, but I would defer this until we have tests for them. Generally - let's not let "perfect" be the enemy of "good". Just unit tests are not high enough fidelity for this project to prevent regressions. Getting something in and iterating feels better for project health, since the alternative is effectively postponing development.

Either way - let's continue to discuss the direction in depth at our contributor meeting tomorrow

murphyjacob4 · 2025-01-28T21:55:32Z

There is also a question of merging efforts with https://github.com/valkey-io/valkey-bloom/blob/unstable/tests/valkey_bloom_test_case.py. We should have a discussion about this. But I think the majority of the logic here should be portable if we decide to switch frameworks (both Python, both using valkey-py for the interaction, both based on spinning up Valkey as a subprocess for testing). I would suggest we follow this thread in parallel

Signed-off-by: Jacob Murphy <[email protected]>

allenss-amazon · 2025-01-29T01:14:53Z

Merging with Bloom make sense to me.

I'm fully aligned with using a standard client. I'm just saying that we're going to be augmenting many if not all of our basic objects with unusual functionality that the base objects just don't support. By adding an intermediate layer (sub-classing is probably the best solution here -- I agree we want to inherit 100% of the base object functionality) we can seamlessly add the functionality in the future. Also things like debugging output can be centralized, etc.

murphyjacob4 added 4 commits January 23, 2025 17:33

Add integration testing framework

f8c8744

Signed-off-by: Jacob Murphy <[email protected]>

Merge branch 'valkey-io:main' into main

a050bc5

Add newline to MODULE.bazel

00577e1

Signed-off-by: Jacob Murphy <[email protected]>

Add integration tests to DEVELOPER.md

a379522

Signed-off-by: Jacob Murphy <[email protected]>

murphyjacob4 assigned yairgott and allenss-amazon Jan 23, 2025

allenss-amazon reviewed Jan 28, 2025

View reviewed changes

allenss-amazon requested changes Jan 28, 2025

View reviewed changes

allenss-amazon closed this Jan 28, 2025

murphyjacob4 reopened this Jan 28, 2025

murphyjacob4 added 2 commits January 28, 2025 23:54

Merge branch 'valkey-io:main' into main

fe4a62a

Apply review feedback

d28cf7d

Signed-off-by: Jacob Murphy <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add integration test framework #20

Add integration test framework #20

murphyjacob4 commented Jan 23, 2025

murphyjacob4 commented Jan 23, 2025

yairgott commented Jan 23, 2025

allenss-amazon Jan 28, 2025

murphyjacob4 Jan 29, 2025

allenss-amazon Jan 28, 2025

murphyjacob4 Jan 28, 2025

allenss-amazon Jan 28, 2025

murphyjacob4 Jan 29, 2025

allenss-amazon Jan 28, 2025

murphyjacob4 Jan 29, 2025

allenss-amazon left a comment

murphyjacob4 commented Jan 28, 2025

allenss-amazon commented Jan 28, 2025

allenss-amazon commented Jan 28, 2025

allenss-amazon commented Jan 28, 2025

yairgott commented Jan 28, 2025

murphyjacob4 commented Jan 28, 2025

murphyjacob4 commented Jan 28, 2025

allenss-amazon commented Jan 29, 2025

		cluster_args["cluster-config-file"] = os.path.join(
		node_dir, "nodes.conf"



		def create_flat_index(
		r: valkey.ValkeyCluster, index_name: str, vector_dimensions: int

Add integration test framework #20

Are you sure you want to change the base?

Add integration test framework #20

Conversation

murphyjacob4 commented Jan 23, 2025

murphyjacob4 commented Jan 23, 2025

yairgott commented Jan 23, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

allenss-amazon left a comment

Choose a reason for hiding this comment

murphyjacob4 commented Jan 28, 2025

allenss-amazon commented Jan 28, 2025

allenss-amazon commented Jan 28, 2025

allenss-amazon commented Jan 28, 2025

yairgott commented Jan 28, 2025

murphyjacob4 commented Jan 28, 2025

murphyjacob4 commented Jan 28, 2025

allenss-amazon commented Jan 29, 2025