Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add integration test framework #20

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

murphyjacob4
Copy link
Collaborator

This also adds some initial tests:

  • valkey_search_integration_test.py: General functional testing of ValkeySearch
  • stability_test.py: A stress test that performs many VSS operations while performing various one-off operations (BGSAVE, FLUSHDB, etc.)

The goal is to capture the "sunny day" scenarios in the functional tests, and to discover potential edge cases and crashes with the stress tests. Given the multi-threaded nature of the ValkeySearch engine, it is common for simple functional tests to pass while introducing a more complicated multi-threading issue that is only triggered under stress.

To run:

bazel test //testing/integration:stability_test --test_arg=--valkey_server_path=/path/to/valkey-server --test_arg=--valkey_cli_path=/path/to/valkey-cli --test_arg=--memtier_path=/path/to/memtier_benchmark --test_output=streamed

bazel test //testing/integration:vector_search_integration_test --test_arg=--valkey_server_path=/path/to/valkey-server --test_arg=--valkey_cli_path=/path/to/valkey-cli --test_output=streamed

You will need to have a local build of the Valkey server and Valkey CLI tool for both tests, and for the stability tests, you will need to have a local build of Memtier. We may want to later consider moving to valkey-benchmark for simplicity.

@murphyjacob4
Copy link
Collaborator Author

Note that tests are failing right now. This appears to be related to an issue with the memory allocator overrides interacting with gRPC. These are real failures, and when I remove the memory allocator overrides, all tests pass. See #22 for more info

@yairgott
Copy link
Collaborator

I suggest adding a compilation flag to disable the memory allocator override specifically for the integration tests. This would allow all integration tests to pass until the root cause is identified and resolved.

Once this adjustment is made, it would be highly beneficial to expand the presubmit tests to include the integration tests. For reference, you can review how the unittests are configured to run: unittests.yml.

args: dict[str, str],
modules: dict[str, str],
password: str | None = None,
) -> subprocess.Popen[Any]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be returning an instance of a new object type here, rather than a raw subprocess.Popen. The new object would be a place to concentrate node-specific controls and behaviors, etc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that makes sense. For now it is pretty basic, but nice to have a seam for future functionality.

Comment on lines +101 to +102
cluster_args["cluster-config-file"] = os.path.join(
node_dir, "nodes.conf"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"nodes.conf" seems to be a pre-existing file, i.e., a hidden parameters to this function. I'd rather see a dict get passed in and have this function construct nodes.conf.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nodes.conf is created by the Valkey subprocess in the test temp directory. At this time we have no use in editing this or making this ourselves, so I don't see why we would need to parameterize it. If we ever have a need to inspect or modify it we could parameterize it then

index_name:
vector_dimensions:
"""
args = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should either make this more generic or move it to the specific test file that's using it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made it more generic



def create_flat_index(
r: valkey.ValkeyCluster, index_name: str, vector_dimensions: int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

save as above, I don't see this as a real utility routine.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unified the two into one create_index function with configurable mapping of attributes

Copy link
Member

@allenss-amazon allenss-amazon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have several concerns about this PR. Since this is the start of the integration framework, I feel that perhaps it would be best if we had discussion on the requirements of a framework. For example, IMO, we want a framework that will allow individual tests to be run in parallel on a big box so that testing wall-time is minimized. Other concerns are related to code layering and reuse.

@murphyjacob4
Copy link
Collaborator Author

we want a framework that will allow individual tests to be run in parallel on a big box so that testing wall-time is minimized

To me this seems like an incremental improvement. Do you have any concerns with the high level approach?

Parallelization should be afforded by the python test framework we end up using, in addition to a port-finding utility similar to what the Valkey tests use to spin up the Valkey sub-processes on individual ports.

Other concerns are related to code layering and reuse

If you could call these out inline I can address them. I see your comments about the index creation functions - are there any others?

Pull request closed

This seems a bit premature to me. Was this on purpose? I am going to reopen for now, unless there is a concrete alternative proposed that would make this PR unneeded

@murphyjacob4 murphyjacob4 reopened this Jan 28, 2025
@allenss-amazon
Copy link
Member

I'm new to github. wasn't sure if closing was the right thing to do or not. So apologizes....

@allenss-amazon
Copy link
Member

Parallelization is an improvements. I'm concerned that it get prioritized sooner so that the "increment" of the "incremental" doesn't become so large that it's a major stumbling block.

@allenss-amazon
Copy link
Member

As for the code layering/factoring. I think that long term we don't want to make our code be using the raw library types: Subprocess.Popen, Valkey.ValkeyCluster, etc. Rather, I think we should have intermediate abstractions for nodes and clusters. That's because we're likely to create additional functionality for these objects that go beyond what their underlying types support. For example, as a test developer, I'll want to have the ability to "restart" a node. Where would I put that code? In my mind, it should be part of this new abstraction for node that I'd like to see, etc.

@yairgott
Copy link
Collaborator

FYI, I mitigated the memory tracking issue reported by the issue.

@murphyjacob4
Copy link
Collaborator Author

I'm new to github. wasn't sure if closing was the right thing to do or not. So apologizes....

No worries. Obviously it depends on the project - but the general etiquette is to close PRs when they are obsolete, the owner is unresponsive, or the request is not aligned with the project direction. It is effectively you saying "denied, don't bother with further discussion" :)

I'm concerned that it get prioritized sooner so that the "increment" of the "incremental" doesn't become so large that it's a major stumbling block

Makes sense. I am happy to look into it following PR submission. But as you know we are still setting up the project, and getting integration tests running on presubmits feels like a bigger milestone than parallelized testing. Suffice to say - I think filing an issue is a good way to track this and make sure it isn't deferred for very long.

As for the code layering/factoring. I think that long term we don't want to make our code be using the raw library types: Subprocess.Popen, Valkey.ValkeyCluster, etc. Rather, I think we should have intermediate abstractions for nodes and clusters. That's because we're likely to create additional functionality for these objects that go beyond what their underlying types support. For example, as a test developer, I'll want to have the ability to "restart" a node. Where would I put that code? In my mind, it should be part of this new abstraction for node that I'd like to see, etc.

Yeah I get your concern. I think we are still building the framework out. There will be lots of incremental work to do things like support fault injections. But I think regardless we want to use valkey-py (valkey.ValkeyCluster) for the client, which gives us a really large test interface - but doesn't get us 100% of the way there. I would hesitate to wrap the client in an abstraction, since I don't know it will give us a huge benefit and seems like more maintenance work.

That said - consolidating the Valkey Subprocess.Popen handle into a wrapper object seems reasonable to do before submission, I can start work now and let you know when it is done. Later, we can follow up with support for faults like pausing the process to simulate lag or restarting the process, but I would defer this until we have tests for them. Generally - let's not let "perfect" be the enemy of "good". Just unit tests are not high enough fidelity for this project to prevent regressions. Getting something in and iterating feels better for project health, since the alternative is effectively postponing development.

Either way - let's continue to discuss the direction in depth at our contributor meeting tomorrow

@murphyjacob4
Copy link
Collaborator Author

There is also a question of merging efforts with https://github.com/valkey-io/valkey-bloom/blob/unstable/tests/valkey_bloom_test_case.py. We should have a discussion about this. But I think the majority of the logic here should be portable if we decide to switch frameworks (both Python, both using valkey-py for the interaction, both based on spinning up Valkey as a subprocess for testing). I would suggest we follow this thread in parallel

@allenss-amazon
Copy link
Member

Merging with Bloom make sense to me.

I'm fully aligned with using a standard client. I'm just saying that we're going to be augmenting many if not all of our basic objects with unusual functionality that the base objects just don't support. By adding an intermediate layer (sub-classing is probably the best solution here -- I agree we want to inherit 100% of the base object functionality) we can seamlessly add the functionality in the future. Also things like debugging output can be centralized, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants