Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sled Agent x Falcon: Use VMMs for Sled Agent testing #5226

Open
9 tasks
smklein opened this issue Mar 7, 2024 · 2 comments
Open
9 tasks

Sled Agent x Falcon: Use VMMs for Sled Agent testing #5226

smklein opened this issue Mar 7, 2024 · 2 comments
Labels
Sled Agent Related to the Per-Sled Configuration and Management Testing & Analysis Tests & Analyzers

Comments

@smklein
Copy link
Collaborator

smklein commented Mar 7, 2024

TL:DR

It's time to introduce a test wrapper for Sled Agent tests to execute within a VMM.

Summary

Sled Agent tests used mocks to interface with the OS (intercepting calls to the system). Then, to a limited degree, they used fakes (see: #2422) to simulate the system. However, these tests still require a significant amount of plumbing, test-only interfaces, and constraints to execute correctly.

We'd benefit significantly from using a combination of falcon and nextest features to wrap "the ability to run your code in the context of a new, isolated VMM".

Background

Goal

Here's what would be a really nice end-state:

  • You write some code within Sled Agent which manipulates "global state" on your sled (e.g., managing disks, launching zones, manipulating dump devices, etc, -- whatever!)
  • In the same file where you want to write the code to perform these actions, you write a test like the following:
// Some function poking at global state, that you want to test.
pub async fn manage_system_state() { ... }

#[cfg(all(test, target = "illumos", feature = "vmm-test"))]
mod test {
  use super::*;

  #[vmm_test(config = default)]
  async fn my_test() {
     let zones = std::command::Command::new("zoneadm").arg("list").output().expect();
     println!("My own, test-specific set of zones in my VMM: {}", zones.stdout);

     // Use your test code to manipulate the state of the system.
     manage_system_state().await;
  }

  #[vmm_test(config = default)]
  async fn my_other_test() {
     // Run in a separate VMM - no worry about conflicting with the state of "my_test".
     ...
  }
}
  • To run these tests, you should be able to run cargo nextest run, pointing specifically to this test target, and we could be able to run them with a pfexec invocation, so the test runner could successfully launch VMMs.
    • This exact command could be invoked via cargo xtask, and itself added to CI.

Tasks

  • Create an attribute macro for vmm_tests, which spins up a node, mounts test binaries, and runs commands within the new VM.
    • Extend this command with "config" options, to allow tests to specify "what their machine looks like". This should largely translate to calling into Falcon's API, though it would be nice to set some reasonable single-sled defaults.
    • Extend this command to grab logs and other debug information from tests, so we can inspect system state on test failure.
    • Ensure this test runner destroys VMMs on cleanup
    • Consider optimizing this runner to "revert system state before the test started" if we want to re-use it between multiple tests.
  • Ensure that any tests using this framework are adequately labelled. For example, we could mark the tests as "ignored" to ensure that the vanilla cargo nextest run invocation is not broken when executed without adequate permissions.
  • Use or work around Add support for per-test target runners nextest-rs/nextest#1358 to invoke pfexec from nextest, granting adequate permissions to the specific tests wanting to launch VMMs
  • Ensure these tests are run on the "lab environment" in CI
  • Migrate Sled Agent tests to use this framework. Good targets include: The StorageManager, ServiceManager, and ZoneBundler tests, though there are many more viable candidates.
@smklein smklein added Testing & Analysis Tests & Analyzers Sled Agent Related to the Per-Sled Configuration and Management labels Mar 7, 2024
@karencfv
Copy link
Contributor

karencfv commented Mar 7, 2024

Oooohhhh nice! This will help with #4835

@smklein
Copy link
Collaborator Author

smklein commented Mar 8, 2024

Chatted with @andrewjstone a little bit about this. We're thinking that "while a test attribute macro might be cool", it also would probably make more sense for tests to have a bit better "explicit control" over VMMs. That would let us do things like maybe "write test code that can cause reboots, and watch what happens".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Sled Agent Related to the Per-Sled Configuration and Management Testing & Analysis Tests & Analyzers
Projects
None yet
Development

No branches or pull requests

2 participants