Skip to content

Commit

Permalink
Ability to run tier 2 tests locally (#481)
Browse files Browse the repository at this point in the history
* Tier 2 test support for user yaml, added documentation

* Support for tier 2 tests from command line

* Added documentation for tier2 tests and platform choice

* removed leftover print

* Added capability to use tier2 overrides from suite_tests

* string format fix

* create_experiment now uses tier2 overrides if file is present

* Simplified tier test selection, removed print calls from debug

* fixed call to tier2 enum

* fixes to suite tests

* removed unused function
  • Loading branch information
mranst authored Jan 3, 2025
1 parent ac67008 commit ba8ea75
Show file tree
Hide file tree
Showing 5 changed files with 165 additions and 8 deletions.
36 changes: 35 additions & 1 deletion docs/code_tests/suite_tests.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ One recommended override, per the example below, is to request a local copy of `
```
This works because, unless otherwise specified, `sbatch` automatically inherits all current environment variables, which you have already configured in step 2 above.
If you prefer, you can create a dedicated `sbatch` script to wrap the `swell t1test ...` command, to be extra sure that the environment is exactly as it should be.
By default, tier tests will be run on the `nccs_discover_sles15` platform, alternative platforms can be specified with the `-p` flag, similar to other swell commands.

4. Repeat (2) for other tests you would like to run. Currently, we recommend running the following tests:
- `3dvar`
Expand All @@ -56,4 +57,37 @@ Debug as necessary.

## Tier 2 tests

Coming soon!
Swell tier2 tests are run in a similar way to tier1 tests. Overrides from ~/.swell/swell-test.yaml are read and used in the test. Tier 2 tests generally involve building JEDI before running each test suite, a time and computationally expensive process. This involves cloning the Git repositories for JEDI, which must be done on a Discover login node, as compute nodes do not have internet access. In addition, access to private Git repositories necessary to build JEDI requires the user to be part of the JDSCA-internal organization on Github. A `~/.git-credentials` must be created containing Github access token information (see [jedi bundle documentation](https://github.com/GEOS-ESM/jedi_bundle/blob/develop/docs/git_credentials.md))

The recommended way to run tier 2 tests on NCCS Discover is as follows:

1. (Optional but recommended), create a file called `~/.swell/swell-test.yaml`.
Like tier 1 tests, setting the root directory controls where test outputs will be stored (replacing with a real file path):

```yaml
test_root: /path/to/tier1/test/outputs
```
(If unset, the test function will create a temporary directory that is deleted by the operating system when the `sbatch` job concludes.
You can still run the tests without this, but you won't be able to study the outputs.) Other overrides will be passed to the test.
By default, tier 2 tests will build JEDI at the beginning of the job, unless a path to an existing JEDI build is specified in the user's `~/.swell/swell-test.yaml`, using the lines
```yaml
jedi_build_method: use_existing
existing_jedi_build_directory: /path/to/jedi/build/directory
exising_jedi_source_directory: /path/to/jedi/source/directory
```
2. Ensure your SWELL interactive enviroment is set up correctly (see step 2 under tier 2 tests)

3. To start a tier 2 test on a login node (This is needed for internet access for cloning jedi repositories, please do this sparingly to conserve NCCS resources. If you have a sucessful JEDI build and want to test it further, use the overrides in the `~/.swell/swell-test.yaml` to avoid having to build JEDI and use login nodes)
```sh
swell t2test <suite> -p <platform>
# The platform is specified with -p, this will default to nccs_discover_sles15.
```
This works because, unless otherwise specified, `sbatch` automatically inherits all current environment variables, which you have already configured in step 2 above.

4. Repeat (2) for other tests you would like to run. Currently, we recommend running the following tests:
- `3dvar`
- `hofx`
- `ufo_testing`
- `convert_ncdiags`
- `3dfgat_atmos`
- `build_jedi`
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
from swell.deployment.prepare_config_and_suite.question_and_answer_defaults import GetAnswerDefaults
from swell.utilities.logger import Logger
from swell.utilities.jinja2 import template_string_jinja2
from swell.utilities.dictionary import update_dict


# --------------------------------------------------------------------------------------------------
Expand Down Expand Up @@ -298,6 +299,14 @@ def override_with_external(self) -> None:
with open(test_file, 'r') as ymlfile:
override_dict = yaml.safe_load(ymlfile)

# Update overrides with tier2 suite test file if available
tier2_test_file = os.path.join(get_swell_path(), 'test', 'suite_tests',
self.suite + '-tier2.yaml')
if os.path.exists(tier2_test_file):
with open(tier2_test_file, 'r') as ymlfile:
tier2_override_dict = yaml.safe_load(ymlfile)
override_dict = update_dict(override_dict, tier2_override_dict)

# Now append with any user provided override
if self.override is not None:

Expand Down
33 changes: 30 additions & 3 deletions src/swell/swell.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
from swell.deployment.launch_experiment import launch_experiment
from swell.tasks.base.task_base import task_wrapper, get_tasks
from swell.test.test_driver import test_wrapper, valid_tests
from swell.test.suite_tests.suite_tests import run_suite
from swell.test.suite_tests.suite_tests import run_suite, TestSuite
from swell.utilities.suite_utils import get_suites
from swell.utilities.welcome_message import write_welcome_message
from swell.utilities.scripts.utility_driver import get_utilities, utility_wrapper
Expand Down Expand Up @@ -244,15 +244,42 @@ def test(test: str) -> None:


@swell_driver.command()
@click.option('-p', '--platform', 'platform', type=click.Choice(get_platforms()),
default="nccs_discover_sles15", help=platform_help)
@click.argument('suite', type=click.Choice(("hofx", "3dvar", "ufo_testing")))
def t1test(suite: Literal["hofx", "3dvar", "ufo_testing"]) -> None:
def t1test(
suite: Literal["hofx", "3dvar", "ufo_testing"],
platform: Optional[str] = "nccs_discover_sles15"
) -> None:
"""
Run a particular swell suite from the tier 1 tests.
Arguments:
suite (str): Name of the suite to run (e.g., hofx, 3dvar, ufo_testing)
"""
run_suite(suite)
run_suite(suite, platform, TestSuite.TIER1)


# --------------------------------------------------------------------------------------------------


@swell_driver.command()
@click.option('-p', '--platform', 'platform', type=click.Choice(get_platforms()),
default="nccs_discover_sles15", help=platform_help)
@click.argument('suite', type=click.Choice(("hofx", "3dvar", "ufo_testing",
"convert_ncdiags", "3dfgat_atmos", "build_jedi")))
def t2test(
suite: Literal["hofx", "3dvar", "ufo_testing",
"convert_ncdiags", "3dfgat_atmos", "build_jedi"],
platform: Optional[str] = "nccs_discover_sles15"
) -> None:
"""
Run a particular swell suite from the tier 2 tests.
Arguments:
suite (str): Name of the suite to run (e.g., hofx, 3dvar, ufo_testing)
"""
run_suite(suite, platform, TestSuite.TIER2)


# --------------------------------------------------------------------------------------------------
Expand Down
6 changes: 6 additions & 0 deletions src/swell/test/suite_tests/build_jedi-tier1.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
jedi_build_method: create
bundles:
- fv3-jedi
- soca
- iodaconv
- ufo
89 changes: 85 additions & 4 deletions src/swell/test/suite_tests/suite_tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,19 +5,67 @@
from pathlib import Path
from datetime import datetime
from importlib import resources
from enum import Enum

from swell.deployment.create_experiment import create_experiment_directory
from swell.deployment.launch_experiment import launch_experiment
from swell.utilities.dictionary import update_dict


def run_suite(suite: str):
class TestSuite(Enum):
TIER1 = "tier1"
TIER2 = "tier2"


def build_jedi_for_tier2(test_dir: str, experiment_id_root: str, platform: str, test_config: dict):
suite_overrides_file = (resources.files("swell") /
"test" /
"suite_tests" /
"build_jedi-tier1.yaml")

with suite_overrides_file.open("r") as f:
suite_overrides = yaml.safe_load(f)

experiment_id = experiment_id_root + "build_jedi"

override = {
"experiment_id": experiment_id,
"experiment_root": str(test_dir),
**suite_overrides
}

if "override" in test_config:
override = update_dict(override, test_config['override'])

experiment_dir = test_dir / experiment_id
experiment_dir.mkdir(parents=True, exist_ok=True)
override_yml = experiment_dir / "override.yaml"

with open(override_yml, "w") as f:
yaml.dump(override, f)

create_experiment_directory(
"build_jedi", "defaults", platform,
str(override_yml), False, None
)

suite_path = str(experiment_dir / f"{experiment_id}-suite")
log_path = str(experiment_dir / "log")

launch_experiment(suite_path, True, log_path)

return experiment_dir


def run_suite(suite: str, platform: str, test_tier: TestSuite):
# Add a random int to the experiment_id to mitigate errors from workflows
# created at (roughly) the same time.
ii = random.randint(0, 99)
experiment_id = f"t{datetime.now().strftime('%Y%jT%H%M')}r{ii:02d}{suite}"

# Get test directory from `~/.swell/swell-test.yml`
experiment_id_root = f"t{datetime.now().strftime('%Y%jT%H%M')}r{ii:02d}"
experiment_id = f"{experiment_id_root}{suite}"

# Get test directory from `~/.swell/swell-test.yaml`
test_config = {
"test_root": Path(tempfile.TemporaryDirectory().name)
}
Expand Down Expand Up @@ -47,6 +95,22 @@ def run_suite(suite: str):
with suite_overrides_file.open("r") as f:
suite_overrides = yaml.safe_load(f)

# If it exists, update suite overrides from (suite)-tier2.yaml
if test_tier == TestSuite.TIER2:
tier2_suite_overrides_file = (resources.files("swell") /
"test" /
"suite_tests" /
f"{suite}-tier2.yaml")
if Path(tier2_suite_overrides_file).exists():
with open(tier2_suite_overrides_file, 'r') as f:
tier2_suite_overrides = yaml.safe_load(f)
print("Updating suite with tier 2 overrides" +
f"from: {tier2_suite_overrides_file}")
suite_overrides = update_dict(suite_overrides, tier2_suite_overrides)
else:
print(f"Could not find tier 2 override file for {suite}," +
" defaulting to tier 1 overrides")

override = {
"experiment_id": experiment_id,
"experiment_root": str(testdir),
Expand All @@ -71,12 +135,29 @@ def run_suite(suite: str):
experiment_dir = testdir / experiment_id
experiment_dir.mkdir(parents=True, exist_ok=True)

# Build JEDI for tier 2 tests if existing build is not specified in user yaml
if test_tier == TestSuite.TIER2:
if not ("jedi_build_method" in test_config
and test_config["jedi_build_method"] == "use_existing"
and 'existing_jedi_source_directory' in test_config
and 'existing_jedi_build_directory' in test_config):
jedi_dir = build_jedi_for_tier2(testdir, experiment_id_root, platform, test_config)

tier2_override = {"jedi_build_method": "use_existing",
"existing_jedi_source_directory": f"{jedi_dir}/jedi_bundle/source",
"existing_jedi_build_directory": f"{jedi_dir}/jedi_bundle/build"}

override = update_dict(override, tier2_override)

if suite == "build_jedi":
return None

override_yml = experiment_dir / "override.yaml"
with open(override_yml, "w") as f:
yaml.dump(override, f)

create_experiment_directory(
suite, "defaults", "nccs_discover_sles15",
suite, "defaults", platform,
str(override_yml), False, None
)

Expand Down

0 comments on commit ba8ea75

Please sign in to comment.