WIP: Introduce TrialRunner Abstraction #720

bpkroth · 2024-03-19T21:24:40Z

This is another step in adding support for parallel trial execution #380.

Here we separate out the running of an individual trial to a single class - TrialRunner.

Multiple TrialRunners are instantiated at CLI invocation with the --num-trial-runners argument.
Each TrialRunner associated with a single copy of the root Environment, and made unique by means of a unique trial_runner_id value that's included in that Environment's global_config.

TODO:

tests

In future PRs we will add:

New Scheduler implementations to run TrialRunners in parallel.
Async polling of status results in each TrialRunner independently.

…nit test for bench (not tested)

…e bulk registration (check for is_warm_up)

…ion loop

…eduler

…tension is buggy

…duced - buggy

bpkroth · 2025-01-07T00:06:57Z

mlos_bench/mlos_bench/tests/launcher_parse_args_test.py

+    assert {
+        trial_runner.environment.const_args["trial_runner_id"]
+        for trial_runner in launcher.trial_runners
+    } == set(range(0, len(launcher.trial_runners)))


FIXME: Some places I think we use 0 indexing and others 1 indexing. We should be consistent about that.

bpkroth · 2025-01-07T00:08:23Z

mlos_bench/mlos_bench/storage/base_storage.py

            return config

+        # NOTE: This may no longer be necessary with the new schema.
+        def add_new_config_data(


These were refactored originally to avoid changing the schema. We don't need that anymore, but it still might be handy to keep around.

@motus, thoughts on reverting vs. just leaving this part of it? It also affects the _save_params code movement.

bpkroth · 2025-01-07T00:08:55Z

mlos_bench/mlos_bench/storage/base_storage.py

@@ -446,6 +455,25 @@ def tunables(self) -> TunableGroups:
            """
            return self._tunables

+        @abstractmethod
+        def assign_trial_runner(self, trial_runner_id: int) -> int:


Move this up to be near the property?

bpkroth · 2025-01-07T00:09:19Z

mlos_bench/mlos_bench/storage/__init__.py

@@ -129,7 +129,7 @@
 >>> # Access ExperimentData by experiment id.
 >>> experiment_data = storage.experiments["my_experiment_id"]
 >>> experiment_data.trials
-{1: Trial :: my_experiment_id:1 cid:1 SUCCEEDED}
+{1: Trial :: my_experiment_id:1 cid:1 rid:None SUCCEEDED}


No runner assigned here. May want to note that in the docstring.

motus added 30 commits February 21, 2024 15:34

do not pass the optimizer into _run()

616f44e

mypy fixes

33e332a

start splitting the optimization loop into two

0247259

first complete version of the optimization loop (not tested yet)

483e378

Merge branch 'main' into sergiym/run/2loops

addd5a4

allow running mlos_bench.run._main directly from unit tests + add a u…

e97266f

…nit test for bench (not tested)

move in-process launch to a separate unit test file

64771fd

add is_warm_up flag to the optimization step

bd7c55e

Merge branch 'main' of github.com:microsoft/MLOS into sergiym/run/2loops

387722a

in-process optimizaiton loop invocation works!

9f15aee

add multi-iteration optimization to in-process test; fix the mlos_cor…

65cd072

…e bulk registration (check for is_warm_up)

make in-process launcerh tests pass

c010d95

remove unnecessary local variables to make pylint happy

7cfef3a

move trial_config_repeat_count checks to the launcher

7233180

make experiment.load() return trial_ids and use them in the optimizat…

be7dcec

…ion loop

use proper last_trial_id in the main loop; fix the unit tests

3c52e03

update launcher tests with the new output patterns

0d9dc97

remove unused variable

4e171e0

Merge branch 'main' into sergiym/run/2loops

ab69fa0

better naming for functions in the optimization loop

52adab8

start implementing the scheduler class

df893d9

change the default value for is_warm_up parameter to False

5aca764

Merge branch 'sergiym/run/2loops' into sergiym/run/scheduler

4d183df

started to implement teh start() method of the sync scheduler

309e10c

Merge branch 'main' of github.com:microsoft/MLOS into sergiym/run/sch…

9a72b40

…eduler

implement proper Scheduler constructor

ffe23e1

more clean-ups to the base scheduler

cb863e0

minor pylint fixes

990b019

add _add_trial_to_queue() method

2ac0520

better handling of warm-up phase (no redundant code)

b95100a

bpkroth added 26 commits January 6, 2025 20:34

Merge branch 'main' into trial-runner-abstraction

116f1a6

fix merge

86ca87c

format

388c3b1

typing

d250929

test fixup

f1a8dff

revert vscode formatting extension to black -- external pre-commit ex…

0aaf2eb

…tension is buggy

wip: adjust how trial runners data is saved

4c735ee

be in context

d506672

better debugging output

47a52a3

backwards compat

1ae1deb

comment

03fd44e

wip test

488f637

revert vscode formatting extension to black -- external pre-commit ex…

db1feeb

…tension is buggy

Merge branch 'main' into vscode-tweaks

df91045

Merge branch 'vscode-tweaks-2' into vscode-tweaks

b1b96e5

Merge branch 'main' into trial-runner-abstraction

3872909

Merge branch 'vscode-tweaks' into trial-runner-abstraction

c0164ac

remove 80

83f8244

fixup

29cc0ad

backwards compat

4e972b6

remove the interim test

94b327f

fixup

23ea0de

Merge branch 'main' into trial-runner-abstraction

59f3e9c

Add another test

99c4878

checks some scheduler behavior by means of looking at the data it pro…

222f2c8

…duced - buggy

fixup range

003d6a4

bpkroth commented Jan 7, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Introduce TrialRunner Abstraction #720

WIP: Introduce TrialRunner Abstraction #720

bpkroth commented Mar 19, 2024 •

edited

Loading

bpkroth Jan 7, 2025

bpkroth Jan 7, 2025

bpkroth Jan 7, 2025

bpkroth Jan 7, 2025

WIP: Introduce TrialRunner Abstraction #720

Are you sure you want to change the base?

WIP: Introduce TrialRunner Abstraction #720

Conversation

bpkroth commented Mar 19, 2024 • edited Loading

bpkroth Jan 7, 2025

Choose a reason for hiding this comment

bpkroth Jan 7, 2025

Choose a reason for hiding this comment

bpkroth Jan 7, 2025

Choose a reason for hiding this comment

bpkroth Jan 7, 2025

Choose a reason for hiding this comment

bpkroth commented Mar 19, 2024 •

edited

Loading