Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add advanced Optuna plugin authoring methods #3050

Closed
wants to merge 35 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
4298d66
add suggestion bundle support
granthamtaylor Jan 7, 2025
f1d291f
run pre-commit
granthamtaylor Jan 7, 2025
add0bac
update init
granthamtaylor Jan 7, 2025
9995829
simplify bundling setup. Now happens automatically via typing
granthamtaylor Jan 7, 2025
f481da6
remove Suggestion type
granthamtaylor Jan 7, 2025
356944d
allow for some suggestions to be defined as kwargs during bundling
granthamtaylor Jan 7, 2025
0554472
organize imports
granthamtaylor Jan 7, 2025
9fb2cb1
fix unit tests
granthamtaylor Jan 8, 2025
6acca75
remove positional only arg
granthamtaylor Jan 8, 2025
f57dc7a
replace and with or in tests (again)
granthamtaylor Jan 8, 2025
a912bfa
simplify to suggest over any dictionary
granthamtaylor Jan 8, 2025
b76e948
support recursive dictionaries
granthamtaylor Jan 8, 2025
0923a0d
update recursive method
granthamtaylor Jan 8, 2025
8b9af79
remove space
granthamtaylor Jan 8, 2025
8d80b9f
fix in-place process error
granthamtaylor Jan 8, 2025
d8a5bf4
simplify a tiny bit
granthamtaylor Jan 8, 2025
85712fe
add functionality for suggestion callback
granthamtaylor Jan 9, 2025
7364d25
fix test
granthamtaylor Jan 9, 2025
a488fa0
update typing to include Union operator
granthamtaylor Jan 9, 2025
65e87c7
fixed new unit test
granthamtaylor Jan 9, 2025
8bc3adb
remove ws
granthamtaylor Jan 9, 2025
5af7c60
clean up callback method
granthamtaylor Jan 10, 2025
749ac03
add typing-extensions based concat
granthamtaylor Jan 10, 2025
2e9190b
add optimize decorator
granthamtaylor Jan 11, 2025
cd31d51
run pre-commit
granthamtaylor Jan 11, 2025
8139800
add ParamSpec for typing
granthamtaylor Jan 11, 2025
fabd88e
fix import statements
granthamtaylor Jan 11, 2025
9589dfa
update docs
granthamtaylor Jan 11, 2025
4841e4b
Add validation to delay
granthamtaylor Jan 11, 2025
490cd4c
test runtime validation
granthamtaylor Jan 11, 2025
d8c318c
whitespace
granthamtaylor Jan 11, 2025
d7ebaba
add unparameterized decorator
granthamtaylor Jan 11, 2025
77a2326
add tuple output validation
granthamtaylor Jan 11, 2025
e4c8a46
clean up some tests
granthamtaylor Jan 11, 2025
734c2cf
whitespace
granthamtaylor Jan 11, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
187 changes: 165 additions & 22 deletions plugins/flytekit-optuna/README.md
Original file line number Diff line number Diff line change
@@ -1,45 +1,188 @@
# Fully Parallelized Flyte Orchestrated Optimizer
# Fully Parallelized Wrapper Around Optuna Using Flyte

WIP Flyte integration with Optuna to parallelize optimization objective function runtime.
## Overview

This documentation provides a guide to a fully parallelized Flyte plugin for Optuna. This wrapper leverages Flyte's scalable and distributed workflow orchestration capabilities to parallelize Optuna's hyperparameter optimization across multiple trials efficiently.

![Timeline](timeline.png)


## Features

- **Ease of Use**: This plugin requires no external data storage or experiment tracking.
- **Parallelized Trial Execution**: Enables concurrent execution of Optuna trials, dramatically speeding up optimization tasks.
- **Scalability**: Leverages Flyte’s ability to scale horizontally to handle large-scale hyperparameter tuning jobs.
- **Flexible Integration**: Compatible with various machine learning frameworks and training pipelines.

## Installation

- Install `flytekit`
- Install `flytekitplugins.optuna`

## Getting Started

### Prerequisites

- A Flyte deployment configured and running.
- Python 3.9 or later.
- Familiarity with Flyte and asynchronous programming.

### Define the Objective Function

The objective function defines the problem to be optimized. It should include the hyperparameters to be tuned and return a value to minimize or maximize.

```python
import math

import flytekit as fl

from optimizer import Optimizer, suggest

image = fl.ImageSpec(builder="union", packages=["flytekit==1.15.0b0", "optuna>=4.0.0"])
image = fl.ImageSpec(packages=["flytekitplugins.optuna"])

@fl.task(container_image=image)
async def objective(x: float, y: int, z: int, power: int) -> float:
return math.log((((x - 5) ** 2) + (y + 4) ** 4 + (3 * z - 3) ** 2)) ** power

```

### Configure the Flyte Workflow

The Flyte workflow orchestrates the parallel execution of Optuna trials. Below is an example:

```python
import flytekit as fl
from flytekitplugins.optuna import Optimizer, suggest

@fl.eager(container_image=image)
async def train(concurrency: int, n_trials: int):
optimizer = Optimizer(objective, concurrency, n_trials)
async def train(concurrency: int, n_trials: int) -> float:

optimizer = Optimizer(objective=objective, concurrency=concurrency, n_trials=n_trials)

await optimizer(
x=suggest.float(low=-10, high=10),
y=suggest.integer(low=-10, high=10),
z=suggest.category([-5, 0, 3, 6, 9]),
power=2,
x = suggest.float(low=-10, high=10),
y = suggest.integer(low=-10, high=10),
z = suggest.category([-5, 0, 3, 6, 9]),
power = 2,
)

print(optimizer.study.best_value)

```

### Register and Execute the Workflow

Submit the workflow to Flyte for execution:

```bash
pyflyte register files .
pyflyte run --name train
```

### Monitor Progress

You can monitor the progress of the trials via the Flyte Console. Each trial runs as a separate task, and the results are aggregated by the Optuna wrapper.

You may access the `optuna.Study` like so: `optimizer.study`.

Therefore, with `plotly` installed, you may create create Flyte Decks of the study like so:

```python
import plotly

fig = optuna.visualization.plot_timeline(optimizer.study)
fl.Deck(name, plotly.io.to_html(fig))
```

This integration allows one to define fully parallelized HPO experiments via `@eager` in as little as 20 lines of code. The objective task is optimized via Optuna under the hood, such that one may extract the `optuna.Study` at any time for the purposes of serialization, storage, visualization, or interpretation.
## Advanced Configuration

### Custom Dictionary Inputs

Suggestions may be defined in recursive dictionaries:

This plugin provides full feature parity to Optuna, including:
```python
import flytekit as fl
from flytekitplugins.optuna import Optimizer, suggest

image = fl.ImageSpec(packages=["flytekitplugins.optuna"])


@fl.task(container_image=image)
async def objective(params: dict[str, int | float | str]) -> float:
...


@fl.eager(container_image=image)
async def train(concurrency: int, n_trials: int):

study = optuna.create_study(direction="maximize")

optimizer = Optimizer(objective=objective, concurrency=concurrency, n_trials=n_trials, study=study)

params = {
"lambda": suggest.float(1e-8, 1.0, log=True),
"alpha": suggest.float(1e-8, 1.0, log=True),
"subsample": suggest.float(0.2, 1.0),
"colsample_bytree": suggest.float(0.2, 1.0),
"max_depth": suggest.integer(3, 9, step=2),
"objective": "binary:logistic",
"tree_method": "exact",
"booster": "dart",
}

await optimizer(params=params)
```

### Custom Callbacks

In some cases, you may need to create define the suggestions programmatically. This may be done

```python
import flytekit as fl
import optuna
from flytekitplugins.optuna import optimize

image = fl.ImageSpec(packages=["flytekitplugins.optuna"])

@fl.task(container_image=image)
async def objective(params: dict[str, int | float | str]) -> float:
...

@optimize
def optimizer(trial: optuna.Trial, verbosity: int, tree_method: str):

params = {
"verbosity:": verbosity,
"tree_method": tree_method,
"objective": "binary:logistic",
# defines booster, gblinear for linear functions.
"booster": trial.suggest_categorical("booster", ["gbtree", "gblinear", "dart"]),
# sampling according to each tree.
"colsample_bytree": trial.suggest_float("colsample_bytree", 0.2, 1.0),
}

if params["booster"] in ["gbtree", "dart"]:
# maximum depth of the tree, signifies complexity of the tree.
params["max_depth"] = trial.suggest_int("max_depth", 3, 9, step=2)

if params["booster"] == "dart":
params["sample_type"] = trial.suggest_categorical("sample_type", ["uniform", "weighted"])
params["normalize_type"] = trial.suggest_categorical("normalize_type", ["tree", "forest"])

return objective(params)


@fl.eager(container_image=image)
async def train(concurrency: int, n_trials: int):

optimizer.concurrency = concurrency
optimizer.n_trials = n_trials

study = optuna.create_study(direction="maximize")

await optimizer(verbosity=0, tree_method="exact")
```

- fixed arguments
- multiple suggestion types (`Integer`, `Float`, `Category`)
- multi-objective, with arbitrary objective directions (minimize, maximize)
- pruners
- samplers
## Troubleshooting

# Improvements
Resource Constraints: Ensure sufficient compute resources are allocated for the number of parallel jobs specified.

- This would synergize really well with Union Actors.
- This should also support workflows, but it currently does not.
- Add unit tests, of course.
Flyte Errors: Refer to the Flyte logs and documentation to debug workflow execution issues.
4 changes: 2 additions & 2 deletions plugins/flytekit-optuna/flytekitplugins/optuna/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
from .optimizer import Optimizer, suggest
from .optimizer import Optimizer, optimize, suggest

__all__ = ["Optimizer", "suggest"]
__all__ = ["Optimizer", "optimize", "suggest"]
Loading
Loading