-
Notifications
You must be signed in to change notification settings - Fork 307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add advanced Optuna plugin authoring methods #3050
Conversation
Code Review Agent Run #0054ccActionable Suggestions - 4
Additional Suggestions - 1
Review Details
|
Changelist by BitoThis pull request implements the following key changes.
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #3050 +/- ##
==========================================
+ Coverage 82.79% 92.08% +9.29%
==========================================
Files 3 112 +109
Lines 186 4601 +4415
==========================================
+ Hits 154 4237 +4083
- Misses 32 364 +332 ☔ View full report in Codecov by Sentry. |
Code Review Agent Run #0bfefdActionable Suggestions - 5
Review Details
|
async def objective(x: float, y: int, z: int, power: int) -> float: | ||
return 1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The objective
function returns a hardcoded value 1.0
which may not provide meaningful test coverage for validation. Consider implementing test cases that validate different input combinations and their impact on the optimization process.
Code suggestion
Check the AI-generated fix before applying
async def objective(x: float, y: int, z: int, power: int) -> float: | |
return 1.0 | |
async def objective(x: float, y: int, z: int, power: int) -> float: | |
return x + y + z + power # Example computation using all parameters |
Code Review Run #0bfefd
Is this a valid issue, or was it incorrectly flagged by the Agent?
- it was incorrectly flagged
def test_unparameterized_callback(): | ||
|
||
|
||
@fl.task | ||
async def objective(letter: str, number: Union[float, int], other: str, fixed: str) -> float: | ||
|
||
loss = len(letter) + number + len(other) + len(fixed) | ||
|
||
return float(loss) | ||
|
||
@optimize | ||
def optimizer(trial: optuna.Trial, fixed: str): | ||
|
||
letter = trial.suggest_categorical("booster", ["A", "B", "BLAH"]) | ||
|
||
if letter == "A": | ||
number = trial.suggest_int("number_A", 1, 10) | ||
elif letter == "B": | ||
number = trial.suggest_float("number_B", 10., 20.) | ||
else: | ||
number = 10 | ||
|
||
other = trial.suggest_categorical("other", ["Something", "another word", "a phrase"]) | ||
|
||
return objective(letter, number, other, fixed) | ||
|
||
|
||
@fl.eager | ||
async def train(concurrency: int, n_trials: int) -> float: | ||
|
||
optimizer.n_trials = n_trials | ||
optimizer.concurrency = concurrency | ||
|
||
await optimizer(fixed="hello!") | ||
|
||
return float(optimizer.study.best_value) | ||
|
||
loss = asyncio.run(train(concurrency=2, n_trials=10)) | ||
|
||
assert isinstance(loss, float) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider extracting the optimization logic from test_unparameterized_callback()
into a separate helper function to improve test readability and maintainability. The test contains complex optimization setup that could be reused.
Code suggestion
Check the AI-generated fix before applying
@@ -75,40 +75,45 @@
def test_unparameterized_callback():
+ optimizer = create_test_optimizer()
+ loss = asyncio.run(train(optimizer, concurrency=2, n_trials=10))
+ assert isinstance(loss, float)
+
def create_test_optimizer():
@fl.task
async def objective(letter: str, number: Union[float, int], other: str, fixed: str) -> float:
loss = len(letter) + number + len(other) + len(fixed)
return float(loss)
@optimize
def optimizer(trial: optuna.Trial, fixed: str):
letter = trial.suggest_categorical("booster", ["A", "B", "BLAH"])
if letter == "A":
number = trial.suggest_int("number_A", 1, 10)
elif letter == "B":
number = trial.suggest_float("number_B", 10., 20.)
else:
number = 10
other = trial.suggest_categorical("other", ["Something", "another word", "a phrase"])
return objective(letter, number, other, fixed)
return optimizer
Code Review Run #0bfefd
Is this a valid issue, or was it incorrectly flagged by the Agent?
- it was incorrectly flagged
if not isinstance(self.delay, int) or (self.delay < 0): | ||
raise ValueError("delay must be an integer greater than 0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message states 'greater than 0' but the validation allows 0 (delay < 0
). Consider aligning the validation with the error message.
Code suggestion
Check the AI-generated fix before applying
if not isinstance(self.delay, int) or (self.delay < 0): | |
raise ValueError("delay must be an integer greater than 0") | |
if not isinstance(self.delay, int) or (self.delay <= 0): | |
raise ValueError("delay must be an integer greater than 0") |
Code Review Run #0bfefd
Is this a valid issue, or was it incorrectly flagged by the Agent?
- it was incorrectly flagged
elif signature.return_annotation is float: | ||
args = signature.return_annotation.__args__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition signature.return_annotation is float
appears to be duplicated from line 88, which may lead to unreachable code. Consider removing the duplicate condition or updating it to check for tuple return type.
Code suggestion
Check the AI-generated fix before applying
elif signature.return_annotation is float: | |
args = signature.return_annotation.__args__ | |
elif hasattr(signature.return_annotation, '__args__'): | |
args = signature.return_annotation.__args__ |
Code Review Run #0bfefd
Is this a valid issue, or was it incorrectly flagged by the Agent?
- it was incorrectly flagged
def optimize( | ||
objective: Optional[Union[CallbackType, PythonFunctionTask]] = None, | ||
concurrency: int = 1, | ||
n_trials: int = 1, | ||
study: Optional[optuna.Study] = None, | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding validation for concurrency
and n_trials
parameters to ensure they are positive integers. These parameters control optimization behavior and should be validated early.
Code suggestion
Check the AI-generated fix before applying
def optimize( | |
objective: Optional[Union[CallbackType, PythonFunctionTask]] = None, | |
concurrency: int = 1, | |
n_trials: int = 1, | |
study: Optional[optuna.Study] = None, | |
): | |
def optimize( | |
objective: Optional[Union[CallbackType, PythonFunctionTask]] = None, | |
concurrency: int = 1, | |
n_trials: int = 1, | |
study: Optional[optuna.Study] = None, | |
): | |
if concurrency < 1: | |
raise ValueError('concurrency must be a positive integer') | |
if n_trials < 1: | |
raise ValueError('n_trials must be a positive integer') |
Code Review Run #0bfefd
Is this a valid issue, or was it incorrectly flagged by the Agent?
- it was incorrectly flagged
Code Review Agent Run #f1b6b3Actionable Suggestions - 2
Review Details
|
y0 = (((x - 5) ** 2) + (y + 4) ** 4 + (3 * z - 3) ** 2) ** power | ||
y1 = (((x - 2) ** 4) + (y + 1) ** 2 + (4 * z - 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider extracting the complex mathematical expressions into separate variables or helper functions to improve readability and maintainability. The expressions (((x - 5) ** 2) + (y + 4) ** 4 + (3 * z - 3) ** 2) ** power
and (((x - 2) ** 4) + (y + 1) ** 2 + (4 * z - 1))
could be broken down into smaller parts.
Code suggestion
Check the AI-generated fix before applying
y0 = (((x - 5) ** 2) + (y + 4) ** 4 + (3 * z - 3) ** 2) ** power | |
y1 = (((x - 2) ** 4) + (y + 1) ** 2 + (4 * z - 1)) | |
def calc_y0(x, y, z, power): | |
term1 = (x - 5) ** 2 | |
term2 = (y + 4) ** 4 | |
term3 = (3 * z - 3) ** 2 | |
return (term1 + term2 + term3) ** power | |
def calc_y1(x, y, z): | |
term1 = (x - 2) ** 4 | |
term2 = (y + 1) ** 2 | |
term3 = (4 * z - 1) | |
return term1 + term2 + term3 | |
y0 = calc_y0(x, y, z, power) | |
y1 = calc_y1(x, y, z) |
Code Review Run #f1b6b3
Is this a valid issue, or was it incorrectly flagged by the Agent?
- it was incorrectly flagged
|
||
x, y = suggestions["x"], suggestions["y"] | ||
|
||
return (((x - 5) ** 2) + (y + 4) ** 4) ** power |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The objective function appears to have been simplified by removing the z
parameter from the calculation, but it is still being passed as an argument. Consider either using the z
parameter in the calculation or removing it from the function signature if it's not needed.
Code suggestion
Check the AI-generated fix before applying
- async def objective(suggestions: dict[str, Union[int, float]], z: int, power: int) -> float:
+ async def objective(suggestions: dict[str, Union[int, float]], power: int) -> float:
Code Review Run #f1b6b3
Is this a valid issue, or was it incorrectly flagged by the Agent?
- it was incorrectly flagged
Fully Parallelized Wrapper Around Optuna Using Flyte
Overview
This documentation provides a guide to a fully parallelized Flyte plugin for Optuna. This wrapper leverages Flyte's scalable and distributed workflow orchestration capabilities to parallelize Optuna's hyperparameter optimization across multiple trials efficiently.
Features
Installation
flytekit
flytekitplugins.optuna
Getting Started
Prerequisites
Define the Objective Function
The objective function defines the problem to be optimized. It should include the hyperparameters to be tuned and return a value to minimize or maximize.
Configure the Flyte Workflow
The Flyte workflow orchestrates the parallel execution of Optuna trials. Below is an example:
Register and Execute the Workflow
Submit the workflow to Flyte for execution:
pyflyte register files . pyflyte run --name train
Monitor Progress
You can monitor the progress of the trials via the Flyte Console. Each trial runs as a separate task, and the results are aggregated by the Optuna wrapper.
You may access the
optuna.Study
like so:optimizer.study
.Therefore, with
plotly
installed, you may create create Flyte Decks of the study like so:Advanced Configuration
Custom Dictionary Inputs
Suggestions may be defined in recursive dictionaries:
Custom Callbacks
In some cases, you may need to create define the suggestions programmatically. This may be done
Troubleshooting
Resource Constraints: Ensure sufficient compute resources are allocated for the number of parallel jobs specified.
Flyte Errors: Refer to the Flyte logs and documentation to debug workflow execution issues.
Summary by Bito
This PR enhances the Optuna plugin by implementing parameter bundling capabilities and improving validation. Key improvements include nested parameter suggestions, enhanced type handling, flexible configuration options, and support for multi-objective optimization. The implementation includes delay parameter control and supports both direct function calls and decorator syntax. The test suite has been comprehensively enhanced with simplified objective functions and new test cases.Unit tests added: True
Estimated effort to review (1-5, lower is better): 3