Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bounds_transformer could bypass global_bounds due to the test logic within _trim function in domain_reduction.py #441

Merged
merged 15 commits into from
Jan 12, 2024

Conversation

perezed00
Copy link
Contributor

Previously, when the upper limit of new_bounds < lower limit of global_bounds, the new_bounds could bypass the global_bounds. The proposed changes requires the little change to the existing code, but could cause one of the entries in new_bounds to stagnate at the global_bounds if the new_bounds entry is always outside of the global_bounds.

Previously, when the new upper limit was less than the original lower limit, the new_bounds could bypass the global_bounds.
@codecov
Copy link

codecov bot commented Aug 14, 2023

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (11a0c6a) 98.42% compared to head (6996dd1) 98.46%.
Report is 1 commits behind head on master.

Files Patch % Lines
bayes_opt/target_space.py 90.47% 0 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #441      +/-   ##
==========================================
+ Coverage   98.42%   98.46%   +0.04%     
==========================================
  Files           8        8              
  Lines         573      588      +15     
  Branches       84       88       +4     
==========================================
+ Hits          564      579      +15     
+ Misses          5        3       -2     
- Partials        4        6       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@till-m
Copy link
Member

till-m commented Aug 15, 2023

This should've been fixed in #398 (c.f. also #394). I don't think this change has been deployed to PyPI yet though. It has been deployed with v1.4.3.

@perezed00
Copy link
Contributor Author

perezed00 commented Aug 15, 2023

Thanks for linking me to those posts, I was not aware of them. However, this issue is distinct from #394, which addresses a window-width bypass. In 1.4.3 the global_bounds can still be bypassed when the suggested upper bound is smaller than the original lower bound. This PR addresses the current issue by identifying if either of the suggested new_bounds are outside of the original_bounds, in which case it sets the suggested new_bound value to the global_bound value.

This issue arises, for example, when an optimizer instance loads a log that was generated by an instance with bounds greater than those of the current instance.

I will return to this PR to add a test case to tests/test_seq_domain_red.py and to verify the order of the various checks in _trim. I suspect that the proposed solution would be more robust if the entries of new_bounds are sorted before the global bounds are enforced.

@till-m
Copy link
Member

till-m commented Aug 15, 2023

Hey @perezed00,

just to confirm, the case you're trying to cover is that sometimes, the lower bound can be greater than the upper bound (i.e. the bounds are out of order) and then the lower bound is not trimmed properly and allowed to be greater than the global bound?

@perezed00
Copy link
Contributor Author

perezed00 commented Aug 15, 2023

No, the case I've identified is roughly as follows:
Suppose the global bounds for a variable are [5, 10]. If bounds_transformer suggests new_bounds = [1, 4], then both bounds are smaller than the lower global_bound. v1.4.3 only checks if one of the bounds is smaller than global limit, so the second bound sneaks in.


Here is the same info, but written more programmatically:
Suppose the global_bounds for a variable are [x_lower, x_upper].
Suppose bounds_transformer suggests new_bounds=[nx_lower, nx_upper].

Assuming nx_lower<nx_upper:
If (nx_lower < x_lower and nx_upper > x_lower), v1.4.3 catches the error.
If (nx_lower < x_lower and nx_upper < x_lower), v1.4.3 does not catch the error. (Caught by this PR)

Not assuming nx_lower<nx_upper:
If (nx_lower > nx_upper and nx_upper < x_lower), needs to be checked.

@till-m
Copy link
Member

till-m commented Aug 15, 2023

Thanks, that makes sense! I guess some more testing is in order. If the case you describe only shows up when loading logs from a differently configured optimizer, it makes sense that it wasn't covered before.
I assume you now have a solid understanding of the trim logic. Would you mind adding some comments to the code as part of this PR to make it clear what's happening in the _trim function?

@perezed00
Copy link
Contributor Author

Sounds good. I don't mind adding some comments to the _trim function.
The case I've described can also arise after the use of optimizer.set_bounds(new_bounds={})

In summary, I'll revise this PR after doing the following:

  • Add comments to _trim
  • Verify the (nx_lower > nx_upper and nx_upper < x_lower) case described above
  • Check if the entries of new_bounds should be sorted before global bound enforcement
  • Add a test case to tests/test_seq_domain_red.py

It will probably be a few days before I get a chance to implement these, but I'll post again when complete.

@till-m till-m mentioned this pull request Sep 4, 2023
@till-m
Copy link
Member

till-m commented Sep 4, 2023

Hey @perezed00, did you get around to have another look at this? Seems like the issue might've popped up for another user so I'd be eager to get this fixed.

@perezed00
Copy link
Contributor Author

perezed00 commented Sep 4, 2023 via email

Added test cases to catch an error when both bounds of new_bounds exceeded the global_bounds
_trim function now avoids an error when both bounds for a given parameter in new_bounds exceed the global_bounds
@perezed00
Copy link
Contributor Author

perezed00 commented Sep 4, 2023

Hi, @till-m

I have updated this pull request. Regarding the list of tasks above, here's what was done:

  • Comments were added to _trim. The section of _trim that deals with the window size could still use more comments, however.
  • The (nx_lower > nx_upper and nx_upper < x_lower) case described above is tested, and not an issue
  • The entries of new bounds are now sorted before global bounds are enforced. This change was not needed but it makes the organization of _trim a little easier to follow
  • Test cases were added to tests/test_seq_domain_red.py

Additionally, the name of variables in _trim was made more uniform. With the exception of test_notebooks_run.py all of the tests in /tests pass. test_notebooks_run.py was not tested.

Copy link
Member

@till-m till-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me in general. Some small questions and things to verify.

bayes_opt/domain_reduction.py Outdated Show resolved Hide resolved
bayes_opt/domain_reduction.py Show resolved Hide resolved
@perezed00
Copy link
Contributor Author

Here's an alternative solution to the bounds problem:

When determining the center of the shrinking window, the transformer should only consider the subset of the target_space that is within the global_bounds. This should prevent the transformer from suggesting out-of-bounds points to begin with. This is better than simply loading a log with a smaller domain, because it still allows the Gaussian process to use the data from the larger domain.

I should note that I think the original paper has a method of dealing with this problem (see page 6 if interested), but I do think that the solution above would circumvent the issue without much added work.

@till-m
Copy link
Member

till-m commented Sep 6, 2023

Regarding your alternative solution -- I like your suggestion, maybe the appropriate way to implement this would be to restrict TargetSpace.max() to return the maximum obeying the global bounds. The problem is, that the global bounds are currently only stored in the transformer, not in the target space. (cc: @fmfn @bwheelz36 @leandrobbraga)

Also, I think it'd be good to have a warning to inform the user that the transformer observed a maximum outside the global bounds and why this can happen.

@perezed00
Copy link
Contributor Author

perezed00 commented Sep 6, 2023

I don't think this would be too hard to implement in the _update function within domain_reduction.py. I'll take a look at that now. Implementing the change in _update would avoid upstream changes to target_space.py which is used throughout the library.

Since we have identified a few separate issues, I propose we implement the solutions in separate PRs.

Here is how I see the issues:
Issue #0: The transformer is suggesting boundaries that exceed the global bounds.
Issue #1: Some erroneous boundaries do not get caught by existing _trim logic.
Issue #2: Current solutions to issue #1 can prevent the optimizer from converging in certain cases.

Issue #0 is the root cause of issues #1 and #2. This PR addresses #1, but leaves #0 and #2 unresolved. The suggested solution above may address #0 directly. For the sake of organized updates lets keep this discussion focused on #1 and make a new PR for #0. What do you think? If you agree I can make a new branch / PR.

I don't mind if this PR get's closed without integration (I'll leave that decision to you), I just don't want this single PR to become a discussion that is too long to follow.

@till-m
Copy link
Member

till-m commented Sep 21, 2023

Hi @perezed00,

my apologies, for not getting back to you, I have been rather busy recently.
I think it is generally better to tackle issues at the root. How would your solution to issue #0 look like (mathematically speaking)? I think there are several ways of doing what you're trying to accomplish with subtle, but important differences.

@perezed00
Copy link
Contributor Author

Hi @till-m,

No problem, I have also been very busy. Basically, the if the transformer sees a set of parameters in the target_space that optimize the function, then it tries to converge to that set of parameters. Mathematically, can mask the target_space so that the transformer only sees the parameters within the global bounds. I implemented a windowing function within `domain_reduction.py' that doesn't require modifying other portions of the code base. I'll upload that now, along with some improved comments.

During this process, I learned that the mask will not stop the transformer from suggesting bounds outside of the global_bounds, since that process also depends on the zoom/pan parameters. I am not sure that it can be avoided, and I don't think it needs to be addressed. This is because the _trim function enforces the global bounds.

On the other hand, as far as I can tell, the mask does drastically improve convergence. I believe that a transformer with a mask is as good as (or better than) a "normal" transformer at converging. This can (should?) be explicitly tested.

If, in your opinion, the convergence seems okay, we might want to suppress the convergence warning since it is quite common at this point.

Added windowing function to improve the convergence of optimizers that use domain_reduction. Improved comments and documentation.
Copy link
Member

@till-m till-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay...
Biggest thing from my side is the fact that the max_window should be calculated using the target space. Rest looks good to me :)

bayes_opt/domain_reduction.py Outdated Show resolved Hide resolved
bayes_opt/domain_reduction.py Outdated Show resolved Hide resolved
bayes_opt/domain_reduction.py Outdated Show resolved Hide resolved
bayes_opt/domain_reduction.py Show resolved Hide resolved
@perezed00
Copy link
Contributor Author

Sorry for the late reply here , I am in the final stretch of dissertation work. I will finish this in 5 days.

@till-m
Copy link
Member

till-m commented Nov 8, 2023

No worries. Good luck with the dissertation!

@till-m
Copy link
Member

till-m commented Dec 10, 2023

Hey @perezed00, don't mean to rush you, but any update on this? :)

@perezed00
Copy link
Contributor Author

Hey @till-m ,

Thanks for your patience. The changes have been implemented. target_space.max() now respects the bounds and I improved the warning messages. Warnings are now more specific and less frequent. Let me know if you see any other issues.

Cheers

@till-m till-m self-requested a review December 11, 2023 08:30
This function was used to prototype a solution. It should not have been pushed and can be removed.
@till-m
Copy link
Member

till-m commented Dec 13, 2023

hey, I'm a bit swamped right now, will have a look as soon as I can. Thanks for the contributions :)

Copy link
Member

@till-m till-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I think this is basically ready for merge, excluding the 3 minor comments + the warning discussion.
@osullivryan If you have the time, I think a review from your side would also be appreciated.

tests/test_target_space.py Show resolved Hide resolved
tests/test_target_space.py Show resolved Hide resolved
bayes_opt/target_space.py Outdated Show resolved Hide resolved
tests/test_target_space.py Show resolved Hide resolved
bayes_opt/target_space.py Show resolved Hide resolved
@till-m
Copy link
Member

till-m commented Dec 17, 2023

Thanks for adding so much high-quality documentation, by the way!

@osullivryan
Copy link
Contributor

Hey @perezed00 and @till-m ,

I took a look. Nice catch and fix! Awesome!

@till-m
Copy link
Member

till-m commented Jan 9, 2024

From my side, I think this is good to go. @bwheelz36 @leandrobbraga @fmfn feel free to review. If you have no comments, I will merge this PR soon.

@perezed00 thank you very much for your contribution. As you may have noticed, we are a bit short on maintainers for this project currently. Based on the way you wrote this PR, the thoughtfulness of your approaches and the clarity of the code you wrote, I think it would be very valuable to have you on the team. If you're interested, please let me know :)

@perezed00
Copy link
Contributor Author

@till-m Thanks for the invite to join the team. I accept, I would be happy to join. Just a bit of a warning however: I defend my dissertation in April, so I might be less responsive a couple months before then.

Cheers!

@bwheelz36
Copy link
Collaborator

Looks good to me.
regarding the codecov checks, I'm not sure if you knew but you can check the code cov analysis of this PR here:
https://app.codecov.io/gh/bayesian-optimization/BayesianOptimization/pull/441
Apparently there are a few lines like if (pbounds[0] > global_bounds[i, 1]): which aren't being tested - should the test case be covering these?

@perezed00
Copy link
Contributor Author

@bwheelz36 ,

Thanks, that is helpful info. I wasn't aware, but I'll take a look now

@till-m till-m merged commit 844927b into bayesian-optimization:master Jan 12, 2024
6 of 7 checks passed
@perezed00 perezed00 deleted the perezed00-trim_patch-1 branch January 15, 2024 21:12
till-m added a commit that referenced this pull request Jan 24, 2024
* Fixes issue-436: Constrained optimization does not allow duplicate points (#437)

* Update docs of `bayesian_optimization.py` and `observer.py`.

* Fix minor style issue in module docstring

* Update docs of `__init__.py` and `events.py`.

* Fix minor style issue in class docstring

* Add workflow to check docstrings

* Update bayes_opt/bayesian_optimization.py

Co-authored-by: Leandro Braga <[email protected]>

* Improve acq_max seeding of L-BFGS-B optimization (#297)

* bounds_transformer could bypass global_bounds due to the test logic within _trim function in domain_reduction.py (#441)

* Update trim bounds in domain_reduction.py

Previously, when the new upper limit was less than the original lower limit, the new_bounds could bypass the global_bounds.

* Update test_seq_domain_red.py

Added test cases to catch an error when both bounds of new_bounds exceeded the global_bounds

* Update domain_reduction.py

_trim function now avoids an error when both bounds for a given parameter in new_bounds exceed the global_bounds

* Update domain_reduction.py comments

* fixed English in domain_reduction.py

* use numpy to sort bounds,  boundary exceeded warn.

* simple sort test added

* domain_red windows target_space to global_bounds

Added windowing function to improve the convergence of optimizers that use domain_reduction. Improved comments and documentation.

* target_space.max respects bounds; SDRT warnings

* Remove unused function.

This function was used to prototype a solution. It should not have been pushed and can be removed.

* Updated target_space.py docstrings

* Update tests/test_target_space.py

Co-authored-by: till-m <[email protected]>

* Added pbound warnings, updated various tests.

* updated line spacing for consistency and style

* added pbound test condition

---------

Co-authored-by: till-m <[email protected]>

* DomainReduction docs, docstyle

* Add missing doc dependency

---------

Co-authored-by: YoungJae Bae <[email protected]>
Co-authored-by: Leandro Braga <[email protected]>
Co-authored-by: ptapping <[email protected]>
Co-authored-by: Edgar <[email protected]>
till-m added a commit that referenced this pull request Feb 26, 2024
* Replace custom colour implementation, add docs for `logger.py`, `util.py` (#435)

* Replace custom colour implementation, add docs for `logger.py`, `util.py`

* minor typo/syntax fixes

* User `or` to separate different possible types

* Update docs & linting for `constraints.py`, `target_space.py` (#440)

* Run tests on any PR

* Update docs, linting

* Update bayes_opt/constraint.py

Co-authored-by: Leandro Braga <[email protected]>

* Rename mislabelled parameters

---------

Co-authored-by: Leandro Braga <[email protected]>

* Update various docstrings, add workflow to check docstrings (#445)

* Fixes issue-436: Constrained optimization does not allow duplicate points (#437)

* Update docs of `bayesian_optimization.py` and `observer.py`.

* Fix minor style issue in module docstring

* Update docs of `__init__.py` and `events.py`.

* Fix minor style issue in class docstring

* Add workflow to check docstrings

* Update bayes_opt/bayesian_optimization.py

Co-authored-by: Leandro Braga <[email protected]>

---------

Co-authored-by: YoungJae Bae <[email protected]>
Co-authored-by: Leandro Braga <[email protected]>

* Pydocstyle (#453)

* Improve acq_max seeding of L-BFGS-B optimization (#297)

---------

Co-authored-by: ptapping <[email protected]>

* Domain reduction, Sphinx docs (#455)

* Fixes issue-436: Constrained optimization does not allow duplicate points (#437)

* Update docs of `bayesian_optimization.py` and `observer.py`.

* Fix minor style issue in module docstring

* Update docs of `__init__.py` and `events.py`.

* Fix minor style issue in class docstring

* Add workflow to check docstrings

* Update bayes_opt/bayesian_optimization.py

Co-authored-by: Leandro Braga <[email protected]>

* Improve acq_max seeding of L-BFGS-B optimization (#297)

* bounds_transformer could bypass global_bounds due to the test logic within _trim function in domain_reduction.py (#441)

* Update trim bounds in domain_reduction.py

Previously, when the new upper limit was less than the original lower limit, the new_bounds could bypass the global_bounds.

* Update test_seq_domain_red.py

Added test cases to catch an error when both bounds of new_bounds exceeded the global_bounds

* Update domain_reduction.py

_trim function now avoids an error when both bounds for a given parameter in new_bounds exceed the global_bounds

* Update domain_reduction.py comments

* fixed English in domain_reduction.py

* use numpy to sort bounds,  boundary exceeded warn.

* simple sort test added

* domain_red windows target_space to global_bounds

Added windowing function to improve the convergence of optimizers that use domain_reduction. Improved comments and documentation.

* target_space.max respects bounds; SDRT warnings

* Remove unused function.

This function was used to prototype a solution. It should not have been pushed and can be removed.

* Updated target_space.py docstrings

* Update tests/test_target_space.py

Co-authored-by: till-m <[email protected]>

* Added pbound warnings, updated various tests.

* updated line spacing for consistency and style

* added pbound test condition

---------

Co-authored-by: till-m <[email protected]>

* DomainReduction docs, docstyle

* Add missing doc dependency

---------

Co-authored-by: YoungJae Bae <[email protected]>
Co-authored-by: Leandro Braga <[email protected]>
Co-authored-by: ptapping <[email protected]>
Co-authored-by: Edgar <[email protected]>

* Small fixes, minor cosmetic changes

* Add some more docs to target space and constraint, cosmetic changes

* Remove duplicate code snippet

* Remove numpydoc + adjust "*" formatting accordingly

* Explicitly add D417, adjust code accordingly

* Adjust `TargetSpace.probe()` behaviour to be in line with docstring.

* Update bayes_opt/target_space.py

Co-authored-by: Edgar <[email protected]>

* Update README.md

---------

Co-authored-by: Leandro Braga <[email protected]>
Co-authored-by: YoungJae Bae <[email protected]>
Co-authored-by: ptapping <[email protected]>
Co-authored-by: Edgar <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants