Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In GPI-LS for continuous action, use GPI only for selecting weights as default #126

Merged
merged 1 commit into from
Oct 25, 2024

Conversation

LucasAlegre
Copy link
Owner

@LucasAlegre LucasAlegre commented Oct 25, 2024

In the continuous action case, GPI can only be approximated and is not completely equivalent to the discrete case. I have realized that GPI-LS in the continuous action case works much better if use_gpi=False. That is, GPI is only used in the selection of the new weights of the iteration, and not to evaluate the algorithm.

This also reduces the training time, since the time to evaluate an action becomes smaller.

@LucasAlegre LucasAlegre self-assigned this Oct 25, 2024
@LucasAlegre LucasAlegre changed the title Set use_gpi=False in the continuous action case In GPI-LS for continuous action, use GPI only for selecting weights as default Oct 25, 2024
@LucasAlegre LucasAlegre requested a review from ffelten October 25, 2024 22:37
@LucasAlegre LucasAlegre merged commit 665657b into main Oct 25, 2024
4 checks passed
@LucasAlegre LucasAlegre deleted the gpi-ls-continuous-action-use-gpi-default branch October 25, 2024 23:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants