In GPI-LS for continuous action, use GPI only for selecting weights as default #126

LucasAlegre · 2024-10-25T22:24:51Z

In the continuous action case, GPI can only be approximated and is not completely equivalent to the discrete case. I have realized that GPI-LS in the continuous action case works much better if use_gpi=False. That is, GPI is only used in the selection of the new weights of the iteration, and not to evaluate the algorithm.

This also reduces the training time, since the time to evaluate an action becomes smaller.

Set use_gpi=False in the continuous action case

4c8c2e2

LucasAlegre self-assigned this Oct 25, 2024

LucasAlegre changed the title ~~Set use_gpi=False in the continuous action case~~ In GPI-LS for continuous action, use GPI only for selecting weights as default Oct 25, 2024

LucasAlegre requested a review from ffelten October 25, 2024 22:37

ffelten approved these changes Oct 25, 2024

View reviewed changes

LucasAlegre merged commit 665657b into main Oct 25, 2024
4 checks passed

LucasAlegre deleted the gpi-ls-continuous-action-use-gpi-default branch October 25, 2024 23:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

In GPI-LS for continuous action, use GPI only for selecting weights as default #126

In GPI-LS for continuous action, use GPI only for selecting weights as default #126

LucasAlegre commented Oct 25, 2024 •

edited

Loading

In GPI-LS for continuous action, use GPI only for selecting weights as default #126

In GPI-LS for continuous action, use GPI only for selecting weights as default #126

Conversation

LucasAlegre commented Oct 25, 2024 • edited Loading

LucasAlegre commented Oct 25, 2024 •

edited

Loading