Discussion: Config of RL Model #71

fabioseel · 2025-01-13T12:50:50Z

Should model be completely defined by config or should the RL Model automatically add simple linear layers such as the Action Parameterization?

Ok, so here's my thoughts on this after developing for a while:

Pro complete definition by config

cleaner / easier to understand
simpler weights file (as the state_dict will not have additional keys) for reuse
more consistent loss definition (target circuit)

Contra complete definition by config

weight file not reusable for other framework out of the box
- model parts will need to be ignored upon load (how to define this? nicely? in config?)

How to deal with the samplefactory enforced head / core / tail structure?

I see the following options:

implement an autodiscovery of where to 'split' the circuit into these three parts.
- would be the easiest to use - as long as it works. And that's a big IF.
somehow define (eg through the config) which part contains which circuits
- quite strict binding to samplefactory model style, which we don't really want
modify samplefactory code
- with the introduction of a custom learner this could be possible. Perhaps one could even implement it in a way that it is backwards compatible for samplefactory and integrate it into the library?
- risks: lose some of the optimization performance, as the library does some 'magic' I did not dive into yet for the individual parts. In particular the core allows to use an RNN and the learner applies some optimizations etc for that. While I think this is probably the best solution considering what we get, it's also the most complicated and we might deviate more from 'standard' samplefactory

fabioseel mentioned this issue Jan 13, 2025

Reinforcement Learning #31

Open

10 tasks

fabioseel changed the title ~~decide / discuss: should model be completely defined by config or should the RL Model automatically add simple linear layers such as the Action Parameterization?~~ Discussion: Config of RL Model Jan 13, 2025

fabioseel added this to the Sample Factory + RL milestone Jan 13, 2025

fabioseel added Feature A new capability in the library Major A large issue that may require a signficant commit labels Jan 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussion: Config of RL Model #71

Discussion: Config of RL Model #71

fabioseel commented Jan 13, 2025 •

edited

Loading

Discussion: Config of RL Model #71

Discussion: Config of RL Model #71

Comments

fabioseel commented Jan 13, 2025 • edited Loading

Should model be completely defined by config or should the RL Model automatically add simple linear layers such as the Action Parameterization?

How to deal with the samplefactory enforced head / core / tail structure?

fabioseel commented Jan 13, 2025 •

edited

Loading