Nice Work on CodeRM but Many Missing Citations #1

ashwin296 · 2025-02-07T05:34:13Z

This is a really nice work that contributes to new reward models in the coding domain.

However, many relevant prior works are not cited. The claim that "very few models have explored the potential of reinforcement learning" in the introduction is misleading (currently, only CodeRL is cited in Section 5.4).

To name a few:

It would be great to clarify the contributions to reward model training (which are quite valuable) and adjust the phrasing to better reflect prior works.

Thanks again for open-sourcing!

-- a Reinforcement Learner : )

wenhuchen · 2025-02-07T16:03:42Z

Thanks for the reminder. We have read these insightful papers. We will compare with them in the paper.

ashwin296 · 2025-02-07T17:05:33Z

And PG-TD, where there are value estimates.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nice Work on CodeRM but Many Missing Citations #1

Nice Work on CodeRM but Many Missing Citations #1

ashwin296 commented Feb 7, 2025 •

edited

Loading

wenhuchen commented Feb 7, 2025

ashwin296 commented Feb 7, 2025

Nice Work on CodeRM but Many Missing Citations #1

Nice Work on CodeRM but Many Missing Citations #1

Comments

ashwin296 commented Feb 7, 2025 • edited Loading

wenhuchen commented Feb 7, 2025

ashwin296 commented Feb 7, 2025

ashwin296 commented Feb 7, 2025 •

edited

Loading