[megatron] fix: critic and reward model load tokenizer from config #301

HollowMan6 · 2025-02-18T22:58:53Z

Currently, the worker will fail if the critic or reward model path doesn't contain a tokenizer. This PR tries to fix this by loading tokenizer from the config for the previously mentioned case.

For the critic model, we fall back to load from critic.model.tokenizer_path.
For the reward model, we first fall back to load from reward_model.model.rm_tokenizer, and then reward_model.model.input_tokenizer if that is not set.

HollowMan6 · 2025-02-19T08:03:28Z

Just fixed the yapf formatting issue

verl/workers/megatron_workers.py

Currently, the worker will fail if the critic or reward model path doesn't contain a tokenizer. This PR tries to fix this by loading tokenizer from the config for the previously mentioned case. - For the critic model, we fall back to load from `critic.model.tokenizer_path`. - For the reward model, we first fall back to load from `reward_model.model.rm_tokenizer`, and then `reward_model.model.input_tokenizer` if that is not set. Signed-off-by: Hollow Man <[email protected]>

HollowMan6 force-pushed the tokenizer branch 2 times, most recently from 9c51e60 to ab24a03 Compare February 19, 2025 08:02

eric-haibin-lin reviewed Feb 21, 2025

View reviewed changes

verl/workers/megatron_workers.py Outdated Show resolved Hide resolved

HollowMan6 force-pushed the tokenizer branch from ab24a03 to 9b45db2 Compare February 21, 2025 15:04

HollowMan6 force-pushed the tokenizer branch from 9b45db2 to 2864034 Compare February 21, 2025 15:06

HollowMan6 requested a review from eric-haibin-lin February 21, 2025 15:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[megatron] fix: critic and reward model load tokenizer from config #301

[megatron] fix: critic and reward model load tokenizer from config #301

HollowMan6 commented Feb 18, 2025

HollowMan6 commented Feb 19, 2025

[megatron] fix: critic and reward model load tokenizer from config #301

Are you sure you want to change the base?

[megatron] fix: critic and reward model load tokenizer from config #301

Conversation

HollowMan6 commented Feb 18, 2025

HollowMan6 commented Feb 19, 2025