Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

为什么训练epoch很少,但验证epoch很多,这是正常的吗? #59

Open
MeetingTheSea opened this issue Aug 8, 2023 · 1 comment
Assignees

Comments

@MeetingTheSea
Copy link

我把模型跑起来后,发现训练的epoch只有1803,但验证的epoch却有466232,这是正常的现象吗?感觉有点违背我的认识,而且数据划分时,我看是8:1:1,但程序跑起来之后,却感觉训练的数据加载器和验证的数据加载器对调了,我基本没有改动过代码,这是什么问题?

@Wicknight
Copy link
Collaborator

您好! @MeetingTheSea
这可能与各种因素有关,比如您训练和验证的batch size设置不同, 另外这也可能与您的评测模式相关。您的情况推测是使用的full-sort评测模式(继承自RecBole),在这种模式下是对每个 user 和每个 item 打分,因此 batch_num * batch_size 应该大概是 10% * user 数量 * item 数量,相比于训练数据是不同的计算方式,这里建议您适当调大 eval_batch_size 若干数量级,比如 100 倍或者 1000 倍以更好地并行计算。

@Wicknight Wicknight self-assigned this Dec 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants