Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于保存模型大小的问题 #116

Open
SSSS88888 opened this issue Jan 15, 2025 · 2 comments
Open

关于保存模型大小的问题 #116

SSSS88888 opened this issue Jan 15, 2025 · 2 comments

Comments

@SSSS88888
Copy link

image

按照作者的描述,pretrain_512.pth和full_sft_512.pth应该只有26M,pretrain_768.pth应该是108M,但是保存下来的文件却要大好几倍,这是为什么呢?
另外,经过检查,LMConfig里吗的模型结构也都是对的,开始训练时控制台输出的大小也是正确的。各位有遇到这个问题吗?

image
@jingyaogong
Copy link
Owner

jingyaogong commented Jan 16, 2025

这里的M是(Million)参数数量,不是文件大小

float32占4字节

100M模型权重文件大小 ≈ 100M × 4 ≈ 400,000,000字节 ≈ 400MB

@SSSS88888
Copy link
Author

这里的M是(Million)参数数量,不是文件大小

float32占4字节

100M模型权重文件大小 = 100M × 4 ≈ 400,000,000字节 ≈ 400MB

谢谢大神,我明白了!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants