We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
按照作者的描述,pretrain_512.pth和full_sft_512.pth应该只有26M,pretrain_768.pth应该是108M,但是保存下来的文件却要大好几倍,这是为什么呢? 另外,经过检查,LMConfig里吗的模型结构也都是对的,开始训练时控制台输出的大小也是正确的。各位有遇到这个问题吗?
The text was updated successfully, but these errors were encountered:
这里的M是(Million)参数数量,不是文件大小
float32占4字节
100M模型权重文件大小 ≈ 100M × 4 ≈ 400,000,000字节 ≈ 400MB
Sorry, something went wrong.
这里的M是(Million)参数数量,不是文件大小 float32占4字节 100M模型权重文件大小 = 100M × 4 ≈ 400,000,000字节 ≈ 400MB
100M模型权重文件大小 = 100M × 4 ≈ 400,000,000字节 ≈ 400MB
谢谢大神,我明白了!
No branches or pull requests
按照作者的描述,pretrain_512.pth和full_sft_512.pth应该只有26M,pretrain_768.pth应该是108M,但是保存下来的文件却要大好几倍,这是为什么呢?
另外,经过检查,LMConfig里吗的模型结构也都是对的,开始训练时控制台输出的大小也是正确的。各位有遇到这个问题吗?
The text was updated successfully, but these errors were encountered: