We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llama应该默认没有启用bias项。但按照苏神最新思路,把q,k的bias项加回来可以明显提升长度外推性能,作者考虑预训练测试一下不 https://kexue.fm/archives/9577
The text was updated successfully, but these errors were encountered:
赞,再预训练的时候我会加上
Sorry, something went wrong.
请问目前公开的代码预训练的时候加上bias项了吗
目前加上bias项的还在训练中,将在训练好之后再release出相应代码
这个项目是基于llama从0开始训练是吗
No branches or pull requests
llama应该默认没有启用bias项。但按照苏神最新思路,把q,k的bias项加回来可以明显提升长度外推性能,作者考虑预训练测试一下不
https://kexue.fm/archives/9577
The text was updated successfully, but these errors were encountered: