Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请问如何增加/修改词表vocab.txt #2

Open
Crescentz opened this issue Sep 25, 2020 · 3 comments
Open

请问如何增加/修改词表vocab.txt #2

Crescentz opened this issue Sep 25, 2020 · 3 comments

Comments

@Crescentz
Copy link

bert的中文vocab.txt的汉字太少了,请问垂直领域遇到这样情况是怎么增加自己的token呢, 【unused】不够用

@ZhuiyiTechnology
Copy link
Owner

1、把它们加入到vocab.txt里边
2、通过compound_tokens参数追加。

以上只适合bert4keras,可以仔细琢磨一下训练脚本中追加词的方法:https://github.com/ZhuiyiTechnology/WoBERT/blob/master/train.py

@yuhaiyan-77
Copy link

您好,我无法下载文件请问还有没有别的办法下载模型

@alanbreeze
Copy link

已恢复下载

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants