Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I get a parameter error when I use a pretrained model #78

Open
coobMagicX opened this issue Mar 20, 2023 · 3 comments
Open

I get a parameter error when I use a pretrained model #78

coobMagicX opened this issue Mar 20, 2023 · 3 comments

Comments

@coobMagicX
Copy link

I get a parameter problem when using a pretrained model under pytorch, codebase and codevecs length mismatch in search.py.

Traceback (most recent call last):
File "search.py", line 150, in
assert len(codebase)==len(codevecs), "inconsistent number of chunks, check whether the specified files for codebase and code vectors are correct!"
AssertionError: inconsistent number of chunks, check whether the specified files for codebase and code vectors are correct!

@guxd
Copy link
Owner

guxd commented Mar 30, 2023

This is probably because you did not specify the --chunk_size argument.
The default number (2M) is set for our provided dataset. If you use your own dataset, you need to set an appropriate chunk size.

@coobMagicX
Copy link
Author

Yes, because I used the dataset downloaded from Google Drive, I didn't modify the chunk_size at first, but they didn't match.
Now I have switched to using the project under the keras version, could you please provide the raw code datasets used for project training. Or tell me where I can get the raw code datasets used by the project. Thank you so much.

@guxd
Copy link
Owner

guxd commented Jun 6, 2023

The raw code datasets are available at /pytorch/train.rawcode.rar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants