Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why 1023 in this code? #123

Closed
ty5491003 opened this issue Sep 24, 2019 · 2 comments
Closed

Why 1023 in this code? #123

ty5491003 opened this issue Sep 24, 2019 · 2 comments

Comments

@ty5491003
Copy link

length=min(length, 1023 - (len(context_tokens) if prefix else 0)),

From this code, we can see that the maxlen of the generative data is 1023 tokens, no matter how we adjust the '--length' parameter when generating. So i want to ask why set a constant 1023 in this code?

@minimaxir
Copy link
Owner

The original GPT-2 model is fixed at a context window of 1024; if you go over, it'll error.

The fix is a sliding windows approach which is in #87 but it needs testing.

@ty5491003
Copy link
Author

Got it, thx.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants