-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In v0.4.0, PretrainedConfig.get_config_dict fails for TinyStories-Instruct-2Layers-33M #322
Comments
Hey @PhilipQuirke Sorry to hear that! Can you give me more of a full example? This short one seems to work for me: from nnsight import LanguageModel
model = LanguageModel("roneneldan/TinyStories-Instruct-2Layers-33M", device_map="auto", dispatch=True)
with model.generate("hello", max_new_tokens=10, pad_token_id=model.tokenizer.eos_token_id) as tracer:
final_output = model.generator.output.save() |
Cant attach an ipynb. Below is the body of my test ipynb that demonstrates the issue. Does this help?
|
@PhilipQuirke Just pushed a new release. Can you try now? |
That's certainly a big improvement with the TinyStories model loading successfully. Thanks very much! Further testing, revealed another change. The commented line now fails in 0.4.0 but only for the TinyStories model (model_num == 1):
Do you want a separate issue for this? |
@PhilipQuirke Ah I see the problem. For now you can use |
Our "TinySQL" project uses nnsight to investigate "text to SQL" models using as base models:
The following code worked with nnsight v0.3.7 for all three models:
The same code fails in v0.4.0 for the TinyStories model (only) with an endless recursive copy starting in:
FYI, we have developed a set of notebooks using nnsight that mirror ands then extend the nnsight tutorials, and that works for the above 3 models.
Once we have finished our investigation, we intend publishing our training datasets, models, notebooks and code library as "worked examples".
We'd appreciate your help resolving this issue with v0.4.0 so our notebooks work with the latest nnsight code.
The text was updated successfully, but these errors were encountered: