In v0.4.0, PretrainedConfig.get_config_dict fails for TinyStories-Instruct-2Layers-33M #322

PhilipQuirke · 2025-02-04T03:07:19Z

Our "TinySQL" project uses nnsight to investigate "text to SQL" models using as base models:

roneneldan/TinyStories-Instruct-2Layers-33M
Qwen/Qwen2.5-0.5B-Instruct
withmartian/Llama-3.2-1B-Instruct

The following code worked with nnsight v0.3.7 for all three models:

with model.generate(inputs['input_ids'], max_new_tokens=10, pad_token_id=model.tokenizer.eos_token_id) as tracer:
	final_output = model.generator.output.save()

The same code fails in v0.4.0 for the TinyStories model (only) with an endless recursive copy starting in:

File "/usr/local/lib/python3.11/dist-packages/transformers/models/auto/configuration_auto.py", line 1021, in from_pretrained
    config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)

FYI, we have developed a set of notebooks using nnsight that mirror ands then extend the nnsight tutorials, and that works for the above 3 models.
Once we have finished our investigation, we intend publishing our training datasets, models, notebooks and code library as "worked examples".

We'd appreciate your help resolving this issue with v0.4.0 so our notebooks work with the latest nnsight code.

The text was updated successfully, but these errors were encountered:

JadenFiotto-Kaufman · 2025-02-04T03:34:38Z

Hey @PhilipQuirke Sorry to hear that! Can you give me more of a full example? This short one seems to work for me:

from nnsight import LanguageModel

model = LanguageModel("roneneldan/TinyStories-Instruct-2Layers-33M", device_map="auto", dispatch=True)

with model.generate("hello", max_new_tokens=10, pad_token_id=model.tokenizer.eos_token_id) as tracer:
	final_output = model.generator.output.save()

PhilipQuirke · 2025-02-04T04:32:05Z

Cant attach an ipynb. Below is the body of my test ipynb that demonstrates the issue. Does this help?

!pip install -U nnsight -q          # Fails
#!pip install nnsight==0.3.7 -q     # Works
import nnsight

! pip install transformers -q
from transformers import AutoTokenizer, AutoModelForCausalLM

import torch

model_location = "roneneldan/TinyStories-Instruct-2Layers-33M"

tokenizer = AutoTokenizer.from_pretrained( model_location )

# model without flash attention
auto_model = AutoModelForCausalLM.from_pretrained(
    model_location,
    torch_dtype=torch.float32,
    device_map="auto",
)

tokenizer.padding_side = "left"
tokenizer.add_special_tokens({'pad_token': '<|pad|>'})

auto_model.resize_token_embeddings(len(tokenizer), mean_resizing=False)
auto_model.config.pad_token_id = tokenizer.pad_token_id
auto_model.resize_token_embeddings(len(tokenizer))

model = nnsight.LanguageModel(auto_model, tokenizer)
model.tokenizer = tokenizer

the_prompt = "Instructions: get distance from locations Context: CREATE TABLE locations ( distance INT, size INT) Response:"

inputs = model.tokenizer(the_prompt, return_tensors="pt", padding=True)
with model.generate(inputs['input_ids'], max_new_tokens=10, pad_token_id=model.tokenizer.eos_token_id) as tracer:
    final_output = model.generator.output.save()

JadenFiotto-Kaufman · 2025-02-04T22:01:45Z

@PhilipQuirke Just pushed a new release. Can you try now?

PhilipQuirke · 2025-02-04T23:16:22Z

That's certainly a big improvement with the TinyStories model loading successfully. Thanks very much!

Further testing, revealed another change. The commented line now fails in 0.4.0 but only for the TinyStories model (model_num == 1):

    N_LAYERS = len(model.transformer.h) if model_num == 1 else len(model.model.layers)
    N_HEADS = model.config.num_attention_heads # Works in 0.3.7, fails in 0.4.0
    D_MODEL = model.transformer.wte.embedding_dim if model_num == 1 else model.config.hidden_size
    D_HEAD = D_MODEL // N_HEADS

Do you want a separate issue for this?
Alternatively happy to change my code. Is there a better / more consistent way to calculate the above 4 values across models?

JadenFiotto-Kaufman · 2025-02-04T23:53:31Z

@PhilipQuirke Ah I see the problem. For now you can use model._model.config isntead of model.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

In v0.4.0, PretrainedConfig.get_config_dict fails for TinyStories-Instruct-2Layers-33M #322

In v0.4.0, PretrainedConfig.get_config_dict fails for TinyStories-Instruct-2Layers-33M #322

PhilipQuirke commented Feb 4, 2025

JadenFiotto-Kaufman commented Feb 4, 2025

PhilipQuirke commented Feb 4, 2025

JadenFiotto-Kaufman commented Feb 4, 2025

PhilipQuirke commented Feb 4, 2025

JadenFiotto-Kaufman commented Feb 4, 2025

In v0.4.0, PretrainedConfig.get_config_dict fails for TinyStories-Instruct-2Layers-33M #322

In v0.4.0, PretrainedConfig.get_config_dict fails for TinyStories-Instruct-2Layers-33M #322

Comments

PhilipQuirke commented Feb 4, 2025

JadenFiotto-Kaufman commented Feb 4, 2025

PhilipQuirke commented Feb 4, 2025

JadenFiotto-Kaufman commented Feb 4, 2025

PhilipQuirke commented Feb 4, 2025

JadenFiotto-Kaufman commented Feb 4, 2025