Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

forward() got an unexpected keyword argument 'num_items_in_batch' #35838

Open
2 of 4 tasks
Bachstelze opened this issue Jan 22, 2025 · 6 comments
Open
2 of 4 tasks

forward() got an unexpected keyword argument 'num_items_in_batch' #35838

Bachstelze opened this issue Jan 22, 2025 · 6 comments
Labels

Comments

@Bachstelze
Copy link

System Info

New versions can't train encoder-decoder models.
Related issue and pull request: #34575
System-Info:

  • transformers version: 4.48.1
  • Platform: Linux-6.8.0-36-generic-x86_64-with-glibc2.39
  • Python version: 3.12.8
  • Huggingface_hub version: 0.24.6
  • Safetensors version: 0.4.5
  • Accelerate version: 1.2.1
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.4.1+cu121 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using distributed or parallel set-up in script?: No
  • Using GPU in script?: Yes
  • GPU type: Tesla V100-PCIE-32GB
Traceback (most recent call last):
  File "/home/hilsenbek/workspace/thesis/syntax_transformer/training/train_cross_attention.py", line 110, in <module>
    trainer.train()
  File "/home/hilsenbek/.conda/envs/harness/lib/python3.12/site-packages/transformers/trainer.py", line 2171, in train
    return inner_training_loop(
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/hilsenbek/.conda/envs/harness/lib/python3.12/site-packages/transformers/trainer.py", line 2531, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs, num_items_in_batch)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/hilsenbek/.conda/envs/harness/lib/python3.12/site-packages/transformers/trainer.py", line 3675, in training_step
    loss = self.compute_loss(model, inputs, num_items_in_batch=num_items_in_batch)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/hilsenbek/.conda/envs/harness/lib/python3.12/site-packages/transformers/trainer.py", line 3731, in compute_loss
    outputs = model(**inputs)
              ^^^^^^^^^^^^^^^
  File "/home/hilsenbek/.conda/envs/harness/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/hilsenbek/.conda/envs/harness/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/hilsenbek/.conda/envs/harness/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 433, in _fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/hilsenbek/.conda/envs/harness/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/hilsenbek/.conda/envs/harness/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/hilsenbek/.conda/envs/harness/lib/python3.12/site-packages/accelerate/utils/operations.py", line 823, in forward
    return model_forward(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/hilsenbek/.conda/envs/harness/lib/python3.12/site-packages/accelerate/utils/operations.py", line 811, in __call__
    return convert_to_fp32(self.model_forward(*args, **kwargs))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/hilsenbek/.conda/envs/harness/lib/python3.12/site-packages/torch/amp/autocast_mode.py", line 43, in decorate_autocast
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/hilsenbek/.conda/envs/harness/lib/python3.12/site-packages/transformers/models/encoder_decoder/modeling_encoder_decoder.py", line 603, in forward
    encoder_outputs = self.encoder(
                      ^^^^^^^^^^^^^
  File "/home/hilsenbek/.conda/envs/harness/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/hilsenbek/.conda/envs/harness/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: RobertaModel.forward() got an unexpected keyword argument 'num_items_in_batch'

Who can help?

@ArthurZucker
@gheinrich

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

follow the blog https://huggingface.co/blog/encoder-decoder

Expected behavior

Work as in old transformer versions

@Bachstelze Bachstelze added the bug label Jan 22, 2025
@Rocketknight1
Copy link
Member

This seems related to the trainer changes - cc @muellerzr @SunMarc

@shubhamjain0594
Copy link

Getting same error for the Gemma Model.

@SilverSoldier
Copy link

Same for bloom which is marking unexpected arguments as deprecated and throws ValueError: Got unexpected arguments: {'num_items_in_batch': 5120}.

Seems to be these 3 lines causing the problem:

loss_kwargs["num_items_in_batch"] = num_items_in_batch
inputs = {**inputs, **loss_kwargs}
outputs = model(**inputs)

@ArthurZucker
Copy link
Collaborator

We'll do a patch as soon as there is a fix!

@SunMarc
Copy link
Member

SunMarc commented Jan 23, 2025

Can you share the traceback for the gemma model error @shubhamjain0594 ?

For the bloom error, this can be easily fixed by setting accepts_loss_kwargs = False in bloom modeling code. This happens because for bloom, we allow to pass kwargs hence the issue.

For the encoder decoder, this is because we allow to pass **kwargs in the forward + kwargs_encoder is not set correctly.

I'll let @muellerzr decide how to fix these. Maybe the easiest fix would be to just set accepts_loss_kwargs = True for models that supports it.

@shubhamjain0594
Copy link

Image

@SunMarc here you go. Does this help?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants