-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Model State Reload with Quantized Stubs in SparseAutoModelForCausalLM #2226
Conversation
8c1504b
to
0e16e99
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The snapshot_download can pull down a lot of extra files since it downloads the whole folder. This might interact weirdly with the resolve_recipe call with specifically tries to download the recipe. This works, but I would like to be a bit more selective with the download
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM pending @mgoin's comment
Addressed in latest commit @mgoin |
d162980
to
b85b069
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot! It's additional complexity but I think good to have
It was a great callout. Really appreciate it. |
Description
Identified a bug in the main branch where the model state fails to reload when using quantized stubs with
SparseAutoModelForCausalLM.from_pretrained(...)
. The issue was due to thereload_model_state
method expecting weight files in a local directory, not accounting for remotely hosted model directories.Solution
Propose downloading the model directory before invoking
reload_model_state
to ensure weight files are available locally for model state reload.Testing
Tested with the following script, confirming the fix resolves the issue:
Observations
Before the Fix:
Model state fails to reload due to missing local weight files, as shown in warnings and errors in the logs.
After the Fix:
Successfully reloaded model state with the fix.