-
Notifications
You must be signed in to change notification settings - Fork 965
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding-support-for-mamba2 #1009
base: main
Are you sure you want to change the base?
adding-support-for-mamba2 #1009
Conversation
…niz-Guelmez/mlx-examples into adding-support-for-mamba2
…t MambaMixer block pass)
…niz-Guelmez/mlx-examples into adding-support-for-mamba2
…niz-Guelmez/mlx-examples into adding-support-for-mamba2
Codestral Mamba and other models rely on the Mamba2 architecture. Hopefully we can get this soon. |
…tes Gibberish on Codestral
i think it has something to do with the codestral repo on hf, the layers are not converted correctly. I'll try that later when im home. |
…odestral. working: rokyang/mamba2-130m-hf rokyang/mamba2-370m-hf rokyang/mamba2-780m-hf rokyang/mamba2-1.3b-hf rokyang/mamba2-2.7b-hf
python -m mlx_lm.generate --model /Users/gokdenizgulmez/Desktop/Mamba-Codestral-7B-v0.1-4bit --prompt "# A function that computes fibonacci def fibonacci(" -m 64 ========== n): print(f"{os.path.abspath(".")/data/data/data/com.android.launcher.png) ## 🙌🏼 🙌🙌🙌🙌🙌🙌 class _State(Enum): def __init__ (self ========== Prompt: 16 tokens, 84.547 tokens-per-sec Generation: 64 tokens, 13.774 tokens-per-sec Peak memory: 4.139 GB
Hey @awni, I finished it with mamba-codestral, I will push the quantised version up but you can also use
Ps. There is no prompt format though. |
@Goekdeniz-Guelmez Thanks for all your hard work on this! |
@Goekdeniz-Guelmez I tried both codestral and mamba2 2.7B. Both models generate pretty bad responses.. the 2.7B doesn't really work at all even in 8-bit... so I think there must be a bug there. The codestral one can generate text but doesn't seem to be able to end correctly. Is that your experience? |
Yes, I’ve tried it again with a max generations number and got the same problems, I’ll look into it tomorrow. |
I think it is somewhere in the ssm computation that I got wrong. |
No description provided.