attempting to convert tiiuae/falcon-180B-chat #1472

silvacarl2 · 2023-09-11T23:56:21Z

we are attempting to convert tiiuae/falcon-180B-chat to ct2 format.

this is the command:

ct2-transformers-converter --model tiiuae/falcon-180B-chat --output_dir tiiuae-falcon-180b-instruct-int8-float16 --force --copy_files tokenizer.json README.md tokenizer_config.json generation_config.json special_tokens_map.json --quantization int8_float16 --trust_remote_code

but we get this crash:

[2023-09-11 16:04:11,429] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
Loading checkpoint shards: 100%|████████████████| 81/81 [02:28<00:00, 1.84s/it]
Traceback (most recent call last):
File "/home/silvacarl/.local/bin/ct2-transformers-converter", line 8, in
sys.exit(main())
File "/home/silvacarl/.local/lib/python3.8/site-packages/ctranslate2/converters/transformers.py", line 1719, in main
converter.convert_from_args(args)
File "/home/silvacarl/.local/lib/python3.8/site-packages/ctranslate2/converters/converter.py", line 50, in convert_from_args
return self.convert(
File "/home/silvacarl/.local/lib/python3.8/site-packages/ctranslate2/converters/converter.py", line 89, in convert
model_spec = self._load()
File "/home/silvacarl/.local/lib/python3.8/site-packages/ctranslate2/converters/transformers.py", line 140, in _load
spec = loader(model, tokenizer)
File "/home/silvacarl/.local/lib/python3.8/site-packages/ctranslate2/converters/transformers.py", line 192, in call
spec = self.get_model_spec(model)
File "/home/silvacarl/.local/lib/python3.8/site-packages/ctranslate2/converters/transformers.py", line 1331, in get_model_spec
self.set_decoder(spec.decoder, model.transformer)
File "/home/silvacarl/.local/lib/python3.8/site-packages/ctranslate2/converters/transformers.py", line 1359, in set_decoder
self.set_layer_norm(layer_spec.input_layer_norm, layer.ln_attn)
AttributeError: 'TransformerDecoderLayerSpec' object has no attribute 'input_layer_norm'

any ideas?

we are running on it on A40 with 576 Gi RAM

guillaumekln · 2023-09-12T10:59:06Z

This specific error may require small changes in the converter, but there are currently more general issues related to very large models.

During conversion it will likely hit this other error #1324 which is a limitation in the current model serialization. Then during runtime, the model will require at least 180GB in int8 and would need to be split on multiple GPUs, which is currently not supported in CTranslate2 #1052.

silvacarl2 · 2023-09-12T15:10:03Z

we are happy to supply you with a server you can test with if that would help.

silvacarl2 · 2023-11-27T00:44:24Z

non issue now.

aflah02 · 2024-01-16T18:13:02Z

@silvacarl2 How did you get this to work? I still get the same error as you posted for falcon-40b

silvacarl2 · 2024-01-16T18:37:59Z

i didn't, we decide to try somehting else. theres ten zillion choices now.

aflah02 · 2024-01-16T18:53:11Z

Lol yeah that's true
Can I know what worked best for you for inference on falcon-40b and other such large models? I've had good success with ctranslate2 for smaller models so far while trying to get logprobs for inputs

silvacarl2 · 2024-01-16T19:10:19Z

sorry i thought you tried to do falcon-180b

this may work for 40b, i don't remember.

ct2-transformers-converter --model tiiuae/falcon-40b-instruct --output_dir tiiuae-falcon-40b-instruct-int8_float16 --force --copy_files tokenizer.json README.md tokenizer_config.json generation_config.json special_tokens_map.json --quantization int8_float16 --trust_remote_code

it needs 88 Gb of RAM

aflah02 · 2024-01-16T20:35:06Z

Thanks @silvacarl2
Is there a way to do this without the int_8 quantization as well? As I tried the same command without the int8 quantization flags and got the same error as you had originally posted for the 180b model. I'm also curious about the 180b model, what way did you go with to run it?

silvacarl2 · 2024-01-16T21:17:06Z

Is there a way to do this without the int_8 quantization as well?

yes, just use int8 instaead of int_8_float16

'm also curious about the 180b model, what way did you go with to run it?

we gave up on it, there are many better chocies now.

aflah02 · 2024-01-16T21:51:23Z

@silvacarl2 Thanks for the response!
I think you might've misunderstood my first question. I want to do this without any quantization and hence I did not use that flag but it did not work and I got the error you mentioned in this issue.

I agree there certainly are much stronger models now!

silvacarl2 · 2024-01-16T21:53:19Z

got it. well, thats as far as we went with it. 8-(

aflah02 · 2024-01-16T21:58:11Z

Ah got it! Thanks for the help 😊

guillaumekln added the enhancement New feature or request label Sep 12, 2023

silvacarl2 closed this as completed Nov 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

attempting to convert tiiuae/falcon-180B-chat #1472

attempting to convert tiiuae/falcon-180B-chat #1472

silvacarl2 commented Sep 11, 2023 •

edited

Loading

guillaumekln commented Sep 12, 2023 •

edited

Loading

silvacarl2 commented Sep 12, 2023

silvacarl2 commented Nov 27, 2023

aflah02 commented Jan 16, 2024

silvacarl2 commented Jan 16, 2024

aflah02 commented Jan 16, 2024

silvacarl2 commented Jan 16, 2024

aflah02 commented Jan 16, 2024

silvacarl2 commented Jan 16, 2024

aflah02 commented Jan 16, 2024 •

edited

Loading

silvacarl2 commented Jan 16, 2024

aflah02 commented Jan 16, 2024

attempting to convert tiiuae/falcon-180B-chat #1472

attempting to convert tiiuae/falcon-180B-chat #1472

Comments

silvacarl2 commented Sep 11, 2023 • edited Loading

guillaumekln commented Sep 12, 2023 • edited Loading

silvacarl2 commented Sep 12, 2023

silvacarl2 commented Nov 27, 2023

aflah02 commented Jan 16, 2024

silvacarl2 commented Jan 16, 2024

aflah02 commented Jan 16, 2024

silvacarl2 commented Jan 16, 2024

aflah02 commented Jan 16, 2024

silvacarl2 commented Jan 16, 2024

aflah02 commented Jan 16, 2024 • edited Loading

silvacarl2 commented Jan 16, 2024

aflah02 commented Jan 16, 2024

silvacarl2 commented Sep 11, 2023 •

edited

Loading

guillaumekln commented Sep 12, 2023 •

edited

Loading

aflah02 commented Jan 16, 2024 •

edited

Loading