Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding PhiConfig #1568

Merged
merged 3 commits into from
Nov 29, 2023
Merged

Adding PhiConfig #1568

merged 3 commits into from
Nov 29, 2023

Conversation

michaelfeil
Copy link
Contributor

@michaelfeil michaelfeil commented Nov 28, 2023

Work in progress to add one more model. (microsoft/phi)

I am really not sure if my conversions are right, maybe give it a check. heavily based on MixFormerSequentialLoader.

Closes #1559

@michaelfeil
Copy link
Contributor Author

michaelfeil commented Nov 28, 2023

Using the same test scripts as elsewhere:

To reproduce run converter with:

--model microsoft/phi-1_5 --trust_remote_code --output_dir /home/michi/ct2_convert_phi 
import ctranslate2
from transformers import AutoTokenizer, AutoModelForCausalLM
from timeit import default_timer as timer
model_ct2 = ctranslate2.Generator(
    "/home/michi/ct2_convert_phi",
    device="cpu",
    compute_type="float32",
)
name_hf = 'microsoft/phi-1_5'
model_hf = AutoModelForCausalLM.from_pretrained(name_hf, trust_remote_code=True)
tokenizer =  AutoTokenizer.from_pretrained(name_hf, device="cpu", trust_remote_code=True)

text = "# this code print('hello', name) in python3 \n\ndef hello_name(name: "

def inference_ct2(text: str):
    tokens_in = tokenizer.convert_ids_to_tokens(tokenizer.encode(text))
    s = timer()
    tokens_out = model_ct2.generate_batch([tokens_in], max_length=32, min_length=32)
    total_ct2 = timer() - s
    text_out = tokenizer.decode(tokens_out[0].sequences_ids[0])
    return total_ct2, text_out

def inference_hf(text: str):
    inputs = tokenizer.encode(text, return_tensors="pt")
    s = timer()
    outputs = model_hf.generate(inputs, max_length=32, min_length=32)
    total_hf = timer() - s
    text_out2 = tokenizer.batch_decode(outputs)[0]
    return total_hf, text_out2


_, _ = inference_ct2("warm up")
_, _ = inference_hf("warm up")

total_ct2, text_out_ct2 = inference_ct2(text)
total_hf, text_out_hf = inference_hf(text)

print("hf\n",text_out_hf, "\nct2\n",text_out_ct2)
print(f"time for ct2={total_ct2} time for hf={total_hf}")
assert text_out_ct2[:len(text_out_hf)] == text_out_hf # same, ct2 generates +1 token compared to HF
print("done")

I get:

>>> inference_ct2("warm up")
(6.334434970000075, 'warm up before exercising.\n\n3. Why did John feel energized and ready to tackle the day after his workout?\nAnswer: John felt energized and')
>>> inference_hf("warm up")
(7.759652636999817, 'warm up before exercising.\n\n3. Why did John feel energized and ready to tackle the day after his workout?\nAnswer: John felt energized')

@vince62s
Copy link
Member

can you fix flake8 ?

@michaelfeil
Copy link
Contributor Author

Done!

@vince62s
Copy link
Member

looking at the transformers library I don't see anywhere the use of MixFormerSequentialLoader, so just wondering if we could just get rid of it or if there is really a foundation to keep this class.

@michaelfeil
Copy link
Contributor Author

Valid, point, i forgot that. It was my starting point.

@vince62s
Copy link
Member

vince62s commented Nov 29, 2023

sorry to be annoying but shouldn't we remove the other one completely ? unless we want to support older versions of the transformers library for now.

@michaelfeil
Copy link
Contributor Author

No idea, which models from huggingface use „MixFormerSequentialLoader“ - I assume there are some models for that.

This PR is for supporting „PhiConfig“. I initally thought the structure is very similar to MixFormerSequentialLoader. That was not the case. I think it makes sense to keep it, and open an extra class right?

@vince62s
Copy link
Member

let's keep both for now but I really think this is the same model and they only renamed it. I see you changes vs the older one but I think specs are equivalent. Obviously you tested it so I'll merge it.

@vince62s vince62s merged commit c6f7f3b into OpenNMT:master Nov 29, 2023
17 checks passed
@michaelfeil michaelfeil deleted the phi-support branch November 29, 2023 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Phi-1.5 conversion not working
2 participants