please implement the `from_pretrained` tokenizer method #12

cregouby · 2022-01-29T16:12:12Z

Hello,

current behavior

The Quicktour from Huggingface ends with the following paragraph :

but currently, {hftokenizer} do not allow to load pretrained tokenizers.

expected behavior

I'd like to be able to reuse pretrained tokenizers already available in the LM models present in the wild ( BERT, RoBERTa and friends) and / or in my local cache folder, in order to feed those models with the result of {hftokenizer} tokenizer$encode()$ids.

And have the Quicktour vignette to cover the API to do it.

Thanks a lot !

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

please implement the `from_pretrained` tokenizer method #12

please implement the `from_pretrained` tokenizer method #12

cregouby commented Jan 29, 2022

please implement the from_pretrained tokenizer method #12

please implement the from_pretrained tokenizer method #12

Comments

cregouby commented Jan 29, 2022

current behavior

expected behavior

please implement the `from_pretrained` tokenizer method #12

please implement the `from_pretrained` tokenizer method #12