Support for new languages #111

pyai88 · 2025-01-11T14:46:51Z

Hello, thank you for the great work. I have a couple of questions:

Should this method work on a new language out-of-the-box? I'm seeing properties from the training language in my output, so I'm wondering if I've made a mistake.
If it doesn't work out-of-the-box, would fine-tuning the pre-trained model be preferable to training from scratch?
To help me estimate the training cost, could you provide guidance on how much data is typically needed for a new language and how long you trained your pre-trained model for?

Thank you in advance.

Plachtaa · 2025-01-12T10:17:27Z

Hi there,
If you find the output in unseen language being accented, you may try finetuning the current checkpoint with the language you desire.
I cannot give an estimation how many hours of data is required, the only thing I suggest is to use as much as you have

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for new languages #111

Support for new languages #111

pyai88 commented Jan 11, 2025 •

edited

Loading

Plachtaa commented Jan 12, 2025

Support for new languages #111

Support for new languages #111

Comments

pyai88 commented Jan 11, 2025 • edited Loading

Plachtaa commented Jan 12, 2025

pyai88 commented Jan 11, 2025 •

edited

Loading