You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
hi, we have 12M names and we would like to fine tune whisper on them. also, i am happy to share with you the results.
the question is it better to fine tune whisper using the entire spoken name? Or is it better to fine tune using invidial names and recording snippets of each anme spoken?
The text was updated successfully, but these errors were encountered:
Hey @silvacarl2! Sorry for the late reply here! The best option would be to fine-tune on the closest scenario to what you expect at inference time. If you expect the model to transcribe the entire spoken name, then you should go with that.
hi, we have 12M names and we would like to fine tune whisper on them. also, i am happy to share with you the results.
the question is it better to fine tune whisper using the entire spoken name? Or is it better to fine tune using invidial names and recording snippets of each anme spoken?
The text was updated successfully, but these errors were encountered: