-
-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix and enable XTTS streaming #478
base: alltalkbeta
Are you sure you want to change the base?
Conversation
SilyNoMeta
commented
Jan 4, 2025
•
edited
Loading
edited
- adds the ability to enable streaming on XTTS settings (disabled on others depending of capabilities)
- uses the state of the streaming flag when using the Open AI compatible Speech API
- fix the streaming mode
* Tested with @RASPAUDIO french model available here : https://huggingface.co/RASPIAUDIO/F5-French-MixedSpeakers-reduced
* adds langdetect as requirement for colab, standalone and textgen * adds "auto" to the language dropdown in the Advanced Engine/Model Settings panel * replace the hardcoded "en" by "auto" when called by the OpenAI compatible Speech API
Add initial support for pickletensor models to F5-TTS
Add language auto-detection
Hi @SilyNoMeta As you may note there is a github merge/sequencing thing going on here with the next 4 PR's that you sent, all seemingly back to the tts_server.py. Should be easy enough to sort out, but I am looking deeper at the code changes before I pull things in. I push them all to a staging area first and then up to the alltalkbeta. That aside, I have 2x questions for you on this update:
Line 951 in c83faf9
Lines 1129 to 1130 in c83faf9
but its already pulled in as model_engine: Line 193 in c83faf9
is this because you are attempting to re-load the variables from the actual underlying engine on each run, in case the mapped voice changed? If so, Im probably going to move this back to an update of model_engine just to keep all variables the same throughout the script. But Im just checking thats what I think you are doing, or if there was some other reason/issue you encountered? Sorry to have to ask, but I do like to ensure I know why the code is doing certain things and I actually have a huge update 80% done that I will have to merge in after all these new PR's and there are quite a few changes to do with generation of TTS and a new rvc pipeline, so I just need to be certain on the core functionality of the generate functions in my head. Thanks |
Hey ! Most of the changes in tts_server.py and model_engine.py are as you said, not necessary so you shouldn't really bother merging those if it will conflicts with your working branches. Actually, what happend was, as I was trying to enable streaming support through my new settings, I've got errors. The "true" fix ended up being the addition of the StreamingResponse when the new flag was set on the OpenAI Speech API compatible webservice : # Lines 1150 to 1156 in c83faf9
As for the model engine redefinition, what happends was that when I was playing with the GUI, the new saved settings were not used when I tested the API. # Lines 1129 to 1130 in c83faf9
Good luck with your work ! I'm now hyped !! 🍿 |