How would you use a different TTS program for this? #62

Mike-MW · 2024-12-06T21:09:49Z

Elevenlabs is a bit pricey. I'd prefer to use something like amazon polly, yeah it's lower quality but you get more speech for the price and I don't exactly have a lot of excess cash to throw around.

tizu69 · 2024-12-14T15:12:25Z

have a look at https://github.com/DougDougGithub/Babagaboosh/blob/main/eleven_labs.py - ElevenLabsManager is quite modular, and as long as polly has an api you should be easily able to integrate it. sadly I currently do not have the resources to get polly working on my machine so I can't provide any code snippets, https://docs.aws.amazon.com/polly/latest/dg/SynthesizeSpeechSamplePython.html + a library that plays the audio seems like it could help.

Mike-MW · 2024-12-14T15:32:04Z

Thanks for the reply but I ended up rebuilding the whole thing in Rust in the end, Python is just such a painful language to deal with. I have a version where the VTT and TTS are both handled by the open AI api so I only need one API key.

tizu69 · 2024-12-14T23:05:29Z

is it open source? I think that'd be interesting to check out :)

Mike-MW · 2024-12-14T23:17:08Z

https://github.com/slbsh/chatgpt_slop this is the link

The instructions are as follows:
Edit the config to include the following:
your openai api key
global listen is whether or not you want to be able to activate it from another window or not
device is the system name of your microphone
backend is set to linux by default but switch it to dshow if you use windows
keycode is the code of the key you want to activate the recording, if you want a keycode that's not part of the standard keyboard and you have a keyboard/mouse with macro keys you could use 124 - 135 for F13-F24
The prompt can be multi-lined but only if you have """ before and """ after the prompt
you can increase or decrease the message limit to determine how much info chat gpt stores in memory during your conversation.

Once you have config sorted save and close it, then open a terminal window by right clicking in the folder, click terminal, then type in "cargo run" and hit enter, it will compile the files, you can then test it to make sure it's working

When finished you can CTRL+C to exit, it will show an error when yo do so but don't worry about it.

You can then run "cargo build --release" in the terminal to build a release version inside the "target" folder with an EXE that takes you straight to the listening part, but before you use it you must copy the config toml file into the release folder

not the most user friendly thing in the world but, it's a start.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How would you use a different TTS program for this? #62

How would you use a different TTS program for this? #62

Mike-MW commented Dec 6, 2024

tizu69 commented Dec 14, 2024

Mike-MW commented Dec 14, 2024

tizu69 commented Dec 14, 2024

Mike-MW commented Dec 14, 2024 •

edited

Loading

How would you use a different TTS program for this? #62

How would you use a different TTS program for this? #62

Comments

Mike-MW commented Dec 6, 2024

tizu69 commented Dec 14, 2024

Mike-MW commented Dec 14, 2024

tizu69 commented Dec 14, 2024

Mike-MW commented Dec 14, 2024 • edited Loading

Mike-MW commented Dec 14, 2024 •

edited

Loading