batch que please? #11

unifirer · 2023-12-20T12:50:10Z

unifirer
Dec 20, 2023

with the deepspeed optimisation, we can actually do multiple books in 1 night!

if u have some spare time, a batch que would allow us to do so!

another fantastic reason to do so, somewhere in the process, the max length of wav is <17h. after that the output shows 17 but is 4.5hrs max, maybe its a limit of ffmpeg or something else like a limit of the wav file or the header or something, i couldnt find it tho

thank you very much

erew123 · 2023-12-20T18:37:15Z

erew123
Dec 20, 2023
Maintainer

Well there is a 4GB limitation https://en.wikipedia.org/wiki/WAV#Limitations and it is an old standard, going back to 1991. I cant speak as to how things such as ffmpeg would handle such large files. But that's 6.8 hours at 44100Hz, so maybe closer to 10-12 tops at 22050Hz (with headers/padding overheads).

An average book, you're looking at generating 70,000-100,000 words, 3000-6000 sentences (or wav files should I say) then trying to combine them into 1x file. Its quite a technical overhead on a system to do that and I'm not sure where the limitations are. It would be a decent amount of testing and coding to make something like that work.

Can I ask what your exact use case is?

0 replies

unifirer · 2023-12-20T19:51:27Z

unifirer
Dec 20, 2023
Author

Oh that makes sense. it outputs 24k and does 9.5 hours without a problem. So the maximum is somewhere between 10 and 17 hours. The final combination process was so intensive that my game lagged out and my computer froze for 2minutes. Audiobook production. That's my use case. If you have spare time and energy to make it output multiple files instead in a queue as a batch as a new feature request. That would help a lot. Like the feature that handbrake has. batch processing. Please inform me of interesting jobs and opportunities

…

________________________________ From: erew123 ***@***.***> Sent: Thursday, December 21, 2023 7:37:25 AM To: erew123/alltalk_tts ***@***.***> Cc: unifirer ***@***.***>; Author ***@***.***> Subject: Re: [erew123/alltalk_tts] batch que please? (Issue #8) Well there is a 4GB limitation https://en.wikipedia.org/wiki/WAV#Limitations and it is an old standard, going back to 1991. I cant speak as to how things such as ffmpeg would handle such large files. But that's 6.8 hours at 44100Hz, so maybe closer to 10-12 tops at 22050Hz (with headers/padding overheads). An average book, you're looking at generating 70,000-100,000 words, 3000-6000 sentences (or wav files should I say) then trying to combine them into 1x file. Its quite a technical overhead on a system to do that and I'm not sure where the limitations are. It would be a decent amount of testing and coding to make something like that work. Can I ask what your exact use case is? — Reply to this email directly, view it on GitHub<#8 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BEMJT2OP6C452TZ25MOX5WDYKMV6LAVCNFSM6AAAAABA4YYKH2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRUHE2TSMBWG4>. You are receiving this because you authored the thread.Message ID: ***@***.***>

0 replies

unifirer · 2023-12-20T19:55:51Z

unifirer
Dec 20, 2023
Author

Perhaps instead of queing multiple videos. que multiple text files

…

________________________________ From: erew123 ***@***.***> Sent: Thursday, December 21, 2023 7:37:25 AM To: erew123/alltalk_tts ***@***.***> Cc: unifirer ***@***.***>; Author ***@***.***> Subject: Re: [erew123/alltalk_tts] batch que please? (Issue #8) Well there is a 4GB limitation https://en.wikipedia.org/wiki/WAV#Limitations and it is an old standard, going back to 1991. I cant speak as to how things such as ffmpeg would handle such large files. But that's 6.8 hours at 44100Hz, so maybe closer to 10-12 tops at 22050Hz (with headers/padding overheads). An average book, you're looking at generating 70,000-100,000 words, 3000-6000 sentences (or wav files should I say) then trying to combine them into 1x file. Its quite a technical overhead on a system to do that and I'm not sure where the limitations are. It would be a decent amount of testing and coding to make something like that work. Can I ask what your exact use case is? — Reply to this email directly, view it on GitHub<#8 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BEMJT2OP6C452TZ25MOX5WDYKMV6LAVCNFSM6AAAAABA4YYKH2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRUHE2TSMBWG4>. You are receiving this because you authored the thread.Message ID: ***@***.***>

0 replies

unifirer · 2023-12-20T20:03:17Z

unifirer
Dec 20, 2023
Author

With text files we can split them into chapters too!

…

________________________________ From: erew123 ***@***.***> Sent: Thursday, December 21, 2023 7:37:25 AM To: erew123/alltalk_tts ***@***.***> Cc: unifirer ***@***.***>; Author ***@***.***> Subject: Re: [erew123/alltalk_tts] batch que please? (Issue #8) Well there is a 4GB limitation https://en.wikipedia.org/wiki/WAV#Limitations and it is an old standard, going back to 1991. I cant speak as to how things such as ffmpeg would handle such large files. But that's 6.8 hours at 44100Hz, so maybe closer to 10-12 tops at 22050Hz (with headers/padding overheads). An average book, you're looking at generating 70,000-100,000 words, 3000-6000 sentences (or wav files should I say) then trying to combine them into 1x file. Its quite a technical overhead on a system to do that and I'm not sure where the limitations are. It would be a decent amount of testing and coding to make something like that work. Can I ask what your exact use case is? — Reply to this email directly, view it on GitHub<#8 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BEMJT2OP6C452TZ25MOX5WDYKMV6LAVCNFSM6AAAAABA4YYKH2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRUHE2TSMBWG4>. You are receiving this because you authored the thread.Message ID: ***@***.***>

0 replies

erew123 · 2023-12-21T05:12:47Z

erew123
Dec 21, 2023
Maintainer

Ok. Well its not a core feature that has been in my roadmap and it would take some thinking around how best to split out that much text, send bits off for generation, potentially combine every X amount of generated wavs, then keep going and do a smaller combine at the end. Compiling that much audio into one and how to handle that process is a potentially complicated task. And obviously, the only way I could test the robustness of such a thing would be to write code then set a machine off and running for X hours and if it crashes/errors, change the code and then send it off again etc.

I also don't want to get into feature creep while working on what is a new project, as I do have certain goals.. Ultimately, it would probably be best handled by a separate script that calls on the AllTalk API. It can deal with any filtering/cleaning, breaking down of text ahead of time and you could probably feed in a text file or something.

So, Im not saying no, but on the same note, I've got quite a bit going on currently to jump off to batch queue text generation and I need to get those core things done initially.

Ill move this over from issues to discussions so I've got a reference there. Ill mull the problem over in my head and see where I get to with my other bits of the roadmap.

Also that means if others discover this conversation, they are welcome to chip in, if its an idea they are interested in.

0 replies

unifirer · 2023-12-23T04:41:08Z

unifirer
Dec 23, 2023
Author

Thank you for your hard work

0 replies

erew123 · 2024-01-06T16:58:41Z

erew123
Jan 6, 2024
Maintainer

Almost done.... Not available yet!

If you're going to be making money from this... I hope you'll make me a donation at some point ;)

0 replies

unifirer · 2024-01-06T19:50:41Z

unifirer
Jan 6, 2024
Author

Thank you very much. I haven't made a cent yet because in every hour of speech generated. I had around 5 seconds of continuous screech or rumbles. I've managed to get it down to 1 second after decreasing the parameters to the lower values. Thank you for that function too Once I start making a profit you'll get a percentage share for sure. I'm still not sure about the demand yet. But the market seems to be way overpriced right now. On 7/01/2024 at 5:58 am, erew123 ***@***.***> wrote: image.png (view on web) Almost done.... Not available yet! If you're going to be making money from this... I hope you'll make me a donation at some point ;) — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.

0 replies

unifirer · 2024-01-06T21:40:23Z

unifirer
Jan 6, 2024
Author

Almost done.... Not available yet!

If you're going to be making money from this... I hope you'll make me a donation at some point ;)

i tried to make a donation when i first started using your app and tried again now, cant see a link, do u have it somewhere else? its not on ur profile

0 replies

erew123 · 2024-01-06T22:11:34Z

erew123
Jan 6, 2024
Maintainer

Thanks! Though, you're right, I've got no links at the moment for anything like that... I've yet to figure it out! Guess Ive been too busy coding!

Here you go though! https://github.com/erew123/alltalk_tts#-alltalk-tts-generator

I'm going to mark this closed now hah!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

batch que please? #11

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 10 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

batch que please? #11

unifirer Dec 20, 2023

Replies: 10 comments

erew123 Dec 20, 2023 Maintainer

unifirer Dec 20, 2023 Author

unifirer Dec 20, 2023 Author

unifirer Dec 20, 2023 Author

erew123 Dec 21, 2023 Maintainer

unifirer Dec 23, 2023 Author

erew123 Jan 6, 2024 Maintainer

unifirer Jan 6, 2024 Author

unifirer Jan 6, 2024 Author

erew123 Jan 6, 2024 Maintainer

unifirer
Dec 20, 2023

erew123
Dec 20, 2023
Maintainer

unifirer
Dec 20, 2023
Author

unifirer
Dec 20, 2023
Author

unifirer
Dec 20, 2023
Author

erew123
Dec 21, 2023
Maintainer

unifirer
Dec 23, 2023
Author

erew123
Jan 6, 2024
Maintainer

unifirer
Jan 6, 2024
Author

unifirer
Jan 6, 2024
Author

erew123
Jan 6, 2024
Maintainer