Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

96% of the way through a 24 hour generation and I am hit with a Too much text error. How do I break up the line without starting over? #68

Open
couchpotatochip21 opened this issue Oct 30, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@couchpotatochip21
Copy link

I was doing a 700 page textbook when I discovered an error at 96% completion stating the following:
RuntimeError: Possible latent mismatch: try recomputing voice latents. Error: Too much text provided. Break the text up into separate segments and re-try inference.

Now this line is fairly long but I can't seem to find any way to just skip this line or break up this line. Is there any way for me to save this?

@couchpotatochip21
Copy link
Author

I found a fix, go to the folder for the Audiobook Maker and go to audiobooks > (youraudiobooknamefolder) > text_audio_map.json. Then, find the line that is too long (look for largest number .wav and it will lead you to where the failed line should be) and shorten it. I have not tested how this affects the audio but the script has continued.

@JarodMica
Copy link
Owner

Ooh yeah, this is an issue with tortoise and I may need to handle it better with my segmeneter and tortoise. For now, your fix will work.

Also, 700 pages, that's wild! Glad to hear it's being used to good use

@JarodMica JarodMica added the bug Something isn't working label Oct 30, 2024
@couchpotatochip21
Copy link
Author

Ooh yeah, this is an issue with tortoise and I may need to handle it better with my segmeneter and tortoise. For now, your fix will work.

Also, 700 pages, that's wild! Glad to hear it's being used to good use

Thank you so much for the reply!

Unfortunately the export failed, something about segment length? I will reply in a bit with the error as I will try it again. Do I need to modify the segment length to fit the new sentence or something? I cut the segment that was too long in half so it may have problems with the audio clip length.

@couchpotatochip21
Copy link
Author

C:\Users\_\Documents\audobookmaker\audiobook_maker>call venv\Scripts\activate [2024-10-30 05:13:45,184] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect) W1030 05:13:54.908000 23832 torch\distributed\elastic\multiprocessing\redirects.py:27] NOTE: Redirects are currently not supported in Windows or MacOs. Traceback (most recent call last): File "C:\Users\_\Documents\audobookmaker\audiobook_maker\src\controller.py", line 793, in export_audiobook output_filename = self.model.export_audiobook(directory_path, pause_duration) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\_\Documents\audobookmaker\audiobook_maker\src\model.py", line 368, in export_audiobook combined_audio.export(output_filename, format="mp3") File "C:\Users\_\Documents\audobookmaker\audiobook_maker\venv\Lib\site-packages\pydub\audio_segment.py", line 895, in export wave_data.writeframesraw(pcm_for_wav) File "C:\Users\_\AppData\Local\Programs\Python\Python311\Lib\wave.py", line 547, in writeframesraw self._ensure_header_written(len(data)) File "C:\Users\_\AppData\Local\Programs\Python\Python311\Lib\wave.py", line 588, in _ensure_header_written self._write_header(datasize) File "C:\Users\_\AppData\Local\Programs\Python\Python311\Lib\wave.py", line 600, in _write_header self._file.write(struct.pack('<L4s4sLHHLLHH4s', ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ struct.error: argument out of range

Swapped out my user for _ in this sippet.

@Joe-Orlina
Copy link

Joe-Orlina commented Jan 17, 2025

Hi, I recently purchased the audiobook maker and I think it's really swell! I had this problem myself and found a nice work around. I've played around with ollama and anythingllm but I found that if you use chatgpt and upload the .txt file you would have placed in the audiobook maker, and enter the prompt:

"Apply these rules to reformat the uploaded text:

Clause Separation:

Each clause is placed on a separate line, ensuring no sentence runs onto the next line.
Word Limit Rule (25 words):

If a sentence exceeds 25 words:
Commas, semicolons, or colons are replaced with a full stop (period).
The next sentence starts with a capital letter.
The remainder of the sentence is moved to the next line.
Preserving Formatting Consistency:

No phrases will be deleted from the original text.
Only punctuation changes will be made according to the word limit rule.
Character names, stage directions, and overall line structure are preserved."

^
What this will do is reformat the entire .txt file so that each sentence is shorter per line, and when a sentence is too long, causing the error, it will break it into smaller segments, and when the audiobook maker attempts to play the audio from the generated lines, it will sound smooth and you won't notice the difference.

It's a neat little hack if your GPU isn't strong enough to generate massive lines of TTS. I'm currently generating a text doc and I'm still checking to see if it turned out okay, but so far, listening to the lines there doesn't seem to be a problem. Thought I would share in case this helps anyone else!

Update: -this is definitely not a perfect fix; chatgpt seems to omit entire phrases and some clauses still are broken as they go from one line to the next and the audiobook maker pauses in between sentences; still working on seeing if there's a better way of doing this process.

@couchpotatochip21
Copy link
Author

Hi, I recently purchased the audiobook maker and I think it's really swell! I had this problem myself and found a nice work around. I've played around with ollama and anythingllm but I found that if you use chatgpt and upload the .txt file you would have placed in the audiobook maker, and enter the prompt:

"Apply these rules to reformat the uploaded text:

Clause Separation:

Each clause is placed on a separate line, ensuring no sentence runs onto the next line. Word Limit Rule (25 words):

If a sentence exceeds 25 words: Commas, semicolons, or colons are replaced with a full stop (period). The next sentence starts with a capital letter. The remainder of the sentence is moved to the next line. Preserving Formatting Consistency:

No phrases will be deleted from the original text. Only punctuation changes will be made according to the word limit rule. Character names, stage directions, and overall line structure are preserved."

^ What this will do is reformat the entire .txt file so that each sentence is shorter per line, and when a sentence is too long, causing the error, it will break it into smaller segments, and when the audiobook maker attempts to play the audio from the generated lines, it will sound smooth and you won't notice the difference.

It's a neat little hack if your GPU isn't strong enough to generate massive lines of TTS. I'm currently generating a text doc and I'm still checking to see if it turned out okay, but so far, listening to the lines there doesn't seem to be a problem. Thought I would share in case this helps anyone else!

Update: -this is definitely not a perfect fix; chatgpt seems to omit entire phrases and some clauses still are broken as they go from one line to the next and the audiobook maker pauses in between sentences; still working on seeing if there's a better way of doing this process.

Thank you very much for the solution! I no longer need the textbook I originally was trying to make an audiobook out of but I am glad there is a solution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants