Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError: too many indices for tensor of dimension 2 #119

Open
Chasebyui22 opened this issue Jan 7, 2025 · 1 comment
Open

IndexError: too many indices for tensor of dimension 2 #119

Chasebyui22 opened this issue Jan 7, 2025 · 1 comment

Comments

@Chasebyui22
Copy link

I'm getting an error message about how it can't find the txt/wav file. When I remove the .txt/.wav ending I get a different error.
image
image

Error calling Python override of QThread::run(): Traceback (most recent call last):
  File "C:\Users\Smith\Documents\Chase\Code\audiobook_maker\src\controller.py", line 48, in run
    self.function(self.directory_path, self.is_continue, self.report_progress, self.sentence_generated_callback, self.should_stop)
  File "C:\Users\Smith\Documents\Chase\Code\audiobook_maker\src\model.py", line 288, in generate_audio_for_sentence_threaded
    audio_path = self.generate_audio_proxy(sentence, speaker_settings, s2s_validated)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Smith\Documents\Chase\Code\audiobook_maker\src\model.py", line 314, in generate_audio_proxy
    success = tts_engines.generate_audio(self.tts_engine, sentence, voice_parameters, tts_engine_name, audio_path)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Smith\Documents\Chase\Code\audiobook_maker\src\tts_engines.py", line 39, in generate_audio
    return generate_with_f5tts(tts_engine, sentence, voice_parameters, audio_path)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Smith\Documents\Chase\Code\audiobook_maker\src\tts_engines.py", line 137, in generate_with_f5tts
    with open(ref_text, "r", encoding="utf-8") as f:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'voices/f5tts\\chase\\chase.txt'

Error when I remove the .txt/.wav ending.

Error calling Python override of QThread::run(): Traceback (most recent call last):
  File "C:\Users\Smith\Documents\Chase\Code\audiobook_maker\src\controller.py", line 48, in run
    self.function(self.directory_path, self.is_continue, self.report_progress, self.sentence_generated_callback, self.should_stop)
  File "C:\Users\Smith\Documents\Chase\Code\audiobook_maker\src\model.py", line 288, in generate_audio_for_sentence_threaded
    audio_path = self.generate_audio_proxy(sentence, speaker_settings, s2s_validated)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Smith\Documents\Chase\Code\audiobook_maker\src\model.py", line 314, in generate_audio_proxy
    success = tts_engines.generate_audio(self.tts_engine, sentence, voice_parameters, tts_engine_name, audio_path)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Smith\Documents\Chase\Code\audiobook_maker\src\tts_engines.py", line 39, in generate_audio
    return generate_with_f5tts(tts_engine, sentence, voice_parameters, audio_path)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Smith\Documents\Chase\Code\audiobook_maker\src\tts_engines.py", line 148, in generate_with_f5tts
    tts_engine.infer(
  File "C:\Users\Smith\Documents\Chase\Code\audiobook_maker\venv\Lib\site-packages\f5_tts\api.py", line 120, in infer
    wav, sr, spect = infer_process(
                     ^^^^^^^^^^^^^^
  File "C:\Users\Smith\Documents\Chase\Code\audiobook_maker\venv\Lib\site-packages\f5_tts\infer\utils_infer.py", line 438, in infer_process
    return infer_batch_process(
           ^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Smith\Documents\Chase\Code\audiobook_maker\venv\Lib\site-packages\f5_tts\infer\utils_infer.py", line 506, in infer_batch_process
    duration_in_sec = prediction_model(audio, text_list)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Smith\Documents\Chase\Code\audiobook_maker\venv\Lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Smith\Documents\Chase\Code\audiobook_maker\venv\Lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Smith\Documents\Chase\Code\audiobook_maker\venv\Lib\site-packages\f5_tts\model\modules.py", line 939, in forward
    x = self.transformer(inp, text=text)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Smith\Documents\Chase\Code\audiobook_maker\venv\Lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Smith\Documents\Chase\Code\audiobook_maker\venv\Lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Smith\Documents\Chase\Code\audiobook_maker\venv\Lib\site-packages\f5_tts\model\modules.py", line 849, in forward
    x = block(x, mask=mask, rope=rope)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Smith\Documents\Chase\Code\audiobook_maker\venv\Lib\site-packages\f5_tts\model\modules.py", line 777, in __call__
    attn_output = self.attn(x=norm, mask=mask, rope=rope)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Smith\Documents\Chase\Code\audiobook_maker\venv\Lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Smith\Documents\Chase\Code\audiobook_maker\venv\Lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Smith\Documents\Chase\Code\audiobook_maker\venv\Lib\site-packages\f5_tts\model\modules.py", line 403, in forward
    return self.processor(self, x, mask=mask, rope=rope)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Smith\Documents\Chase\Code\audiobook_maker\venv\Lib\site-packages\f5_tts\model\modules.py", line 432, in __call__
    query = apply_rotary_pos_emb(query, freqs, q_xpos_scale)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Smith\Documents\Chase\Code\audiobook_maker\venv\Lib\site-packages\torch\amp\autocast_mode.py", line 43, in decorate_autocast
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Smith\Documents\Chase\Code\audiobook_maker\venv\Lib\site-packages\x_transformers\x_transformers.py", line 696, in apply_rotary_pos_emb
    freqs = freqs[:, -seq_len:, :]
            ~~~~~^^^^^^^^^^^^^^^^^
IndexError: too many indices for tensor of dimension 2

I'm pretty sure I have the correct Torch Version as well
image
Any ideas?

@Chasebyui22 Chasebyui22 changed the title FileNotFoundError: [Errno 2] No such file or directory: 'voices/f5tts\\chase\\chase.txt' IndexError: too many indices for tensor of dimension 2 Jan 7, 2025
@JarodMica
Copy link
Owner

JarodMica commented Jan 12, 2025

FileNotFoundError: [Errno 2] No such file or directory: 'voices/f5tts\chase\chase.txt'

This first one might be from extensions not being shown on windows, check to make sure that isn't the case. That would mean it's reading chase.txt.txt

IndexError: too many indices for tensor of dimension 2
I believe this is from the audio file being stereo and not mono. Could you try converting it to mono?

I plan to implement a "Voice Loader" into the GUI so that everything is handled correctly without you guys having to fiddle around with stuff too much so these problems should be resolved in the future with this feature

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants