You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hope you are doing well. I have trained a tacotron 2 model and attempted to run synthesizer.py with --mode live. I got the below error:
Traceback (most recent call last):
File "synthesize.py", line 100, in
main()
File "synthesize.py", line 94, in main
synthesize(args, hparams, taco_checkpoint, wave_checkpoint, sentences)
File "synthesize.py", line 35, in synthesize
wavenet_in_dir = tacotron_synthesize(args, hparams, taco_checkpoint, sentences)
File "/mnt/disks/data/home/johnang3/git/rmt2/tacotron/synthesize.py", line 126, in tacotron_synthesize
run_live(args, checkpoint_path, hparams)
File "/mnt/disks/data/home/johnang3/git/rmt2/tacotron/synthesize.py", line 27, in run_live
generate_fast(synth, greetings)
File "/mnt/disks/data/home/johnang3/git/rmt2/tacotron/synthesize.py", line 15, in generate_fast
model.synthesize(text, None, None, None, None)
File "/mnt/disks/data/home/johnang3/git/rmt2/tacotron/synthesizer.py", line 84, in synthesize
wav = audio.inv_mel_spectrogram(mels.T, hparams)
File "/mnt/disks/data/home/johnang3/git/rmt2/datasets/audio.py", line 90, in inv_mel_spectrogram
S = _mel_to_linear(_db_to_amp(D + hparams.ref_level_db), hparams) # Convert back to linear
File "/mnt/disks/data/home/johnang3/git/rmt2/datasets/audio.py", line 160, in _mel_to_linear
return np.maximum(1e-10, np.dot(_inv_mel_basis, mel_spectrogram))
ValueError: shapes (1025,80) and (80,6,89) not aligned: 80 (dim 1) != 6 (dim 1)
This is coming from the "if basenames is None" block, specifically the line:
wav = audio.inv_mel_spectrogram(mels.T, hparams)
This is notably different that the corresponding line in the line that would be executed if basenames were present:
wav = audio.inv_mel_spectrogram(mel.T, hparams)
Where mel is a single member of mels. It I change the prior line to use mels[0].T instead of mels.T, that removes the runtime error but results in a very short file. I believe that each mel corresponds to only a very short slice of the input file, so I tried inverting each of them and then concatenating the result. Although this did produce a wav file, it didn't sound at all correct. So, I'm not sure how to correctly fix this.
The text was updated successfully, but these errors were encountered:
Hope you are doing well. I have trained a tacotron 2 model and attempted to run synthesizer.py with --mode live. I got the below error:
Traceback (most recent call last):
File "synthesize.py", line 100, in
main()
File "synthesize.py", line 94, in main
synthesize(args, hparams, taco_checkpoint, wave_checkpoint, sentences)
File "synthesize.py", line 35, in synthesize
wavenet_in_dir = tacotron_synthesize(args, hparams, taco_checkpoint, sentences)
File "/mnt/disks/data/home/johnang3/git/rmt2/tacotron/synthesize.py", line 126, in tacotron_synthesize
run_live(args, checkpoint_path, hparams)
File "/mnt/disks/data/home/johnang3/git/rmt2/tacotron/synthesize.py", line 27, in run_live
generate_fast(synth, greetings)
File "/mnt/disks/data/home/johnang3/git/rmt2/tacotron/synthesize.py", line 15, in generate_fast
model.synthesize(text, None, None, None, None)
File "/mnt/disks/data/home/johnang3/git/rmt2/tacotron/synthesizer.py", line 84, in synthesize
wav = audio.inv_mel_spectrogram(mels.T, hparams)
File "/mnt/disks/data/home/johnang3/git/rmt2/datasets/audio.py", line 90, in inv_mel_spectrogram
S = _mel_to_linear(_db_to_amp(D + hparams.ref_level_db), hparams) # Convert back to linear
File "/mnt/disks/data/home/johnang3/git/rmt2/datasets/audio.py", line 160, in _mel_to_linear
return np.maximum(1e-10, np.dot(_inv_mel_basis, mel_spectrogram))
ValueError: shapes (1025,80) and (80,6,89) not aligned: 80 (dim 1) != 6 (dim 1)
This is coming from the "if basenames is None" block, specifically the line:
wav = audio.inv_mel_spectrogram(mels.T, hparams)
This is notably different that the corresponding line in the line that would be executed if basenames were present:
wav = audio.inv_mel_spectrogram(mel.T, hparams)
Where mel is a single member of mels. It I change the prior line to use mels[0].T instead of mels.T, that removes the runtime error but results in a very short file. I believe that each mel corresponds to only a very short slice of the input file, so I tried inverting each of them and then concatenating the result. Although this did produce a wav file, it didn't sound at all correct. So, I'm not sure how to correctly fix this.
The text was updated successfully, but these errors were encountered: