You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fastspeech project ( https://github.com/xcmyz/FastSpeech) generates mel spectrogram quite fast
from text, i am trying to integrate fastspeech mel generation with squeezewave vocoder instead of using mel2samp.py to generates mels...pt.
but getting
i tried saving the mel_postnet_torch( melspectrogram) to a pt file , then used to generate wav
from Squeezewave but i get following error.
Traceback (most recent call last):
File "inference.py", line 87, in
args.sampling_rate, args.is_fp16, args.denoiser_strength)
File "inference.py", line 57, in main
audio = squeezewave.infer(mel, sigma=sigma).float()
File "/mount/data/SqueezeWave/glow.py", line 261, in infer
output = self.WN[k]((audio_0, spect))
File "/home/alok/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/mount/data/SqueezeWave/glow.py", line 165, in forward
spect = self.cond_layer(spect)
File "/home/alok/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/alok/.local/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 187, in forward
self.padding, self.dilation, self.groups)
RuntimeError: Expected 3-dimensional input for 3-dimensional weight [2048, 80, 1], but got 4-dimensional input of size [1, 1, 80, 133] instead
Fastspeech project ( https://github.com/xcmyz/FastSpeech) generates mel spectrogram quite fast
from text, i am trying to integrate fastspeech mel generation with squeezewave vocoder instead of using mel2samp.py to generates mels...pt.
but getting
i tried saving the mel_postnet_torch( melspectrogram) to a pt file , then used to generate wav
from Squeezewave but i get following error.
Traceback (most recent call last):
File "inference.py", line 87, in
args.sampling_rate, args.is_fp16, args.denoiser_strength)
File "inference.py", line 57, in main
audio = squeezewave.infer(mel, sigma=sigma).float()
File "/mount/data/SqueezeWave/glow.py", line 261, in infer
output = self.WN[k]((audio_0, spect))
File "/home/alok/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/mount/data/SqueezeWave/glow.py", line 165, in forward
spect = self.cond_layer(spect)
File "/home/alok/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/alok/.local/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 187, in forward
self.padding, self.dilation, self.groups)
RuntimeError: Expected 3-dimensional input for 3-dimensional weight [2048, 80, 1], but got 4-dimensional input of size [1, 1, 80, 133] instead
Any idea was could be the issue?
I added lines to save mel calculation at
after
https://github.com/xcmyz/FastSpeech/blob/master/synthesis.py#L66
torch.save(mel_postnet_torch,"filename.pt")
The text was updated successfully, but these errors were encountered: