artifacts at the verge between existing and extended frequency bands #23

Cunfu-Zhuge · 2024-01-02T02:46:27Z

I trained the model by myself and found there was a line of artifacts at at the verge between existing and extended frequency bands using the model I trained. I read the paper related to this repo and found the three reasons provided in the paper for this phenomenon didn't exist in my training process. I don't know why.
I guess maybe it's because of the process i transformed .flac to .wav for the dataset VCTK. So could you please tell me how did you transform .flac to .wav?
Thank you!

Cunfu-Zhuge · 2024-01-02T02:50:35Z

I find there is a parameter of "model" in the official pretrained model is "hdemucs-snake-ftb-lstm-peg-concat". But this value is not supported by the code. The supported values for this parameter are just "Aero" and "SEAnet".

At0nale · 2024-02-12T11:27:32Z

I had the same issue although I am not converting .flac to .wav so I would rule this out.. I have partially fixed the issue by implementing variance in the way I downsample my training examples (Now using 10 different downsampling methods).

it's working nicely for clean voice samples but as soon as there is a bit of noise or distortion in the voice, I have a frequency boost at the verge between existing and extended frequency bands again.

Any update on this issue?

yezhangyinge · 2024-05-30T03:45:14Z

I trained the model by myself and found there was a line of artifacts at at the verge between existing and extended frequency bands using the model I trained. I read the paper related to this repo and found the three reasons provided in the paper for this phenomenon didn't exist in my training process. I don't know why. I guess maybe it's because of the process i transformed .flac to .wav for the dataset VCTK. So could you please tell me how did you transform .flac to .wav? Thank you!

I also found that "a line of artifacts at at the verge between existing and extended frequency bands using the model I trained". Did you tackle this problem? Or any suggestions?

yezhangyinge · 2024-05-31T06:33:34Z

I had the same issue although I am not converting .flac to .wav so I would rule this out.. I have partially fixed the issue by implementing variance in the way I downsample my training examples (Now using 10 different downsampling methods).

it's working nicely for clean voice samples but as soon as there is a bit of noise or distortion in the voice, I have a frequency boost at the verge between existing and extended frequency bands again.

Any update on this issue?

Gould you tell me how to "implementing variance in the way I downsample my training examples (Now using 10 different downsampling methods)" ? I also met this problem.

pokepress · 2024-08-23T04:13:47Z

Just to be clear, what do these artifacts sound like? I've been using my modified version of this project, and even though I've primarily been doing 44.1->44.1 conversion, I'm getting buzzing/sibilance in the 7-8 khz range for my AM radio upscale project.

pokepress · 2024-08-26T03:36:41Z

So, having done more experimentation this weekend, it seems like in my case this issue correlates with the weight of the STFT loss. Those coefficients are adjustable in the base version, so try lowering them and see if it helps. I'm not 100% convinced the STFT loss is the underlying cause of the artifacts (it seems odd that it would affect such specific ranges differently than others), but it does seem to make it worse.

pokepress · 2024-09-19T03:10:30Z

I've done more work on this in my fork of the project. Right now, I'm testing a modified version of the STFT loss that allows for restricting the frequency range of the comparison (in terms of code, it zeros out parts of the STFT results above/below specified points), which should force the loss to focus on that particular frequency range. It shows some promise, but I'll need to keep increasing the weights to see how effective it really is.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

artifacts at the verge between existing and extended frequency bands #23

artifacts at the verge between existing and extended frequency bands #23

Cunfu-Zhuge commented Jan 2, 2024

Cunfu-Zhuge commented Jan 2, 2024

At0nale commented Feb 12, 2024

yezhangyinge commented May 30, 2024

yezhangyinge commented May 31, 2024

pokepress commented Aug 23, 2024

pokepress commented Aug 26, 2024

pokepress commented Sep 19, 2024

artifacts at the verge between existing and extended frequency bands #23

artifacts at the verge between existing and extended frequency bands #23

Comments

Cunfu-Zhuge commented Jan 2, 2024

Cunfu-Zhuge commented Jan 2, 2024

At0nale commented Feb 12, 2024

yezhangyinge commented May 30, 2024

yezhangyinge commented May 31, 2024

pokepress commented Aug 23, 2024

pokepress commented Aug 26, 2024

pokepress commented Sep 19, 2024